research-article

The nonverbal structure of patient case discussions in multidisciplinary medical team meetings

Abstract

Meeting analysis has a long theoretical tradition in social psychology, with established practical ramifications in computer science, especially in computer supported cooperative work. More recently, a good deal of research has focused on the issues of indexing and browsing multimedia records of meetings. Most research in this area, however, is still based on data collected in laboratories, under somewhat artificial conditions. This article presents an analysis of the discourse structure and spontaneous interactions at real-life multidisciplinary medical team meetings held as part of the work routine in a major hospital. It is hypothesized that the conversational structure of these meetings, as indicated by sequencing and duration of vocalizations, enables segmentation into individual patient case discussions. The task of segmenting audio-visual records of multidisciplinary medical team meetings is described as a topic segmentation task, and a method for automatic segmentation is proposed. An empirical evaluation based on hand labelled data is presented, which determines the optimal length of vocalization sequences for segmentation, and establishes the competitiveness of the method with approaches based on more complex knowledge sources. The effectiveness of Bayesian classification as a segmentation method, and its applicability to meeting segmentation in other domains are discussed.

References

  1. Ajmera, J. and Wooters, C. 2003. A robust speaker clustering algorithm. In Proceedings of the IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU). IEEE Press, 411--416.Google ScholarGoogle Scholar
  2. Allan, J., Carbonell, J., Doddington, G., Yamron, J., and Yang, Y. 1998. Topic detection and tracking pilot study: Final report. In Proceedings of the DARPA Broadcast News Transcription and Understanding Workshop. Vol. 1998, 194--218.Google ScholarGoogle Scholar
  3. Bales, R. F. 1950. Interaction Process Analysis: A Method for the Study of Small Groups. Addison-Wesley, Cambridge, MA.Google ScholarGoogle Scholar
  4. Banerjee, S., Rose, C., and Rudnicky, A. I. 2005. The necessity of a meeting recording and playback system, and the benefit of topic-level annotations to meeting browsing. In Proceedings of the 10th International Conference on Human-Computer Interaction (INTERACT). Springer, 643--656. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Banerjee, S. and Rudnicky, A. 2004. Using simple speech-based features to detect the state of a meeting and the roles of the meeting participants. In Proceedings of the 8th International Conference on Spoken Language Processing (INTERSPEECH-ICSLP). 2189--2192.Google ScholarGoogle Scholar
  6. Banerjee, S. and Rudnicky, A. I. 2007. Segmenting meetings into agenda items by extracting implicit supervision from human note-taking. In Proceedings of the 12th International Conference On Intelligent User Interfaces (IUI). ACM, New York, NY, 151--159. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Barzilay, R., Collins, M., Hirschberg, J., and Wittaker, S. 2000. The rules behind roles: Identifying speaker role in radio broadcasts. In Proceedings of the National Conference on Artificial Intelligence. AAAI Press, 679--684. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Beeferman, D., Berger, A., and Lafferty, J. 1999. Statistical models for text segmentation. Mach. Learn. 34, 177--210. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Blei, D. M. and Moreno, P. J. 2001. Topic segmentation with an aspect hidden Markov model. In Proceedings of The 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, New York, NY, 343--348. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Bouamrane, M.-M. and Luz, S. 2007. Meeting browsing. Multimedia Syst. 12, 4--5, 439--457.Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Brady, P. 1968. A statistical analysis of on-off patterns in 16 conversations. Bell Syst. Tech. J. 47, 73--91.Google ScholarGoogle ScholarCross RefCross Ref
  12. Burger, S., MacLaren, V., and Yu, H. 2002. The ISL meeting corpus: The impact of meeting type on speech style. In Proceedings of the 7th International Conference on Spoken Language Processing (ICSLP).Google ScholarGoogle Scholar
  13. Calman-Hine, E. 1995. A policy framework for commissioning cancer services: a report to the chief medical officers of England and Wales. In The Calman-Hine Report, Department of Health, London.Google ScholarGoogle Scholar
  14. Carletta, J. 2007. Unleashing the killer corpus: experiences in creating the multi-everything AMI meeting corpus. Lang. Resource Eval. 41, 2, 181--190.Google ScholarGoogle ScholarCross RefCross Ref
  15. Chen, L., Rose, R., Qiao, Y., Kimbara, I ., Parrill, F., Welji, H., Han,T., Tu, J., Huang, Z., Harper, M., Quek, F., Xiong, Y., McNeill, D., Tuttle, R., and Huang, T. 2006. VACE multimodal meeting corpus. In Proceedings of Machine Learning for Multimodal Interaction (MLMI). S. Renals and S. Bengio, Eds., Lecture Notes in Computer Science, vol. 3869, Springer Berlin/Heidelberg, 40--51. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Chen, S. and Gopala Krishnan, P. 1998. Speaker, environment and channel change detection and clustering via the Bayesian information criterion. In Proceedings of DARPA Broadcast News Transcription and Understanding Workshop.Google ScholarGoogle Scholar
  17. Choi, F. Y. Y. 2000. Advances in domain independent linear text segmentation. InProceedings of the North American Chapter of the Association for Computational Linguistics (NAACL). 26--33. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Cortes, C. and Vapnik, V. 1995. Support-vector networks. Mach. Learn. 20, 3, 273--297. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Dabbs, J. M. J. and Ruback, B. 1987. Dimensions of group process: Amount and structure of vocal interaction. Adv. Exper. Social Psychol. 20, 123--169.Google ScholarGoogle Scholar
  20. Dielmann, A. and Renals, S. 2007. Automatic meeting segmentation using dynamic Bayesian networks. IEEE Trans. Multimedia 9, 1, 25--36. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Finn, K. E., Sellen, A. J., and Wilbur, S. B., Eds. 1997. Video-Mediated Communication. Lawrence Erlbaum Associates. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Fiscus, J. G., Ajot, J., and Garofolo, J. S. 2008. The rich transcription 2007 meeting recognition evaluation. In Multimodal Technologies for Perception of Humans, Springer, 3--34.Google ScholarGoogle Scholar
  23. Galegher, J. and Kraut, R. E. 1994. Computer-mediated communication for intellectual teamwork: An experience in group writing. Inform. Syst. Res. 5, 2, 110--139.Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Galley, M., Mckeown, K. R., Fosler-Lussier, E., and Jing, H. 2003. Discourse segmentation of multi-party conversation. In Proceedings of the 41st Annual Meeting of the Association for Computational Linguistics. E. Hinrichs and D. Roth, Eds., 562--569. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Garofolo, J., Laprun, C., Michel, M., Stanford, V., and Tabassi, E. 2004. The NIST meeting room pilot corpus. In Proceedings of the 4th International Conference on Language Resources and Evaluation (LREC). ELRA, 1411--1414.Google ScholarGoogle Scholar
  26. Garofolo, J. S., Auzanne, C. G. P., and Voorhees, E. M. 1999. The TREC spoken document retrieval track: A success story. In Proceedings of the Text Retrieval Conference (TREC).Google ScholarGoogle Scholar
  27. Grosz, B. and Hirschberg, J. 1992. Some intonational characteristics of discourse structure. In Proceedings of the International Conference on Spoken Language Processing (ICSLP). ISCA, 429--432.Google ScholarGoogle Scholar
  28. Groth, K., Frykholm, O., Segersvard, R., Isaksson, B., and Permert, J. 2009. Effciency in treatment discussions: A field study of time related aspects in multi-disciplinary team meetings. In Proceedings of the 22nd IEEE International Symposium on Computer-Based Medical Systems (CBMS). 1--8.Google ScholarGoogle Scholar
  29. Gruenstein, A., Niekrasz, J., and Purver, M. 2005. Meeting structure annotation: Data and tools. In Proceedings of the 6th SIGdial Workshop on Discourse and Dialogue. 17--127.Google ScholarGoogle Scholar
  30. Gutwin, C. and Greenberg, S. 1999. The effects of workspace awareness support on the usability of real-time distributed groupware. ACM Trans. Comput.-Hum. Interact. 6, 3, 243--281. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Hackman, J. R. and Morris, C. G. 1975. Group tasks, group interaction process, and group performance effectiveness: A review and proposed integration. In Advances in Experimental Social Psychology, L. Berkowitz, Ed., Vol. 8, Academic Press, New York, 45--99.Google ScholarGoogle Scholar
  32. Hearst, M. A. 1997. TextTiling: Segmenting text into multi-paragraph subtopic passages. Comput. Ling. 23, 1, 33--64. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Heldner, M. and Edlund, J. 2010. Pauses, gaps and overlaps in conversations. J. Phonetics 38, 4, 555--568.Google ScholarGoogle ScholarCross RefCross Ref
  34. Hindus, D., Schmandt, C., and Horner, C. 1993. Capturing, structuring, and representing ubiquitous audio. ACM Trans. Inf. Syst. 11, 376--400. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Hsueh, P., Moore, J. D., and Renals, S. 2006. Automatic segmentation of multiparty dialogue. In Proceedings of the 11th Conference of the European Chapter of the Association for Computational Linguistics (EACL). ACL Press, 273--277.Google ScholarGoogle Scholar
  36. Hsueh, P.- Y. 2008. Audio-based unsupervised segmentation of multiparty dialogue. In Proceedings of the International Conference on Acoustics, Speech and Signal Processing. 5049--5052.Google ScholarGoogle Scholar
  37. Hsueh, P.-Y. and Moore, J. D. 2007. Combining multiple knowledge sources for dialogue segmentation in multimedia archives. In Proceedings of the 45th Annual Meeting of the ACL. Association for Computational Linguistics.Google ScholarGoogle Scholar
  38. Jaffe, J. and Feldstein, S. 1970. Rhythms of Dialogue. Academic Press, New York.Google ScholarGoogle Scholar
  39. Janin, A., Baron, D., Edwards, J., Ellis, D., Gelbart, D., Morgan, N., Peskin, B., Pfau, T., Shriberg, E., Stolcke A., and Wooters, C. 2003. The ICSI meeting corpus. In Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP). Vol. 1, 364--367.Google ScholarGoogle Scholar
  40. Japkowicz, N. and Stephen, S. 2002. The class imbalance problem: A systematic study. Intell. Data Anal. 6, 5, 429--449. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. John, G. H. and Langley, P. 1995. Estimating continuous distributions in Bayesian classifiers. In Proceedings of the 11th Conference on Uncertainty in Artificial Intelligence (UAI). Besnard, Philippe and S. Hanks, Eds., Morgan Kaufmann Publishers, San Francisco, CA, 338--345. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Jordan, B. and Henderson, A. 1995. Interaction analysis: Foundations and practice. J. Learn. Sci. 4, 1, 39--103.Google ScholarGoogle ScholarCross RefCross Ref
  43. Kane, B. and Luz, S. 2006. Multidisciplinary medical team meetings: An analysis of collaborative working with special attention to timing and teleconferencing. In Proceedings of International Conference on Computer Supported Cooperative Work (CSCW). 501--535. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Kane, B. and Luz, S. 2009. Achieving diagnosis by consensus. Comput. Supp. Coop. Work 18, 4, 357--391. Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. Kane, B. T. 2008. An analysis of multidisciplinary medical team meeting and the use of communication technology. Ph.D. thesis, Trinity College, University of Dublin,Google ScholarGoogle Scholar
  46. Kazman, R., Al-Halimi, R., Hunt, W., and Mantey, M. 1996. Four paradigms for indexing video conferences. IEEE Multimedia 3, 1, 63--73. Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. Laskowski, K., Ostendor F, M., and Schulz, T. 2003. Modeling vocal interaction for text-independent participant characterization in multi-party conversation. In Proceedings of the 9th SIGDIAL Workshop on Discourse and Dialogue. Association for Computational Linguistics, 148--155. Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. Lewis, D. D. 1995. Evaluating and optimizing autonomous text classification systems. In Proceedings of the 18th annual International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM Press, New York, NY, USA, 246--254. Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. Liu, Y., Chawla, N. V., Harper, M. P., Shriberg, E., and Stolcke, A. 2006. A study in machine learning from imbalanced data for sentence boundary detection in speech. Comput. Speech Lang. 20, 4, 468--494.Google ScholarGoogle ScholarCross RefCross Ref
  50. Luz, S. and Kane, B. 2009. Classification of patient case discussions through analysis of vocalisation graphs. In Proceedings of the 11th International Conference on Multimodal Interfaces and Machine Learning for Multimodal Interaction (ICMI-MLMI). Association for Computing Machinery, New York, NY, 107--114. Google ScholarGoogle ScholarDigital LibraryDigital Library
  51. Luz, S. and Su, J. 2010. The relevance of timing, pauses and overlaps in dialogues: Detecting topic changes in scenario based meetings. In Proceedings of International Conference on Spoken Language Processing (INTERSPEECH-ICSLP). 1369--1372.Google ScholarGoogle Scholar
  52. Malioutov, I., Park, A., Barzilay, R., and Glass, J. 2007. Making sense of sound: Unsupervised topic segmentation over acoustic input. In Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics. 504--511.Google ScholarGoogle Scholar
  53. Maskey, S. and Hirschberg, J. 2006. Summarizing speech without text using hidden Markov models. In Proceedings of the Human Language Technology Conference of the NAACL, (Companion Volume: Short Papers). Association for Computational Linguistics, Stroudsburg, PA, 89--92. Google ScholarGoogle ScholarDigital LibraryDigital Library
  54. McCallum, A. and Nigam, K. 1998. A comparison of event models for naive Bayes text classification. In Proceedings of the Workshop on Learning for Text Categorization (AAAI/ICML). AAAI Press, 41--48.Google ScholarGoogle Scholar
  55. McCowan, I., Bengio, S., Gatica-Perez, D., Lathoud, G., Monay, F., Moore, D., Wellner, P., and Bourlard, H. 2003. Modeling human interaction in meetings. In Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP). Vol. 4, 748--51.Google ScholarGoogle Scholar
  56. McCowan, I., Gatica-Perez, D., Bengio, S., Lathoud, G., Barnard, M., and Zhang, D. 2005. Automatic analysis of multimodal group actions in meetings. IEEE Trans. Pattern Anal. Mach. Intell. 27, 3, 305--317. Google ScholarGoogle ScholarDigital LibraryDigital Library
  57. McGrath, J. E. 1991. Time, interaction, and performance (tip). Small Group Res. 22, 2, 147--174.Google ScholarGoogle ScholarCross RefCross Ref
  58. Moran, T. P., Palen, L., Harrison, S., Chiu, P., Kimber, D., Minneman, S., Vanmelle, W., and Zellweger, P. 1997. “I'll get that off the audio”: A case study of salvaging multimedia meeting records. In Proceedings of the ACM Conference on Human Factors in Computing Systems (CHI). Vol. 1, 202--209. Google ScholarGoogle ScholarDigital LibraryDigital Library
  59. MPI. 2005. ELAN: Eucido Linguistic Annotator. Max Planck Institute for Psycholinguistics. http://www.mpi. nl/tools/elan.html.Google ScholarGoogle Scholar
  60. NIST. 2011. RT evaluation project. http://www.itl.nist.gov/iad/mig/tests/rt/.Google ScholarGoogle Scholar
  61. Oliveira, M. 2002. The role of pause occurrence and pause duration in the signaling of narrative structure. In Lecture Note on Artificial Interlligence, vol. 2389, Springer, 43--51. Google ScholarGoogle ScholarDigital LibraryDigital Library
  62. Olson, J. S., Olson, G. M., Storrøsten, M., and Carter, M. 1993. Groupwork close up: A comparison of the group design process with and without a simple group editor. ACM Trans. Inf. Syst. 11, 4, 321--348. Google ScholarGoogle ScholarDigital LibraryDigital Library
  63. Parthasarathi, S. H. K., Magimai-Doss, M., Gatica-Perez, D., and Bourlard, H. 2009. Speaker change detection with privacy-preserving audio cues. In Proceedings of the International Conference on Multimodal Interfaces (ICMI-MLMI). ACM, New York, NY, 343--346. Google ScholarGoogle ScholarDigital LibraryDigital Library
  64. Pevzner, L. and Hearst, M. A. 2002. A critique and improvement of an evaluation metric for text segmentation. Comput. Ling. 28, 19--36. Google ScholarGoogle ScholarDigital LibraryDigital Library
  65. Quinlan, J. 1993. C4.5: Programs for Machine Learning. Morgan Kaufmann. Google ScholarGoogle ScholarDigital LibraryDigital Library
  66. Renals, S. and Ellis, D. 2003. Audio information access from meeting rooms. In Proceedings of the International Conference on Acoustics, Speech, and Signal Processing. IEEE, 744--747.Google ScholarGoogle Scholar
  67. Renals, S., Hain, T., and Bourlard, H. 2007. Recognition and interpretation of meetings: The AMI and AMIDA projects. InProceedings of IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU).Google ScholarGoogle Scholar
  68. Richter, H. A., Abowd, G. D., Geyer, W., Fuchs, L., Daijavad, S., and Poltrock, S. E. 2001. Integrating meeting capture within a collaborative team environment. In Proceedings of the 3rd International Conference on Ubiquitous Computing (UbiComp). Springer-Verlag, Berlin, 123--138. Google ScholarGoogle ScholarDigital LibraryDigital Library
  69. Robertson, T., Li, J., O'Hara, K., and Hansen, S. 2010. Collaboration within different settings: A study of co-located and distributed multidisciplinary medical team meetings. Comput. Supp. Coop. Work 19, 483--513. Google ScholarGoogle ScholarDigital LibraryDigital Library
  70. Rosenberg, A., Sharifi, M., and Hirschberg, J. 2007. Varying input segmentation for story boundary detection in English, Arabic and Mandarin broadcast news. In Proceedings of the International Conference on Spoken Language Processing (INTERSPEECH-ICSLP). ISCA, 2589--2592.Google ScholarGoogle Scholar
  71. Sacks, H., Schegloff, E. A., and Jefferson, G. 1974. A simplest systematics for the organization of turn taking in conversation. Lang. 50, 4, 696--735.Google ScholarGoogle ScholarCross RefCross Ref
  72. Schwarz, P., Matejka, P., and Cernocky, J. 2004. Towards lower error rates in phoneme recognition. In Proceedings of the 7th International Conference on Text, Speech and Dialogue. 465--472.Google ScholarGoogle Scholar
  73. Sellen, A. J. 1995. Remote conversations: The effects of mediating talk with technology. Hum.-Comput. Interact. 10, 4, 401--444. Google ScholarGoogle ScholarDigital LibraryDigital Library
  74. Sherman, M. and Liu, Y. 2008. Using hidden Markov models for topic segmentation of meeting transcripts. In Proceedings of the IEEE Spoken Language Technology Workshop. 185--188.Google ScholarGoogle Scholar
  75. Shriberg, E., Stolcke, A., Hakkani-Tür, D., and Tür, G. 2000. Prosody-based automatic segmentation of speech into sentences and topics. Speech Comm. 32, 1-2, 127--154. Google ScholarGoogle ScholarDigital LibraryDigital Library
  76. Tranter, S. E. and Reynolds, D. A. 2006. An overview of automatic speaker diarization systems. IEEE Trans. Audio, Speech, Lang. Proc. 14, 5, 1557--1565. Google ScholarGoogle ScholarDigital LibraryDigital Library
  77. Tucker, S. and Whittaker, S. 2005. Accessing multimodal meeting data: Systems, problems and possibilities. In Proceedings of the International Conference on Machine Learning for Multimodal Interaction (MLMI). Springer-Verlag, 1--11. Google ScholarGoogle ScholarDigital LibraryDigital Library
  78. Utiyama, M. and Isahara, H. 2001. A statistical model for domain-independent text segmentation. In Proceedings of the 39th Annual Meeting on Association for Computational Linguistics (ACL). Association for Computational Linguistics, Morristown, NJ, 499--506. Google ScholarGoogle ScholarDigital LibraryDigital Library
  79. Vinciarelli, A. 2007. Speakers role recognition in multiparty audio recordings using social network analysis and duration distribution modeling.IEEE Trans. Multimedia 9, 6, 1215--1226. Google ScholarGoogle ScholarDigital LibraryDigital Library
  80. Waibel, A., Brett, M., Metze, F., Ries, K., Schaaf, T., Schultz, T., S Oltau, H., Y U, H., and Zechner, K. 2001. Advances in automatic meeting record creation and access. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing. IEEE Press, 597--600.Google ScholarGoogle Scholar
  81. Whittaker, S., Laban, R., and Tucker, S. 2006. Analysing meeting records: An ethnographic study and technological implications. In Machine Learning for Multimodal Interaction, S. Renals and S. Bengio, Eds., Lecture Notes in Computer Science, vol. 3869, Springer, 101--113. Google ScholarGoogle ScholarDigital LibraryDigital Library
  82. Yang, Y. 2001. A study on thresholding strategies for text categorization. In Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR). W. B. Croft, D. J. Harper, D. H. Kraft, and J. Zobel, Eds., ACM Press, New York, 137--145. Google ScholarGoogle ScholarDigital LibraryDigital Library
  83. Zhang, D., Gatica-Perez, D., Bengio, S., and McCowan, I. 2006. Modeling individual and group actions in meetings with layered HMMs. IEEE Trans. Multimedia 8, 3, 509--520. Google ScholarGoogle ScholarDigital LibraryDigital Library
  84. Zhang, H. 2004. The optimality of Naive Bayes. In Proceedings of the 7th International Florida Artificial Intelligence Research Society Conference. AAAI Press.Google ScholarGoogle Scholar

Index Terms

  1. The nonverbal structure of patient case discussions in multidisciplinary medical team meetings

              Comments

              Login options

              Check if you have access through your login credentials or your institution to get full access on this article.

              Sign in

              Full Access

              PDF Format

              View or Download as a PDF file.

              PDF

              eReader

              View online with eReader.

              eReader
              About Cookies On This Site

              We use cookies to ensure that we give you the best experience on our website.

              Learn more

              Got it!