ABSTRACT

An analysis of multiparty interaction in the problem solving sessions of the Multimodal Math Data Corpus is presented. The analysis focuses on non-verbal cues extracted from the audio tracks. Algorithms for expert identification and performance prediction (correctness of solution) are implemented based on patterns of speech activity among session participants. Both of these categorisation algorithms employ an underlying graph-based representation of dialogues for each individual problem solving activities. The proposed Bayesian approach to expert prediction proved quite effective, reaching accuracy levels of over 92\% with as few as 6 dialogues of training data. Performance prediction was not quite as effective. Although the simple graph-matching strategy employed for predicting incorrect solutions improved considerably over a Monte Carlo simulated baseline (F1 score increased by a factor of 2.3), there is still much room for improvement in this task.
References
- D. Aha and D. Kibler. Instance-based learning algorithms. Machine Learning, 6:37--66, 1991. Google Scholar
Digital Library
- F. Bonin, N. Campbell, and C. Vogel. Laughter and topic changes: Temporal distribution and information flow. In P. Baranyi, editor, 3rd IEEE Conference on Cognitive Infocommunications, pages 53--58, 2012.Google Scholar
- A. P. Bradley. The use of the area under the ROC curve in the evaluation of machine learning algorithms. Pattern Recognition, 30(7):1145--1159, July 1997. Google Scholar
Digital Library
- N. Campbell. On the use of nonverbal speech sounds in human communication. In A. Esposito, M. Faundez-Zanuy, E. Keller, and M. Marinaro, editors, Verbal and nonverbal communication behaviours, Lecture Notes in Computer Science, pages 117--128. Springer, 2007. Google Scholar
Digital Library
- J. M. J. Dabbs and B. Ruback. Dimensions of group process: Amount and structure of vocal interaction. Advances in Experimental Social Psychology, 20(123--169), 1987.Google Scholar
- P.-Y. Hsueh and J. D. Moore. Automatic decision detection in meeting speech. In A. Popescu-Belis, S. Renals, and H. Bourlard, editors, Machine Learning for Multimodal Interaction (MLMI '07), volume 4892 of Lecture Notes in Computer Science. Springer, 2007. Google Scholar
Digital Library
- J. Jaffe and S. Feldstein. Rhythms of dialogue. Academic Press, New York, 1970.Google Scholar
- N. Japkowicz and S. Stephen. The class imbalance problem: A systematic study. Intelligent Data Analysis, 6(5):429--449, 2002. Google Scholar
Digital Library
- G. H. John and P. Langley. Estimating continuous distributions in Bayesian classifiers. In Besnard, Philippe and S. Hanks, editors, Proceedings of the 11th Conference on Uncertainty in Artificial Intelligence (UAI'95), pages 338--345, San Francisco, CA, USA, Aug. 1995. Morgan Kaufmann Publishers. Google Scholar
Digital Library
- S. Luz. Interleave factor and multimedia information visualisation. In H. Sharp, P. Chalk, J. LePeuple, and J. Rosbottom, editors, Proceedings of Human Computer Interaction 2002, volume 2, pages 142--146, London, 2002.Google Scholar
- S. Luz. Locating case discussion segments in recorded medical team meetings. In Proceedings of the ACM Multimedia Workshop on Searching Spontaneous Conversational Speech (SSCS'09), pages 21--30, Beijing, China, Oct. 2009. ACM Press. Google Scholar
Digital Library
- S. Luz. The non-verbal structure of patient case discussions in multidisciplinary medical team meetings. ACM Transactions on Information Systems, 30(3):article 17, 2012. Google Scholar
Digital Library
- S. Luz and B. Kane. Classification of patient case discussions through analysis of vocalisation graphs. In Proceedings of the 11th International Conference on Multimodal Interfaces and Machine Learning for Multimodal Interaction (ICMI-MLMI'09), pages 107--114, Cambridge, MA, 2009. ACM. Google Scholar
Digital Library
- I. McCowan, D. Gatica-Perez, S. Bengio, G. Lathoud, M. Barnard, and D. Zhang. Automatic analysis of multimodal group actions in meetings. IEEE Transactions on Pattern Analysis and Machine Intelligence, 27(3):305--317, March 2005. Google Scholar
Digital Library
- S. Oviatt. Problem-solving, domain expertise and learning: Ground-truth performance results for math data corpus. In Second International Workshop on Multimodal Learning Analytics, Sydney, Australia, dec 2013. Google Scholar
Digital Library
- S. Oviatt, A. Cohen, and N. Weibel. Multimodal learning analytics: Description of math data corpus for ICMI grand challenge workshop. Available from http://mla.ucsd.edu/data/MMLA_Math_Data_Corpus.pdf, 2013. Accessed August 2013. Google Scholar
Digital Library
- A. Pentland. Social signal processing {exploratory DSP}. IEEE Signal Processing Magazine, 24(4):108--111, 2007.Google Scholar
Cross Ref
- R Development Core Team. R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria, 2011. ISBN 3--900051-07-0.Google Scholar
- S. Renals, T. Hain, and H. Bourlard. Recognition and interpretation of meetings: The AMI and AMIDA projects. In Proc. IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU '07), 2007.Google Scholar
- S. Scherer, N. Weibel, L.-P. Morency, and S. Oviatt. Multimodal prediction of expertise and leadership in learning groups. In Proceedings of the 1st International Workshop on Multimodal Learning Analytics, MLA '12, pages 1:1--1:8. ACM, 2012. Google Scholar
Digital Library
- E. Shriberg, A. Stolcke, D. Jurafsky, N. Coccaro, M. Meteer, R. Bates, P. Taylor, K. Ries, R. Martin, and C. van Ess-Dykema. Can prosody aid the automatic classification of dialog acts in conversational speech? Language and Speech, 41(3--4):443--492, 1998.Google Scholar
Index Terms
Automatic identification of experts and performance prediction in the multimodal math data corpus through analysis of speech interaction






Comments