Abstract
Although it has become easier for individuals to track their personal health data (e.g., heart rate, step count, and nutrient intake data), there is still a wide chasm between the collection of data and the generation of meaningful summaries to help users better understand what their data means to them. With an increased comprehension of their data, users will be able to act upon the newfound information and work toward striving closer to their health goals. We aim to bridge the gap between data collection and summary generation by mining the data for interesting behavioral findings that may provide hints about a user’s tendencies. Our focus is on improving the explainability of temporal personal health data via a set of informative summary templates, or “protoforms.” These protoforms span both evaluation-based summaries that help users evaluate their health goals and pattern-based summaries that explain their implicit behaviors. In addition to individual-level summaries, the protoforms we use are also designed for population-level summaries. We apply our approach to generate summaries (both univariate and multivariate) from real user health data and show that the summaries our system generates are both interesting and useful.
- Alberto Alvarez-Alvarez and Gracian Trivino. 2013. Linguistic description of the human gait quality. Engineering Applications of Artificial Intelligence 26, 1 (2013), 13–23.Google Scholar
Digital Library
- Tatsuya Aoki, Akira Miyazawa, Tatsuya Ishigaki, Keiichi Goshima, Kasumi Aoki, Ichiro Kobayashi, Hiroya Takamura, and Yusuke Miyao. 2018. Generating market comments referring to external resources. In Proceedings of the International Conference on Natural Language Generation.Google Scholar
- American Diabetes Association. 2019. 5. Lifestyle management: Standards of medical care in diabetes—2019. Diabetes Care 42, Suppl. 1 (2019), S46–S60.Google Scholar
- James Baldwin, Trevor P. Martin, and Jonathan M. Rossiter. 1998. Time series modelling and prediction using fuzzy trend information. In Proceedings of the International Conference on Soft Computing and Information Intelligent Systems.Google Scholar
- Ildar Z. Batyrshin and Leonid B. Sheremetov. 2008. Perception-based approach to time series data mining. Applied Soft Computing 8, 3 (2008), 1211–1221.Google Scholar
Digital Library
- Fatih Emre Boran, Diyar Akay, and Ronald R. Yager. 2016. An overview of methods for linguistic summarization with fuzzy sets. Expert Systems with Applications 61 (2016), 356–377.Google Scholar
Digital Library
- Rita Castillo-Ortega, Nicolás Marín, Daniel Sánchez, and Andrea Tettamanzi. 2011. Linguistic summarization of time series data using genetic algorithms. In Proceedings of the Conference of the European Society for Fuzzy Logic and Technology.Google Scholar
- Jarvis T.-Y. Cheung and George Stephanopoulos. 1990. Representation of process trends—Part I. A formal representation framework. Computers & Chemical Engineering 14, 4 (1990), 495–510.Google Scholar
Cross Ref
- Eun Kyoung Choe, Nicole B. Lee, Bongshin Lee, Wanda Pratt, and Julie A. Kientz. 2014. Understanding quantified-selfers’ practices in collecting and exploring personal data. In Proceedings of the ACM Conference on Human Factors in Computing Systems.Google Scholar
- James Codella, Chohreh Partovian, Hung-Yang Chang, and Ching-Hua Chen. 2018. Data quality challenges for person-generated health and wellness data. IBM Journal of Research and Development 62, 1 (Jan. 2018), Article 3, 8 pages.Google Scholar
- Patricia Conde-Clemente, Jose M. Alonso, Eldman O. Nunes, Angel Sanchez, and Gracian Trivino. 2017. New types of computational perceptions: Linguistic descriptions in deforestation analysis. Expert Systems with Applications 85 (2017), 46–60.Google Scholar
Digital Library
- Gautam Das, King-Ip Lin, Heikki Mannila, Gopal Renganathan, and Padhraic Smyth. 1998. Rule discovery from time series. In Proceedings of the ACM SIGKDD Conference on Knowledge Discovery and Data Mining.Google Scholar
- Luka Eciolaza, Martin Pereira-Farina, and Gracian Trivino. 2013. Automatic linguistic reporting in driving simulation environments. Applied Soft Computing 13, 9 (2013), 3956–3967.Google Scholar
Digital Library
- Steven Elsworth and Stefan Guttel. 2020. ABBA: Adaptive Brownian bridge-based symbolic aggregation of time series. Data Mining and Knowledge Discovery 34 (2020), 1175–1200.Google Scholar
Cross Ref
- Albert Gatt, François Portet, Ehud Reiter, Jim Hunter, Saad Mahamood, Wendy Moncur, and Somayajulu Sripada. 2009. From data to text in the neonatal intensive care unit: Using NLG technology for decision support and information management. AI Communications 22, 3 (Aug. 2009), 153–186.Google Scholar
- Herbert Paul Grice. 1967. Logic and conversation. In Studies in the Way of Words, Paul Grice (Ed.). Harvard University Press, 41–58.Google Scholar
- Gabriela Guimarães and Alfred Ultsch. 1999. A method for temporal knowledge conversion. In Advances in Intelligent Data Analysis, David J. Hand, Joost N. Kok, and Michael R. Berthold (Eds.). Springer, 369–380.Google Scholar
- Zengyou He, Xiaofei Xu, and Shengchun Deng. 2002. Squeezer: An efficient algorithm for clustering categorical data. Journal of Computer Science and Technology 17 (2002), 611–624.Google Scholar
Digital Library
- Frank Höppner. 2001. Learning temporal rules from state sequences. In Proceedings of the IJCAI Workshop on Learning from Temporal and Spatial Data.Google Scholar
- Janusz Kacprzyk and Anna Wilbik. 2008. Linguistic summarization of time series using fuzzy logic with linguistic quantifiers: A truth and specificity based approach. In Proceedings of the International Conference on Artificial Intelligence and Soft Computing. 241–252.Google Scholar
- Janusz Kacprzyk, Anna Wilbik, and Slawomir Zadrozny. 2008. Linguistic summarization of time series using a fuzzy quantifier driven aggregation. Fuzzy Sets and Systems 159, 12 (2008), 1485–1499.Google Scholar
Digital Library
- Janusz Kacprzyk, Anna Wilbik, and Slawomir Zadrozny. 2010. An approach to the linguistic summarization of time series using a fuzzy quantifier driven aggregation. International Journal of Intelligent Systems 25, 5 (May 2010), 411–439.Google Scholar
- Janusz Kacprzyk, Ronald R. Yager, and Slawomir Zadrozny. 2002. Fuzzy Linguistic Summaries of Databases for an Efficient Business Data Analysis and Decision Support. Springer, Boston, MA, 129–152. DOI:https://doi.org/10.1007/0-306-46991-X_6Google Scholar
- Katarzyna Kaczmarek-Majer and Olgierd Hryniewicz. 2019. Application of linguistic summarization methods in time series forecasting. Information Sciences 478 (2019), 580–594.Google Scholar
Cross Ref
- Guillaume Klein, Yoon Kim, Yuntian Deng, Jean Senellart, and Alexander M. Rush. 2017. OpenNMT: Open-source toolkit for neural machine translation. arxiv:1701.02810.Google Scholar
- Philipp Koehn, Hieu Hoang, Alexandra Birch, Chris Callison-Burch, Marcello Federico, Nicola Bertoldi, Brooke Cowan, et al. 2007. Moses: Open source toolkit for statistical machine translation. In ACL Companion Volume: Demo and Poster Sessions.Google Scholar
- Xuan-May Le, Tuan Tran, and Hien Nguyen. 2020. An improvement of SAX representation for time series by using complexity invariance. Intelligent Data Analysis 24 (2020), 625–641.Google Scholar
Digital Library
- Jessica Lin, Eamonn J. Keogh, Li Wei, and Stefano Lonardi. 2007. Experiencing SAX: A novel symbolic representation of time series. Data Mining and Knowledge Discovery 15 (2007), 107–144.Google Scholar
Digital Library
- Walter Maner and Sean Joyce. 1997. WXSYS Weather Lore + Fuzzy Logic = Weather Forecasts. Retrieved March 13, 2021 from https://www.researchgate.net/publication/237546595_WXSYS_Weather_Lore_Fuzzy_Logic_Weather_ForecastsGoogle Scholar
- Matthew J. Menne, Imke Durre, Bryant Korzeniewski, Shelley McNeal, Kristy Thomas, Xungang Yin, Steven Anthony, et al. 2020. Global Historical Climatology Network Daily (GHCN-Daily), Version 3. Retrieved March 13, 2021 from https://www.ncei.noaa.gov/Google Scholar
- Gilles Moyse and Marie-Jeanne Lesot. 2016. Linguistic summaries of locally periodic time series. Fuzzy Sets and Systems 285 (2016), 94–117.Google Scholar
Digital Library
- Soichiro Murakami, Akihiko Watanabe, Akira Miyazawa, Keiichi Goshima, Toshihiko Yanase, Hiroya Takamura, and Yusuke Miyao. 2017. Learning to generate market comments from stock prices. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics.Google Scholar
Cross Ref
- Kishore Papineni, Salim Roukos, Todd Ward, and Wei-Jing Zhu. 2002. Bleu: A method for automatic evaluation of machine translation. In Proceedings of the Annual Meeting of the Association for Computational Linguistics.Google Scholar
- Elizabeth Peel, Margaret Douglas, and Julia Lawton. 2007. Self monitoring of blood glucose in type 2 diabetes: Longitudinal qualitative study of patients’ perspectives. BMJ 335, 7618 (Sept. 2007), 493.Google Scholar
- Reza Rawassizadeh, Elaheh Momeni, Chelsea Dobbins, Joobin Gharibshah, and Michael Pazzani. 2016. Scalable daily human behavioral pattern mining from multivariate temporal data. IEEE Transactions on Knowledge and Data Engineering 28, 11 (Nov. 2016), 3098–3112.Google Scholar
Digital Library
- Ehud Reiter and Robert Dale. 2000. Building Natural Language Generation Systems. Cambridge University Press.Google Scholar
Digital Library
- Daniel Sanchez-Valdes, Alberto Alvarez-Alvarez, and Gracian Trivino. 2016. Dynamic linguistic descriptions of time series applied to self-track the physical activity. Fuzzy Sets and Systems 285 (2016), 162–181.Google Scholar
Digital Library
- Patrick Schäfer and Mikael Högqvist. 2012. SFA: A symbolic Fourier approximation and index for similarity search in high dimensional datasets. In Proceedings of the 15th International Conference on Extending Database Technology. 516–527.Google Scholar
- Somayajulu G. Sripada, Ehud Reiter, Jim Hunter, and Jin Yu. 2003. Generating English summaries of time series data using the Gricean maxims. In Proceedings of the 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 187–196.Google Scholar
- Si Sun and Kaitlin L. Costello. 2018. Designing decision-support technologies for patient-generated data in type 1 diabetes. In AMIA Annual Symposium Proceedings. 1645–1654.Google Scholar
- Romel Torres. 2019. Alpha Vantage. Retrieved March 13, 2021 from https://github.com/RomelTorres/alpha_vantageGoogle Scholar
- A. Ultsch. 1993. Knowledge extraction from self-organizing neural networks. In Information and Classification, Otto Opitz, Berthold Lausen, and Rüdiger Klar (Eds.). Springer, Berlin, Germany, 301–306.Google Scholar
- Chris van der Lee, Emiel Krahmer, and Sander Wubben. 2018. Automated learning of templates for data-to-text generation: Comparing rule-based, statistical and neural methods. In Proceedings of the International Conference on Natural Language Generation.Google Scholar
- Ingmar Weber and Palakorn Achananuparp. 2016. Insights from machine-learned diet success prediction. In Proceedings of the Pacific Symposium on Biocomputing.Google Scholar
- Anna Wilbik and Uzay Kaymak. 2015. Linguistic summarization of processes—A research agenda. In Proceedings of the Conference of the International Fuzzy Systems Association and the European Society for Fuzzy Logic and Technology.Google Scholar
- Anna Wilbik, James M. Keller, and Gregory L. Alexander. 2011. Linguistic summarization of sensor data for eldercare. In Proceedings of the IEEE International Conference on Systems, Man, and Cybernetics.Google Scholar
- Dongrui Wu, Jerry M. Mendel, and Jhiin Joo. 2010. Linguistic summarization using IF-THEN rules. In Proceedings of the International Conference on Fuzzy Systems. 1–8.Google Scholar
- Ronald R. Yager. 1982. A new approach to the summarization of data. Information Sciences 28, 1 (1982), 69–86.Google Scholar
Cross Ref
- Lotfi A. Zadeh. 1975. The concept of a linguistic variable and its application to approximate reasoning—I. Information Sciences 8, 3 (1975), 199–249.Google Scholar
Cross Ref
- Lotfi A. Zadeh. 1983. A computational approach to fuzzy quantifiers in natural languages. Computers & Mathematics with Applications 9, 1 (1983), 149–184.Google Scholar
Cross Ref
- Lotfi A. Zadeh. 2002. A prototype-centered approach to adding deduction capability to search engines—The concept of protoform. In Proceedings of the IEEE Symposium on Intelligent Systems.Google Scholar
- Mohammed J. Zaki. 2001. SPADE: An efficient algorithm for mining frequent sequences. Machine Learning 42, 1 (Jan. 2001), 31–60.Google Scholar
Cross Ref
Index Terms
A Framework for Generating Summaries from Temporal Personal Health Data
Recommendations
The challenges of individuality to technology approaches to personally collected health data
PervasiveHealth '17: Proceedings of the 11th EAI International Conference on Pervasive Computing Technologies for HealthcareWhile patients' increasing willingness to collect personal health data portends improvements in the individualization of health care, helping health care providers to effectively act upon these personal data collections poses its own challenges. In this ...
Designing Patient-Centered Personal Health Records (PHRs): Health Care Professionals' Perspective on Patient-Generated Data
Currently, patients not only want access to various medical records their health care providers keep about them, but they also are willing to become active participants in managing their own health information and the health information of the ones they ...
Implementing the lifelong personal health record in a regionalised health information system: The case of Lombardy, Italy
Abstract BackgroundThe use of personal health records (PHRs) can help people make better health decisions and improves the quality of care by allowing access to and use of the information needed to communicate effectively with ...






Comments