skip to main content
research-article
Public Access

Physics-Guided Machine Learning for Scientific Discovery: An Application in Simulating Lake Temperature Profiles

Published:18 May 2021Publication History
Skip Abstract Section

Abstract

Physics-based models are often used to study engineering and environmental systems. The ability to model these systems is the key to achieving our future environmental sustainability and improving the quality of human life. This article focuses on simulating lake water temperature, which is critical for understanding the impact of changing climate on aquatic ecosystems and assisting in aquatic resource management decisions. General Lake Model (GLM) is a state-of-the-art physics-based model used for addressing such problems. However, like other physics-based models used for studying scientific and engineering systems, it has several well-known limitations due to simplified representations of the physical processes being modeled or challenges in selecting appropriate parameters. While state-of-the-art machine learning models can sometimes outperform physics-based models given ample amount of training data, they can produce results that are physically inconsistent. This article proposes a physics-guided recurrent neural network model (PGRNN) that combines RNNs and physics-based models to leverage their complementary strengths and improves the modeling of physical processes. Specifically, we show that a PGRNN can improve prediction accuracy over that of physics-based models (by over 20% even with very little training data), while generating outputs consistent with physical laws. An important aspect of our PGRNN approach lies in its ability to incorporate the knowledge encoded in physics-based models. This allows training the PGRNN model using very few true observed data while also ensuring high prediction accuracy. Although we present and evaluate this methodology in the context of modeling the dynamics of temperature in lakes, it is applicable more widely to a range of scientific and engineering disciplines where physics-based (also known as mechanistic) models are used.

References

  1. Yoshua Bengio, Patrice Simard, and Paolo Frasconi. 1994. Learning long-term dependencies with gradient descent is difficult. IEEE Transactions on Neural Networks 5, 2 (1994), 157–166. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. T. Beucler, S. Rasp, M. Pritchard, and P. Gentine. 2019. Achieving conservation of energy in neural network emulators for climate modeling. arXiv:1906.06622 (2019).Google ScholarGoogle Scholar
  3. Josh Bongard and Hod Lipson. 2007. Automated reverse engineering of nonlinear dynamical systems. Proceedings of the National Academy of Sciences 104, 24 (2007), 9943–9948. DOI:https://doi.org/10.1073/pnas.0609476104Google ScholarGoogle ScholarCross RefCross Ref
  4. Louise C. Bruce, Marieke A. Frassl, George B. Arhonditsis, Gideon Gal, David P. Hamilton, Paul C. Hanson, Amy L. Hetherington, John M. Melack, Jordan S. Read, Karsten Rinke, and others. 2018. A multi-lake comparative analysis of the General Lake Model (GLM): Stress-testing across a global observatory network. Environmental Modelling & Software 102 (2018), 274–291. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Steven L. Brunton, Joshua L. Proctor, and J. Nathan Kutz. 2016. Discovering governing equations from data by sparse identification of nonlinear dynamical systems. Proceedings of the National Academy of Sciences 113, 15 (2016), 3932–3937.Google ScholarGoogle ScholarCross RefCross Ref
  6. Keith T. Butler, Daniel W. Davies, Hugh Cartwright, Olexandr Isayev, and Aron Walsh. 2018. Machine learning for molecular and materials science. Nature 559, 7715 (2018), 547.Google ScholarGoogle Scholar
  7. S. Chen and S. A. Billings. 1992. Neural networks for nonlinear dynamic system modelling and identification. Internat. J. Control 56, 2 (1992), 319–346. DOI:https://doi.org/10.1080/00207179208934317Google ScholarGoogle ScholarCross RefCross Ref
  8. James P. Crutchfield and Bruce S. McNamara. 1987. Equations of motions a data series. Complex Systems 1 (01 1987).Google ScholarGoogle Scholar
  9. Urban Forssell and Peter Lindskog. 1997. Combining semi-physical and neural network modeling: An example of its usefulness. IFAC Proceedings Volumes (1997).Google ScholarGoogle Scholar
  10. Garrett B. Goh, Nathan O. Hodas, and Abhinav Vishnu. 2017. Deep learning for computational chemistry. Journal of Computational Chemistry 38, 16 (2017), 1291–1307.Google ScholarGoogle ScholarCross RefCross Ref
  11. D. Graham-Rowe, D. Goldston, C. Doctorow, M. Waldrop, C. Lynch, F. Frankel, R. Reid, S. Nelson, D. Howe, S. Y. Rhee, et al. 2008. Big data: Science in the petabyte era. Nature 455, 7209 (2008), 8–9.Google ScholarGoogle Scholar
  12. Hoshin V. Gupta and Grey S. Nearing. 2014. Debates—The future of hydrological sciences: A (common) path forward? Using models and data to learn: A systems theoretic perspective on the future of hydrological science. WRR 50, 6 (2014), 5351–5359.Google ScholarGoogle ScholarCross RefCross Ref
  13. Franz Hamilton, Alun L. Lloyd, and Kevin B. Flores. 2017. Hybrid modeling and prediction of dynamical systems. PLoS Computational Biology 13, 7 (2017), e1005655.Google ScholarGoogle Scholar
  14. Paul C. Hanson, Aviah B. Stillman, Xiaowei Jia, Anuj Karpatne, Hilary A. Dugan, Cayelan C. Carey, Joseph Stachelek, Nicole K. Ward, Yu Zhang, Jordan S. Read, and Vipin Kumar. 2020. Predicting lake surface water phosphorus dynamics using process-guided machine learning. Ecological Modelling 430 (2020), 109–136.Google ScholarGoogle Scholar
  15. Ted D. Harris and Jennifer L. Graham. 2017. Predicting cyanobacterial abundance, microcystin, and geosmin in a eutrophic drinking-water reservoir using a 14-year dataset. Lake and Reservoir Management (2017).Google ScholarGoogle Scholar
  16. B. B. Hicks. 1972. Some evaluations of drag and bulk transfer coefficients over water bodies of different sizes. Boundary-Layer Meteorology 3, 2 (1972), 201–213.Google ScholarGoogle ScholarCross RefCross Ref
  17. M. R. Hipsey L. C. Bruce, and D. P. Hamilton. 2014. GLM-general lake model: Model overview and user information. The University of Western Perth, Perth, Australia.Google ScholarGoogle Scholar
  18. Matthew R. Hipsey, Louise C. Bruce, Casper Boon, Brendan Busch, Cayelan C. Carey, David P. Hamilton, Paul C. Hanson, Jordan S. Read, Eduardo De Sousa, Michael Weber, et al. 2019. A general lake model (GLM 3.0) for linking with high-frequency sensor data from the global lake ecological observatory network (GLEON). (2019).Google ScholarGoogle Scholar
  19. Xiaowei Jia, Jared Willard, Anuj Karpatne, Jordan Read, Jacob Zwart, Michael Steinbach, and Vipin Kumar. 2019. Physics guided RNNs for modeling dynamical systems: A case study in simulating lake temperature profiles. In Proceedings of the 2019 SIAM International Conference on Data Mining. SIAM, 558–566.Google ScholarGoogle ScholarCross RefCross Ref
  20. J. Kani and A. Elsheikh. 2017. DR-RNN: A deep residual recurrent neural network for model reduction. arXiv:1709.00939 (2017).Google ScholarGoogle Scholar
  21. Anuj Karpatne, Gowtham Atluri, James H. Faghmous, Michael Steinbach, Arindam Banerjee, Auroop Ganguly, Shashi Shekhar, Nagiza Samatova, and Vipin Kumar. 2017a. Theory-guided data science: A new paradigm for scientific discovery from data. IEEE Transactions on Knowledge and Data Engineering 29, 10 (2017), 2318–2331.Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Anuj Karpatne, William Watkins, Jordan Read, and Vipin Kumar. 2017b. Physics-guided neural networks (PGNN): An application in lake temperature modeling. arXiv preprint arXiv:1710.11431 (2017).Google ScholarGoogle Scholar
  23. Diederik P. Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).Google ScholarGoogle Scholar
  24. Dimitrios Kotzias, Misha Denil, Nando De Freitas, and Padhraic Smyth. 2015. From group to individual labels using deep features. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 597–606. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Upmanu Lall. 2014. Debates—The future of hydrological sciences: A (common) path forward? One water. One world. Many climes. Many souls. WRR (2014).Google ScholarGoogle Scholar
  26. David Lazer et al. 2014. The parable of Google Flu: Traps in big data analysis. Science (2014).Google ScholarGoogle Scholar
  27. Ruotent Li, Loong Fah Cheong, and Robby T. Tan. 2019. Heavy rain image restoration: Integrating physics model and conditional adversarial learning. arXiv preprint arXiv:1904.05050 (2019).Google ScholarGoogle Scholar
  28. John J. Magnuson et al. 1979. Temperature as an ecological resource. American Zoologist 19, 1 (1979), 331–343.Google ScholarGoogle Scholar
  29. Andrew J Majda and John Harlim. 2012. Physics constrained nonlinear regression models for time series. Nonlinearity 26 (11 2012), 201. DOI:https://doi.org/10.1088/0951-7715/26/1/201Google ScholarGoogle Scholar
  30. Jeffrey J. McDonnell and Keith Beven. 2014. Debates—The future of hydrological sciences: A (common) path forward? A call to action aimed at understanding velocities, celerities and residence time distributions of the headwater hydrograph. WRR (2014).Google ScholarGoogle Scholar
  31. Sean McGregor, Dattaraj Dhuri, Anamaria Berea, and Andrés Muñoz-Jaramillo. 2017. FlareNet: A deep learning framework for solar phenomena prediction. In Proceedings of the Workshop on Deep Learning for Physical Sciences (DLPS 2017), NIPS 2017.Google ScholarGoogle Scholar
  32. N. Muralidhar, M. R. Islam, M. Marwah, A. Karpatne, and N. Ramakrishnan. 2018. Incorporating prior domain knowledge into deep neural networks. In IEEE Big Data. IEEE.Google ScholarGoogle Scholar
  33. Dave Ojika, Darin Acosta, Ann Gordon-Ross, Andrew Carnes, and Sergei Gleyzer. 2017. Accelerating high-energy physics exploration with deep learning. In Proceedings of the Practice and Experience in Advanced Research Computing 2017 on Sustainability, Success and Impact. ACM, 37. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Hans W. Paerl and Jef Huisman. 2008. Blooms like it hot. Science 320, 5872 (2008), 57–58.Google ScholarGoogle Scholar
  35. Jinshan Pan, Yang Liu, Jiangxin Dong, Jiawei Zhang, Jimmy Ren, Jinhui Tang, Yu-Wing Tai, and Ming-Hsuan Yang. 2018. Physics-based generative adversarial models for image restoration and beyond. arXiv e-prints, Article arxiv:cs.CV/1808.00605Google ScholarGoogle Scholar
  36. Shaowu Pan and Karthik Duraisamy. 2018. Long-time predictive modeling of nonlinear dynamical systems using neural networks. Complexity 2018 (12 2018), 1–26. DOI:https://doi.org/10.1155/2018/4801012Google ScholarGoogle Scholar
  37. Frank J. Rahel and Julian D. Olden. 2008. Assessing the effects of climate change on aquatic invasive species. Conservation Biology 22, 3 (2008), 521–533.Google ScholarGoogle ScholarCross RefCross Ref
  38. Maziar Raissi. 2018. Deep hidden physics models: Deep learning of nonlinear partial differential equations. arXiv:1801.06637 [cs, math, stat] (Jan. 2018). http://arxiv.org/abs/1801.06637arXiv: 1801.06637.Google ScholarGoogle Scholar
  39. Maziar Raissi, Paris Perdikaris, and George Em Karniadakis. 2017. Inferring solutions of differential equations using noisy multi-fidelity data. J. Comput. Phys. 335 (2017), 736–746.Google ScholarGoogle ScholarCross RefCross Ref
  40. Maziar Raissi, Paris Perdikaris, and George Em Karniadakis. 2018a. Multistep neural networks for data-driven discovery of nonlinear dynamical systems. arXiv e-prints, Article arxiv:math.DS/1801.01236Google ScholarGoogle Scholar
  41. Maziar Raissi, Alireza Yazdani, and George Em Karniadakis. 2018b. Hidden fluid mechanics: A Navier-Stokes informed deep learning framework for assimilating flow visualization data. arXiv preprint arXiv:1808.04327 (2018).Google ScholarGoogle Scholar
  42. Emily K. Read et al. 2017. Water quality data for national-scale aquatic research: The Water Quality Portal. Water Resources Research (2017).Google ScholarGoogle Scholar
  43. Jordan S. Read, Xiaowei Jia, Jared Willard, Alison P. Appling, Jacob A. Zwart, Samantha K. Oliver, Anuj Karpatne, Gretchen J. A. Hansen, Paul C. Hanson, William Watkins, et al. 2019. Process-guided deep learning predictions of lake water temperature. Water Resources Research (2019).Google ScholarGoogle Scholar
  44. Hongyu Ren et al. 2018. Learning with weak supervision from physics and data-driven constraints.AI Magazine (2018).Google ScholarGoogle Scholar
  45. James J. Roberts et al. 2013. Fragmentation and thermal risks from climate change interact to affect persistence of native trout in the Colorado River basin. Global Change Biology (2013).Google ScholarGoogle Scholar
  46. James J. Roberts, Kurt D. Fausch, Mevin B. Hooten, and Douglas P. Peterson. 2017. Nonnative trout invasions combined with climate change threaten persistence of isolated cutthroat trout populations in the southern Rocky Mountains. North American Journal of Fisheries Management 37, 2 (2017), 314–325.Google ScholarGoogle Scholar
  47. Samuel H. Rudy, Steven L. Brunton, Joshua L. Proctor, and J. Nathan Kutz. 2017. Data-driven discovery of partial differential equations. Science Advances 3, 4 (2017), e1602614.Google ScholarGoogle ScholarCross RefCross Ref
  48. O. San and R. Maulik. 2018. Machine learning closures for model order reduction of thermal fluids. Applied Mathematical Modelling (2018).Google ScholarGoogle Scholar
  49. Abhinav Shrivastava, Saurabh Singh, and Abhinav Gupta. 2012. Constrained semi-supervised learning Using attributes and comparative attributes. In Proceedings of the 12th European Conference on Computer Vision - Volume Part III (ECCV’12). Springer-Verlag, Berlin, 369–383. DOI:https://doi.org/10.1007/978-3-642-33712-3_27 Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. Russell Stewart and Stefano Ermon. 2017. Label-free supervision of neural networks with physics and domain knowledge. In AAAI, Vol. 1. 1–7. Google ScholarGoogle ScholarDigital LibraryDigital Library
  51. Pascal Sturmfels, Saige Rutherford, Mike Angstadt, Mark Peterson, Chandra Sripada, and Jenna Wiens. 2018. A domainguided CNN architecture for predicting age from structural brain images. arXiv preprint arXiv:1808.04362 (2018).Google ScholarGoogle Scholar
  52. George Sugihara, Robert May, Hao Ye, Chih-hao Hsieh, Ethan Deyle, Michael Fogarty, and Stephan Munch. 2012. Detecting causality in complex ecosystems. Science 338, 6106 (2012), 496–500. DOI:https://doi.org/10.1126/science.1227079Google ScholarGoogle ScholarCross RefCross Ref
  53. S. Tabata. 1973. A simple but accurate formula for the saturation vapor pressure over liquid water. Journal of Applied Meteorology 12, 8 (1973), 1410–1411.Google ScholarGoogle Scholar
  54. Alexandre M. Tartakovsky, Carlos Ortiz Marrero, D. Tartakovsky, and David Barajas-Solano. 2018. Learning parameters and constitutive relationships with physics informed deep neural networks. arXiv preprint arXiv:1808.03398 (2018).Google ScholarGoogle Scholar
  55. Zhong Yi Wan, Pantelis Vlachas, Petros Koumoutsakos, and Themistoklis Sapsis. 2018. Data-assisted reduced-order modeling of extreme events in complex dynamical systems. PLoS ONE 13, 5 (2018), e0197704.Google ScholarGoogle ScholarCross RefCross Ref
  56. Jared Willard, Xiaowei Jia, Shaoming Xu, Michael Steinbach, and Vipin Kumar. 2020a. Integrating physics-based modeling with machine learning: A survey. arXiv preprint arXiv:2003.04919 (2020).Google ScholarGoogle Scholar
  57. Jared D. Willard, Jordan S. Read, Alison P. Appling, Samantha K. Oliver, Xiaowei Jia, and Vipin Kumar. 2020b. Predicting water temperature dynamics of unmonitored lakes with meta transfer learning. arXiv preprint arXiv:2011.05369 (2020).Google ScholarGoogle Scholar
  58. Tianfang Xu and Albert J. Valocchi. 2015. Data-driven methods to improve baseflow prediction of a regional groundwater model. Computers & Geosciences (2015). Google ScholarGoogle ScholarDigital LibraryDigital Library
  59. Kun Yao, John E. Herr, David W. Toth, Ryker Mckintyre, and John Parkhill. 2018. The TensorMol-0.1 model chemistry: A neural network augmented with long-range physics. https://pubs.rsc.org/en/content/articlehtml/2018/sc/c7sc04934j.Google ScholarGoogle Scholar

Index Terms

  1. Physics-Guided Machine Learning for Scientific Discovery: An Application in Simulating Lake Temperature Profiles

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in

        Full Access

        • Published in

          cover image ACM/IMS Transactions on Data Science
          ACM/IMS Transactions on Data Science  Volume 2, Issue 3
          August 2021
          302 pages
          ISSN:2691-1922
          DOI:10.1145/3465442
          Issue’s Table of Contents

          Copyright © 2021 Association for Computing Machinery.

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 18 May 2021
          • Accepted: 1 January 2021
          • Revised: 1 July 2020
          • Received: 1 December 2019
          Published in tds Volume 2, Issue 3

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article
          • Refereed

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        HTML Format

        View this article in HTML Format .

        View HTML Format
        About Cookies On This Site

        We use cookies to ensure that we give you the best experience on our website.

        Learn more

        Got it!