Abstract
Executable biology presents new challenges to formal methods. This paper addresses two problems that cell biologists face when developing formally analyzable models.
First, we show how to automatically synthesize a concurrent in-silico model for cell development given in-vivo experiments of how particular mutations influence the experiment outcome. The problem of synthesis under mutations is unique because mutations may produce non-deterministic outcomes (presumably by introducing races between competing signaling pathways in the cells) and the synthesized model must be able to replay all these outcomes in order to faithfully describe the modeled cellular processes. In contrast, a "regular" concurrent program is correct if it picks any outcome allowed by the non-deterministic specification. We developed synthesis algorithms and synthesized a model of cell fate determination of the earthworm C. elegans. A version of this model previously took systems biologists months to develop.
Second, we address the problem of under-constrained specifications that arise due to incomplete sets of mutation experiments. Under-constrained specifications give rise to distinct models, each explaining the same phenomenon differently. Addressing the ambiguity of specifications corresponds to analyzing the space of plausible models. We develop algorithms for detecting ambiguity in specifications, i.e., whether there exist alternative models that would produce different fates on some unperformed experiment, and for removing redundancy from specifications, i.e., computing minimal non-ambiguous specifications.
Additionally, we develop a modeling language and embed it into Scala. We describe how this language design and embedding allows us to build an efficient synthesizer. For our C. elegans case study, we infer two observationally equivalent models expressing different biological hypotheses through different protein interactions. One of these hypotheses was previously unknown to biologists.
- Rajeev Alur and Thomas A. Henzinger. Reactive modules. Formal Methods in System Design, 15(1):7--48, 1999. Google Scholar
Digital Library
- A. Arkin, J. Ross, and H. H. McAdams. Stochastic kinetic analysis of developmental pathway bifurcation in phage lambda-infected Escherichia coli cells. Genetics, 149(4):1633--1648, Aug 1998.Google Scholar
- Anil Aswani, Soile V. E. Keränen, James Brown, Charless C. Fowlkes, David W. Knowles, Mark D. Biggin, Peter Bickel, and Claire J. Tomlin. Nonparametric identification of regulatory interactions from spatial and temporal gene expression data. BMC Bioinformatics, 11:413, 2010.Google Scholar
Cross Ref
- Nathan A. Barker, Chris J. Myers, and Hiroyuki Kuwahara. Learning genetic regulatory network connectivity from time series data. IEEE/ACM Trans. Comput. Biology Bioinform., 8(1):152--165, 2011. Google Scholar
Digital Library
- Grégory Batt, Calin Belta, and Ron Weiss. Temporal logic analysis of gene networks under parameter uncertainty. IEEE Transactions of Automatic Control, page 2008.Google Scholar
Cross Ref
- Gregory Batt, Delphine Ropers, Hidde de Jong, Johannes Geiselmann, Radu Mateescu, Michel Page, and Dominique Schneider. Analysis and verification of qualitative models of genetic regulatory networks: A model-checking approach. In IJCAI, 2005. Google Scholar
Digital Library
- Nathalie Chabrier and François Fages. Symbolic model checking of biochemical networks. CMSB '03, 2003. Google Scholar
Digital Library
- C. Chaouiya. Petri net modelling of biological networks. Brief. Bioinformatics, 8(4):210--219, Jul 2007.Google Scholar
Cross Ref
- Vincent Danos, Jéróme Feret, Walter Fontana, and Jean Krivine. Abstract interpretation of cellular signalling networks. VMCAI'08, pages 83--97. Google Scholar
Digital Library
- Leonardo de Moura and Nikolaj Bjørner. Z3: Efficient SMT solver. In TACAS'08: Tools and Algorithms for the Construction and Analysis of Systems, volume 4963/2008 of Lecture Notes in Computer Science, pages 337--340, 2008. Google Scholar
Digital Library
- David L. Dill. Model checking cell biology. In CAV, page 2, 2012. Google Scholar
Digital Library
- J. Fisher, N. Piterman, A. Hajnal, and T. A. Henzinger. Predictive modeling of signaling crosstalk during C. elegans vulval development. PLoS Comput. Biol., 3(5):e92, May 2007.Google Scholar
Cross Ref
- Jasmin Fisher, David Harel, and Thomas A. Henzinger. Biology as reactivity. Commun. ACM, 54(10):72--82, 2011. Google Scholar
Digital Library
- Jasmin Fisher and Thomas A. Henzinger. Executable cell biology. Nature Biotechnology, 25(11):1239--1249, November 2007.Google Scholar
Cross Ref
- Jasmin Fisher, Thomas A. Henzinger, Maria Mateescu, and Nir Piterman. Bounded asynchrony: Concurrency for modeling cell-cell interactions. In FMSB, pages 17--32, 2008. Google Scholar
Digital Library
- https://oeis.org/A000670.Google Scholar
- Sumit Gulwani. Automating string processing in spreadsheets using input-output examples. In Proceedings of the 38th annual ACM SIGPLAN-SIGACT symposium on Principles of programming languages, POPL '11, pages 317--330. ACM. Google Scholar
Digital Library
- J. Heath, M. Kwiatkowska, G. Norman, D. Parker, and O. Tymchyshyn. Probabilistic model checking of complex biological pathways. Theoretical Computer Science, 319(3):239--257, 2008. Google Scholar
Digital Library
- Na'aman Kam, Irun R. Cohen, and David Harel. The immune system as a reactive system: Modeling t cell activation with statecharts. In HCC, pages 15--22, 2001. Google Scholar
Digital Library
- Na'aman Kam, David Harel, Hillel Kugler, Rami Marelly, Amir Pnueli, E. Jane Albert Hubbard, and Michael J. Stern. Formal modeling of c. elegans development: A scenario-based approach. In CMSB, pages 4--20, 2003. Google Scholar
Digital Library
- http://www.cs.berkeley.edu/~koksal/.Google Scholar
- Ali Sinan Köksal, Viktor Kuncak, and Philippe Suter. Scala to the Power of Z3: Integrating SMT and Programming. In CADE, pages 400--406, 2011. Google Scholar
Digital Library
- S. Li, S. M. Assmann, and R. Albert. Predicting essential components of signal transduction networks: a dynamic model of guard cell abscisic acid signaling. PLoS Biol., 4(10):e312, Oct 2006.Google Scholar
Cross Ref
- H. H. McAdams and A. Arkin. Stochastic mechanisms in gene expression. Proc. Natl. Acad. Sci. U.S.A., 94(3):814--819, Feb 1997.Google Scholar
Cross Ref
- Martin Odersky, Lex Spoon, and Bill Venners. Programming in Scala: a comprehensive step-by-step guide. Artima Press, 2008. Google Scholar
Digital Library
- Aviv Regev and Ehud Shapiro. The pi-calculus as an abstraction for biomolecular systems. 2004.Google Scholar
- Aurélien Rizk, Grégory Batt, Francóis Fages, and Sylvain Soliman. Continuous valuations of temporal logic specifications with applications to parameter optimization and robustness measures. Theor. Comput. Sci., 412(26):2827--2839, 2011. Google Scholar
Digital Library
- Armando Solar-Lezama, Christopher Grant Jones, and Rastislav Bodik. Sketching concurrent data structures. In Proceedings of the 2008 ACM SIGPLAN conference on Programming language design and implementation, PLDI '08, pages 136--148. ACM. Google Scholar
Digital Library
- Armando Solar-Lezama, Liviu Tancau, Rastislav Bodik, Sanjit Seshia, and Vijay Saraswat. Combinatorial sketching for finite programs. In ASPLOS-XII, pages 404--415, New York, NY, USA, 2006. ACM. Google Scholar
Digital Library
- Saurabh Srivastava, Sumit Gulwani, and Jeffrey S. Foster. From program verification to program synthesis. In POPL, 2010. Google Scholar
Digital Library
- Martin Vechev and Eran Yahav. Deriving linearizable fine-grained concurrent objects. SIGPLAN Not., 43(6):125--135, June 2008. Google Scholar
Digital Library
- A. S. Yoo, C. Bais, and I. Greenwald. Crosstalk between the EGFR and LIN-12/Notch pathways in C. elegans vulval development. Science, 303(5658):663--666, Jan 2004.Google Scholar
Cross Ref
Index Terms
Synthesis of biological models from mutation experiments
Recommendations
Synthesis of biological models from mutation experiments
POPL '13: Proceedings of the 40th annual ACM SIGPLAN-SIGACT symposium on Principles of programming languagesExecutable biology presents new challenges to formal methods. This paper addresses two problems that cell biologists face when developing formally analyzable models.
First, we show how to automatically synthesize a concurrent in-silico model for cell ...
Biological specifications for a synthetic gene expression data generation model
WILF'05: Proceedings of the 6th international conference on Fuzzy Logic and ApplicationsAn open problem in gene expression data analysis is the evaluation of the performance of gene selection methods applied to discover biologically relevant sets of genes. The problem is difficult, as the entire set of genes involved in specific biological ...
(Computational) synthetic biology
GECCO '11: Proceedings of the 13th annual conference companion on Genetic and evolutionary computationThe ultimate goal of systems biology is the development of executable in silico models of cells and organisms. Systems biology attempts to provide an integrative methodology, which while able to cope with -on the one hand- the data deluge that is being ...







Comments