skip to main content
research-article
Public Access

SOLVENT: A Mixed Initiative System for Finding Analogies between Research Papers

Authors Info & Claims
Published:01 November 2018Publication History
Skip Abstract Section

Abstract

Scientific discoveries are often driven by finding analogies in distant domains, but the growing number of papers makes it difficult to find relevant ideas in a single discipline, let alone distant analogies in other domains. To provide computational support for finding analogies across domains, we introduce SOLVENT, a mixed-initiative system where humans annotate aspects of research papers that denote their background (the high-level problems being addressed), purpose (the specific problems being addressed), mechanism (how they achieved their purpose), and findings (what they learned/achieved), and a computational model constructs a semantic representation from these annotations that can be used to find analogies among the research papers. We demonstrate that this system finds more analogies than baseline information-retrieval approaches; that annotators and annotations can generalize beyond domain; and that the resulting analogies found are useful to experts. These results demonstrate a novel path towards computationally supported knowledge sharing in research communities.

Skip Supplemental Material Section

Supplemental Material

References

  1. Paul André, Haoqi Zhang, Juho Kim, Lydia Chilton, Steven P. Dow, and Robert C. Miller. 2013. Community clustering: Leveraging an academic crowd to form coherent conference sessions. In First AAAI Conference on Human Computation and Crowdsourcing.Google ScholarGoogle Scholar
  2. Ryan Arlitt, Friederich Berthelsdorf, Sebastian Immel, and Robert B. Stone. 2014. The Biology Phenomenon Categorizer: A Human Computation Framework in Support of Biologically Inspired Design . Journal of Mechanical Design (2014).Google ScholarGoogle Scholar
  3. Michele Banko, Michael J Cafarella, Stephen Soderland, Matthew Broadhead, and Oren Etzioni. 2007. Open Information Extraction from the Web.. In IJCAI, Vol. 7. 2670--2676. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Abraham Bernstein, James Hendler, and Natalya Noy. 2016. A New Look at the Semantic Web . Commun. ACM , Vol. 59, 9 (Aug. 2016), 35--37. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Chandra Bhagavatula, Sergey Feldman, Russell Power, and Waleed Ammar. 2018. Content-based citation recommendation. arXiv preprint arXiv:1802.08301 (2018).Google ScholarGoogle Scholar
  6. David M. Blei, Andrew Y. Ng, Michael I. Jordan, and John Lafferty. 2003. Latent Dirichlet Allocation. Journal of Machine Learning Research (2003), 993--1022. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Jonathan Bragg and Daniel S. Weld. 2013. Crowdsourcing Multi-Label Classification for Taxonomy Creation. In First AAAI Conference on Human Computation and Crowdsourcing.Google ScholarGoogle Scholar
  8. Joseph C. Chang, Aniket Kittur, and Nathan Hahn. 2016. Alloy: Clustering with crowds and computation. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Lydia B. Chilton, Juho Kim, Paul André, Felicia Cordeiro, James A. Landay, Daniel S. Weld, Steven P. Dow, Robert C. Miller, and Haoqi Zhang. 2014. Frenzy: Collaborative Data Organization for Creating Conference Sessions. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI '14). ACM, New York, NY, USA, 1255--1264. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Lydia B. Chilton, Greg Little, Darren Edge, Daniel S. Weld, and James A. Landay. 2013. Cascade: Crowdsourcing taxonomy creation. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. ACM, 1999--2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Paolo Ciccarese, Elizabeth Wu, Gwen Wong, Marco Ocana, June Kinoshita, Alan Ruttenberg, and Tim Clark. 2008. The SWAN biomedical discourse ontology. Journal of Biomedical Informatics , Vol. 41, 5 (Oct. 2008), 739--751. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Tim Clark, Paolo N. Ciccarese, and Carole A. Goble. 2014. Micropublications: a semantic model for claims, evidence, arguments and annotations in biomedical communications. Journal of Biomedical Semantics , Vol. 5 (July 2014), 28.Google ScholarGoogle Scholar
  13. Scott Deerwester, Susan T. Dumais, Geroge W. Furnas, and Thomas K. Landauer. 1990. Indexing by Latent Semantic Analysis. JASIST , Vol. 41, 6 (1990), 1990.Google ScholarGoogle ScholarCross RefCross Ref
  14. Brian Falkenhainer, Kenneth D Forbus, and Dedre Gentner. 1989. The structure-mapping engine: Algorithm and examples. Artificial intelligence , Vol. 41, 1 (1989), 1--63. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Dedre Gentner. 1983. Structure-Mapping: A Theoretical Framework for Analogy*. Cognitive science , Vol. 7, 2 (1983), 155--170.Google ScholarGoogle Scholar
  16. M. L. Gick and K. J. Holyoak. 1983. Schema induction and analogical transfer. Cognitive Psychology , Vol. 15, 1 (1983), 1--38.Google ScholarGoogle ScholarCross RefCross Ref
  17. Karni Gilon, Joel Chan, Felicia Y Ng, Hila Lifshitz Assaf, Aniket Kittur, and Dafna Shahaf. 2018. Analogy Mining for Specific Design Needs . In Proceedings of the 2018 ACM SIGCHI Conference on Human Factors in Computing. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Nathan Hahn, Joseph Chang, Ji Eun Kim, and Aniket Kittur. 2016. The Knowledge Accelerator: Big Picture Thinking in Small Pieces. In Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems (CHI '16). ACM, New York, NY, USA, 2258--2270. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Silvana Hartmann, Ilia Kuznetsov, Teresa Martin, and Iryna Gurevych. 2017. Out-of-domain FrameNet Semantic Role Labeling. In Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 1, Long Papers, Vol. 1. 471--482.Google ScholarGoogle ScholarCross RefCross Ref
  20. Qi He, Jian Pei, Daniel Kifer, Prasenjit Mitra, and Lee Giles. 2010. Context-aware Citation Recommendation. In Proceedings of the 19th International Conference on World Wide Web (WWW '10). ACM, New York, NY, USA, 421--430. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. K. J. Holyoak and P. Thagard. 1996. The analogical scientist. In Mental Leaps: Analogy in Creative Thought , K. J. Holyoak and P. Thagard (Eds.). Cambridge, MA, 185--209.Google ScholarGoogle Scholar
  22. Tom Hope, Joel Chan, Aniket Kittur, and Dafna Shahaf. 2017. Accelerating Innovation Through Analogy Mining. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 235--243. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. John E Hummel and Keith J Holyoak. 2003. A symbolic-connectionist theory of relational inference and generalization. Psychological review , Vol. 110, 2 (2003), 220.Google ScholarGoogle Scholar
  24. Benjamin F. Jones. 2009. The Burden of Knowledge and the Death of the Renaissance Man: Is Innovation Getting Harder? Review of Economic Studies , Vol. 76, 1 (2009), 283--317.Google ScholarGoogle ScholarCross RefCross Ref
  25. Juho Kim, Phu Tran Nguyen, Sarah Weir, Philip J. Guo, Robert C. Miller, and Krzysztof Z. Gajos. 2014. Crowdsourcing Step-by-step Information Extraction to Enhance Existing How-to Videos. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI '14). ACM, New York, NY, USA, 4017--4026. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Scott Kirkpatrick, C Daniel Gelatt, Mario P Vecchi, et almbox. 1983. Optimization by simulated annealing. science , Vol. 220, 4598 (1983), 671--680.Google ScholarGoogle Scholar
  27. Maria Liakata, Shyamasree Saha, Simon Dobnik, Colin Batchelor, and Dietrich Rebholz-Schuhmann. 2012. Automatic recognition of conceptualization zones in scientific articles and two life science applications. Bioinformatics , Vol. 28, 7 (April 2012), 991--1000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Maria Liakata, Simone Teufel, Advaith Siddharthan, Colin R Batchelor, and others. 2010. Corpora for the Conceptualisation and Zoning of Scientific Papers.. In LREC. Citeseer.Google ScholarGoogle Scholar
  29. Yicong Liang, Qing Li, and Tieyun Qian. 2011. Finding Relevant Papers Based on Citation Relations. In Web-Age Information Management (Lecture Notes in Computer Science ). Springer, Berlin, Heidelberg, 403--414. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Angli Liu, Stephen Soderland, Jonathan Bragg, Christopher H Lin, Xiao Ling, and Daniel S Weld. 2016. Effective Crowd Annotation for Relation Extraction.. In HLT-NAACL. 897--906.Google ScholarGoogle Scholar
  31. Salvador E Luria and Max Delbrück. 1943. Mutations of bacteria from virus sensitivity to virus resistance. Genetics , Vol. 28, 6 (1943), 491.Google ScholarGoogle ScholarCross RefCross Ref
  32. Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013a. Efficient Estimation of Word Representations in Vector Space . arXiv:1301.3781 {cs} (Jan. 2013). http://arxiv.org/abs/1301.3781 arXiv: 1301.3781.Google ScholarGoogle Scholar
  33. Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S Corrado, and Jeff Dean. 2013b. Distributed Representations of Words and Phrases and their Compositionality . In Advances in Neural Information Processing Systems 26, C. J. C. Burges, L. Bottou, M. Welling, Z. Ghahramani, and K. Q. Weinberger (Eds.). Curran Associates, Inc., 3111--3119. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Tanushree Mitra, C.J. Hutto, and Eric Gilbert. 2015. Comparing Person- and Process-centric Strategies for Obtaining Quality Data on Amazon Mechanical Turk. In Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems (CHI '15). ACM, New York, NY, USA, 1345--1354. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Jeffrey Pennington, Richard Socher, and Christopher D Manning. 2014. Glove: Global vectors for word representation. Proceedings of the Empiricial Methods in Natural Language Processing (EMNLP 2014) , Vol. 12 (2014), 1532--1543.Google ScholarGoogle ScholarCross RefCross Ref
  36. Peter Pirolli and Stuart Card. 1999. Information foraging. Psychological review , Vol. 106, 4 (1999), 643.Google ScholarGoogle Scholar
  37. Xiang Ren, Jialu Liu, Xiao Yu, Urvashi Khandelwal, Quanquan Gu, Lidan Wang, and Jiawei Han. 2014. ClusCite: effective citation recommendation by information network-based clustering. In Knowledge Discovery and Data Mining. 821--830. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. R. Keith Sawyer. 2012. Explaining creativity: the science of human innovation 2nd ed.). Oxford University Press, New York.Google ScholarGoogle Scholar
  39. Aashish Sheshadri and Matthew Lease. 2013. SQUARE: A Benchmark for Research on Computing Crowd Consensus. In Proceedings of the 1st AAAI Conference on Human Computation (HCOMP). 156--164. http://ir.ischool.utexas.edu/square/documents/sheshadri.pdfGoogle ScholarGoogle Scholar
  40. Pao Siangliulue, Joel Chan, Bernd Huber, Steven P. Dow, and Krzysztof Z. Gajos. 2016. IdeaHound: Self-sustainable Idea Generation in Creative Online Communities. In Proceedings of the 19th ACM Conference on Computer Supported Cooperative Work and Social Computing Companion (CSCW '16 Companion ). ACM, New York, NY, USA, 98--101. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. David W Stephens and John R Krebs. 1986. Foraging theory .Princeton University Press.Google ScholarGoogle Scholar
  42. Trevor Strohman, W. Bruce Croft, and David Jensen. 2007. Recommending Citations for Academic Papers . In Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR '07). ACM, New York, NY, USA, 705--706. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. Yalin Sun, Pengxiang Cheng, Shengwei Wang, Hao Lyu, Matthew Lease, Iain Marshall, and Byron C. Wallace. 2016. Crowdsourcing Information Extraction for Biomedical Systematic Reviews. In 4th AAAI Conference on Human Computation and Crowdsourcing (HCOMP): Works-in-Progress Track. http://arxiv.org/abs/1609.01017 3 pages. arXiv:1609.01017.Google ScholarGoogle Scholar
  44. Swaroop Vattam, Bryan Wiltgen, Michael Helms, Ashok K. Goel, and Jeannette Yen. 2011. DANE: Fostering Creativity in and through Biologically Inspired Design . In Design Creativity 2010 . http://link.springer.com/chapter/10.1007/978-0--85729--224--7_16Google ScholarGoogle Scholar
  45. S. Wuchty, B. F. Jones, and B. Uzzi. 2007. The increasing dominance of teams in production of knowledge. Science , Vol. 316, 5827 (2007), 1036--1039.Google ScholarGoogle Scholar
  46. James Zou, Kamalika Chaudhuri, and Adam Kalai. 2015. Crowdsourcing Feature Discovery via Adaptively Chosen Comparisons. In Third AAAI Conference on Human Computation and Crowdsourcing.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. SOLVENT: A Mixed Initiative System for Finding Analogies between Research Papers

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader
      About Cookies On This Site

      We use cookies to ensure that we give you the best experience on our website.

      Learn more

      Got it!