skip to main content
10.1145/2505515.2505573acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
research-article

Programming with personalized pagerank: a locally groundable first-order probabilistic logic

Published:27 October 2013Publication History

ABSTRACT

Many information-management tasks (including classification, retrieval, information extraction, and information integration) can be formalized as inference in an appropriate probabilistic first-order logic. However, most probabilistic first-order logics are not efficient enough for realistically-sized instances of these tasks. One key problem is that queries are typically answered by "grounding" the query---i.e., mapping it to a propositional representation, and then performing propositional inference---and with a large database of facts, groundings can be very large, making inference and learning computationally expensive. Here we present a first-order probabilistic language which is well-suited to approximate "local" grounding: in particular, every query $Q$ can be approximately grounded with a small graph. The language is an extension of stochastic logic programs where inference is performed by a variant of personalized PageRank. Experimentally, we show that the approach performs well on an entity resolution task, a classification task, and a joint inference task; that the cost of inference is independent of database size; and that speedup in learning is possible by multi-threading.

References

  1. Babak Ahmadi, Kristian Kersting, and Scott Sanner. Multi-evidence lifted message passing, with application to pagerank and the kalman filter. In Proceedings of the Twenty-Second international joint conference on Artificial Intelligence, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Reid Andersen, Fan R. K. Chung, and Kevin J. Lang. Local partitioning for directed graphs using pagerank. Internet Mathematics, 5(1):3--22, 2008.Google ScholarGoogle ScholarCross RefCross Ref
  3. Lars Backstrom and Jure Leskovec. Supervised random walks: predicting and recommending links in social networks. In Proceedings of the fourth ACM international conference on Web search and data mining, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Matthias Brocheler, Lilyana Mihalkova, and Lise Getoor. Probabilistic similarity logic. In Proceedings of the Conference on Uncertainty in Artificial Intelligence, 2010.Google ScholarGoogle Scholar
  5. Andrew Carlson, Justin Betteridge, Bryan Kisiel, Burr Settles, Estevam R. Hruschka Jr., and Tom M. Mitchell. Toward an architecture for never-ending language learning. In Maria Fox and David Poole, editors, AAAI. AAAI Press, 2010.Google ScholarGoogle Scholar
  6. Soumen Chakrabarti. Dynamic personalized PageRank in entity-relation graphs. In Proceedings of the 16th international conference on World Wide Web, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. William W. Cohen. Data integration using similarity joins and a word-based information representation language. ACM Transactions on Information Systems, 18(3):288--321, July 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. William W Cohen. Graph Walks and Graphical Models. Carnegie Mellon University, School of Computer Science, Machine Learning Department, 2010.Google ScholarGoogle Scholar
  9. James Cussens. Parameter estimation in stochastic logic programs. Machine Learning, 44(3):245--271, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Luc De Raedt, Angelika Kimmig, and Hannu Toivonen. Problog: A probabilistic prolog and its application in link discovery. In Proceedings of the 20th international joint conference on Artifical intelligence, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Pedro Domingos and Daniel Lowd. Markov Logic: An Interface Layer for Artificial Intelligence. Synthesis Lectures on Artificial Intelligence and Machine Learning. Morgan & Claypool Publishers, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Norbert Fuhr. Probabilistic datalog--a logic for powerful retrieval methods. In Proceedings of the 18th annual international ACM SIGIR conference on Research and development in information retrieval, pages 282--290. ACM, 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Bernd Gutmann, Angelika Kimmig, Kristian Kersting, and Luc De Raedt. Parameter estimation in problog from annotated queries. CW Reports, 2010.Google ScholarGoogle Scholar
  14. Abhay Jha and Dan Suciu. Probabilistic databases with markoviews. Proceedings of the VLDB Endowment, 5(11):1160--1171, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Ni Lao and William W. Cohen. Relational retrieval using a combination of path-constrained random walks. Machine Learning, 81(1):53--67, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Ni Lao, Tom M. Mitchell, and William W. Cohen. Random walk inference and learning in a large scale knowledge base. In EMNLP, pages 529--539. ACL, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. J. W. Lloyd. Foundations of Logic Programming: Second Edition. Springer-Verlag, 1987. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Daniel Lowd and Pedro Domingos. Efficient weight learning for markov logic networks. In Knowledge Discovery in Databases: PKDD 2007, pages 200--211. Springer, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Andrew McCallum, Kamal Nigam, and Lyle H. Ungar. Efficient clustering of high-dimensional data sets with application to reference matching. In Knowledge Discovery and Data Mining, pages 169--178, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Feng Niu, Christopher Ré, AnHai Doan, and Jude Shavlik. Tuffy: Scaling up statistical inference in markov logic networks using an RDBMS. Proceedings of the VLDB Endowment, 4(6):373--384, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Feng Niu, Benjamin Recht, Christopher Ré, and Stephen J Wright. Hogwild!: A lock-free approach to parallelizing stochastic gradient descent. arXiv preprint arXiv:1106.5730, 2011.Google ScholarGoogle Scholar
  22. Larry Page, Sergey Brin, R. Motwani, and T. Winograd. The PageRank citation ranking: Bringing order to the web. In Technical Report, Computer Science department, Stanford University, 1998.Google ScholarGoogle Scholar
  23. Hoifung Poon and Pedro Domingos. Joint inference in information extraction. In Proceedings of the National Conference on Artificial Intelligence, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Hoifung Poon and Pedro Domingos. Joint unsupervised coreference resolution with markov logic. In Proceedings of the Conference on Empirical Methods in Natural Language Processing, pages 650--659. Association for Computational Linguistics, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Matthew Richardson and Pedro Domingos. Markov logic networks. Mach. Learn., 62(1--2):107--136, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Jude Shavlik and Sriraam Natarajan. Speeding up inference in markov logic networks by preprocessing to reduce the size of the resulting grounded network. In Proceedings of the Twenty-first International Joint Conference on Artificial Intelligence (IJCAI-09), 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Parag Singla and Pedro Domingos. Entity resolution with markov logic. In Data Mining, 2006. ICDM'06. Sixth International Conference on, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Parag Singla and Pedro Domingos. Memory-efficient inference in relational domains. In Proceedings of the national conference on Artificial intelligence, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Parag Singla and Pedro Domingos. Lifted first-order belief propagation. In Proceedings of the 23rd national conference on Artificial intelligence, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Hanghang Tong, Christos Faloutsos, and Jia-Yu Pan. Fast random walk with restart and its applications. In ICDM, pages 613--622. IEEE Computer Society, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Martin Zinkevich, Alex Smola, and John Langford. Slow learners are fast. Advances in Neural Information Processing Systems, 22:2331--2339, 2009.Google ScholarGoogle Scholar
  32. Martin Zinkevich, Markus Weimer, Alex Smola, and Lihong Li. Parallelized stochastic gradient descent. Advances in Neural Information Processing Systems, 2010.Google ScholarGoogle Scholar

Index Terms

  1. Programming with personalized pagerank: a locally groundable first-order probabilistic logic

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader
    About Cookies On This Site

    We use cookies to ensure that we give you the best experience on our website.

    Learn more

    Got it!