Abstract
We focus on the problem of predicting missing assertions in Web ontologies. We start from the assumption that individual resources that are similar in some aspects are more likely to be linked by specific relations: this phenomenon is also referred to as homophily and emerges in a variety of relational domains. In this article, we propose a method for (1) identifying which relations in the ontology are more likely to link similar individuals and (2) efficiently propagating knowledge across chains of similar individuals. By enforcing sparsity in the model parameters, the proposed method is able to select only the most relevant relations for a given prediction task. Our experimental evaluation demonstrates the effectiveness of the proposed method in comparison to state-of-the-art methods from the literature.
- Charu C. Aggarwal (Ed.). 2011. Social Network Data Analytics. Spr Google Scholar
Digital Library
- Sören Auer, Christian Bizer, Georgi Kobilarov, Jens Lehmann, Richard Cyganiak, and Zachary G. Ives. 2007. DBpedia: A nucleus for a Web of open data. In Proceedings of the 6th International Semantic Web Conference and the 2nd Asian Semantic Web Conference (ISWC’07+ASWC’07). 722--735. Google Scholar
Digital Library
- Franz Baader, Diego Calvanese, Deborah L. McGuinness, Daniele Nardi, and Peter F. Patel-Schneider (Eds.). 2007. The Description Logic Handbook (2nd ed.). Cambridge University Press. Google Scholar
Digital Library
- Francis R. Bach, Rodolphe Jenatton, Julien Mairal, and Guillaume Obozinski. 2012. Optimization with sparsity-inducing penalties. Foundations and Trends in Machine Learning 4, 1, 1--106. Google Scholar
Digital Library
- Yoshua Bengio, Olivier Delalleau, and Nicolas Le Roux. 2006. Label propagation and quadratic criterion. In Semi-Supervised Learning, O. Chapelle, B. Schölkopf, and A. Zien (Eds.). MIT Press, Cambridge, MA, 193--216.Google Scholar
- Tim Berners-Lee, James Hendler, and Ora Lassila. 2001. The Semantic Web. Scientific American 284, 5, 34--43. 0036-8733Google Scholar
Cross Ref
- S. Bhagat, G. Cormode, and S. Muthukrishnan. 2011. Node classification in social networks. In Social Network Data Analytics, C. C. Aggarwal (Ed.). Springer, 115--148.Google Scholar
- Christopher M. Bishop. 2006. Pattern Recognition and Machine Learning. Springer. Google Scholar
Digital Library
- Christian Bizer, Tom Heath, and Tim Berners-Lee. 2009. Linked data—the story so far. International Journal on Semantic Web and Information Systems 5, 3, 1--22.Google Scholar
Cross Ref
- Christian Bizer, Jens Lehmann, Georgi Kobilarov, Sören Auer, Christian Becker, Richard Cyganiak, and Sebastian Hellmann. 2009. DBpedia—a crystallization point for the Web of data. Journal of Web Semantics 7, 3, 154--165. 15708268 Google Scholar
Digital Library
- Stephan Bloehdorn and York Sure. 2007. Kernel methods for mining instance data in ontologies. In Proceedings of the 6th International Semantic Web Conference and the 2nd Asian Semantic Web Conference (ISWC’07+ASWC’07). 58--71. Google Scholar
Digital Library
- Kurt D. Bollacker, Colin Evans, Praveen Paritosh, Tim Sturge, and Jamie Taylor. 2008. Freebase: A collaboratively created graph database for structuring human knowledge. In Proceedings of the ACM SIGMOD International Conference on Management of Data (SIGMOD’08). ACM, New York, NY, 1247--1250. Google Scholar
Digital Library
- Antoine Bordes and Evgeniy Gabrilovich. 2014. Constructing and mining Web-scale knowledge graphs: KDD 2014 tutorial. In Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’14). ACM, New York, NY, 1967. Google Scholar
Digital Library
- Antoine Bordes, Xavier Glorot, Jason Weston, and Yoshua Bengio. 2014. A semantic matching energy function for learning with multi-relational data—application to word-sense disambiguation. Machine Learning 94, 2, 233--259. Google Scholar
Digital Library
- Antoine Bordes, Nicolas Usunier, Alberto García-Durán, Jason Weston, and Oksana Yakhnenko. 2013. Translating embeddings for modeling multi-relational data. In Proceedings of the 27th Annual Conference on Neural Information Processing Systems. 2787--2795. Google Scholar
Digital Library
- Antoine Bordes, Jason Weston, Ronan Collobert, and Yoshua Bengio. 2011. Learning structured embeddings of knowledge bases. In Proceedings of the 25th AAAI Conference on Artificial Intelligence (AAAI’11). Google Scholar
Digital Library
- O. Chapelle, B. Schölkopf, and A. Zien (Eds.). 2006. Semi-Supervised Learning. MIT Press, Cambridge, MA.Google Scholar
- Michael B. Cohen, Rasmus Kyng, Gary L. Miller, Jakub W. Pachocki, Richard Peng, Anup Rao, and Shen Chen Xu. 2014. Solving SDD linear systems in nearly mlogn time. In Proceedings of the 46th Annual ACM Symposium on Theory of Computing (STOC’14). ACM, New York, NY, 343--352. Google Scholar
Digital Library
- Claudia d’Amato, Nicola Fanizzi, and Floriana Esposito. 2010. Inductive learning for the Semantic Web: What does it buy? Semantic Web 1, 1--2, 53--59. Google Scholar
Digital Library
- Jesse Davis and Mark Goadrich. 2006. The relationship between precision-recall and ROC curves. In Proceedings of the 23rd International Conference on Machine Learning (ICML’06). 233--240. Google Scholar
Digital Library
- Gerben Klaas Dirk de Vries. 2013. A fast approximation of the Weisfeiler-Lehman graph kernel for RDF data. In ECML PKDD 2013: Machine Learning and Knowledge Discovery in Databases. Lecture Notes in Computer Science, Vol. 8188. Springer, 606--621.Google Scholar
- Olivier Delalleau, Yoshua Bengio, and Nicolas Le Roux. 2005. Efficient non-parametric function induction in semi-supervised learning. In Proceedings of the 10th International Workshop on Artificial Intelligence and Statistics (AISTATS’05).Google Scholar
- Pedro Domingos, Daniel Lowd, Stanley Kok, Hoifung Poon, Matthew Richardson, and Parag Singla. 2008. Just add weights: Markov logic for the Semantic Web. In Uncertainty Reasoning for the Semantic Web I. Lecture Notes in Artificial Intelligence, Vol. 5327. Springer, 1--25. Google Scholar
Digital Library
- Xin Dong, Evgeniy Gabrilovich, Geremy Heitz, Wilko Horn, Ni Lao, Kevin Murphy, Thomas Strohmann, Shaohua Sun, and Wei Zhang. 2014. Knowledge vault: A Web-scale approach to probabilistic knowledge fusion. In Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’14). ACM, New York, NY, 601--610. Google Scholar
Digital Library
- Lucas Drumond, Steffen Rendle, and Lars Schmidt-Thieme. 2012. Predicting RDF triples in incomplete knowledge bases with tensor factorization. In Proceedings of the 27th Symposium on Applied Computing (SAC’12). ACM, New York, NY, 326--331. Google Scholar
Digital Library
- Nicola Fanizzi, Claudia d’Amato, and Floriana Esposito. 2012. Induction of robust classifiers for Web ontologies through kernel machines. Journal of Web Semantics 11, 1--13. Google Scholar
Digital Library
- Rob Fergus, Yair Weiss, and Antonio Torralba. 2009. Semi-supervised learning in gigantic image collections. In Proceedings of the 23rd Annual Conference on Neural Information Processing Systems. 522--530. Google Scholar
Digital Library
- Daniel Fleischhacker and Johanna Völker. 2011. Inductive learning of disjointness axioms. In On the Move to Meaningful Internet Systems: OTM 2011. Lecture Notes in Computer Science, Vol. 7045. Springer, 680--697. Google Scholar
Digital Library
- Thomas Franz, Antje Schultz, Sergej Sizov, and Steffen Staab. 2009. TripleRank: Ranking Semantic Web data by tensor decomposition. In The Semantic Web—ISWC 2009. Lecture Notes in Computer Science, Vol. 5823. Springer, 213--228. Google Scholar
Digital Library
- Luis Antonio Galárraga, Christina Teflioudi, Katja Hose, and Fabian M. Suchanek. 2013. AMIE: Association rule mining under incomplete evidence in ontological knowledge bases. In Proceedings of the 22nd International World Wide Web Conference (WWW’13). ACM, New York, NY, 413--422. Google Scholar
Digital Library
- Thomas Gärtner. 2009. Kernels for Structured Data. World Scientific Publishing, River Edge, NJ. Google Scholar
Digital Library
- Lise Getoor and Benjamin Taskar. 2007. Introduction to Statistical Relational Learning. MIT Press, Cambridge, MA. Google Scholar
Digital Library
- Mehmet Gönen and Ethem Alpaydin. 2011. Multiple kernel learning algorithms. Journal of Machine Learning Research 12, 2211--2268. Google Scholar
Digital Library
- Bernardo Cuenca Grau, Peter Patel-Schneider, and Boris Motik. 2012. OWL 2 Web Ontology Language Direct Semantics (Second Edition). W3C Recommendation. W3C. Retrieved July 18, 2017, from https://www.w3.org/TR/2012/REC-owl2-direct-semantics-20121211/Google Scholar
- Ramanathan Guha and Dan Brickley. 2014. RDF Schema 1.1. W3C Recommendation. W3C. Retrieved July 18, 2017, from https://www.w3.org/TR/2014/REC-rdf-schema-20140225/Google Scholar
- Steve Harris and Andy Seaborne. 2013. SPARQL 1.1 Query Language. Retrieved July 18, 2017, from https://www.w3.org/TR/sparql11-query/.Google Scholar
- Trevor Hastie, Robert Tibshirani, and Jerome Friedman. 2008. The Elements of Statistical Learning: Data Mining, Inference and Prediction (2nd ed.). Springer.Google Scholar
- Patrick Hayes and Peter Patel-Schneider. 2014. RDF 1.1 Semantics. W3C Recommendation. W3C. Retrieved July 18, 2017, from https://www.w3.org/TR/2014/REC-rdf11-mt-20140225/.Google Scholar
- Tom Heath and Christian Bizer. 2011. Linked Data: Evolving the Web Into a Global Data Space. Morgan 8 Claypool. Google Scholar
Digital Library
- Sebastian Hellmann, Jens Lehmann, and Sören Auer. 2009. Learning of OWL class descriptions on very large knowledge bases. International Journal on Semantic Web and Information Systems 5, 2, 25--48.Google Scholar
Cross Ref
- L. Hogben. 2006. Handbook of Linear Algebra. CRC Press, Boca Raton, FL.Google Scholar
- Ming Ji, Yizhou Sun, Marina Danilevsky, Jiawei Han, and Jing Gao. 2010. Graph regularized transductive classification on heterogeneous information networks. In Machine Learning and Knowledge Discovery in Databases. Lecture Notes in Computer Science, Vol. 6321. Springer, 570--586. Google Scholar
Digital Library
- D. Koller and N. Friedman. 2009. Probabilistic Graphical Models: Principles and Techniques. MIT Press, Cambridge, MA. Google Scholar
Digital Library
- Risi Kondor and John D. Lafferty. 2002. Diffusion kernels on graphs and other discrete input spaces. In Proceedings of the 19th International Conference on Machine Learning (ICML’02). 315--322. Google Scholar
Digital Library
- D. Koutra, T.-Y. Ke, U. Kang, D. H. Chau, H.-K. K. Pao, and C. Faloutsos. 2011. Unifying guilt-by-association approaches: Theorems and fast algorithms. In Machine Learning and Knowledge Discovery in Databases. Lecture Notes in Computer Science, Vol. 6912. Springer, 245--260. Google Scholar
Digital Library
- Denis Krompaß, Stephan Baier, and Volker Tresp. 2015. Type-constrained representation learning in knowledge graphs. In The Semantic Web—ISWC 2015. Lecture Notes in Computer Science, Vol. 9366. Springer, 630--655. Google Scholar
Digital Library
- Denis Krompaß, Maximilian Nickel, and Volker Tresp. 2014. Querying factorized probabilistic triple databases. In The Semantic Web—ISWC 2014. Lecture Notes in Computer Science, Vol. 8797. Springer, 114--129. Google Scholar
Digital Library
- Yann LeCun, Sumit Chopra, Raia Hadsell, Marc’Aurelio Ranzato, and Fu-Jie Huang. 2006. A tutorial on energy-based learning. In Predicting Structured Data, G. Bakir et al. (Eds.). MIT Press, Cambridge, MA, 1--59.Google Scholar
- Wei Liu, Junfeng He, and Shih-Fu Chang. 2010. Large graph construction for scalable semi-supervised learning. In Proceedings of the 27th International Conference on Machine Learning (ICML’10). 679--686. Google Scholar
Digital Library
- Oren E. Livne and Achi Brandt. 2012. Lean algebraic multigrid (LAMG): Fast graph Laplacian linear solver. SIAM Journal on Scientific Computing 34, 4, B499--B522.Google Scholar
Digital Library
- Uta Lösch, Stephan Bloehdorn, and Achim Rettinger. 2012. Graph kernels for RDF data. In The Semantic Web—ESWC 2012. Lecture Notes in Computer Science, Vol. 7295. Springer, 134--148. Google Scholar
Digital Library
- Linyuan Lü and Tao Zhou. 2011. Link prediction in complex networks: A survey. Physica A: Statistical Mechanics and Its Applications 390, 6, 1150--1170. 03784371Google Scholar
Cross Ref
- Chen Luo, Renchu Guan, Zhe Wang, and Chenghua Lin. 2014. HetPathMine: A novel transductive classification algorithm on heterogeneous information networks. In Advances in Information Retrieval. Lecture Notes in Computer Science, Vol. 8416. Springer, 210--221.Google Scholar
Cross Ref
- Miller McPherson, Lynn S. Lovin, and James M. Cook. 2001. Birds of a feather: Homophily in social networks. Annual Review of Sociology 27, 1, 415--444.Google Scholar
Digital Library
- Kurt T. Miller, Thomas L. Griffiths, and Michael I. Jordan. 2009. Nonparametric latent feature models for link prediction. In Proceedings of the 23rd Annual Conference on Neural Information Processing Systems. 1276--1284. Google Scholar
Digital Library
- Pasquale Minervini, Claudia d’Amato, and Nicola Fanizzi. 2012. A graph regularization based approach to transductive class-membership prediction. In Proceedings of the 8th International Workshop on Uncertainty Reasoning for the Semantic Web (URSW’12). 39--50. Google Scholar
Digital Library
- Pasquale Minervini, Claudia d’Amato, Nicola Fanizzi, and Floriana Esposito. 2013. Transductive inference for class-membership propagation in Web ontologies. In The Semantic Web: Semantics and Big Data. Lecture Notes in Computer Science, Vol. 7882. Springer, 457--471.Google Scholar
- Richi Nayak, Pierre Senellart, Fabian M. Suchanek, and Aparna S. Varde. 2012. Discovering interesting information with advances in Web technology. ACM SIGKDD Explorations Newsletter 14, 2, 63--81. Google Scholar
Digital Library
- Maximilian Nickel, Kevin Murphy, Volker Tresp, and Evgeniy Gabrilovich. 2016. A review of relational machine learning for knowledge graphs. Proceedings of the IEEE 104, 1, 11--33.Google Scholar
Cross Ref
- Maximilian Nickel, Volker Tresp, and Hans-Peter Kriegel. 2011. A three-way model for collective learning on multi-relational data. In Proceedings of the 28th International Conference on Machine Learning (ICML’11). 809--816. Google Scholar
Digital Library
- Maximilian Nickel, Volker Tresp, and Hans-Peter Kriegel. 2012. Factorizing YAGO: Scalable machine learning for linked data. In Proceedings of the 21st World Wide Web Conference (WWW’12). ACM, New York, NY, 271--280. Google Scholar
Digital Library
- Heiko Paulheim. 2017. Knowledge graph refinement: A survey of approaches and evaluation methods. Semantic Web 8, 3, 489--508.Google Scholar
Digital Library
- Richard Peng and Daniel A. Spielman. 2014. An efficient parallel solver for SDD linear systems. In Proceedings of the 46th ACM Symposium on Theory of Computing (STOC’14). ACM, New York, NY, 333--342. Google Scholar
Digital Library
- Achim Rettinger, Uta Lösch, Volker Tresp, Claudia d’Amato, and Nicola Fanizzi. 2012. Mining the Semantic Web: Statistical learning for next generation knowledge bases. Data Mining and Knowledge Discovery 24, 3, 613--662. Google Scholar
Digital Library
- Achim Rettinger, Matthias Nickles, and Volker Tresp. 2009. Statistical relational learning with formal ontologies. In Machine Learning and Knowledge Discovery in Databases. Lecture Notes in Computer Science, Vol. 5782. Springer, 286--301.Google Scholar
- Max Schmachtenberg, Christian Bizer, and Heiko Paulheim. 2014. Adoption of the linked data best practices in different topical domains. In The Semantic Web—ISWC 2014. Lecture Notes in Computer Science, Vol. 8796. Springer, 245--260. Google Scholar
Digital Library
- Nigel Shadbolt, Tim Berners-Lee, and Wendy Hall. 2006. The Semantic Web revisited. IEEE Intelligent Systems 21, 3, 96--101. Google Scholar
Digital Library
- John Shawe-Taylor and Nello Cristianini. 2004. Kernel Methods for Pattern Analysis. Cambridge University Press. Google Scholar
Digital Library
- N. Z. Shor, K. C. Kiwiel, and A. Ruszcayǹski. 1985. Minimization Methods for Non-Differentiable Functions. Springer-Verlag, New York, NY. Google Scholar
Digital Library
- Evren Sirin and Bijan Parsia. 2007. SPARQL-DL: SPARQL query for OWL-DL. In Proceedings of the 3rd International Workshop on OWL: Experiences and Directions (OWLED’07).Google Scholar
- Richard Socher, Danqi Chen, Christopher D. Manning, and Andrew Y. Ng. 2013. Reasoning with neural tensor networks for knowledge base completion. In Proceedings of the 27th Annual Conference on Neural Information Processing Systems. 926--934. Google Scholar
Digital Library
- Daniel A. Spielman. 2010. Algorithms, graph theory, and linear equations in Laplacian matrices. In Proceedings of the International Congress of Mathematicians (ICM’10). 2698--2722.Google Scholar
- Fabian M. Suchanek, Gjergji Kasneci, and Gerhard Weikum. 2007. YAGO: A core of semantic knowledge. In Proceedings of the 16th International Conference on World Wide Web (WWW’07). ACM, New York, NY, 697--706. Google Scholar
Digital Library
- Yizhou Sun and Jiawei Han. 2012. Mining heterogeneous information networks: A structural analysis approach. SIGKDD Explorations 14, 2, 20--28. Google Scholar
Digital Library
- Yizhou Sun and Jiawei Han. 2012. Mining Heterogeneous Information Networks: Principles and Methodologies. Morgan 8 Claypool. Google Scholar
Digital Library
- Yizhou Sun, Jiawei Han, Peixiang Zhao, Zhijun Yin, Hong Cheng, and Tianyi Wu. 2009. RankClus: Integrating clustering with ranking for heterogeneous information network analysis. In Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology (EDBT’09). ACM, New York, NY, 565--576. Google Scholar
Digital Library
- Volker Tresp, Yi Huang, Markus Bundschus, and Achim Rettinger. 2009. Materializing and querying learned knowledge. In Proceedings of the 1st ESWC Workshop on Inductive Reasoning and Machine Learning on the Semantic Web (IRMLeS’09).Google Scholar
- Vladimir N. Vapnik. 1998. Statistical Learning Theory. Wiley.Google Scholar
- Kai Zhang, James T. Kwok, and Bahram Parvin. 2009. Prototype vector machine for large scale semi-supervised learning. In Proceedings of the 26th Annual International Conference on Machine Learning (ICML’09). ACM, New York, NY, 1233--1240. Google Scholar
Digital Library
- Yan-Ming Zhang, Kaizhu Huang, and Cheng-Lin Liu. 2011. Fast and robust graph-based transductive learning via minimum tree cut. In Proceedings of the 11th IEEE International Conference on Data Mining (ICDM’11). IEEE, Los Alamitos, CA, 952--961. Google Scholar
Digital Library
- Xiaojin Zhu, Zoubin Ghahramani, and John D. Lafferty. 2003. Semi-supervised learning using Gaussian fields and harmonic functions. In Proceedings of the 20th International Conference on Machine Learning (ICML'03). 912--919. Google Scholar
Digital Library
Index Terms
Adaptive Knowledge Propagation in Web Ontologies
Recommendations
The bio-zen plus ontology
Towards a Metaontology for the Biomedical DomainBio-zen plus is an OWL DL ontology for the domain of biomedical research. It incorporates several existing Semantic Web ontologies: the DOLCE foundational ontology, the Simple Knowledge Organisation System (SKOS), the Semantically Interlinked Online ...
The bio-zen plus ontology
Towards a Metaontology for the Biomedical DomainBio-zen plus is an OWL DL ontology for the domain of biomedical research. It incorporates several existing Semantic Web ontologies: the DOLCE foundational ontology, the Simple Knowledge Organisation System (SKOS), the Semantically Interlinked Online ...
Semantic enrichment for medical ontologies
The Unified Medical Language System (UMLS) contains two separate but interconnected knowledge structures, the Semantic Network (upper level) and the Metathesaurus (lower level). In this paper, we have attempted to work out better how the use of such a ...






Comments