skip to main content
research-article

Scalable semantic analytics on social networks for addressing the problem of conflict of interest detection

Published:03 March 2008Publication History
Skip Abstract Section

Abstract

In this article, we demonstrate the applicability of semantic techniques for detection of Conflict of Interest (COI). We explain the common challenges involved in building scalable Semantic Web applications, in particular those addressing connecting-the-dots problems. We describe in detail the challenges involved in two important aspects on building Semantic Web applications, namely, data acquisition and entity disambiguation (or reference reconciliation). We extend upon our previous work where we integrated the collaborative network of a subset of DBLP researchers with persons in a Friend-of-a-Friend social network (FOAF). Our method finds the connections between people, measures collaboration strength, and includes heuristics that use friendship/affiliation information to provide an estimate of potential COI in a peer-review scenario. Evaluations are presented by measuring what could have been the COI between accepted papers in various conference tracks and their respective program committee members. The experimental results demonstrate that scalability can be achieved by using a dataset of over 3 million entities (all bibliographic data from DBLP and a large collection of FOAF documents).

References

  1. Adamic, L. A., Buyukkokten, O., and Adar, E. 2003. A social network caught in the Web. First Monday 8, 6.Google ScholarGoogle ScholarCross RefCross Ref
  2. Aleman-Meza, B., Halaschek-Wiener, C., Arpinar, I. B., Ramakrishnan, C., and Sheth, A. P. 2005. Ranking complex relationships on the semantic web. IEEE Internet Comput. 9, 3, 37-- 44. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Aleman-Meza, B., Nagarajan, M., Ramakrishnan, C., Ding, L., Kolari, P., Sheth, A. P., Arpinar, I. B., Joshi, A., and Finin, T. 2006. Semantic analytics on social networks: Experiences addressing the problem of conflict of interest detection. In Proceedings of the 13th International World Wide Web Conference, Edinburgh. Scotland. 407--416. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Aleman-Meza, B., Hakimpour, F., Arpinar, I. B., and Sheth, A. P. 2007. SwetoDblp ontology of computer science publications, J. Web Semant. 5, 6, 151--155. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Anderson, R. and Khattak, A. 1998. The use of information retrieval techniques for intrusion detection. In Proceedings of the 1st International Workshop on Recent Advances in Intrusion Detection. Louvain-la-Neuve, Berlin, Germany.Google ScholarGoogle Scholar
  6. Anyanwu, K. and Sheth, A. P. 2003. ρ-Queries: Enabling querying for semantic associations on the semantic web. In Proceedings of the 12th International World Wide Web Conference. Budapest, Hungary. 690--699. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Anyanwu, K., Maduko, A., and Sheth, A. P. 2007. SPARQ2L: Towards support for subgraph extraction queries in RDF databases. In Proceedings of the 14th International World Wide Web Conference. Banff, Alberta, Canada. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Aswani, N., Bontcheva, K., and Cunningham, H. 2006. Mining information for instance unification. In Proceedings of the 5th International Semantic Web Conference. Athens, GA. 329--342. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Barabási, A.-L. 2002. Linked---The New Science of Networks. Perseus Publishing, Cambridge, MA.Google ScholarGoogle Scholar
  10. Berkowitz, S. D. 1982. Introduction to Structural Analysis: The Network Approach to Social Research. Butterworth, Toronto, Canada.Google ScholarGoogle Scholar
  11. Bergamaschi, S., Castano, S., and Vincini, M. 1999. Semantic integration of semistructured and structured data sources.SIGMOD Rec. 28, 1, 54--59. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Bhattacharya, I. and Getoor, L. 2006. Entity resolution in graphs. In L. B. Holder and D. J. Cook, Eds. Mining Graph Data. John Wiley & Sons.Google ScholarGoogle Scholar
  13. Chen, C. 1999. Visualising semantic spaces and author co-citation networks in digital libraries. Inform. Proc. Manag. 35, 3, 401--420. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Chen, C. and Carr, L. 1999. Trailblazing the literature of hypertext: Author co-citation analysis (1989--1998). In Proceedings of the 10th ACM Conference on Hypertext and Hypermedia: Returning to Our Diverse Roots. Darmstadt, Germany, 51--60. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Crescenzi, V., Mecca, G., and Merialdo, P. 2001. RoadRunner: Towards automatic data extraction from large Web sites. In Proceedings of the 27th International Conference on Very Large Data Bases. Rome, Italy. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Dill, S., Eiron, N., Gibson, D., Gruhl, D., Guha, R. V., Jhingran, A., Kanungo, T., Rajagopalan, S., Tomkins, A., Tomlin, J. A., and Zien, J. Y. 2003. SemTag and seeker: Bootstrapping the semantic Web via automated semantic annotation. In Proceedings of the 12th International World Wide Web Conference. Budapest, Hungary. 178--186. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Ding, L., Finin, T., Joshi, A., Pan, R., Cost, R. S., Peng, Y., Reddivari, P., Doshi, V., and Sachs, J. 2004. Swoogle: A search and metadata engine for the semantic Web. In Proceedings of the International Conference on Information and Knowledge Management. Washington, DC. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Ding, L., Finin, T., Zhou, L., and Joshi, A. 2005a. Social networking on the semantic web. Learn. Orga. 5, 12.Google ScholarGoogle Scholar
  19. Ding, L., Zhou, L., Finin, T., and Joshi, A. 2005b. How the Semantic Web is being used: An analysis of FOAF documents. In Proceedings of the 38th Hawaii International Conference on System Sciences. Big Island, HI. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Dong, X., Halevy, A., and Madhavan, J. 2005. Reference reconciliation in complex information spaces. In Proceedings of the ACM SIGMOD Conference. Baltimore. MD. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Garton, L., Haythornthwaite, C., and Wellman, B. 1997. Studying online social networks. J. Comput.-Mediated Comm. 3, 1.Google ScholarGoogle Scholar
  22. Guha, R., Mccool, R., and Miller, E. 2003. Semantic search. In Proceedings of the 12th International World Wide Web Conference. Budapest, Hungary. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Hammond, B., Sheth, A., and Kochut, K. 2002. Semantic enhancement engine: A modular document enhancement platform for semantic applications over heterogeneous content. In V. Kashyap and L. Shklar Eds. Real World Semantic Web Applications. Ios Press. Inc. 29--49.Google ScholarGoogle Scholar
  24. Hassell, J., Aleman-Meza, B., and Arpinar, I. B. 2006. Ontology-driven automatic entity disambiguation in unstructured text. In Proceedings of the 5th International Semantic Web Conference, Athens, GA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Hollywood, J., Snyder, D., Mckay, K. N., and Boon, J. E. 2004. Out of the Ordinary: Finding Hidden Threats by Analyzing Unusual Behavior. RAND Corporation.Google ScholarGoogle Scholar
  26. Horrocks, I. and Tessaris, S. 2002. Querying the semantic web: A formal approach. In Proceedings of the 1st International Semantic Web Conference. Sardinia, Italy. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Janik, M. and Kochut, K. 2005. BRAHMS: A WorkBench RDF store and high performance memory system for semantic association discovery. In Proceedings of the 4th International Semantic Web Conference. Galway, Ireland. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Jonyer, I., Holder, L. B., and Cook, D. J. 2000. Graph-based hierarchical conceptual clustering. In Proceedings of the 13th International Florida Artificial intelligence Research Society Conference. AAAI Press, 91--95. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Kalashnikov, D., Mehrotra, S., and Chen, Z. 2005. Exploiting relationships for domain-independent data cleaning. In Proceedings of the SIAM Data Mining Conference.Google ScholarGoogle Scholar
  30. Karvounarakis, G., Alexaki, S., Christophides, V., Plexousakis, D., and Scholl, M. 2002. RQL: A declarative query language for RDF. In Proceedings of the 11th International World Wide Web Conference. Honolulu, HI, 592--603. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Kautz, H., Selman, B., and Shah, M. 1997. The hidden web. AI Mag. 18, 2, 27--36.Google ScholarGoogle Scholar
  32. Kempe, D., Kleinberg, J. M., and Tardos, E. 2003. Maximizing the spread of influence through a social network. In Proceedings of the 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 137--146. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Kochut, K. and Janik, M. 2007. SPARQLeR: Extended SPARQL for semantic association discovery. In Proceedings of the 4th European Semantic Web Conference. Innsbruck, Austria. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Laender, A. H. F., Ribeiro-Neto, B. A., Da Silva, A. S., and Teixeira, J. S. 2002. A brief survey of web data extraction tools. SIGMOD Rec. 31, 2, 84--93. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Laz, T., Fisher, K., Kostich, M., and Atkinson, M. 2004. Connecting the dots. Modern Drug Discovery, 33--36.Google ScholarGoogle Scholar
  36. Lee, Y. L. 2005. Apps make semantic web a reality. SD Times.Google ScholarGoogle Scholar
  37. Mika, P. 2005. Flink: Semantic Web technology for the extraction and analysis of social networks. J. Web Semant. 3, 2--3, 211--223. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Miller, E. 2005. The Semantic Web is Here. In Proceedings of the Semantic Technology Conference 2005. San Francisco, CA.Google ScholarGoogle Scholar
  39. Nascimento, M. A., Sander, J., and Pound, J. 2003. Analysis of SIGMOD's CoAuthorship graph. SIGMOD Rec. 32, 3. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Neville, J., Adler, M., and Jensen, D. 2003. Clustering relational data using attribute and link information. In Proceedings of the Text Mining and Link Analysis Workshop.Google ScholarGoogle Scholar
  41. Newman, M. E. J. 2001a. The structure of scientific collaboration networks. In Proceedings of the National Academy of Sciences 98, 2, 404--409.Google ScholarGoogle ScholarCross RefCross Ref
  42. Newman, M. E. J. 2001b. Scientific collaboration networks: II. Shortest paths, weighted networks, and centrality. Phys. Rev. E 64, 016132.Google ScholarGoogle ScholarCross RefCross Ref
  43. Papagelis, M., Plexousakis, D., and Nikolaou, P. N. 2005. CONFIOUS: Managing the electronic submission and reviewing process of scientific conferences. In Proceedings of the 6th International Conference on Web Information Systems Engineering. New York, NY. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Ramakrishnan, C., Milnor, W. H., Perry, M., and Sheth, A. P. 2005. Discovering informative connection subgraphs in multi-relational graphs. SIGKDD Exp. 7, 2, 56--63. Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. Sheth, A. P. 2005a. Enterprise applications of semantic Web: The sweet spot of risk and compliance. In Proceedings of the IFIP International Conference on Industrial Applications of Semantic Web. Jyväskylä, Finland.Google ScholarGoogle ScholarCross RefCross Ref
  46. Sheth, A. P. 2005b. From semantic search & integration to analytics. In Proceedings of the Dagstuhl Seminar: Semantic Interoperability and Integration. IBFI, Schloss Dagstuhl, Germany.Google ScholarGoogle Scholar
  47. Sheth, A. P., Aleman-Meza, B., Arpinar, I. B., Halaschek, C., Ramakrishnan, C., Bertram, C., Warke, Y., Avant, D., Arpinar, F. S., Anyanwu, K., and Kochut, K. 2005. Semantic association identification and knowledge discovery for national security applications. J. Datab. Manag. 16, 1, 33--53.Google ScholarGoogle ScholarCross RefCross Ref
  48. Sheth, A. P., Bertram, C., Avant, D., Hammond, B., Kochut, K., and Warke, Y. 2002. Managing semantic content for the Web. IEEE Internet Computing 6, 4, 80--87. Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. Smeaton, A. F., Keogh, G., Gurrin, C., McDonald, K., and Sodring, T. 2002. Analysis of papers from twenty-five years of SIGIR conferences: What have we been doing for the last quarter of a century. SIGIR For. 36, 2. Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. Townley, J. 2000. The streaming search engine that reads your mind. Streaming Media World.Google ScholarGoogle Scholar
  51. Wasserman, S. and Faust, K. 1994. Social Network Analysis: Methods and Applications. Cambridge University Press, Cambridge, UK.Google ScholarGoogle Scholar
  52. Wellman, B. 1998. Structural analysis: From method and metaphor to theory and substance. In B. Wellman and S. D. Berkowitz. Eds. Social Structures: A Network Approach. Cambridge University Press, Cambridge, 19--61.Google ScholarGoogle Scholar
  53. Winkler, W. E. 1999. The state of record linkage and current research problems. RR99/03, U.S. Census Bureau.Google ScholarGoogle Scholar
  54. Xu, J. and Chen, H. 2003. Untangling criminal networks: A case study. In Proceedings of Intelligence and Security Informatics, 1st NSF/NIJ Symposium, 232--248. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Scalable semantic analytics on social networks for addressing the problem of conflict of interest detection

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader
      About Cookies On This Site

      We use cookies to ensure that we give you the best experience on our website.

      Learn more

      Got it!