research-article

AJAXSearch: crawling, indexing and searching web 2.0 applications

Online:01 August 2008Publication History

Abstract

Current search engines such as Google and Yahoo! are prevalent for searching the Web. Search in dynamic pages, however, is either inexistent or far from perfect. AJAX and Rich Internet Application are such applications. They are increasingly frequent on the Web (in YouTube, Amazon, GMail, Yahoo!Mail) or mobile devices and are offering a high degree of interactivity to the user, by seamlessly loading content from the server without the need to refresh the page. Current search engines cannot correctly index AJAX applications. This produces false positives and false negatives, because search engines do not understand the application logic that loads content dynamically. Crawling an AJAX application is a difficult problem. Since the user invokes events on the page, crawling must identify the different application states generated by the client-side logic. This demo sets the stage for this new type of search and shows that a search engine for AJAX can be built. Among others, the challenges, as opposed to traditional search engines, are: automatically identifying states by triggering events, efficiently crawling application states, avoiding the invocation of potentially very numerous events, scalability in the number of events, duplicate elimination of states, result presentation and aggregation, ranking. The demo presents the AJAX search engine: crawler, indexer and query processor, applied on a real application and showcases challenges and solutions.

References

  1. S. Agrawal, S. Chaudhuri, and G. Das. DBXplorer:Enabling Keyword Search over Relational Databases. In SIGMOD, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. AJAX News with calendar. http://www.giannifrey.com/ajax/news.cfm?showCalendar=true.Google ScholarGoogle Scholar
  3. Sample AJAX News Application. http://www.giannifrey.com/ajax/news.html.Google ScholarGoogle Scholar
  4. S. Amer-Yahia, C. Botev, S. Buxon, P. Case, J. Doerre, D. McBeath, M. Rys, and J. Shanmugasundaram. XQuery 1.0 and XPath 2.0 Full-Text, W3C Working Draft, 4 April 2005. http://www.w3.org/TR/2005/WD-xquery-full-text-20050404/.Google ScholarGoogle Scholar
  5. S. Amer-Yahia, E. Curtmola, and A. Deutsch. Flexible and efficient XML search with complex full-text predicates. In SIGMOD, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. R. A. Baeza-Yates and B. A. Ribeiro-Neto. Modern Information Retrieval. ACM Press / Addison-Wesley, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. G. Bhalotia, C. Nakhe, A. Hulgeri, S. Chakrabarti, and S. Sudarshan. Keyword searching and browsing in databases using BANKS. In ICDE, 2002.Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. S. Boag, D. Chamberlin, M. F. Fernndez, D. Florescu, J. Robie, and J. Simon. XQuery 1.0: An XML Query Language W3C Candidate Recommendation, 3 November 2005. http://www.w3.org/TR/2005/CR-xquery-200511033.Google ScholarGoogle Scholar
  9. COBRA Toolkit. http://html.xamjwg.org/cobra.jsp.Google ScholarGoogle Scholar
  10. L. Guo, F. Shao, C. Botev, and J. Shanmugasundaram. XRANK: Ranked Keyword Search over XML Documents. In SIGMOD, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. V. Hristidis and Y. Papakonstantinou. DISCOVER: Keyword Search in Relational Databases. In VLDB, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Yahoo! Mail. http://mail.yahoo.com.Google ScholarGoogle Scholar

Index Terms

  1. AJAXSearch: crawling, indexing and searching web 2.0 applications

            Comments

            Login options

            Check if you have access through your login credentials or your institution to get full access on this article.

            Sign in

            Full Access

            PDF Format

            View or Download as a PDF file.

            PDF

            eReader

            View online with eReader.

            eReader
            About Cookies On This Site

            We use cookies to ensure that we give you the best experience on our website.

            Learn more

            Got it!