skip to main content
10.1145/1807085.1807122acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
research-article

A learning algorithm for top-down XML transformations

Authors Info & Claims
Published:06 June 2010Publication History

ABSTRACT

A generalization from string to trees and from languages to translations is given of the classical result that any regular language can be learned from examples: it is shown that for any deterministic top-down tree transformation there exists a sample set of polynomial size (with respect to the minimal transducer) which allows to infer the translation. Until now, only for string transducers and for simple relabeling tree transducers, similar results had been known. Learning of deterministic top-down tree transducers (dtops) is far more involved because a dtop can copy, delete, and permute its input subtrees. Thus, complex dependencies of labeled input to output paths need to be maintained by the algorithm. First, a Myhill-Nerode theorem is presented for dtops, which is interesting on its own. This theorem is then used to construct a learning algorithm for dtops. Finally, it is shown how our result can be applied to xml transformations (e.g. xslt programs). For this, a new dtd-based encoding of unranked trees by ranked ones is presented. Over such encodings, dtops can realize many practically interesting xml transformations which cannot be realized on firstchild/next-sibling encodings.

References

  1. D. Angluin. Learning regular sets from queries and counterexamples. Inf. Comput., 75(2):87--106, 1987. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. G. J. Bex, S. Maneth, and F. Neven. A formal model for an expressive fragment of XSLT. Inform. Systems, 27:21--39, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. G. Jan Bex, W. Gelade, F. Neven, and S. Vansummeren. Learning deterministic regular expr. for the inference of schemas from XML data. In WWW, pages 825--834, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. J. Carme, R. Gilleron, A. Lemay, and J. Niehren. Interactive learning of node selecting tree transducer. Machine Learning, 66(1):33--67, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. J. Carme, R. Gilleron, A. Lemay, and J. Niehren. Interactive learning of node selecting tree transducers. Machine Learning, 66(1):33--67, January 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. C. Choffrut. Minimizing subsequential transducers: a survey. Theor. Comput. Sci., 292(1):131--143, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. A. Doan, P. Domingos, and A. Y. Halevy. Reconciling schemas of disparate data sources: A machine-learning approach. In SIGMOD Conference, p. 509--520, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. J. Engelfriet. Bottom-up and top-down tree transformations - a comparison. Math. Syst. Theory, 9(3):198--231, 1975.Google ScholarGoogle Scholar
  9. J. Engelfriet. Top-down tree transducers with regular look-ahead. Math. Syst. Theory, 10:289--303, 1977.Google ScholarGoogle ScholarCross RefCross Ref
  10. J. Engelfriet. Some open questions and recent results on tree transducers and tree languages. In Formal language theory; perspectives and open problems. Acad. Press, 1980.Google ScholarGoogle ScholarCross RefCross Ref
  11. J. Engelfriet and S. Maneth. Macro tree transducers, attribute grammars, and MSO definable tree translations. Inf. Comput., 154(1):34--91, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. J. Engelfriet, S. Maneth, and H. Seidl. Deciding equivalence of top-down XML transformations in polynomial time. J. Comput. Syst. Sci., 75(5):271--286, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Z. Esik. Decidability results concerning tree transducers I. Acta Cybernetica, 5:1--20, 1980.Google ScholarGoogle Scholar
  14. F. Gecseg and M. Steinby. Tree Automata. Akademiai Kiado, Budapest, 1984.Google ScholarGoogle Scholar
  15. E.M. Gold. Complexity of automaton identification from given data. Inform. Control, 37:302--320, 1978.Google ScholarGoogle ScholarCross RefCross Ref
  16. J. Graehl, K. Knight, and J. May. Training tree transducers. Computational Linguistics, 34:391--427, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. W. Janssen, A. Korlyukov, and J. Van den Bussche. On the tree-transformation power of XSLT. Acta Inf., 43(6):371--393, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. S. Kepser. A simple proof for the Turing-completeness of XSLT and XQuery. In Extreme Markup Languages, 2004.Google ScholarGoogle Scholar
  19. S. Maneth, A. Berlea, T. Perst, and H. Seidl. XML type checking with macro tree transducers. In PODS, p. 283--294. ACM Press, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. S. Maneth and G. Busatto. Tree transducers and tree compressions. In FOSSACS, p. 363--377, 2004.Google ScholarGoogle ScholarCross RefCross Ref
  21. S. Maneth and F. Neven. Structured document transform. based on XSL. In DBPL, p. 80--98, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. M. Mohri. Minimization algorithms for sequential transducers. Theor. Comput. Sci., 234:177--201, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. A. Morishima, H. Kitagawa, and A. Matsumoto. A machine learning approach to rapid development of XML mapping queries. In ICDE, p. 276--287, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. O. Niese. An integrated approach to testing complex systems. Phd thesis, Universitat Dortmund, Germany, 2003.Google ScholarGoogle Scholar
  25. J. Oncina, P. Garcia, and E. Vidal. Learning subseq. transduc. for patt. recog. interpretation tasks. IEEE Trans. Pattern Anal. Mach. Intell., 15(5):448--458, 1993. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. R. Onder and Z. Bayram. XSLT version 2.0 is Turing-complete: A purely transformation based proof. In CIAA, p. 275--276, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. T. Perst and H. Seidl. Macro forest transducers. Inf. Process. Lett., 89(3):141--149, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. L. Popa, Y. Velegrakis, R. J. Miller, M. A. Hernandez, and R. Fagin. Translating web data. VLDB, p. 598--609, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. A learning algorithm for top-down XML transformations

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      PODS '10: Proceedings of the twenty-ninth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
      June 2010
      350 pages
      ISBN:9781450300339
      DOI:10.1145/1807085

      Copyright © 2010 ACM

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 6 June 2010

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      Overall Acceptance Rate476of1,835submissions,26%

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader
    About Cookies On This Site

    We use cookies to ensure that we give you the best experience on our website.

    Learn more

    Got it!