ABSTRACT
A generalization from string to trees and from languages to translations is given of the classical result that any regular language can be learned from examples: it is shown that for any deterministic top-down tree transformation there exists a sample set of polynomial size (with respect to the minimal transducer) which allows to infer the translation. Until now, only for string transducers and for simple relabeling tree transducers, similar results had been known. Learning of deterministic top-down tree transducers (dtops) is far more involved because a dtop can copy, delete, and permute its input subtrees. Thus, complex dependencies of labeled input to output paths need to be maintained by the algorithm. First, a Myhill-Nerode theorem is presented for dtops, which is interesting on its own. This theorem is then used to construct a learning algorithm for dtops. Finally, it is shown how our result can be applied to xml transformations (e.g. xslt programs). For this, a new dtd-based encoding of unranked trees by ranked ones is presented. Over such encodings, dtops can realize many practically interesting xml transformations which cannot be realized on firstchild/next-sibling encodings.
- D. Angluin. Learning regular sets from queries and counterexamples. Inf. Comput., 75(2):87--106, 1987. Google Scholar
Digital Library
- G. J. Bex, S. Maneth, and F. Neven. A formal model for an expressive fragment of XSLT. Inform. Systems, 27:21--39, 2002. Google Scholar
Digital Library
- G. Jan Bex, W. Gelade, F. Neven, and S. Vansummeren. Learning deterministic regular expr. for the inference of schemas from XML data. In WWW, pages 825--834, 2008. Google Scholar
Digital Library
- J. Carme, R. Gilleron, A. Lemay, and J. Niehren. Interactive learning of node selecting tree transducer. Machine Learning, 66(1):33--67, 2007. Google Scholar
Digital Library
- J. Carme, R. Gilleron, A. Lemay, and J. Niehren. Interactive learning of node selecting tree transducers. Machine Learning, 66(1):33--67, January 2007. Google Scholar
Digital Library
- C. Choffrut. Minimizing subsequential transducers: a survey. Theor. Comput. Sci., 292(1):131--143, 2003. Google Scholar
Digital Library
- A. Doan, P. Domingos, and A. Y. Halevy. Reconciling schemas of disparate data sources: A machine-learning approach. In SIGMOD Conference, p. 509--520, 2001. Google Scholar
Digital Library
- J. Engelfriet. Bottom-up and top-down tree transformations - a comparison. Math. Syst. Theory, 9(3):198--231, 1975.Google Scholar
- J. Engelfriet. Top-down tree transducers with regular look-ahead. Math. Syst. Theory, 10:289--303, 1977.Google Scholar
Cross Ref
- J. Engelfriet. Some open questions and recent results on tree transducers and tree languages. In Formal language theory; perspectives and open problems. Acad. Press, 1980.Google Scholar
Cross Ref
- J. Engelfriet and S. Maneth. Macro tree transducers, attribute grammars, and MSO definable tree translations. Inf. Comput., 154(1):34--91, 1999. Google Scholar
Digital Library
- J. Engelfriet, S. Maneth, and H. Seidl. Deciding equivalence of top-down XML transformations in polynomial time. J. Comput. Syst. Sci., 75(5):271--286, 2009. Google Scholar
Digital Library
- Z. Esik. Decidability results concerning tree transducers I. Acta Cybernetica, 5:1--20, 1980.Google Scholar
- F. Gecseg and M. Steinby. Tree Automata. Akademiai Kiado, Budapest, 1984.Google Scholar
- E.M. Gold. Complexity of automaton identification from given data. Inform. Control, 37:302--320, 1978.Google Scholar
Cross Ref
- J. Graehl, K. Knight, and J. May. Training tree transducers. Computational Linguistics, 34:391--427, 2008. Google Scholar
Digital Library
- W. Janssen, A. Korlyukov, and J. Van den Bussche. On the tree-transformation power of XSLT. Acta Inf., 43(6):371--393, 2007. Google Scholar
Digital Library
- S. Kepser. A simple proof for the Turing-completeness of XSLT and XQuery. In Extreme Markup Languages, 2004.Google Scholar
- S. Maneth, A. Berlea, T. Perst, and H. Seidl. XML type checking with macro tree transducers. In PODS, p. 283--294. ACM Press, 2005. Google Scholar
Digital Library
- S. Maneth and G. Busatto. Tree transducers and tree compressions. In FOSSACS, p. 363--377, 2004.Google Scholar
Cross Ref
- S. Maneth and F. Neven. Structured document transform. based on XSL. In DBPL, p. 80--98, 1999. Google Scholar
Digital Library
- M. Mohri. Minimization algorithms for sequential transducers. Theor. Comput. Sci., 234:177--201, 2000. Google Scholar
Digital Library
- A. Morishima, H. Kitagawa, and A. Matsumoto. A machine learning approach to rapid development of XML mapping queries. In ICDE, p. 276--287, 2004. Google Scholar
Digital Library
- O. Niese. An integrated approach to testing complex systems. Phd thesis, Universitat Dortmund, Germany, 2003.Google Scholar
- J. Oncina, P. Garcia, and E. Vidal. Learning subseq. transduc. for patt. recog. interpretation tasks. IEEE Trans. Pattern Anal. Mach. Intell., 15(5):448--458, 1993. Google Scholar
Digital Library
- R. Onder and Z. Bayram. XSLT version 2.0 is Turing-complete: A purely transformation based proof. In CIAA, p. 275--276, 2006. Google Scholar
Digital Library
- T. Perst and H. Seidl. Macro forest transducers. Inf. Process. Lett., 89(3):141--149, 2004. Google Scholar
Digital Library
- L. Popa, Y. Velegrakis, R. J. Miller, M. A. Hernandez, and R. Fagin. Translating web data. VLDB, p. 598--609, 2002. Google Scholar
Digital Library
Index Terms
A learning algorithm for top-down XML transformations
Recommendations
Bottom-up and top-down tree series transformations
We generalize bottom-up tree transducers and top-down tree transducers to the concept of bottom-up tree series transducer and top-down tree series transducer, respectively, by allowing formal tree series as output rather than trees, where a formal tree ...
Compositions of extended top-down tree transducers
Unfortunately, the class of transformations computed by linear extended top-down tree transducers with regular look-ahead is not closed under composition. It is shown that the class of transformations computed by certain linear bimorphisms coincides ...
On the representation of simply generated trees by leftist trees
Each simply generated family F of trees is unambiguously associated with another simply generated family F1 of trees such that the total weight of the trees with m leaves in F is equal to the total weight of the leftist trees with m leaves in F1. This ...






Comments