Ordered labeled trees are trees in which the sibling order matters. This paper presents algorithms for three problems having to do with approximate matching for such trees with variable length don′t cares (VLDCs). In strings, a VLDC symbol in the pattern may substitute for zero or more symbols in the data string. For example, if “com*er” is the pattern, then the “*” would substitute for the substring “put” when matching the data string “computer.” Approximate VLDC matching in strings means that after the best possible substitution, the pattern still need not be the same as the data string for a match to be allowed. For example, “com*er” matches “counter” within distance one (representing the cost of removing the “m” from “com*er” and having the “*” substitute for “unt”). We generalize approximate VLDC string matching to three algorithms for approximate VLDC matching on trees. The time complexity of our algorithms is O(|P| × |D| × min(depth(P), leaves(P)) × min(depth(D), leaves(D))) (where |P| and |D| are the number of nodes respectively of the pattern P and the data tree D), the same as for the best approximate tree matching algorithm without VLDCs previously reported.
All Science Journal Classification (ASJC) codes
- Control and Optimization
- Computational Mathematics
- Computational Theory and Mathematics