xylia.sax.contentmodelparser
Class ConvertToDFA

java.lang.Object
  |
  +--xylia.sax.contentmodelparser.ConvertToDFA

public class ConvertToDFA
extends java.lang.Object

Converts a regular expression to a Deterministe Finite Automaton (DFA). The alogrithm used is that presented in Aho, Sethi, and Ullman's second edition of Compilers: Principles, Techniques, and Tools, in section 3.9. The automaton is not minimized.

The algorithm first converts a regular expression r to r# where # is a unique symbol. A position is a node in r# that represents a symbol. In XML terms, the symbols are element names or the special symbol #PCDATA. Then the following functions are computed for each node of r#.

Next we compute the following function on the postions of r#

Now followpos provides the structure of an NFA. We do a subset construction on followpos to compute the states and transitions of a DFA as follows (symbols is the set of all symbols excluding #).
     const start := firstpos(r#)
     states := {start}
     var unmarked := {start}
     while unmarked not= {} 
         const T :in unmarked
         unmarked := unmarked \ {T}
         for a in symbols 
             const U := Union( p in T . {q in followpos(p) | q matches a})
             if U not= {}
                 states := states union {U}
                 delta(T,a) := U
     final := {S in states | S contains a position matching #} 
 

Now the DFA is (states, symbols, delta, start, final).

However the above construction creates states that are sets. For efficiency, this class creates DFAs with Integers as states.

Copyright: Copyright (c) 2002 Theodore S. Norvell

Company: MUN

Version:
1.0
Author:
Theodore S. Norvell

Constructor Summary
ConvertToDFA()
           
 
Method Summary
static FiniteStateAutomaton convertToDFA(RegExp r)
          Implements the algorithm in of Aho Sethi and Ullman section 3.9 converting a regular expression (unaugmented) to a dfa (not minimized).
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

ConvertToDFA

public ConvertToDFA()
Method Detail

convertToDFA

public static FiniteStateAutomaton convertToDFA(RegExp r)
Implements the algorithm in of Aho Sethi and Ullman section 3.9 converting a regular expression (unaugmented) to a dfa (not minimized).
Parameters:
r - a regular expression. r must be a tree (no shared nodes).