4.3. Tree reduction
We reduce the problem of graph pattern matching to matching on rooted trees – as we will see in section 4.6, a much simpler problem to solve. The map between graphs and (rooted) trees is given by rooted dual trees. Call tree-like if is connected and the underlying undirected graph of is acyclic.
Let be a tree-like graph with operations . Then given a root operation , the rooted dual tree of rooted at , written is the tree given by
- the nodes of the tree are the operations of ,
- the parent and children of are the operations that share a value with in ; the parent is the unique operation on the path from to ,
- the children of an operation are ordered according to the port labels.
Unlike graphs, tree nodes are identified uniquely by their path from the root. Trees isomorphic as graphs with identical root are thus considered equal.
A tree reduction using path splitting #
To reduce a graph to a tree using the rooted dual tree construction, it suffices to reduce to a tree-like graph. The following result shows that this can always be achieved by repeatedly applying operation splitting transformations.
A tree-like graph can be obtained from any connected graph by applying operation splittings. The resulting graph is a path-split graph (PSG) of .
Proof
Consider the undirected simple graph , where vertices are linear paths, and there is an edge between two linear paths for every operation that belongs to both paths. We call the interface graph of .
Splitting an operation in a graph corresponds to removing the corresponding edge in . On the other hand, the underlying undirected graph of has a cycle if and only if there is a cycle in . Indeed, a cycle in cannot belong to a single linear path in , by acyclicity of minIR graphs. There is, therefore, a cycle of operations that span multiple linear paths, thus forming a cycle in .
Hence, the operations to be split to turn into a tree-like graph are given by the set of edges in that must be removed to obtain a spanning tree of 1
As we consider typed graph, the splitting of an operation is unique; however the choice of spanning tree of is not unique, and thus multiple PSGs exist for a given graph .
If is a PSG of some graph , then call an operation of an anchor operation if it is on two linear paths and it is not split in . The set of all anchors operations fully determines the path-split graph. We write for the PSG of obtained by anchors .
Proof
We have assumed in section 4.2 that every operation in is on at most two linear paths and thus can be connected to at most four values. Each value is linear and hence connected to at most one other operation. It results that every operation in has at most four neighbouring operations – one parent and three children. A tree leaf can be chosen as the root operation to ensure the root does not have four children.
We can make the path splitting transformation reversible by separately storing the set of split operations in that correspond to a single operation in . As every operation of can get split in at most two split operations, we can store the pairs of split operations in that correspond to an operation in in a partial map that defines weights for (a subset of) the operations of :
This maps a split operation to the unique undirected path in from to the other half of the split operation.
This defines a map , the inverse of the path splitting transformation .
Contracted path-split graphs #
We can further simplify the structure of the data of a PSG by contracting all operations of that are on a single linear path. The result is the contracted path-split graph (cPSG) of , written .
We employ a similar trick as above to make this transformation reversible, this time by introducing weights on the values of that store the string of operations that were contracted2
where are the values of and are the optypes of operations in , i.e. the optypes of the minIR graph along with the optypes of the split operations. This defines a second map that is the inverse of the path-split graph contraction transformation . In summary, we have the composition
Contracted PSGs are particularly useful for the study of the asymptotic complexity of the pattern matching algorithm we propose, as they have a very regular structure. This is expressed by the following proposition that further extends the statement of Proposition 4.3:
Proof
That the tree is ternary follows from Proposition 4.3. Every node of the tree corresponds to an operation in , which is on exactly two linear paths. As a result of acyclicity of the tree, a tree of nodes spans linear paths – and hence, we conclude .
We conclude the construction presented in this section with the following result, expressing graph pattern matching in terms of tree equality:
Let be a pattern graph and a graph. Let be a PSG of . There is an embedding if and only if there is and a PSG of such that
and the trees have equal weight maps and .
The proof of this follows directly from our construction, the unicity of trees under isomorphism and the bijection between the graphs and their cPSGs.
We have thus successfully reduced the problem of pattern matching to the problem of matching on trees. Given that the ordering of children of a node in a tree is fixed, checking trees for equality is a simple matter of checking node and weight equality, one node (and edge) at a time.
We conclude this section with a figure summarising the constructions we have presented.
A graph , along with the path-split graph , the contracted path-split graph and their rooted dual trees. The anchor operations are (grey) and (red). The root of the rooted dual trees is .