Pattern Matching in large Graph Transformation Systems

Chapter 4

Pattern Matching in large Graph Transformation Systems

To our knowledge, the first practical proposal for a GTS-based quantum compiler was presented in Xu, 2022Mingkuan Xu, Zikun Li, Oded Padon, Sina Lin, Jessica Pointing, Auguste Hirth, Henry Ma, Jens Palsberg, Alex Aiken, Umut A. Acar and Zhihao Jia. 2022. Quartz: Superoptimization of Quantum Circuits. In Proceedings of the 43rd ACM SIGPLAN International Conference on Programming Language Design and Implementation, June 2022. Association for Computing Machinery, 625--640. doi: 10.1145/3519939.3523433 and then refined in Xu, 2023Amanda Xu, Abtin Molavi, Lauren Pick, Swamit Tannu and Aws Albarghouthi. 2023. Synthesizing Quantum-Circuit Optimizers. Proceedings of the ACM on Programming Languages 7, PLDI (June 2023, 835--859). doi: 10.1145/3591254. In these, the set of possible graph transformations is obtained by exhaustive enumeration. Using SAT solvers and fingerprinting techniques, the set of all small programs up to a certain size can be generated ahead of time and clustered into disjoint partitions of equivalent programs. This concisely expresses every possible peephole optimisation up to the specified size: for every small enough subset of operations of an input program, its equivalence class can be determined. Any replacement of that set of operations with another program in the same equivalence class is a valid transformation and, thus, a potential peephole optimisation. Transformation systems on minIR graphs based on equivalence classes were formalised in section 3.4.

First results of this approach are promising. Xu, 2022Mingkuan Xu, Zikun Li, Oded Padon, Sina Lin, Jessica Pointing, Auguste Hirth, Henry Ma, Jens Palsberg, Alex Aiken, Umut A. Acar and Zhihao Jia. 2022. Quartz: Superoptimization of Quantum Circuits. In Proceedings of the 43rd ACM SIGPLAN International Conference on Programming Language Design and Implementation, June 2022. Association for Computing Machinery, 625--640. doi: 10.1145/3519939.3523433 demonstrated that optimisation performance improves markedly with larger sets of transformation rules. Such workloads however rely heavily on pattern matching, the computational task that identifies subgraphs on which transformation rules apply. In Xu, 2022Mingkuan Xu, Zikun Li, Oded Padon, Sina Lin, Jessica Pointing, Auguste Hirth, Henry Ma, Jens Palsberg, Alex Aiken, Umut A. Acar and Zhihao Jia. 2022. Quartz: Superoptimization of Quantum Circuits. In Proceedings of the 43rd ACM SIGPLAN International Conference on Programming Language Design and Implementation, June 2022. Association for Computing Machinery, 625--640. doi: 10.1145/3519939.3523433 and Xu, 2023Amanda Xu, Abtin Molavi, Lauren Pick, Swamit Tannu and Aws Albarghouthi. 2023. Synthesizing Quantum-Circuit Optimizers. Proceedings of the ACM on Programming Languages 7, PLDI (June 2023, 835--859). doi: 10.1145/3591254, pattern matching is carried out separately for each pattern. This becomes a significant bottleneck for large rule sets. In Xu, 2022Mingkuan Xu, Zikun Li, Oded Padon, Sina Lin, Jessica Pointing, Auguste Hirth, Henry Ma, Jens Palsberg, Alex Aiken, Umut A. Acar and Zhihao Jia. 2022. Quartz: Superoptimization of Quantum Circuits. In Proceedings of the 43rd ACM SIGPLAN International Conference on Programming Language Design and Implementation, June 2022. Association for Computing Machinery, 625--640. doi: 10.1145/3519939.3523433, performance peaks at around 50,000 transformation rules, after which the additional overhead from pattern matching becomes dominant, deteriorating the compilation results.

In this chapter, we solve these scaling difficulties by presenting an algorithm for pattern matching on minIR graphs that uses a pre-computed data structure to return all pattern matches in a single query. The set of transformation rules is directly encoded in this data structure. After a one-time cost for construction, pattern-matching queries can be answered in running time independent of the number of rules in the transformation system.

The asymptotic complexity results presented in this chapter depend on some simplifying assumptions on the properties that the pattern graphs and embeddings must satisfy. This represents a restriction on the generality of minIR graphs, but we do not find that they restrict the usefulness of the result in practice. As discussed in section 4.7, none of these assumptions are required in practice for the implementation. We have not observed any impact on performance when the imposed constraints are lifted, so we conjecture that at least some of these assumptions can be relaxed and our results generalised.

After a discussion of related work in section 4.1, Section 4.2 presents the assumptions that we are making in detail, along with some relevant definitions for the rest of the chapter. Sections 4.3, 4.4, and 4.5 present the core ideas of our approach, respectively introducing: a reduction of minIR graphs to equivalent trees, a canonical construction for the tree reduction and an efficient way to enumerate all possible subtrees of a graph. We also prove bounds on the size and number of the resulting trees.

In section 4.6, we introduce a pre-computation step and show that the pattern-matching problem reduced to tree structures can be solved using a prefix tree-like automaton that is fixed and pre-computed for a given set of patterns. Combining the automaton construction with bounds from section 4.5 leads to the final result. We conclude in section 4.7 with benchmarks on a real-world dataset of 10000 quantum circuits, obtaining a 20x speedup over a leading C++ implementation of pattern matching for quantum circuits.

4.1. Related work

Our proposed solution can be seen as a specialisation of RETE networks Forgy, 1982Charles L. Forgy. 1982. Rete: A fast algorithm for the many pattern/many object pattern match problem. Artificial Intelligence 19, 1 (Septempter 1982, 17--37). doi: 10.1016/0004-3702(82)90020-0 Varró, 2013Gergely Varró and Frederik Deckwerth. 2013. A Rete Network Construction Algorithm for Incremental Pattern Matching and derivatives Ian, 2003Wright Ian and James A. R. Marshall. 2003. The execution kernel of RC++: RETE*, a faster RETE with TREAT as a special case. International Journal of Intelligent Games and Simulation 2, 1 (Feb 2003, 36-48) Armstr., 2014Dylan Armstrong. 2014. Memory Efficient Stream Reasoning onResource-Limited Devices Mirank., 1987D. P. Miranker. 1987. TREAT: a new and efficient match algorithm for AI production systems to the case of graph pattern matching. The additional structure obtained from restricting our considerations to graphs results in a simplified network design that allows us to derive worst-case asymptotic runtime and space bounds that are polynomial in the parameters relevant to our use case1 – overcoming a key limitation of RETE.

Another well-studied application of large-scale pattern matching is in the context of stochastic biomolecular simulations Sneddon, 2010Michael W Sneddon, James R Faeder and Thierry Emonet. 2010. Efficient modeling, simulation and coarse-graining of biological complexity with NFsim. Nature Methods 8, 2 (December 2010, 177--183). doi: 10.1038/nmeth.1546 Bachman, 2011John A Bachman and Peter Sorger. 2011. New approaches to modeling complex biochemistry. Nature Methods 8, 2 (January 2011, 130--131). doi: 10.1038/nmeth0211-130, particularly the Kappa project Danos, 2004Vincent Danos and Cosimo Laneve. 2004. Formal molecular biology. Theoretical Computer Science 325, 1 (Septempter 2004, 69--110). doi: 10.1016/j.tcs.2004.03.065. Stochastic simulations depend on performing many rounds of fast pattern-matching for continuous Monte Carlo simulations Yang, 2008Jin Yang, Michael I. Monine, James R. Faeder and William S. Hlavacek. 2008. Kinetic Monte Carlo method for rule-based modeling of biochemical networks. Physical Review E 78, 3 (Septempter 2008, 031910). doi: 10.1103/physreve.78.031910. However, unlike our use case, the procedure typically does not need to scale well to a large number of patterns. In Danos, 2007Vincent Danos, Jérôme Feret, Walter Fontana and Jean Krivine. 2007. Scalable Simulation of Cellular Signaling Networks, Danos et al. introduced a pre-computation step to accelerate matching by establishing relations between patterns that activate or inhibit further patterns. This idea was later expanded upon and formalised in categorical language in Boutil., 2017Pierre Boutillier, Thomas Ehrhard and Jean Krivine. 2017. Incremental Update for Graph Rewriting. The ideas presented in Boutil., 2017Pierre Boutillier, Thomas Ehrhard and Jean Krivine. 2017. Incremental Update for Graph Rewriting are similar to ours; their formalism has the advantage of being more general but does not present any asymptotic complexity bounds and suffers from similar worst-case complexities as RETE.

A similar problem has also been studied in the context of multiple-query optimisation for database queries Sellis, 1988Timos K. Sellis. 1988. Multiple-query optimization. ACM Transactions on Database Systems 13, 1 (March 1988, 23–52). doi: 10.1145/42201.42203 Ren, 2016Xuguang Ren and Junhu Wang. 2016. Multi-Query Optimization for Subgraph Isomorphism Search. Proceedings of the VLDB Endowment 10, 3 (November 2016, 121–132). doi: 10.14778/3021924.3021929, but it has limited itself to developing caching strategies and search heuristics for specific use cases. Finally, using a pre-compiled data structure for pattern matching was already proposed in Messmer, 1999Bruno T. Messmer and Horst Bunke. 1999. A decision tree approach to graph and subgraph isomorphism detection. Pattern Recognition 32, 12 (December 1999, 1979--1998). doi: 10.1016/S0031-3203(98)90142-X. However, with a nΘ(m)n^{\Theta(m)} space complexity – nn is the input size and mm the pattern size – it does not scale to large input graphs, even for small patterns.


  1. RETE networks have been shown to have exponential worst-case space (and thus time) complexity Rakib, 2018Abdur Rakib and Ijaz Uddin. 2018. An Efficient Rule-Based Distributed Reasoning Framework for Resource-bounded Systems. Mobile Networks and Applications 24, 1 (October 2018, 82--99). doi: 10.1007/s11036-018-1140-x, although performance in practical use cases can vary widely Uddin, 2016Ijaz Uddin, Hafiz Mahfooz Ul Haque, Abdur Rakib and Mohamad Rafi Segi Rahmat. 2016. Resource-Bounded Context-Aware Applications: A Survey and Early Experiment↩︎

4.2. Preliminaries and simplifying assumptions

For simplicity, we will throughout consider minIR graphs that admit a type system Σ\Sigma, though most of the results can also be adapted to other graph domains. We will write TT for the types of Σ\Sigma (i.e. its values) and Γ\Gamma for the operation types (i.e. its edges).

Linear paths and operation splitting #

An operation type νΓ\nu \in \Gamma in the type system Σ\Sigma is a hyperedge. Its endpoints

src(ν)=src1(ν)srcs(ν) and tgt(ν)=tgt1(ν)tgtt(ν) \textit{src}(\nu) = \textit{src}_1(\nu) \cdots \textit{src}_s(\nu) \textrm{ and } \textit{tgt}(\nu) = \textit{tgt}_1(\nu) \cdots \textit{tgt}_t(\nu)

are strings of data types that define the input and output signature of the operation ν\nu. We can refer to the set of all hyperedge endpoints of ν\nu using the string indices Idx()\mathrm{Idx}(\cdot) (\sqcup denotes the disjoint set union):

Pν=Idx(src(ν))Idx(tgt(ν))={1,,s}{1,,t}.P_\nu = \mathrm{Idx}(\textit{src}(\nu)) \sqcup \mathrm{Idx}(\textit{tgt}(\nu)) = \{1, \dots, s\} \sqcup \{1, \dots, t\}.

Fix a partition of PνP_\nu into disjoint pairs

Pν={p1,p1}{p2,p2},P_\nu = \{p_1, p_1'\} \,\sqcup\, \{p_2, p_2'\} \,\sqcup\, \cdots,

where the last set of the partition may be a singleton if Pν|P_\nu| is odd. For every νΓ\nu \in \Gamma, we then define f=Po/2f = \lceil |P_o| / 2 \rceil new split operation types ν1,,νf\nu_1, \dots, \nu_f, each with two endpoints: the ii-th operation type νi\nu_i has endpoints pip_i and pip_i' in PνP_\nu. For every operation oEo \in E of type ν\nu, we can then split oo into ff operations o1,,of,o_1, \dots, o_f, each of arity 1 or 2 and of types ν1,,νf\nu_1, \dots, \nu_f respectively. We will refer to the graph transformation that replaces an operation oo in a minIR graph with the operations oio_i for 1if1 \leqslant i \leqslant f as operation splitting.

It is important to note that the splitting of an operation oo is unique and given by the type of oo, and thus invariant under (typed) morphisms: there is a morphism PGP \to G of a pattern into a graph if and only if there is a morphism PGP' \to G' from the split pattern PP' into the split graph GG'.

Operation splitting

A transformation rule splitting an operation with 3 sources and 2 targets. The choice of endpoint partition made here, obtained by pairing the ii-th use with the ii-th define, is arbitrary but convenient for quantum gates as they correspond to the input and output values of a same qubit.

The endpoint partitions PνP_\nu also define linear paths. Two values v,vv, v' in a minIR graph are on the same linear path if there are values u1,,uku_1, \dots, u_k with v=u1v = u_1 and v=ukv' = u_k such that uiu_i is connected to ui+1u_{i+1} through an operation oo and they correspond to the same pair of endpoints in the endpoints partition (i.e. the indices of Ptype(o)P_{type(o)} correspond to values uiu_i and ui+1u_{i+1} in src(o)tgt(o)\textit{src}(o) \sqcup \textit{tgt}(o)).

Linearity assumption and rigidity #

Recall that in Definition ., VsrcV_\textit{src} and VtgtV_\textit{tgt} refer to the subset of values that are within the domain of definition of src\textit{src} and tgt\textit{tgt} respectively. For this chapter, we will assume Vsrc=Vtgt=VV_\textit{src} = V_\textit{tgt} = V; in other words, minIR graphs are IO-free (this is w.l.o.g., see discussion after Proposition .) and all values are linear1 (this is definitely not w.l.o.g.!).

As a result of this assumption, the subcategory of minIR graphs that we consider forms a rigid category, as introduced by Danos et al. Danos, 2014Vincent Danos, Reiko Heckel and Pawel Sobocinski. 2014. Transformation and Refinement of Rigid Structures. The definition, which we reproduce here, is given in terms of morphisms that intersect all components of the codomain. We refer to Danos, 2014Vincent Danos, Reiko Heckel and Pawel Sobocinski. 2014. Transformation and Refinement of Rigid Structures for the precise definition of that notion: in the context of linear-valued minIR graphs, this is equivalent to requiring that the image of the graph homomorphism intersects every connected component of the codomain.

Definition 4.1Rigid Category

A category C\mathbb C is rigid if for all morphisms AhBA \xrightarrow{h} B in C\mathbb C that intersects all components of BB and for all AgCA \xrightarrow{g} C that factorises as g=fhg = f \circ h, then BfCB \xrightarrow{f} C is unique.

In other words, there is a unique way to extend a morphism AgCA \xrightarrow{g} C to a morphism BfCB \xrightarrow{f} C, if it exists. If we interpret AA and BB as graph patterns that we are interested in matching with ABA \subseteq B, then rigidity guarantees that there is (at most) a unique way to extend a match morphism AGA \to G into a match morphism on the larger pattern BGB \to G.

The linearity assumption also has other useful consequences. Every linear value has exactly one use and one definition. As a result, all linear paths are disjoint and form a partition of the values of the graph. They correspond to the paths that form the connected components of the fully split graph, i.e. the graph obtained by splitting every operation. We call the number of linear paths (and hence the number of connected components in the fully split graph) the circuit width, written width(G)width(G). We also use the linear path decomposition to define circuit depth, written depth(G)depth(G), as the longest linear path in GG.

As discussed in section 3.4, minIR rewrites are instantiated from transformation rules by minIR match morphisms m:PGm: P \to G. Restricting our considerations to linear-valued minIR graphs has the further implication that all such match morphsisms mm will be injective. We call mm an embedding and write it using greek letters and a hooked arrow

m=φ:PG.m = \varphi: P \hookrightarrow G.

Finding such embeddings PGP \hookrightarrow G is the pattern matching problem that we are solving. This problem is equivalent to finding minIR subgraphs HGH \subseteq G of GG such that HH is isomorphic to the pattern PHP \simeq H.

Convexity #

According to Proposition ., a necessary condition for a subgraph HH to define a valid minIR rewrite is convexity. In this chapter we weaken this requirement and propose a condition based on circuit width:

Proposition 4.1Necessary condition for convexity

Let φ:PG\varphi: P \hookrightarrow G be an embedding of a pattern PP into a linear-valued minIR graph GG such that φ(P)\varphi(P) is a convex subgraph of GG. Then for every subgraph HGH \subseteq G such that φ(P)H\varphi(P) \subseteq H, it holds that width(P)width(H).width(P) \leq width(H).

Proof

Up to isomorphism, we can assume PGP \subseteq G. Suppose there is HGH \subseteq G such that PHP \subseteq H and width(P)>width(H)width(P) > width(H). Let LP,LHP(V)\mathcal{L}_P, \mathcal{L}_H \subseteq \mathcal{P}(V) be partitions of VPV_P and VHV_H into sets of values that are on the same linear path of PP and HH respectively. It must hold that for all LP\ell \in \mathcal{L}_P there is LH\ell' \in \mathcal{L}_H such that \ell \subseteq \ell', because PHP \subseteq H and operation splitting is preserved under embeddings. As the map from LP\mathcal{L}_P to LH\mathcal{L}_H cannot be injective, there must be 1,2LP\ell_1, \ell_2 \in \mathcal{L}_P and LH\ell' \in \mathcal{L}_H, such that 12\ell_1 \neq \ell_2 and 1,2\ell_1, \ell_2 \subseteq \ell'. We conclude that there must be a path in the fully split graph of HH between a value of 1\ell_1 and a value of 2\ell_2 that is not in the fully split graph of PP. Given that PP is convex, this path must be in PP, which contradicts the preservation of operation splitting under embeddings.

In this chapter, whenever we define a subgraph HGH \subseteq G of a graph GG, we will assume that HH satisfies the above weakened convexity condition.

The converse of Proposition ., however, is not true. The pattern-matching technique presented below will find a strict superset of convex embeddings. To restrict considerations to convex embeddings, it suffices to filter out the non-convex ones in a post-processing step.

Ignoring minIR Hierarchy #

So far, we have omitted discussing one part of the minIR structure: the nested hierarchy of operations. Syntactically, the hierarchy formed by parentparent relations between minIR operations can be viewed as just another value type that operations are incident to: parent operations define an additional output that children operations consume as additional input. Because of the bijectivity requirement of minIR morphisms on parent-child relations of Definition ., these parent-child relations behave, in fact, like linear values – and hence do not violate the linearity assumption we have imposed.

However, by treating them as such, we have further weakened the constraints on pattern embeddings. We do not enforce that boundary values must be in the same regions or that parent-child relations cannot be boundary values. Similarly to convexity, we defer checking these properties to a post-processing step.

Further assumptions (harmless) #

We will further simplify the problem by making presentation choices that do not imply any loss of generality. First of all, we assume that all patterns have the same width ww and depth dd, are connected graphs and have at least 2 operations. These conditions can always be fulfilled by adding “dummy” operations if necessary. Embeddings of disconnected patterns can be computed one connected component at a time.

We will further assume that all operations are on at most two linear paths (and thus in particular, have at most 4 endpoints). Operations on Δ>2\Delta > 2 linear paths can always be broken up into a composition of Δ1\Delta-1 operations, each on two linear paths as follows:

Gate decomposition

Expressing an operation on Δ=3\Delta = 3 linear paths as a composition of two operations on 2 linear paths.

This transformation leaves circuit width unchanged but may multiply the graph depth by up to a factor Δ\Delta.

We furthermore define the set of all port labels

Pall=νΓPνP_{all} = \bigcup_{\nu \in \Gamma} P_\nu

so that we can associate every operation endpoint in a minIR graph GG with a port label from the set PallP_{all}. We further endow the labels PallP_{all} with a total order (for instance, based on the string index values). The total order on PallP_{all} then induces a total order on the paths v1vkVv_1\cdots v_k \in V^\ast in GG that start in the same value v1v_1: the paths are equivalently described by the sequence of port labels of the operations traversed. These form strings in PallP_{all}^\ast, which we order lexicographically. Given a root value rr, for every value vv in GG there is thus a unique smallest path from rr to vv in GG2. This path is invariant under isomorphism of the underlying graph (i.e. relabelling of the values and operations but preserving the port labels). With this we conclude the discussions of the specificities of minIR graphs related to typing, linearity and hierarchy, and the related assumptions that we are making.

To summarise, minIR graphs as they are considered in this chapter are hypergraphs (Definition .) that satisfy the following properties

  • every vertex (value) is incident to exactly two hyperedges (operations). It is the target of one hyperedge (its definition) and the source of another one (its use),
  • every hyperedge is incident to at most four vertices,
  • every hyperedge can be split in a unique way (and invariant under isomorphism) into at most two split operations, with each at most two endpoints.

When modelling subgraphs of IO-free minIR graphs (typically patterns for pattern matching), some hyperedge connections at the boundary of the subgraph will be missing. We say a value is open if a use or define operation is missing (i.e. it is a boundary value in a minIR subgraph).

We will simplify refer to hypergraphs that satisfy the above assumptions as graphs. In the unique instance of this chapter where a graph that does not satisfy this construction is referred to, we will specifically call it a simple graph.

We conclude with the following notable bound on circuit width.

Proposition 4.2Bound on circuit width

Let GG be a graph with noddn_\textrm{odd} operations of odd arity (i.e. def(o)+use(o)|def(o) + use(o)| is odd) and nωn_\omega open values. Then, the circuit width of GG is

width(G)=(nodd+nω)/2.width(G) = \lfloor(n_\textrm{odd} + n_\omega) / 2\rfloor.

Proof

For any linear path PVP \subseteq V^\ast in GG consider its two ends v1v_1 and v2v_2, i.e. the two values in PP with only one neighbouring value in PP (by definition linear paths cannot be empty). In the fully split graph of GG, these values are either open or must be connected to two operations. In the latter case, at least one of the operations must have a single endpoint (otherwise by acyclicity, the operation would have two neighbours).

In a fully split graph, operations with a single endpoint result from a split operation with an odd number of endpoints. We conclude that for every linear path, there are either two operations with an odd number of endpoints in GG, or one such operation and one open value, or two open values. The result follows.


  1. This restriction is necessary for our results: copyable values may admit an arbitrary number of adjacent hyperedges. As a result, minIR graph pattern matching with copyable values is a generalisation of the subgraph isomorphism problem, a well-known NP-complete problem Cook, 1971Stephen A. Cook. 1971. The complexity of theorem-proving procedures. In Proceedings of the third annual ACM symposium on Theory of computing - STOC ’71. ACM Press, 151--158. doi: 10.1145/800157.805047. The approach generalises to non-linear types, but the complexity analysis no longer holds (we pay a computational price for every non-linear value matched). ↩︎

  2. Remark that the ordering of the operations thus defined is a particular case of a depth-first search (DFS) ordering of the graph: given an operation oo that has been visited, all its descendants will be visited before proceeding to any other operation. ↩︎

4.3. Tree reduction

We reduce the problem of graph pattern matching to matching on rooted trees – as we will see in section 4.6, a much simpler problem to solve. The map between graphs and (rooted) trees is given by rooted dual trees. Call GG tree-like if GG is connected and the underlying undirected graph GUG_U of GG is acyclic.

Definition 4.2Rooted dual tree

Let GG be a tree-like graph with operations OO. Then given a root operation rOr \in O, the rooted dual tree of GG rooted at rr, written τr(G)\tau_r(G) is the tree given by

  • the nodes of the tree are the operations OO of GG,
  • the parent and children of oOo \in O are the operations that share a value with oo in GG; the parent is the unique operation on the path from oo to rr,
  • the children of an operation are ordered according to the port labels.

Unlike graphs, tree nodes are identified uniquely by their path from the root. Trees isomorphic as graphs with identical root are thus considered equal.

A tree reduction using path splitting #

To reduce a graph GG to a tree using the rooted dual tree construction, it suffices to reduce GG to a tree-like graph. The following result shows that this can always be achieved by repeatedly applying operation splitting transformations.

Proposition 4.6Path splitting

A tree-like graph can be obtained from any connected graph GG by applying operation splittings. The resulting graph is a path-split graph (PSG) of GG.

Proof

Consider the undirected simple graph I\mathcal{I}, where vertices are linear paths, and there is an edge between two linear paths for every operation that belongs to both paths. We call I\mathcal{I} the interface graph of GG.

Splitting an operation oo in a graph GG corresponds to removing the corresponding edge in I\mathcal{I}. On the other hand, the underlying undirected graph GUG_U of GG has a cycle if and only if there is a cycle in I\mathcal{I}. Indeed, a cycle in GUG_U cannot belong to a single linear path in GG, by acyclicity of minIR graphs. There is, therefore, a cycle of operations that span multiple linear paths, thus forming a cycle in I\mathcal{I}.

Hence, the operations to be split to turn GG into a tree-like graph are given by the set of edges EE^- in I\mathcal{I} that must be removed to obtain a spanning tree of I.\mathcal{I}.1

As we consider typed graph, the splitting of an operation is unique; however the choice of spanning tree of I\mathcal I is not unique, and thus multiple PSGs exist for a given graph GG.

If GG' is a PSG of some graph GG, then call an operation oo of GG an anchor operation if it is on two linear paths and it is not split in GG'. The set of all anchors operations πO\pi \subseteq O fully determines the path-split graph. We write Gπ=GG^\pi = G' for the PSG of GG obtained by anchors π\pi.

Proposition 4.3Rooted dual trees are ternary
Consider a PSG GπG^\pi of a graph GG. There is a root operation rOr \in O such that the rooted dual tree τr(Gπ)\tau_r(G^\pi) is a ternary tree, i.e. every node of τr(Gπ)\tau_r(G^\pi) has at most three children.
Proof

We have assumed in section 4.2 that every operation in GG is on at most two linear paths and thus can be connected to at most four values. Each value is linear and hence connected to at most one other operation. It results that every operation in τr(Gπ)\tau_r(G^\pi) has at most four neighbouring operations – one parent and three children. A tree leaf can be chosen as the root operation to ensure the root does not have four children.

We can make the path splitting transformation GGπG \to G^\pi reversible by separately storing the set of split operations in GπG^\pi that correspond to a single operation in GG. As every operation of GG can get split in at most two split operations, we can store the pairs of split operations in GπG^\pi that correspond to an operation in GG in a partial map that defines weights for (a subset of) the operations OπO^\pi of GπG^\pi:

split:OπP.split: O^\pi \rightharpoonup P^\ast.

This maps a split operation oo to the unique undirected path in GπG^\pi from oo to the other half of the split operation.

This defines a map σ1:(Gπ,split)G\sigma_1: (G^\pi, split) \mapsto G, the inverse of the path splitting transformation GGπG \to G^\pi.

Contracted path-split graphs #

We can further simplify the structure of the data of a PSG by contracting all operations of GπG^\pi that are on a single linear path. The result is the contracted path-split graph (cPSG) of GG, written c(Gπ)c(G^\pi).

We employ a similar trick as above to make this transformation reversible, this time by introducing weights on the values of c(Gπ)c(G^\pi) that store the string of operations that were contracted2

contract:VC(Γπ)contract: V_C \rightharpoonup (\Gamma^\pi)^\ast

where VCV_C are the values of c(Gπ)c(G^\pi) and Γπ\Gamma^\pi are the optypes of operations in GπG^\pi, i.e. the optypes of the minIR graph GG along with the optypes of the split operations. This defines a second map σ2:(c(Gπ),contract)Gπ\sigma_2: (c(G^\pi), contract) \mapsto G^\pi that is the inverse of the path-split graph contraction transformation c()c(\cdot). In summary, we have the composition

(c(Gπ),contract,split)σ2×Id(Gπ,split)σ1G.(c(G^\pi), contract, split) \xrightarrow{\sigma_2 \times Id} (G^\pi, split) \xrightarrow{\sigma_1} G.

Contracted PSGs are particularly useful for the study of the asymptotic complexity of the pattern matching algorithm we propose, as they have a very regular structure. This is expressed by the following proposition that further extends the statement of Proposition 4.3:

Proposition 4.4Contracted PSG
Consider a PSG GπG^\pi of a graph GG. There is a root operation rOr \in O such that the rooted dual tree of the contracted PSG τr(c(Gπ))\tau_r(c(G^\pi)) is a ternary tree with width(G)1width(G) - 1 nodes.
Proof

That the tree is ternary follows from Proposition 4.3. Every node of the tree corresponds to an operation in c(Gπ)c(G^\pi), which is on exactly two linear paths. As a result of acyclicity of the tree, a tree of kk nodes spans k+1k+1 linear paths – and hence, we conclude k=width(G)1k = width(G) - 1.

We conclude the construction presented in this section with the following result, expressing graph pattern matching in terms of tree equality:

Proposition 4.5Reduction to Tree Pattern matching

Let PP be a pattern graph and GG a graph. Let PπP^\pi be a PSG of GG. There is an embedding PGP \hookrightarrow G if and only if there is HGH \subseteq G and a PSG HπH^{\pi'} of HH such that

τ(c(Pπ))=τ(c(Hπ))\tau(c(P^\pi)) = \tau(c(H^{\pi'}))

and the trees have equal weight maps splitsplit and contractcontract.

The proof of this follows directly from our construction, the unicity of trees under isomorphism and the bijection between the graphs P,HP, H and their cPSGs.

We have thus successfully reduced the problem of pattern matching to the problem of matching on trees. Given that the ordering of children of a node in a tree is fixed, checking trees for equality is a simple matter of checking node and weight equality, one node (and edge) at a time.

We conclude this section with a figure summarising the constructions we have presented.

A graph GGG, along with the path-split graph GπG^\piGπ, the contracted path-split graph c(Gπ)c(G^\pi)c(Gπ) and their rooted dual trees. The anchor operations are ddd (grey) and eee (red). The root of the rooted dual trees is eee.

A graph GG, along with the path-split graph GπG^\pi, the contracted path-split graph c(Gπ)c(G^\pi) and their rooted dual trees. The anchor operations are dd (grey) and ee (red). The root of the rooted dual trees is ee.


  1. It is a simple result from graph theory that such a set of edges always exists – it suffices to remove one edge from every cycle in the graph. ↩︎

  2. Because all contracted operations apply on a single, shared, linear path, they indeed form a string of operations. ↩︎

4.4. Canonicalising the tree reduction

The reduction of graph matching to ternary trees from the previous section is a big step towards an algorithm for graph matching. However, Proposition 4.5 is expressed in terms of existence of PSGs – it is as yet unclear how the trees can be constructed. This is the purpose of this section.

We introduce for this purpose a canonical, that is, invariant under isomorphism, choice of PSG GπG^\pi of GG. The result is a unique canonical transformation GGπc(Gπ)G \mapsto G^\pi \mapsto c(G^\pi) from GG to a cPSG that we can use for pattern matching.

We proceed by using the total order that we have defined on port labels and can be extended lexicographically to paths outgoing from a shared root operation (see section 4.2 for more details). Whenever more than one path from rr to oo exist in GG, it suffices to consider the smallest one. For a choice of root operation rO,r \in O, we thus obtain a total order of all operations OO in GG.

We then restrict our attention to operations on two linear paths and consider them in order. We keep track of linear paths that have been visited and proceed as follows to determine whether oOo \in O must be split:

  • if oo is on a linear path that was not seen before, it is left unchanged and the set of visited linear paths is updated;
  • otherwise, i.e. oo is on two linear paths that have already been visited, the operation is split, resulting in two operations on a single linear path.

The pseudocode CanonicalPathSplit implements this algorithm. We use Operations(G) to retrieve all the operations on the graph G and LinearPaths(G, op) to retrieve the linear paths of the operation op. The linear paths are identified using integer indices that can be pre-computed and stored in linear time in the graph size. SplitOperation(G, op) returns the graph resulting from splitting op into two operations on a single linear path. Finally, PathAsPortLabels(G, root, v) returns the string of the port labels that encode the path from root to v in the graph G. The strings are ordered lexicographically. The non-capitalized functions set, union, sort1, len, and issubset have their standard meanings.

 1def CanonicalPathSplit(G: Graph, root: Operation) -> Graph:
 2  new_G := G
 3  all_operations := Operations(G)
 4  sorted_operations := sort(
 5      all_operations,
 6      sort_key= lambda v: PathAsPortLabels(G, root, v)
 7  )
 8
 9  # keep track of the visited linear paths
10  seen_paths := set()
11  for op in sorted_operations:
12    # Get the (pre-computed) indices of the linear paths
13    op_linear_paths := LinearPaths(G, op)
14    if len(op_linear_paths) == 2:
15      if issubset(op_linear_paths, seen_paths):
16        # The two linear paths of `op` are already visited
17        new_G = SplitOperation(new_G, op)
18      else:
19        # Mark the new linear paths as visited
20        seen_paths = union(seen_paths, op_linear_paths)
21  return new_G

The following figure shows an example of splitting a graph into its canonical PSG using CanonicalPathSplit.

Splitting a graph into its canonical PSG. Ports are ordered counter-clockwise on each edge, and numbered according to the lexicographic order of the paths from root to the port, as returned by PathAsPortLabels. This induces an order on the hyperedges, reflected in the alphabetic order of the edge labels. Linear paths are formed by ports in a horizontal line (as marked by the dotted lines). Vertex root is chosen as the root of the canoncal splitting. Vertices d and g are not split because they are the smallest edges that contain the fourth, respectively first linear path.

Splitting a graph into its canonical PSG. Ports are ordered counter-clockwise on each edge, and numbered according to the lexicographic order of the paths from root to the port, as returned by PathAsPortLabels. This induces an order on the hyperedges, reflected in the alphabetic order of the edge labels. Linear paths are formed by ports in a horizontal line (as marked by the dotted lines). Vertex root is chosen as the root of the canoncal splitting. Vertices d and g are not split because they are the smallest edges that contain the fourth, respectively first linear path.

Proposition 4.7Correctness of CanonicalPathSplit
For a graph GG, the graph returned by CanonicalPathSplit(G) is a valid PSG of G.G. It is deterministic and invariant under isomorphism of GG. The runtime of CanonicalPathSplit is O(G)O(|G|), where G|G| is the number of operations in the graph GG.
Proof

Let GπG^\pi be the graph returned by CanonicalPathSplit(G). From the discussion in the proof of Proposition 4.6, we know it is sufficient to show that the interaction graph I\mathcal{I} of GπG^\pi is acyclic and connected.

I\mathcal{I} is acyclic. If there was a cycle in I\mathcal{I}, then there would be operations o0,,ok1o_0, \dots, o_{k-1} in GG that pairwise (oi,oi+1modk)(o_i, o_{i+1\, mod\, k}) share a linear path. One of these operations must be considered last in the for loop of lines 11–20, suppose it is ok1o_{k-1}. But every linear path of ok1o_{k-1} is either also a linear path of ok2o_{k-2} or a linear path of o1o_{1}: ok1o_{k-1} thus does not satisfy the condition on line 15, and thus cannot be in I\mathcal{I}, a contradiction. Hence I\mathcal{I} is acyclic.

I\mathcal{I} is connected. We proceed inductively to show the following invariant for the main for loop (lines 11–20): for all linear paths in seen_paths, there is a path in I\mathcal{I} to a linear path of the root operation. seen_paths is only modified on line 20. If op is the root operation, then trivially there is a path from the linear paths op_linear_paths to a linear path of the root operation. Otherwise, we claim that there must be one of the paths in op_linear_paths that is already in seen_paths. From there it follows that there is a path in I\mathcal{I} from the root path to the unseen linear path, given by the path to the linear path in seen_path followed by the edge in I\mathcal{I} that corresponds to op.

By connectedness of GG, there is a path from the root operation to op. The path is not empty because op is not the root operation, so we can consider the prefix of the path of all operations excluding op. Call op' the last operation preceding op and op_linear_paths' its linear paths. Two successive operations on a path must share a linear path: op_linear_paths \cap op_linear_paths' cannot be empty. According to line 4, op' must have been visited before op, thus op_linear_paths' \subseteq seen_paths. It follows that at least one element of op_linear_paths must be in seen_paths.

Determinstic and isomorphism invariant. The pseudocode above is deterministic and only depends on paths in GG encoded as strings of port labels, which are invariant under isomorphism.

Runtime complexity. Lines 2 and 3 run in O(G)O(|G|) time. With the exception of the sort function on lines 4–7, every other line can be run in O(1)O(1) time:

  • lines 13 and 15 run in constant time because the size of op_linear_paths is always at most 2;
  • line 20 (and the in check on line 15) can be run in constant time by representing the seen_paths set as a fixed-size boolean array of size ww, with the ii-th bit indicating whether the ii-th linear path has been seen;
  • line 17 is a constant time transformation if we allow in-place modification of new_G.

The for loop will run G|G| iterations, for a total of O(G)O(|G|) runtime. Finally, the sorting operation would naively take time O(GlogG)O(|G| \log |G|). However, given that the ordering is obtained lexicographically from the paths starting at the root, we can obtain the sorted list of operations by depth-first traversal of the graph starting at the root. The result follows.

Using CanonicalPathSplit, we can now sketch what the pattern matching algorithm should look like. For each pattern, we first compute their canonical PSG for an arbirary choice of pattern root operation; then, given a graph GG, we can find all embeddings of patterns into GG by iterating over all possible PSGs within GG. Naively, this involves enumerating all posible subgraphs of GG, and then for each of them, iterating over all possible root choices.

This can be significantly sped up by realising that many of the PSGs that are computed when iterating over all possible subgraphs and root choices are redundant2. We will see in the next section that we can i) iterate once over all possible root choices rr in GG and ii) introduce a new procedure AllPathSplits that will efficiently enumerate all possible rooted ual trees of PSGs that are rooted in rr for subgraphs within GG. In the process, we will also see that we can replace the tree equality check of line 12 with a subtree inclusion check, further reducing the number of PSGs that must be considered.

Naive pattern matching.

 1# Precompute all PSGs
 2allT = [CanonicalPathSplit(
 3    P, root_P
 4) for (P, root_P) in patterns]
 5
 6for S in Subgraphs(G):
 7  for root_S in Operations(S):
 8    TG = CanonicalPathSplit(
 9        S, root_S
10    )
11    for T in allT:
12      if T == TG:
13        yield T

Improved using AllPathSplits (section 4.5).

 1# Precompute all PSGs
 2allT = [CanonicalPathSplit(
 3    P, root_P
 4) for (P, root_P) in patterns]
 5
 6for root_G in Operations(G):
 7  for TG in AllPathSplits(
 8      G, root_G
 9  ):
10    for T in allT
11      # Replace == with subtree
12      if IsSubTree(T, TG)
13        yield T

  1. The sort_key parameter of the sort function defines the total order according to which the elements are sorted, from smallest to largest. ↩︎

  2. Think for example of the same root operation rr that is considered repeatedly for every overlapping subgraph of GG that contains rr↩︎

4.5. Enumerating all path-split graphs

The CanonicalPathSplit procedure in the previous section defines for all graphs GG and choice of root operation rr a canonical PSG GπG^\pi, and thus a canonical set of anchors π\pi that we write as πr(G)=π.\pi_r(G) = \pi.

Instead of CanonicalPathSplit, we can equivalently consider a CanonicalAnchors procedure, which computes πr(G)\pi_r(G) directly instead of the graph Gπr(G)G^{\pi_r(G)}.

We formulate this computation below, using recursion instead of a for loop. This form generalises better to the AllAnchors procedure that we will introduce next.

The equivalence of the CanonicalAnchors procedure with CanonicalPathSplit follows from the observation made in section 4.2 that ordering operations in lexicographic order of the port labels is equivalent to a depth-first traversal of the graph.

CanonicalAnchors implements a recursive depth-first traversal (DFS), with the twist that the recursion is explicit only on the anchor nodes and otherwise relying on the lexicographic ordering just like in CanonicalPathSplit: lines 5–15 of CanonicalAnchors correspond to the iterations of the for loop (line 11–20) of CanonicalPathSplit until an anchor operation is found (i.e. the else branch on lines 18–20 is executed). From there, the graph traversal proceeds recursively.

We introduce the ConnectedComponent, Neighbours and RemoveOperation procedures; the first returns the connected component of the current operation, whereas the other two procedures are used to traverse, respectively modify, the graph GG. Importantly, Neighbours(G, op) returns the neighbours of op ordered by port label order.

To ensure that the recursive DFS does not visit the same operation twice, we modify the graph with RemoveOperation on lines 11 and 15, ensuring that no visited operation remains in G. As a consequence, CanonicalAnchors may be called on disconnected graphs, which explains why an additional call to ConnectedComponent (line 4) is required.

Proposition 4.8Equivalence of CanonicalPathSplit and CanonicalAnchors

Let GG be a connected graph and let rr be a root operation in GG. Then CanonicalAnchors maps the graph to the canonical anchor set:

(G,r,{})(π(G)r,L,),(G, r, \{\}) \mapsto (\pi(G)_r, L, \varnothing),

where LL is the set of all paths in GG and \varnothing designates the empty graph.

The proof follows directly from the previous paragraphs.

 1def CanonicalAnchors(
 2    G: Graph, root: Operation, seen_paths: Set[int]
 3) -> (Set[Operation], Set[int], Graph):
 4  operations = Operations(ConnectedComponent(G, root))
 5  # sort by PathAsPortLabels, as previously
 6  sorted_operations := sort(operations)
 7  operations_queue := queue(sorted_operations)
 8
 9  # Skip all operations that are not anchors
10  op := operations_queue.pop() # never emtpy, contains root
11  G = RemoveOperation(G, op)
12  while len(LinearPaths(G, op)) == 1 or
13        issubset(LinearPaths(G, op), seen_paths):
14    op = operations_queue.pop() or return ({}, {}, G)
15    G = RemoveOperation(G, op)
16
17  # op is anchor, update seen_paths and recurse
18  seen_paths = union(seen_paths, LinearPaths(G, op))
19  anchors := [op]
20  # sort by port labels
21  for child in Neighbours(G, op):
22    (new_anchors, seen_paths, G) = CanonicalAnchors(
23        G, child, seen_paths
24    )
25    anchors += new_anchors
26
27  return (anchors, seen_paths, G)

Maximal PSGs #

In addition to “simplifying” the data required to define path splitting, the definition of PSGs using anchor operations has another advantage that is fundamental to the pattern matching algorithm.

Consider the rooted dual tree τr(Gπ)\tau_r(G^\pi) of a PSG with root operation rr in GπG^\pi. Recall that tree nodes are uniquely identified by their path from the root and thus are considered equal if they are isomorphic as graphs. We can in the same way define a tree inclusion relation \subseteq on rooted dual trees that corresponds to checking that the trees have the same root and that the left-hand side is isomorphic to a subtree of the right-hand side. We also require that the operation weights given by the splitsplit map splitsplit map coincide on the common subtree.

Proposition 4.9Maximal PSG

Let GG be a connected graph, π\pi a set of operations in GG and rπr \in \pi a root operation. Consider the set Gπ={HGπr(H)=π}.\mathcal{G}_\pi = \{H \subseteq G \mid \pi_r(H) = \pi \}.

There is a subgraph MGM \subseteq G such that for all subgraphs HGXH \in \mathcal{G}_X: HMH \subseteq M. Furthermore, for all graph PP, there is rr' and π=π(P)r\pi' = \pi(P)_{r'} such that

PHGXτr(Pπ)τr(Mπ).P \simeq H \in \mathcal{G}_X \quad\Leftrightarrow\quad \tau_{r'}(P^{\pi'}) \subseteq \tau_r(M^\pi).

We call MπM^\pi the maximal PSG with anchors π\pi in GG.

The proof gives an explicit construction for MM.

Proof

Assume GX\mathcal{G}_X \neq \varnothing, otherwise the statement is trivial.

Construction of MM. Let LL be the set of linear paths in GG that go through at least one operation in π\pi. Consider the set of operations OLO_L in GG given by the operations whose linear paths are contained in LL. This defines a subgraph GOLG|_{O_L} of GG. Since GX\mathcal{G}_X \neq \varnothing, there exists HGXH \in \mathcal{G}_X. By assumption, HH is connected, and thus the anchors π\pi of HH are connected in HH. There is therefore a connected component MGOLM \subseteq G|_{O_L} that contains the set π\pi.

Well-definedness of MπM^\pi. Consider the PSG MπM^\pi of MM. We must show that MπM^\pi is a tree-like graph for the proposition statement to be well-defined. In other words, we must show that the interaction graph I\mathcal{I} of MπM^\pi is acyclic and connected. MM is connected by construction, which implies connectedness of MπM^\pi and thus of I\mathcal{I}. It is acyclic because width(M)=π+1width(M) = |\pi| + 1 and MM has exactly π|\pi| operations on more than one linear path. I\mathcal{I} is a thus a tree.

HMH \subseteq M. For any subgraph HGXH \in \mathcal{G}_X, its operations must be contained in OLO_L. Since any HGH \in \mathcal{G} is connected and contains π\pi, it must further hold that HMH \subseteq M.

We can now prove the \Leftrightarrow equivalence of (1).

\Leftarrow: If τr(Pπ)τr(Mπ)\tau_{r'}(P^{\pi'}) \subseteq \tau_r(M^\pi), then there exists HMπH' \subseteq M^\pi with rooted dual tree

τr(Hπ)=τr(Pπ).\tau_r(H^\pi) = \tau_r(P^{\pi'}).

Furthermore, by definition of \subseteq on rooted trees, a splitsplit map is defined on HH', given by the splitsplit map of MπM^\pi on the domain HH'. Recall from section 4.3 that there is a map σ\sigma that maps

(H,split)σHand(Mπ,split)σM.(H', split) \overset{\sigma}{\longmapsto} H\quad\textrm{and}\quad(M^\pi, split) \overset{\sigma}{\longmapsto} M.
It merges split operations pairwise, and thus it is immediate that HMπH' \subseteq M^\pi implies HMH \subseteq M. Thus HGXH \in \mathcal{G}_X and H=HπH' = H^\pi. By construction, one can also derive that PHP \simeq H. The statement follows.

\Rightarrow: Since HGXH \in \mathcal{G}_X, we know from point 1 that HMH \subseteq M. Thus we can define an injective embedding φ:PM\varphi: P \to M.

Operation splitting leaves the set of values from HH to HπH^\pi, as well as from MM to MπM^\pi unchanged. Similarly, there is a bijection between values in HπH^\pi and MπM^\pi and thus between edges in τr(Hπ)\tau_r(H^\pi) and τr(Mπ)\tau_r(M^\pi). The pattern embedding φ\varphi hence defines an injective map ϕE\phi_E from tree edges in τr(Hπ)\tau_r(H^\pi) to tree edges in τr(Mπ)\tau_r(M^\pi). We extend this map to a map on the trees ϕ:τr(Hπ)τr(Mπ)\phi: \tau_r(H^\pi) \to \tau_r(M^\pi) by induction over the nodes set of τr(Hπ)\tau_r(H^\pi). We start by the root map ϕ(r)=r\phi(r) = r. Using ϕE\phi_E, we can then uniquely define the image of any child node of rr in τr(Hπ)\tau_r(H^\pi), and so forth inductively.

We show now that the map ϕ\phi thus defined is injective. Suppose v,vv, v' are nodes in τr(Hπ)\tau_r(H^\pi) such that ϕ(v)=ϕ(v)\phi(v) = \phi(v'). By the inductive construction there are paths from the root rr to vv and vv' respectively such that their image under ϕE\phi_E are two paths from rr to ϕ(v)=ϕ(v)\phi(v) = \phi(v'). But τr(Mπ)\tau_r(M^\pi) is a tree, so both paths must be equal. By bijectivity of ϕE\phi_E, it follows v=vv = v', and thus ϕ\phi is injective. Finally, the value and operation weights are invariant under pattern embedding and thus are preserved by definition.

This result means that instead of listing all PSGs for every possible subgraph of GG, it is sufficient to proceed as follows:

  1. for every pattern PP, fix a root operation rPr_P and construct the rooted tree dual of the canonical PSG
    τP:=τrP(PπrP(P)).\tau_P := \tau_{r_P}(P^{\pi_{r_P}(P)}).
  2. enumerate every possible root operation rr in GG,
  3. enumerate every possible sets of anchors π\pi in GG with root rr,
  4. for each set π\pi, find the maximal PSG MM with anchors π\pi in GG, and take its rooted tree dual τM:=τr(Mπ)\tau_M := \tau_r(M^\pi),
  5. find all patterns PP such that τPτM\tau_P \subseteq \tau_M.

In other words, if AllAnchors is a procedure that enumerates all possible sets of anchors π\pi in GG and MaximalPathSplit computes the maximal PSG MM as presented in the proof of Proposition 4.9, then AllPathSplits(G) can simply be obtained by calling AllAnchors and then returning their respective maximal PSGs in GG:

def AllPathSplits(G: Graph, root: Operation) -> Set[Graph]:
  all_anchors = AllAnchors(G, root)
  return {MaximalPathSplit(G, pi) for pi in all_anchors}

The missing piece: AllAnchors #

We can now complete the definition of AllPathSplits by defining the AllAnchors procedure, which enumerates all possible sets of anchors in GG given a root operation rr.

The procedure is similar to CanonicalAnchors, described in detail in the previous paragraphs. In addition to the arguments of CanonicalAnchors, AllAnchors requires a width w1w \geq 1 argument. It then returns all sets of at most ww operations1 that form the canonical anchors of some width-ww subgraph of GG with root rr. The main difference between CanonicalAnchors and AllAnchors is that the successive recursive calls (line 22 in CanonicalAnchors) are replaced by a series of nested loops (lines 42–48 in AllAnchors) that exhaustively iterate over the possible outcomes for different subgraphs of GG. The results of every possible combination of recursive calls are then collected into a list of anchor sets, which is returned.

The part of the pseudocode that is without comments is unchanged from CanonicalAnchors. Using Proposition 4.3, we know that we can assume that every operation has at most 3 children, and thus 3 neighbours in G, given that the operations equivalent to parent nodes were removed.

 1def AllAnchors(
 2    G: Graph, root: Operation, w: int,
 3    seen_paths: Set[int] = {}
 4) -> List[(Set[Operation], Set[int], Graph)]:
 5  # Base case: return one empty anchor list
 6  if w == 0:
 7    return [({}, {}, G)]
 8
 9  operations = Operations(ConnectedComponent(G, root))
10  sorted_operations := sort(operations)
11  operations_queue := queue(sorted_operations)
12
13  op := operations_queue.pop()
14  G = RemoveOperation(G, op)
15  while len(LinearPaths(G, op)) == 1 or
16        issubset(LinearPaths(G, op), seen_paths):
17    op = operations_queue.pop() or return [({}, {}, G)]
18    G = RemoveOperation(G, op)
19
20  seen0 = union(seen_paths, LinearPaths(G, op))
21  # There are always at most three neighbours: we
22  # unroll the for loop of CanonicalAnchors.
23  [child1, child2, child3] = Neighbours(G, op)
24  # Iterate over all ways to split w-1 anchors over
25  # the three children and solve recursively
26  all_anchors = []
27  for 0 <= w1, w2, w3 < w with w1 + w2 + w3 == w - 1:
28    for (anchors1, seen1, G1) in
29        AllAnchors(G, child1, w1, seen0):
30      for (anchors2, seen2, G2) in
31          AllAnchors(G1, child2, w2, seen1):
32        for (anchors3, seen3, G3) in
33            AllAnchors(G2, child3, w3, seen2):
34          # Concatenate new anchor with anchors from all paths
35          anchors = union([op], anchors1, anchors2, anchors3)
36          all_anchors.push((anchors, seen3, G3))
37  return all_anchors

We can represent the sequence of recursive calls to AllAnchors as a tree. The call tree for the graph used as example to illustrate CanonicalAnchors earlier is given on the next page.

We now show correctness of the procedure. Let us write Πrw(G)\Pi_r^w(G) for the set of sets of anchors returned by AllAnchors(G, r, w, {}).

Proposition 4.11Correctness of AllAnchors

Let GG be a graph and HGH \subseteq G be a subgraph of GG of width ww. Let rr be a choice of root operation in HH. We have πr(H)Πrw(G).\pi_r(H) \subseteq \Pi_r^w(G).

A call tree for an execution of AllAnchors on the example graph of the previous figure with w=3w = 3w=3. Starting from the root, each node in the tree corresponds to either picking an operation as anchor or not (thus splitting it). Edges are labelled by the values assigned to www for the respective children of the source node. One path from root to leaf leads to no solution (it is impossible to find an unseen linear path from operation bbb. The other paths each lead to a valid set of three anchors.

A call tree for an execution of AllAnchors on the example graph of the previous figure with w=3w = 3. Starting from the root, each node in the tree corresponds to either picking an operation as anchor or not (thus splitting it). Edges are labelled by the values assigned to ww for the respective children of the source node. One path from root to leaf leads to no solution (it is impossible to find an unseen linear path from operation bb. The other paths each lead to a valid set of three anchors.

The proof is by induction over the width ww of the subgraph HH. The idea is to map every recursive call in CanonicalAnchors to one of the calls to AllAnchors on lines 29, 31 or 33. All recursive results are concatenated on line 36, and thus, the value returned by CanonicalAnchors will be one of the anchor sets in the list returned by AllAnchors.

Proof

Let HGH \subseteq G be a connected subgraph of GG of width ww. We prove inductively over ww that if (X,S,H)=(X, S', H') = CanonicalAnchors$(H, r,S)$ then there is a graph GG' such that HGGH' \subseteq G' \subseteq G such that

(X,S,G)(X, S', G') \in AllAnchors(G,r,w,S)(G, r, w, S)

for all valid root operations rr of HH and all subsets of the linear paths of HH in seen_paths. The statement in the proposition directly follows this claim.

For the base case w=1w = 1, CanonicalAnchors will return the anchors anchors = [op] as defined on line 19: there is only one linear path, and it is already in seen_paths, thus for every recursive call to CanonicalAnchors, the while condition on line 12 will always be satisfied until all operations have been exhausted and empty sets are returned. In AllAnchors, on the other hand, The only values of w1, w2 and w3 that satisfy the loop condition on line 27 for w=1w = 1 are w1 == w2 == w3 =0= 0. As a result, given the w =0=0 base case on lines 6–7, the lines 35 and 36 of AllAnchors are only executed once, and the definition of anchors on line 36 is equivalent to its definition in CanonicalAnchors.

We now prove the claim for w>1w > 1 by induction. As documented in AllAnchors, we can assume that every operation has at most 3 children. This simplifies the loop on lines 21–25 of CanonicalAnchors to, at most, three calls to CanonicalAnchors.

Consider a call to CanonicalAnchors for a graph HGH \subseteq G, a root operation rr in HH and a set SS of linear paths. Let waw_a, wbw_b and wcw_c be the length of the values returned by the three recursive calls to CanonicalAnchors of line 22 for the execution of CanonicalAnchors with arguments HH, rr and SS. Let ca,cbc_a, c_b and ccc_c be the three neighbours of rr in HH. If the child cxc_x does not exist, then one can set wx=0w_x = 0 and it can be ignored – the argument below still holds in that case. The definition of seen0 on line 20 in AllAnchors coincides with the update to the variable seen_paths on line 18 of CanonicalAnchors; similarly, the updates to G on lines 14 and 18 of AllAnchors are identical to the lines 11 and 15 of CanonicalAnchors that update H. Let the updated seen_paths be the set SaS_a, the updated G be GaG_a and the updated HH be HaH_a, with HaGaH_a \subseteq G_a.

As every anchor operation reduces the number of unseen linear paths by exactly one (using the simplifying assumptions of section 4.2), it must hold that wa+wb+wc+1=ww_a + w_b + w_c + 1 = w. Thus, for a call to AllAnchors with the arguments GG, rr, ww and SS, there is an iteration of the for loop on line 27 of AllAnchors such that w1 =wa= w_a, w2 =wb= w_b and w3 =wc= w_c. It follows that on line 29 of AllAnchors, the procedure is called recursively with arguments (Ga,ca,wa,Sa)(G_a, c_a, w_a, S_a). From the induction hypothesis, we obtain that there is an iteration of the for loop on line 29 in which the values of anchors1 and seen1 coincide with the values of the new_anchors and seen_paths variables after the first iteration of the for loop on line 21 of CanonicalAnchors. Call the value of seen1 (and seen_paths) SbS_b. Similarly, call the updated value of G in AllAnchors GbG_b and the updated value of G in CanonicalAnchors HbH_b. We have, by the induction hypothesis, that HbGbH_b \subseteq G_b.

Repeating the argument, we obtain that there are iterations of the for loops on lines 30 and 32 of AllAnchors that correspond to the second and third recursive calls to CanonicalAnchors on line 22 of the procedure. Finally, the concatenation of anchor lists on line 36 of AllAnchors is equivalent to the repeated concatenations on line 25 of CanonicalAnchors, and so we conclude that the induction hypothesis holds for ww.

We will see that the overall runtime complexity of AllAnchors can be easily derived from a bound on the size of the returned list. For this, we use the following result:

Proposition 4.10Number of anchor sets in AllAnchors
For a graph GG, a root operation rr in GG and 1wwidth(G)1 \leq w \leq width(G), the length of the list AllAnchors(G,r,w)(G, r, w) is in O(cww3/2)O(c^w \cdot w^{-3/2}), where c=6.75c = 6.75 is a constant.
Proof

Let CwC_w be an upper bound for the length of the list returned by a call to AllAnchors for width ww. For the base case w=0w = 0, C0=1C_0 = 1. The returned all_anchors list is obtained by pushing anchor lists one by one on line 36. We can count the number of times this line is executed by multiplying the length of the lists returned by the recursive calls on lines 28–32, giving us the recursion relation

Cw0w1,w2,w3<ww1+w2+w3=w1Cw1Cw2Cw3.C_w \leq \sum_{\substack{0 \leq w_1, w_2, w_3 < w\\w_1 + w_2 + w_3 = w - 1}} C_{w_1} \cdot C_{w_2} \cdot C_{w_3}.

Since CwC_w is meant to be an upper bound, we replace \leq with equality above to obtain a recurrence relation for CwC_w. This recurrence relation is a generalisation of the well-known Catalan numbers Stanley, 2015Richard P. Stanley. 2015. Catalan Numbers. Cambridge University Press. doi: 10.1017/CBO9781139871495, equivalent to counting the number of ternary trees with ww internal nodes: a ternary tree with w1w \geq 1 internal nodes is made of a root along with three subtrees with w1,w2w_1,w_2 and w3w_3 internal nodes respectively, with w1+w2+w3=w1w_1 + w_2 + w_3 = w-1. A closed form solution to this problem can be found in Aval, 2008Jean-Christophe Aval. 2008. Multivariate Fuss–Catalan numbers. Discrete Mathematics 308, 20 (October 2008, 4660–4669). doi: 10.1016/j.disc.2007.08.100​:

Cw=(3ww)2w+1=Θ(cww3/2)C_w = \frac{{3w \choose w}}{2w + 1} = \Theta \left(\frac{c^w}{w^{3/2}} \right)

satisfying the above recurrence relation with equality, where c=27/4=6.75c = 27/4 = 6.75 is a constant obtained from the Stirling approximation:

(3ww)=(3w)!(2w)!w!=Θ(1w)((3w)3e3)w(e2(2w)2)w(ew)w=Θ((27/4)ww1/2).\begin{aligned}{3w \choose w} = \frac{(3w)!}{(2w)!w!} &= \Theta\left(\frac{1}{\sqrt{w}}\right) \Big(\frac{(3w)^3}{e^3}\Big)^{w}\Big(\frac{e^2}{(2w)^2}\Big)^{w}\Big(\frac{e}{w}\Big)^{w}\\ &= \Theta\left(\frac{(27/4)^w}{w^{1/2}}\right).\end{aligned}

To obtain a runtime bound for AllAnchors, it is useful to identify how much of GG needs to be traversed. If we suppose all patterns have at most depth dd, then it immediately follows that any operation in GG that is in the image of a pattern embedding must be at most a distance dd away from an anchor operation. We can thus equivalently call AllAnchors on a subgraph of GG such that no linear path is longer than 2d2d. We thus obtain the following runtime.

Proposition 4.12Runtime of AllAnchors

For patterns with at most width ww and depth dd, the total runtime of AllAnchors is in

O(cwdw1/2).O\left(\frac{c^w \cdot d}{w^{1/2}}\right).

Proof

We restrict Operations on line 9 to only return the first dd operations on the linear path in each direction, starting at the anchor operation: operations more than distance dd away from the anchor cannot be part of a pattern of depth dd.

We use the bound on the length of the list returned by calls to AllAnchors of Proposition 4.10 to bound the runtime. We can ignore the non-constant runtime of the concatenation of the outputs of recursive calls on line 35, as the total size of the outputs is asymptotically at worst of the same complexity as the runtime of the recursive calls themselves. Excluding the recursive calls, the only remaining lines of AllAnchors that are not executed in constant time are the while loop on lines 15–18 and the Operations and sort calls on lines 9–11. Using the same argument as in CanonicalAnchors, we can ignore the latter two calls by replacing the queue of operations by a lazy iterator of operations. The next operation given op and the graph G can always be computed in O(1)O(1) time using a depth-first traversal of G.

Consider the recursion tree of AllAnchors, i.e. the tree in which the nodes are the recursive calls to AllAnchors and the children are the executions spawned by the nested for loops on line 28–32. This tree has at most

Cw=Θ(cww3/2)C_w = \Theta\left(\frac{c^w}{w^{3/2}}\right)

leaves. A path from the root to a leaf corresponds to a stack of recursive calls to AllAnchors. Along this recursion path, seen_paths set is always strictly growing (line 35) and the operations removed from G on lines 14 and 18 are all distinct. For each linear path, at most 2d2d operations are traversed. Thus the total runtime of the while loop (lines 15–18) along a path from root to leaf in the recursion tree is in O(wd)O(w \cdot d). We can thus bound the overall complexity of executing the entire recursion tree by O(Cwwd)=O(cwdw1/2)O(C_w \cdot w \cdot d) = O(\frac{c^w \cdot d}{w^{1/2}}).


  1. Every anchor operation is on at least one previously unseen linear path, thus there can be at most ww operations in the set of anchors. ↩︎

4.6. An automaton for multi-pattern matching

We have shown in the previous sections that graph pattern-matching can be reduced to a problem of tree inclusions, with trees of fixed width ww. To complete the pattern-matching algorithm, we must provide a fast way to evaluate the subtree relation for many trees representing the set of all patterns we wish to match.

More precisely, for patterns P1,,PP_1, \dots, P_\ell with width ww, fix a root operation rir_i in PiP_i for each 1i1 \leqslant i \leqslant \ell and consider the rooted tree duals of the canonical PSGs τri(Piπi)\tau_{r_i}(P_i^{\pi_i}), with πi=πri(Pi)\pi_i = \pi_{r_i}(P_i) the canonical anchors. Then given a subject graph GG, we wish to compute the set

{1iτri(Piπi)τr(Gπ)},\{1 \leqslant i \leqslant \ell \mid \tau_{r_i}(P_i^{\pi_i}) \subseteq \tau_r(G^\pi)\},

for all anchor sets πΠrw(G)\pi \in \Pi_r^w(G) and root operation rr in GG. This corresponds to the IsSubTree predicate introduced in the sketch of the algorith in section 4.4.

Instead of considering the trees of PSGs, it will prove easier to consider the contracted PSGs (cPSGs)

τri(c(Piπi))andτr(c(Gπ)).\tau_{r_i}(c(P_i^{\pi_i}))\quad\textrm{and}\quad \tau_r(c(G^\pi)).

Such tree inclusions are equivalent to finding embeddings in the subject graph itself, provided that we keep track of the splitsplit and contractcontract weight maps (see section 4.3).

It will be useful to remind ourselves the following properties of contracted PSGs. Every operation of a cPSG (and thus every node in its rooted dual tree) is an anchor operation of the PSG. Per Proposition 4.4, the rooted dual tree of a cPSG is a ternary tree and has exactly width(G)1width(G) - 1 nodes. Finally, recall the concept of an open value of a graph, i.e. a value that is missing either a use or define operation (see section 4.2).

Reduction of tree inclusion to string prefix matching #

Now consider two contracted spanning tree reductions c(G1π1)c(G_1^{\pi_1}) and c(G2π2)c(G_2^{\pi_2}) with values V1V_1 and V2V_2. To simplify notation, define

τ1=τr1(c(G1π1))andτ2=τr2(c(G2π2))\tau_1 = \tau_{r_1}(c(G_1^{\pi_1})) \quad\textrm{and}\quad \tau_2 = \tau_{r_2}(c(G_2^{\pi_2}))

for some choice of root operations r1r_1 and r2r_2 in G1G_1 and G2G_2, respectively. We lift the \subseteq relation on rooted dual trees of PSGs introduced in section 4.5 to rooted dual trees of cPSGs in Such a way that there is an inclusion relation between two rooted dual trees of PSGs if and only if the same relation holds on the rooted duals of cPSGs.

We say that τ1τ2\tau_1 \subseteq \tau_2 if and only if

  • the trees share the same root operation,
  • τ1\tau_1 is a subtree of τ2\tau_2,
  • the spiltspilt map coincides on the common subtree, and
  • the contractcontract map satisfies for all vV1v \in V_1:
    {contract(v)contract(f(v))if v is an open value,contract(v)=contract(f(v))otherwise,\begin{cases}contract(v) \subseteq contract(f(v))\quad&\textrm{if }v\textrm{ is an open value},\\contract(v) = contract(f(v))\quad&\textrm{otherwise},\\\end{cases}

where f:V1V2f: V_1 \hookrightarrow V_2 designates the embedding of V1V_1 into V2V_2 given by the tree embedding.

The first three conditions are taken as-is from the \subseteq relation on non-contracted trees, whilst the fourth condition on the contractcontract map is specific to contracted trees.

Using Proposition 4.2, there are at most 2 open values for each linear path in the graph, and thus at most 2w2 \cdot w open values in a rooted dual tree of a cPSG of width ww. For each such contracted rooted dual, we can thus define a contracted string tuple S=(s1,,s2w)(O)2wS = (s_1, \dots, s_{2w}) \in (O^\ast)^{2w} given by the values of the contractcontract map evaluated in the (up to) 2w2w open values1.

If contractCcontract|_C is the restriction of contractcontract to the domain of definition of non-open values of a cPSG, the fourth condition for the inclusion relation \subseteq on rooted dual cPSGs, given above becomes an equality condition when restricted to non-open values. A special case of this property of particular interest to us is stated as the following result. The \subseteq relation on strings refers to prefix inclusion, i.e. sts \subseteq t if and only if ss is a prefix of tt.

Proposition 4.14Inclusion of equal-width trees

Let S=(s1,,s2w)S = (s_1, \dots, s_{2w}) and T=(t1,,t2w)(O)2wT = (t_1, \dots, t_{2w}) \in (O^\ast)^{2w} be the contracted string tuples of τ1\tau_1 and τ2\tau_2 respectively. Then τ1τ2\tau_1 \subseteq \tau_2 if and only if the trees share the same root, are isomorphic, have the same splitsplit and contractCcontract|_C maps and for all i{1,,2w}i \in \{1, \dots, 2w\}: sitis_i \subseteq t_i.

The proof of this follows directly from observing that rooted duals of cPSGs have the same set of nodes and that the restriction to non-open values contractCcontract|_C must satisfy equality.

Why restricting ourselves to trees of the same width ww? It is sufficient for our purposes! All patterns are of width ww by assumption and so are the rooted dual trees of the form τr(Gπ)\tau_r(G^\pi), given that πΠrw(G)\pi \in \Pi_r^w(G).

The string prefix matching problem is a simple computational task that can be generalised to check for multiple string patterns at the same time using a prefix tree. An overview of this problem can be found in appendix A. We can thus obtain a solution for the pattern matching problem for \ell patterns:

Proposition 4.15Fixed anchor pattern matching

As above, let

  • GG be a graph, with πΠrw(G)\pi \in \Pi_r^w(G) a set of w1w - 1 operations and rπr \in \pi a choice of root operation,
  • P1,,PP_1, \dots, P_\ell be patterns of width ww and depth dd, with choices of root operations r1,,rr_1, \dots, r_\ell and canonical anchors πi=πri(Pi).\pi_i = \pi_{r_i}(P_i).

The set of all pattern embeddings mapping the canonical anchor set πi\pi_i to π\pi and root rir_i to rr for 1i1 \leq i \leq \ell can be computed in time O(wd)O(w\cdot d) using at most \ell pre-computed prefix tree of size at most (d)w+1(\ell \cdot d)^w + 1, each constructed in time complexity O((d)w)O((\ell \cdot d)^w).

Proof

For each pattern, we consider its canonical spanning tree reduction and construct a multi-dimensional prefix tree (see Appendix ) for each group of patterns that share the same spanning tree reduction.

Given a graph GG, we can compute the cPSG of GG for anchors π\pi and map its rooted dual tree to the corresponding prefix tree. This can be done in O(TG)O(|T_G|) time by using a search tree. We can restrict GπG^\pi to a graph of size O(wd)O(w \cdot d) by truncating the linear paths to at most 2d2d length, as in the proof of Proposition 4.12. Thus we can assume GπO(wd)|G^\pi| \in O(w \cdot d).

The rest of the proof and the runtime follow from the multi-dimensional prefix tree construction detailed in Appendix ).

Combining everything #

Finally, putting Proposition . and Proposition 4.12 together, we obtain our main result.

Proposition 4.13Pattern matching

Let P1,,PP_1, \dots, P_\ell be patterns with width ww and depth dd. The pre-computation runs in time and space complexity

O((d)w+wd).O \left( (d\cdot \ell)^w \cdot \ell + \ell \cdot w \cdot d \right).

For any subject graph GG, the pre-computed prefix tree can be used to find all pattern embeddings PiGP_i \to G in time

O(Gcww1/2d)O \left( |G| \cdot \frac{c^w}{w^{1/2}} \cdot d \right)

where c=6.75c = 6.75 is a constant.

Proof

The pre-computation consists of running the CanonicalAnchors procedure on each of the \ell patterns and then transforming them into a map of prefix trees using Proposition .. By Proposition 4.7, CanonicalAnchors runs in O(wd)O(w\cdot d) for each pattern, where we used that Piwd|P_i| \leqslant w \cdot d for all patterns. The total runtime of prefix construction is thus

O((d)w+wd).O \left( (d\cdot \ell)^w \cdot \ell + \ell \cdot w \cdot d \right).

The complexity of pattern matching itself on the other hand is composed of two parts: the computation of all possible anchor sets Πrw(G)\Pi_r^w(G), and the execution of the prefix string matcher for each of the trees resulting from these sets πΠrw(G)\pi \in \Pi_r^w(G). As AllAnchors must be run for every choice of root vertex rr in GG, the runtime is thus obtained by multiplying i) G|G| with ii) the runtime of the prefix tree matching (Proposition .), and with iii) Πrw(G)|\Pi_r^w(G)|, i.e. the number of anchor lists returned by AllAnchors (Proposition 4.10):

O(GwdCw),O(|G| \cdot w \cdot d \cdot C_w ),

where CwC_w is the bound for the number of anchor lists returned by AllAnchors. The result follows.


  1. The values can be ordered as usual by using the total lexicographic order on port labels of the tree. ↩︎

4.7. Benchmarks

Proposition 4.13 shows that pattern-independent matching can scale to large datasets of patterns but imposes some restrictions on the patterns and embeddings that can be matched. In this section, we discuss these limitations and give empirical evidence that the pattern-matching approach we have presented can be used on a large scale and outperform existing solutions.

Pattern limitations #

In section 4.2, we imposed conditions on the pattern embeddings to obtain a complexity bound for pattern-independent matching. We argued how these restrictions are natural for applications in quantum computing, and most of the arguments will also hold for a much broader class of computation graphs.

In future work, it would nonetheless be of theoretical interest to explore the importance of these assumptions and their impact on the complexity of the problem. As a first step towards a generalisation, our implementation and all our benchmarks in this section do not make any of these simplifying assumptions. Our results below give empirical evidence that a significant performance advantage can be obtained regardless.

Implementation #

We provide an open-source implementation in Rust of pattern independent matching using the results of this chapter, available on GitHub. The code and datasets used for the benchmarks themselves are available in a dedicated repository.

The implementation works for weighted or unweighted port graphs – of which typed minIR graphs are a special case – and makes none of the simplifying assumptions employed in the theoretical analysis. Pattern matching proceeds in two phases: precomputation and runtime.

Precomputation.  In a first step, all graph patterns are processed and compiled into a single state automaton that will be used at runtime for fast pattern independent matching. The automaton in the implementation combines in one data structure two distinct computations of this chapter:

  • the recursive branching logic used in the AllAnchors procedure to enumerate all possible choices of anchors.
  • the automaton described in section 4.6 that matches patterns for a fixed set of anchors, and

The former is implemented with non-deterministic state transitions – each transition corresponding to choosing an additional anchor – , whereas the latter is implemented deterministically.

Concretely, the automaton is constructed by following the construction of section 4.4 to decompose each pattern into its canonical path-split graph. We then order the nodes of the PSG and express each node as a condition that ensures the connectivity and node weight in the graph matches the pattern. We thus obtain a chain of conditions, with a transition between any two consecutive conditions; transitions are deterministic by default and marked as non-deterministic whenever they lead to a condition on an anchor node. The state automaton for all patterns is then obtained by joining all chains of conditions into a tree.

Runtime.  Pattern matching is then as simple as simulating the state automaton, evaluating all conditions on the graph GG passed as input. The states in the automaton corresponding to the last condition of a pattern must be marked as end states, along with a label identifying the pattern that was matched. This can then be used at runtime to report all patterns found.

Our implementation has been tested for correctness, i.e. on the one hand that all matches that are reported are correct, and on the one hand that all pattern matches are found. This was done by comparing the matches of our implementation with the results obtained from matching every pattern separately on millions of randomly generated graphs and edge cases. We also ensured during benchmarking that the number of matches reported by our implementation and by Quartz were always the same.

Benchmarks #

Baseline.  To assess practical use, we have benchmarked our implementation against a leading C++ implementation of pattern matching for quantum circuits from the Quartz superoptimiser project Xu, 2022Mingkuan Xu, Zikun Li, Oded Padon, Sina Lin, Jessica Pointing, Auguste Hirth, Henry Ma, Jens Palsberg, Alex Aiken, Umut A. Acar and Zhihao Jia. 2022. Quartz: Superoptimization of Quantum Circuits. In Proceedings of the 43rd ACM SIGPLAN International Conference on Programming Language Design and Implementation, June 2022. Association for Computing Machinery, 625--640. doi: 10.1145/3519939.3523433. This implementation is the principal component of an end-to-end quantum circuit optimisation pipeline. The results and speedups we obtain here thus apply and transfer directly to this application.

Dataset.  We further ensure that our results apply in practice by using a real-world dataset of patterns. The Quartz optimiser finds opportunities for circuit optimisation by relying on precomputed equivalence classes of circuits (ECC). These are obtained exhaustively by enumerating all possible small quantum circuits, computing their unitaries and clustering them into classes of circuits with identical unitaries.

The generation of ECC sets is parametrised on the number of qubits, the maximum number of gates and the gate set in use. For these benchmarks we chose the minimal set of gates T,H,CXT, H, CX and considered circuits with up to 6 gates and 2, 3 or 4 qubits. The size of these pattern circuits is typical for the application1.

Thus, for our patterns, we have the bound d6d \leq 6 for the maximum depth and width w=2,3,4w = 2,3,4. In all experiments, the graph GG subject to pattern matching was barenco_tof_10 input, i.e. a 19-qubit circuit input with 674 gates obtained by decomposing a 10-qubit Toffoli gate using the Barenco decomposition Barenco, 1995Adriano Barenco, Charles H. Bennett, Richard Cleve, David P. DiVincenzo, Norman Margolus, Peter Shor, Tycho Sleator, John A. Smolin and Harald Weinfurter. 1995. Elementary gates for quantum computation. Physical Review A 52, 5 (November 1995, 3457--3467). doi: 10.1103/PhysRevA.52.3457.

Results.  We study the runtime of our implementation as a function of the number \ell of patterns being matched, up to =104\ell = 10^4 patterns. We expect the runtime of pattern matching algorithms that match one pattern at a time to scale linearly with \ell. On the other hand, Proposition 4.13 results in a complexity that is independent of \ell.

For each value of \ell, we select a subset of all patterns in the ECC sets at random. For w=2w = 2, there are only a total of =1954\ell = 1954 patterns, explaining why we do not report result beyond that number. For =200\ell = 200 patterns, our proposed algorithm is 3×3\times faster than Quartz. As expected, the advantage of our approach increases as we match more patterns, scaling up to a 20×20\times speedup for =104\ell=10^4. The results are summarised in the following figure:

Runtime of pattern matching for ℓ=0…104\ell = 0\dots 10^4ℓ=0…104 patterns on 2, 3 and 4 qubit quantum circuits from the Quartz ECC dataset, for our implementation (Portmatching) and the Quartz project. All ℓ=1954\ell = 1954ℓ=1954 two-qubit circuits were used, whereas for 3 and 4 qubit circuits, ℓ=104\ell = 10^4ℓ=104 random samples were drawn.

Runtime of pattern matching for =0104\ell = 0\dots 10^4 patterns on 2, 3 and 4 qubit quantum circuits from the Quartz ECC dataset, for our implementation (Portmatching) and the Quartz project. All =1954\ell = 1954 two-qubit circuits were used, whereas for 3 and 4 qubit circuits, =104\ell = 10^4 random samples were drawn.

Dependency on ww and \ell.  We further study the runtime of our algorithm as a function of its two main parameters, the number of patterns \ell and the pattern width ww, on an expanded dataset. To this end, we generate random sets of 10,000 pattern circuits with 15 gates and between w=2w=2 and w=10w=10 qubits, using the same gate set as previously. The resulting pattern matching runtimes are shown in the figure below.

From Proposition 4.13, we expect that the pattern matching runtime is upper bounded by a \ell-independent constant. Our results support this result for w=2w=2 and w=3w=3 qubit patterns, where runtime seems indeed to saturate, reaching an observable runtime plateau at large \ell.

We suspect on the other hand that the exponential cwc^w dependency in the complexity bound of Proposition 4.13 makes it difficult to observe a similar plateau for w4w \geq 4, as we expect this upper bound on the runtime to increase rapidly with qubit counts . A runtime ceiling is not directly observable at this experiment size, but the gradual decrease in the slope of the curve is consistent with the existence of the \ell-independent upper bound predicted in Proposition 4.13.

Runtime of our pattern matching for random quantum circuits with up to 10 qubits.

Runtime of our pattern matching for random quantum circuits with up to 10 qubits.


  1. Such small circuit sizes are imposed in part by the fact that ECCs of larger circuits quickly become unfeasible to generate as their number grows exponentially. In practice, large circuit transformations can often be expressed as the composition of smaller atomic transformations, hence the good performance of this approach in practice. ↩︎