Chapter 3
Quantum Compilation as a Graph Transformation Problem
The specialised optimisation techniques that we reviewed in section 2.2 are effective for the scenarios they were designed for, but they are challenging to adapt to new hardware primitives, constraints, or cost functions.
This thesis proposes interpreting quantum compilation as a graph transformation system (GTS). GTSs endow quantum compilation with well-defined semantics and strong theoretical foundations Lack, 2005. 2005. Adhesive and quasiadhesive categories. RAIRO - Theoretical Informatics and Applications 39, 3 (July 2005, 511--545). doi: 10.1051/ITA:2005028. They establish a practical, purely declarative framework in which compiler transformations can be defined and studied.
This allows us to decouple the semantics of quantum programs and the architecture specifics from the compiler infrastructure itself. We can thus focus on building and designing scalable and efficient graph transformation algorithms that can then be applied on a wide range of compilation problems and hardware targets.
In this chapter, we formalise quantum computation and optimisation based on graphs and graph transformations, providing the foundation for all considerations in later chapters. Albeit slightly simplified, the intermediate representation IR we propose here is based on joint work Mark K., 2025. 2025. HUGR: A Quantum-Classical Intermediate Representation. Retrieved (talk recording) from https://www.youtube.com/live/D8esZrt7ogk?feature=shared&t=5217, as well as ongoing development.
The words graph rewrite and graph transformation are often used interchangeably in the literature. In the context of this thesis, we will take these words to distinguish two slightly different problems:
The study of equivalences and other relations between graphs under well-defined semantics is the subject of graph transformations. For instance:
- a graph transformation rule (Definition .) expresses that an instance of can always be transformed into an instance of , reflecting the semantics of the system that the graph is modelling.
- a minIR equivalence class (Definition .) is an instance of a graph transformation system (GTS), which uses known semantic relations, expressed, for example, as graph transformation rules, to define how graphs can be transformed.
Graph rewriting, on the other hand, encapsulates the algorithmic procedures and data structures that mutate graphs. A rewrite (Definition .) is the tuple of data required to turn a graph into a new graph .
Given matches of patterns on a graph , a graph transformation system can consider the set of graph transformation rules that define the semantics of to produce a set of rewrites that can be applied to and mutate .
Our contributions in the subsequent chapters are mostly preoccupied with problems of graph rewriting, i.e. the definition and application of the data required to mutate graphs, as opposed to graph transformations. This chapter nonetheless considers both, using the mature graph transformation framework as a foundation to define IR rewriting semantics.
Section 3.1 starts with a review of previous related work at the intersection of graph transformation software and quantum program optimisation. We then discuss in section 3.2 a fundamental difference between classical computation graphs and the requirements of quantum computation. This motivates a new graph-based IR tailored to quantum computation that we present in section 3.3, along with formal graph rewriting semantics based on sesqui-pushout (SqPO) transformations (section 3.4). Whilst the SqPO transformation definition is constructive, its existence is not guaranteed. We conclude the chapter in section 3.5 by discussing a more restricted “operational” notion of graph rewriting that will be useful for the rest of the thesis.
3.1. Related work
Graph rewriting on computation graphs. Optimisation of computation graphs is a long-standing problem in computer science that is seeing renewed interest in the compiler Lattner, 2021. 2021. MLIR: Scaling Compiler Infrastructure for Domain Specific Computation. In 2021 IEEE/ACM International Symposium on Code Generation and Optimization (CGO), February 2021. IEEE, 2--14. doi: 10.1109/CGO51591.2021.9370308, machine learning (ML) Jia, 2019. 2019. TASO: optimizing deep learning computation with automatic generation of graph substitutions. In Proceedings of the 27th ACM Symposium on Operating Systems Principles, October 2019. ACM, 47--62. doi: 10.1145/3341301.3359630 Fang, 2020. 2020. Optimizing DNN computation graph using graph substitutions. Proceedings of the VLDB Endowment 13, 12 (August 2020, 2734--2746). doi: 10.14778/3407790.3407857 and quantum computing communities Xu, 2022. 2022. Quartz: Superoptimization of Quantum Circuits. In Proceedings of the 43rd ACM SIGPLAN International Conference on Programming Language Design and Implementation, June 2022. Association for Computing Machinery, 625--640. doi: 10.1145/3519939.3523433 Xu, 2023. 2023. Synthesizing Quantum-Circuit Optimizers. Proceedings of the ACM on Programming Languages 7, PLDI (June 2023, 835--859). doi: 10.1145/3591254. In all these domains, graphs encode computations that are either expensive to execute or evaluated repeatedly over many iterations, making the optimisation of the execution cost of the computation a primary concern.
Domain-specific heuristics are the most common approach in compiler optimisations Paszke, 2019. 2019. PyTorch: An Imperative Style, High-Performance Deep Learning Library. In Neural Information Processing Systems. doi: 10.5555/3454287.3455008 Sivara., 2020. 2020. t|ket⟩: a retargetable compiler for NISQ devices. Quantum Science and Technology 6, 1 (November 2020, 014003). doi: 10.1088/2058-9565/ab8e92 – a more flexible alternative are optimisation engines based on declarative sets of graph transformations Bonchi, 2022. 2022. String Diagram Rewrite Theory I: Rewriting with Frobenius Structure. Journal of the ACM 69, 2 (March 2022, 1 - 58). doi: 10.1145/3502719 Bonchi, 2022. 2022. String diagram rewrite theory II: Rewriting with symmetric monoidal structure. Mathematical Structures in Computer Science 32, 4 (April 2022, 511--541). doi: 10.1017/s0960129522000317. In such systems, a graph transformation system (GTS) is used to find a sequence of allowed transformations that rewrite a computation graph given as input into a computation graph with minimal cost.
Transformation systems were first studied on strings Dersho., 1990. 1990. Rewrite Systems, then generalised to trees and terms Bezem, 2003. 2003. Term Rewriting Systems (1. publ. ed.). Cambridge University Press, Cambridge, before being applied to graph domains Ehrig, 1973. 1973. Graph-Grammars: An Algebraic Approach. In 14th Annual Symposium on Switching and Automata Theory, Iowa City, Iowa, USA, October 15-17, 1973. IEEE Computer Society, 167--180. doi: 10.1109/SWAT.1973.11 Rozenb., 1997. 1997. Handbook of Graph Grammars and Computing by Graph Transformations, Volume 1: Foundations. World Scientific König, 2018. 2018. A Tutorial on Graph Transformation. In Graph Transformation, Specifications, and Nets - In Memory of Hartmut Ehrig. Springer, 83--104. doi: 10.1007/978-3-319-75396-6_5. Their use in quantum computing is part of a long tradition of diagrammatic reasoning in physics Penrose, 1964. 1964. Conformal treatment of infinity. General Relativity and Gravitation 43, 3 (901--922 (reprint)). doi: 10.1007/s10714-010-1110-5 Feynman, 1949. 1949. Space-Time Approach to Quantum Electrodynamics. Physical Review 76, 6 (Septempter 1949, 769--789). doi: 10.1103/physrev.76.769, and particularly in quantum mechanics with the advent of categorical quantum mechanics Abrams., 2008. 2008. Categorical quantum mechanics. arXiv: 0808.1023 [quant-ph] Coecke, 2012. 2012. Strong Complementarity and Non-locality in Categorical Quantum Mechanics. In 2012 27th Annual IEEE Symposium on Logic in Computer Science, June 2012. IEEE, 245--254. doi: 10.1109/lics.2012.35 Coecke, 2017. 2017. Picturing Quantum Processes: A First Course in Quantum Theory and Diagrammatic Reasoning. Cambridge University Press. doi: 10.1017/9781316219317.
GTS in quantum computing. In quantum computing, the ZX calculus Coecke, 2008. 2008. Interacting Quantum Observables and other diagrammatic theories that derive from it are particularly important. Properties of GTSs such as completeness, confluence and termination Verma, 1995. 1995. Transformations and confluence for rewrite systems. Theoretical Computer Science 152, 2 (December 1995, 269--283). doi: 10.1016/0304-3975(94)00255-0 are well-studied within this field Backens, 2014. 2014. The ZX-calculus is complete for stabilizer quantum mechanics. New Journal of Physics 16, 9 (Septempter 2014, 093021). doi: 10.1088/1367-2630/16/9/093021 Backens, 2019. 2019. ZH: A Complete Graphical Calculus for Quantum Computations Involving Classical Non-linearity. Electronic Proceedings in Theoretical Computer Science 287 (January 2019, 23--42). doi: 10.4204/EPTCS.287.2 Biamon., 2023. 2023. The ZX-Calculus is Canonical in the Heisenberg Picture for Stabilizer Quantum Mechanics. arXiv: 2301.05717 [quant-ph]. These results have formed the basis for software implementations of circuit optimisations with soundness and performance guarantees Duncan, 2020. 2020. Graph-theoretic Simplification of Quantum Circuits with the ZX-calculus. Quantum 4 (June 2020, 279). doi: 10.22331/q-2020-06-04-279 Kissin., 2020. 2020. PyZX: Large Scale Automated Diagrammatic Reasoning. In Proceedings 16th International Conference on Quantum Physics and Logic, Chapman University, Orange, CA, USA., 10-14 June 2019. Open Publishing Association, 229-241. doi: 10.4204/EPTCS.318.14 Sivara., 2020. 2020. t|ket⟩: a retargetable compiler for NISQ devices. Quantum Science and Technology 6, 1 (November 2020, 014003). doi: 10.1088/2058-9565/ab8e92 Borgna, 2023. 2023. Towards a compiler toolchain for quantum programs.
Great strides are also being made in our theoretical understanding of transformation systems for quantum circuits. Recently, Clément et al. established completeness for the first time Cléme., 2023. 2023. A Complete Equational Theory for Quantum Circuits. In 2023 38th Annual ACM/IEEE Symposium on Logic in Computer Science (LICS), June 2023. IEEE, 1--13. doi: 10.1109/lics56636.2023.10175801 as well as minimality Cléme., 2024. 2024. Minimal Equational Theories for Quantum Circuits. In Proceedings of the 39th Annual ACM/IEEE Symposium on Logic in Computer Science, July 2024. ACM, 1--14. doi: 10.1145/3661814.3662088 of a GTS for quantum circuits. A set of circuit transformation rules were presented such that no rule is redundant, and for any two equivalent quantum circuits, there exists a sequence of local transformations rewriting one into the other. Such systems are however not confluent, and this is unlikely to change: most circuit optimisation problems are known to be computationally hard Weteri., 2024. 2024. Optimising quantum circuits is generally hard. arXiv: 2310.05958 [quant-ph].
There is also another inherent tension in integrating diagrammatic calculi into compilers. Diagrammatic theories arise from abstract primitives that admit a simple rewriting logic Heurtel, 2024. 2024. A Complete Graphical Language for Linear Optical Circuits with Finite-Photon-Number Sources and Detectors. arXiv: 2402.17693 [quant-ph] Booth, 2024. 2024. Graphical Symplectic Algebra. arXiv: 2401.07914 [cs.LO] Felice, 2023. 2023. Light-Matter Interaction in the ZXW Calculus. Electronic Proceedings in Theoretical Computer Science 384 (August 2023, 20--46). doi: 10.4204/EPTCS.384.2 Carette, 2023. 2023. Complete Graphical Language for Hermiticity-Preserving Superoperators. In 2023 38th Annual ACM/IEEE Symposium on Logic in Computer Science (LICS), June 2023. IEEE, 1--22. doi: 10.1109/LICS56636.2023.10175712; compilers meanwhile must capture all the expressivity, constraints and messiness of real-world hardware targets, with all the edge cases and exceptions that this entails.
An example of this is the ZX circuit extraction problem Quanz, 2024. 2024. Parallel Quantum Circuit Extraction from MBQC-Patterns. In 2024 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), 1078-1087. doi: 10.1109/IPDPSW63119.2024.00179 Backens, 2021. 2021. There and back again: A circuit extraction tale. Quantum 5 (March 2021, 421). doi: 10.22331/q-2021-03-25-421: it is in general hard to recover an executable quantum circuit from a ZX diagram as the latter is strictly more general and primitives cannot be mapped one-to-one. Similarly, while simple quantum-classical hybrid computations can be expressed using extensions of ZX Borgna, 2021. 2021. Hybrid Quantum-Classical Circuit Simplification with the ZX-Calculus. In Programming Languages and Systems, Cham. Springer International Publishing, 121--139. doi: 10.1007/978-3-030-89051-3_8 Carette, 2021. 2021. Completeness of Graphical Languages for Mixed State Quantum Mechanics. ACM Transactions on Quantum Computing 2, 4 (December 2021, 1--28). doi: 10.1145/3464693 Koziel., 2024. 2024. Hybrid Quantum-Classical Machine Learning with String Diagrams. arXiv: 2407.03673 [quant-ph], it will never be possible to capture the full breadth and generality of classical CPU instruction sets in a practical and extensible (and algebraically satisfying) way.
Peephole optimisations. As an alternative to the very principled approach of elegant calculi, graph transformations can also be used in the absence of theoretical guarantees in a more ad hoc fashion. Indeed, many existing (classical and quantum) compiler optimisations can already be understood as graph transformations. For as long as compilation has existed, compilers have relied on local transformations of the IR, typically referred to as peephole optimisations McKeem., 1965. 1965. Peephole optimization. Communications of the ACM 8, 7 (July 1965, 443--444). doi: 10.1145/364995.365000 Tanenb., 1982. 1982. Using Peephole Optimization on Intermediate Code. ACM Transactions on Programming Languages and Systems 4, 1 (January 1982, 21--36). doi: 10.1145/357153.357155. Such optimisation strategies are based on the heuristic that local optimisations to the program will produce a well-optimised result overall. Mature compiler ecosystems have developed tools for declarative definitions, as well as automatic generation and correctness proving of peephole optimisations Menend., 2017. 2017. Alive-Infer: data-driven precondition inference for peephole optimizations in LLVM. ACM SIGPLAN Notices 52, 6 (June 2017, 49--63). doi: 10.1145/3140587.3062372 Lopes, 2015. 2015. Provably correct peephole optimizations with alive. In Proceedings of the 36th ACM SIGPLAN Conference on Programming Language Design and Implementation, June 2015. ACM, 22--32. doi: 10.1145/2737924.2737965 Riddle, 2021. 2021. PDLL: a new declarative rewrite frontend for MLIR. (November 2021). Retrieved on 13/01/2025 (RFC on Discourse) from https://discourse.llvm.org/t/rfc-pdll-a-new-declarative-rewrite-frontend-for-mlir/4798. We refer to the classical compiler literature, e.g. Muchni., 2007. 2007. Advanced compiler design and implementation ([Nachdr.] ed.). Morgan Kaufmann, San Francisco, Calif. [u.a.], for more details on the various types of common peephole optimisations.
Quantum compilers adopted peephole-style optimisations from the beginning Cheung, 2007. 2007. Translation techniques between quantum circuit architectures. In Workshop on quantum information processing, 1--3 Steiger, 2018. 2018. ProjectQ: an open source software framework for quantum computing. Quantum 2 (January 2018, 49). doi: 10.22331/q-2018-01-31-49 Sivara., 2020. 2020. t|ket⟩: a retargetable compiler for NISQ devices. Quantum Science and Technology 6, 1 (November 2020, 014003). doi: 10.1088/2058-9565/ab8e92. They encompass some of the most common optimisations in quantum computing, including the Euler Angle reduction Chatzi., 2009. 2009. Improving quantum gate fidelities using optimized Euler angles. Physical Review A 80, 5 (November 2009, 052329). doi: 10.1103/physreva.80.052329, the two-qubit KAK decomposition Tucci, 2005. 2005. An Introduction to Cartan's KAK Decomposition for QC Programmers. arXiv: quant-ph/0507171 [quant-ph] Cross, 2019. 2019. Validating quantum computers using randomized model circuits. Physical Review A 100, 3 (Septempter 2019, 032328). doi: 10.1103/physreva.100.032328 and all gate set rebases contri., 2025. 2025. Documentation: pytket.passes.AutoRebase. Retrieved on 13/01/2025 (TKET docs) from https://docs.quantinuum.com/tket/api-docs/passes.html#pytket.passes.AutoRebase. A quantum-specific flavour of peephole optimisation with close links to GTSs, template matching Maslov, 2008. 2008. Quantum Circuit Simplification and Level Compaction. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 27, 3 (March 2008, 436--444). doi: 10.1109/tcad.2007.911334 Iten, 2022. 2022. Exact and Practical Pattern Matching for Quantum Circuit Optimization. ACM Transactions on Quantum Computing 3, 1 (January 2022, 1--41). doi: 10.1145/3498325, achieved state-of-the-art results for Clifford circuit optimisation Bravyi, 2021. 2021. Clifford Circuit Optimization with Templates and Symbolic Pauli Gates. Quantum 5 (November 2021, 580). doi: 10.22331/q-2021-11-16-580. Recently, quantum peephole optimisations were also proposed that leverage provable state information to perform contextual optimisations Liu, 2021. 2021. Relaxed Peephole Optimization: A Novel Compiler Optimization for Quantum Circuits. In 2021 IEEE/ACM International Symposium on Code Generation and Optimization (CGO), February 2021. IEEE, 301--314. doi: 10.1109/cgo51591.2021.9370310, similar to strength reduction and optimisation with preconditions in classical compilation Lopes, 2015. 2015. Provably correct peephole optimizations with alive. In Proceedings of the 36th ACM SIGPLAN Conference on Programming Language Design and Implementation, June 2015. ACM, 22--32. doi: 10.1145/2737924.2737965.
Internal representations. The graph formalisation of quantum computations we will define in this chapter also draws a lot from the internal representations (IR) of programs in classical compilers. The classical compilation community has found significant advantages in sharing a common standardised IR format. Indeed, while the exact syntax constructs and abstractions vary across programming languages, and, at the other end of the compiler stack, the specific assembly instructions emitted differ between hardware targets, much of the compiler middleware can be broadly shared across use cases. This gave rise to the LLVM Lattner, 2004. 2004. LLVM: A compilation framework for lifelong program analysis & transformation. In International Symposium on Code Generation and Optimization, 2004. CGO 2004.. IEEE, 75--86. doi: 10.1109/CGO.2004.1281665 and, more recently, the MLIR Lattner, 2021. 2021. MLIR: Scaling Compiler Infrastructure for Domain Specific Computation. In 2021 IEEE/ACM International Symposium on Code Generation and Optimization (CGO), February 2021. IEEE, 2--14. doi: 10.1109/CGO51591.2021.9370308 projects, which provide common compiler IRs, along with all the infrastructure compilers typically require: IR transformation tooling, translation into hardware-specific assembly, efficient serialisations, in-memory formats etc.
The idea of adopting LLVM for quantum was championed by QIR QIR Al., 2021. 2021. QIR Specification v0.1. Retrieved on 31/12/24 from https://www.qir-alliance.org/, a standard introducing quantum primitives into the LLVM IR. This was subsequently adopted by many quantum hardware providers for its superior expressive power compared to circuit-based formats QIR Al., 2023. 2023. NVIDIA Joins the QIR Alliance as the Effort Enters Year Two. Retrieved on 02/01/2025 from https://www.qir-alliance.org/posts/year_one_in_review/. Building on top of QIR, an IR specifically for quantum-classical programs was proposed in Mark K., 2025. 2025. HUGR: A Quantum-Classical Intermediate Representation. Retrieved (talk recording) from https://www.youtube.com/live/D8esZrt7ogk?feature=shared&t=5217, with additional soundness guarantees based, among others, on the no-cloning principle of quantum information. In parallel, projects with similar aims have also emerged McCask., 2021. 2021. A MLIR Dialect for Quantum Assembly Languages. In 2021 IEEE International Conference on Quantum Computing and Engineering (QCE), October 2021. IEEE. doi: 10.1109/QCE52317.2021.00043 Ittah, 2022. 2022. QIRO: A Static Single Assignment-based Quantum Program Representation for Optimization. ACM Transactions on Quantum Computing 3, 3 (June 2022, 1--32). doi: 10.1145/3491247 that use the full MLIR and LLVM toolchain.
Challenges of GTS for compilation. Peephole optimisations of compiler IRs have proven to be a fast, general and scalable approach to compilation and code optimisation in practice. However, the optimisation results depend heavily on well-designed transformation orderings and the performance may vary widely across (equivalent) input programs. This is commonly known in compiler research as the phase ordering problem Click, 1995. 1995. Combining analyses, combining optimizations. ACM Transactions on Programming Languages and Systems 17, 2 (March 1995, 181--196). doi: 10.1145/201059.201061. When a compiler can modify code in multiple ways, it must determine which transformations to apply and in what sequence to achieve optimal results Whitfi., 1997. 1997. An approach for exploring code improving transformations. ACM Transactions on Programming Languages and Systems 19, 6 (November 1997, 1053--1084). doi: 10.1145/267959.267960 Liang, 2023. 2023. Learning Compiler Pass Orders using Coreset and Normalized Value Prediction. In Proceedings of the 40th International Conference on Machine Learning. JMLR.org. doi: 10.48550/ARXIV.2301.05104. This is a common design challenge in GTSs, often addressed through mechanisms such as rule controls Heckel, 2020. 2020. Graph Transformation for Software Engineers: With Applications to Model-Based Development and Domain-Specific Language Engineering. Springer International Publishing. doi: 10.1007/978-3-030-43916-3.
This issue is also a key challenge within quantum compilation, as can be verified by comparing the performance of peephole-based compilers with provably optimal circuit synthesis strategies. On problem sizes where exhaustive search is feasible, unitary synthesis tools can sometimes outperform current, mostly peephole-based compilers Sivara., 2020. 2020. t|ket⟩: a retargetable compiler for NISQ devices. Quantum Science and Technology 6, 1 (November 2020, 014003). doi: 10.1088/2058-9565/ab8e92 by up to 50%1 Wu, 2020. 2020. QGo: Scalable Quantum Circuit Optimization Using Automated Synthesis. arXiv: 2012.09835 [quant-ph].
at the cost of many hours of compute, of course. ↩︎
3.2. Computation graphs and linearity
Computation graphs represent the flow of data between operations in a program, with nodes as operations and edges as data dependencies. Widely used in machine learning frameworks and GPU optimisations Bergst., 2011. 2011. Theano: Deep learning on gpus with python. In NIPS 2011, BigLearning Workshop, Granada, Spain Zhao, 2023. 2023. AutoGraph: Optimizing DNN Computation Graph for Parallel GPU Kernel Execution. Proceedings of the AAAI Conference on Artificial Intelligence 37, 9 (June 2023, 11354--11362). doi: 10.1609/aaai.v37i9.26343, they are conceptually equivalent to dataflow graphs used in compiler design, which were pioneered by Feo, 1990. 1990. A report on the sisal language project. Journal of Parallel and Distributed Computing 10, 4 (December 1990, 349--366). doi: 10.1016/0743-7315(90)90035-n and Kahn, 1976. 1976. Coroutines and networks of parallel processes and are now central to most compiler IRs.
In classical computations, these graph representations of computations are essentially term graphs Barend., 1987. 1987. Term Graph Rewriting – sets of algebraic expressions that are stored as trees, combined with an important optimisation known as term sharing. When identical subexpressions appear multiple times, they can be represented as one computation and referenced from multiple locations, creating a directed acyclic graph rather than a term tree Plump, 1999. 1999. Term Graph Rewriting. (October 1999). This sharing enables a more efficient representation. It can also be used as a compiler optimisation to identify subexpressions that can be cached and shared across expression evaluations for a more efficient execution – a technique known as common subexpression elimination (CSE) Cocke, 1970. 1970. Global common subexpression elimination. In Proceedings of a symposium on Compiler optimization -. ACM Press, 20--24. doi: 10.1145/800028.808480.
Each edge of a computation graphs corresponds to a unique value: the output of a previous computation that is being passed on to new operations. These values flow along edges in the graph – hence dataflow graph. Values are immutable: they are defined once and then passed as input to further operations, where they can only be consumed, never modified. In compiler speak, programs expressed using such immutable values are often called single static assignment (SSA) programs Cytron, 1991. 1991. Efficiently computing static single assignment form and the control dependence graph. ACM Transactions on Programming Languages and Systems 13, 4 (October 1991, 451--490). doi: 10.1145/115372.115320 Rosen, 1988. 1988. Global Value Numbers and Redundant Computations. In Proceedings of the 15th ACM SIGPLAN-SIGACT symposium on Principles of programming languages - POPL ’88. ACM Press, 12--27. doi: 10.1145/73560.73562. In SSA:
- Every value is defined exactly once,
- Every value may be used any number of times (including zero).
Quantum computing throws this second pillar of SSA into the bin. Values in quantum computations are the result of computations on quantum data, and as such must obey the no-cloning and no-deleting theorems (section 2.1). We call values subject to these restrictions linear1. They introduce the following constraint on valid computation graphs:
Every linear value must be used exactly once.
Linear values change fundamentally how transformations of the computation graph must be specified. Where compilers on classical data can:
- freely share common subexpressions (term sharing),
- undo term sharing, i.e. duplicate shared terms into independent subterms, and
- delegate the identification and deletion of obsolete code to specialised passes (e.g., dead code elimination Cytron, 1991. 1991. Efficiently computing static single assignment form and the control dependence graph. ACM Transactions on Programming Languages and Systems 13, 4 (October 1991, 451--490). doi: 10.1145/115372.115320 Briggs, 1994. 1994. Effective partial redundancy elimination. ACM SIGPLAN Notices 29, 6 (June 1994, 159--170). doi: 10.1145/773473.178257),
quantum compilers must enforce much stricter invariants on IR transformations – or risk producing invalid programs.
In classical compilers, IR modification APIs (such as MLIR’s PatternRewriter) decouple program transformation from code deletion. Program transformations are specified by copying existing values and introducing new values and operations as needed, while the actual deletion of unused code is deferred to specialized dead code elimination passes. This approach is no longer feasible in the presence of linear values. Computation graphs for quantum computations must adopt proper graph rewriting semantics, in which the explicit deletion of obsolete values and operations is just as much a part of the rewriting data as the new code generation.
The terminology comes from “linear” logic Girard, 1987. 1987. Linear logic. Theoretical Computer Science 50, 1 (1--101). doi: 10.1016/0304-3975(87)90045-4. I apologise for slamming additional semantics on what I recognise is an already very overloaded term. ↩︎
3.3. A graph representation for quantum programs
For the purposes of this thesis, we introduce a simplified graph-based IR for quantum computations that we will call minIR. It captures all the expressiveness that we require for hybrid programs whilst remaining sufficiently abstract to be applicable to a variety of IRs and, by extension, quantum programming languages and compiler frameworks.
MinIR can be thought of as being built from statements of the form
x, y, ... := op(a, b, c, ...)
to be understood as an operation op applied on the SSA values a, b, c, ...
and producing the output values x, y, ....
Computation and dataflow graphs are commonly defined with operations as vertices
and values as edges. To faithfully capture the function signature of op, this
requires storing and preserving an ordering of the incoming and outgoing edges
(also known as port graphs Ferná., 2018. 2018. Labelled Port Graph – A Formal Structure for Models and Computations. Electronic Notes in Theoretical Computer Science 338 (October 2018, 3--21). doi: 10.1016/j.entcs.2018.10.002).
Instead, we adopt the hypergraph formalisation of computation graphs, which is more common within category theory (see string diagrams Seling., 2010. 2010. A Survey of Graphical Languages for Monoidal Categories and their formalisation in the hypergraph category Bonchi, 2022. 2022. String Diagram Rewrite Theory I: Rewriting with Frobenius Structure. Journal of the ACM 69, 2 (March 2022, 1 - 58). doi: 10.1145/3502719 Wilson, 2022. 2022. The Cost of Compositionality: A High-Performance Implementation of String Diagram Composition. Electronic Proceedings in Theoretical Computer Science 372 (November 2022, 262--275). doi: 10.4204/eptcs.372.19). This definition is particularly well-suited for our purposes because it frames the graph transformations of interest to us in the well-studied language of rewriting within adhesive categories Lack, 2004. 2004. Adhesive Categories. In Foundations of Software Science and Computation Structures, Berlin, Heidelberg. Springer Berlin Heidelberg, 273--288. doi: 10.1007/978-3-540-24727-2_20.
Hypergraphs and minIR #
At a minimum, a directed hypergraph – for simplicity sometimes in the following referred to simply as graph – is defined by a set of vertices and a set of (hyper) edges . We will always consider hypergraphs where the edges are directed and the vertices attached to are given by ordered lists. We formalise this incidence relation between vertices in and edges in by writing as the partition over the disjoint sets
and introducing source and target maps for each . Why we write sets in boldface will become clear in a moment.
A directed hypergraph is given by sets and for , along with maps
for all .
Note that in this thesis, as in most common uses of hypergraphs, the sets and will always be finite, and thus for a finite number of only.
For simplicity, we can further omit the subscript of the source and target maps whenever it can be inferred from the domain of definition of the map. For , we call the source vertices of and the target vertices of .
We introduce the notation to signify that there is an edge from to , i.e. there is for some and such that and . We define the equivalence relation of connected vertices, given by the transitive, symmetric and reflexive closure of . The equivalence classes of are the connected components of the graph. We will write , resp. for the connected component that contains the vertex , resp. the edge .
To proceed, it is useful to frame the hypergraph definition in a categorical setting. We write for the presheaf topos of the category , i.e. the category with functors as objects and natural transformations as morphisms. Definition 3.1 can be equivalently restated as:
hypergraphs are objects in the presheaf topos ,
where the category has objects and for and arrows given by (1), now interpreted as morphisms in rather than as functions in . In this framing, a graph is a functor that defines a set for each object of and specifies functions between these sets – one for each arrow in .
This is where the distinction between bold and non-bold typeface comes from: we use bold letters to refer to images in of a hypergraph functor, whereas the non-bold typeface is refers to objects in the indexing category . The distinction between and is less important for morphisms – it will typically be clear from the context. We thus use the same symbols for both.
Linearity constraints #
The introduction of not only gives us a notion of hypergraph homomorphisms – maps between hypergraphs that preserve the structure of the graph. It also provides us with a way to express the linearity constraints that arise from our discussion in section 3.2, and which we must enforce on our computation graphs.
The definition that follows adds the coproduct explicitly as an object of the category (which we did not need to do in Definition 3.1), as we need it as the codomain of the new morphisms and . The adhesitivity of the category does no longer comes for free – we will get back to this in section 3.4.
The category is the category given by objects
and for all . The morphisms are split monomorphisms and the following diagrams commute for all and :
Directed hypergraphs with linearity conditions are objects in the full subcategory given by objects such that
We probably owe an explanation for this definition – at least for the sake of the few computer scientists that are still following us.
First of all, notice that every hypergraph with linearity constraint corresponds to a hypergraph in the sense of Definition 3.1: there is an obvious functor that maps each object and morphism in to the object or morphism with the same name in . By contravariance, we can thus (functorially) map every hypergraph with linearity constraints to the hypergraph .
Another way of looking at this is to realise that by requiring that be split monomorphisms, we obtain that the resulting functions in are injective. Up to isomorphism, we can consider that are subsets of vertices in . A hypergraph with linearity constraints is thus a directed hypergraph with two selected subsets of vertices and .
Vertices within these subsets are special. For every , there exist unique indices , and edge such that . In words, for every there is a unique edge in the hypergraph that has as one of its sources. We then say that is the unique use of vertex . Similarly, vertices in have a unique edge in the hypergraph with as one of its targets – it is the unique definition of .
Typed graphs #
MinIR graphs are strongly typed. We introduce typed graphs for this purpose, a concept for which graph transformation was first formalised in Ehrig, 2004. 2004. Fundamental Theory for Typed Attributed Graph Transformation. In Graph Transformations, Berlin, Heidelberg. Springer Berlin Heidelberg, 161--177. doi: 10.1007/978-3-540-30203-2_13. A type system for directed hypergraphs is just another object . A typed graph is then an object of the slice category , that is to say, typed graphs are morphisms of and morphisms between typed graphs are given by the subset of morphisms of that make the triangle diagram formed by , and commute.
To type hypergraphs with linearity constraints, we do not pick the type system in , as the existence of and morphisms impose restrictions that are too strict. We consider instead the category given by the same objects as , as well as the same morphisms, with the omission of and . There is an obvious functor and thus, by contravariance, every hypergraph with linearity constraints can be mapped to a hypergraph in . We say that a hypergraph with linearity constraints is -typed for a type system if there is a morphism , when interpreting is as an object of .
Two example hypergraphs. Vertices (labelled with capital letters) are circles and hyperedges (labelled with small letters) span between them. Vertices that are attached to an edge in the black half of the circle correspond to source vertices of the edge; those in the white half correspond to target vertices. The functions and map edges to the incident vertices, defining the directed hypergraph. On the left, there are further functions and that map vertices to the unique edge that uses or defines them. This defines a hypergraph with linearity constraints, with and . These cannot be defined on the graph on the right. On the other hand, the edges b and c are in , and are in the domain of functions and , thus defining a child region of a hierarchical graph. Note that it would be invalid to have any edge connecting or to or . The and vertices also have incidence morphisms, not displayed here.
Hierarchical hypergraphs #
A final bit of structure that minIR graphs require is a notion of hierarchy
between regions of the graph. This will be useful to define functions, control
flow blocks such as if-else, or any subroutine that can itself be viewed as an
operation in the computation.
Hierarchical hypergraphs were first proposed in Drewes, 2002. 2002. Hierarchical Graph Transformation. Journal of Computer and System Sciences 64, 2 (March 2002, 249--283). doi: 10.1006/jcss.2001.1790 and further generalised in Busatto, 2005. 2005. Abstract hierarchical graph transformation. Mathematical Structures in Computer Science 15, 4 (July 2005, 773--819). doi: 10.1017/s0960129505004846 Palacz, 2004. 2004. Algebraic hierarchical graph transformation. Journal of Computer and System Sciences 68, 3 (May 2004, 497--520). doi: 10.1016/s0022-0000(03)00064-3. However, we opt to use a more restrictive definition, closer to the notion of flattened hypergraphs of Drewes, 2002. 2002. Hierarchical Graph Transformation. Journal of Computer and System Sciences 64, 2 (March 2002, 249--283). doi: 10.1006/jcss.2001.1790. The reason for this is twofold. Firstly, hierarchical (hyper)graphs are typically defined recursively. It is not obvious under which conditions (and if) such definitions form adhesive categories, although progess in this direction was made in Padberg, 2017. 2017. Hierarchical Graph Transformation Revisited: Transformations of Coalgebraic Graphs. In Graph Transformation, Cham. Springer International Publishing, 20--35. doi: 10.1007/978-3-319-61470-0_2 with the introduction of coalgebraic graphs. As a result, to the extent that graph transformation results can be applied to such structures, it must be done so carefully.
The second, more practical, reason is that the notion of typed graph introduced above cannot be directly lifted to the hierarchical graph setting: while some subset of the hierarchical relation in minIR should be enforced by the type system, the type graph of a nested graph should be identical to the parent’s (as opposed to being itself nested within the type graph of the parent).
It is therefore more convenient for us to encode hierarchy in directed hypergraphs as follows. Note that it is not clear that our definition is adhesive either1 – but at least it is framed as a subcategory of a base category that is.
The category is the category with objects and arrows of along with additional objects for and arrows:
- that are split monomorphisms,
- input arrows , and output arrows .
Hierarchical hypergraph are the objects in the full subcategory given by objects such that
- for any edge of , the set has at most one element.
- the transitive and reflexive closure of for is a partial order on the connected components of .
Here and are the functions with domain defined piecewise as and for all on their respective (disjoint) domains of definition.
The same definition can also be applied to to obtain the category of hierarchical directed hypergraphs with linearity constraints . Similarly, we define the associated category for type systems; however, we do not impose any of the two conditions related to for the type category, i.e. is the full presheaf category, rather than a subcategory of it.
As with the incidence morphisms and , we will drop the subscript for the IO arrows and when it can be inferred from the domain of definition.
Just as in the discussion of Definition 3.2, we interpret the split monomorphisms as equivalent to requiring . Taking over terminology from Drewes, 2002. 2002. Hierarchical Graph Transformation. Journal of Computer and System Sciences 64, 2 (March 2002, 249--283). doi: 10.1006/jcss.2001.1790, we call elements the frames of . For each frame , there is thus a unique input edge and a unique output edge in that have respectively targets and zero sources, and sources and zero targets.
By the first condition we imposed on , the partial function mapping connected components to their parent edge:
is well-defined. We call the subgraphs of that share a same parent a region of 2. The subgraph of vertices and edges without parent is the root region of .
The minIR computation graph #
In minIR, the vertices are the values of the computation, while the hyperedges define the operations. This imposes some constraints for a hypergraph to be a valid minIR graph:
- all values in minIR must have a unique operation that defines them;
- values that are linear must also have a unique operation that uses them;
- the graph must be acyclic, meaning that no value can be defined in terms of itself.
This can be expressed as a hypergraph with linearity constraints by choosing and , where is the subset of linear values.
The following definition then comes as no surprise:
Let be a type system. A minIR graph typed in is an object of that is -typed and such that the adjacency relation is acyclic. We call , the linear values of .
In the context of minIR, relations encode the data flow of values across the computation. The lack of explicit operation ordering differentiates minIR (and HUGR) from most classical IRs, which, unless specified otherwise, typically assume that instructions may have side effects and thus cannot be reordered. All quantum operations (and the classical operations we are interested in) are side-effect free, which significantly simplifies our IR.
Input and output values #
Notice that in Definition 3.4, it is not enforced that every value has a definition, i.e. there might be ; nor that every value with a linear type is in , i.e. if is the typing morphism and are the linear types in the type system, there might be .
This would be easy to fix: we could on the one hand enforce the equality , thus guaranteeing that every value has a unique definition in the graph. On the other hand, we could define as the coproduct , where is a new object introduced to explicitly capture the set of non-linear values. Morphisms in this category would guarantee that the linearity of values is always preserved, and thus in particular the type morphism would map a value to a linear type if and only if it is linear.
Instead, we opt to allow undefined values and unused linear values to be able to express rewrite rules that match sugraphs of minIR graphs within the same category.
For a minIR graph with typing morphism , we call the set the input values of and its output values, where are the linear values in .
If , we say that is IO-free.
Note that by this definition, an output value in always has a linear type! This is because non-linear values do not need to be treated specially when they are outputs: unlike linear values that must always be used in a non-output position, non-linear values may have no outgoing edge, in which case they are simply discarded in the computation.
Structured control flow #
The operations and values of a minIR graph define the data flow of a program. However, a program must also be able to control and change the data flow at run time in order to express loops, conditionals, function calls etc. This is the program control flow, which minIR expresses using regions and so-called structured control flow.
Using regions, any non-trivial control flow (function calls, conditionals, loops etc.) is captured by a frame, a “black box” operation within the data flow of the program. Its implementation is then defined in the nested region of the frame. This can be used for function calls, but also for branches of control flow. A simple function call, that unconditionally redirects the control flow to the operations within a nested region, could for example be represented as follows:
In this figure and below, circles are SSA values (the vertices of the
hypergraph), while the edges spanning between them are the operations. Edges
attached to the white half of circles are value definitions, while hyperedges
attached to the black half of circles are value uses. The call and +
operation can be read left to right: for instance, the two values x and y on
the left of call are inputs (the operation uses those values, and is thus
attached to the black half of the circles), whereas the x + y value on the
right is the output of the call operation (the operation defines this value,
which is thus attached to the white half of the circle).
Dashed arrows indicate hierarchical dependencies that map the frame edge to the two input and output edges in the child region; dashed rectangles mark the non-root regions of the graph.
Importantly, the frame edge representing the call operation must intuitively
“forward” all its input values to the in operation of the child region, and
similarly passes on the value at the out operation to the output value of the
call output in the parent graph. Passing function arguments and retrieving
returned values in this fashion will be very familiar to any computer scientist.
Unlike most programming languages, this is also how in minIR values are passed
to and from any control flow constructs we would wish to model.
In terms of graph structure, this relation between values in parent and child
regions means that the arity and types of the inputs and outpus of call fix
the signatures of the child in and out operations.
Definition 3.3 already ensures that the input and
output arities of in and out are correct. The correct typing of these input
and output values will be ensured by the type system, which we discuss in a
separate section below.
To handle constructs that require more than one child region, such as an
if-else statement, we can use frames that have zero input and one output:
The output of the ifblock is intuitively a higher-order type representing an
operation that takes two inputs and ouputs the sum.
An if-else statement might then look as follows:
The if and else blocks must expect the same input and output values. This is
key to respecting any linearity constraints that values passed to ifelse might
be subject to. By definition, all operations that use or define a value will
be in the same region – in other words, values are only available within their
defining region. This in effect implements “variable scoping”. With some
imagination, this construction can easily be adapted to model loops, complex
control flow graphs, or any other control flow structures.
Why not plain branch statements? #
There is a simpler – and at least as popular – way of expressing control flow in IRs without requiring regions and operation hierarchies, using branch statements3. For instance, LLVM IR provides a conditional branch statement
br i1 %cond, label %iftrue, label %iffalse
that will jump to the operations under the iftrue label if %cond is true and
to the iffalse label otherwise.
This is a simple and versatile approach to control flow that can be used to express any higher-level language constructs. Unfortunately, conditional branching does not mix well with linear values.
Linearity, as defined in Definition 3.4, is a simple
constraint to impose on minIR graphs. In the presence of conditional branching,
however, the constraint would have to be relaxed to allow for single use in
each mutually exclusive branch of control flow. For instance, the following two
uses of b should be allowed (in pseudo-IR):
b := h(a)
<if cond> c := x(b)
<else> d := h(b)
This is a much harder invariant to check on the IR: linearity would no longer be enforceable as a syntactic constraint on the minIR graph as in Definition 3.4, but would instead depend on the semantics of the operations to establish mutual exclusivity of control flow4. Forbidding arbitrary branching in minIR and resorting instead to structured control flow as described above to express control flow is just as expressive and gives the linearity constraint a much simpler form.
Type graph #
We have seen how minIR graphs impose some structure on the types of
computations that can be expressed: a linear value cannot be used by two (or
zero) operations, frames will always have a unique in and out operation in
their child region with correct arities, etc.
However, without a “good” type system and associated semantics, it is still
possible to express nonsensical programs: we have mentioned for instance earlier
that it is up to the type system to enforce that the types of the in and out
operations match the types of the frame. Similarly, it is possible to construct
programs that break linearity: take the ifelse operation discussed in the
previous section, but now replace its semantics to be do-in-parallel, i.e. it
will execute both the if-block and else-block in parallel on the inputs that
it is given. This would violate the linearity of its inputs, but would
nonetheless be a syntactically valid minIR graph!
To resolve this we present here some typed operations, along with their semantics, that can be used to construct well-behaved type systems: programs typed in this system model the kind of quantum programs that we are interested in expressing and are guaranteed to be valid computations. Categorising all valid constructions or an exhaustive enumeration of conditions that type systems must satisfy to guarantee the validity of programs is beyond the scope of this thesis. It is in practice often straightforward to combine and extend the elements presented here to support further custom syntactic constructs and types.
Basic types and operations #
The most elementary types in our computations are Bits and Qubits. The
former is typically known as a Boolean and represents the purely classical
values 0 and 1. The latter is the canonical quantum example of a linear
type. Indeed, just like values in minIR graphs, the type system in
distinguishes between linear and
non-linear types.
Other typical classical types such as integers, floats, strings, custom
algebraic data types (ADT) etc could also be introduced as required. In the
figure below, we for instance introduce the Angle type to represent rotation
angles that parametrise quantum gates. Further examples of linear types, on the
other hand, include higher-dimensional qudits, but also any ADT that contains a
linear type within it.
As we saw in section 2.1, the number of input qubits in pure
quantum operations will match the number of output qubits: the single-qubit h
(hadamard gate) and two-qubit cx (controlled NOT) operations thus have one or
two Qubits as both inputs and outputs rz (Z rotation) is also a single-qubit
operation, but it takes an additional input of type Angle to specify the
rotation angle.
On the pure classical side, we are free to add any side-effect free operations
on our types; in our example we model addition + on Angles and negation
not on Bits. In the type system
, each type is represented
by a single vertex.
In our example, we thus have three vertices:
We introduce a different colour for each type. Operations such as cx are
represented by a hyperedge with two sources on Qubit and two targets on
Qubit. As in the previous diagrams, we can distinguish operation inputs from
outputs by whether they are attached to the dark or light half of the type
vertex: the rz operation thus has one Qubit input, one Angle input and one
Qubit output.
As you can tell from the diagram, whilst Qubit is a linear type in the type
system, it is not a linear value in the sense of a minIR graph: the Qubit type
has multiple uses and defines in the cx operation alone. This is the key
difference between and
.
Qubit allocation and measurement #
We also introduce non-pure quantum operations qalloc and measure which
respectively “create” a qubit (so no input, one Qubit output) and “destroy” it
(one Qubit input, one Bit output – depending on whether the qubit was
projected onto the or state). Remember that the reason these
operations seem to “break” the laws of pure quantum physics is because they
result from interactions with the classical environment.
measure is fundamental, as it connects quantum values with classical ones!
Region definition and structured control flow #
Our type system is so far missing a crucial aspect of minIR: the hierarchical structures. For this we need frame types, i.e. frames in the type graph. We must introduce a distinct type for each possible type signature of a frame. To keep this as simple as possible, we will introduce exactly one type for each signature.
If we write for the set of types in our type system (i.e. Bit, Qubit and
Angle in our example), then a type signature of an edge is given by a pair
of ordered lists of types . For each such
pair, we introduce
- the frame type
regiondef<I, O>, - the in and out types
in<I,O>andout<I,O>, - along with a new non-linear type
Region<I, O>, the higher-order type representing a region with inputs and outputs .
The regiondef<I, O> op takes zero inputs and returns one output
Region<I, O>, whereas in<I,O> takes zero inputs and returns values of type
and out<I,O> takes inputs of type and returns nothing. For instance,
for Qubit, Qubit and Qubit, Bit, we have the
following type graph.
Note that there is an important distinction in
in comparison to
: there is no notion of regions in the type system:
the Qubit and Bit types in the above diagram would be in the child region of
regiondef<I, O> if it were a graph in , but in the
type system, they might also be used by other operations in other regions (such
as cx, rz, h etc. defined earlier).
Using the Region<I,O> types, it is then easy to define typed operations for
any structured control flow of interest, such as the if-else example above.
The following figure gives an overview of the entire type system of our example.
For display purposes, we have included multiple copies of each type vertex; we
remind the reader that in the actual type graph, all circles of the same type
(colour) are one and the same.
A complete minIR type graph, following the example in this section. Value vertices with the same label (and same colour) form a single vertex in the type graph. They have been split into multiple vertices in this representation for better readability. The data types and op types with the <I,O> suffix are parametrised on the signature type for .
An example minIR program #
Taking a step back, let us make the introduced ideas more concrete through an example. We demonstrate how a simple program written in textual form can be translated and expressed as a minIR graph. All statements are of the form
x, y, ... := op(a, b, c, ...)
where a, b, c etc are the SSA values passed to op (or used by op),
and x, y etc are the SSA values returned by op (or defined by op). We
use curly bracket to define the child region of a regiondef operation. A valid
minIR program might then look as follows:
1main := regiondef<(Qubit, Qubit), (Qubit, Bit)> {
2 q0, q1 := in()
3
4 q0_1 := h(q0)
5 q0_2, q1_1 := cx(q0_1, q1)
6
7 m0 := measure(q0_2)
8
9 ifregion := regiondef<(Qubit,), (Qubit,)> {
10 q1 := in()
11 out(q1)
12 }
13 elseregion := regiondef<(Qubit,), (Qubit,)> {
14 q1 = in()
15 q1_1 := h(q1)
16 out(q1_1)
17 }
18 q1_2 := ifelse(m0, q1_1, ifregion, elseregion)
19
20 out(q1_2, m0)
21}
Note that the in() and out(..) operations are only allowed within nested
regions (as required by the type system). We have omitted the type parameters on
these operations, as it mirrors exactly the paremeter of the regiondef.
It corresponds to the two minIR graphs on the following page. We use “wiggly hyperedges” that stretch between values, as in the first figure. They may look unusual if you are used to computation graphs. One can opt to draw the same graph with boxes for hyperedges and wires for values, yielding the second figure. The two representations are equivalent, but the rewriting semantics are most explicit when viewing values as vertices.
An example of an IO-free minIR graph. The vertex colours indicate their types in the type system presented in the previous figure. The main, ifregion and elseregion ops are all of op type regiondef (with type parameters omitted), labelled here with custom names for clarity. The type parameters of the ifelse, in and out op type have similarly been omitted. All other operation types are given as labels on the edges.
An equivalent representation of the computation above, now representing operations as boxes and values as wires. The arrow direction indicates the flow from value definition to value use(s). Dashed arrows have been changed to point to regions instead of individual operations.
Differences to the quantum circuit model #
We conclude this presentation of minIR by highlighting the differences between this IR-based representation and the quantum circuit model that most quantum computing and quantum information scientists are familiar with5.
When restricted to purely quantum operations and no nested regions, the string diagram representation of a minIR graph (i.e. operations as boxes and values as wires) looks very similar to a quantum circuit. There is, however, a fundamental shift under the hood from reference to value semantics – to borrow terminology from C++.
In the reference semantics of quantum circuits, operations are typically thought of as “placed” on a qubit (the “lines” in the circuit representation), for instance, by referring to a qubit index. This qubit reference exists for the entire computation duration, and the quantum data it refers to will change over time as operations are applied to that qubit.
In the value semantics of computation graphs and SSA, on the other hand, qubits only exist in the form of the data they encode. When applying an operation, the (quantum) data is consumed by the operation and new data is returned. Given that the input data no longer exists, linearity conditions are required to ensure that no other operation can be fed the same value.
To make the difference clear, compare the program representations of the following computation:
Quantum circuit (pytket)6
import pytket as tk
circ = tk.Circuit(2)
circ.H(0)
circ.CX(0, 1)
circ.X(1)
SSA (minIR)
q0_0, q1_0 := in()
q0_1 := h(q0_0)
q0_2, q1_1 := cx(q0_1, q1_0)
q1_2 := x(q1_1)
out(q0_2, q1_2)
In value semantics, it becomes much harder to track physical qubits across their lifespan. This has very practical implications: without the convenient naming scheme, it would, for example, be non-trivial to count how many qubits are required in the SSA representation of the computation above. However, it is a drastically simpler picture from the point of view of the compiler and the optimiser – hence its popularity in classical compilers. When operations are defined based on qubit references, the compiler must carefully track the ordering of these operations: operations on the same qubit must always be applied in order. Through multi-qubit gates, this also imposes a partial ordering on operations across different qubits that must be respected.
SSA values remove this dependency tracking altogether: the notion of physical qubit disappears, and the ordering of statements becomes irrelevant. All that matters is connecting each use of a value (i.e. an input to an operation) with its unique definition, the output of a previous operation. In other words, the global ordering imposed by reference semantics is replaced by a causal order on the diagram Kissin., 2019. 2019. A categorical semantics for causal structure. Logical Methods in Computer Science Volume 15, Issue 3 (August 2019). doi: 10.23638/lmcs-15(3:15)2019.
All the concepts of minIR embed themselves very easily within the MLIR-based quantum IRs and the HUGR IR Mark K., 2025. 2025. HUGR: A Quantum-Classical Intermediate Representation. Retrieved (talk recording) from https://www.youtube.com/live/D8esZrt7ogk?feature=shared&t=5217. In this sense, our toy IR serves as the minimum denominator across IRs and compiler technologies so that proposals and contributions we are about to make can be applied regardless of the underlying technical details.
By waiving goodbye to the circuit model, we have been able to integrate much of the theory of traditional compiler design, bringing us in the process much closer to traditional compiler research and the large-scale software infrastructure that is already available. This gives us access to all the classical optimisation and program transformation techniques developed over decades. Using structured control flow, we were also able to model linear resources such as qubits well – by using value semantics and SSA, checking that no-cloning is not violated is as simple as checking that each linear value is used exactly once.
Finally, this new design is also extremely extensible. Not only does it support arbitrary operations, but the type system is also very flexible. There is dedicated support for linear types, but this does not have to be restricted to qubits: lists of qubits could be added or even, depending on the target hardware, higher dimensional qudits, continuous variable quantum data, etc.
Note that a region may not be a connected subgraph. Albeit, it is a simple exercice to convince yourself that any non-root region contains either one or two connected components. ↩︎
You may know this from prehistoric times as the
gotostatement, in languages such as Fortran, C, and, yes, even Go. ↩︎You might be thinking “oh, but all that is required here are phi nodes!”, if you are familiar with those. No – you’d also need a sort of “phi inverse”. Besides, see this discussion for more arguments on why no phi nodes. ↩︎
Note that these comments apply specifically to characteristics of quantum circuits. Other diagrammatic representations of quantum processes in use, such as string diagrams, quantum combs etc may not share the same properties. ↩︎
This is python code:
pip install pytket. ↩︎
3.4. Graph transformation in minIR
As discussed in section 3.2, computation graphs with linear values, such as minIR, must adopt strict graph transformation semantics to ensure that linear constraints are satisfied at all times. In this section, we use the minIR graph category presented in the previous section to define transformation semantics that lean on the double pushout (DPO) Ehrig, 1976. 1976. Parallelism of manipulations in multidimensional information structures and sesqui-pushout (SqPO) Corrad., 2006. 2006. Sesqui-Pushout Rewriting. In Graph Transformations, Berlin, Heidelberg. Springer Berlin Heidelberg, 30--45. doi: 10.1007/11841883_4 semantics in adhesive categories Lack, 2005. 2005. Adhesive and quasiadhesive categories. RAIRO - Theoretical Informatics and Applications 39, 3 (July 2005, 511--545). doi: 10.1051/ITA:2005028.
Adhesivity of hypergraph categories #
The natural place to start this section is by studying which of the categories defined in section 3.3 are adhesive. From adhesivity follows that transforming graphs using DPO and SqPO constructions is well-defined and unique, at least in the regimes of interest to us.
A category is said to be adhesive if it has all pullbacks and pushouts along monos, as well as some compatibility conditions between them, the so-called “Van Kampen squares”. We refer to the literature (e.g. Lack, 2005. 2005. Adhesive and quasiadhesive categories. RAIRO - Theoretical Informatics and Applications 39, 3 (July 2005, 511--545). doi: 10.1051/ITA:2005028) for a complete definition. For our purposes, the following two results are sufficient:
- Every presheaf topos is adhesive (Corollary 3.6 in Lack, 2005. 2005. Adhesive and quasiadhesive categories. RAIRO - Theoretical Informatics and Applications 39, 3 (July 2005, 511--545). doi: 10.1051/ITA:2005028);
- Every full subcategory of an adhesive category is adhesive if the pullbacks and pushouts in of objects in are again in (a simple result; if the Van Kampen squares commute in , they must commute in ).
A first result immediately follows from the first result:
Proof
It is a presheaf.
This does not immediately generalise to , as unlike , Definition 3.2 imposes that be a coproduct. However, the result still holds:
Proof
is a full subcategory of the adhesive category . We must show the existence of pullbacks and pushouts along monos in .
Pullbacks. Consider a pullback of in , with . We must show that is in . Colimits are computed pointwise in presheaves, so we know that is the pullback of in . If we can show that is the coproduct of for , then we are done.
Let . Because and are coproducts in Set, i.e. a disjoint union, there must be such that and . By naturality of and , it follows that and . But by commutativity of the pullback diagram, , and thus and . We conclude by unicity of the pullback that and thus .
Pushouts. The same argument as for pullbacks also applies to pushouts: given a pushout of in with , an element that makes the pushout square commute must have preimages in and for some . Thus the pushout distributes over the coproduct, and we can conclude that is the coproduct of pushouts.
The same argument also applies to 1.
Now to the spicy stuff:
Proof
is a presheaf – hence adhesive.
The following pushout square shows that cannot be adhesive: the pushout square is valid in , but the pushout at the bottom right is not in , because the child regions cannot each be assigned a unique parent.
Double pushout semantics #
From Proposition 3.3, it follows that minIR graph transformations can be performed through the double pushout (DPO) construction Ehrig, 1976. 1976. Parallelism of manipulations in multidimensional information structures in the category.
A transformation rule in an adhesive category is a span . For objects , we then write or if there is a matching morphism and a context object along with morphisms and such that the following diagram commutes and both squares are pushouts:
If the DPO transformation exists for some rule and match , then we say is a valid DPO rewrite.
To ensure that a DPO rewrite is valid in minIR, we must impose certain conditions. Let be an IO-free minIR graph, i.e. , there is a morphism in for some type system and .
A DPO rewrite is a valid minIR DPO rewrite if there is a transformation in and
- is left-mono, i.e. the morphism is mono,2
- the pushout complement and pushout also exist in the slice category ,
- satisfies the hierarchy condition of Definition 3.3,
- is IO-free.
Proof
We know by construction that . We must show that further satisfies the constraints to be an object in the full subcategory of minIR graphs.
The first condition is standard in DPO and guarantees that and are unique if they exist.
The third condition we impose on corresponds directly to the constraint that defines hierarchical graphs in . The fourth condition ensures that valid minIR DPO rewrites map IO-free graphs to IO-free graphs.
Finally, the second condition is imposed to ensure well-typedness of . The functor that forgets the and morphisms is a left adjoint (it possesses a right Kan extension defined pointwise), and thus preserves colimits. The images of and thus form pushout squares in , and by unicity, must match the pushout squares in . Hence is well-typed.
The restriction to rewrites of IO-free graphs is not a restriction of
generality: if we are interested in rewriting computations with inputs and
outputs, we can always express them as IO-free graphs by adding input and
output ops with the values in as outputs, respectively as inputs. We
assign them dedicated types distinct from all other operations; these operations
will never be matched by transformation rules and can be removed at the end of
rewriting.
Generalising to sesqui-pushouts #
We restricted minIR rewrites to DPO transformations obtained form left-mono rules, to ensure that the construction is unique. This excludes rules that may identify two values in but split them into two different values in . Such rules allow for cloning values, which is a useful transformation in minIR for non-linear values. An example of a transformation rule that we would like to allow in minIR:
For this example we added a 2x operation that multiplies an angle value passed
as input by two. The transformation rule replaces a rotation of angle
by two rotations of angle by cloning the input angle.
Such semantics are possible using the sesqui-pushout construction (SqPO) by Corradini et al. Corrad., 2006. 2006. Sesqui-Pushout Rewriting. In Graph Transformations, Berlin, Heidelberg. Springer Berlin Heidelberg, 30--45. doi: 10.1007/11841883_4. We can reuse the same notation: when DPO is restricted to left-mono rules as we have done, SqPO is a generalisation of DPO (i.e. the construction coincides whenever the DPO exists).
A transformation rule in an adhesive category is a span . For objects , we then write or if there is a matching morphism and a context object along with morphisms and such that is the final pullback complement of and the right square is a pushout:
If the SqPO transformation exists for some rule and match , then we say is a valid (SqPO) rewrite.
The left square is redundant in the diagram above, as it follows from the requirement that be the final pullback complement (FPC). It is kept to highlight the similarities to DPO. As the commuting diagram indicates, the final pullback complement (FPC) construction forms a pullback square. Furthermore, unlike pushout complements, the FPC is defined by a universality property that ensures uniqueness if it exists. We refer to Corrad., 2006. 2006. Sesqui-Pushout Rewriting. In Graph Transformations, Berlin, Heidelberg. Springer Berlin Heidelberg, 30--45. doi: 10.1007/11841883_4 for the exact FPC construction.
With SqPO, we can define the set of valid minIR rewrites as given by the SqPO transformations in satisfying the relaxed set of conditions
- the pushout complement and pushout also exist in the slice category ,
- satisfies the hierarchy condition of Definition 3.3,
- is IO-free.
We conclude this section with a discussion of some of the properties of minIR transformations using SqPO (referring again to Corradini Corrad., 2006. 2006. Sesqui-Pushout Rewriting. In Graph Transformations, Berlin, Heidelberg. Springer Berlin Heidelberg, 30--45. doi: 10.1007/11841883_4 or König, 2018. 2018. A Tutorial on Graph Transformation. In Graph Transformation, Specifications, and Nets - In Memory of Hartmut Ehrig. Springer, 83--104. doi: 10.1007/978-3-319-75396-6_5 for a more detailed explanation of the concepts discussed):
Deletion in unknown context. A key difference between DPO and SqPO transformations is that SqPO transformations on graphs will delete edges attached to a vertex that is deleted by the transformation rule (i.e. but of the rule). The DPO transformation on the other hand is only well-defined when all edges incident to are in the image of and thus explicitly deleted (this is known as the dangling condition).
As minIR rewrites follow SqPO semantics, transformation rules such as the following are allowed:
Here denotes the multiplication of angles and the zero angle. Any operation that would be connected to the starred value on the left would be deleted by this rule. However such an implicit operation deletion only yields valid minIR graphs if all incident values are non-linear and none of the target values of the deleted operation are used.
Non-left-mono rules. As discussed in the introduction to SqPO, the
cloning of values is allowed in minIR rewrites. However, linear values may never
be cloned (the FPC or pushout will not exist in these cases). Thus any minIR
transformation rule will be left-mono on linear values. It must further be
left-linear on all (linear and non-linear) values in that are mapped to
outputs in : if a value is produced by op applied to , then cloning
and 'op will result in two definitions of .
Non-right-mono rules. Non-right-mono rules are allowed in both DPO and SqPO. They result in vertex merges. In minIR, the situation for right-mono is symmetric to left-mono: the map must be mono on linear values (otherwise the same value will have multiple uses or definitions) and it must be mono on all values in that are mapped to inputs in (otherwise a value in the rewritten minIR graph will have more than one value definition).
In fact, a much simpler argument applies: the category is isomorphic to the presheaf category , where is obtained from by removing the object . Adhesivity follows. ↩︎
This is often called left-linear in the literature. We avoid this term in this thesis to avoid confusion with the linearity property of values in minIR. ↩︎
3.5. MinIR rewriting, operationally
The previous section proposed to view minIR rewrites as the result of a (DPO or SqPO) graph transformation. This yields valid rewriting semantics elegantly (and with little effort!). However, the conditions that must be imposed on the transformation to be valid, along with the fact that pushouts may not exist mean that the existence of a rewrite given a transformation rule and a match is not guaranteed.
In this section, we address this by considering a more restricted notion of minIR rewriting, for which the existence of the right-hand side of the rewrite is guaranteed. In addition, in place of the categorical presentation of the last section, we express the rewriting operation operationally, i.e. as data and a procedure on sets that translates directly into an algorithmic implementation.
We find that this rewrite definition is sufficient in practice. We conclude the section with an example of how more complex rewrites can be achieved by composition of simpler rewrites that can be expressed in this framework.
Graph glueings and rewrites #
Throughout, we consider graph glueings on disjoint vertex and (hyper)edge sets. To underline this, we will use the symbol to denote disjoint set unions.
As we will be working exclusively with vertex and edge sets in this section (as opposed to the objects in the indexing category), we will drop the bold typeface for sets, writing e.g. instead of for the set of vertices of a hypergraph.
Finally, all minIR graphs in this section are IO free.
We define local graph rewrites in terms of graph glueings. Consider first the case of two arbitrary graphs and , along with a relation . Let be the equivalence relation induced by , i.e. the smallest relation on that is reflexive, symmetric and transitive, and satisifes for all and ,
Then, we can define
- is the set of all equivalence classes of , and
- for , is the equivalence class of that belongs to.
The glueing of and according to the glueing relation is given by the vertices and the edges
We write the glueing graph as .
In other words, the glueing is the disjoint union of the two graphs, with identification (and merging) of vertices that are related in .
This allows us to define a rewrite on a graph :
A rewrite on a graph is given by a tuple , with
- is a graph called the replacement graph,
- is the vertex deletion set,
- is the edge deletion set, and
- is the glueing relation, a partial function that maps a subset of the deleted vertices of to vertices in the replacement graph.
The domain of definition is known as the boundary values of .
A graph rewrite per this definition can always be generated by a single pushout (SPO) transformation Löwe, 1991. 1991. Extended algebraic graph transformation. Retrieved from http://d-nb.info/910935696.
- define as the graph . Then the injection is the match morphism ;
- the partial map maps a subset of to vertices in the replacement . By injectivity of the match morphism, it also defines a partial map .
We opted for SPO-like semantics in this definition, as they are the simplest to write in set-theoretic terms and coincide with DPO and SqPO in our restricted domain of interest.
The result of the rewrite is computed by gluing the right-hand side to the context subgraph of given by
The partial function is a special case of a glueing relation , and thus defines a glueing of with . The rewritten graph resulting from applying to is
An example of a graph rewrite is given in the next figure. This is equivalent to an SPO transformation with the graph induced by on the left-hand side, the graph on the right-hand side and the partial map given by .
Application of a graph rewrite. On the left, the original graph along with the replacement graph (grey box). On the right, the rewritten graph . Only the vertex has been deleted, as other vertices in are in the boundary (in orange). The (singleton) edge deletion set is red. The blue edge connects a vertex of to a boundary vertex, and is thus also present on the right-hand side. The purple edge, on the other hand, connects a vertex of to a non-boundary vertex of , and is thus deleted.
When there are no edges between and (purple in the example above), this definition corresponds to graph rewrites that can be produced using DPO transformations (see discussion in section 3.4). Otherwise, such edges are deleted.
The notions of graph glueing and graph rewrite can straightforwardly be lifted to hypergraphs and, by extension, to minIR graphs. Notice that in this case, values are glued together, not operations (the former were defined as the graph’s vertices, the latter as its hyperedges).
However, the glueing of two valid minIR graphs – and the result of applying a valid rewrite – may not be a valid minIR graph. Glueing two values of a linear type, for instance, is a sure way to introduce multiple uses (or definitions) of it. Thus, we must be careful to only consider glueings and rewrites of minIR graphs that preserve all the constraints we have imposed in Definition 3.4.
Ensuring rewrite validity: interfaces #
As a sufficient condition for valid minIR rewrites, we introduce minIR interfaces, a concept closely related to the “hypergraph with interfaces” construction of Bonchi, 2017. 2017. Confluence of Graph Rewriting with Interfaces or the supermaps of quantum causality Hefford, 2024. 2024. A Profunctorial Semantics for Quantum Supermaps. In Proceedings of the 39th Annual ACM/IEEE Symposium on Logic in Computer Science, July 2024. ACM, 1--15. doi: 10.1145/3661814.3662123. We eschew the presentation of holes as a slice category in favour of a definition that fits naturally within minIR and is sufficient for our purposes.
Let be a -typed minIR graph with data types and linear types . Consider type strings . We define the index sets
corresponding respectively to the set of all indices into and the subset of indices of linear types. For any , we denote by the type at position i in .
We define a partial order 1 on where and say that can be coerced into if there exists an index map such that
- types are preserved: , and
- is well-defined and bijective on the restriction to indices of linear
types
Let be a set of data types. An interface is a pair of type strings .
We say that an interface can be coerced into an interface , written , if and .
We can define the interface associated with an operation in a minIR graph by considering the values used and defined by . Calling the type morphism on and assuming to be an operation in with inputs and outputs, we define the interface of in as the pair of strings in
Similarly, we can assign interfaces to subgraphs of minIR graphs:
Consider a subset of values and operations and . Define the use and define boundary sets
The tuple of is called a minIR subgraph of if there exists a region of such that all boundary values of are in :
We write to indicate that is a minIR subgraph of .
Note that is exactly the set of inputs in the non-IO free minIR graph given by the subgraph of the minIR graph. is a superset of the outputs of : it includes all linear values in that do not have a use in , but also any non-linear value that has a use outside of .
Unlike interfaces, subgraph boundary values are not ordered. An ordering of is a string along with a bijective map
If there are strings and orderings of and
then we can set and in complete analogy to operations. We will write and for the strings and respectively. We say that the subgraph implements the interface
where the type morphism was extended element-wise to strings .
Remark, though, that unlike operations, the same subgraph may implement more than one interface as a result of various choices of orderings and .
As mentioned, the subgraph forms a non-IO free minIR graph. We can always construct an IO-free minIR graph from by adding two operations and in the root region respectively in and inputs-outputs, defined by
We call the resulting graph an interface graph. It implements the
interface if implements . Calling to mind the illustrations of
section 3.3, looks like one of the nested regions
within regiondef operations that we were considering.
MinIR operation rewrite #
Consider
- an operation in a minIR graph with values
- an interface graph with values and its associated subgraph , such that implements an interface
- the index maps and that define the generalisation (per Definition 3.10).
We can define a glueing relation
This is almost enough to define a rewrite that replaces the operation in with the values and operations of – the interface compatibility constraint that we have imposed ensures that the resulting minIR graph is valid. Unfortunately, is not a partial function as required by Definition 3.4.
This is resolved in the following proposition:
Let , and such that , as defined above. Then
i.e. the graph obtained by removing the operation from the glueing of and along , is a valid minIR graph.
There is a graph with values and a partial function such that the graph (5) is the graph , obtained from the rewrite
We call the rewrite of into .
The definition of the rewrite of into a graph behaves as one would expect – the only subtleties relate to handling non-linear (i.e. copyable) values at the boundary of the rewrite. The following example illustrates some of these edge cases.
Rewriting operation in the graph (top left) into the operations and of the graph (bottom left). Coloured dots indicate the index maps and from inputs of to inputs of , respectively from outputs of to outputs of .
When the index maps and are not injective (yellow and green dots), values are merged, resulting in multiple uses of the value (i.e. copies). This is why the index maps must be injective on linear values (dots in shades of blue). Value merging also happens when a value is used multiple times in (yellow and red dots). This will never happen with linear values (as they can never have more than one use in ), nor with any value definitions (the same value can never be defined more than once). Finally, values not in the image of or (purple dot) are discarded. This case is also excluded for linear values by requiring surjectivity.
Proof
We start this proof with the explicit construction of and . Define as the smallest equivalence relation such that
Then we define , the graph obtained by glueing together values within the same equivalence class of .
Claim 1: is a valid minIR graph.
Claim 1 follows from the observation that only values of non-linear types are glued together. If , then either or there exist such that
Define as the subgraph obtained from by removing the operations . Let be the set of values of (and of ). Writing for the equivalence class of that belongs to, we can define as:
Claim 2: is a partial function .
In other words, for all , then . Let and be values in . First of all, for all , otherwise is not acyclic. So either , or , but not both.
The simpler case: if , then there exists such that . Furthermore is unique because by minIR definition, has a unique definition in . It follows from (4) that and hence .
Otherwise, there exists and such that and as well as . By definition of , we have , and thus
proving Claim 2.
Claim 3: is given by .
It follows directly from our construction of and that the equivalence classes of (the smallest equivalence relation closure of) is equal to the equivalence classes of (the smallest equivalence relation closure of) . The claim follows by Definition 3.8 and the definition of .
And finally, Claim 4: is a valid minIR graph.
Per Definition 3.4, We must check four properties: (i) every value is defined exactly once, (ii) every linear value is used exactly once, (iii) the graph is acyclic, and (iv) every region has (at most) one parent.
(iii) follows from the fact that and are acyclic and a single operation in is replaced: any cycle across and would also be a cycle in by replacing the subpath in with . (iv) follows from the fact that and are in the root region of , by definition of interface implementation. (i): removing from removes the unique definitions of all values that are targets of . Each such value is glued to a unique value in – the new and unique definition of in (ii) follows from the same argument as in (i), but relying on injectivity of on linear values to establish uniqueness.
Arbitrary minIR rewrites #
We have so far defined rewrites of single operations into graphs . We can generalise these rewrites to rewrite subgraphs , provided the minIR subgraphs satisfy some constraints. We require for this a notion of convexity, as discussed in Bonchi, 2022. 2022. String diagram rewrite theory II: Rewriting with symmetric monoidal structure. Mathematical Structures in Computer Science 32, 4 (April 2022, 511--541). doi: 10.1017/s0960129522000317.
As usual, let us consider a minIR graph with values , linear values , edges , the incidence maps and as well as their inverses and . Consider further a subgraph of that we will now call , to distinguish from .
Let us further define the partial morphism that maps a value to the parent of the region of .
A minIR subgraph is convex if the following conditions hold:
- for all , any path along from to contains only vertices in ,
- parent-child relations are contained within the subgraph, i.e.
Define the sets of boundary values and , as in (2); then fix the boundary orderings and as in (3). The subgraph implements the interface
Consider an interface graph that implements such that . Instead of defining a gluing relation from values of an operation to values of , we replace the interface with . This generalises the definition of from (4) to a glueing defined as
With the set of boundary operations defined as2
we are able to define minIR rewrites in their most general form.
Let and such that and is convex, as defined above. Then,
i.e. the graph obtained by removing the values and operations from the glueing of and along , is a valid minIR graph.
There is a graph with values and a partial function such that the graph (8) is the graph , obtained from the rewrite
We call the rewrite of into .
Proof
Consider an operation that implements . We can define the interface graph given by three operations , and . Its associated subgraph only includes . Let be the glueing relation
Consider the rewrite . If we write for the subgraph of given by
Our claim is that is a valid minIR graph.
The graph (8) is then obtained by applying the rewrite as given by (6) to . Defining the rewrite as the composition of followed by , the result follows from our claim and Proposition 3.5.
We now prove the claim, by showing the four properties of minIR graphs as per Definition 3.4. Property i) requires showing that every value is defined exactly once. As is obtained by removing values and operations from a valid minIR graph , no value in can be defined more than once. A value that is not defined in must be in the boundary of . By the boundary definitions of (2), cannot be in and thus must be in . It follows by the definition of the glueing that in , will be in the definitions of : . The glueing is bijective between the values of and and thus we can conclude that has a unique definition in .
The same argument applies to property ii). Property iii) follows from the convexity requirement of . Finally, property iv) (every region has at most one parent) follows from two observations. First, by convexity of , no deleted value or operation could be the parent of any value not in , and thus the relation is well-defined on : . Secondly, all new values and operations added to the boundary region of are from the root region of , and thus do not have a parent, ensuring that parent uniqueness is preserved.
This simple and limited graph transformation framework captures a remarkably large set of minIR program transformations. It may seem at first that the restriction to boundary values within a single region of Definition 3.11, as well as the convexity requirements of Definition 3.12 represent significant limitations on the expressivity of the rewrites. In practice, however, the semantics of minIR operations can be used to decompose more complex rewrites into a sequence of simple rewrites to which Proposition 3.6 applies.
Consider minIR graphs with a type system that includes regiondef and call
operations as discussed in examples of the previous section – respectively
defining a code block by a nested region and redirecting control flow to a code
block defined using a regiondef. Then all constraints that we impose on
rewriting can be effectively side-stepped using the region outlining and
value hoisting transformations.
Region outlining moves a valid minIR subgraph into its own separate region,
and replaces the hole left by the subgraph in the computation by a call
operation to the newly outlined region.
Value hoisting moves a value definition within a region to its parent region and passes the value down to the nested region through an additional input. In case of linear values, we can similarly hoist the unique use of the value to the parent region.
Using these transformations, non-convex subgraphs can always be made convex by taking the convex hull and outlining any parts within it that are not part of the subgraph. Outlined regions can then be passed as additional inputs to the subgraph. Step 1 of the figure below illustrates this transformation. Similarly, a subgraph that includes operations without their parent can be extended to cover the entire region and its parent, outlining any parts of the region that are not part of the subgraph.
Finally, whenever a boundary value belongs to a region that is not the top level region of the subgraph3, we can repeatedley hoist to its parent region until it is in the top level region. The value is then recursively passed as argument to descendant regions until the region that it is required in. Subgraphs can thus always be transformed to only have input and output boundary values at the top level region. Step 2 of the figure below illustrates this transformation.
A non-convex minIR graph rewrite, obtained by decomposition into valid convex rewrites, using outlining and hoisting. For simplicity, regiondef operations were made implicit and represented by nested boxes: a region within an operation corresponds to a region definition that is passed as an argument to the operation. Edge colours correspond to value types. Step 1 outlines the ... operations into a dedicated region, which step 2 hoists outside of the region being rewritten. Step 3 and 4 together correspond to a minIR sugraph rewrite. They have been split into two steps following the proof strategy. Step 4 is an instance of a minIR operation rewrite.
To be precise, is a partial order on the type strings up to isomorphism. ↩︎
The set operations and are again understood to apply to the unordered set of elements contained in the lists and . ↩︎
We can always extend a subgraph to contain more ancestor regions, until there is indeed a unique top-level region in the subgraph. ↩︎