Chapter 3

Quantum Compilation as a Graph Transformation Problem

The specialised optimisation techniques that we reviewed in section 2.2 are effective for the scenarios they were designed for, but they are challenging to adapt to new hardware primitives, constraints, or cost functions.

This thesis proposes interpreting quantum compilation as a graph transformation system (GTS). GTSs endow quantum compilation with well-defined semantics and strong theoretical foundations Lack, 2005Stephen Lack and Pawel Sobocinski. 2005. Adhesive and quasiadhesive categories. RAIRO - Theoretical Informatics and Applications 39, 3 (July 2005, 511--545). doi: 10.1051/ITA:2005028. They establish a practical, purely declarative framework in which compiler transformations can be defined and studied.

This allows us to decouple the semantics of quantum programs and the architecture specifics from the compiler infrastructure itself. We can thus focus on building and designing scalable and efficient graph transformation algorithms that can then be applied on a wide range of compilation problems and hardware targets.

In this chapter, we formalise quantum computation and optimisation based on graphs and graph transformations, providing the foundation for all considerations in later chapters. Albeit slightly simplified, the intermediate representation IR we propose here is based on joint work Mark K., 2025Seyon Sivarajah, Alan Lawrence, Alec Edgington, Douglas Wilson, Craig Roy, Luca Mondada, Lukas Heidemann, Ross Duncan Mark Koch. 2025. HUGR: A Quantum-Classical Intermediate Representation. Retrieved (talk recording) from https://www.youtube.com/live/D8esZrt7ogk?feature=shared&t=5217, as well as ongoing development.

The words graph rewrite and graph transformation are often used interchangeably in the literature. In the context of this thesis, we will take these words to distinguish two slightly different problems:
The study of equivalences and other relations between graphs under well-defined semantics is the subject of graph transformations. For instance:
a graph transformation rule $L \to R$ (Definition .) expresses that an instance of $L$ can always be transformed into an instance of $R$ , reflecting the semantics of the system that the graph is modelling.
a minIR equivalence class (Definition .) is an instance of a graph transformation system (GTS), which uses known semantic relations, expressed, for example, as graph transformation rules, to define how graphs can be transformed.
Graph rewriting, on the other hand, encapsulates the algorithmic procedures and data structures that mutate graphs. A rewrite (Definition .) is the tuple of data required to turn a graph $G$ into a new graph $G'$ .
Given matches of patterns $L$ on a graph $G$ , a graph transformation system can consider the set of graph transformation rules that define the semantics of $G$ to produce a set of rewrites that can be applied to $G$ and mutate $G$ .

Our contributions in the subsequent chapters are mostly preoccupied with problems of graph rewriting, i.e. the definition and application of the data required to mutate graphs, as opposed to graph transformations. This chapter nonetheless considers both, using the mature graph transformation framework as a foundation to define IR rewriting semantics.

Section 3.1 starts with a review of previous related work at the intersection of graph transformation software and quantum program optimisation. We then discuss in section 3.2 a fundamental difference between classical computation graphs and the requirements of quantum computation. This motivates a new graph-based IR tailored to quantum computation that we present in section 3.3, along with formal graph rewriting semantics based on sesqui-pushout (SqPO) transformations (section 3.4). Whilst the SqPO transformation definition is constructive, its existence is not guaranteed. We conclude the chapter in section 3.5 by discussing a more restricted “operational” notion of graph rewriting that will be useful for the rest of the thesis.

3.1. Related work

Graph rewriting on computation graphs. Optimisation of computation graphs is a long-standing problem in computer science that is seeing renewed interest in the compiler Lattner, 2021Chris Lattner, Mehdi Amini, Uday Bondhugula, Albert Cohen, Andy Davis, Jacques Pienaar, River Riddle, Tatiana Shpeisman, Nicolas Vasilache and Oleksandr Zinenko. 2021. MLIR: Scaling Compiler Infrastructure for Domain Specific Computation. In 2021 IEEE/ACM International Symposium on Code Generation and Optimization (CGO), February 2021. IEEE, 2--14. doi: 10.1109/CGO51591.2021.9370308, machine learning (ML) Jia, 2019Zhihao Jia, Oded Padon, James Thomas, Todd Warszawski, Matei Zaharia and Alex Aiken. 2019. TASO: optimizing deep learning computation with automatic generation of graph substitutions. In Proceedings of the 27th ACM Symposium on Operating Systems Principles, October 2019. ACM, 47--62. doi: 10.1145/3341301.3359630 Fang, 2020Jingzhi Fang, Yanyan Shen, Yue Wang and Lei Chen. 2020. Optimizing DNN computation graph using graph substitutions. Proceedings of the VLDB Endowment 13, 12 (August 2020, 2734--2746). doi: 10.14778/3407790.3407857 and quantum computing communities Xu, 2022Mingkuan Xu, Zikun Li, Oded Padon, Sina Lin, Jessica Pointing, Auguste Hirth, Henry Ma, Jens Palsberg, Alex Aiken, Umut A. Acar and Zhihao Jia. 2022. Quartz: Superoptimization of Quantum Circuits. In Proceedings of the 43rd ACM SIGPLAN International Conference on Programming Language Design and Implementation, June 2022. Association for Computing Machinery, 625--640. doi: 10.1145/3519939.3523433 Xu, 2023Amanda Xu, Abtin Molavi, Lauren Pick, Swamit Tannu and Aws Albarghouthi. 2023. Synthesizing Quantum-Circuit Optimizers. Proceedings of the ACM on Programming Languages 7, PLDI (June 2023, 835--859). doi: 10.1145/3591254. In all these domains, graphs encode computations that are either expensive to execute or evaluated repeatedly over many iterations, making the optimisation of the execution cost of the computation a primary concern.

Domain-specific heuristics are the most common approach in compiler optimisations Paszke, 2019Adam Paszke, Sam Gross, Francisco Massa, A. Lerer, J. Bradbury, G. Chanan, T. Killeen, Z. Lin, N. Gimelshein, L. Antiga, A. Desmaison, A. Kö pf, E. Yang, Z. DeVito, M. Raison, A. Tejani, S. Chilamkurthy, B. Steiner, L. Fang, J. Bai and S. Chintala. 2019. PyTorch: An Imperative Style, High-Performance Deep Learning Library. In Neural Information Processing Systems. doi: 10.5555/3454287.3455008 Sivara., 2020Seyon Sivarajah, Silas Dilkes, Alexander Cowtan, Will Simmons, Alec Edgington and Ross Duncan. 2020. t|ket⟩: a retargetable compiler for NISQ devices. Quantum Science and Technology 6, 1 (November 2020, 014003). doi: 10.1088/2058-9565/ab8e92 – a more flexible alternative are optimisation engines based on declarative sets of graph transformations Bonchi, 2022Filippo Bonchi, Fabio Gadducci, Aleks Kissinger, Pawel Sobocinski and Fabio Zanasi. 2022. String Diagram Rewrite Theory I: Rewriting with Frobenius Structure. Journal of the ACM 69, 2 (March 2022, 1 - 58). doi: 10.1145/3502719 Bonchi, 2022Filippo Bonchi, Fabio Gadducci, Aleks Kissinger, Pawel Sobocinski and Fabio Zanasi. 2022. String diagram rewrite theory II: Rewriting with symmetric monoidal structure. Mathematical Structures in Computer Science 32, 4 (April 2022, 511--541). doi: 10.1017/s0960129522000317. In such systems, a graph transformation system (GTS) is used to find a sequence of allowed transformations that rewrite a computation graph given as input into a computation graph with minimal cost.

Transformation systems were first studied on strings Dersho., 1990Nachum Dershowitz and Jean-Pierre Jouannaud. 1990. Rewrite Systems, then generalised to trees and terms Bezem, 2003Marc Bezem, Jan Willem Klop and Roel Vrijer. 2003. Term Rewriting Systems (1. publ. ed.). Cambridge University Press, Cambridge, before being applied to graph domains Ehrig, 1973Hartmut Ehrig, Michael Pfender and Hans Jürgen Schneider. 1973. Graph-Grammars: An Algebraic Approach. In 14th Annual Symposium on Switching and Automata Theory, Iowa City, Iowa, USA, October 15-17, 1973. IEEE Computer Society, 167--180. doi: 10.1109/SWAT.1973.11 Rozenb., 1997Grzegorz Rozenberg. 1997. Handbook of Graph Grammars and Computing by Graph Transformations, Volume 1: Foundations. World Scientific König, 2018Barbara König, Dennis Nolte, Julia Padberg and Arend Rensink. 2018. A Tutorial on Graph Transformation. In Graph Transformation, Specifications, and Nets - In Memory of Hartmut Ehrig. Springer, 83--104. doi: 10.1007/978-3-319-75396-6_5. Their use in quantum computing is part of a long tradition of diagrammatic reasoning in physics Penrose, 1964Roger Penrose. 1964. Conformal treatment of infinity. General Relativity and Gravitation 43, 3 (901--922 (reprint)). doi: 10.1007/s10714-010-1110-5 Feynman, 1949R. P. Feynman. 1949. Space-Time Approach to Quantum Electrodynamics. Physical Review 76, 6 (Septempter 1949, 769--789). doi: 10.1103/physrev.76.769, and particularly in quantum mechanics with the advent of categorical quantum mechanics Abrams., 2008Samson Abramsky and Bob Coecke. 2008. Categorical quantum mechanics. arXiv: 0808.1023 [quant-ph] Coecke, 2012Bob Coecke, Ross Duncan, Aleks Kissinger and Quanlong Wang. 2012. Strong Complementarity and Non-locality in Categorical Quantum Mechanics. In 2012 27th Annual IEEE Symposium on Logic in Computer Science, June 2012. IEEE, 245--254. doi: 10.1109/lics.2012.35 Coecke, 2017Bob Coecke and Aleks Kissinger. 2017. Picturing Quantum Processes: A First Course in Quantum Theory and Diagrammatic Reasoning. Cambridge University Press. doi: 10.1017/9781316219317.

GTS in quantum computing. In quantum computing, the ZX calculus Coecke, 2008Bob Coecke and Ross Duncan. 2008. Interacting Quantum Observables and other diagrammatic theories that derive from it are particularly important. Properties of GTSs such as completeness, confluence and termination Verma, 1995Rakesh M. Verma. 1995. Transformations and confluence for rewrite systems. Theoretical Computer Science 152, 2 (December 1995, 269--283). doi: 10.1016/0304-3975(94)00255-0 are well-studied within this field Backens, 2014Miriam Backens. 2014. The ZX-calculus is complete for stabilizer quantum mechanics. New Journal of Physics 16, 9 (Septempter 2014, 093021). doi: 10.1088/1367-2630/16/9/093021 Backens, 2019Miriam Backens and Aleks Kissinger. 2019. ZH: A Complete Graphical Calculus for Quantum Computations Involving Classical Non-linearity. Electronic Proceedings in Theoretical Computer Science 287 (January 2019, 23--42). doi: 10.4204/EPTCS.287.2 Biamon., 2023J Biamonte and A Nasrallah. 2023. The ZX-Calculus is Canonical in the Heisenberg Picture for Stabilizer Quantum Mechanics. arXiv: 2301.05717 [quant-ph]. These results have formed the basis for software implementations of circuit optimisations with soundness and performance guarantees Duncan, 2020Ross Duncan, Aleks Kissinger, Simon Perdrix and John van de Wetering. 2020. Graph-theoretic Simplification of Quantum Circuits with the ZX-calculus. Quantum 4 (June 2020, 279). doi: 10.22331/q-2020-06-04-279 Kissin., 2020Aleks Kissinger and John van de Wetering. 2020. PyZX: Large Scale Automated Diagrammatic Reasoning. In Proceedings 16th International Conference on Quantum Physics and Logic, Chapman University, Orange, CA, USA., 10-14 June 2019. Open Publishing Association, 229-241. doi: 10.4204/EPTCS.318.14 Sivara., 2020Seyon Sivarajah, Silas Dilkes, Alexander Cowtan, Will Simmons, Alec Edgington and Ross Duncan. 2020. t|ket⟩: a retargetable compiler for NISQ devices. Quantum Science and Technology 6, 1 (November 2020, 014003). doi: 10.1088/2058-9565/ab8e92 Borgna, 2023Agustín Borgna. 2023. Towards a compiler toolchain for quantum programs.

Great strides are also being made in our theoretical understanding of transformation systems for quantum circuits. Recently, Clément et al. established completeness for the first time Cléme., 2023Alexandre Clément, Nicolas Heurtel, Shane Mansfield, Simon Perdrix and Benoît Valiron. 2023. A Complete Equational Theory for Quantum Circuits. In 2023 38th Annual ACM/IEEE Symposium on Logic in Computer Science (LICS), June 2023. IEEE, 1--13. doi: 10.1109/lics56636.2023.10175801 as well as minimality Cléme., 2024Alexandre Clément, Noé Delorme and Simon Perdrix. 2024. Minimal Equational Theories for Quantum Circuits. In Proceedings of the 39th Annual ACM/IEEE Symposium on Logic in Computer Science, July 2024. ACM, 1--14. doi: 10.1145/3661814.3662088 of a GTS for quantum circuits. A set of circuit transformation rules were presented such that no rule is redundant, and for any two equivalent quantum circuits, there exists a sequence of local transformations rewriting one into the other. Such systems are however not confluent, and this is unlikely to change: most circuit optimisation problems are known to be computationally hard Weteri., 2024John van de Wetering and Matt Amy. 2024. Optimising quantum circuits is generally hard. arXiv: 2310.05958 [quant-ph].

There is also another inherent tension in integrating diagrammatic calculi into compilers. Diagrammatic theories arise from abstract primitives that admit a simple rewriting logic Heurtel, 2024Nicolas Heurtel. 2024. A Complete Graphical Language for Linear Optical Circuits with Finite-Photon-Number Sources and Detectors. arXiv: 2402.17693 [quant-ph] Booth, 2024Robert I. Booth, Titouan Carette and Cole Comfort. 2024. Graphical Symplectic Algebra. arXiv: 2401.07914 [cs.LO] Felice, 2023Giovanni de Felice, Razin A. Shaikh, Boldizsár Poór, Lia Yeh, Quanlong Wang and Bob Coecke. 2023. Light-Matter Interaction in the ZXW Calculus. Electronic Proceedings in Theoretical Computer Science 384 (August 2023, 20--46). doi: 10.4204/EPTCS.384.2 Carette, 2023Titouan Carette, Timothée Hoffreumon, Émile Larroque and Renaud Vilmart. 2023. Complete Graphical Language for Hermiticity-Preserving Superoperators. In 2023 38th Annual ACM/IEEE Symposium on Logic in Computer Science (LICS), June 2023. IEEE, 1--22. doi: 10.1109/LICS56636.2023.10175712; compilers meanwhile must capture all the expressivity, constraints and messiness of real-world hardware targets, with all the edge cases and exceptions that this entails.

An example of this is the ZX circuit extraction problem Quanz, 2024Marcel Quanz, Korbinian Staudacher and Karl Fürlinger. 2024. Parallel Quantum Circuit Extraction from MBQC-Patterns. In 2024 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), 1078-1087. doi: 10.1109/IPDPSW63119.2024.00179 Backens, 2021Miriam Backens, Hector Miller-Bakewell, Giovanni de Felice, Leo Lobski and John van de Wetering. 2021. There and back again: A circuit extraction tale. Quantum 5 (March 2021, 421). doi: 10.22331/q-2021-03-25-421: it is in general hard to recover an executable quantum circuit from a ZX diagram as the latter is strictly more general and primitives cannot be mapped one-to-one. Similarly, while simple quantum-classical hybrid computations can be expressed using extensions of ZX Borgna, 2021Agustín Borgna, Simon Perdrix and Benoît Valiron. 2021. Hybrid Quantum-Classical Circuit Simplification with the ZX-Calculus. In Programming Languages and Systems, Cham. Springer International Publishing, 121--139. doi: 10.1007/978-3-030-89051-3_8 Carette, 2021Titouan Carette, Emmanuel Jeandel, Simon Perdrix and Renaud Vilmart. 2021. Completeness of Graphical Languages for Mixed State Quantum Mechanics. ACM Transactions on Quantum Computing 2, 4 (December 2021, 1--28). doi: 10.1145/3464693 Koziel., 2024Alexander Koziell-Pipe and Aleks Kissinger. 2024. Hybrid Quantum-Classical Machine Learning with String Diagrams. arXiv: 2407.03673 [quant-ph], it will never be possible to capture the full breadth and generality of classical CPU instruction sets in a practical and extensible (and algebraically satisfying) way.

Peephole optimisations. As an alternative to the very principled approach of elegant calculi, graph transformations can also be used in the absence of theoretical guarantees in a more ad hoc fashion. Indeed, many existing (classical and quantum) compiler optimisations can already be understood as graph transformations. For as long as compilation has existed, compilers have relied on local transformations of the IR, typically referred to as peephole optimisations McKeem., 1965W. M. McKeeman. 1965. Peephole optimization. Communications of the ACM 8, 7 (July 1965, 443--444). doi: 10.1145/364995.365000 Tanenb., 1982Andrew S. Tanenbaum, Hans van Staveren and Johan W. Stevenson. 1982. Using Peephole Optimization on Intermediate Code. ACM Transactions on Programming Languages and Systems 4, 1 (January 1982, 21--36). doi: 10.1145/357153.357155. Such optimisation strategies are based on the heuristic that local optimisations to the program will produce a well-optimised result overall. Mature compiler ecosystems have developed tools for declarative definitions, as well as automatic generation and correctness proving of peephole optimisations Menend., 2017David Menendez and Santosh Nagarakatte. 2017. Alive-Infer: data-driven precondition inference for peephole optimizations in LLVM. ACM SIGPLAN Notices 52, 6 (June 2017, 49--63). doi: 10.1145/3140587.3062372 Lopes, 2015Nuno P. Lopes, David Menendez, Santosh Nagarakatte and John Regehr. 2015. Provably correct peephole optimizations with alive. In Proceedings of the 36th ACM SIGPLAN Conference on Programming Language Design and Implementation, June 2015. ACM, 22--32. doi: 10.1145/2737924.2737965 Riddle, 2021River Riddle. 2021. PDLL: a new declarative rewrite frontend for MLIR. (November 2021). Retrieved on 13/01/2025 (RFC on Discourse) from https://discourse.llvm.org/t/rfc-pdll-a-new-declarative-rewrite-frontend-for-mlir/4798. We refer to the classical compiler literature, e.g. Muchni., 2007Steven S. Muchnick. 2007. Advanced compiler design and implementation ([Nachdr.] ed.). Morgan Kaufmann, San Francisco, Calif. [u.a.], for more details on the various types of common peephole optimisations.

Quantum compilers adopted peephole-style optimisations from the beginning Cheung, 2007Donny Cheung, Dmitri Maslov and Simone Severini. 2007. Translation techniques between quantum circuit architectures. In Workshop on quantum information processing, 1--3 Steiger, 2018Damian S. Steiger, Thomas Häner and Matthias Troyer. 2018. ProjectQ: an open source software framework for quantum computing. Quantum 2 (January 2018, 49). doi: 10.22331/q-2018-01-31-49 Sivara., 2020Seyon Sivarajah, Silas Dilkes, Alexander Cowtan, Will Simmons, Alec Edgington and Ross Duncan. 2020. t|ket⟩: a retargetable compiler for NISQ devices. Quantum Science and Technology 6, 1 (November 2020, 014003). doi: 10.1088/2058-9565/ab8e92. They encompass some of the most common optimisations in quantum computing, including the Euler Angle reduction Chatzi., 2009K. Ch. Chatzisavvas, G. Chadzitaskos, C. Daskaloyannis and S. G. Schirmer. 2009. Improving quantum gate fidelities using optimized Euler angles. Physical Review A 80, 5 (November 2009, 052329). doi: 10.1103/physreva.80.052329, the two-qubit KAK decomposition Tucci, 2005Robert R. Tucci. 2005. An Introduction to Cartan's KAK Decomposition for QC Programmers. arXiv: quant-ph/0507171 [quant-ph] Cross, 2019Andrew W. Cross, Lev S. Bishop, Sarah Sheldon, Paul D. Nation and Jay M. Gambetta. 2019. Validating quantum computers using randomized model circuits. Physical Review A 100, 3 (Septempter 2019, 032328). doi: 10.1103/physreva.100.032328 and all gate set rebases contri., 2025TKET contributors. 2025. Documentation: pytket.passes.AutoRebase. Retrieved on 13/01/2025 (TKET docs) from https://docs.quantinuum.com/tket/api-docs/passes.html#pytket.passes.AutoRebase. A quantum-specific flavour of peephole optimisation with close links to GTSs, template matching Maslov, 2008D. Maslov, G.W. Dueck, D.M. Miller and C. Negrevergne. 2008. Quantum Circuit Simplification and Level Compaction. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 27, 3 (March 2008, 436--444). doi: 10.1109/tcad.2007.911334 Iten, 2022Raban Iten, Romain Moyard, Tony Metger, David Sutter and Stefan Woerner. 2022. Exact and Practical Pattern Matching for Quantum Circuit Optimization. ACM Transactions on Quantum Computing 3, 1 (January 2022, 1--41). doi: 10.1145/3498325, achieved state-of-the-art results for Clifford circuit optimisation Bravyi, 2021Sergey Bravyi, Ruslan Shaydulin, Shaohan Hu and Dmitri Maslov. 2021. Clifford Circuit Optimization with Templates and Symbolic Pauli Gates. Quantum 5 (November 2021, 580). doi: 10.22331/q-2021-11-16-580. Recently, quantum peephole optimisations were also proposed that leverage provable state information to perform contextual optimisations Liu, 2021Ji Liu, Luciano Bello and Huiyang Zhou. 2021. Relaxed Peephole Optimization: A Novel Compiler Optimization for Quantum Circuits. In 2021 IEEE/ACM International Symposium on Code Generation and Optimization (CGO), February 2021. IEEE, 301--314. doi: 10.1109/cgo51591.2021.9370310, similar to strength reduction and optimisation with preconditions in classical compilation Lopes, 2015Nuno P. Lopes, David Menendez, Santosh Nagarakatte and John Regehr. 2015. Provably correct peephole optimizations with alive. In Proceedings of the 36th ACM SIGPLAN Conference on Programming Language Design and Implementation, June 2015. ACM, 22--32. doi: 10.1145/2737924.2737965.

Internal representations. The graph formalisation of quantum computations we will define in this chapter also draws a lot from the internal representations (IR) of programs in classical compilers. The classical compilation community has found significant advantages in sharing a common standardised IR format. Indeed, while the exact syntax constructs and abstractions vary across programming languages, and, at the other end of the compiler stack, the specific assembly instructions emitted differ between hardware targets, much of the compiler middleware can be broadly shared across use cases. This gave rise to the LLVM Lattner, 2004Chris Lattner and Vikram Adve. 2004. LLVM: A compilation framework for lifelong program analysis & transformation. In International Symposium on Code Generation and Optimization, 2004. CGO 2004.. IEEE, 75--86. doi: 10.1109/CGO.2004.1281665 and, more recently, the MLIR Lattner, 2021Chris Lattner, Mehdi Amini, Uday Bondhugula, Albert Cohen, Andy Davis, Jacques Pienaar, River Riddle, Tatiana Shpeisman, Nicolas Vasilache and Oleksandr Zinenko. 2021. MLIR: Scaling Compiler Infrastructure for Domain Specific Computation. In 2021 IEEE/ACM International Symposium on Code Generation and Optimization (CGO), February 2021. IEEE, 2--14. doi: 10.1109/CGO51591.2021.9370308 projects, which provide common compiler IRs, along with all the infrastructure compilers typically require: IR transformation tooling, translation into hardware-specific assembly, efficient serialisations, in-memory formats etc.

The idea of adopting LLVM for quantum was championed by QIR QIR Al., 2021 QIR Alliance. 2021. QIR Specification v0.1. Retrieved on 31/12/24 from https://www.qir-alliance.org/, a standard introducing quantum primitives into the LLVM IR. This was subsequently adopted by many quantum hardware providers for its superior expressive power compared to circuit-based formats QIR Al., 2023 QIR Alliance. 2023. NVIDIA Joins the QIR Alliance as the Effort Enters Year Two. Retrieved on 02/01/2025 from https://www.qir-alliance.org/posts/year_one_in_review/. Building on top of QIR, an IR specifically for quantum-classical programs was proposed in Mark K., 2025Seyon Sivarajah, Alan Lawrence, Alec Edgington, Douglas Wilson, Craig Roy, Luca Mondada, Lukas Heidemann, Ross Duncan Mark Koch. 2025. HUGR: A Quantum-Classical Intermediate Representation. Retrieved (talk recording) from https://www.youtube.com/live/D8esZrt7ogk?feature=shared&t=5217, with additional soundness guarantees based, among others, on the no-cloning principle of quantum information. In parallel, projects with similar aims have also emerged McCask., 2021Alexander McCaskey and Thien Nguyen. 2021. A MLIR Dialect for Quantum Assembly Languages. In 2021 IEEE International Conference on Quantum Computing and Engineering (QCE), October 2021. IEEE. doi: 10.1109/QCE52317.2021.00043 Ittah, 2022David Ittah, Thomas Häner, Vadym Kliuchnikov and Torsten Hoefler. 2022. QIRO: A Static Single Assignment-based Quantum Program Representation for Optimization. ACM Transactions on Quantum Computing 3, 3 (June 2022, 1--32). doi: 10.1145/3491247 that use the full MLIR and LLVM toolchain.

Challenges of GTS for compilation. Peephole optimisations of compiler IRs have proven to be a fast, general and scalable approach to compilation and code optimisation in practice. However, the optimisation results depend heavily on well-designed transformation orderings and the performance may vary widely across (equivalent) input programs. This is commonly known in compiler research as the phase ordering problem Click, 1995Cliff Click and Keith D. Cooper. 1995. Combining analyses, combining optimizations. ACM Transactions on Programming Languages and Systems 17, 2 (March 1995, 181--196). doi: 10.1145/201059.201061. When a compiler can modify code in multiple ways, it must determine which transformations to apply and in what sequence to achieve optimal results Whitfi., 1997Deborah L. Whitfield and Mary Lou Soffa. 1997. An approach for exploring code improving transformations. ACM Transactions on Programming Languages and Systems 19, 6 (November 1997, 1053--1084). doi: 10.1145/267959.267960 Liang, 2023Youwei Liang, Kevin Stone, Ali Shameli, Chris Cummins, Mostafa Elhoushi, Jiadong Guo, Benoit Steiner, Xiaomeng Yang, Pengtao Xie, Hugh Leather and Yuandong Tian. 2023. Learning Compiler Pass Orders using Coreset and Normalized Value Prediction. In Proceedings of the 40th International Conference on Machine Learning. JMLR.org. doi: 10.48550/ARXIV.2301.05104. This is a common design challenge in GTSs, often addressed through mechanisms such as rule controls Heckel, 2020Reiko Heckel and Gabriele Taentzer. 2020. Graph Transformation for Software Engineers: With Applications to Model-Based Development and Domain-Specific Language Engineering. Springer International Publishing. doi: 10.1007/978-3-030-43916-3.

This issue is also a key challenge within quantum compilation, as can be verified by comparing the performance of peephole-based compilers with provably optimal circuit synthesis strategies. On problem sizes where exhaustive search is feasible, unitary synthesis tools can sometimes outperform current, mostly peephole-based compilers Sivara., 2020Seyon Sivarajah, Silas Dilkes, Alexander Cowtan, Will Simmons, Alec Edgington and Ross Duncan. 2020. t|ket⟩: a retargetable compiler for NISQ devices. Quantum Science and Technology 6, 1 (November 2020, 014003). doi: 10.1088/2058-9565/ab8e92 by up to 50%¹ Wu, 2020Xin-Chuan Wu, Marc Grau Davis, Frederic T. Chong and Costin Iancu. 2020. QGo: Scalable Quantum Circuit Optimization Using Automated Synthesis. arXiv: 2012.09835 [quant-ph].

at the cost of many hours of compute, of course. ↩︎

3.2. Computation graphs and linearity

Computation graphs represent the flow of data between operations in a program, with nodes as operations and edges as data dependencies. Widely used in machine learning frameworks and GPU optimisations Bergst., 2011James Bergstra, Frédéric Bastien, Olivier Breuleux, Pascal Lamblin, Razvan Pascanu, Olivier Delalleau, Guillaume Desjardins, David Warde-Farley, Ian Goodfellow, Arnaud Bergeron and others. 2011. Theano: Deep learning on gpus with python. In NIPS 2011, BigLearning Workshop, Granada, Spain Zhao, 2023Yuxuan Zhao, Qi Sun, Zhuolun He, Yang Bai and Bei Yu. 2023. AutoGraph: Optimizing DNN Computation Graph for Parallel GPU Kernel Execution. Proceedings of the AAAI Conference on Artificial Intelligence 37, 9 (June 2023, 11354--11362). doi: 10.1609/aaai.v37i9.26343, they are conceptually equivalent to dataflow graphs used in compiler design, which were pioneered by Feo, 1990John T. Feo, David C. Cann and Rodney R. Oldehoeft. 1990. A report on the sisal language project. Journal of Parallel and Distributed Computing 10, 4 (December 1990, 349--366). doi: 10.1016/0743-7315(90)90035-n and Kahn, 1976Gilles Kahn and David MacQueen. 1976. Coroutines and networks of parallel processes and are now central to most compiler IRs.

In classical computations, these graph representations of computations are essentially term graphs Barend., 1987H. P. Barendregt, M. C. J. D. Eekelen, J. R. W. Glauert, J. R. Kennaway, M. J. Plasmeijer and M. R. Sleep. 1987. Term Graph Rewriting – sets of algebraic expressions that are stored as trees, combined with an important optimisation known as term sharing. When identical subexpressions appear multiple times, they can be represented as one computation and referenced from multiple locations, creating a directed acyclic graph rather than a term tree Plump, 1999Detlef Plump. 1999. Term Graph Rewriting. (October 1999). This sharing enables a more efficient representation. It can also be used as a compiler optimisation to identify subexpressions that can be cached and shared across expression evaluations for a more efficient execution – a technique known as common subexpression elimination (CSE) Cocke, 1970John Cocke. 1970. Global common subexpression elimination. In Proceedings of a symposium on Compiler optimization -. ACM Press, 20--24. doi: 10.1145/800028.808480.

Each edge of a computation graphs corresponds to a unique value: the output of a previous computation that is being passed on to new operations. These values flow along edges in the graph – hence dataflow graph. Values are immutable: they are defined once and then passed as input to further operations, where they can only be consumed, never modified. In compiler speak, programs expressed using such immutable values are often called single static assignment (SSA) programs Cytron, 1991Ron Cytron, Jeanne Ferrante, Barry K. Rosen, Mark N. Wegman and F. Kenneth Zadeck. 1991. Efficiently computing static single assignment form and the control dependence graph. ACM Transactions on Programming Languages and Systems 13, 4 (October 1991, 451--490). doi: 10.1145/115372.115320 Rosen, 1988B. K. Rosen, M. N. Wegman and F. K. Zadeck. 1988. Global Value Numbers and Redundant Computations. In Proceedings of the 15th ACM SIGPLAN-SIGACT symposium on Principles of programming languages - POPL ’88. ACM Press, 12--27. doi: 10.1145/73560.73562. In SSA:

Every value is defined exactly once,
Every value may be used any number of times (including zero).

Quantum computing throws this second pillar of SSA into the bin. Values in quantum computations are the result of computations on quantum data, and as such must obey the no-cloning and no-deleting theorems (section 2.1). We call values subject to these restrictions linear¹. They introduce the following constraint on valid computation graphs:

Every linear value must be used exactly once.

Linear values change fundamentally how transformations of the computation graph must be specified. Where compilers on classical data can:

freely share common subexpressions (term sharing),
undo term sharing, i.e. duplicate shared terms into independent subterms, and
delegate the identification and deletion of obsolete code to specialised passes (e.g., dead code elimination Cytron, 1991Ron Cytron, Jeanne Ferrante, Barry K. Rosen, Mark N. Wegman and F. Kenneth Zadeck. 1991. Efficiently computing static single assignment form and the control dependence graph. ACM Transactions on Programming Languages and Systems 13, 4 (October 1991, 451--490). doi: 10.1145/115372.115320 Briggs, 1994Preston Briggs and Keith D. Cooper. 1994. Effective partial redundancy elimination. ACM SIGPLAN Notices 29, 6 (June 1994, 159--170). doi: 10.1145/773473.178257),

quantum compilers must enforce much stricter invariants on IR transformations – or risk producing invalid programs.

In classical compilers, IR modification APIs (such as MLIR’s PatternRewriter) decouple program transformation from code deletion. Program transformations are specified by copying existing values and introducing new values and operations as needed, while the actual deletion of unused code is deferred to specialized dead code elimination passes. This approach is no longer feasible in the presence of linear values. Computation graphs for quantum computations must adopt proper graph rewriting semantics, in which the explicit deletion of obsolete values and operations is just as much a part of the rewriting data as the new code generation.

The terminology comes from “linear” logic Girard, 1987Jean-Yves Girard. 1987. Linear logic. Theoretical Computer Science 50, 1 (1--101). doi: 10.1016/0304-3975(87)90045-4. I apologise for slamming additional semantics on what I recognise is an already very overloaded term. ↩︎

3.3. A graph representation for quantum programs

For the purposes of this thesis, we introduce a simplified graph-based IR for quantum computations that we will call minIR. It captures all the expressiveness that we require for hybrid programs whilst remaining sufficiently abstract to be applicable to a variety of IRs and, by extension, quantum programming languages and compiler frameworks.

MinIR can be thought of as being built from statements of the form

x, y, ... := op(a, b, c, ...)

to be understood as an operation op applied on the SSA values a, b, c, ... and producing the output values x, y, ....

Computation and dataflow graphs are commonly defined with operations as vertices and values as edges. To faithfully capture the function signature of op, this requires storing and preserving an ordering of the incoming and outgoing edges (also known as port graphs Ferná., 2018Maribel Fernández, Hélène Kirchner and Bruno Pinaud. 2018. Labelled Port Graph – A Formal Structure for Models and Computations. Electronic Notes in Theoretical Computer Science 338 (October 2018, 3--21). doi: 10.1016/j.entcs.2018.10.002).

Instead, we adopt the hypergraph formalisation of computation graphs, which is more common within category theory (see string diagrams Seling., 2010P. Selinger. 2010. A Survey of Graphical Languages for Monoidal Categories and their formalisation in the hypergraph category Bonchi, 2022Filippo Bonchi, Fabio Gadducci, Aleks Kissinger, Pawel Sobocinski and Fabio Zanasi. 2022. String Diagram Rewrite Theory I: Rewriting with Frobenius Structure. Journal of the ACM 69, 2 (March 2022, 1 - 58). doi: 10.1145/3502719 Wilson, 2022Paul Wilson and Fabio Zanasi. 2022. The Cost of Compositionality: A High-Performance Implementation of String Diagram Composition. Electronic Proceedings in Theoretical Computer Science 372 (November 2022, 262--275). doi: 10.4204/eptcs.372.19). This definition is particularly well-suited for our purposes because it frames the graph transformations of interest to us in the well-studied language of rewriting within adhesive categories Lack, 2004Stephen Lack and Paweł Sobociński. 2004. Adhesive Categories. In Foundations of Software Science and Computation Structures, Berlin, Heidelberg. Springer Berlin Heidelberg, 273--288. doi: 10.1007/978-3-540-24727-2_20.

Hypergraphs and minIR #

At a minimum, a directed hypergraph – for simplicity sometimes in the following referred to simply as graph – is defined by a set of vertices $\mathbf V$ and a set of (hyper) edges $\mathbf E$ . We will always consider hypergraphs where the edges $e \in \mathbf E$ are directed and the vertices attached to $e$ are given by ordered lists. We formalise this incidence relation between vertices in $\mathbf V$ and edges in $\mathbf E$ by writing $\mathbf E$ as the partition over the disjoint sets $\mathbf E_{st}$

\mathbf E = \bigsqcup_{s, t \in \mathbb{N}} \mathbf E_{st}

and introducing $s$ source and $t$ target maps for each $\mathbf E_{st}$ . Why we write sets in boldface will become clear in a moment.

Definition 3.1Directed hypergraph

A directed hypergraph is given by sets $\mathbf V$ and $\mathbf E_{st}$ for $s, t \in\mathbb{N}$ , along with maps

\begin{aligned} \textit{src}_{st,i}&: \mathbf E_{st} \to \mathbf V\quad&\textrm{for } 1 \leqslant i \leqslant s\\ \textit{tgt}_{st,j}&: \mathbf E_{st} \to \mathbf V&\textrm{for } 1 \leqslant j \leqslant t\\ \end{aligned}

for all $s, t \in \mathbb{N}$ .

Note that in this thesis, as in most common uses of hypergraphs, the sets $\mathbf V$ and $\mathbf E = \bigsqcup \mathbf E_{st}$ will always be finite, and thus $\mathbf E_{st} \neq \varnothing$ for a finite number of $s, t \in \mathbb{N}$ only.

For simplicity, we can further omit the $st$ subscript of the source $\textit{src}_{st,i}$ and target $\textit{tgt}_{st,j}$ maps whenever it can be inferred from the domain of definition of the map. For $e \in \mathbf{E}_{st}$ , we call $\textit{src}_{1}(e), \dots, \textit{src}_{s}(e) \in \mathbf{V}$ the $s$ source vertices of $e$ and $\textit{tgt}_{1}(e), \dots, \textit{tgt}_{t}(e) \in \mathbf{V}$ the $t$ target vertices of $e$ .

We introduce the notation $u \leadsto v$ to signify that there is an edge from $u$ to $v$ , i.e. there is $e \in \mathbf E_{st}$ for some $s, t \in \mathbb N$ and $1 \leqslant i \leqslant s, 1 \leqslant j \leqslant t$ such that $u = src_i(e)$ and $v = tgt_j(e)$ . We define the equivalence relation $\sim \subseteq \mathbf V^2$ of connected vertices, given by the transitive, symmetric and reflexive closure of $\leadsto$ . The equivalence classes of $\sim$ are the connected components of the graph. We will write $[v]$ , resp. $[e]$ for the connected component that contains the vertex $v$ , resp. the edge $e$ .

To proceed, it is useful to frame the hypergraph definition in a categorical setting. We write $[\mathbb{C}, \mathrm{Set}]$ for the presheaf topos of the category $\mathbb{C}$ , i.e. the category with functors $\mathbb{C} \to \mathrm{Set}$ as objects and natural transformations as morphisms. Definition 3.1 can be equivalently restated as:

hypergraphs are objects in the presheaf topos $\mathbb H = [\mathbb{C}, \mathrm{Set}]$ ,

where the category $\mathbb{C}$ has objects $V$ and $E_{st}$ for $s, t \in \mathbb{N}$ and arrows given by (1), now interpreted as morphisms in $\mathbb{C}$ rather than as functions in $\mathrm{Set}$ . In this framing, a graph is a functor that defines a set for each object of $\mathbb{C}$ and specifies functions between these sets – one for each arrow in $\mathbb{C}$ .

This is where the distinction between bold and non-bold typeface comes from: we use bold letters to refer to images in $\mathrm{Set}$ of a hypergraph functor, whereas the non-bold typeface is refers to objects in the indexing category $\mathbb C$ . The distinction between $\mathbb C$ and $\mathrm{Set}$ is less important for morphisms – it will typically be clear from the context. We thus use the same symbols for both.

Linearity constraints #

The introduction of $\mathbb H$ not only gives us a notion of hypergraph homomorphisms – maps between hypergraphs that preserve the structure of the graph. It also provides us with a way to express the linearity constraints that arise from our discussion in section 3.2, and which we must enforce on our computation graphs.

The definition that follows adds the coproduct $E$ explicitly as an object of the category (which we did not need to do in Definition 3.1), as we need it as the codomain of the new morphisms $\textit{use}$ and $\textit{def}$ . The adhesitivity of the category does no longer comes for free – we will get back to this in section 3.4.

Definition 3.2Hypergraph with linearity constraints

The category $\textrm{lin-}\mathbb{C}$ is the category given by objects

\{V, V_\textit{src}, V_\textit{tgt}\} \cup \{ E \} \cup \{ E_{st}\, |\, s, t \in \mathbb{N}\}.

Its arrows are the incidence morphisms given in (1), along with

\begin{aligned} \mathit{use}&: V_\textit{src} \to E\quad& \mathit{def}&: V_\textit{tgt} \to E\\ \lambda_\mathit{src}&: V_\textit{src} \rightarrowtail V& \lambda_\mathit{tgt}&: V_\textit{tgt} \rightarrowtail V\\ \end{aligned}

and $\iota_{st}: E_{st} \rightarrowtail E$ for all $s, t \in \mathbb{N}$ . The morphisms $\lambda_\mathit{src}, \lambda_\mathit{tgt}, \iota_{st}$ are split monomorphisms and the following diagrams commute for all $s, t \in \mathbb{N}$ and $1 \leqslant i \leqslant s, 1 \leqslant j \leqslant t$ :

Directed hypergraphs with linearity conditions are objects in the full subcategory $\textrm{lin-}\mathbb H$ given by objects $H_\mathrm{lin} \in [\textrm{lin-}\mathbb{C}, \mathrm{Set}]$ such that

\mathbf E = \bigsqcup \mathbf E_{st}

is the coproduct in

\mathrm{Set}

and

H_\mathrm{lin}(\iota_{st}): H_\mathrm{lin}(E_{st}) \to H_\mathrm{lin}(E)

are the injections into

H_\mathrm{lin}(E)

We probably owe an explanation for this definition – at least for the sake of the few computer scientists that are still following us.

First of all, notice that every hypergraph with linearity constraint corresponds to a hypergraph in the sense of Definition 3.1: there is an obvious functor $\mathcal{L}: \mathbb{C} \to \textrm{lin-}\mathbb{C}$ that maps each object and morphism in $\mathbb{C}$ to the object or morphism with the same name in $\textrm{lin-}\mathbb{C}$ . By contravariance, we can thus (functorially) map every hypergraph with linearity constraints $H_\mathrm{lin} \in \textrm{lin-}\mathbb{H}$ to the hypergraph $H = H_\mathrm{lin} \circ \mathcal{L} \in \mathbb{H}$ .

Another way of looking at this is to realise that by requiring that $\lambda_\mathit{src}, \lambda_\mathit{tgt}$ be split monomorphisms, we obtain that the resulting functions in $\mathrm{Set}$ are injective. Up to isomorphism, we can consider that $\mathbf{V}_\textit{src}, \mathbf{V}_\textit{tgt} \subseteq \mathbf{V}$ are subsets of vertices in $H_\mathrm{lin}$ . A hypergraph with linearity constraints is thus a directed hypergraph with two selected subsets of vertices $\mathbf{V}_\textit{src}$ and $\mathbf{V}_\textit{tgt}$ .

Vertices within these subsets are special. For every $v \in \mathbf{V}_\textit{src}$ , there exist unique indices $s, t \in \mathbb{N}$ , $1 \leq i \leq s$ and edge $use(v) = e \in \mathbf{E}_{st}$ such that $\textit{src}_i(e) = v$ . In words, for every $v\in \mathbf{V}_\textit{src}$ there is a unique edge in the hypergraph that has $v$ as one of its sources. We then say that $e$ is the unique use of vertex $v$ . Similarly, vertices in $\mathbf{V}_\textit{tgt}$ have a unique edge $e$ in the hypergraph with $v$ as one of its targets – it is the unique definition of $v$ .

Typed graphs #

MinIR graphs are strongly typed. We introduce typed graphs for this purpose, a concept for which graph transformation was first formalised in Ehrig, 2004Hartmut Ehrig, Ulrike Prange and Gabriele Taentzer. 2004. Fundamental Theory for Typed Attributed Graph Transformation. In Graph Transformations, Berlin, Heidelberg. Springer Berlin Heidelberg, 161--177. doi: 10.1007/978-3-540-30203-2_13. A type system for directed hypergraphs is just another object $\Sigma \in \mathbb H$ . A typed graph is then an object of the slice category $\mathbb H \searrow \Sigma$ , that is to say, typed graphs are morphisms $H \to \Sigma$ of $\mathbb{H}$ and morphisms between typed graphs are given by the subset of morphisms $H_1 \rightarrow H_2$ of $\mathbb H$ that make the triangle diagram formed by $H_1$ , $H_2$ and $\Sigma$ commute.

To type hypergraphs with linearity constraints, we do not pick the type system $\Sigma$ in $\textrm{lin-}\mathbb H$ , as the existence of $use$ and $def$ morphisms impose restrictions that are too strict. We consider instead the category $\textrm{lin-}\mathbb C_\textrm{type}$ given by the same objects as $\textrm{lin-}\mathbb C$ , as well as the same morphisms, with the omission of $use$ and $def$ . There is an obvious functor $\textrm{lin-}\mathbb C_\textrm{type} \to \textrm{lin-}\mathbb C$ and thus, by contravariance, every hypergraph with linearity constraints $H_\textrm{lin} \in \textrm{lin-}\mathbb H$ can be mapped to a hypergraph in $\textrm{lin-}\mathbb H_\textrm{type}$ . We say that a hypergraph with linearity constraints $H_\textrm{lin} \in \textrm{lin-}\mathbb H$ is $\Sigma$ -typed for a type system $\Sigma \in \textrm{lin-}\mathbb H_\textrm{type}$ if there is a morphism $\Sigma \to H_\textrm{lin}$ , when interpreting $H_\textrm{lin}$ is as an object of $\textrm{lin-}\mathbb H_\textrm{type}$ .

Two example hypergraphs. Vertices (labelled with capital letters) are circles and hyperedges (labelled with small letters) span between them. Vertices that are attached to an edge in the black half of the circle correspond to source vertices of the edge; those in the white half correspond to target vertices. The functions srcisrc_isrci and tgtjtgt_jtgtj map edges to the incident vertices, defining the directed hypergraph. On the left, there are further functions use\textit{use}use and def\textit{def}def that map vertices to the unique edge that uses or defines them. This defines a hypergraph with linearity constraints, with Vtgt={A1,A2}V_\textit{tgt} = \{A_1, A_2\}Vtgt={A1,A2} and Vsrc={A3}V_\textit{src} = \{A_3\}Vsrc={A3}. These cannot be defined on the graph on the right. On the other hand, the edges b and c are in F11F_{11}F11, and are in the domain of functions in11in_{11}in11 and out11out_{11}out11, thus defining a child region of a hierarchical graph. Note that it would be invalid to have any edge connecting i1i_1i1 or o1o_1o1 to i2i_2i2 or o2o_2o2. The iii and ooo vertices also have incidence morphisms, not displayed here. — Two example hypergraphs. Vertices (labelled with capital letters) are circles and hyperedges (labelled with small letters) span between them. Vertices that are attached to an edge in the black half of the circle correspond to source vertices of the edge; those in the white half correspond to target vertices. The functions $src_i$ and $tgt_j$ map edges to the incident vertices, defining the directed hypergraph. On the left, there are further functions $\textit{use}$ and $\textit{def}$ that map vertices to the unique edge that uses or defines them. This defines a hypergraph with linearity constraints, with $V_\textit{tgt} = \{A_1, A_2\}$ and $V_\textit{src} = \{A_3\}$ . These cannot be defined on the graph on the right. On the other hand, the edges `b` and `c` are in $F_{11}$ , and are in the domain of functions $in_{11}$ and $out_{11}$ , thus defining a child region of a hierarchical graph. Note that it would be invalid to have any edge connecting $i_1$ or $o_1$ to $i_2$ or $o_2$ . The $i$ and $o$ vertices also have incidence morphisms, not displayed here.

Hierarchical hypergraphs #

A final bit of structure that minIR graphs require is a notion of hierarchy between regions of the graph. This will be useful to define functions, control flow blocks such as if-else, or any subroutine that can itself be viewed as an operation in the computation.

Hierarchical hypergraphs were first proposed in Drewes, 2002Frank Drewes, Berthold Hoffmann and Detlef Plump. 2002. Hierarchical Graph Transformation. Journal of Computer and System Sciences 64, 2 (March 2002, 249--283). doi: 10.1006/jcss.2001.1790 and further generalised in Busatto, 2005Giorgio Busatto, Hans-Jörg Kreowski and Sabine Kuske. 2005. Abstract hierarchical graph transformation. Mathematical Structures in Computer Science 15, 4 (July 2005, 773--819). doi: 10.1017/s0960129505004846 Palacz, 2004Wojciech Palacz. 2004. Algebraic hierarchical graph transformation. Journal of Computer and System Sciences 68, 3 (May 2004, 497--520). doi: 10.1016/s0022-0000(03)00064-3. However, we opt to use a more restrictive definition, closer to the notion of flattened hypergraphs of Drewes, 2002Frank Drewes, Berthold Hoffmann and Detlef Plump. 2002. Hierarchical Graph Transformation. Journal of Computer and System Sciences 64, 2 (March 2002, 249--283). doi: 10.1006/jcss.2001.1790. The reason for this is twofold. Firstly, hierarchical (hyper)graphs are typically defined recursively. It is not obvious under which conditions (and if) such definitions form adhesive categories, although progess in this direction was made in Padberg, 2017Julia Padberg. 2017. Hierarchical Graph Transformation Revisited: Transformations of Coalgebraic Graphs. In Graph Transformation, Cham. Springer International Publishing, 20--35. doi: 10.1007/978-3-319-61470-0_2 with the introduction of coalgebraic graphs. As a result, to the extent that graph transformation results can be applied to such structures, it must be done so carefully.

The second, more practical, reason is that the notion of typed graph introduced above cannot be directly lifted to the hierarchical graph setting: while some subset of the hierarchical relation in minIR should be enforced by the type system, the type graph of a nested graph should be identical to the parent’s (as opposed to being itself nested within the type graph of the parent).

It is therefore more convenient for us to encode hierarchy in directed hypergraphs as follows. Note that it is not clear that our definition is adhesive either¹ – but at least it is framed as a subcategory of a base category that is.

Definition 3.3Hierarchical hypergraph

The category $\textrm{hier-}\mathbb C$ is the category with objects and arrows of $\mathbb C$ along with additional objects $F_{st}$ for $s, t \in \mathbb{N}$ and arrows:

$F_{st} \rightarrowtail E_{st}$ that are split monomorphisms,
input arrows $F_{st} \xrightarrow{in_{st}} E_{0s}$ , and output arrows $F_{st} \xrightarrow{out_{st}} E_{t0}$ .

Hierarchical hypergraph are the objects in the full subcategory $\textrm{hier-}\mathbb H$ given by objects $H_\textrm{hier} \in [\textrm{hier-}\mathbb{C}, \mathrm{Set}]$ such that

for any edge $e \in \mathbf{E}$ of $H_\textrm{hier}$ , the set $P([e]) = \overline{in}^{-1}([e]) \cup \overline{out}^{-1}([e])$ has at most one element.
the transitive and reflexive closure of $[e] \preccurlyeq [P([e])]$ for $e \in \mathbf E$ is a partial order on the connected components of $H_\textrm{hier}$ .

Here $\overline{in}$ and $\overline{out}$ are the functions with domain $\mathbf E$ defined piecewise as $in_{st}$ and $out_{st}$ for all $s, t$ on their respective (disjoint) domains of definition.

The same definition can also be applied to $\textrm{lin-}\mathbb H$ to obtain the category of hierarchical directed hypergraphs with linearity constraints $\textrm{hier-lin-}\mathbb H$ . Similarly, we define the associated category for type systems; however, we do not impose any of the two conditions related to $P(\cdot)$ for the type category, i.e. $\textrm{hier-lin-}\mathbb H_\textrm{type} = [\textrm{hier-lin-}\mathbb C_\textrm{type}, \mathrm{Set}]$ is the full presheaf category, rather than a subcategory of it.

As with the incidence morphisms $\textit{src}$ and $\textit{tgt}$ , we will drop the $st$ subscript for the IO arrows $\textit{in}$ and $\textit{out}$ when it can be inferred from the domain of definition.

Just as in the discussion of Definition 3.2, we interpret the split monomorphisms as equivalent to requiring $\mathbf{F}_{st} \subseteq \mathbf{E}_{st}$ . Taking over terminology from Drewes, 2002Frank Drewes, Berthold Hoffmann and Detlef Plump. 2002. Hierarchical Graph Transformation. Journal of Computer and System Sciences 64, 2 (March 2002, 249--283). doi: 10.1006/jcss.2001.1790, we call elements $f \in \mathbf{F}_{st}$ the frames of $H_\textrm{hier} \in \textrm{hier-}\mathbb H$ . For each frame $f$ , there is thus a unique input edge $in(f)$ and a unique output edge $out(f)$ in $H_\textrm{hier}$ that have respectively $s$ targets and zero sources, and $t$ sources and zero targets.

By the first condition we imposed on $P(\cdot)$ , the partial function $parent$ mapping connected components to their parent edge:

parent([e]) = \begin{cases} p\quad&\textrm{ if there exists } p \in P(e)\\ \bot&\textrm{ otherwise.} \end{cases}

is well-defined. We call the subgraphs of $H_\textrm{hier}$ that share a same parent a region of $H_\textrm{hier}$ ². The subgraph of vertices and edges without parent is the root region of $H_\textrm{hier}$ .

The minIR computation graph #

In minIR, the vertices are the values of the computation, while the hyperedges define the operations. This imposes some constraints for a hypergraph to be a valid minIR graph:

all values in minIR must have a unique operation that defines them;
values that are linear must also have a unique operation that uses them;
the graph must be acyclic, meaning that no value can be defined in terms of itself.

This can be expressed as a hypergraph with linearity constraints by choosing $\mathbf V_\textit{tgt} = \mathbf V$ and $\mathbf V_\textit{src} = \mathbf V_L$ , where $\mathbf V_L \subseteq \mathbf V$ is the subset of linear values.

The following definition then comes as no surprise:

Definition 3.4MinIR graph

Let $\Sigma \in \textrm{hier-lin-}\mathbb H_\textrm{type}$ be a type system. A minIR graph $H$ typed in $\Sigma$ is an object of $\textrm{hier-lin-}\mathbb H$ that is $\Sigma$ -typed and such that the adjacency relation $\leadsto$ is acyclic. We call $\mathbf V_\textit{src} = \mathbf V_L$ , the linear values of $H$ .

In the context of minIR, $\leadsto$ relations encode the data flow of values across the computation. The lack of explicit operation ordering differentiates minIR (and HUGR) from most classical IRs, which, unless specified otherwise, typically assume that instructions may have side effects and thus cannot be reordered. All quantum operations (and the classical operations we are interested in) are side-effect free, which significantly simplifies our IR.

Input and output values #

Notice that in Definition 3.4, it is not enforced that every value has a definition, i.e. there might be $v \in \mathbf V \setminus \mathbf V_\textrm{tgt}$ ; nor that every value with a linear type is in $\mathbf V_L$ , i.e. if $\tau: H \to \Sigma$ is the typing morphism and $\mathbf T_L$ are the linear types in the type system, there might be $v \in \tau^{-1}(\mathbf T_L) \setminus \mathbf V_L$ .

This would be easy to fix: we could on the one hand enforce the equality $\mathbf V_\textrm{tgt} = \mathbf V$ , thus guaranteeing that every value has a unique definition in the graph. On the other hand, we could define $\mathbf V$ as the coproduct $\mathbf V = \mathbf V_L \sqcup \mathbf V_\textit{NL}$ , where $V_\textit{NL}$ is a new object introduced to explicitly capture the set of non-linear values. Morphisms in this category would guarantee that the linearity of values is always preserved, and thus in particular the type morphism would map a value to a linear type if and only if it is linear.

Instead, we opt to allow undefined values and unused linear values to be able to express rewrite rules that match sugraphs of minIR graphs within the same category.

Definition 3.5Inputs and outputs of minIR graphs

For a minIR graph $H$ with typing morphism $\tau: H \to \Sigma$ , we call the set $I = \mathbf V \setminus \mathbf V_\textit{tgt}$ the input values of $H$ and $O = \tau^{-1}(\mathbf T_L) \setminus \mathbf V_L$ its output values, where $\mathbf T_L$ are the linear values in $\Sigma$ .

If $I = O = \varnothing$ , we say that $H$ is IO-free.

Note that by this definition, an output value in $O$ always has a linear type! This is because non-linear values do not need to be treated specially when they are outputs: unlike linear values that must always be used in a non-output position, non-linear values may have no outgoing edge, in which case they are simply discarded in the computation.

Structured control flow #

The operations and values of a minIR graph define the data flow of a program. However, a program must also be able to control and change the data flow at run time in order to express loops, conditionals, function calls etc. This is the program control flow, which minIR expresses using regions and so-called structured control flow.

Using regions, any non-trivial control flow (function calls, conditionals, loops etc.) is captured by a frame, a “black box” operation within the data flow of the program. Its implementation is then defined in the nested region of the frame. This can be used for function calls, but also for branches of control flow. A simple function call, that unconditionally redirects the control flow to the operations within a nested region, could for example be represented as follows:

In this figure and below, circles are SSA values (the vertices of the hypergraph), while the edges spanning between them are the operations. Edges attached to the white half of circles are value definitions, while hyperedges attached to the black half of circles are value uses. The call and + operation can be read left to right: for instance, the two values x and y on the left of call are inputs (the operation uses those values, and is thus attached to the black half of the circles), whereas the x + y value on the right is the output of the call operation (the operation defines this value, which is thus attached to the white half of the circle).

Dashed arrows indicate hierarchical dependencies that map the frame edge to the two input and output edges in the child region; dashed rectangles mark the non-root regions of the graph.

Importantly, the frame edge representing the call operation must intuitively “forward” all its input values to the in operation of the child region, and similarly passes on the value at the out operation to the output value of the call output in the parent graph. Passing function arguments and retrieving returned values in this fashion will be very familiar to any computer scientist. Unlike most programming languages, this is also how in minIR values are passed to and from any control flow constructs we would wish to model.

In terms of graph structure, this relation between values in parent and child regions means that the arity and types of the inputs and outpus of call fix the signatures of the child in and out operations. Definition 3.3 already ensures that the input and output arities of in and out are correct. The correct typing of these input and output values will be ensured by the type system, which we discuss in a separate section below.

To handle constructs that require more than one child region, such as an if-else statement, we can use frames that have zero input and one output:

The output of the ifblock is intuitively a higher-order type representing an operation that takes two inputs and ouputs the sum.

An if-else statement might then look as follows:

The if and else blocks must expect the same input and output values. This is key to respecting any linearity constraints that values passed to ifelse might be subject to. By definition, all operations that use or define a value $v$ will be in the same region – in other words, values are only available within their defining region. This in effect implements “variable scoping”. With some imagination, this construction can easily be adapted to model loops, complex control flow graphs, or any other control flow structures.

Why not plain branch statements? #

There is a simpler – and at least as popular – way of expressing control flow in IRs without requiring regions and operation hierarchies, using branch statements³. For instance, LLVM IR provides a conditional branch statement

br i1 %cond, label %iftrue, label %iffalse

that will jump to the operations under the iftrue label if %cond is true and to the iffalse label otherwise.

This is a simple and versatile approach to control flow that can be used to express any higher-level language constructs. Unfortunately, conditional branching does not mix well with linear values.

Linearity, as defined in Definition 3.4, is a simple constraint to impose on minIR graphs. In the presence of conditional branching, however, the constraint would have to be relaxed to allow for single use in each mutually exclusive branch of control flow. For instance, the following two uses of b should be allowed (in pseudo-IR):

b := h(a)
<if cond> c := x(b)
<else>    d := h(b)

This is a much harder invariant to check on the IR: linearity would no longer be enforceable as a syntactic constraint on the minIR graph as in Definition 3.4, but would instead depend on the semantics of the operations to establish mutual exclusivity of control flow⁴. Forbidding arbitrary branching in minIR and resorting instead to structured control flow as described above to express control flow is just as expressive and gives the linearity constraint a much simpler form.

Type graph #

We have seen how minIR graphs impose some structure on the types of computations that can be expressed: a linear value cannot be used by two (or zero) operations, frames will always have a unique in and out operation in their child region with correct arities, etc.

However, without a “good” type system and associated semantics, it is still possible to express nonsensical programs: we have mentioned for instance earlier that it is up to the type system to enforce that the types of the in and out operations match the types of the frame. Similarly, it is possible to construct programs that break linearity: take the ifelse operation discussed in the previous section, but now replace its semantics to be do-in-parallel, i.e. it will execute both the if-block and else-block in parallel on the inputs that it is given. This would violate the linearity of its inputs, but would nonetheless be a syntactically valid minIR graph!

To resolve this we present here some typed operations, along with their semantics, that can be used to construct well-behaved type systems: programs typed in this system model the kind of quantum programs that we are interested in expressing and are guaranteed to be valid computations. Categorising all valid constructions or an exhaustive enumeration of conditions that type systems must satisfy to guarantee the validity of programs is beyond the scope of this thesis. It is in practice often straightforward to combine and extend the elements presented here to support further custom syntactic constructs and types.

Basic types and operations #

The most elementary types in our computations are Bits and Qubits. The former is typically known as a Boolean and represents the purely classical values 0 and 1. The latter is the canonical quantum example of a linear type. Indeed, just like values in minIR graphs, the type system in $\textrm{hier-lin-}\mathbb H_\textrm{type}$ distinguishes between linear and non-linear types.

Other typical classical types such as integers, floats, strings, custom algebraic data types (ADT) etc could also be introduced as required. In the figure below, we for instance introduce the Angle type to represent rotation angles that parametrise quantum gates. Further examples of linear types, on the other hand, include higher-dimensional qudits, but also any ADT that contains a linear type within it.

As we saw in section 2.1, the number of input qubits in pure quantum operations will match the number of output qubits: the single-qubit h (hadamard gate) and two-qubit cx (controlled NOT) operations thus have one or two Qubits as both inputs and outputs rz (Z rotation) is also a single-qubit operation, but it takes an additional input of type Angle to specify the rotation angle.

On the pure classical side, we are free to add any side-effect free operations on our types; in our example we model addition + on Angles and negation not on Bits. In the type system $\Sigma \in \textrm{hier-lin-}\mathbb H_\textrm{type}$ , each type is represented by a single vertex.

In our example, we thus have three vertices:

We introduce a different colour for each type. Operations such as cx are represented by a hyperedge with two sources on Qubit and two targets on Qubit. As in the previous diagrams, we can distinguish operation inputs from outputs by whether they are attached to the dark or light half of the type vertex: the rz operation thus has one Qubit input, one Angle input and one Qubit output.

As you can tell from the diagram, whilst Qubit is a linear type in the type system, it is not a linear value in the sense of a minIR graph: the Qubit type has multiple uses and defines in the cx operation alone. This is the key difference between $\textrm{lin-}\mathbb H_\textrm{type}$ and $\textrm{lin-}\mathbb H$ .

Qubit allocation and measurement #

We also introduce non-pure quantum operations qalloc and measure which respectively “create” a qubit (so no input, one Qubit output) and “destroy” it (one Qubit input, one Bit output – depending on whether the qubit was projected onto the $\ket{0}$ or $\ket{1}$ state). Remember that the reason these operations seem to “break” the laws of pure quantum physics is because they result from interactions with the classical environment.

measure is fundamental, as it connects quantum values with classical ones!

Region definition and structured control flow #

Our type system is so far missing a crucial aspect of minIR: the hierarchical structures. For this we need frame types, i.e. frames in the type graph. We must introduce a distinct type for each possible type signature of a frame. To keep this as simple as possible, we will introduce exactly one type for each signature.

If we write $T$ for the set of types in our type system (i.e. Bit, Qubit and Angle in our example), then a type signature of an edge is given by a pair $(I, O)$ of ordered lists of types $I, O \in T^\ast$ . For each such $(I, O)$ pair, we introduce

the frame type regiondef<I, O>,
the in and out types in<I,O> and out<I,O>,
along with a new non-linear type Region<I, O>, the higher-order type representing a region with inputs $I$ and outputs $O$ .

The regiondef<I, O> op takes zero inputs and returns one output Region<I, O>, whereas in<I,O> takes zero inputs and returns values of type $I$ and out<I,O> takes inputs of type $O$ and returns nothing. For instance, for $I = ($ Qubit, Qubit $)$ and $O = ($ Qubit, Bit $)$ , we have the following type graph.

Note that there is an important distinction in $\textrm{hier-lin-}\mathbb H_\textrm{type}$ in comparison to $\textrm{hier-lin-}\mathbb H$ : there is no notion of regions in the type system: the Qubit and Bit types in the above diagram would be in the child region of regiondef<I, O> if it were a graph in $\textrm{hier-}\mathbb H$ , but in the type system, they might also be used by other operations in other regions (such as cx, rz, h etc. defined earlier).

Using the Region<I,O> types, it is then easy to define typed operations for any structured control flow of interest, such as the if-else example above. The following figure gives an overview of the entire type system of our example. For display purposes, we have included multiple copies of each type vertex; we remind the reader that in the actual type graph, all circles of the same type (colour) are one and the same.

A complete minIR type graph, following the example in this section. Value vertices with the same label (and same colour) form a single vertex in the type graph. They have been split into multiple vertices in this representation for better readability. The data types and op types with the <I,O> suffix are parametrised on the signature type (I,O)(I,O)(I,O) for I,O∈T∗I,O \in T^\astI,O∈T∗. — A complete minIR type graph, following the example in this section. Value vertices with the same label (and same colour) form a single vertex in the type graph. They have been split into multiple vertices in this representation for better readability. The data types and op types with the `<I,O>` suffix are parametrised on the signature type $(I,O)$ for $I,O \in T^\ast$ .

An example minIR program #

Taking a step back, let us make the introduced ideas more concrete through an example. We demonstrate how a simple program written in textual form can be translated and expressed as a minIR graph. All statements are of the form

x, y, ... := op(a, b, c, ...)

where a, b, c etc are the SSA values passed to op (or used by op), and x, y etc are the SSA values returned by op (or defined by op). We use curly bracket to define the child region of a regiondef operation. A valid minIR program might then look as follows:

 1main := regiondef<(Qubit, Qubit), (Qubit, Bit)> {
 2    q0, q1 := in()
 3
 4    q0_1 := h(q0)
 5    q0_2, q1_1 := cx(q0_1, q1)
 6
 7    m0 := measure(q0_2)
 8
 9    ifregion := regiondef<(Qubit,), (Qubit,)> {
10        q1 := in()
11        out(q1)
12    }
13    elseregion := regiondef<(Qubit,), (Qubit,)> {
14        q1 = in()
15        q1_1 := h(q1)
16        out(q1_1)
17    }
18    q1_2 := ifelse(m0, q1_1, ifregion, elseregion)
19
20    out(q1_2, m0)
21}

Note that the in() and out(..) operations are only allowed within nested regions (as required by the type system). We have omitted the type parameters on these operations, as it mirrors exactly the paremeter of the regiondef.

It corresponds to the two minIR graphs on the following page. We use “wiggly hyperedges” that stretch between values, as in the first figure. They may look unusual if you are used to computation graphs. One can opt to draw the same graph with boxes for hyperedges and wires for values, yielding the second figure. The two representations are equivalent, but the rewriting semantics are most explicit when viewing values as vertices.

An example of an IO-free minIR graph. The vertex colours indicate their types in the type system presented in the previous figure. The main, ifregion and elseregion ops are all of op type regiondef (with type parameters omitted), labelled here with custom names for clarity. The type parameters of the ifelse, in and out op type have similarly been omitted. All other operation types are given as labels on the edges. — An example of an IO-free minIR graph. The vertex colours indicate their types in the type system presented in the previous figure. The `main`, `ifregion` and `elseregion` ops are all of op type `regiondef` (with type parameters omitted), labelled here with custom names for clarity. The type parameters of the `ifelse`, `in` and `out` op type have similarly been omitted. All other operation types are given as labels on the edges.

An equivalent representation of the computation above, now representing operations as boxes and values as wires. The arrow direction indicates the flow from value definition to value use(s). Dashed arrows have been changed to point to regions instead of individual operations.

Differences to the quantum circuit model #

We conclude this presentation of minIR by highlighting the differences between this IR-based representation and the quantum circuit model that most quantum computing and quantum information scientists are familiar with⁵.

When restricted to purely quantum operations and no nested regions, the string diagram representation of a minIR graph (i.e. operations as boxes and values as wires) looks very similar to a quantum circuit. There is, however, a fundamental shift under the hood from reference to value semantics – to borrow terminology from C++.

In the reference semantics of quantum circuits, operations are typically thought of as “placed” on a qubit (the “lines” in the circuit representation), for instance, by referring to a qubit index. This qubit reference exists for the entire computation duration, and the quantum data it refers to will change over time as operations are applied to that qubit.

In the value semantics of computation graphs and SSA, on the other hand, qubits only exist in the form of the data they encode. When applying an operation, the (quantum) data is consumed by the operation and new data is returned. Given that the input data no longer exists, linearity conditions are required to ensure that no other operation can be fed the same value.

To make the difference clear, compare the program representations of the following computation:

Quantum circuit (pytket)⁶

import pytket as tk
circ = tk.Circuit(2)
circ.H(0)
circ.CX(0, 1)
circ.X(1)

SSA (minIR)

q0_0, q1_0 := in()
q0_1       := h(q0_0)
q0_2, q1_1 := cx(q0_1, q1_0)
q1_2       := x(q1_1)
out(q0_2, q1_2)

In value semantics, it becomes much harder to track physical qubits across their lifespan. This has very practical implications: without the convenient naming scheme, it would, for example, be non-trivial to count how many qubits are required in the SSA representation of the computation above. However, it is a drastically simpler picture from the point of view of the compiler and the optimiser – hence its popularity in classical compilers. When operations are defined based on qubit references, the compiler must carefully track the ordering of these operations: operations on the same qubit must always be applied in order. Through multi-qubit gates, this also imposes a partial ordering on operations across different qubits that must be respected.

SSA values remove this dependency tracking altogether: the notion of physical qubit disappears, and the ordering of statements becomes irrelevant. All that matters is connecting each use of a value (i.e. an input to an operation) with its unique definition, the output of a previous operation. In other words, the global ordering imposed by reference semantics is replaced by a causal order on the diagram Kissin., 2019Aleks Kissinger and Sander Uijlen. 2019. A categorical semantics for causal structure. Logical Methods in Computer Science Volume 15, Issue 3 (August 2019). doi: 10.23638/lmcs-15(3:15)2019.

All the concepts of minIR embed themselves very easily within the MLIR-based quantum IRs and the HUGR IR Mark K., 2025Seyon Sivarajah, Alan Lawrence, Alec Edgington, Douglas Wilson, Craig Roy, Luca Mondada, Lukas Heidemann, Ross Duncan Mark Koch. 2025. HUGR: A Quantum-Classical Intermediate Representation. Retrieved (talk recording) from https://www.youtube.com/live/D8esZrt7ogk?feature=shared&t=5217. In this sense, our toy IR serves as the minimum denominator across IRs and compiler technologies so that proposals and contributions we are about to make can be applied regardless of the underlying technical details.

By waiving goodbye to the circuit model, we have been able to integrate much of the theory of traditional compiler design, bringing us in the process much closer to traditional compiler research and the large-scale software infrastructure that is already available. This gives us access to all the classical optimisation and program transformation techniques developed over decades. Using structured control flow, we were also able to model linear resources such as qubits well – by using value semantics and SSA, checking that no-cloning is not violated is as simple as checking that each linear value is used exactly once.

Finally, this new design is also extremely extensible. Not only does it support arbitrary operations, but the type system is also very flexible. There is dedicated support for linear types, but this does not have to be restricted to qubits: lists of qubits could be added or even, depending on the target hardware, higher dimensional qudits, continuous variable quantum data, etc.

And in fact, we will see in section 3.4 that it is not. ↩︎
Note that a region may not be a connected subgraph. Albeit, it is a simple exercice to convince yourself that any non-root region contains either one or two connected components. ↩︎
You may know this from prehistoric times as the goto statement, in languages such as Fortran, C, and, yes, even Go. ↩︎
You might be thinking “oh, but all that is required here are phi nodes!”, if you are familiar with those. No – you’d also need a sort of “phi inverse”. Besides, see this discussion for more arguments on why no phi nodes. ↩︎
Note that these comments apply specifically to characteristics of quantum circuits. Other diagrammatic representations of quantum processes in use, such as string diagrams, quantum combs etc may not share the same properties. ↩︎
This is python code: pip install pytket. ↩︎

3.4. Graph transformation in minIR

As discussed in section 3.2, computation graphs with linear values, such as minIR, must adopt strict graph transformation semantics to ensure that linear constraints are satisfied at all times. In this section, we use the minIR graph category presented in the previous section to define transformation semantics that lean on the double pushout (DPO) Ehrig, 1976Hartmut Ehrig and Hans-Jörg Kreowski. 1976. Parallelism of manipulations in multidimensional information structures and sesqui-pushout (SqPO) Corrad., 2006Andrea Corradini, Tobias Heindel, Frank Hermann and Barbara König. 2006. Sesqui-Pushout Rewriting. In Graph Transformations, Berlin, Heidelberg. Springer Berlin Heidelberg, 30--45. doi: 10.1007/11841883_4 semantics in adhesive categories Lack, 2005Stephen Lack and Pawel Sobocinski. 2005. Adhesive and quasiadhesive categories. RAIRO - Theoretical Informatics and Applications 39, 3 (July 2005, 511--545). doi: 10.1051/ITA:2005028.

Adhesivity of hypergraph categories #

The natural place to start this section is by studying which of the categories defined in section 3.3 are adhesive. From adhesivity follows that transforming graphs using DPO and SqPO constructions is well-defined and unique, at least in the regimes of interest to us.

A category is said to be adhesive if it has all pullbacks and pushouts along monos, as well as some compatibility conditions between them, the so-called “Van Kampen squares”. We refer to the literature (e.g. Lack, 2005Stephen Lack and Pawel Sobocinski. 2005. Adhesive and quasiadhesive categories. RAIRO - Theoretical Informatics and Applications 39, 3 (July 2005, 511--545). doi: 10.1051/ITA:2005028) for a complete definition. For our purposes, the following two results are sufficient:

Every presheaf topos $[\mathbb C, \mathrm{Set}]$ is adhesive (Corollary 3.6 in Lack, 2005Stephen Lack and Pawel Sobocinski. 2005. Adhesive and quasiadhesive categories. RAIRO - Theoretical Informatics and Applications 39, 3 (July 2005, 511--545). doi: 10.1051/ITA:2005028);
Every full subcategory $\mathbb D \subseteq \mathbb C$ of an adhesive category is adhesive if the pullbacks and pushouts in $\mathbb C$ of objects in $\mathbb D$ are again in $\mathbb D$ (a simple result; if the Van Kampen squares commute in $\mathbb C$ , they must commute in $\mathbb D$ ).

A first result immediately follows from the first result:

Proposition 3.1Adhesivity of directed hypergraphs

The category

\mathbb H

of directed hypergraphs is adhesive.

Proof

It is a presheaf.

This does not immediately generalise to $\textrm{lin-}\mathbb H$ , as unlike $\mathbb H$ , Definition 3.2 imposes that $E$ be a coproduct. However, the result still holds:

Proposition 3.2Adhesivity of hypergraphs with linearity constraints

The categories

\textrm{lin-}\mathbb H

and

\textrm{lin-}\mathbb H_\textrm{type}

are adhesive.

Proof

$\textrm{lin-}\mathbb H$ is a full subcategory of the adhesive category $[\textrm{lin-}\mathbb C, \mathrm{Set}]$ . We must show the existence of pullbacks and pushouts along monos in $\textrm{lin-}\mathbb H$ .

Pullbacks. Consider a pullback $A \xleftarrow{p_a} P \xrightarrow{p_b}$ of $A \xrightarrow{a} C \xleftarrow{b} B$ in $[\textrm{lin-}\mathbb C, \mathrm{Set}]$ , with $A, B, C \in \textrm{lin-}\mathbb H$ . We must show that $P$ is in $\textrm{lin-}\mathbb H$ . Colimits are computed pointwise in presheaves, so we know that $P(E)$ is the pullback of $A(E) \to C(E) \leftarrow B(E)$ in $\textrm{Set}$ . If we can show that $P(E)$ is the coproduct of $P(E_{st})$ for $s, t \in \mathbb{N}$ , then we are done.

Let $v \in P(E)$ . Because $A(E)$ and $B(E)$ are coproducts in Set, i.e. a disjoint union, there must be $s, t, s', t' \in \mathbb{N}$ such that $p_a(v) \in A(E_{st})$ and $p_b(v) \in B(E_{st'})$ . By naturality of $a$ and $b$ , it follows that $a(p_a(v)) \in C(E_{st})$ and $b(p_b(v)) \in B(E_{s't'})$ . But by commutativity of the pullback diagram, $a(p_a(v)) = b(p_b(v))$ , and thus $s = s'$ and $t = t'$ . We conclude by unicity of the pullback that $v \in P(E_{st})$ and thus $P(E) = \bigsqcup_{st} P(E_{st})$ .

Pushouts. The same argument as for pullbacks also applies to pushouts: given a pushout $P$ of $A \xrightarrow{a} C \xleftarrow{b} B$ in $[\textrm{lin-}\mathbb C, \mathrm{Set}]$ with $A, B, C \in \textrm{lin-}\mathbb H$ , an element $v \in P(E)$ that makes the pushout square commute must have preimages in $A(E_{st}), B(E_{st})$ and $C(E_{st})$ for some $s, t \in \mathbb{N}$ . Thus the pushout distributes over the coproduct, and we can conclude that $P(E)$ is the coproduct of pushouts.

The same argument also applies to $\textrm{lin-}\mathbb H_\textrm{type}$ ¹.

Now to the spicy stuff:

Proposition 3.3Non-adhesivity of hierarchical hypergraphs

Whilst

\textrm{hier-lin-}\mathbb H_\textrm{type}

is adhesive, the category

\textrm{hier-lin-}\mathbb H

is NOT adhesive.

Proof

$\textrm{hier-lin-}\mathbb H_\textrm{type}$ is a presheaf – hence adhesive.

The following pushout square shows that $\textrm{hier-lin-}\mathbb H$ cannot be adhesive: the pushout square is valid in $[\textrm{hier-lin-}\mathbb C, \mathrm{Set}]$ , but the pushout at the bottom right is not in $\textrm{hier-lin-}\mathbb H$ , because the child regions cannot each be assigned a unique parent.

Double pushout semantics #

From Proposition 3.3, it follows that minIR graph transformations can be performed through the double pushout (DPO) construction Ehrig, 1976Hartmut Ehrig and Hans-Jörg Kreowski. 1976. Parallelism of manipulations in multidimensional information structures in the $[\textrm{hier-lin-}\mathbb C, \mathrm{Set}]$ category.

Definition 3.6Double pushout (DPO) transformation

A transformation rule $p$ in an adhesive category $\mathbb A$ is a span $L \leftarrow I \rightarrow R$ . For objects $G, H \in \mathbb A$ , we then write $G \xRightarrow{(p,m)} H$ or $G \xRightarrow{p} H$ if there is a matching morphism $m: L \to G$ and a context object $C$ along with morphisms $G \leftarrow C \to H$ and $I \to C$ such that the following diagram commutes and both squares are pushouts:

If the DPO transformation $G \xRightarrow{(p,m)} H$ exists for some rule $p$ and match $m$ , then we say $G \Rightarrow H$ is a valid DPO rewrite.

To ensure that a DPO rewrite is valid in minIR, we must impose certain conditions. Let $G$ be an IO-free minIR graph, i.e. $G \in \textrm{hier-lin-}\mathbb H$ , there is a morphism $G \to \Sigma$ in $\textrm{hier-lin-}\mathbb H_\textrm{type}$ for some type system $\Sigma$ and $I = O = \varnothing$ .

A DPO rewrite $G \Rightarrow H$ is a valid minIR DPO rewrite if there is a transformation $G \xRightarrow{p} H$ in $[\textrm{hier-lin-}\mathbb C, \mathrm{Set}]$ and

$p$ is left-mono, i.e. the morphism $I \to L$ is mono,²
the pushout complement $C$ and pushout $H$ also exist in the slice category $\textrm{hier-lin-}\mathbb H_\textrm{type} \searrow \Sigma$ ,
$H$ satisfies the hierarchy condition of Definition 3.3,
$H$ is IO-free.

Proposition 3.4

G

is a minIR graph and

G \Rightarrow H

is a valid minIR DPO rewrite, then

H

is a valid minIR graph.

Proof

We know by construction that $H \in [\textrm{hier-lin-}\mathbb C, \mathrm{Set}]$ . We must show that $H$ further satisfies the constraints to be an object in the full subcategory of minIR graphs.

The first condition is standard in DPO and guarantees that $C$ and $D$ are unique if they exist.

The third condition we impose on $H$ corresponds directly to the constraint that defines hierarchical graphs in $\textrm{hier-lin-}\mathbb H$ . The fourth condition ensures that valid minIR DPO rewrites map IO-free graphs to IO-free graphs.

Finally, the second condition is imposed to ensure well-typedness of $H$ . The functor $\textrm{hier-lin-}\mathbb H \to \textrm{hier-lin-}\mathbb H_\textrm{type}$ that forgets the $\textit{def}$ and $\textit{use}$ morphisms is a left adjoint (it possesses a right Kan extension defined pointwise), and thus preserves colimits. The images of $C$ and $H$ thus form pushout squares in $\textrm{hier-lin-}\mathbb H_\textrm{type}$ , and by unicity, must match the pushout squares in $\textrm{hier-lin-}\mathbb H_\textrm{type} \searrow \Sigma$ . Hence $H$ is well-typed.

The restriction to rewrites of IO-free graphs is not a restriction of generality: if we are interested in rewriting computations with inputs and outputs, we can always express them as IO-free graphs by adding input and output ops with the values in $I$ as outputs, respectively $O$ as inputs. We assign them dedicated types distinct from all other operations; these operations will never be matched by transformation rules and can be removed at the end of rewriting.

Generalising to sesqui-pushouts #

We restricted minIR rewrites to DPO transformations obtained form left-mono rules, to ensure that the construction is unique. This excludes rules that may identify two values in $G$ but split them into two different values in $H$ . Such rules allow for cloning values, which is a useful transformation in minIR for non-linear values. An example of a transformation rule that we would like to allow in minIR:

For this example we added a 2x operation that multiplies an angle value passed as input by two. The transformation rule replaces a rotation of angle $2\alpha$ by two rotations of angle $\alpha$ by cloning the input angle.

Such semantics are possible using the sesqui-pushout construction (SqPO) by Corradini et al. Corrad., 2006Andrea Corradini, Tobias Heindel, Frank Hermann and Barbara König. 2006. Sesqui-Pushout Rewriting. In Graph Transformations, Berlin, Heidelberg. Springer Berlin Heidelberg, 30--45. doi: 10.1007/11841883_4. We can reuse the same $\xRightarrow{(p,m)}$ notation: when DPO is restricted to left-mono rules as we have done, SqPO is a generalisation of DPO (i.e. the construction coincides whenever the DPO exists).

Definition 3.7Sesqui-pushout (SqPO) transformation

A transformation rule $p$ in an adhesive category $\mathbb A$ is a span $L \leftarrow I \rightarrow R$ . For objects $G, H \in \mathbb A$ , we then write $G \xRightarrow{(p,m)} H$ or $G \xRightarrow{p} H$ if there is a matching morphism $m: L \to G$ and a context object $C$ along with morphisms $G \leftarrow C \to H$ and $I \to C$ such that $C$ is the final pullback complement of $I \to L \xrightarrow{m} G$ and the right square is a pushout:

If the SqPO transformation $G \xRightarrow{(p,m)} H$ exists for some rule $p$ and match $m$ , then we say $G \Rightarrow H$ is a valid (SqPO) rewrite.

The left square is redundant in the diagram above, as it follows from the requirement that $C$ be the final pullback complement (FPC). It is kept to highlight the similarities to DPO. As the commuting diagram indicates, the final pullback complement (FPC) construction forms a pullback square. Furthermore, unlike pushout complements, the FPC is defined by a universality property that ensures uniqueness if it exists. We refer to Corrad., 2006Andrea Corradini, Tobias Heindel, Frank Hermann and Barbara König. 2006. Sesqui-Pushout Rewriting. In Graph Transformations, Berlin, Heidelberg. Springer Berlin Heidelberg, 30--45. doi: 10.1007/11841883_4 for the exact FPC construction.

With SqPO, we can define the set of valid minIR rewrites as given by the SqPO transformations $G \xRightarrow{p} H$ in $[\textrm{hier-lin-}\mathbb C, \mathrm{Set}]$ satisfying the relaxed set of conditions

the pushout complement $C$ and pushout $H$ also exist in the slice category $\textrm{hier-lin-}\mathbb H_\textrm{type} \searrow \Sigma$ ,
$H$ satisfies the hierarchy condition of Definition 3.3,
$H$ is IO-free.

We conclude this section with a discussion of some of the properties of minIR transformations using SqPO (referring again to Corradini Corrad., 2006Andrea Corradini, Tobias Heindel, Frank Hermann and Barbara König. 2006. Sesqui-Pushout Rewriting. In Graph Transformations, Berlin, Heidelberg. Springer Berlin Heidelberg, 30--45. doi: 10.1007/11841883_4 or König, 2018Barbara König, Dennis Nolte, Julia Padberg and Arend Rensink. 2018. A Tutorial on Graph Transformation. In Graph Transformation, Specifications, and Nets - In Memory of Hartmut Ehrig. Springer, 83--104. doi: 10.1007/978-3-319-75396-6_5 for a more detailed explanation of the concepts discussed):

Deletion in unknown context. A key difference between DPO and SqPO transformations is that SqPO transformations on graphs will delete edges attached to a vertex $v_d$ that is deleted by the transformation rule (i.e. $v_d \in L$ but $v_d \not\in R$ of the rule). The DPO transformation on the other hand is only well-defined when all edges incident to $v_d$ are in the image of $m$ and thus explicitly deleted (this is known as the dangling condition).

As minIR rewrites follow SqPO semantics, transformation rules such as the following are allowed:

Here $\times$ denotes the multiplication of angles and $\textsf{const(0)}$ the zero angle. Any operation that would be connected to the starred value on the left would be deleted by this rule. However such an implicit operation deletion only yields valid minIR graphs if all incident values are non-linear and none of the target values of the deleted operation are used.

Non-left-mono rules. As discussed in the introduction to SqPO, the cloning of values is allowed in minIR rewrites. However, linear values may never be cloned (the FPC or pushout will not exist in these cases). Thus any minIR transformation rule will be left-mono on linear values. It must further be left-linear on all (linear and non-linear) values in $I$ that are mapped to outputs in $R$ : if a value $w$ is produced by op applied to $v$ , then cloning $v$ and 'op will result in two definitions of $w$ .

Non-right-mono rules. Non-right-mono rules are allowed in both DPO and SqPO. They result in vertex merges. In minIR, the situation for right-mono is symmetric to left-mono: the map must be mono on linear values (otherwise the same value will have multiple uses or definitions) and it must be mono on all values in $I$ that are mapped to inputs in $L$ (otherwise a value in the rewritten minIR graph will have more than one value definition).

In fact, a much simpler argument applies: the category $\textrm{lin-}\mathbb H_\textrm{type}$ is isomorphic to the presheaf category $[\textrm{lin-}\tilde{\mathbb C}_\textrm{type}, \mathrm{Set}]$ , where $\textrm{lin-}\tilde{\mathbb C}_\textrm{type}$ is obtained from $\textrm{lin-}\mathbb C_\textrm{type}$ by removing the object $E$ . Adhesivity follows. ↩︎
This is often called left-linear in the literature. We avoid this term in this thesis to avoid confusion with the linearity property of values in minIR. ↩︎

3.5. MinIR rewriting, operationally

The previous section proposed to view minIR rewrites as the result of a (DPO or SqPO) graph transformation. This yields valid rewriting semantics elegantly (and with little effort!). However, the conditions that must be imposed on the transformation to be valid, along with the fact that pushouts may not exist mean that the existence of a rewrite given a transformation rule and a match is not guaranteed.

In this section, we address this by considering a more restricted notion of minIR rewriting, for which the existence of the right-hand side of the rewrite is guaranteed. In addition, in place of the categorical presentation of the last section, we express the rewriting operation operationally, i.e. as data and a procedure on sets that translates directly into an algorithmic implementation.

We find that this rewrite definition is sufficient in practice. We conclude the section with an example of how more complex rewrites can be achieved by composition of simpler rewrites that can be expressed in this framework.

Graph glueings and rewrites #

Throughout, we consider graph glueings on disjoint vertex and (hyper)edge sets. To underline this, we will use the $\sqcup$ symbol to denote disjoint set unions.
As we will be working exclusively with vertex and edge sets in this section (as opposed to the objects in the indexing category), we will drop the bold typeface for sets, writing e.g. $V$ instead of $\mathbf V$ for the set of vertices of a hypergraph.
Finally, all minIR graphs in this section are IO free.

We define local graph rewrites in terms of graph glueings. Consider first the case of two arbitrary graphs $G_1 = (V_1, E_2)$ and $G_2 = (V_2, E_2)$ , along with a relation $\mu\ \subseteq V_1 \times V_2$ . Let $\sim_\mu \ \subseteq (V_1 \sqcup V_2)^2$ be the equivalence relation induced by $\mu$ , i.e. the smallest relation on $V_1 \sqcup V_2$ that is reflexive, symmetric and transitive, and satisifes for all $v_1 \in V_1$ and $v_2 \in V_2$ ,

(v_1, v_2) \in \mu \Rightarrow v_1 \sim_\mu v_2.

Then, we can define

$V = (V_1 \sqcup V_2)/\sim_\mu$ is the set of all equivalence classes of $\sim_\mu$ , and
for $v \in V_1 \sqcup V_2$ , $\alpha_\mu(v) \in V$ is the equivalence class of $\sim_\mu$ that $v$ belongs to.

Definition 3.8Graph glueing

The glueing of $G_1$ and $G_2$ according to the glueing relation $\mu$ is given by the vertices $V = (V_1 \sqcup V_2)/\sim_\mu$ and the edges

E = \{(\alpha_\mu(u), \alpha_\mu(v)) \mid (u,v) \in E_1 \sqcup E_2 \} \subseteq V^2.

We write the glueing graph as $(G_1 \sqcup G_2) / \sim_\mu$ .

In other words, the glueing is the disjoint union of the two graphs, with identification (and merging) of vertices that are related in $\mu$ .

This allows us to define a rewrite on a graph $G$ :

Definition 3.9Graph rewrite

A rewrite $r$ on a graph $G = (V, E)$ is given by a tuple $r = (G_R, V^-, E^-, \mu)$ , with

$G_R = (V_R, E_R)$ is a graph called the replacement graph,
$V^- \subseteq V$ is the vertex deletion set,
$E^- \subseteq E \cap dom(\mu)^2$ is the edge deletion set, and
$\mu: V^- \rightharpoonup V_R$ is the glueing relation, a partial function that maps a subset of the deleted vertices of $G$ to vertices in the replacement graph.

The domain of definition $dom(\mu)$ is known as the boundary values of $r$ .

A graph rewrite per this definition can always be generated by a single pushout (SPO) transformation Löwe, 1991Michael Löwe. 1991. Extended algebraic graph transformation. Retrieved from http://d-nb.info/910935696.

define $L$ as the graph $(V^-, E^-)$ . Then the injection $L \subseteq G$ is the match morphism $L \to G$ ;
the partial map $\mu$ maps a subset of $V^-$ to vertices in the replacement $R = G_R$ . By injectivity of the match morphism, it also defines a partial map $L \rightharpoonup R$ .

We opted for SPO-like semantics in this definition, as they are the simplest to write in set-theoretic terms and coincide with DPO and SqPO in our restricted domain of interest.

The result of the rewrite is computed by gluing the right-hand side $G_R$ to the context subgraph $G_C = (V_C, E_C)$ of $G$ given by

\begin{aligned}V_C &= (V \smallsetminus V^-) \ \cup\ dom(\mu)\\E_C &= (E \smallsetminus E^-)\ \cap\ V_C^2.\end{aligned}

The partial function $\mu$ is a special case of a glueing relation $\mu \subseteq V_C \times V_R$ , and thus defines a glueing of $G_C$ with $G_R$ . The rewritten graph resulting from applying $r$ to $G$ is $r(G) = (G_C \sqcup G_R) / \sim_\mu.$

An example of a graph rewrite is given in the next figure. This is equivalent to an SPO transformation with the graph induced by $V^-$ on the left-hand side, the graph $G_R$ on the right-hand side and the partial map $L \hookrightarrow R$ given by $\mu$ .

Application of a graph rewrite. On the left, the original graph GGG along with the replacement graph GRG_RGR (grey box). On the right, the rewritten graph r(G)r(G)r(G). Only the vertex ggg has been deleted, as other vertices in V−V^-V− are in the boundary dom(μ)dom(\mu)dom(μ) (in orange). The (singleton) edge deletion set is red. The blue edge connects a vertex of V∖V−V \smallsetminus V^-V∖V− to a boundary vertex, and is thus also present on the right-hand side. The purple edge, on the other hand, connects a vertex of V∖V−V \smallsetminus V^-V∖V− to a non-boundary vertex of V−V^-V−, and is thus deleted. — Application of a graph rewrite. On the left, the original graph $G$ along with the replacement graph $G_R$ (grey box). On the right, the rewritten graph $r(G)$ . Only the vertex $g$ has been deleted, as other vertices in $V^-$ are in the boundary $dom(\mu)$ (in orange). The (singleton) edge deletion set is red. The blue edge connects a vertex of $V \smallsetminus V^-$ to a boundary vertex, and is thus also present on the right-hand side. The purple edge, on the other hand, connects a vertex of $V \smallsetminus V^-$ to a non-boundary vertex of $V^-$ , and is thus deleted.

When there are no edges between $V \smallsetminus V^-$ and $V^- \smallsetminus dom(\mu)$ (purple in the example above), this definition corresponds to graph rewrites that can be produced using DPO transformations (see discussion in section 3.4). Otherwise, such edges are deleted.

The notions of graph glueing and graph rewrite can straightforwardly be lifted to hypergraphs and, by extension, to minIR graphs. Notice that in this case, values are glued together, not operations (the former were defined as the graph’s vertices, the latter as its hyperedges).

However, the glueing of two valid minIR graphs – and the result of applying a valid rewrite – may not be a valid minIR graph. Glueing two values of a linear type, for instance, is a sure way to introduce multiple uses (or definitions) of it. Thus, we must be careful to only consider glueings and rewrites of minIR graphs that preserve all the constraints we have imposed in Definition 3.4.

Ensuring rewrite validity: interfaces #

As a sufficient condition for valid minIR rewrites, we introduce minIR interfaces, a concept closely related to the “hypergraph with interfaces” construction of Bonchi, 2017Filippo Bonchi, Fabio Gadducci, Aleks Kissinger, Paweł Sobociński and Fabio Zanasi. 2017. Confluence of Graph Rewriting with Interfaces or the supermaps of quantum causality Hefford, 2024James Hefford and Matt Wilson. 2024. A Profunctorial Semantics for Quantum Supermaps. In Proceedings of the 39th Annual ACM/IEEE Symposium on Logic in Computer Science, July 2024. ACM, 1--15. doi: 10.1145/3661814.3662123. We eschew the presentation of holes as a slice category in favour of a definition that fits naturally within minIR and is sufficient for our purposes.

Let $G$ be a $\Sigma$ -typed minIR graph with data types $T$ and linear types $T_L \subseteq T$ . Consider type strings $S, S' \in T^\ast$ . We define the index sets

\begin{aligned}\mathrm{Idx}(S) &= \{i \in \mathbb{N} \mid 1 \leq i \leq |S|\}\\\mathrm{Idx}_L(S) &= \{i \in \mathrm{Idx}(S) \mid S_i \in T_L\} \subseteq \mathrm{Idx}(S)\end{aligned}

corresponding respectively to the set of all indices into $S$ and the subset of indices of linear types. For any $i \in \mathrm{Idx}(S)$ , we denote by $S_i$ the type at position i in $S$ .

We define a partial order $\preccurlyeq$ ¹ on $T^\ast$ where $S \preccurlyeq S'$ and say that $S'$ can be coerced into $S$ if there exists an index map $\rho: \mathrm{Idx}(S) \to \mathrm{Idx}(S')$ such that

types are preserved: $S_i = S'_{\rho(i)}$ , and
$\rho$ is well-defined and bijective on the restriction to indices of linear types
$\left.\rho\right|_{\mathrm{Idx}_L(S)}: \mathrm{Idx}_L(S) \to \mathrm{Idx}_L(S').$

Definition 3.10Interface

Let $T$ be a set of data types. An interface $I = (U, D)$ is a pair of type strings $U, D \in T^\ast$ .

We say that an interface $I' = (U', D')$ can be coerced into an interface $I = (U, D)$ , written $I \triangleleft I'$ , if $U \succcurlyeq U'$ and $D \preccurlyeq D'$ .

We can define the interface associated with an operation $o$ in a minIR graph $G$ by considering the values used and defined by $o$ . Calling $\tau$ the type morphism on $G$ and assuming $o \in E_{st}$ to be an operation in $G$ with $s$ inputs and $t$ outputs, we define the interface of $o$ in $G$ as the pair of strings in $T^\ast$

I(o) = (\tau(\textit{src}_1(o))\cdots\tau(\textit{src}_s(o)), \tau(\mathit{tgt}_1(o))\cdots\tau(\mathit{tgt}_t(o))).

Similarly, we can assign interfaces to subgraphs of minIR graphs:

Definition 3.11MinIR subgraph

Consider a subset of values and operations $V_H \subseteq V$ and $E_H \subseteq E$ . Define the use and define boundary sets

\begin{aligned} B_U &= \{v \in V_H \mid \mathit{def}\,(v) \in E \smallsetminus E_H \},\\B_D &= \{v \in V_H \mid use(v) \in E \smallsetminus E_H \}.\end{aligned}

The tuple $H = (V_H, E_H)$ of $G$ is called a minIR subgraph of $G$ if there exists a region $R$ of $G$ such that all boundary values of $H$ are in $R$ :

B= B_U \cup B_D \subseteq R.

We write $H \subseteq G$ to indicate that $H$ is a minIR subgraph of $G$ .

Note that $B_U$ is exactly the set of inputs $I$ in the non-IO free minIR graph given by the subgraph $(V_H, E_H)$ of the minIR graph. $B_D$ is a superset of the outputs $O$ of $H$ : it includes all linear values in $H$ that do not have a use in $H$ , but also any non-linear value that has a use outside of $H$ .

Unlike interfaces, subgraph boundary values are not ordered. An ordering of $B \subseteq V$ is a string $S \in V^\ast$ along with a bijective map

\mathrm{ord}: B \to \mathrm{Idx}(S) \quad\textrm{such that}\quad v = S_{\mathrm{ord}(v)}.

If there are strings $S_U, S_D \in V^\ast$ and orderings of $B_U$ and $B_D$

\begin{aligned}\textrm{ord}_U:\ &B_U \to \mathrm{Idx}(S_U)&\quad\textrm{ord}_D:\ &B_D \to \mathrm{Idx}(S_D),\end{aligned}

then we can set $\textit{src}_i\,(H) = (S_U)_i$ and $\textit{tgt}_i\,(H) = (S_D)_i$ in complete analogy to operations. We will write $\textit{src}(H)$ and $\textit{tgt}(H)$ for the strings $\textit{src}_1(H)\cdots\textit{src}_{|S_U|}(H)$ and $\textit{tgt}_1(H)\cdots\textit{tgt}_{|S_D|}(H)$ respectively. We say that the subgraph $H$ implements the interface

I_H = (\tau(\textit{src}(H)), \tau(\mathit{tgt}(H)),

where the type morphism $\tau$ was extended element-wise to strings $V^\ast$ .

Remark, though, that unlike operations, the same subgraph may implement more than one interface as a result of various choices of orderings $\textrm{ord}_U$ and $\textrm{ord}_D$ .

As mentioned, the subgraph $H$ forms a non-IO free minIR graph. We can always construct an IO-free minIR graph from $H$ by adding two operations $o_{in}$ and $o_{out}$ in the root region respectively in $E_{0, |S_U|}$ and $E_{|S_D|, 0}$ inputs-outputs, defined by

\textit{tgt}_i\,(o_{in}) = \textit{src}_i(H),\quad\quad \textit{src}_i(o_{out}) = \textit{tgt}_i\,(H).

We call the resulting graph $\bar{H}$ an interface graph. It implements the interface $I_H$ if $H$ implements $I_H$ . Calling to mind the illustrations of section 3.3, $\bar{H}$ looks like one of the nested regions within regiondef operations that we were considering.

MinIR operation rewrite #

Consider

an operation $o$ in a minIR graph $G$ with values $V,$
an interface graph $\bar{H}$ with values $V_H$ and its associated subgraph $H \subseteq \bar{H}$ , such that $H$ implements an interface $I(o) \triangleleft I_H,$
the index maps $\rho: \mathrm{Idx}(\textit{src}(H)) \to \mathrm{Idx}(\textit{src}(o))$ and $\sigma: \mathrm{Idx}(\textit{tgt}\,(o)) \to \mathrm{Idx}(\textit{tgt}\,(H))$ that define the generalisation $I(o) \triangleleft I_H$ (per Definition 3.10).

We can define a glueing relation $\mu_o \subseteq V \times V_H$

\begin{aligned}\mu_o =\ & \{ \left(\textit{src}_{\rho(i)}(o), \textit{src}_{i}(H)\right) \mid i \in \mathrm{Idx}(\textit{src}(H)) \}\ \cup \\& \{ \left(\mathit{tgt}_{i}\,(o), \mathit{tgt}_{\sigma(i)}(H)\right) \mid i \in \mathrm{Idx}(\textit{tgt}\,(o)) \}.\end{aligned}

This is almost enough to define a rewrite that replaces the operation $o$ in $G$ with the values and operations of $H$ – the interface compatibility constraint $I(o) \triangleleft I_H$ that we have imposed ensures that the resulting minIR graph is valid. Unfortunately, $\mu_o$ is not a partial function as required by Definition 3.4.

This is resolved in the following proposition:

Proposition 3.5MinIR operation rewrite

Let $G$ , $o$ and $H$ such that $I(o) \triangleleft I_H$ , as defined above. Then

\big((G \sqcup H) / \sim_{\mu_o}\!\big) \smallsetminus \{o\},

i.e. the graph obtained by removing the operation $o$ from the glueing of $G$ and $H$ along $\mu_o$ , is a valid minIR graph.

There is a graph $G_R$ with values $V_R$ and a partial function $\mu_o': V \rightharpoonup V_R$ such that the graph (5) is the graph $r_o(G)$ , obtained from the rewrite

r_o = (G_R, dom(\mu_o), \{o\}, \mu_o').

We call $r_o$ the rewrite of $o$ into $H$ .

The definition of the rewrite of $o$ into a graph $H$ behaves as one would expect – the only subtleties relate to handling non-linear (i.e. copyable) values at the boundary of the rewrite. The following example illustrates some of these edge cases.

Rewriting operation ooo in the graph GGG (top left) into the operations o1o_1o1 and o2o_2o2 of the graph Hˉ\bar{H}Hˉ (bottom left). Coloured dots indicate the index maps ρ\rhoρ and σ\sigmaσ from inputs BUB_UBU of Hˉ\bar{H}Hˉ to inputs of ooo, respectively from outputs of ooo to outputs BDB_DBD of Hˉ\bar{H}Hˉ. — Rewriting operation $o$ in the graph $G$ (top left) into the operations $o_1$ and $o_2$ of the graph $\bar{H}$ (bottom left). Coloured dots indicate the index maps $\rho$ and $\sigma$ from inputs $B_U$ of $\bar{H}$ to inputs of $o$ , respectively from outputs of $o$ to outputs $B_D$ of $\bar{H}$ .

When the index maps $\rho$ and $\sigma$ are not injective (yellow and green dots), values are merged, resulting in multiple uses of the value (i.e. copies). This is why the index maps must be injective on linear values (dots in shades of blue). Value merging also happens when a value is used multiple times in $o$ (yellow and red dots). This will never happen with linear values (as they can never have more than one use in $o$ ), nor with any value definitions (the same value can never be defined more than once). Finally, values not in the image of $\rho$ or $\sigma$ (purple dot) are discarded. This case is also excluded for linear values by requiring surjectivity.

Proof

We start this proof with the explicit construction of $G_R$ and $\mu_o'$ . Define $\sim_R \subseteq (V_H)^2$ as the smallest equivalence relation such that

\textit{src}_{\rho(i)}(o) = \textit{src}_{\rho(j)}(o) \Rightarrow \textit{tgt}_i\,(o_{in}) \sim_R \textit{tgt}_j\,(o_{in}).

Then we define $\bar{G}_R = \bar{H} / \sim_R$ , the graph obtained by glueing together values within the same equivalence class of $\sim_R$ .

Claim 1: $\bar{G}_R$ is a valid minIR graph.

Claim 1 follows from the observation that only values of non-linear types are glued together. If $v \sim_R v'$ , then either $v = v'$ or there exist $i \neq j$ such that

\textit{tgt}_i\,(o_{in}) \sim_R \textit{tgt}_j\,(o_{in}).

\rho(i) = \rho(j)

, then

\rho

is not injective on

i

and

j

, and by the definition of

\rho

\tau(v)\notin T_L

and

\tau(v') \notin T_L

. Otherwise, there are

i' = \rho(i) \neq \textit{src}_{\rho(j)}(o) = j'

such that

\textit{src}_{i'}(o) = \textit{src}_{j'}(o)

. The same value is used twice, which is only a valid minIR graph if

v

and

v'

are not linear, thus proving Claim 1.

Define $G_R$ as the subgraph obtained from $\bar{G}_R$ by removing the operations $\{o_{in}, o_{out}\}$ . Let $V_R = V_H / \sim_R$ be the set of values of $G_R$ (and of $\bar{G}_R$ ). Writing $\alpha_R(v) \in V_R$ for the equivalence class of $\sim_R$ that $v \in V_H$ belongs to, we can define $\mu_o' \in V \times V_R$ as:

(v, w) \in \mu_o \Leftrightarrow (v, \alpha_R(w)) \in \mu_o'.

Claim 2: $\mu_o'$ is a partial function $V \rightharpoonup V_R$ .

In other words, for all $(v, \alpha_1), (v, \alpha_2) \in \mu_o'$ , then $\alpha_1 = \alpha_2$ . Let $w_1 \in \alpha_1$ and $w_2 \in \alpha_2$ be values in $V_H$ . First of all, $\textit{src}_i(o) \neq \textit{tgt}_j\,(o)$ for all $i, j$ , otherwise $G$ is not acyclic. So either $use(v) = o$ , or $\textit{def}(v) = o$ , but not both.

The simpler case: if $\textit{def}\,(v) = o$ , then there exists $i$ such that $\textit{tgt}\,(o)_i = v$ . Furthermore $i$ is unique because by minIR definition, $v$ has a unique definition in $G$ . It follows from (4) that $w_1 = \textit{src}_{\rho(i)}(o_{out}) = w_2$ and hence $\alpha_1 = \alpha_2$ .

Otherwise, there exists $i$ and $j$ such that $v = \textit{src}_{\rho(i)}(o) = \textit{src}_{\rho(j)}(o)$ and $\textit{tgt}_i\,(o_{in}) = w_1$ as well as $\textit{tgt}_j\,(o_{in}) = w_2$ . By definition of $\sim_R$ , we have $w \sim_R w'$ , and thus

\alpha_1 = \alpha_R(w_1) = \alpha_R(w_2) = \alpha_2,

proving Claim 2.

Claim 3: $r_o(G)$ is given by $((G \sqcup H) / \sim_{\mu_o}) \smallsetminus \{o\}$ .

It follows directly from our construction of $\sim_R$ and $\mu_o'$ that the equivalence classes of (the smallest equivalence relation closure of) $\mu_o' \circ \alpha_R$ is equal to the equivalence classes of (the smallest equivalence relation closure of) $\mu_o$ . The claim follows by Definition 3.8 and the definition of $r_o$ .

And finally, Claim 4: $r_o(G)$ is a valid minIR graph.

Per Definition 3.4, We must check four properties: (i) every value is defined exactly once, (ii) every linear value is used exactly once, (iii) the graph is acyclic, and (iv) every region has (at most) one parent.

(iii) follows from the fact that $G$ and $H$ are acyclic and a single operation $o$ in $G$ is replaced: any cycle across $G$ and $H$ would also be a cycle in $G$ by replacing the subpath in $H$ with $o$ . (iv) follows from the fact that $o_{in}$ and $o_{out}$ are in the root region of $\bar{H}$ , by definition of interface implementation. (i): removing $o$ from $G$ removes the unique definitions of all values that are targets of $o$ . Each such value $v$ is glued to a unique value $\mathit{src}_i\,(o_{out})$ in $H$ – the new and unique definition of $v$ in $r_o(G).$ (ii) follows from the same argument as in (i), but relying on injectivity of $\rho$ on linear values to establish uniqueness.

Arbitrary minIR rewrites #

We have so far defined rewrites of single operations into graphs $H$ . We can generalise these rewrites to rewrite subgraphs $P \subseteq G$ , provided the minIR subgraphs satisfy some constraints. We require for this a notion of convexity, as discussed in Bonchi, 2022Filippo Bonchi, Fabio Gadducci, Aleks Kissinger, Pawel Sobocinski and Fabio Zanasi. 2022. String diagram rewrite theory II: Rewriting with symmetric monoidal structure. Mathematical Structures in Computer Science 32, 4 (April 2022, 511--541). doi: 10.1017/s0960129522000317.

As usual, let us consider a minIR graph $G$ with values $V$ , linear values $V_L \subseteq V$ , edges $E$ , the incidence maps $\textit{src}_i$ and $\textit{tgt}_j$ as well as their inverses $\textit{use}$ and $\textit{def}$ . Consider further a subgraph of $G$ that we will now call $P = (V_P, E_P) \subseteq G$ , to distinguish from $H$ .

Let us further define the partial $parent$ morphism that maps a value $v \in V$ to the parent of the region of $v$ .

Definition 3.12Convex minIR subgraph

A minIR subgraph $P \subseteq G$ is convex if the following conditions hold:

for all $v_1, v_2 \in V_P$ , any path along $\leadsto$ from $v_1$ to $v_2$ contains only vertices in $V_P$ ,
parent-child relations are contained within the subgraph, i.e.
$v \in V_P \cap dom(\mathit{parent}) \Leftrightarrow \mathit{parent}(v) \in V_P.$

Define the sets of boundary values $B_U, B_D$ and $B = B_U \cup B_D$ , as in (2); then fix the boundary orderings $\textit{src}(P)$ and $\textit{tgt}\,(P)$ as in (3). The subgraph $P$ implements the interface

Consider an interface graph $\bar{H}$ that implements $I_H$ such that $I_P \triangleleft I_H$ . Instead of defining a gluing relation from values of an operation $o$ to values of $H$ , we replace the interface $I(o)$ with $I_P$ . This generalises the definition of $\mu_o$ from (4) to a glueing $\mu\subseteq B \times V_H$ defined as

\begin{aligned}\mu =\ & \{ \left((\textit{src}_{\rho(i)}(P)), \textit{src}_{i}\,(H)\right) \mid i \in \mathrm{Idx}(\textit{src}(H)) \}\ \cup \\& \{ \left((\textit{tgt}_{i}\,(P)), \textit{tgt}_{\sigma(i)}\,(H)\right) \mid i \in \mathrm{Idx}(\textit{tgt}\,(P)) \},\end{aligned}

With the set of boundary operations defined as²

E_B = \left\{o \in E_P \mid \left(\mathit{tgt}\,(o) \cup \textit{src}(o)\right) \subseteq B\right\},

we are able to define minIR rewrites in their most general form.

Proposition 3.6MinIR subgraph rewrite

Let $P \subseteq G$ and $H$ such that $I_P \triangleleft I_H$ and $P$ is convex, as defined above. Then,

\big((G \sqcup H) / \sim_{\mu}\!\big) \smallsetminus (V_P \smallsetminus B, E_B),

i.e. the graph obtained by removing the values $V_P \smallsetminus B$ and operations $E_B$ from the glueing of $G$ and $H$ along $\mu$ , is a valid minIR graph.

There is a graph $G_R$ with values $V_R$ and a partial function $\mu': V_P \rightharpoonup V_R$ such that the graph (8) is the graph $r_P(G)$ , obtained from the rewrite

r_P = (G_R, V_P, E_B, \mu').

We call $r_P$ the rewrite of $P$ into $H$ .

Proof

Consider an operation $o$ that implements $I_P = (U_P, D_P)$ . We can define the interface graph $\bar{H}_o$ given by three operations $o_{in}$ , $o_{out}$ and $o$ . Its associated subgraph $H_o \subseteq \bar{H}_o$ only includes $o$ . Let $\tilde \mu$ be the glueing relation

\begin{aligned}\tilde \mu =\ &\{ (\textit{src}_i(P), \textit{src}_i(o)) \mid i \in \mathrm{Idx}(U_P) \}\ \cup \\& \{ (\textit{tgt}_i\,(P), \textit{tgt}_i\,(o)) \mid i \in \mathrm{Idx}(D_P) \}.\end{aligned}

Consider the rewrite $r = (H_o, V_P \smallsetminus B, E_B, \tilde\mu)$ . If we write $G' = (V', E')$ for the subgraph of $G$ given by

\begin{aligned}V' &= (V \smallsetminus (V_P \smallsetminus B)) \\E' &= (E \smallsetminus E_B) \cap (V')^\ast,\end{aligned}

then according to (1), the graph resulting from applying

r

G

can be expressed as the glueing

G_o = r(G) = (G' \sqcup H_o) / \sim_{\tilde \mu}.

Our claim is that $G_o$ is a valid minIR graph.

The graph (8) is then obtained by applying the rewrite $r_o$ as given by (6) to $G_o$ . Defining the rewrite $r_P$ as the composition of $r$ followed by $r_o$ , the result follows from our claim and Proposition 3.5.

We now prove the claim, by showing the four properties of minIR graphs as per Definition 3.4. Property i) requires showing that every value is defined exactly once. As $G'$ is obtained by removing values and operations from a valid minIR graph $G$ , no value in $V'$ can be defined more than once. A value $v \in V'$ that is not defined in $G'$ must be in the boundary $v \in B$ of $P$ . By the boundary definitions of (2), $v$ cannot be in $B_U$ and thus must be in $B_D$ . It follows by the definition of the glueing $\tilde \mu$ that in $G_o$ , $v$ will be in the definitions of $o$ : $\textit{def}(v) = o$ . The glueing $\tilde \mu$ is bijective between the values of $P$ and $o$ and thus we can conclude that $v$ has a unique definition in $G_o$ .

The same argument applies to property ii). Property iii) follows from the convexity requirement of $P$ . Finally, property iv) (every region has at most one parent) follows from two observations. First, by convexity of $P$ , no deleted value or operation could be the parent of any value not in $P$ , and thus the $parent$ relation is well-defined on $G'$ : $im(parent) \subseteq E'$ . Secondly, all new values and operations added to the boundary region of $G'$ are from the root region of $H$ , and thus do not have a parent, ensuring that parent uniqueness is preserved.

This simple and limited graph transformation framework captures a remarkably large set of minIR program transformations. It may seem at first that the restriction to boundary values within a single region of Definition 3.11, as well as the convexity requirements of Definition 3.12 represent significant limitations on the expressivity of the rewrites. In practice, however, the semantics of minIR operations can be used to decompose more complex rewrites into a sequence of simple rewrites to which Proposition 3.6 applies.

Consider minIR graphs with a type system that includes regiondef and call operations as discussed in examples of the previous section – respectively defining a code block by a nested region and redirecting control flow to a code block defined using a regiondef. Then all constraints that we impose on rewriting can be effectively side-stepped using the region outlining and value hoisting transformations.

Region outlining moves a valid minIR subgraph into its own separate region, and replaces the hole left by the subgraph in the computation by a call operation to the newly outlined region.

Value hoisting moves a value definition within a region to its parent region and passes the value down to the nested region through an additional input. In case of linear values, we can similarly hoist the unique use of the value to the parent region.

Using these transformations, non-convex subgraphs can always be made convex by taking the convex hull and outlining any parts within it that are not part of the subgraph. Outlined regions can then be passed as additional inputs to the subgraph. Step 1 of the figure below illustrates this transformation. Similarly, a subgraph that includes operations without their parent can be extended to cover the entire region and its parent, outlining any parts of the region that are not part of the subgraph.

Finally, whenever a boundary value $v$ belongs to a region that is not the top level region of the subgraph³, we can repeatedley hoist $v$ to its parent region until it is in the top level region. The value is then recursively passed as argument to descendant regions until the region that it is required in. Subgraphs can thus always be transformed to only have input and output boundary values at the top level region. Step 2 of the figure below illustrates this transformation.

A non-convex minIR graph rewrite, obtained by decomposition into valid convex rewrites, using outlining and hoisting. For simplicity, regiondef operations were made implicit and represented by nested boxes: a region within an operation corresponds to a region definition that is passed as an argument to the operation. Edge colours correspond to value types. Step 1 outlines the ... operations into a dedicated region, which step 2 hoists outside of the region being rewritten. Step 3 and 4 together correspond to a minIR sugraph rewrite. They have been split into two steps following the proof strategy. Step 4 is an instance of a minIR operation rewrite. — A non-convex minIR graph rewrite, obtained by decomposition into valid convex rewrites, using outlining and hoisting. For simplicity, `regiondef` operations were made implicit and represented by nested boxes: a region within an operation corresponds to a region definition that is passed as an argument to the operation. Edge colours correspond to value types. Step 1 *outlines* the `...` operations into a dedicated region, which step 2 *hoists* outside of the region being rewritten. Step 3 and 4 together correspond to a minIR sugraph rewrite. They have been split into two steps following the proof strategy. Step 4 is an instance of a minIR operation rewrite.

To be precise, $\preccurlyeq$ is a partial order on the type strings up to isomorphism. ↩︎
The set operations $\subseteq$ and $\cup$ are again understood to apply to the unordered set of elements contained in the lists $\mathit\,{tgt}(o)$ and $\mathit{src}\,(o)$ . ↩︎
We can always extend a subgraph to contain more ancestor regions, until there is indeed a unique top-level region in the subgraph. ↩︎