Quantum Compilation as a Graph Transformation Problem

Chapter 3

Quantum Compilation as a Graph Transformation Problem

The specialised optimisation techniques that we reviewed in section 2.2 are effective for the scenarios they were designed for, but they are challenging to adapt to new hardware primitives, constraints, or cost functions.

This thesis proposes interpreting quantum compilation as a graph transformation system (GTS). GTSs endow quantum compilation with well-defined semantics and strong theoretical foundations Lack, 2005Stephen Lack and Pawel Sobocinski. 2005. Adhesive and quasiadhesive categories. RAIRO - Theoretical Informatics and Applications 39, 3 (July 2005, 511--545). doi: 10.1051/ITA:2005028. They establish a practical, purely declarative framework in which compiler transformations can be defined and studied.

This allows us to decouple the semantics of quantum programs and the architecture specifics from the compiler infrastructure itself. We can thus focus on building and designing scalable and efficient graph transformation algorithms that can then be applied on a wide range of compilation problems and hardware targets.

In this chapter, we formalise quantum computation and optimisation based on graphs and graph transformations, providing the foundation for all considerations in later chapters. Albeit slightly simplified, the intermediate representation IR we propose here is based on joint work Mark K., 2025Seyon Sivarajah, Alan Lawrence, Alec Edgington, Douglas Wilson, Craig Roy, Luca Mondada, Lukas Heidemann, Ross Duncan Mark Koch. 2025. HUGR: A Quantum-Classical Intermediate Representation. Retrieved (talk recording) from https://www.youtube.com/live/D8esZrt7ogk?feature=shared&t=5217, as well as ongoing development.

The words graph rewrite and graph transformation are often used interchangeably in the literature. In the context of this thesis, we will take these words to distinguish two slightly different problems:

The study of equivalences and other relations between graphs under well-defined semantics is the subject of graph transformations. For instance:

  • a graph transformation rule LRL \to R (Definition .) expresses that an instance of LL can always be transformed into an instance of RR, reflecting the semantics of the system that the graph is modelling.
  • a minIR equivalence class (Definition .) is an instance of a graph transformation system (GTS), which uses known semantic relations, expressed, for example, as graph transformation rules, to define how graphs can be transformed.

Graph rewriting, on the other hand, encapsulates the algorithmic procedures and data structures that mutate graphs. A rewrite (Definition .) is the tuple of data required to turn a graph GG into a new graph GG'.

Given matches of patterns LL on a graph GG, a graph transformation system can consider the set of graph transformation rules that define the semantics of GG to produce a set of rewrites that can be applied to GG and mutate GG.

Our contributions in the subsequent chapters are mostly preoccupied with problems of graph rewriting, i.e. the definition and application of the data required to mutate graphs, as opposed to graph transformations. This chapter nonetheless considers both, using the mature graph transformation framework as a foundation to define IR rewriting semantics.

Section 3.1 starts with a review of previous related work at the intersection of graph transformation software and quantum program optimisation. We then discuss in section 3.2 a fundamental difference between classical computation graphs and the requirements of quantum computation. This motivates a new graph-based IR tailored to quantum computation that we present in section 3.3, along with formal graph rewriting semantics based on sesqui-pushout (SqPO) transformations (section 3.4). Whilst the SqPO transformation definition is constructive, its existence is not guaranteed. We conclude the chapter in section 3.5 by discussing a more restricted “operational” notion of graph rewriting that will be useful for the rest of the thesis.

3.1. Related work

Graph rewriting on computation graphs.  Optimisation of computation graphs is a long-standing problem in computer science that is seeing renewed interest in the compiler Lattner, 2021Chris Lattner, Mehdi Amini, Uday Bondhugula, Albert Cohen, Andy Davis, Jacques Pienaar, River Riddle, Tatiana Shpeisman, Nicolas Vasilache and Oleksandr Zinenko. 2021. MLIR: Scaling Compiler Infrastructure for Domain Specific Computation. In 2021 IEEE/ACM International Symposium on Code Generation and Optimization (CGO), February 2021. IEEE, 2--14. doi: 10.1109/CGO51591.2021.9370308, machine learning (ML) Jia, 2019Zhihao Jia, Oded Padon, James Thomas, Todd Warszawski, Matei Zaharia and Alex Aiken. 2019. TASO: optimizing deep learning computation with automatic generation of graph substitutions. In Proceedings of the 27th ACM Symposium on Operating Systems Principles, October 2019. ACM, 47--62. doi: 10.1145/3341301.3359630 Fang, 2020Jingzhi Fang, Yanyan Shen, Yue Wang and Lei Chen. 2020. Optimizing DNN computation graph using graph substitutions. Proceedings of the VLDB Endowment 13, 12 (August 2020, 2734--2746). doi: 10.14778/3407790.3407857 and quantum computing communities Xu, 2022Mingkuan Xu, Zikun Li, Oded Padon, Sina Lin, Jessica Pointing, Auguste Hirth, Henry Ma, Jens Palsberg, Alex Aiken, Umut A. Acar and Zhihao Jia. 2022. Quartz: Superoptimization of Quantum Circuits. In Proceedings of the 43rd ACM SIGPLAN International Conference on Programming Language Design and Implementation, June 2022. Association for Computing Machinery, 625--640. doi: 10.1145/3519939.3523433 Xu, 2023Amanda Xu, Abtin Molavi, Lauren Pick, Swamit Tannu and Aws Albarghouthi. 2023. Synthesizing Quantum-Circuit Optimizers. Proceedings of the ACM on Programming Languages 7, PLDI (June 2023, 835--859). doi: 10.1145/3591254. In all these domains, graphs encode computations that are either expensive to execute or evaluated repeatedly over many iterations, making the optimisation of the execution cost of the computation a primary concern.

Domain-specific heuristics are the most common approach in compiler optimisations Paszke, 2019Adam Paszke, Sam Gross, Francisco Massa, A. Lerer, J. Bradbury, G. Chanan, T. Killeen, Z. Lin, N. Gimelshein, L. Antiga, A. Desmaison, A. Kö pf, E. Yang, Z. DeVito, M. Raison, A. Tejani, S. Chilamkurthy, B. Steiner, L. Fang, J. Bai and S. Chintala. 2019. PyTorch: An Imperative Style, High-Performance Deep Learning Library. In Neural Information Processing Systems. doi: 10.5555/3454287.3455008 Sivara., 2020Seyon Sivarajah, Silas Dilkes, Alexander Cowtan, Will Simmons, Alec Edgington and Ross Duncan. 2020. t|ket⟩: a retargetable compiler for NISQ devices. Quantum Science and Technology 6, 1 (November 2020, 014003). doi: 10.1088/2058-9565/ab8e92​ – a more flexible alternative are optimisation engines based on declarative sets of graph transformations Bonchi, 2022Filippo Bonchi, Fabio Gadducci, Aleks Kissinger, Pawel Sobocinski and Fabio Zanasi. 2022. String Diagram Rewrite Theory I: Rewriting with Frobenius Structure. Journal of the ACM 69, 2 (March 2022, 1 - 58). doi: 10.1145/3502719 Bonchi, 2022Filippo Bonchi, Fabio Gadducci, Aleks Kissinger, Pawel Sobocinski and Fabio Zanasi. 2022. String diagram rewrite theory II: Rewriting with symmetric monoidal structure. Mathematical Structures in Computer Science 32, 4 (April 2022, 511--541). doi: 10.1017/s0960129522000317. In such systems, a graph transformation system (GTS) is used to find a sequence of allowed transformations that rewrite a computation graph given as input into a computation graph with minimal cost.

Transformation systems were first studied on strings Dersho., 1990Nachum Dershowitz and Jean-Pierre Jouannaud. 1990. Rewrite Systems, then generalised to trees and terms Bezem, 2003Marc Bezem, Jan Willem Klop and Roel Vrijer. 2003. Term Rewriting Systems (1. publ. ed.). Cambridge University Press, Cambridge, before being applied to graph domains Ehrig, 1973Hartmut Ehrig, Michael Pfender and Hans Jürgen Schneider. 1973. Graph-Grammars: An Algebraic Approach. In 14th Annual Symposium on Switching and Automata Theory, Iowa City, Iowa, USA, October 15-17, 1973. IEEE Computer Society, 167--180. doi: 10.1109/SWAT.1973.11 Rozenb., 1997Grzegorz Rozenberg. 1997. Handbook of Graph Grammars and Computing by Graph Transformations, Volume 1: Foundations. World Scientific König, 2018Barbara König, Dennis Nolte, Julia Padberg and Arend Rensink. 2018. A Tutorial on Graph Transformation. In Graph Transformation, Specifications, and Nets - In Memory of Hartmut Ehrig. Springer, 83--104. doi: 10.1007/978-3-319-75396-6_5. Their use in quantum computing is part of a long tradition of diagrammatic reasoning in physics Penrose, 1964Roger Penrose. 1964. Conformal treatment of infinity. General Relativity and Gravitation 43, 3 (901--922 (reprint)). doi: 10.1007/s10714-010-1110-5 Feynman, 1949R. P. Feynman. 1949. Space-Time Approach to Quantum Electrodynamics. Physical Review 76, 6 (Septempter 1949, 769--789). doi: 10.1103/physrev.76.769, and particularly in quantum mechanics with the advent of categorical quantum mechanics Abrams., 2008Samson Abramsky and Bob Coecke. 2008. Categorical quantum mechanics. arXiv: 0808.1023 [quant-ph] Coecke, 2012Bob Coecke, Ross Duncan, Aleks Kissinger and Quanlong Wang. 2012. Strong Complementarity and Non-locality in Categorical Quantum Mechanics. In 2012 27th Annual IEEE Symposium on Logic in Computer Science, June 2012. IEEE, 245--254. doi: 10.1109/lics.2012.35 Coecke, 2017Bob Coecke and Aleks Kissinger. 2017. Picturing Quantum Processes: A First Course in Quantum Theory and Diagrammatic Reasoning. Cambridge University Press. doi: 10.1017/9781316219317.

GTS in quantum computing.  In quantum computing, the ZX calculus Coecke, 2008Bob Coecke and Ross Duncan. 2008. Interacting Quantum Observables and other diagrammatic theories that derive from it are particularly important. Properties of GTSs such as completeness, confluence and termination Verma, 1995Rakesh M. Verma. 1995. Transformations and confluence for rewrite systems. Theoretical Computer Science 152, 2 (December 1995, 269--283). doi: 10.1016/0304-3975(94)00255-0 are well-studied within this field Backens, 2014Miriam Backens. 2014. The ZX-calculus is complete for stabilizer quantum mechanics. New Journal of Physics 16, 9 (Septempter 2014, 093021). doi: 10.1088/1367-2630/16/9/093021 Backens, 2019Miriam Backens and Aleks Kissinger. 2019. ZH: A Complete Graphical Calculus for Quantum Computations Involving Classical Non-linearity. Electronic Proceedings in Theoretical Computer Science 287 (January 2019, 23--42). doi: 10.4204/EPTCS.287.2 Biamon., 2023J Biamonte and A Nasrallah. 2023. The ZX-Calculus is Canonical in the Heisenberg Picture for Stabilizer Quantum Mechanics. arXiv: 2301.05717 [quant-ph]. These results have formed the basis for software implementations of circuit optimisations with soundness and performance guarantees Duncan, 2020Ross Duncan, Aleks Kissinger, Simon Perdrix and John van de Wetering. 2020. Graph-theoretic Simplification of Quantum Circuits with the ZX-calculus. Quantum 4 (June 2020, 279). doi: 10.22331/q-2020-06-04-279 Kissin., 2020Aleks Kissinger and John van de Wetering. 2020. PyZX: Large Scale Automated Diagrammatic Reasoning. In Proceedings 16th International Conference on Quantum Physics and Logic, Chapman University, Orange, CA, USA., 10-14 June 2019. Open Publishing Association, 229-241. doi: 10.4204/EPTCS.318.14 Sivara., 2020Seyon Sivarajah, Silas Dilkes, Alexander Cowtan, Will Simmons, Alec Edgington and Ross Duncan. 2020. t|ket⟩: a retargetable compiler for NISQ devices. Quantum Science and Technology 6, 1 (November 2020, 014003). doi: 10.1088/2058-9565/ab8e92 Borgna, 2023Agustín Borgna. 2023. Towards a compiler toolchain for quantum programs.

Great strides are also being made in our theoretical understanding of transformation systems for quantum circuits. Recently, Clément et al. established completeness for the first time Cléme., 2023Alexandre Clément, Nicolas Heurtel, Shane Mansfield, Simon Perdrix and Benoît Valiron. 2023. A Complete Equational Theory for Quantum Circuits. In 2023 38th Annual ACM/IEEE Symposium on Logic in Computer Science (LICS), June 2023. IEEE, 1--13. doi: 10.1109/lics56636.2023.10175801 as well as minimality Cléme., 2024Alexandre Clément, Noé Delorme and Simon Perdrix. 2024. Minimal Equational Theories for Quantum Circuits. In Proceedings of the 39th Annual ACM/IEEE Symposium on Logic in Computer Science, July 2024. ACM, 1--14. doi: 10.1145/3661814.3662088 of a GTS for quantum circuits. A set of circuit transformation rules were presented such that no rule is redundant, and for any two equivalent quantum circuits, there exists a sequence of local transformations rewriting one into the other. Such systems are however not confluent, and this is unlikely to change: most circuit optimisation problems are known to be computationally hard Weteri., 2024John van de Wetering and Matt Amy. 2024. Optimising quantum circuits is generally hard. arXiv: 2310.05958 [quant-ph].

There is also another inherent tension in integrating diagrammatic calculi into compilers. Diagrammatic theories arise from abstract primitives that admit a simple rewriting logic Heurtel, 2024Nicolas Heurtel. 2024. A Complete Graphical Language for Linear Optical Circuits with Finite-Photon-Number Sources and Detectors. arXiv: 2402.17693 [quant-ph] Booth, 2024Robert I. Booth, Titouan Carette and Cole Comfort. 2024. Graphical Symplectic Algebra. arXiv: 2401.07914 [cs.LO] Felice, 2023Giovanni de Felice, Razin A. Shaikh, Boldizsár Poór, Lia Yeh, Quanlong Wang and Bob Coecke. 2023. Light-Matter Interaction in the ZXW Calculus. Electronic Proceedings in Theoretical Computer Science 384 (August 2023, 20--46). doi: 10.4204/EPTCS.384.2 Carette, 2023Titouan Carette, Timothée Hoffreumon, Émile Larroque and Renaud Vilmart. 2023. Complete Graphical Language for Hermiticity-Preserving Superoperators. In 2023 38th Annual ACM/IEEE Symposium on Logic in Computer Science (LICS), June 2023. IEEE, 1--22. doi: 10.1109/LICS56636.2023.10175712; compilers meanwhile must capture all the expressivity, constraints and messiness of real-world hardware targets, with all the edge cases and exceptions that this entails.

An example of this is the ZX circuit extraction problem Quanz, 2024Marcel Quanz, Korbinian Staudacher and Karl Fürlinger. 2024. Parallel Quantum Circuit Extraction from MBQC-Patterns. In 2024 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), 1078-1087. doi: 10.1109/IPDPSW63119.2024.00179 Backens, 2021Miriam Backens, Hector Miller-Bakewell, Giovanni de Felice, Leo Lobski and John van de Wetering. 2021. There and back again: A circuit extraction tale. Quantum 5 (March 2021, 421). doi: 10.22331/q-2021-03-25-421​: it is in general hard to recover an executable quantum circuit from a ZX diagram as the latter is strictly more general and primitives cannot be mapped one-to-one. Similarly, while simple quantum-classical hybrid computations can be expressed using extensions of ZX Borgna, 2021Agustín Borgna, Simon Perdrix and Benoît Valiron. 2021. Hybrid Quantum-Classical Circuit Simplification with the ZX-Calculus. In Programming Languages and Systems, Cham. Springer International Publishing, 121--139. doi: 10.1007/978-3-030-89051-3_8 Carette, 2021Titouan Carette, Emmanuel Jeandel, Simon Perdrix and Renaud Vilmart. 2021. Completeness of Graphical Languages for Mixed State Quantum Mechanics. ACM Transactions on Quantum Computing 2, 4 (December 2021, 1--28). doi: 10.1145/3464693 Koziel., 2024Alexander Koziell-Pipe and Aleks Kissinger. 2024. Hybrid Quantum-Classical Machine Learning with String Diagrams. arXiv: 2407.03673 [quant-ph], it will never be possible to capture the full breadth and generality of classical CPU instruction sets in a practical and extensible (and algebraically satisfying) way.

Peephole optimisations.  As an alternative to the very principled approach of elegant calculi, graph transformations can also be used in the absence of theoretical guarantees in a more ad hoc fashion. Indeed, many existing (classical and quantum) compiler optimisations can already be understood as graph transformations. For as long as compilation has existed, compilers have relied on local transformations of the IR, typically referred to as peephole optimisations McKeem., 1965W. M. McKeeman. 1965. Peephole optimization. Communications of the ACM 8, 7 (July 1965, 443--444). doi: 10.1145/364995.365000 Tanenb., 1982Andrew S. Tanenbaum, Hans van Staveren and Johan W. Stevenson. 1982. Using Peephole Optimization on Intermediate Code. ACM Transactions on Programming Languages and Systems 4, 1 (January 1982, 21--36). doi: 10.1145/357153.357155. Such optimisation strategies are based on the heuristic that local optimisations to the program will produce a well-optimised result overall. Mature compiler ecosystems have developed tools for declarative definitions, as well as automatic generation and correctness proving of peephole optimisations Menend., 2017David Menendez and Santosh Nagarakatte. 2017. Alive-Infer: data-driven precondition inference for peephole optimizations in LLVM. ACM SIGPLAN Notices 52, 6 (June 2017, 49--63). doi: 10.1145/3140587.3062372 Lopes, 2015Nuno P. Lopes, David Menendez, Santosh Nagarakatte and John Regehr. 2015. Provably correct peephole optimizations with alive. In Proceedings of the 36th ACM SIGPLAN Conference on Programming Language Design and Implementation, June 2015. ACM, 22--32. doi: 10.1145/2737924.2737965 Riddle, 2021River Riddle. 2021. PDLL: a new declarative rewrite frontend for MLIR. (November 2021). Retrieved on 13/01/2025 (RFC on Discourse) from https://discourse.llvm.org/t/rfc-pdll-a-new-declarative-rewrite-frontend-for-mlir/4798. We refer to the classical compiler literature, e.g. Muchni., 2007Steven S. Muchnick. 2007. Advanced compiler design and implementation ([Nachdr.] ed.). Morgan Kaufmann, San Francisco, Calif. [u.a.], for more details on the various types of common peephole optimisations.

Quantum compilers adopted peephole-style optimisations from the beginning Cheung, 2007Donny Cheung, Dmitri Maslov and Simone Severini. 2007. Translation techniques between quantum circuit architectures. In Workshop on quantum information processing, 1--3 Steiger, 2018Damian S. Steiger, Thomas Häner and Matthias Troyer. 2018. ProjectQ: an open source software framework for quantum computing. Quantum 2 (January 2018, 49). doi: 10.22331/q-2018-01-31-49 Sivara., 2020Seyon Sivarajah, Silas Dilkes, Alexander Cowtan, Will Simmons, Alec Edgington and Ross Duncan. 2020. t|ket⟩: a retargetable compiler for NISQ devices. Quantum Science and Technology 6, 1 (November 2020, 014003). doi: 10.1088/2058-9565/ab8e92. They encompass some of the most common optimisations in quantum computing, including the Euler Angle reduction Chatzi., 2009K. Ch. Chatzisavvas, G. Chadzitaskos, C. Daskaloyannis and S. G. Schirmer. 2009. Improving quantum gate fidelities using optimized Euler angles. Physical Review A 80, 5 (November 2009, 052329). doi: 10.1103/physreva.80.052329, the two-qubit KAK decomposition Tucci, 2005Robert R. Tucci. 2005. An Introduction to Cartan's KAK Decomposition for QC Programmers. arXiv: quant-ph/0507171 [quant-ph] Cross, 2019Andrew W. Cross, Lev S. Bishop, Sarah Sheldon, Paul D. Nation and Jay M. Gambetta. 2019. Validating quantum computers using randomized model circuits. Physical Review A 100, 3 (Septempter 2019, 032328). doi: 10.1103/physreva.100.032328 and all gate set rebases contri., 2025TKET contributors. 2025. Documentation: pytket.passes.AutoRebase. Retrieved on 13/01/2025 (TKET docs) from https://docs.quantinuum.com/tket/api-docs/passes.html#pytket.passes.AutoRebase. A quantum-specific flavour of peephole optimisation with close links to GTSs, template matching Maslov, 2008D. Maslov, G.W. Dueck, D.M. Miller and C. Negrevergne. 2008. Quantum Circuit Simplification and Level Compaction. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 27, 3 (March 2008, 436--444). doi: 10.1109/tcad.2007.911334 Iten, 2022Raban Iten, Romain Moyard, Tony Metger, David Sutter and Stefan Woerner. 2022. Exact and Practical Pattern Matching for Quantum Circuit Optimization. ACM Transactions on Quantum Computing 3, 1 (January 2022, 1--41). doi: 10.1145/3498325, achieved state-of-the-art results for Clifford circuit optimisation Bravyi, 2021Sergey Bravyi, Ruslan Shaydulin, Shaohan Hu and Dmitri Maslov. 2021. Clifford Circuit Optimization with Templates and Symbolic Pauli Gates. Quantum 5 (November 2021, 580). doi: 10.22331/q-2021-11-16-580. Recently, quantum peephole optimisations were also proposed that leverage provable state information to perform contextual optimisations Liu, 2021Ji Liu, Luciano Bello and Huiyang Zhou. 2021. Relaxed Peephole Optimization: A Novel Compiler Optimization for Quantum Circuits. In 2021 IEEE/ACM International Symposium on Code Generation and Optimization (CGO), February 2021. IEEE, 301--314. doi: 10.1109/cgo51591.2021.9370310, similar to strength reduction and optimisation with preconditions in classical compilation Lopes, 2015Nuno P. Lopes, David Menendez, Santosh Nagarakatte and John Regehr. 2015. Provably correct peephole optimizations with alive. In Proceedings of the 36th ACM SIGPLAN Conference on Programming Language Design and Implementation, June 2015. ACM, 22--32. doi: 10.1145/2737924.2737965.

Internal representations.  The graph formalisation of quantum computations we will define in this chapter also draws a lot from the internal representations (IR) of programs in classical compilers. The classical compilation community has found significant advantages in sharing a common standardised IR format. Indeed, while the exact syntax constructs and abstractions vary across programming languages, and, at the other end of the compiler stack, the specific assembly instructions emitted differ between hardware targets, much of the compiler middleware can be broadly shared across use cases. This gave rise to the LLVM Lattner, 2004Chris Lattner and Vikram Adve. 2004. LLVM: A compilation framework for lifelong program analysis & transformation. In International Symposium on Code Generation and Optimization, 2004. CGO 2004.. IEEE, 75--86. doi: 10.1109/CGO.2004.1281665 and, more recently, the MLIR Lattner, 2021Chris Lattner, Mehdi Amini, Uday Bondhugula, Albert Cohen, Andy Davis, Jacques Pienaar, River Riddle, Tatiana Shpeisman, Nicolas Vasilache and Oleksandr Zinenko. 2021. MLIR: Scaling Compiler Infrastructure for Domain Specific Computation. In 2021 IEEE/ACM International Symposium on Code Generation and Optimization (CGO), February 2021. IEEE, 2--14. doi: 10.1109/CGO51591.2021.9370308 projects, which provide common compiler IRs, along with all the infrastructure compilers typically require: IR transformation tooling, translation into hardware-specific assembly, efficient serialisations, in-memory formats etc.

The idea of adopting LLVM for quantum was championed by QIR QIR Al., 2021 QIR Alliance. 2021. QIR Specification v0.1. Retrieved on 31/12/24 from https://www.qir-alliance.org/, a standard introducing quantum primitives into the LLVM IR. This was subsequently adopted by many quantum hardware providers for its superior expressive power compared to circuit-based formats QIR Al., 2023 QIR Alliance. 2023. NVIDIA Joins the QIR Alliance as the Effort Enters Year Two. Retrieved on 02/01/2025 from https://www.qir-alliance.org/posts/year_one_in_review/. Building on top of QIR, an IR specifically for quantum-classical programs was proposed in Mark K., 2025Seyon Sivarajah, Alan Lawrence, Alec Edgington, Douglas Wilson, Craig Roy, Luca Mondada, Lukas Heidemann, Ross Duncan Mark Koch. 2025. HUGR: A Quantum-Classical Intermediate Representation. Retrieved (talk recording) from https://www.youtube.com/live/D8esZrt7ogk?feature=shared&t=5217, with additional soundness guarantees based, among others, on the no-cloning principle of quantum information. In parallel, projects with similar aims have also emerged McCask., 2021Alexander McCaskey and Thien Nguyen. 2021. A MLIR Dialect for Quantum Assembly Languages. In 2021 IEEE International Conference on Quantum Computing and Engineering (QCE), October 2021. IEEE. doi: 10.1109/QCE52317.2021.00043 Ittah, 2022David Ittah, Thomas Häner, Vadym Kliuchnikov and Torsten Hoefler. 2022. QIRO: A Static Single Assignment-based Quantum Program Representation for Optimization. ACM Transactions on Quantum Computing 3, 3 (June 2022, 1--32). doi: 10.1145/3491247 that use the full MLIR and LLVM toolchain.

Challenges of GTS for compilation.  Peephole optimisations of compiler IRs have proven to be a fast, general and scalable approach to compilation and code optimisation in practice. However, the optimisation results depend heavily on well-designed transformation orderings and the performance may vary widely across (equivalent) input programs. This is commonly known in compiler research as the phase ordering problem Click, 1995Cliff Click and Keith D. Cooper. 1995. Combining analyses, combining optimizations. ACM Transactions on Programming Languages and Systems 17, 2 (March 1995, 181--196). doi: 10.1145/201059.201061​. When a compiler can modify code in multiple ways, it must determine which transformations to apply and in what sequence to achieve optimal results Whitfi., 1997Deborah L. Whitfield and Mary Lou Soffa. 1997. An approach for exploring code improving transformations. ACM Transactions on Programming Languages and Systems 19, 6 (November 1997, 1053--1084). doi: 10.1145/267959.267960 Liang, 2023Youwei Liang, Kevin Stone, Ali Shameli, Chris Cummins, Mostafa Elhoushi, Jiadong Guo, Benoit Steiner, Xiaomeng Yang, Pengtao Xie, Hugh Leather and Yuandong Tian. 2023. Learning Compiler Pass Orders using Coreset and Normalized Value Prediction. In Proceedings of the 40th International Conference on Machine Learning. JMLR.org. doi: 10.48550/ARXIV.2301.05104. This is a common design challenge in GTSs, often addressed through mechanisms such as rule controls Heckel, 2020Reiko Heckel and Gabriele Taentzer. 2020. Graph Transformation for Software Engineers: With Applications to Model-Based Development and Domain-Specific Language Engineering. Springer International Publishing. doi: 10.1007/978-3-030-43916-3.

This issue is also a key challenge within quantum compilation, as can be verified by comparing the performance of peephole-based compilers with provably optimal circuit synthesis strategies. On problem sizes where exhaustive search is feasible, unitary synthesis tools can sometimes outperform current, mostly peephole-based compilers Sivara., 2020Seyon Sivarajah, Silas Dilkes, Alexander Cowtan, Will Simmons, Alec Edgington and Ross Duncan. 2020. t|ket⟩: a retargetable compiler for NISQ devices. Quantum Science and Technology 6, 1 (November 2020, 014003). doi: 10.1088/2058-9565/ab8e92 by up to 50%1 Wu, 2020Xin-Chuan Wu, Marc Grau Davis, Frederic T. Chong and Costin Iancu. 2020. QGo: Scalable Quantum Circuit Optimization Using Automated Synthesis. arXiv: 2012.09835 [quant-ph].


  1. at the cost of many hours of compute, of course. ↩︎

3.2. Computation graphs and linearity

Computation graphs represent the flow of data between operations in a program, with nodes as operations and edges as data dependencies. Widely used in machine learning frameworks and GPU optimisations Bergst., 2011James Bergstra, Frédéric Bastien, Olivier Breuleux, Pascal Lamblin, Razvan Pascanu, Olivier Delalleau, Guillaume Desjardins, David Warde-Farley, Ian Goodfellow, Arnaud Bergeron and others. 2011. Theano: Deep learning on gpus with python. In NIPS 2011, BigLearning Workshop, Granada, Spain Zhao, 2023Yuxuan Zhao, Qi Sun, Zhuolun He, Yang Bai and Bei Yu. 2023. AutoGraph: Optimizing DNN Computation Graph for Parallel GPU Kernel Execution. Proceedings of the AAAI Conference on Artificial Intelligence 37, 9 (June 2023, 11354--11362). doi: 10.1609/aaai.v37i9.26343, they are conceptually equivalent to dataflow graphs used in compiler design, which were pioneered by Feo, 1990John T. Feo, David C. Cann and Rodney R. Oldehoeft. 1990. A report on the sisal language project. Journal of Parallel and Distributed Computing 10, 4 (December 1990, 349--366). doi: 10.1016/0743-7315(90)90035-n and Kahn, 1976Gilles Kahn and David MacQueen. 1976. Coroutines and networks of parallel processes and are now central to most compiler IRs.

In classical computations, these graph representations of computations are essentially term graphs Barend., 1987H. P. Barendregt, M. C. J. D. Eekelen, J. R. W. Glauert, J. R. Kennaway, M. J. Plasmeijer and M. R. Sleep. 1987. Term Graph Rewriting​ – sets of algebraic expressions that are stored as trees, combined with an important optimisation known as term sharing. When identical subexpressions appear multiple times, they can be represented as one computation and referenced from multiple locations, creating a directed acyclic graph rather than a term tree Plump, 1999Detlef Plump. 1999. Term Graph Rewriting. (October 1999). This sharing enables a more efficient representation. It can also be used as a compiler optimisation to identify subexpressions that can be cached and shared across expression evaluations for a more efficient execution – a technique known as common subexpression elimination (CSE) Cocke, 1970John Cocke. 1970. Global common subexpression elimination. In Proceedings of a symposium on Compiler optimization -. ACM Press, 20--24. doi: 10.1145/800028.808480.

Each edge of a computation graphs corresponds to a unique value: the output of a previous computation that is being passed on to new operations. These values flow along edges in the graph – hence dataflow graph. Values are immutable: they are defined once and then passed as input to further operations, where they can only be consumed, never modified. In compiler speak, programs expressed using such immutable values are often called single static assignment (SSA) programs Cytron, 1991Ron Cytron, Jeanne Ferrante, Barry K. Rosen, Mark N. Wegman and F. Kenneth Zadeck. 1991. Efficiently computing static single assignment form and the control dependence graph. ACM Transactions on Programming Languages and Systems 13, 4 (October 1991, 451--490). doi: 10.1145/115372.115320 Rosen, 1988B. K. Rosen, M. N. Wegman and F. K. Zadeck. 1988. Global Value Numbers and Redundant Computations. In Proceedings of the 15th ACM SIGPLAN-SIGACT symposium on Principles of programming languages - POPL ’88. ACM Press, 12--27. doi: 10.1145/73560.73562. In SSA:

  1. Every value is defined exactly once,
  2. Every value may be used any number of times (including zero).

Quantum computing throws this second pillar of SSA into the bin. Values in quantum computations are the result of computations on quantum data, and as such must obey the no-cloning and no-deleting theorems (section 2.1). We call values subject to these restrictions linear1. They introduce the following constraint on valid computation graphs:

Every linear value must be used exactly once.

Linear values change fundamentally how transformations of the computation graph must be specified. Where compilers on classical data can:

  • freely share common subexpressions (term sharing),
  • undo term sharing, i.e. duplicate shared terms into independent subterms, and
  • delegate the identification and deletion of obsolete code to specialised passes (e.g., dead code elimination Cytron, 1991Ron Cytron, Jeanne Ferrante, Barry K. Rosen, Mark N. Wegman and F. Kenneth Zadeck. 1991. Efficiently computing static single assignment form and the control dependence graph. ACM Transactions on Programming Languages and Systems 13, 4 (October 1991, 451--490). doi: 10.1145/115372.115320 Briggs, 1994Preston Briggs and Keith D. Cooper. 1994. Effective partial redundancy elimination. ACM SIGPLAN Notices 29, 6 (June 1994, 159--170). doi: 10.1145/773473.178257),

quantum compilers must enforce much stricter invariants on IR transformations – or risk producing invalid programs.

In classical compilers, IR modification APIs (such as MLIR’s PatternRewriter) decouple program transformation from code deletion. Program transformations are specified by copying existing values and introducing new values and operations as needed, while the actual deletion of unused code is deferred to specialized dead code elimination passes. This approach is no longer feasible in the presence of linear values. Computation graphs for quantum computations must adopt proper graph rewriting semantics, in which the explicit deletion of obsolete values and operations is just as much a part of the rewriting data as the new code generation.


  1. The terminology comes from “linear” logic Girard, 1987Jean-Yves Girard. 1987. Linear logic. Theoretical Computer Science 50, 1 (1--101). doi: 10.1016/0304-3975(87)90045-4. I apologise for slamming additional semantics on what I recognise is an already very overloaded term. ↩︎

3.3. A graph representation for quantum programs

For the purposes of this thesis, we introduce a simplified graph-based IR for quantum computations that we will call minIR. It captures all the expressiveness that we require for hybrid programs whilst remaining sufficiently abstract to be applicable to a variety of IRs and, by extension, quantum programming languages and compiler frameworks.

MinIR can be thought of as being built from statements of the form

x, y, ... := op(a, b, c, ...)

to be understood as an operation op applied on the SSA values a, b, c, ... and producing the output values x, y, ....

Computation and dataflow graphs are commonly defined with operations as vertices and values as edges. To faithfully capture the function signature of op, this requires storing and preserving an ordering of the incoming and outgoing edges (also known as port graphs Ferná., 2018Maribel Fernández, Hélène Kirchner and Bruno Pinaud. 2018. Labelled Port Graph – A Formal Structure for Models and Computations. Electronic Notes in Theoretical Computer Science 338 (October 2018, 3--21). doi: 10.1016/j.entcs.2018.10.002).

Instead, we adopt the hypergraph formalisation of computation graphs, which is more common within category theory (see string diagrams Seling., 2010P. Selinger. 2010. A Survey of Graphical Languages for Monoidal Categories and their formalisation in the hypergraph category Bonchi, 2022Filippo Bonchi, Fabio Gadducci, Aleks Kissinger, Pawel Sobocinski and Fabio Zanasi. 2022. String Diagram Rewrite Theory I: Rewriting with Frobenius Structure. Journal of the ACM 69, 2 (March 2022, 1 - 58). doi: 10.1145/3502719 Wilson, 2022Paul Wilson and Fabio Zanasi. 2022. The Cost of Compositionality: A High-Performance Implementation of String Diagram Composition. Electronic Proceedings in Theoretical Computer Science 372 (November 2022, 262--275). doi: 10.4204/eptcs.372.19). This definition is particularly well-suited for our purposes because it frames the graph transformations of interest to us in the well-studied language of rewriting within adhesive categories Lack, 2004Stephen Lack and Paweł Sobociński. 2004. Adhesive Categories. In Foundations of Software Science and Computation Structures, Berlin, Heidelberg. Springer Berlin Heidelberg, 273--288. doi: 10.1007/978-3-540-24727-2_20.

Hypergraphs and minIR #

At a minimum, a directed hypergraph – for simplicity sometimes in the following referred to simply as graph – is defined by a set of vertices V\mathbf V and a set of (hyper) edges E\mathbf E. We will always consider hypergraphs where the edges eEe \in \mathbf E are directed and the vertices attached to ee are given by ordered lists. We formalise this incidence relation between vertices in V\mathbf V and edges in E\mathbf E by writing E\mathbf E as the partition over the disjoint sets Est\mathbf E_{st}

E=s,tNEst\mathbf E = \bigsqcup_{s, t \in \mathbb{N}} \mathbf E_{st}

and introducing ss source and tt target maps for each Est\mathbf E_{st}. Why we write sets in boldface will become clear in a moment.

Definition 3.1Directed hypergraph

A directed hypergraph is given by sets V\mathbf V and Est\mathbf E_{st} for s,tNs, t \in\mathbb{N}, along with maps

srcst,i:EstVfor 1istgtst,j:EstVfor 1jt \begin{aligned} \textit{src}_{st,i}&: \mathbf E_{st} \to \mathbf V\quad&\textrm{for } 1 \leqslant i \leqslant s\\ \textit{tgt}_{st,j}&: \mathbf E_{st} \to \mathbf V&\textrm{for } 1 \leqslant j \leqslant t\\ \end{aligned}

for all s,tNs, t \in \mathbb{N}.

Note that in this thesis, as in most common uses of hypergraphs, the sets V\mathbf V and E=Est\mathbf E = \bigsqcup \mathbf E_{st} will always be finite, and thus Est\mathbf E_{st} \neq \varnothing for a finite number of s,tNs, t \in \mathbb{N} only.

For simplicity, we can further omit the stst subscript of the source srcst,i\textit{src}_{st,i} and target tgtst,j\textit{tgt}_{st,j} maps whenever it can be inferred from the domain of definition of the map. For eEste \in \mathbf{E}_{st}, we call src1(e),,srcs(e)V\textit{src}_{1}(e), \dots, \textit{src}_{s}(e) \in \mathbf{V} the ss source vertices of ee and tgt1(e),,tgtt(e)V\textit{tgt}_{1}(e), \dots, \textit{tgt}_{t}(e) \in \mathbf{V} the tt target vertices of ee.

We introduce the notation uvu \leadsto v to signify that there is an edge from uu to vv, i.e. there is eEste \in \mathbf E_{st} for some s,tNs, t \in \mathbb N and 1is,1jt1 \leqslant i \leqslant s, 1 \leqslant j \leqslant t such that u=srci(e)u = src_i(e) and v=tgtj(e)v = tgt_j(e). We define the equivalence relation V2\sim \subseteq \mathbf V^2 of connected vertices, given by the transitive, symmetric and reflexive closure of \leadsto. The equivalence classes of \sim are the connected components of the graph. We will write [v][v], resp. [e][e] for the connected component that contains the vertex vv, resp. the edge ee.

To proceed, it is useful to frame the hypergraph definition in a categorical setting. We write [C,Set][\mathbb{C}, \mathrm{Set}] for the presheaf topos of the category C\mathbb{C}, i.e. the category with functors CSet\mathbb{C} \to \mathrm{Set} as objects and natural transformations as morphisms. Definition 3.1 can be equivalently restated as:

hypergraphs are objects in the presheaf topos H=[C,Set]\mathbb H = [\mathbb{C}, \mathrm{Set}],

where the category C\mathbb{C} has objects VV and EstE_{st} for s,tNs, t \in \mathbb{N} and arrows given by (1), now interpreted as morphisms in C\mathbb{C} rather than as functions in Set\mathrm{Set}. In this framing, a graph is a functor that defines a set for each object of C\mathbb{C} and specifies functions between these sets – one for each arrow in C\mathbb{C}.

This is where the distinction between bold and non-bold typeface comes from: we use bold letters to refer to images in Set\mathrm{Set} of a hypergraph functor, whereas the non-bold typeface is refers to objects in the indexing category C\mathbb C. The distinction between C\mathbb C and Set\mathrm{Set} is less important for morphisms – it will typically be clear from the context. We thus use the same symbols for both.

Linearity constraints #

The introduction of H\mathbb H not only gives us a notion of hypergraph homomorphisms – maps between hypergraphs that preserve the structure of the graph. It also provides us with a way to express the linearity constraints that arise from our discussion in section 3.2, and which we must enforce on our computation graphs.

The definition that follows adds the coproduct EE explicitly as an object of the category (which we did not need to do in Definition 3.1), as we need it as the codomain of the new morphisms use\textit{use} and def\textit{def}. The adhesitivity of the category does no longer comes for free – we will get back to this in section 3.4.

Definition 3.2Hypergraph with linearity constraints

The category lin-C\textrm{lin-}\mathbb{C} is the category given by objects

{V,Vsrc,Vtgt}{E}{Ests,tN}.\{V, V_\textit{src}, V_\textit{tgt}\} \cup \{ E \} \cup \{ E_{st}\, |\, s, t \in \mathbb{N}\}.
Its arrows are the incidence morphisms given in (1), along with

use:VsrcEdef:VtgtEλsrc:VsrcVλtgt:VtgtV \begin{aligned} \mathit{use}&: V_\textit{src} \to E\quad& \mathit{def}&: V_\textit{tgt} \to E\\ \lambda_\mathit{src}&: V_\textit{src} \rightarrowtail V& \lambda_\mathit{tgt}&: V_\textit{tgt} \rightarrowtail V\\ \end{aligned}

and ιst:EstE\iota_{st}: E_{st} \rightarrowtail E for all s,tNs, t \in \mathbb{N}. The morphisms λsrc,λtgt,ιst\lambda_\mathit{src}, \lambda_\mathit{tgt}, \iota_{st} are split monomorphisms and the following diagrams commute for all s,tNs, t \in \mathbb{N} and 1is,1jt1 \leqslant i \leqslant s, 1 \leqslant j \leqslant t:

Directed hypergraphs with linearity conditions are objects in the full subcategory lin-H\textrm{lin-}\mathbb H given by objects Hlin[lin-C,Set]H_\mathrm{lin} \in [\textrm{lin-}\mathbb{C}, \mathrm{Set}] such that

E=Est\mathbf E = \bigsqcup \mathbf E_{st}
is the coproduct in Set\mathrm{Set} and Hlin(ιst):Hlin(Est)Hlin(E)H_\mathrm{lin}(\iota_{st}): H_\mathrm{lin}(E_{st}) \to H_\mathrm{lin}(E) are the injections into Hlin(E)H_\mathrm{lin}(E).

We probably owe an explanation for this definition – at least for the sake of the few computer scientists that are still following us.

First of all, notice that every hypergraph with linearity constraint corresponds to a hypergraph in the sense of Definition 3.1: there is an obvious functor L:Clin-C\mathcal{L}: \mathbb{C} \to \textrm{lin-}\mathbb{C} that maps each object and morphism in C\mathbb{C} to the object or morphism with the same name in lin-C\textrm{lin-}\mathbb{C}. By contravariance, we can thus (functorially) map every hypergraph with linearity constraints Hlinlin-HH_\mathrm{lin} \in \textrm{lin-}\mathbb{H} to the hypergraph H=HlinLHH = H_\mathrm{lin} \circ \mathcal{L} \in \mathbb{H}.

Another way of looking at this is to realise that by requiring that λsrc,λtgt\lambda_\mathit{src}, \lambda_\mathit{tgt} be split monomorphisms, we obtain that the resulting functions in Set\mathrm{Set} are injective. Up to isomorphism, we can consider that Vsrc,VtgtV\mathbf{V}_\textit{src}, \mathbf{V}_\textit{tgt} \subseteq \mathbf{V} are subsets of vertices in HlinH_\mathrm{lin}. A hypergraph with linearity constraints is thus a directed hypergraph with two selected subsets of vertices Vsrc\mathbf{V}_\textit{src} and Vtgt\mathbf{V}_\textit{tgt}.

Vertices within these subsets are special. For every vVsrcv \in \mathbf{V}_\textit{src}, there exist unique indices s,tNs, t \in \mathbb{N}, 1is1 \leq i \leq s and edge use(v)=eEstuse(v) = e \in \mathbf{E}_{st} such that srci(e)=v\textit{src}_i(e) = v. In words, for every vVsrcv\in \mathbf{V}_\textit{src} there is a unique edge in the hypergraph that has vv as one of its sources. We then say that ee is the unique use of vertex vv. Similarly, vertices in Vtgt\mathbf{V}_\textit{tgt} have a unique edge ee in the hypergraph with vv as one of its targets – it is the unique definition of vv.

Typed graphs #

MinIR graphs are strongly typed. We introduce typed graphs for this purpose, a concept for which graph transformation was first formalised in Ehrig, 2004Hartmut Ehrig, Ulrike Prange and Gabriele Taentzer. 2004. Fundamental Theory for Typed Attributed Graph Transformation. In Graph Transformations, Berlin, Heidelberg. Springer Berlin Heidelberg, 161--177. doi: 10.1007/978-3-540-30203-2_13. A type system for directed hypergraphs is just another object ΣH\Sigma \in \mathbb H. A typed graph is then an object of the slice category HΣ\mathbb H \searrow \Sigma, that is to say, typed graphs are morphisms HΣH \to \Sigma of H\mathbb{H} and morphisms between typed graphs are given by the subset of morphisms H1H2H_1 \rightarrow H_2 of H\mathbb H that make the triangle diagram formed by H1H_1, H2H_2 and Σ\Sigma commute.

To type hypergraphs with linearity constraints, we do not pick the type system Σ\Sigma in lin-H\textrm{lin-}\mathbb H, as the existence of useuse and defdef morphisms impose restrictions that are too strict. We consider instead the category lin-Ctype\textrm{lin-}\mathbb C_\textrm{type} given by the same objects as lin-C\textrm{lin-}\mathbb C, as well as the same morphisms, with the omission of useuse and defdef. There is an obvious functor lin-Ctypelin-C\textrm{lin-}\mathbb C_\textrm{type} \to \textrm{lin-}\mathbb C and thus, by contravariance, every hypergraph with linearity constraints Hlinlin-HH_\textrm{lin} \in \textrm{lin-}\mathbb H can be mapped to a hypergraph in lin-Htype\textrm{lin-}\mathbb H_\textrm{type}. We say that a hypergraph with linearity constraints Hlinlin-HH_\textrm{lin} \in \textrm{lin-}\mathbb H is Σ\Sigma-typed for a type system Σlin-Htype\Sigma \in \textrm{lin-}\mathbb H_\textrm{type} if there is a morphism ΣHlin\Sigma \to H_\textrm{lin}, when interpreting HlinH_\textrm{lin} is as an object of lin-Htype\textrm{lin-}\mathbb H_\textrm{type}.

Two example hypergraphs. Vertices (labelled with capital letters) are circles and hyperedges (labelled with small letters) span between them. Vertices that are attached to an edge in the black half of the circle correspond to source vertices of the edge; those in the white half correspond to target vertices. The functions srcisrc_isrci​ and tgtjtgt_jtgtj​ map edges to the incident vertices, defining the directed hypergraph. On the left, there are further functions use\textit{use}use and def\textit{def}def that map vertices to the unique edge that uses or defines them. This defines a hypergraph with linearity constraints, with Vtgt={A1,A2}V_\textit{tgt} = \{A_1, A_2\}Vtgt​={A1​,A2​} and Vsrc={A3}V_\textit{src} = \{A_3\}Vsrc​={A3​}. These cannot be defined on the graph on the right. On the other hand, the edges b and c are in F11F_{11}F11​, and are in the domain of functions in11in_{11}in11​ and out11out_{11}out11​, thus defining a child region of a hierarchical graph. Note that it would be invalid to have any edge connecting i1i_1i1​ or o1o_1o1​ to i2i_2i2​ or o2o_2o2​. The iii and ooo vertices also have incidence morphisms, not displayed here.

Two example hypergraphs. Vertices (labelled with capital letters) are circles and hyperedges (labelled with small letters) span between them. Vertices that are attached to an edge in the black half of the circle correspond to source vertices of the edge; those in the white half correspond to target vertices. The functions srcisrc_i and tgtjtgt_j map edges to the incident vertices, defining the directed hypergraph. On the left, there are further functions use\textit{use} and def\textit{def} that map vertices to the unique edge that uses or defines them. This defines a hypergraph with linearity constraints, with Vtgt={A1,A2}V_\textit{tgt} = \{A_1, A_2\} and Vsrc={A3}V_\textit{src} = \{A_3\}. These cannot be defined on the graph on the right. On the other hand, the edges b and c are in F11F_{11}, and are in the domain of functions in11in_{11} and out11out_{11}, thus defining a child region of a hierarchical graph. Note that it would be invalid to have any edge connecting i1i_1 or o1o_1 to i2i_2 or o2o_2. The ii and oo vertices also have incidence morphisms, not displayed here.

Hierarchical hypergraphs #

A final bit of structure that minIR graphs require is a notion of hierarchy between regions of the graph. This will be useful to define functions, control flow blocks such as if-else, or any subroutine that can itself be viewed as an operation in the computation.

Hierarchical hypergraphs were first proposed in Drewes, 2002Frank Drewes, Berthold Hoffmann and Detlef Plump. 2002. Hierarchical Graph Transformation. Journal of Computer and System Sciences 64, 2 (March 2002, 249--283). doi: 10.1006/jcss.2001.1790 and further generalised in Busatto, 2005Giorgio Busatto, Hans-Jörg Kreowski and Sabine Kuske. 2005. Abstract hierarchical graph transformation. Mathematical Structures in Computer Science 15, 4 (July 2005, 773--819). doi: 10.1017/s0960129505004846 Palacz, 2004Wojciech Palacz. 2004. Algebraic hierarchical graph transformation. Journal of Computer and System Sciences 68, 3 (May 2004, 497--520). doi: 10.1016/s0022-0000(03)00064-3. However, we opt to use a more restrictive definition, closer to the notion of flattened hypergraphs of Drewes, 2002Frank Drewes, Berthold Hoffmann and Detlef Plump. 2002. Hierarchical Graph Transformation. Journal of Computer and System Sciences 64, 2 (March 2002, 249--283). doi: 10.1006/jcss.2001.1790. The reason for this is twofold. Firstly, hierarchical (hyper)graphs are typically defined recursively. It is not obvious under which conditions (and if) such definitions form adhesive categories, although progess in this direction was made in Padberg, 2017Julia Padberg. 2017. Hierarchical Graph Transformation Revisited: Transformations of Coalgebraic Graphs. In Graph Transformation, Cham. Springer International Publishing, 20--35. doi: 10.1007/978-3-319-61470-0_2 with the introduction of coalgebraic graphs. As a result, to the extent that graph transformation results can be applied to such structures, it must be done so carefully.

The second, more practical, reason is that the notion of typed graph introduced above cannot be directly lifted to the hierarchical graph setting: while some subset of the hierarchical relation in minIR should be enforced by the type system, the type graph of a nested graph should be identical to the parent’s (as opposed to being itself nested within the type graph of the parent).

It is therefore more convenient for us to encode hierarchy in directed hypergraphs as follows. Note that it is not clear that our definition is adhesive either1 – but at least it is framed as a subcategory of a base category that is.

Definition 3.3Hierarchical hypergraph

The category hier-C\textrm{hier-}\mathbb C is the category with objects and arrows of C\mathbb C along with additional objects FstF_{st} for s,tNs, t \in \mathbb{N} and arrows:

  • FstEstF_{st} \rightarrowtail E_{st} that are split monomorphisms,
  • input arrows FstinstE0sF_{st} \xrightarrow{in_{st}} E_{0s}, and output arrows FstoutstEt0F_{st} \xrightarrow{out_{st}} E_{t0}.

Hierarchical hypergraph are the objects in the full subcategory hier-H\textrm{hier-}\mathbb H given by objects Hhier[hier-C,Set]H_\textrm{hier} \in [\textrm{hier-}\mathbb{C}, \mathrm{Set}] such that

  • for any edge eEe \in \mathbf{E} of HhierH_\textrm{hier}, the set P([e])=in1([e])out1([e])P([e]) = \overline{in}^{-1}([e]) \cup \overline{out}^{-1}([e]) has at most one element.
  • the transitive and reflexive closure of [e][P([e])][e] \preccurlyeq [P([e])] for eEe \in \mathbf E is a partial order on the connected components of HhierH_\textrm{hier}.

Here in\overline{in} and out\overline{out} are the functions with domain E\mathbf E defined piecewise as instin_{st} and outstout_{st} for all s,ts, t on their respective (disjoint) domains of definition.

The same definition can also be applied to lin-H\textrm{lin-}\mathbb H to obtain the category of hierarchical directed hypergraphs with linearity constraints hier-lin-H\textrm{hier-lin-}\mathbb H. Similarly, we define the associated category for type systems; however, we do not impose any of the two conditions related to P()P(\cdot) for the type category, i.e. hier-lin-Htype=[hier-lin-Ctype,Set]\textrm{hier-lin-}\mathbb H_\textrm{type} = [\textrm{hier-lin-}\mathbb C_\textrm{type}, \mathrm{Set}] is the full presheaf category, rather than a subcategory of it.

As with the incidence morphisms src\textit{src} and tgt\textit{tgt}, we will drop the stst subscript for the IO arrows in\textit{in} and out\textit{out} when it can be inferred from the domain of definition.

Just as in the discussion of Definition 3.2, we interpret the split monomorphisms as equivalent to requiring FstEst\mathbf{F}_{st} \subseteq \mathbf{E}_{st}. Taking over terminology from Drewes, 2002Frank Drewes, Berthold Hoffmann and Detlef Plump. 2002. Hierarchical Graph Transformation. Journal of Computer and System Sciences 64, 2 (March 2002, 249--283). doi: 10.1006/jcss.2001.1790, we call elements fFstf \in \mathbf{F}_{st} the frames of Hhierhier-HH_\textrm{hier} \in \textrm{hier-}\mathbb H. For each frame ff, there is thus a unique input edge in(f)in(f) and a unique output edge out(f)out(f) in HhierH_\textrm{hier} that have respectively ss targets and zero sources, and tt sources and zero targets.

By the first condition we imposed on P()P(\cdot), the partial function parentparent mapping connected components to their parent edge:

parent([e])={p if there exists pP(e) otherwise. parent([e]) = \begin{cases} p\quad&\textrm{ if there exists } p \in P(e)\\ \bot&\textrm{ otherwise.} \end{cases}

is well-defined. We call the subgraphs of HhierH_\textrm{hier} that share a same parent a region of HhierH_\textrm{hier}2. The subgraph of vertices and edges without parent is the root region of HhierH_\textrm{hier}.

The minIR computation graph #

In minIR, the vertices are the values of the computation, while the hyperedges define the operations. This imposes some constraints for a hypergraph to be a valid minIR graph:

  • all values in minIR must have a unique operation that defines them;
  • values that are linear must also have a unique operation that uses them;
  • the graph must be acyclic, meaning that no value can be defined in terms of itself.

This can be expressed as a hypergraph with linearity constraints by choosing Vtgt=V\mathbf V_\textit{tgt} = \mathbf V and Vsrc=VL\mathbf V_\textit{src} = \mathbf V_L, where VLV\mathbf V_L \subseteq \mathbf V is the subset of linear values.

The following definition then comes as no surprise:

Definition 3.4MinIR graph

Let Σhier-lin-Htype\Sigma \in \textrm{hier-lin-}\mathbb H_\textrm{type} be a type system. A minIR graph HH typed in Σ\Sigma is an object of hier-lin-H\textrm{hier-lin-}\mathbb H that is Σ\Sigma-typed and such that the adjacency relation \leadsto is acyclic. We call Vsrc=VL\mathbf V_\textit{src} = \mathbf V_L, the linear values of HH.

In the context of minIR, \leadsto relations encode the data flow of values across the computation. The lack of explicit operation ordering differentiates minIR (and HUGR) from most classical IRs, which, unless specified otherwise, typically assume that instructions may have side effects and thus cannot be reordered. All quantum operations (and the classical operations we are interested in) are side-effect free, which significantly simplifies our IR.

Input and output values #

Notice that in Definition 3.4, it is not enforced that every value has a definition, i.e. there might be vVVtgtv \in \mathbf V \setminus \mathbf V_\textrm{tgt}; nor that every value with a linear type is in VL\mathbf V_L, i.e. if τ:HΣ\tau: H \to \Sigma is the typing morphism and TL\mathbf T_L are the linear types in the type system, there might be vτ1(TL)VLv \in \tau^{-1}(\mathbf T_L) \setminus \mathbf V_L.

This would be easy to fix: we could on the one hand enforce the equality Vtgt=V\mathbf V_\textrm{tgt} = \mathbf V, thus guaranteeing that every value has a unique definition in the graph. On the other hand, we could define V\mathbf V as the coproduct V=VLVNL\mathbf V = \mathbf V_L \sqcup \mathbf V_\textit{NL}, where VNLV_\textit{NL} is a new object introduced to explicitly capture the set of non-linear values. Morphisms in this category would guarantee that the linearity of values is always preserved, and thus in particular the type morphism would map a value to a linear type if and only if it is linear.

Instead, we opt to allow undefined values and unused linear values to be able to express rewrite rules that match sugraphs of minIR graphs within the same category.

Definition 3.5Inputs and outputs of minIR graphs

For a minIR graph HH with typing morphism τ:HΣ\tau: H \to \Sigma, we call the set I=VVtgtI = \mathbf V \setminus \mathbf V_\textit{tgt} the input values of HH and O=τ1(TL)VLO = \tau^{-1}(\mathbf T_L) \setminus \mathbf V_L its output values, where TL\mathbf T_L are the linear values in Σ\Sigma.

If I=O=I = O = \varnothing, we say that HH is IO-free.

Note that by this definition, an output value in OO always has a linear type! This is because non-linear values do not need to be treated specially when they are outputs: unlike linear values that must always be used in a non-output position, non-linear values may have no outgoing edge, in which case they are simply discarded in the computation.

Structured control flow #

The operations and values of a minIR graph define the data flow of a program. However, a program must also be able to control and change the data flow at run time in order to express loops, conditionals, function calls etc. This is the program control flow, which minIR expresses using regions and so-called structured control flow.

Using regions, any non-trivial control flow (function calls, conditionals, loops etc.) is captured by a frame, a “black box” operation within the data flow of the program. Its implementation is then defined in the nested region of the frame. This can be used for function calls, but also for branches of control flow. A simple function call, that unconditionally redirects the control flow to the operations within a nested region, could for example be represented as follows:

In this figure and below, circles are SSA values (the vertices of the hypergraph), while the edges spanning between them are the operations. Edges attached to the white half of circles are value definitions, while hyperedges attached to the black half of circles are value uses. The call and + operation can be read left to right: for instance, the two values x and y on the left of call are inputs (the operation uses those values, and is thus attached to the black half of the circles), whereas the x + y value on the right is the output of the call operation (the operation defines this value, which is thus attached to the white half of the circle).

Dashed arrows indicate hierarchical dependencies that map the frame edge to the two input and output edges in the child region; dashed rectangles mark the non-root regions of the graph.

Importantly, the frame edge representing the call operation must intuitively “forward” all its input values to the in operation of the child region, and similarly passes on the value at the out operation to the output value of the call output in the parent graph. Passing function arguments and retrieving returned values in this fashion will be very familiar to any computer scientist. Unlike most programming languages, this is also how in minIR values are passed to and from any control flow constructs we would wish to model.

In terms of graph structure, this relation between values in parent and child regions means that the arity and types of the inputs and outpus of call fix the signatures of the child in and out operations. Definition 3.3 already ensures that the input and output arities of in and out are correct. The correct typing of these input and output values will be ensured by the type system, which we discuss in a separate section below.

To handle constructs that require more than one child region, such as an if-else statement, we can use frames that have zero input and one output:

The output of the ifblock is intuitively a higher-order type representing an operation that takes two inputs and ouputs the sum.

An if-else statement might then look as follows:

The if and else blocks must expect the same input and output values. This is key to respecting any linearity constraints that values passed to ifelse might be subject to. By definition, all operations that use or define a value vv will be in the same region – in other words, values are only available within their defining region. This in effect implements “variable scoping”. With some imagination, this construction can easily be adapted to model loops, complex control flow graphs, or any other control flow structures.

Why not plain branch statements? #

There is a simpler – and at least as popular – way of expressing control flow in IRs without requiring regions and operation hierarchies, using branch statements3. For instance, LLVM IR provides a conditional branch statement

br i1 %cond, label %iftrue, label %iffalse

that will jump to the operations under the iftrue label if %cond is true and to the iffalse label otherwise.

This is a simple and versatile approach to control flow that can be used to express any higher-level language constructs. Unfortunately, conditional branching does not mix well with linear values.

Linearity, as defined in Definition 3.4, is a simple constraint to impose on minIR graphs. In the presence of conditional branching, however, the constraint would have to be relaxed to allow for single use in each mutually exclusive branch of control flow. For instance, the following two uses of b should be allowed (in pseudo-IR):

b := h(a)
<if cond> c := x(b)
<else>    d := h(b)

This is a much harder invariant to check on the IR: linearity would no longer be enforceable as a syntactic constraint on the minIR graph as in Definition 3.4, but would instead depend on the semantics of the operations to establish mutual exclusivity of control flow4. Forbidding arbitrary branching in minIR and resorting instead to structured control flow as described above to express control flow is just as expressive and gives the linearity constraint a much simpler form.

Type graph #

We have seen how minIR graphs impose some structure on the types of computations that can be expressed: a linear value cannot be used by two (or zero) operations, frames will always have a unique in and out operation in their child region with correct arities, etc.

However, without a “good” type system and associated semantics, it is still possible to express nonsensical programs: we have mentioned for instance earlier that it is up to the type system to enforce that the types of the in and out operations match the types of the frame. Similarly, it is possible to construct programs that break linearity: take the ifelse operation discussed in the previous section, but now replace its semantics to be do-in-parallel, i.e. it will execute both the if-block and else-block in parallel on the inputs that it is given. This would violate the linearity of its inputs, but would nonetheless be a syntactically valid minIR graph!

To resolve this we present here some typed operations, along with their semantics, that can be used to construct well-behaved type systems: programs typed in this system model the kind of quantum programs that we are interested in expressing and are guaranteed to be valid computations. Categorising all valid constructions or an exhaustive enumeration of conditions that type systems must satisfy to guarantee the validity of programs is beyond the scope of this thesis. It is in practice often straightforward to combine and extend the elements presented here to support further custom syntactic constructs and types.

Basic types and operations #

The most elementary types in our computations are Bits and Qubits. The former is typically known as a Boolean and represents the purely classical values 0 and 1. The latter is the canonical quantum example of a linear type. Indeed, just like values in minIR graphs, the type system in hier-lin-Htype\textrm{hier-lin-}\mathbb H_\textrm{type} distinguishes between linear and non-linear types.

Other typical classical types such as integers, floats, strings, custom algebraic data types (ADT) etc could also be introduced as required. In the figure below, we for instance introduce the Angle type to represent rotation angles that parametrise quantum gates. Further examples of linear types, on the other hand, include higher-dimensional qudits, but also any ADT that contains a linear type within it.

As we saw in section 2.1, the number of input qubits in pure quantum operations will match the number of output qubits: the single-qubit h (hadamard gate) and two-qubit cx (controlled NOT) operations thus have one or two Qubits as both inputs and outputs rz (Z rotation) is also a single-qubit operation, but it takes an additional input of type Angle to specify the rotation angle.

On the pure classical side, we are free to add any side-effect free operations on our types; in our example we model addition + on Angles and negation not on Bits. In the type system Σhier-lin-Htype\Sigma \in \textrm{hier-lin-}\mathbb H_\textrm{type}, each type is represented by a single vertex.

In our example, we thus have three vertices:

We introduce a different colour for each type. Operations such as cx are represented by a hyperedge with two sources on Qubit and two targets on Qubit. As in the previous diagrams, we can distinguish operation inputs from outputs by whether they are attached to the dark or light half of the type vertex: the rz operation thus has one Qubit input, one Angle input and one Qubit output.

As you can tell from the diagram, whilst Qubit is a linear type in the type system, it is not a linear value in the sense of a minIR graph: the Qubit type has multiple uses and defines in the cx operation alone. This is the key difference between lin-Htype\textrm{lin-}\mathbb H_\textrm{type} and lin-H\textrm{lin-}\mathbb H.

Qubit allocation and measurement #

We also introduce non-pure quantum operations qalloc and measure which respectively “create” a qubit (so no input, one Qubit output) and “destroy” it (one Qubit input, one Bit output – depending on whether the qubit was projected onto the 0\ket{0} or 1\ket{1} state). Remember that the reason these operations seem to “break” the laws of pure quantum physics is because they result from interactions with the classical environment.

measure is fundamental, as it connects quantum values with classical ones!

Region definition and structured control flow #

Our type system is so far missing a crucial aspect of minIR: the hierarchical structures. For this we need frame types, i.e. frames in the type graph. We must introduce a distinct type for each possible type signature of a frame. To keep this as simple as possible, we will introduce exactly one type for each signature.

If we write TT for the set of types in our type system (i.e. Bit, Qubit and Angle in our example), then a type signature of an edge is given by a pair (I,O)(I, O) of ordered lists of types I,OTI, O \in T^\ast. For each such (I,O)(I, O) pair, we introduce

  • the frame type regiondef<I, O>,
  • the in and out types in<I,O> and out<I,O>,
  • along with a new non-linear type Region<I, O>, the higher-order type representing a region with inputs II and outputs OO.

The regiondef<I, O> op takes zero inputs and returns one output Region<I, O>, whereas in<I,O> takes zero inputs and returns values of type II and out<I,O> takes inputs of type OO and returns nothing. For instance, for I=(I = (Qubit, Qubit)) and O=(O = (Qubit, Bit)), we have the following type graph.

Note that there is an important distinction in hier-lin-Htype\textrm{hier-lin-}\mathbb H_\textrm{type} in comparison to hier-lin-H\textrm{hier-lin-}\mathbb H: there is no notion of regions in the type system: the Qubit and Bit types in the above diagram would be in the child region of regiondef<I, O> if it were a graph in hier-H\textrm{hier-}\mathbb H, but in the type system, they might also be used by other operations in other regions (such as cx, rz, h etc. defined earlier).

Using the Region<I,O> types, it is then easy to define typed operations for any structured control flow of interest, such as the if-else example above. The following figure gives an overview of the entire type system of our example. For display purposes, we have included multiple copies of each type vertex; we remind the reader that in the actual type graph, all circles of the same type (colour) are one and the same.

A complete minIR type graph, following the example in this section. Value vertices with the same label (and same colour) form a single vertex in the type graph. They have been split into multiple vertices in this representation for better readability. The data types and op types with the <I,O> suffix are parametrised on the signature type (I,O)(I,O)(I,O) for I,O∈T∗I,O \in T^\astI,O∈T∗.

A complete minIR type graph, following the example in this section. Value vertices with the same label (and same colour) form a single vertex in the type graph. They have been split into multiple vertices in this representation for better readability. The data types and op types with the <I,O> suffix are parametrised on the signature type (I,O)(I,O) for I,OTI,O \in T^\ast.

An example minIR program #

Taking a step back, let us make the introduced ideas more concrete through an example. We demonstrate how a simple program written in textual form can be translated and expressed as a minIR graph. All statements are of the form

x, y, ... := op(a, b, c, ...)

where a, b, c etc are the SSA values passed to op (or used by op), and x, y etc are the SSA values returned by op (or defined by op). We use curly bracket to define the child region of a regiondef operation. A valid minIR program might then look as follows:

 1main := regiondef<(Qubit, Qubit), (Qubit, Bit)> {
 2    q0, q1 := in()
 3
 4    q0_1 := h(q0)
 5    q0_2, q1_1 := cx(q0_1, q1)
 6
 7    m0 := measure(q0_2)
 8
 9    ifregion := regiondef<(Qubit,), (Qubit,)> {
10        q1 := in()
11        out(q1)
12    }
13    elseregion := regiondef<(Qubit,), (Qubit,)> {
14        q1 = in()
15        q1_1 := h(q1)
16        out(q1_1)
17    }
18    q1_2 := ifelse(m0, q1_1, ifregion, elseregion)
19
20    out(q1_2, m0)
21}

Note that the in() and out(..) operations are only allowed within nested regions (as required by the type system). We have omitted the type parameters on these operations, as it mirrors exactly the paremeter of the regiondef.

It corresponds to the two minIR graphs on the following page. We use “wiggly hyperedges” that stretch between values, as in the first figure. They may look unusual if you are used to computation graphs. One can opt to draw the same graph with boxes for hyperedges and wires for values, yielding the second figure. The two representations are equivalent, but the rewriting semantics are most explicit when viewing values as vertices.

An example of an IO-free minIR graph. The vertex colours indicate their types in the type system presented in the previous figure. The main, ifregion and elseregion ops are all of op type regiondef (with type parameters omitted), labelled here with custom names for clarity. The type parameters of the ifelse, in and out op type have similarly been omitted. All other operation types are given as labels on the edges.

An example of an IO-free minIR graph. The vertex colours indicate their types in the type system presented in the previous figure. The main, ifregion and elseregion ops are all of op type regiondef (with type parameters omitted), labelled here with custom names for clarity. The type parameters of the ifelse, in and out op type have similarly been omitted. All other operation types are given as labels on the edges.

An equivalent representation of the computation above, now representing operations as boxes and values as wires. The arrow direction indicates the flow from value definition to value use(s). Dashed arrows have been changed to point to regions instead of individual operations.

An equivalent representation of the computation above, now representing operations as boxes and values as wires. The arrow direction indicates the flow from value definition to value use(s). Dashed arrows have been changed to point to regions instead of individual operations.

Differences to the quantum circuit model #

We conclude this presentation of minIR by highlighting the differences between this IR-based representation and the quantum circuit model that most quantum computing and quantum information scientists are familiar with5.

When restricted to purely quantum operations and no nested regions, the string diagram representation of a minIR graph (i.e. operations as boxes and values as wires) looks very similar to a quantum circuit. There is, however, a fundamental shift under the hood from reference to value semantics – to borrow terminology from C++.

In the reference semantics of quantum circuits, operations are typically thought of as “placed” on a qubit (the “lines” in the circuit representation), for instance, by referring to a qubit index. This qubit reference exists for the entire computation duration, and the quantum data it refers to will change over time as operations are applied to that qubit.

In the value semantics of computation graphs and SSA, on the other hand, qubits only exist in the form of the data they encode. When applying an operation, the (quantum) data is consumed by the operation and new data is returned. Given that the input data no longer exists, linearity conditions are required to ensure that no other operation can be fed the same value.

To make the difference clear, compare the program representations of the following computation:

Quantum circuit (pytket)6

import pytket as tk
circ = tk.Circuit(2)
circ.H(0)
circ.CX(0, 1)
circ.X(1)

SSA (minIR)

q0_0, q1_0 := in()
q0_1       := h(q0_0)
q0_2, q1_1 := cx(q0_1, q1_0)
q1_2       := x(q1_1)
out(q0_2, q1_2)

In value semantics, it becomes much harder to track physical qubits across their lifespan. This has very practical implications: without the convenient naming scheme, it would, for example, be non-trivial to count how many qubits are required in the SSA representation of the computation above. However, it is a drastically simpler picture from the point of view of the compiler and the optimiser – hence its popularity in classical compilers. When operations are defined based on qubit references, the compiler must carefully track the ordering of these operations: operations on the same qubit must always be applied in order. Through multi-qubit gates, this also imposes a partial ordering on operations across different qubits that must be respected.

SSA values remove this dependency tracking altogether: the notion of physical qubit disappears, and the ordering of statements becomes irrelevant. All that matters is connecting each use of a value (i.e. an input to an operation) with its unique definition, the output of a previous operation. In other words, the global ordering imposed by reference semantics is replaced by a causal order on the diagram Kissin., 2019Aleks Kissinger and Sander Uijlen. 2019. A categorical semantics for causal structure. Logical Methods in Computer Science Volume 15, Issue 3 (August 2019). doi: 10.23638/lmcs-15(3:15)2019.


All the concepts of minIR embed themselves very easily within the MLIR-based quantum IRs and the HUGR IR Mark K., 2025Seyon Sivarajah, Alan Lawrence, Alec Edgington, Douglas Wilson, Craig Roy, Luca Mondada, Lukas Heidemann, Ross Duncan Mark Koch. 2025. HUGR: A Quantum-Classical Intermediate Representation. Retrieved (talk recording) from https://www.youtube.com/live/D8esZrt7ogk?feature=shared&t=5217. In this sense, our toy IR serves as the minimum denominator across IRs and compiler technologies so that proposals and contributions we are about to make can be applied regardless of the underlying technical details.

By waiving goodbye to the circuit model, we have been able to integrate much of the theory of traditional compiler design, bringing us in the process much closer to traditional compiler research and the large-scale software infrastructure that is already available. This gives us access to all the classical optimisation and program transformation techniques developed over decades. Using structured control flow, we were also able to model linear resources such as qubits well – by using value semantics and SSA, checking that no-cloning is not violated is as simple as checking that each linear value is used exactly once.

Finally, this new design is also extremely extensible. Not only does it support arbitrary operations, but the type system is also very flexible. There is dedicated support for linear types, but this does not have to be restricted to qubits: lists of qubits could be added or even, depending on the target hardware, higher dimensional qudits, continuous variable quantum data, etc.


  1. And in fact, we will see in section 3.4 that it is not. ↩︎

  2. Note that a region may not be a connected subgraph. Albeit, it is a simple exercice to convince yourself that any non-root region contains either one or two connected components. ↩︎

  3. You may know this from prehistoric times as the goto statement, in languages such as Fortran, C, and, yes, even Go↩︎

  4. You might be thinking “oh, but all that is required here are phi nodes!”, if you are familiar with those. No – you’d also need a sort of “phi inverse”. Besides, see this discussion for more arguments on why no phi nodes. ↩︎

  5. Note that these comments apply specifically to characteristics of quantum circuits. Other diagrammatic representations of quantum processes in use, such as string diagrams, quantum combs etc may not share the same properties. ↩︎

  6. This is python code: pip install pytket↩︎

3.4. Graph transformation in minIR

As discussed in section 3.2, computation graphs with linear values, such as minIR, must adopt strict graph transformation semantics to ensure that linear constraints are satisfied at all times. In this section, we use the minIR graph category presented in the previous section to define transformation semantics that lean on the double pushout (DPO) Ehrig, 1976Hartmut Ehrig and Hans-Jörg Kreowski. 1976. Parallelism of manipulations in multidimensional information structures and sesqui-pushout (SqPO) Corrad., 2006Andrea Corradini, Tobias Heindel, Frank Hermann and Barbara König. 2006. Sesqui-Pushout Rewriting. In Graph Transformations, Berlin, Heidelberg. Springer Berlin Heidelberg, 30--45. doi: 10.1007/11841883_4 semantics in adhesive categories Lack, 2005Stephen Lack and Pawel Sobocinski. 2005. Adhesive and quasiadhesive categories. RAIRO - Theoretical Informatics and Applications 39, 3 (July 2005, 511--545). doi: 10.1051/ITA:2005028.

Adhesivity of hypergraph categories #

The natural place to start this section is by studying which of the categories defined in section 3.3 are adhesive. From adhesivity follows that transforming graphs using DPO and SqPO constructions is well-defined and unique, at least in the regimes of interest to us.

A category is said to be adhesive if it has all pullbacks and pushouts along monos, as well as some compatibility conditions between them, the so-called “Van Kampen squares”. We refer to the literature (e.g. Lack, 2005Stephen Lack and Pawel Sobocinski. 2005. Adhesive and quasiadhesive categories. RAIRO - Theoretical Informatics and Applications 39, 3 (July 2005, 511--545). doi: 10.1051/ITA:2005028) for a complete definition. For our purposes, the following two results are sufficient:

  • Every presheaf topos [C,Set][\mathbb C, \mathrm{Set}] is adhesive (Corollary 3.6 in Lack, 2005Stephen Lack and Pawel Sobocinski. 2005. Adhesive and quasiadhesive categories. RAIRO - Theoretical Informatics and Applications 39, 3 (July 2005, 511--545). doi: 10.1051/ITA:2005028);
  • Every full subcategory DC\mathbb D \subseteq \mathbb C of an adhesive category is adhesive if the pullbacks and pushouts in C\mathbb C of objects in D\mathbb D are again in D\mathbb D (a simple result; if the Van Kampen squares commute in C\mathbb C, they must commute in D\mathbb D).

A first result immediately follows from the first result:

Proposition 3.1Adhesivity of directed hypergraphs
The category H\mathbb H of directed hypergraphs is adhesive.
Proof

It is a presheaf.

This does not immediately generalise to lin-H\textrm{lin-}\mathbb H, as unlike H\mathbb H, Definition 3.2 imposes that EE be a coproduct. However, the result still holds:

Proposition 3.2Adhesivity of hypergraphs with linearity constraints
The categories lin-H\textrm{lin-}\mathbb H and lin-Htype\textrm{lin-}\mathbb H_\textrm{type} are adhesive.
Proof

lin-H\textrm{lin-}\mathbb H is a full subcategory of the adhesive category [lin-C,Set][\textrm{lin-}\mathbb C, \mathrm{Set}]. We must show the existence of pullbacks and pushouts along monos in lin-H\textrm{lin-}\mathbb H.

Pullbacks. Consider a pullback ApaPpbA \xleftarrow{p_a} P \xrightarrow{p_b} of AaCbBA \xrightarrow{a} C \xleftarrow{b} B in [lin-C,Set][\textrm{lin-}\mathbb C, \mathrm{Set}], with A,B,Clin-HA, B, C \in \textrm{lin-}\mathbb H. We must show that PP is in lin-H\textrm{lin-}\mathbb H. Colimits are computed pointwise in presheaves, so we know that P(E)P(E) is the pullback of A(E)C(E)B(E)A(E) \to C(E) \leftarrow B(E) in Set\textrm{Set}. If we can show that P(E)P(E) is the coproduct of P(Est)P(E_{st}) for s,tNs, t \in \mathbb{N}, then we are done.

Let vP(E)v \in P(E). Because A(E)A(E) and B(E)B(E) are coproducts in Set, i.e. a disjoint union, there must be s,t,s,tNs, t, s', t' \in \mathbb{N} such that pa(v)A(Est)p_a(v) \in A(E_{st}) and pb(v)B(Est)p_b(v) \in B(E_{st'}). By naturality of aa and bb, it follows that a(pa(v))C(Est)a(p_a(v)) \in C(E_{st}) and b(pb(v))B(Est)b(p_b(v)) \in B(E_{s't'}). But by commutativity of the pullback diagram, a(pa(v))=b(pb(v))a(p_a(v)) = b(p_b(v)), and thus s=ss = s' and t=tt = t'. We conclude by unicity of the pullback that vP(Est)v \in P(E_{st}) and thus P(E)=stP(Est)P(E) = \bigsqcup_{st} P(E_{st}).

Pushouts. The same argument as for pullbacks also applies to pushouts: given a pushout PP of AaCbBA \xrightarrow{a} C \xleftarrow{b} B in [lin-C,Set][\textrm{lin-}\mathbb C, \mathrm{Set}] with A,B,Clin-HA, B, C \in \textrm{lin-}\mathbb H, an element vP(E)v \in P(E) that makes the pushout square commute must have preimages in A(Est),B(Est)A(E_{st}), B(E_{st}) and C(Est)C(E_{st}) for some s,tNs, t \in \mathbb{N}. Thus the pushout distributes over the coproduct, and we can conclude that P(E)P(E) is the coproduct of pushouts.

The same argument also applies to lin-Htype\textrm{lin-}\mathbb H_\textrm{type}1.

Now to the spicy stuff:

Proposition 3.3Non-adhesivity of hierarchical hypergraphs
Whilst hier-lin-Htype\textrm{hier-lin-}\mathbb H_\textrm{type} is adhesive, the category hier-lin-H\textrm{hier-lin-}\mathbb H is NOT adhesive.
Proof

hier-lin-Htype\textrm{hier-lin-}\mathbb H_\textrm{type} is a presheaf – hence adhesive.

The following pushout square shows that hier-lin-H\textrm{hier-lin-}\mathbb H cannot be adhesive: the pushout square is valid in [hier-lin-C,Set][\textrm{hier-lin-}\mathbb C, \mathrm{Set}], but the pushout at the bottom right is not in hier-lin-H\textrm{hier-lin-}\mathbb H, because the child regions cannot each be assigned a unique parent.

Double pushout semantics #

From Proposition 3.3, it follows that minIR graph transformations can be performed through the double pushout (DPO) construction Ehrig, 1976Hartmut Ehrig and Hans-Jörg Kreowski. 1976. Parallelism of manipulations in multidimensional information structures in the [hier-lin-C,Set][\textrm{hier-lin-}\mathbb C, \mathrm{Set}] category.

Definition 3.6Double pushout (DPO) transformation

A transformation rule pp in an adhesive category A\mathbb A is a span LIRL \leftarrow I \rightarrow R. For objects G,HAG, H \in \mathbb A, we then write G(p,m)HG \xRightarrow{(p,m)} H or GpHG \xRightarrow{p} H if there is a matching morphism m:LGm: L \to G and a context object CC along with morphisms GCHG \leftarrow C \to H and ICI \to C such that the following diagram commutes and both squares are pushouts:

If the DPO transformation G(p,m)HG \xRightarrow{(p,m)} H exists for some rule pp and match mm, then we say GHG \Rightarrow H is a valid DPO rewrite.

To ensure that a DPO rewrite is valid in minIR, we must impose certain conditions. Let GG be an IO-free minIR graph, i.e. Ghier-lin-HG \in \textrm{hier-lin-}\mathbb H, there is a morphism GΣG \to \Sigma in hier-lin-Htype\textrm{hier-lin-}\mathbb H_\textrm{type} for some type system Σ\Sigma and I=O=I = O = \varnothing.

A DPO rewrite GHG \Rightarrow H is a valid minIR DPO rewrite if there is a transformation GpHG \xRightarrow{p} H in [hier-lin-C,Set][\textrm{hier-lin-}\mathbb C, \mathrm{Set}] and

  1. pp is left-mono, i.e. the morphism ILI \to L is mono,2
  2. the pushout complement CC and pushout HH also exist in the slice category hier-lin-HtypeΣ\textrm{hier-lin-}\mathbb H_\textrm{type} \searrow \Sigma,
  3. HH satisfies the hierarchy condition of Definition 3.3,
  4. HH is IO-free.
Proposition 3.4
If GG is a minIR graph and GHG \Rightarrow H is a valid minIR DPO rewrite, then HH is a valid minIR graph.
Proof

We know by construction that H[hier-lin-C,Set]H \in [\textrm{hier-lin-}\mathbb C, \mathrm{Set}]. We must show that HH further satisfies the constraints to be an object in the full subcategory of minIR graphs.

The first condition is standard in DPO and guarantees that CC and DD are unique if they exist.

The third condition we impose on HH corresponds directly to the constraint that defines hierarchical graphs in hier-lin-H\textrm{hier-lin-}\mathbb H. The fourth condition ensures that valid minIR DPO rewrites map IO-free graphs to IO-free graphs.

Finally, the second condition is imposed to ensure well-typedness of HH. The functor hier-lin-Hhier-lin-Htype\textrm{hier-lin-}\mathbb H \to \textrm{hier-lin-}\mathbb H_\textrm{type} that forgets the def\textit{def} and use\textit{use} morphisms is a left adjoint (it possesses a right Kan extension defined pointwise), and thus preserves colimits. The images of CC and HH thus form pushout squares in hier-lin-Htype\textrm{hier-lin-}\mathbb H_\textrm{type}, and by unicity, must match the pushout squares in hier-lin-HtypeΣ\textrm{hier-lin-}\mathbb H_\textrm{type} \searrow \Sigma. Hence HH is well-typed.

The restriction to rewrites of IO-free graphs is not a restriction of generality: if we are interested in rewriting computations with inputs and outputs, we can always express them as IO-free graphs by adding input and output ops with the values in II as outputs, respectively OO as inputs. We assign them dedicated types distinct from all other operations; these operations will never be matched by transformation rules and can be removed at the end of rewriting.

Generalising to sesqui-pushouts #

We restricted minIR rewrites to DPO transformations obtained form left-mono rules, to ensure that the construction is unique. This excludes rules that may identify two values in GG but split them into two different values in HH. Such rules allow for cloning values, which is a useful transformation in minIR for non-linear values. An example of a transformation rule that we would like to allow in minIR:

For this example we added a 2x operation that multiplies an angle value passed as input by two. The transformation rule replaces a rotation of angle 2α2\alpha by two rotations of angle α\alpha by cloning the input angle.

Such semantics are possible using the sesqui-pushout construction (SqPO) by Corradini et al. Corrad., 2006Andrea Corradini, Tobias Heindel, Frank Hermann and Barbara König. 2006. Sesqui-Pushout Rewriting. In Graph Transformations, Berlin, Heidelberg. Springer Berlin Heidelberg, 30--45. doi: 10.1007/11841883_4. We can reuse the same (p,m)\xRightarrow{(p,m)} notation: when DPO is restricted to left-mono rules as we have done, SqPO is a generalisation of DPO (i.e. the construction coincides whenever the DPO exists).

Definition 3.7Sesqui-pushout (SqPO) transformation

A transformation rule pp in an adhesive category A\mathbb A is a span LIRL \leftarrow I \rightarrow R. For objects G,HAG, H \in \mathbb A, we then write G(p,m)HG \xRightarrow{(p,m)} H or GpHG \xRightarrow{p} H if there is a matching morphism m:LGm: L \to G and a context object CC along with morphisms GCHG \leftarrow C \to H and ICI \to C such that CC is the final pullback complement of ILmGI \to L \xrightarrow{m} G and the right square is a pushout:

If the SqPO transformation G(p,m)HG \xRightarrow{(p,m)} H exists for some rule pp and match mm, then we say GHG \Rightarrow H is a valid (SqPO) rewrite.

The left square is redundant in the diagram above, as it follows from the requirement that CC be the final pullback complement (FPC). It is kept to highlight the similarities to DPO. As the commuting diagram indicates, the final pullback complement (FPC) construction forms a pullback square. Furthermore, unlike pushout complements, the FPC is defined by a universality property that ensures uniqueness if it exists. We refer to Corrad., 2006Andrea Corradini, Tobias Heindel, Frank Hermann and Barbara König. 2006. Sesqui-Pushout Rewriting. In Graph Transformations, Berlin, Heidelberg. Springer Berlin Heidelberg, 30--45. doi: 10.1007/11841883_4 for the exact FPC construction.

With SqPO, we can define the set of valid minIR rewrites as given by the SqPO transformations GpHG \xRightarrow{p} H in [hier-lin-C,Set][\textrm{hier-lin-}\mathbb C, \mathrm{Set}] satisfying the relaxed set of conditions

  1. the pushout complement CC and pushout HH also exist in the slice category hier-lin-HtypeΣ\textrm{hier-lin-}\mathbb H_\textrm{type} \searrow \Sigma,
  2. HH satisfies the hierarchy condition of Definition 3.3,
  3. HH is IO-free.

We conclude this section with a discussion of some of the properties of minIR transformations using SqPO (referring again to Corradini Corrad., 2006Andrea Corradini, Tobias Heindel, Frank Hermann and Barbara König. 2006. Sesqui-Pushout Rewriting. In Graph Transformations, Berlin, Heidelberg. Springer Berlin Heidelberg, 30--45. doi: 10.1007/11841883_4 or König, 2018Barbara König, Dennis Nolte, Julia Padberg and Arend Rensink. 2018. A Tutorial on Graph Transformation. In Graph Transformation, Specifications, and Nets - In Memory of Hartmut Ehrig. Springer, 83--104. doi: 10.1007/978-3-319-75396-6_5 for a more detailed explanation of the concepts discussed):

Deletion in unknown context.  A key difference between DPO and SqPO transformations is that SqPO transformations on graphs will delete edges attached to a vertex vdv_d that is deleted by the transformation rule (i.e. vdLv_d \in L but vd∉Rv_d \not\in R of the rule). The DPO transformation on the other hand is only well-defined when all edges incident to vdv_d are in the image of mm and thus explicitly deleted (this is known as the dangling condition).

As minIR rewrites follow SqPO semantics, transformation rules such as the following are allowed:

Here ×\times denotes the multiplication of angles and const(0)\textsf{const(0)} the zero angle. Any operation that would be connected to the starred value on the left would be deleted by this rule. However such an implicit operation deletion only yields valid minIR graphs if all incident values are non-linear and none of the target values of the deleted operation are used.

Non-left-mono rules.  As discussed in the introduction to SqPO, the cloning of values is allowed in minIR rewrites. However, linear values may never be cloned (the FPC or pushout will not exist in these cases). Thus any minIR transformation rule will be left-mono on linear values. It must further be left-linear on all (linear and non-linear) values in II that are mapped to outputs in RR: if a value ww is produced by op applied to vv, then cloning vv and 'op will result in two definitions of ww.

Non-right-mono rules.  Non-right-mono rules are allowed in both DPO and SqPO. They result in vertex merges. In minIR, the situation for right-mono is symmetric to left-mono: the map must be mono on linear values (otherwise the same value will have multiple uses or definitions) and it must be mono on all values in II that are mapped to inputs in LL (otherwise a value in the rewritten minIR graph will have more than one value definition).


  1. In fact, a much simpler argument applies: the category lin-Htype\textrm{lin-}\mathbb H_\textrm{type} is isomorphic to the presheaf category [lin-C~type,Set][\textrm{lin-}\tilde{\mathbb C}_\textrm{type}, \mathrm{Set}], where lin-C~type\textrm{lin-}\tilde{\mathbb C}_\textrm{type} is obtained from lin-Ctype\textrm{lin-}\mathbb C_\textrm{type} by removing the object EE. Adhesivity follows. ↩︎

  2. This is often called left-linear in the literature. We avoid this term in this thesis to avoid confusion with the linearity property of values in minIR. ↩︎

3.5. MinIR rewriting, operationally

The previous section proposed to view minIR rewrites as the result of a (DPO or SqPO) graph transformation. This yields valid rewriting semantics elegantly (and with little effort!). However, the conditions that must be imposed on the transformation to be valid, along with the fact that pushouts may not exist mean that the existence of a rewrite given a transformation rule and a match is not guaranteed.

In this section, we address this by considering a more restricted notion of minIR rewriting, for which the existence of the right-hand side of the rewrite is guaranteed. In addition, in place of the categorical presentation of the last section, we express the rewriting operation operationally, i.e. as data and a procedure on sets that translates directly into an algorithmic implementation.

We find that this rewrite definition is sufficient in practice. We conclude the section with an example of how more complex rewrites can be achieved by composition of simpler rewrites that can be expressed in this framework.

Graph glueings and rewrites #

Throughout, we consider graph glueings on disjoint vertex and (hyper)edge sets. To underline this, we will use the \sqcup symbol to denote disjoint set unions.

As we will be working exclusively with vertex and edge sets in this section (as opposed to the objects in the indexing category), we will drop the bold typeface for sets, writing e.g. VV instead of V\mathbf V for the set of vertices of a hypergraph.

Finally, all minIR graphs in this section are IO free.

We define local graph rewrites in terms of graph glueings. Consider first the case of two arbitrary graphs G1=(V1,E2)G_1 = (V_1, E_2) and G2=(V2,E2)G_2 = (V_2, E_2), along with a relation μ V1×V2\mu\ \subseteq V_1 \times V_2. Let μ (V1V2)2\sim_\mu \ \subseteq (V_1 \sqcup V_2)^2 be the equivalence relation induced by μ\mu, i.e. the smallest relation on V1V2V_1 \sqcup V_2 that is reflexive, symmetric and transitive, and satisifes for all v1V1v_1 \in V_1 and v2V2v_2 \in V_2,

(v1,v2)μv1μv2.(v_1, v_2) \in \mu \Rightarrow v_1 \sim_\mu v_2.

Then, we can define

  • V=(V1V2)/μV = (V_1 \sqcup V_2)/\sim_\mu is the set of all equivalence classes of μ\sim_\mu, and
  • for vV1V2v \in V_1 \sqcup V_2, αμ(v)V\alpha_\mu(v) \in V is the equivalence class of μ\sim_\mu that vv belongs to.
Definition 3.8Graph glueing

The glueing of G1G_1 and G2G_2 according to the glueing relation μ\mu is given by the vertices V=(V1V2)/μV = (V_1 \sqcup V_2)/\sim_\mu and the edges

E={(αμ(u),αμ(v))(u,v)E1E2}V2.E = \{(\alpha_\mu(u), \alpha_\mu(v)) \mid (u,v) \in E_1 \sqcup E_2 \} \subseteq V^2.

We write the glueing graph as (G1G2)/μ(G_1 \sqcup G_2) / \sim_\mu.

In other words, the glueing is the disjoint union of the two graphs, with identification (and merging) of vertices that are related in μ\mu.

This allows us to define a rewrite on a graph GG:

Definition 3.9Graph rewrite

A rewrite rr on a graph G=(V,E)G = (V, E) is given by a tuple r=(GR,V,E,μ)r = (G_R, V^-, E^-, \mu), with

  • GR=(VR,ER)G_R = (V_R, E_R) is a graph called the replacement graph,
  • VVV^- \subseteq V is the vertex deletion set,
  • EEdom(μ)2E^- \subseteq E \cap dom(\mu)^2 is the edge deletion set, and
  • μ:VVR\mu: V^- \rightharpoonup V_R is the glueing relation, a partial function that maps a subset of the deleted vertices of GG to vertices in the replacement graph.

The domain of definition dom(μ)dom(\mu) is known as the boundary values of rr.

A graph rewrite per this definition can always be generated by a single pushout (SPO) transformation Löwe, 1991Michael Löwe. 1991. Extended algebraic graph transformation. Retrieved from http://d-nb.info/910935696.

  • define LL as the graph (V,E)(V^-, E^-). Then the injection LGL \subseteq G is the match morphism LGL \to G;
  • the partial map μ\mu maps a subset of VV^- to vertices in the replacement R=GRR = G_R. By injectivity of the match morphism, it also defines a partial map LRL \rightharpoonup R.

We opted for SPO-like semantics in this definition, as they are the simplest to write in set-theoretic terms and coincide with DPO and SqPO in our restricted domain of interest.

The result of the rewrite is computed by gluing the right-hand side GRG_R to the context subgraph GC=(VC,EC)G_C = (V_C, E_C) of GG given by

VC=(VV)  dom(μ)EC=(EE)  VC2.\begin{aligned}V_C &= (V \smallsetminus V^-) \ \cup\ dom(\mu)\\E_C &= (E \smallsetminus E^-)\ \cap\ V_C^2.\end{aligned}

The partial function μ\mu is a special case of a glueing relation μVC×VR\mu \subseteq V_C \times V_R, and thus defines a glueing of GCG_C with GRG_R. The rewritten graph resulting from applying rr to GG is r(G)=(GCGR)/μ.r(G) = (G_C \sqcup G_R) / \sim_\mu.

An example of a graph rewrite is given in the next figure. This is equivalent to an SPO transformation with the graph induced by VV^- on the left-hand side, the graph GRG_R on the right-hand side and the partial map LRL \hookrightarrow R given by μ\mu.

Application of a graph rewrite. On the left, the original graph GGG along with the replacement graph GRG_RGR​ (grey box). On the right, the rewritten graph r(G)r(G)r(G). Only the vertex ggg has been deleted, as other vertices in V−V^-V− are in the boundary dom(μ)dom(\mu)dom(μ) (in orange). The (singleton) edge deletion set is red. The blue edge connects a vertex of V∖V−V \smallsetminus V^-V∖V− to a boundary vertex, and is thus also present on the right-hand side. The purple edge, on the other hand, connects a vertex of V∖V−V \smallsetminus V^-V∖V− to a non-boundary vertex of V−V^-V−, and is thus deleted.

Application of a graph rewrite. On the left, the original graph GG along with the replacement graph GRG_R (grey box). On the right, the rewritten graph r(G)r(G). Only the vertex gg has been deleted, as other vertices in VV^- are in the boundary dom(μ)dom(\mu) (in orange). The (singleton) edge deletion set is red. The blue edge connects a vertex of VVV \smallsetminus V^- to a boundary vertex, and is thus also present on the right-hand side. The purple edge, on the other hand, connects a vertex of VVV \smallsetminus V^- to a non-boundary vertex of VV^-, and is thus deleted.

When there are no edges between VVV \smallsetminus V^- and Vdom(μ)V^- \smallsetminus dom(\mu) (purple in the example above), this definition corresponds to graph rewrites that can be produced using DPO transformations (see discussion in section 3.4). Otherwise, such edges are deleted.

The notions of graph glueing and graph rewrite can straightforwardly be lifted to hypergraphs and, by extension, to minIR graphs. Notice that in this case, values are glued together, not operations (the former were defined as the graph’s vertices, the latter as its hyperedges).

However, the glueing of two valid minIR graphs – and the result of applying a valid rewrite – may not be a valid minIR graph. Glueing two values of a linear type, for instance, is a sure way to introduce multiple uses (or definitions) of it. Thus, we must be careful to only consider glueings and rewrites of minIR graphs that preserve all the constraints we have imposed in Definition 3.4.

Ensuring rewrite validity: interfaces #

As a sufficient condition for valid minIR rewrites, we introduce minIR interfaces, a concept closely related to the “hypergraph with interfaces” construction of Bonchi, 2017Filippo Bonchi, Fabio Gadducci, Aleks Kissinger, Paweł Sobociński and Fabio Zanasi. 2017. Confluence of Graph Rewriting with Interfaces or the supermaps of quantum causality Hefford, 2024James Hefford and Matt Wilson. 2024. A Profunctorial Semantics for Quantum Supermaps. In Proceedings of the 39th Annual ACM/IEEE Symposium on Logic in Computer Science, July 2024. ACM, 1--15. doi: 10.1145/3661814.3662123. We eschew the presentation of holes as a slice category in favour of a definition that fits naturally within minIR and is sufficient for our purposes.

Let GG be a Σ\Sigma-typed minIR graph with data types TT and linear types TLTT_L \subseteq T. Consider type strings S,STS, S' \in T^\ast. We define the index sets

Idx(S)={iN1iS}IdxL(S)={iIdx(S)SiTL}Idx(S)\begin{aligned}\mathrm{Idx}(S) &= \{i \in \mathbb{N} \mid 1 \leq i \leq |S|\}\\\mathrm{Idx}_L(S) &= \{i \in \mathrm{Idx}(S) \mid S_i \in T_L\} \subseteq \mathrm{Idx}(S)\end{aligned}

corresponding respectively to the set of all indices into SS and the subset of indices of linear types. For any iIdx(S)i \in \mathrm{Idx}(S), we denote by SiS_i the type at position i in SS.

We define a partial order \preccurlyeq1 on TT^\ast where SSS \preccurlyeq S' and say that SS' can be coerced into SS if there exists an index map ρ:Idx(S)Idx(S)\rho: \mathrm{Idx}(S) \to \mathrm{Idx}(S') such that

  • types are preserved: Si=Sρ(i)S_i = S'_{\rho(i)}, and
  • ρ\rho is well-defined and bijective on the restriction to indices of linear types
    ρIdxL(S):IdxL(S)IdxL(S).\left.\rho\right|_{\mathrm{Idx}_L(S)}: \mathrm{Idx}_L(S) \to \mathrm{Idx}_L(S').
Definition 3.10Interface

Let TT be a set of data types. An interface I=(U,D)I = (U, D) is a pair of type strings U,DTU, D \in T^\ast.

We say that an interface I=(U,D)I' = (U', D') can be coerced into an interface I=(U,D)I = (U, D), written III \triangleleft I', if UUU \succcurlyeq U' and DDD \preccurlyeq D'.

We can define the interface associated with an operation oo in a minIR graph GG by considering the values used and defined by oo. Calling τ\tau the type morphism on GG and assuming oEsto \in E_{st} to be an operation in GG with ss inputs and tt outputs, we define the interface of oo in GG as the pair of strings in TT^\ast

I(o)=(τ(src1(o))τ(srcs(o)),τ(tgt1(o))τ(tgtt(o))).I(o) = (\tau(\textit{src}_1(o))\cdots\tau(\textit{src}_s(o)), \tau(\mathit{tgt}_1(o))\cdots\tau(\mathit{tgt}_t(o))).

Similarly, we can assign interfaces to subgraphs of minIR graphs:

Definition 3.11MinIR subgraph

Consider a subset of values and operations VHVV_H \subseteq V and EHEE_H \subseteq E. Define the use and define boundary sets

BU={vVHdef(v)EEH},BD={vVHuse(v)EEH}.\begin{aligned} B_U &= \{v \in V_H \mid \mathit{def}\,(v) \in E \smallsetminus E_H \},\\B_D &= \{v \in V_H \mid use(v) \in E \smallsetminus E_H \}.\end{aligned}

The tuple H=(VH,EH)H = (V_H, E_H) of GG is called a minIR subgraph of GG if there exists a region RR of GG such that all boundary values of HH are in RR:

B=BUBDR. B= B_U \cup B_D \subseteq R.

We write HGH \subseteq G to indicate that HH is a minIR subgraph of GG.

Note that BUB_U is exactly the set of inputs II in the non-IO free minIR graph given by the subgraph (VH,EH)(V_H, E_H) of the minIR graph. BDB_D is a superset of the outputs OO of HH: it includes all linear values in HH that do not have a use in HH, but also any non-linear value that has a use outside of HH.

Unlike interfaces, subgraph boundary values are not ordered. An ordering of BVB \subseteq V is a string SVS \in V^\ast along with a bijective map

ord:BIdx(S)such thatv=Sord(v).\mathrm{ord}: B \to \mathrm{Idx}(S) \quad\textrm{such that}\quad v = S_{\mathrm{ord}(v)}.

If there are strings SU,SDVS_U, S_D \in V^\ast and orderings of BUB_U and BDB_D

ordU: BUIdx(SU)ordD: BDIdx(SD),\begin{aligned}\textrm{ord}_U:\ &B_U \to \mathrm{Idx}(S_U)&\quad\textrm{ord}_D:\ &B_D \to \mathrm{Idx}(S_D),\end{aligned}

then we can set srci(H)=(SU)i\textit{src}_i\,(H) = (S_U)_i and tgti(H)=(SD)i\textit{tgt}_i\,(H) = (S_D)_i in complete analogy to operations. We will write src(H)\textit{src}(H) and tgt(H)\textit{tgt}(H) for the strings src1(H)srcSU(H)\textit{src}_1(H)\cdots\textit{src}_{|S_U|}(H) and tgt1(H)tgtSD(H)\textit{tgt}_1(H)\cdots\textit{tgt}_{|S_D|}(H) respectively. We say that the subgraph HH implements the interface

IH=(τ(src(H)),τ(tgt(H)),I_H = (\tau(\textit{src}(H)), \tau(\mathit{tgt}(H)),

where the type morphism τ\tau was extended element-wise to strings VV^\ast.

Remark, though, that unlike operations, the same subgraph may implement more than one interface as a result of various choices of orderings ordU\textrm{ord}_U and ordD\textrm{ord}_D.

As mentioned, the subgraph HH forms a non-IO free minIR graph. We can always construct an IO-free minIR graph from HH by adding two operations oino_{in} and oouto_{out} in the root region respectively in E0,SUE_{0, |S_U|} and ESD,0E_{|S_D|, 0} inputs-outputs, defined by

tgti(oin)=srci(H),srci(oout)=tgti(H). \textit{tgt}_i\,(o_{in}) = \textit{src}_i(H),\quad\quad \textit{src}_i(o_{out}) = \textit{tgt}_i\,(H).

We call the resulting graph Hˉ\bar{H} an interface graph. It implements the interface IHI_H if HH implements IHI_H. Calling to mind the illustrations of section 3.3, Hˉ\bar{H} looks like one of the nested regions within regiondef operations that we were considering.

MinIR operation rewrite #

Consider

  • an operation oo in a minIR graph GG with values V,V,
  • an interface graph Hˉ\bar{H} with values VHV_H and its associated subgraph HHˉH \subseteq \bar{H}, such that HH implements an interface I(o)IH,I(o) \triangleleft I_H,
  • the index maps ρ:Idx(src(H))Idx(src(o))\rho: \mathrm{Idx}(\textit{src}(H)) \to \mathrm{Idx}(\textit{src}(o)) and σ:Idx(tgt(o))Idx(tgt(H))\sigma: \mathrm{Idx}(\textit{tgt}\,(o)) \to \mathrm{Idx}(\textit{tgt}\,(H)) that define the generalisation I(o)IHI(o) \triangleleft I_H (per Definition 3.10).

We can define a glueing relation μoV×VH\mu_o \subseteq V \times V_H

μo= {(srcρ(i)(o),srci(H))iIdx(src(H))} {(tgti(o),tgtσ(i)(H))iIdx(tgt(o))}.\begin{aligned}\mu_o =\ & \{ \left(\textit{src}_{\rho(i)}(o), \textit{src}_{i}(H)\right) \mid i \in \mathrm{Idx}(\textit{src}(H)) \}\ \cup \\& \{ \left(\mathit{tgt}_{i}\,(o), \mathit{tgt}_{\sigma(i)}(H)\right) \mid i \in \mathrm{Idx}(\textit{tgt}\,(o)) \}.\end{aligned}

This is almost enough to define a rewrite that replaces the operation oo in GG with the values and operations of HH – the interface compatibility constraint I(o)IHI(o) \triangleleft I_H that we have imposed ensures that the resulting minIR graph is valid. Unfortunately, μo\mu_o is not a partial function as required by Definition 3.4.

This is resolved in the following proposition:

Proposition 3.5MinIR operation rewrite

Let GG, oo and HH such that I(o)IHI(o) \triangleleft I_H, as defined above. Then

((GH)/μo ⁣){o},\big((G \sqcup H) / \sim_{\mu_o}\!\big) \smallsetminus \{o\},

i.e. the graph obtained by removing the operation oo from the glueing of GG and HH along μo\mu_o, is a valid minIR graph.

There is a graph GRG_R with values VRV_R and a partial function μo:VVR\mu_o': V \rightharpoonup V_R such that the graph (5) is the graph ro(G)r_o(G), obtained from the rewrite

ro=(GR,dom(μo),{o},μo).r_o = (G_R, dom(\mu_o), \{o\}, \mu_o').

We call ror_o the rewrite of oo into HH.

The definition of the rewrite of oo into a graph HH behaves as one would expect – the only subtleties relate to handling non-linear (i.e. copyable) values at the boundary of the rewrite. The following example illustrates some of these edge cases.

Rewriting operation ooo in the graph GGG (top left) into the operations o1o_1o1​ and o2o_2o2​ of the graph Hˉ\bar{H}Hˉ (bottom left). Coloured dots indicate the index maps ρ\rhoρ and σ\sigmaσ from inputs BUB_UBU​ of Hˉ\bar{H}Hˉ to inputs of ooo, respectively from outputs of ooo to outputs BDB_DBD​ of Hˉ\bar{H}Hˉ.

Rewriting operation oo in the graph GG (top left) into the operations o1o_1 and o2o_2 of the graph Hˉ\bar{H} (bottom left). Coloured dots indicate the index maps ρ\rho and σ\sigma from inputs BUB_U of Hˉ\bar{H} to inputs of oo, respectively from outputs of oo to outputs BDB_D of Hˉ\bar{H}.

When the index maps ρ\rho and σ\sigma are not injective (yellow and green dots), values are merged, resulting in multiple uses of the value (i.e. copies). This is why the index maps must be injective on linear values (dots in shades of blue). Value merging also happens when a value is used multiple times in oo (yellow and red dots). This will never happen with linear values (as they can never have more than one use in oo), nor with any value definitions (the same value can never be defined more than once). Finally, values not in the image of ρ\rho or σ\sigma (purple dot) are discarded. This case is also excluded for linear values by requiring surjectivity.

Proof

We start this proof with the explicit construction of GRG_R and μo\mu_o'. Define R(VH)2\sim_R \subseteq (V_H)^2 as the smallest equivalence relation such that

srcρ(i)(o)=srcρ(j)(o)tgti(oin)Rtgtj(oin).\textit{src}_{\rho(i)}(o) = \textit{src}_{\rho(j)}(o) \Rightarrow \textit{tgt}_i\,(o_{in}) \sim_R \textit{tgt}_j\,(o_{in}).

Then we define GˉR=Hˉ/R\bar{G}_R = \bar{H} / \sim_R, the graph obtained by glueing together values within the same equivalence class of R\sim_R.

Claim 1: GˉR\bar{G}_R is a valid minIR graph.

Claim 1 follows from the observation that only values of non-linear types are glued together. If vRvv \sim_R v', then either v=vv = v' or there exist iji \neq j such that

tgti(oin)Rtgtj(oin).\textit{tgt}_i\,(o_{in}) \sim_R \textit{tgt}_j\,(o_{in}).
If ρ(i)=ρ(j)\rho(i) = \rho(j), then ρ\rho is not injective on ii and jj, and by the definition of ρ\rho, τ(v)TL\tau(v)\notin T_L and τ(v)TL\tau(v') \notin T_L. Otherwise, there are i=ρ(i)srcρ(j)(o)=ji' = \rho(i) \neq \textit{src}_{\rho(j)}(o) = j' such that srci(o)=srcj(o)\textit{src}_{i'}(o) = \textit{src}_{j'}(o). The same value is used twice, which is only a valid minIR graph if vv and vv' are not linear, thus proving Claim 1.

Define GRG_R as the subgraph obtained from GˉR\bar{G}_R by removing the operations {oin,oout}\{o_{in}, o_{out}\}. Let VR=VH/RV_R = V_H / \sim_R be the set of values of GRG_R (and of GˉR\bar{G}_R). Writing αR(v)VR\alpha_R(v) \in V_R for the equivalence class of R\sim_R that vVHv \in V_H belongs to, we can define μoV×VR\mu_o' \in V \times V_R as:

(v,w)μo(v,αR(w))μo.(v, w) \in \mu_o \Leftrightarrow (v, \alpha_R(w)) \in \mu_o'.

Claim 2: μo\mu_o' is a partial function VVRV \rightharpoonup V_R.

In other words, for all (v,α1),(v,α2)μo(v, \alpha_1), (v, \alpha_2) \in \mu_o', then α1=α2\alpha_1 = \alpha_2. Let w1α1w_1 \in \alpha_1 and w2α2w_2 \in \alpha_2 be values in VHV_H. First of all, srci(o)tgtj(o)\textit{src}_i(o) \neq \textit{tgt}_j\,(o) for all i,ji, j, otherwise GG is not acyclic. So either use(v)=ouse(v) = o, or def(v)=o\textit{def}(v) = o, but not both.

The simpler case: if def(v)=o\textit{def}\,(v) = o, then there exists ii such that tgt(o)i=v\textit{tgt}\,(o)_i = v. Furthermore ii is unique because by minIR definition, vv has a unique definition in GG. It follows from (4) that w1=srcρ(i)(oout)=w2w_1 = \textit{src}_{\rho(i)}(o_{out}) = w_2 and hence α1=α2\alpha_1 = \alpha_2.

Otherwise, there exists ii and jj such that v=srcρ(i)(o)=srcρ(j)(o)v = \textit{src}_{\rho(i)}(o) = \textit{src}_{\rho(j)}(o) and tgti(oin)=w1\textit{tgt}_i\,(o_{in}) = w_1 as well as tgtj(oin)=w2\textit{tgt}_j\,(o_{in}) = w_2. By definition of R\sim_R, we have wRww \sim_R w', and thus

α1=αR(w1)=αR(w2)=α2,\alpha_1 = \alpha_R(w_1) = \alpha_R(w_2) = \alpha_2,

proving Claim 2.

Claim 3: ro(G)r_o(G) is given by ((GH)/μo){o}((G \sqcup H) / \sim_{\mu_o}) \smallsetminus \{o\}.

It follows directly from our construction of R\sim_R and μo\mu_o' that the equivalence classes of (the smallest equivalence relation closure of) μoαR\mu_o' \circ \alpha_R is equal to the equivalence classes of (the smallest equivalence relation closure of) μo\mu_o. The claim follows by Definition 3.8 and the definition of ror_o.

And finally, Claim 4: ro(G)r_o(G) is a valid minIR graph.

Per Definition 3.4, We must check four properties: (i) every value is defined exactly once, (ii) every linear value is used exactly once, (iii) the graph is acyclic, and (iv) every region has (at most) one parent.

(iii) follows from the fact that GG and HH are acyclic and a single operation oo in GG is replaced: any cycle across GG and HH would also be a cycle in GG by replacing the subpath in HH with oo. (iv) follows from the fact that oino_{in} and oouto_{out} are in the root region of Hˉ\bar{H}, by definition of interface implementation. (i): removing oo from GG removes the unique definitions of all values that are targets of oo. Each such value vv is glued to a unique value srci(oout)\mathit{src}_i\,(o_{out}) in HH – the new and unique definition of vv in ro(G).r_o(G). (ii) follows from the same argument as in (i), but relying on injectivity of ρ\rho on linear values to establish uniqueness.

Arbitrary minIR rewrites #

We have so far defined rewrites of single operations into graphs HH. We can generalise these rewrites to rewrite subgraphs PGP \subseteq G, provided the minIR subgraphs satisfy some constraints. We require for this a notion of convexity, as discussed in Bonchi, 2022Filippo Bonchi, Fabio Gadducci, Aleks Kissinger, Pawel Sobocinski and Fabio Zanasi. 2022. String diagram rewrite theory II: Rewriting with symmetric monoidal structure. Mathematical Structures in Computer Science 32, 4 (April 2022, 511--541). doi: 10.1017/s0960129522000317.

As usual, let us consider a minIR graph GG with values VV, linear values VLVV_L \subseteq V, edges EE, the incidence maps srci\textit{src}_i and tgtj\textit{tgt}_j as well as their inverses use\textit{use} and def\textit{def}. Consider further a subgraph of GG that we will now call P=(VP,EP)GP = (V_P, E_P) \subseteq G, to distinguish from HH.

Let us further define the partial parentparent morphism that maps a value vVv \in V to the parent of the region of vv.

Definition 3.12Convex minIR subgraph

A minIR subgraph PGP \subseteq G is convex if the following conditions hold:

  • for all v1,v2VPv_1, v_2 \in V_P, any path along \leadsto from v1v_1 to v2v_2 contains only vertices in VPV_P,
  • parent-child relations are contained within the subgraph, i.e.
    vVPdom(parent)parent(v)VP.v \in V_P \cap dom(\mathit{parent}) \Leftrightarrow \mathit{parent}(v) \in V_P.

Define the sets of boundary values BU,BDB_U, B_D and B=BUBDB = B_U \cup B_D, as in (2); then fix the boundary orderings src(P)\textit{src}(P) and tgt(P)\textit{tgt}\,(P) as in (3). The subgraph PP implements the interface

Consider an interface graph Hˉ\bar{H} that implements IHI_H such that IPIHI_P \triangleleft I_H. Instead of defining a gluing relation from values of an operation oo to values of HH, we replace the interface I(o)I(o) with IPI_P. This generalises the definition of μo\mu_o from (4) to a glueing μB×VH\mu\subseteq B \times V_H defined as

μ= {((srcρ(i)(P)),srci(H))iIdx(src(H))} {((tgti(P)),tgtσ(i)(H))iIdx(tgt(P))},\begin{aligned}\mu =\ & \{ \left((\textit{src}_{\rho(i)}(P)), \textit{src}_{i}\,(H)\right) \mid i \in \mathrm{Idx}(\textit{src}(H)) \}\ \cup \\& \{ \left((\textit{tgt}_{i}\,(P)), \textit{tgt}_{\sigma(i)}\,(H)\right) \mid i \in \mathrm{Idx}(\textit{tgt}\,(P)) \},\end{aligned}

With the set of boundary operations defined as2

EB={oEP(tgt(o)src(o))B},E_B = \left\{o \in E_P \mid \left(\mathit{tgt}\,(o) \cup \textit{src}(o)\right) \subseteq B\right\},

we are able to define minIR rewrites in their most general form.

Proposition 3.6MinIR subgraph rewrite

Let PGP \subseteq G and HH such that IPIHI_P \triangleleft I_H and PP is convex, as defined above. Then,

((GH)/μ ⁣)(VPB,EB),\big((G \sqcup H) / \sim_{\mu}\!\big) \smallsetminus (V_P \smallsetminus B, E_B),

i.e. the graph obtained by removing the values VPBV_P \smallsetminus B and operations EBE_B from the glueing of GG and HH along μ\mu, is a valid minIR graph.

There is a graph GRG_R with values VRV_R and a partial function μ:VPVR\mu': V_P \rightharpoonup V_R such that the graph (8) is the graph rP(G)r_P(G), obtained from the rewrite

rP=(GR,VP,EB,μ).r_P = (G_R, V_P, E_B, \mu').

We call rPr_P the rewrite of PP into HH.

Proof

Consider an operation oo that implements IP=(UP,DP)I_P = (U_P, D_P). We can define the interface graph Hˉo\bar{H}_o given by three operations oino_{in}, oouto_{out} and oo. Its associated subgraph HoHˉoH_o \subseteq \bar{H}_o only includes oo. Let μ~\tilde \mu be the glueing relation

μ~= {(srci(P),srci(o))iIdx(UP)} {(tgti(P),tgti(o))iIdx(DP)}.\begin{aligned}\tilde \mu =\ &\{ (\textit{src}_i(P), \textit{src}_i(o)) \mid i \in \mathrm{Idx}(U_P) \}\ \cup \\& \{ (\textit{tgt}_i\,(P), \textit{tgt}_i\,(o)) \mid i \in \mathrm{Idx}(D_P) \}.\end{aligned}

Consider the rewrite r=(Ho,VPB,EB,μ~)r = (H_o, V_P \smallsetminus B, E_B, \tilde\mu). If we write G=(V,E)G' = (V', E') for the subgraph of GG given by

V=(V(VPB))E=(EEB)(V),\begin{aligned}V' &= (V \smallsetminus (V_P \smallsetminus B)) \\E' &= (E \smallsetminus E_B) \cap (V')^\ast,\end{aligned}
then according to (1), the graph resulting from applying rr to GG can be expressed as the glueing

Go=r(G)=(GHo)/μ~.G_o = r(G) = (G' \sqcup H_o) / \sim_{\tilde \mu}.

Our claim is that GoG_o is a valid minIR graph.

The graph (8) is then obtained by applying the rewrite ror_o as given by (6) to GoG_o. Defining the rewrite rPr_P as the composition of rr followed by ror_o, the result follows from our claim and Proposition 3.5.

We now prove the claim, by showing the four properties of minIR graphs as per Definition 3.4. Property i) requires showing that every value is defined exactly once. As GG' is obtained by removing values and operations from a valid minIR graph GG, no value in VV' can be defined more than once. A value vVv \in V' that is not defined in GG' must be in the boundary vBv \in B of PP. By the boundary definitions of (2), vv cannot be in BUB_U and thus must be in BDB_D. It follows by the definition of the glueing μ~\tilde \mu that in GoG_o, vv will be in the definitions of oo: def(v)=o\textit{def}(v) = o. The glueing μ~\tilde \mu is bijective between the values of PP and oo and thus we can conclude that vv has a unique definition in GoG_o.

The same argument applies to property ii). Property iii) follows from the convexity requirement of PP. Finally, property iv) (every region has at most one parent) follows from two observations. First, by convexity of PP, no deleted value or operation could be the parent of any value not in PP, and thus the parentparent relation is well-defined on GG': im(parent)Eim(parent) \subseteq E'. Secondly, all new values and operations added to the boundary region of GG' are from the root region of HH, and thus do not have a parent, ensuring that parent uniqueness is preserved.

This simple and limited graph transformation framework captures a remarkably large set of minIR program transformations. It may seem at first that the restriction to boundary values within a single region of Definition 3.11, as well as the convexity requirements of Definition 3.12 represent significant limitations on the expressivity of the rewrites. In practice, however, the semantics of minIR operations can be used to decompose more complex rewrites into a sequence of simple rewrites to which Proposition 3.6 applies.

Consider minIR graphs with a type system that includes regiondef and call operations as discussed in examples of the previous section – respectively defining a code block by a nested region and redirecting control flow to a code block defined using a regiondef. Then all constraints that we impose on rewriting can be effectively side-stepped using the region outlining and value hoisting transformations.

Region outlining moves a valid minIR subgraph into its own separate region, and replaces the hole left by the subgraph in the computation by a call operation to the newly outlined region.

Value hoisting moves a value definition within a region to its parent region and passes the value down to the nested region through an additional input. In case of linear values, we can similarly hoist the unique use of the value to the parent region.

Using these transformations, non-convex subgraphs can always be made convex by taking the convex hull and outlining any parts within it that are not part of the subgraph. Outlined regions can then be passed as additional inputs to the subgraph. Step 1 of the figure below illustrates this transformation. Similarly, a subgraph that includes operations without their parent can be extended to cover the entire region and its parent, outlining any parts of the region that are not part of the subgraph.

Finally, whenever a boundary value vv belongs to a region that is not the top level region of the subgraph3, we can repeatedley hoist vv to its parent region until it is in the top level region. The value is then recursively passed as argument to descendant regions until the region that it is required in. Subgraphs can thus always be transformed to only have input and output boundary values at the top level region. Step 2 of the figure below illustrates this transformation.

A non-convex minIR graph rewrite, obtained by decomposition into valid convex rewrites, using outlining and hoisting. For simplicity, regiondef operations were made implicit and represented by nested boxes: a region within an operation corresponds to a region definition that is passed as an argument to the operation. Edge colours correspond to value types. Step 1 outlines the ... operations into a dedicated region, which step 2 hoists outside of the region being rewritten. Step 3 and 4 together correspond to a minIR sugraph rewrite. They have been split into two steps following the proof strategy. Step 4 is an instance of a minIR operation rewrite.

A non-convex minIR graph rewrite, obtained by decomposition into valid convex rewrites, using outlining and hoisting. For simplicity, regiondef operations were made implicit and represented by nested boxes: a region within an operation corresponds to a region definition that is passed as an argument to the operation. Edge colours correspond to value types. Step 1 outlines the ... operations into a dedicated region, which step 2 hoists outside of the region being rewritten. Step 3 and 4 together correspond to a minIR sugraph rewrite. They have been split into two steps following the proof strategy. Step 4 is an instance of a minIR operation rewrite.


  1. To be precise, \preccurlyeq is a partial order on the type strings up to isomorphism↩︎

  2. The set operations \subseteq and \cup are again understood to apply to the unordered set of elements contained in the lists tgt(o)\mathit\,{tgt}(o) and src(o)\mathit{src}\,(o)↩︎

  3. We can always extend a subgraph to contain more ancestor regions, until there is indeed a unique top-level region in the subgraph. ↩︎