4.1. Related work
Our proposed solution can be seen as a specialisation of RETE networks Forgy, 1982. 1982. Rete: A fast algorithm for the many pattern/many object pattern match problem. Artificial Intelligence 19, 1 (Septempter 1982, 17--37). doi: 10.1016/0004-3702(82)90020-0 Varró, 2013. 2013. A Rete Network Construction Algorithm for Incremental Pattern Matching and derivatives Ian, 2003. 2003. The execution kernel of RC++: RETE*, a faster RETE with TREAT as a special case. International Journal of Intelligent Games and Simulation 2, 1 (Feb 2003, 36-48) Armstr., 2014. 2014. Memory Efficient Stream Reasoning onResource-Limited Devices Mirank., 1987. 1987. TREAT: a new and efficient match algorithm for AI production systems to the case of graph pattern matching. The additional structure obtained from restricting our considerations to graphs results in a simplified network design that allows us to derive worst-case asymptotic runtime and space bounds that are polynomial in the parameters relevant to our use case1 – overcoming a key limitation of RETE.
Another well-studied application of large-scale pattern matching is in the context of stochastic biomolecular simulations Sneddon, 2010. 2010. Efficient modeling, simulation and coarse-graining of biological complexity with NFsim. Nature Methods 8, 2 (December 2010, 177--183). doi: 10.1038/nmeth.1546 Bachman, 2011. 2011. New approaches to modeling complex biochemistry. Nature Methods 8, 2 (January 2011, 130--131). doi: 10.1038/nmeth0211-130, particularly the Kappa project Danos, 2004. 2004. Formal molecular biology. Theoretical Computer Science 325, 1 (Septempter 2004, 69--110). doi: 10.1016/j.tcs.2004.03.065. Stochastic simulations depend on performing many rounds of fast pattern-matching for continuous Monte Carlo simulations Yang, 2008. 2008. Kinetic Monte Carlo method for rule-based modeling of biochemical networks. Physical Review E 78, 3 (Septempter 2008, 031910). doi: 10.1103/physreve.78.031910. However, unlike our use case, the procedure typically does not need to scale well to a large number of patterns. In Danos, 2007. 2007. Scalable Simulation of Cellular Signaling Networks, Danos et al. introduced a pre-computation step to accelerate matching by establishing relations between patterns that activate or inhibit further patterns. This idea was later expanded upon and formalised in categorical language in Boutil., 2017. 2017. Incremental Update for Graph Rewriting. The ideas presented in Boutil., 2017. 2017. Incremental Update for Graph Rewriting are similar to ours; their formalism has the advantage of being more general but does not present any asymptotic complexity bounds and suffers from similar worst-case complexities as RETE.
A similar problem has also been studied in the context of multiple-query optimisation for database queries Sellis, 1988. 1988. Multiple-query optimization. ACM Transactions on Database Systems 13, 1 (March 1988, 23–52). doi: 10.1145/42201.42203 Ren, 2016. 2016. Multi-Query Optimization for Subgraph Isomorphism Search. Proceedings of the VLDB Endowment 10, 3 (November 2016, 121–132). doi: 10.14778/3021924.3021929, but it has limited itself to developing caching strategies and search heuristics for specific use cases. Finally, using a pre-compiled data structure for pattern matching was already proposed in Messmer, 1999. 1999. A decision tree approach to graph and subgraph isomorphism detection. Pattern Recognition 32, 12 (December 1999, 1979--1998). doi: 10.1016/S0031-3203(98)90142-X. However, with a space complexity – is the input size and the pattern size – it does not scale to large input graphs, even for small patterns.
RETE networks have been shown to have exponential worst-case space (and thus time) complexity Rakib, 2018. 2018. An Efficient Rule-Based Distributed Reasoning Framework for Resource-bounded Systems. Mobile Networks and Applications 24, 1 (October 2018, 82--99). doi: 10.1007/s11036-018-1140-x, although performance in practical use cases can vary widely Uddin, 2016. 2016. Resource-Bounded Context-Aware Applications: A Survey and Early Experiment. ↩︎