I am the founder of the Ontario Database Day, a one-day symposium that will bring together the database and data-focused research community of Ontario. The first edition of OnDBD was held at McMaster University. The next edition of OnDBD is organized by Sujaya Maiyya and Jelle Hellings and will be held at Waterloo University.
I am currently not looking for new students. Due to the large volume of unsolicited e-mails of prospective students, I will refrain from answering each of them individually. If you are interested in becoming a Master or PhD student at McMaster, then please follow the official application process.
I typically have a large volume of 60-80 undergraduate students applying as TA for my courses. To keep the hiring process fair, I will only consider applications via the official Undergraduate Teaching Assistant Application portal. I will invite a small selection of applicants that stand out (e.g., based on experiences and grades) for interviews a few weeks before courses start. If you have applied via the application portal, then there is no need to e-mail me about further details.
My research is centered around novel directions for high-performance large-scale data management systems. My research has a strong theoretical component (e.g., lower bound results, finite model theory, dependency theory) and a strong algorithmic component (e.g., external-memory algorithms, distributed algorithms, join algorithms). Currently, my focus is on high-performance query answering with a main focus on static query plan construction and query plan compilation and on scalable resilient systems that can manage data and processing complex transactions, while providing strong guarantees toward users in the presence of faulty behavior (e.g., hardware failures, software failures, and malicious attacks).
Previously, I was a Postdoc Scholar in the Exploratory Systems Lab at the Computer Science Department of the University of California, Davis, 🇺🇸, where I worked on scalable resilient distributed data processing under the supervision of Mohammad Sadoghi. I did my doctoral training in the Databases and Theoretical Computer Science research group at Hasselt University, 🇧🇪, under the supervision of Marc Gyssens. During my doctoral training, I worked on semi-structured data with a main focus on graph databases (e.g., graph query languages, constraints on graph data, graph query evaluation algorithms). Finally, I did my Bachelor and Master in computer science and engineering at the Eindhoven University of Technology, 🇳🇱, where my final Master project focused on external-memory algorithms for indexing very large graph datasets.
In the 2024-2025 Academic Year, I am the lecturer responsible for the following three courses.
In the preceding years, I organized introduction courses on programming in Python and Java, on databases, and on algorithms and data structures. In addition, I organized graduate courses on database theory and introduction courses on programming for people outside Computer Science. Finally, I supervised several students at the undergraduate level (Bachelor thesis, software engineering projects and graduate level (Master thesis, PhD research).
I have served in the program committees of ACM DEBS (2020, 2021, 2022, 2023), IEEE ICDCS 2021, FAB (2021, 2024), ACM SIGMOD (2022, 2025), IEEE DAPPS (2022, 2023, 2024), IEEE ICDE (2023, 2025), ASTRIDE 2023, VLDB (2023 tutorials, 2024 tutorials), and SoCC (2023, 2024). I served as the Demo and Poster Co-Chair for Middleware 2019. I served as external reviewer for GDMA 2018, LICS 2019, ICDE 2020, ICALP 2020, PODS 2021, CIKM 2021, CSL 2023, and ICDT 2024. I reviewed for the journals Information Processing Letters, ACM Transactions on Database Systems, The VLDB Journal, Theoretical Computer Science, Journal of Computer Languages, Distributed and Parallel Databases, IEEE Transactions on Dependable and Secure Computing, Systems & Control Letters, IEEE Transactions on Network and Service Management, IEEE Transactions on Parallel and Distributed Systems, IEEE Transactions on Knowledge and Data Engineering; International Journal of Data Science, Annals of Mathematics and Artificial Intelligence, and Expert Systems With Applications; and for the Mitacs funding agency.
Jump to
Books,
Journal Papers,
Conference Proceedings,
Tutorials, Demos, and Talks,
Theses.
Fault-tolerant distributed transactions on blockchain. (2021). In: Synthesis Lectures on Data Management. Morgan & Claypool. DOI: 10.2200/S01068ED1V01Y202012DTM065. Slides Chapter 1, Slides Chapter 2, Slides Chapter 3, Slides Chapter 4.
.Since the introduction of Bitcoin--the first widespread application driven by blockchain--the interest of the public and private sectors in blockchain has skyrocketed. In recent years, blockchain-based fabrics have been used to address challenges in diverse fields such as trade, food production, property rights, identity-management, aid delivery, health care, and fraud prevention.
These fundamental concepts and the technologies behind them--a generic ledger-based data model, cryptographically ensured data integrity, and consensus-based replication--prove to be a powerful and inspiring combination, a catalyst to promote computational trust. In this book, we present an in-depth study of blockchain, unraveling its revolutionary promise to instill computational trust in society, all carefully tailored to a broad audience including students, researchers, and practitioners. We offer a comprehensive overview of theoretical limitations and practical usability of consensus protocols while examining the diverse landscape of how blockchains are manifested in their permissioned and permissionless forms.
Byzantine cluster-sending in expected constant cost and constant time. (2023). In: Journal of Systems Research, 3(1). eScholarship. author copy, project page.
.Traditional resilient systems operate on fully-replicated fault-tolerant clusters, which limits their scalability and performance. One way to make the step towards resilient high-performance systems that can deal with huge workloads is by enabling independent fault-tolerant clusters to efficiently communicate and cooperate with each other, as this also enables the usage of high-performance techniques such as sharding. Recently, such inter-cluster communication was formalized as the Byzantine cluster-sending problem. Unfortunately, existing worst-case optimal protocols for cluster-sending all have linear complexity in the size of the clusters involved.
In this paper, we propose probabilistic cluster-sending techniques as a solution for the cluster-sending problem with only an expected constant message complexity, this independent of the size of the clusters involved and this even in the presence of highly unreliable communication. Depending on the robustness of the clusters involved, our techniques require only two-to-four message round-trips (without communication failures). Furthermore, our protocols can support worst-case linear communication between clusters. Finally, we have put our techniques to the test in an in-depth experimental evaluation that further underlines the exceptional low expected costs of our techniques in comparison with other protocols. As such, our work provides a strong foundation for the further development of resilient high-performance systems.
Cerberus: minimalistic multi-shard Byzantine-resilient transaction processing. (2023). In: Journal of Systems Research, 3(1). eScholarship. author copy.
.To enable scalable resilient blockchain systems, several powerful general-purpose approaches toward sharding such systems have been demonstrated. Unfortunately, these approaches all come with substantial costs for ordering andexecution of multi-shard transactions.
In this work, we ask whether one can achieve significantcost reductions for processing multi-shard transactions by limiting the type of workloads supported. To initiate the study of this problem, we propose Cerberus, a family of minimalistic primitives for processing single-shard and multi-shard UTXO-like transactions. The first Cerberus variant we propose is core-Cerberus (CCerberus). CCerberus uses strict UTXO-based environmental requirements to enable powerful multi-shard transaction processing with an absolute minimum amount of coordination between shards. In the environment we designed CCerberus for, CCerberus will operate perfectly with respect to all transactions proposed and approved by well-behaved clients, but does not provide any other guarantees.
To illustrate that CCerberus-like protocols can also be of use in environments with faulty clients, we also demonstrate two generalizations of CCerberus, optimistic-Cerberus and resilient-Cerberus, that make different tradeoffs in complexity and costs when dealing with faulty behavior and attacks. Finally, we compare these three protocols and show their potential scalability and performance benefits over state-of-the-art general-purpose systems. These results underline the importance of the study of specialized approaches toward sharding in resilient systems.
ByShard: sharding in a Byzantine environment. (2023). In: The VLDB Journal, 32(6), 1343-1367. Springer. DOI: 10.1007/s00778-023-00794-0. See also VLDB 2021. author copy, project page.
.The emergence of blockchains has fueled the development of resilient systems that deal with Byzantine failures due to crashes, bugs, or even malicious behavior. Recently, we have also seen the exploration of sharding in these resilient systems, this to provide the scalability required by very large data-based applications. Unfortunately, current sharded resilient systems all use system-specific specialized approaches toward sharding that do not provide the flexibility of traditional sharded data management systems. To improve on this situation, we fundamentally look at the design of sharded resilient systems. We do so by introducing ByShard, a unifying framework for the study of sharded resilient systems. Within this framework, we show how two-phase commit and two-phase locking—two techniques central to providing atomicity and isolation in traditional sharded databases—can be implemented efficiently in a Byzantine environment, this with a minimal usage of costly Byzantine resilient primitives. Based on these techniques, we propose eighteen multi-shard transaction processing protocols. Finally, we practically evaluate these protocols and show that each protocol supports high transaction throughput and provides scalability while each striking its own trade-off between throughput, isolation level, latency, and abort rate. As such, our work provides a strong foundation for the development of ACID-compliant general-purpose and flexible sharded resilient data management systems.
The power of Tarski's relation algebra on trees. (2022). In: Journal of Logical and Algebraic Methods in Programming, 126. Elsevier. DOI: 10.1016/j.jlamp.2022.100748. See also FoIKS 2018a, PhD thesis. author copy, project page.
.Fragments of Tarski's relation algebra form the basis of many versatile graph and tree query languages including the regular path queries, XPath, and SPARQL. Surprisingly, however, a systematic study of the relative expressive power of relation algebra fragments on trees has not yet been undertaken. In this work, we perform such a systematic study. Our approach is to start from a basic fragment which only allows composition and union. We then study how the expressive power of the query language changes if we add diversity, converse, projections, coprojections, intersection, and/or difference, both for path queries and Boolean queries. For path queries on labeled trees, we found that adding intersection and difference yields more expressive power for some fragments, while adding one of the other operators always yields more expressive power. For Boolean queries on labeled trees, we obtain a similar picture for the relative expressive power, except for a few fragments where adding converse or projection yields no more expressive power. Additionally, we also studied querying unlabeled trees, for which we have found several redundancies. One challenging problem remains open, however, for both path and Boolean queries: does adding difference yield more expressive power to fragments containing at least diversity, coprojections, and intersection?
ByShard: Sharding in a Byzantine environment. (2021). In: Proceedings of the VLDB Endowment, 14(11), 2230-2243. VLDB. DOI: 10.14778/3476249.3476275. See also JVLDB 2023. author copy, slides, poster, project page.
.The emergence of blockchains has fueled the development of resilient systems that can deal with Byzantine failures due to crashes, bugs, or even malicious behavior. Recently, we have also seen the exploration of sharding in these resilient systems, this to provide the scalability required by very large data-based applications. Unfortunately, current sharded resilient systems all use system-specific specialized approaches toward sharding that do not provide the flexibility of traditional sharded data management systems.
To improve on this situation, we fundamentally look at the design of sharded resilient systems. We do so by introducing ByShard, a unifying framework for the study of sharded resilient systems. Within this framework, we show how two-phase commit and two-phase locking--two techniques central to providing atomicity and isolation in traditional sharded databases--can be implemented efficiently in a Byzantine environment, this with a minimal usage of costly Byzantine resilient primitives. Based on these techniques, we propose eighteen multi-shard transaction processing protocols. Finally, we practically evaluate these protocols and show that each protocol supports high transaction throughput and provides scalability while each striking its own trade-off between throughput, isolation level, latency, and abort rate. As such, our work provides a strong foundation for the development of ACID-compliant general-purpose and flexible sharded resilient data management systems.
From relation algebra to semi-join algebra: An approach to graph query optimization. (2020). In: The Computer Journal, 64(5), 789-811. Oxford University Press. DOI: 10.1093/comjnl/bxaa031. See also DBPL 2017, PhD thesis. author copy.
.Many graph query languages rely on composition to navigate graphs and select nodes of interest, even though evaluating compositions of relations can be costly. Often, this need for composition can be reduced by rewriting towards queries using semi-joins instead, resulting in a significant reduction of the query evaluation cost. We study techniques to recognize and apply such rewritings. Concretely, we study the relationship between the expressive power of the relation algebras, which heavily rely on composition, and the semi-join algebras, which replace composition in favor of semi-joins. Our main result is that each fragment of the relation algebras where intersection and/or difference is only used on edges (and not on complex compositions) is expressively equivalent to a fragment of the semi-join algebras. This expressive equivalence holds for node queries evaluating to sets of nodes. For practical relevance, we exhibit constructive rules for rewriting relation algebra queries to semi-join algebra queries, and prove that they lead to only a well-bounded increase in the number of steps needed to evaluate the rewritten queries. In addition, on sibling-ordered trees, we establish new relationships among the expressive power of Regular XPath, Conditional XPath, FO-logic, and the semi-join algebra augmented with restricted fixpoint operators.
ResilientDB: global scale resilient blockchain fabric. (2020). In: Proceedings of the VLDB Endowment, 13(6), 868-883. VLDB. DOI: 10.14778/3380750.3380757. See also FAB 2020. author copy, technical report, video by Suyash Gupta.
.Recent developments in blockchain technology have inspired innovative new designs in resilient distributed and database systems. At their core, these blockchain applications typically use Byzantine fault-tolerant consensus protocols to maintain a common state across all replicas, even if some replicas are faulty or malicious. Unfortunately, existing consensus protocols are not designed to deal with geo-scale deployments in which many replicas spread across a geographically large area participate in consensus.
To address this, we present the Geo-Scale Byzantine Fault-Tolerant consensus protocol (GeoBFT). GeoBFT is designed for excellent scalability by using a topological-aware grouping of replicas in local clusters, by introducing parallelization of consensus at the local level, and by minimizing communication between clusters. To validate our vision of high-performance geo-scale resilient distributed systems, we implement GeoBFT in our efficient ResilientDB permissioned blockchain fabric. We show that GeoBFT is not only sound and provides great scalability, but also outperforms state-of-the-art consensus protocols by a factor of six in geo-scale deployments.
Comparing the expressiveness of downward fragments of the relation algebra with transitive closure on trees. (2020). In: Information Systems, 89. Elsevier. DOI: 10.1016/j.is.2019.101467. See also DBPL 2015, PhD thesis. author copy, technical report, project page.
.Motivated by the continuing interest in the tree data model, we study the expressive power of downward navigational query languages on trees and chains. Basic navigational queries are built from the identity relation and edge relations using composition and union. We study the effects on relative expressiveness when we add transitive closure, projections, coprojections, intersection, and difference; this for boolean queries and path queries on labeled and unlabeled structures. In all cases, we present the complete Hasse diagram. In particular, we establish, for each query language fragment that we study on trees, whether it is closed under difference and intersection.
First-order definable counting-only queries. (2019). In: Annals of Mathematics and Artificial Intelligence, 87, 109-136. Springer. DOI: 10.1007/s10472-019-09652-8. See also FoIKS 2018b. author copy.
.Many data sources can be represented easily by collections of sets of objects. For several practical queries on such collections of sets of objects, the answer does not depend on the precise composition of these sets, but only on the number of sets to which each object belongs. This is the case k = 1 for the more general situation where the query answer only depends on the number of sets to which each collection of at most k objects belongs. We call such queries k-counting-only. Here, we focus on k-SyCALC, i.e., k-counting-only queries that are first-order definable. As k-SyCALC is semantically defined, however, it is not surprising that it is already undecidable whether a first-order query is in 1-SyCALC. Therefore, we introduce SimpleCALC-k, a syntactically defined (strict) fragment of k-SyCALC. It turns out that many practical queries in k-SyCALC can already be expressed in SimpleCALC-k. We also define the query language GCount-k, which expresses counting-only queries directly by using generalized counting terms, and show that this language is equivalent to SimpleCALC-k. We prove that the k-counting-only queries form a non-collapsing hierarchy: for every k, there exist (k+1)-counting-only queries that are not k-counting-only. This result specializes to both SimpleCALC-k and k-SyCALC. Finally, we establish a strong dichotomy between 1-SyCALC and SimpleCALC-k on the one hand and 2-SyCALC on the other hand by showing that satisfiability, validity, query containment, and query equivalence are decidable for the former two languages, but not for the latter one.
Calculi for symmetric queries. (2019). In: Journal of Computer and System Sciences, 105, 54-86. Elsevier. DOI: 10.1016/j.jcss.2019.04.003. author copy.
.Symmetric queries are introduced as queries on a sequence of sets of objects the result of which does not depend on the order of the sets. An appropriate data model is proposed, and two query languages are introduced, QuineCALC and SyCALC. They are correlated with the symmetric Boolean functions of Quine, respectively symmetric relational functions. The former correlation yields an incidence-based normal form for QuineCALC queries. More generally, we propose counting-only queries as those SyCALC queries the result of which only depends on incidence information, and characterize them as quantified Boolean combinations of QuineCALC queries. A normal form is proposed for them too. It is shown that, while it is undecidable whether a SyCALC query is counting-only, it is decidable whether a counting-only query is a QuineCALC query. Finally, some classical decidability problems are considered which are shown to be undecidable for SyCALC, but decidable for QuineCALC and counting-only queries.
Implication and axiomatization of functional and constant constraints. (2016). In: Annals of Mathematics and Artificial Intelligence, 76(3), 251-279. Springer. DOI: 10.1007/s10472-015-9473-7. See also FoIKS 2014. author copy.
.Akhtar et al. introduced equality-generating constraints and functional constraints as a first step towards dependency-like integrity constraints for RDF data. Here, we focus on functional constraints. Since the usefulness of functional constraints is not limited to the RDF data model, we study the functional constraints in the more general setting of relations with arbitrary arity. We further introduce constant constraints and study the functional and constant constraints combined.
Our main results are sound and complete axiomatizations for the functional and constant constraints, both separately and combined. These axiomatizations are derived using the chase algorithm for equality-generating constraints. For derivations of constant constraints, we show how every chase step can be simulated by a bounded number of applications of inference rules. For derivations of functional constraints, we show that the chase algorithm can be normalized to a more specialized symmetry-preserving chase algorithm performing so-called symmetry-preserving steps. We then show how each symmetry-preserving step can be simulated by a bounded number of applications of inference rules. The axiomatization for functional constraints is in particular applicable to the RDF data model, solving a major open problem of Akhtar et al.
SpotLess: Concurrent Rotational Consensus Made Practical through Rapid View Synchronization. (2024). In: 2024 IEEE 40th International Conference on Data Engineering (ICDE), 1916-1929. IEEE. DOI: 10.1109/ICDE60146.2024.00157. author copy.
.The emergence of blockchain technology has renewed the interest in consensus-based data management systems that are resilient to failures. To maximize the throughput of these systems, we have recently seen several prototype consensus solutions that optimize for throughput at the expense of overall implementation complexity, high costs, and reliability. Due to this, it remains unclear how these prototypes will perform in real-world environments. In this paper, we present SpotLess, a novel concurrent rotational consensus protocol made practical. Central to SpotLess is the combination of (1) a chained consensus design for replicating requests with a reduced message cost; (2) the novel Rapid View Synchronization protocol that enables robust and continuous low-cost failure recovery that eliminates the traditional complex, error-prone view-change protocol; and (3) a high-performance concurrent consensus architecture in which independent instances of the chained consensus operate concurrently to process requests with high throughput and without single-replica bottlenecks. Due to the concurrent consensus architecture, SpotLess greatly outperforms traditional primary-backup consensus protocols such as PBFT (by up to 430%), Narwhal-HS (by up to 137%), and HotStuff (by up to 3803%). Due to its reduced message cost, SpotLess is even able to outperform RCC, a state-of-the-art high-throughput concurrent consensus protocol, by up to 23%. Furthermore, SpotLess is able to maintain a stable and low latency and consistently high throughput even during failures.
Expressive Completeness of Two-Variable First-Order Logic with Counting for First-Order Queries on Unranked Trees. (2023). In: 2023 38th Annual ACM/IEEE Symposium on Logic in Computer Science (LICS), 1-13. IEEE. DOI: 10.1109/LICS56636.2023.10175828. author copy, slides, poster.
.We consider the class of finite, rooted, unranked, unordered, node-labeled trees. Such trees are represented as structures with only the parent-child relation, in addition to any number of unary predicates for node labels. We prove that every unary first-order query over the considered class of trees is already expressible in two-variable first-order logic with counting. Somewhat to our surprise, we have not seen this result being conjectured in the extensive literature on logics for trees. Our proof is based on a global variant of local equivalence notions on nodes of trees. This variant applies to entire trees, and involves counting ancestors of locally equivalent nodes.
The fault-tolerant cluster-sending problem. (2022). In: Foundations of Information and Knowledge Systems, 168-186. Springer. DOI: 10.1007/978-3-031-11321-5_10. author copy, slides.
.The emergence of blockchains is fueling the development of resilient data management systems that can deal with Byzantine failures due to crashes, bugs, or even malicious behavior. As traditional resilient systems lack the scalability required for modern data, several recent systems explored using sharding. Enabling these sharded designs requires two basic primitives: a primitive to reliably make decisions within a cluster and a primitive to reliably communicate between clusters. Unfortunately, such communication has not yet been formally studied.
In this work, we improve on this situation by formalizing the cluster-sending problem: the problem of sending a message from one resilient system to another in a fault-tolerant manner. We also establish lower bounds on the complexity of cluster-sending under both crashes and Byzantine failures. Finally, we present worst-case optimal cluster-sending protocols that meet these lower bounds in practical settings. As such, our work provides a strong foundation for the future development of sharded resilient data management systems.
Optimizing multiset relational algebra queries using weak-equivalent rewrite rules. (2022). In: Foundations of Information and Knowledge Systems, 187-205. Springer. DOI: 10.1007/978-3-031-11321-5_11. author copy, slides.
.Relational query languages rely heavily on costly join operations to combine tuples from multiple tables into a single resulting tuple. In many cases, the cost of query evaluation can be reduced by manually optimizing (parts of) queries to use cheaper semi-joins instead of joins. Unfortunately, existing database products can only apply such optimizations automatically in rather limited cases.
To improve on this situation, we propose a framework for automatic query optimization via weak-equivalent rewrite rules for a multiset relational algebra (that serves as a faithful formalization of core SQL). The weak-equivalent rewrite rules we propose aim at replacing joins by semi-joins. To further maximize their usability, these rewrite rules do so by only providing weak guarantees on the evaluation results of rewritten queries. We show that, in the context of certain operators, these weak-equivalent rewrite rules still provide strong guarantees on the final evaluation results of the rewritten queries.
Proof-of-Execution: Reaching consensus through fault-tolerant speculation. (2021). In: Proceedings of the 24th International Conference on Extending Database Technology (EDBT), 301-312. OpenProceedings.org. DOI: 10.5441/002/edbt.2021.27. author copy, video by Suyash Gupta.
.Multi-party data management and blockchain systems require data sharing among participants. To provide resilient and consistent data sharing, transactions engines rely on Byzantine Fault-Tolerant consensus (BFT), which enables operations during failures and malicious behavior. Unfortunately, existing BFT protocols are unsuitable for high-throughput applications due to their high computational costs, high communication costs, high client latencies, and/or reliance on twin-paths and non-faulty clients.
In this paper, we present the Proof-of-Execution consensus protocol (PoE) that alleviates these challenges. At the core of PoE are out-of-order processing and speculative execution, which allow PoE to execute transactions before consensus is reached among the replicas. With these techniques, PoE manages to reduce the costs of BFT in normal cases, while guaranteeing reliable consensus for clients in all cases. We envision the use of PoE in high-throughput multi-party data-management and blockchain systems. To validate this vision, we implement PoE in our efficient ResilientDB fabric and extensively evaluate PoE against several state-of-the-art BFT protocols. Our evaluation showcases that PoE achieves up-to-80% higher throughputs than existing BFT protocols in the presence of failures.
RCC: Resilient concurrent consensus for high-throughput secure transaction processing. (2021). In: 2021 IEEE 37th International Conference on Data Engineering (ICDE), 1392-1403. IEEE. DOI: 10.1109/ICDE51399.2021.00124. author copy, video by Suyash Gupta.
.Recently, we saw the emergence of consensus-based database systems that promise resilience against failures, strong data provenance, and federated data management. Typically, these fully-replicated systems are operated on top of a primary-backup consensus protocol, which limits the throughput of these systems to the capabilities of a single replica (the primary).
To push throughput beyond this single-replica limit, we propose concurrent consensus. In concurrent consensus, replicas independently propose transactions, thereby reducing the influence of any single replica on performance. To put this idea in practice, we propose our RCC paradigm that can turn any primary-backup consensus protocol into a concurrent consensus protocol by running many consensus instances concurrently. RCC is designed with performance in mind and requires minimal coordination between instances. Furthermore, RCC also promises increased resilience against failures. We put the design of RCC to the test by implementing it in ResilientDB, our high-performance resilient blockchain fabric, and comparing it with state-of-the-art primary-backup consensus protocols. Our experiments show that RCC achieves up to 2.75 times higher throughput than other consensus protocols and can be scaled to 91 replicas.
Stab-Forests: Dynamic Data Structures for Efficient Temporal Query Processing. (2020). In: 27th International Symposium on Temporal Representation and Reasoning (TIME 2020), 18:1-18:19. Schloss Dagstuhl. DOI: 10.4230/LIPIcs.TIME.2020.18. author copy, slides, project page.
.Many sources of data have temporal start and end attributes or are created in a time-ordered manner. Hence, it is only natural to consider joining datasets based on these temporal attributes. To do so efficiently, several internal-memory temporal join algorithms have recently been proposed. Unfortunately, these join algorithms are designed to join entire datasets and cannot efficiently join skewed datasets in which only few events participate in the join result.
To support high-performance internal-memory temporal joins of skewed datasets, we propose the skip-join algorithm, which operates on stab-forests. The stab-forest is a novel dynamic data structure for indexing temporal data that allows efficient updates when events are appended in a time-based order. Our stab-forests efficiently support not only traditional temporal stab-queries, but also more general multi-stab-queries. We conducted an experimental evaluation to compare the skip-join algorithm with state-of-the-art techniques using real-world datasets. We observed that the skip-join algorithm outperforms other techniques by an order of magnitude when joining skewed datasets and delivers comparable performance to other techniques on non-skewed datasets.
Explaining results of path queries on graphs. (2020). In: Software Foundations for Data Interoperability and Large Scale Graph Data Analytics, 84-98. Springer. DOI: 10.1007/978-3-030-61133-0_7. author copy, technical report, slides, project page.
.Many graph query languages use, at their core, path queries that yield node pairs that are connected by a path of interest. For the end-user, such node pairs only give limited insight as to why this query result is obtained, as the pair does not directly identify the underlying path of interest. To address this limitation of path queries, we propose the single-path semantics, which evaluates path queries to, for each node pair (m,n), a single path from m to n satisfying the conditions of the query. To put our proposal in practice, we provide an efficient algorithm for evaluating context-free path queries, a particular powerful type of path queries, using the single-path semantics. Additionally, we perform a short evaluation of our techniques that shows that the single-path semantics is practically feasible, even when query results grow large.
Coordination-free Byzantine replication with minimal communication costs. (2020). In: 23rd International Conference on Database Theory (ICDT 2020), 17:1-17:20. Schloss Dagstuhl. DOI: 10.4230/LIPIcs.ICDT.2020.17. author copy, slides, video.
.State-of-the-art fault-tolerant and federated data management systems rely on fully-replicated designs in which all participants have equivalent roles. Consequently, these systems have only limited scalability and are ill-suited for high-performance data management. As an alternative, we propose a hierarchical design in which a Byzantine cluster manages data, while an arbitrary number of learners can reliable learn these updates and use the corresponding data.
To realize our design, we propose the delayed-replication algorithm, an efficient solution to the Byzantine learner problem that is central to our design. The delayed-replication algorithm is coordination-free, scalable, and has minimal communication cost for all participants involved. In doing so, the delayed-broadcast algorithm opens the door to new high-performance fault-tolerant and federated data management systems. To illustrate this, we show that the delayed-replication algorithm is not only useful to support specialized learners, but can also be used to reduce the overall communication cost of permissioned blockchains and to improve their storage scalability.
Brief announcement: The fault-tolerant cluster-sending problem. (2019). In: 33rd International Symposium on Distributed Computing (DISC 2019), 45:1-45:3. Schloss Dagstuhl. DOI: 10.4230/LIPIcs.DISC.2019.45. author copy, technical report, slides.
.The development of fault-tolerant distributed systems that can tolerate Byzantine behavior has traditionally been focused on consensus protocols, which support fully-replicated designs. For the development of more sophisticated high-performance Byzantine distributed systems, more specialized fault-tolerant communication primitives are necessary, however.
In this brief announcement, we identify the cluster-sending problem``the problem of sending a message from one Byzantine cluster to another Byzantine cluster in a reliable manner--as such an essential communication primitive. We not only formalize this fundamental problem, but also establish lower bounds on the complexity of this problem under crash failures and Byzantine failures. Furthermore, we develop practical cluster-sending protocols that meet these lower bounds and, hence, have optimal complexity. As such, our work provides a strong foundation for the further exploration of novel designs that address challenges encountered in fault-tolerant distributed systems.
Brief announcement: Revisiting consensus protocols through wait-free parallelization. (2019). In: 33rd International Symposium on Distributed Computing (DISC 2019), 44:1-44:3. Schloss Dagstuhl. DOI: 10.4230/LIPIcs.DISC.2019.44. See also ICDE 2021. author copy, technical report, slides.
.In this brief announcement, we propose a protocol-agnostic approach to improve the design of primary-backup consensus protocols. At the core of our approach is a novel wait-free design of running several instances of the underlying consensus protocol in parallel. To yield a high-performance parallelized design, we present coordination-free techniques to order operations across parallel instances, deal with instance failures, and assign clients to specific instances. Consequently, the design we present is able to reduce the load on individual instances and primaries, while also reducing the adverse effects of any malicious replicas. Our design is fine-tuned such that the instances coordinated by non-faulty replicas are wait-free: they can continuously make consensus decisions, independent of the behavior of any other instances.
The power of Tarski's relation algebra on trees. (2018). In: Foundations of Information and Knowledge Systems, 244-264. Springer. DOI: 10.1007/978-3-319-90050-6_14. See also JLAMP 2022, PhD thesis. author copy, slides, project page.
.Fragments of Tarski's relation algebra form the basis of many versatile graph and tree query languages including the regular path queries, XPath, and SPARQL. Surprisingly, however, a systematic study of the relative expressive power of relation algebra fragments on trees has not yet been undertaken. Our approach is to start from a basic fragment which only allows composition and union. We then study how the expressive power of the query language changes if we add diversity, converse, projections, coprojections, intersections, and/or difference, both for path queries and Boolean queries. For path queries, we found that adding intersection and difference yields more expressive power for some fragments, while adding one of the other operators always yields more expressive power. For Boolean queries, we obtain a similar picture for the relative expressive power, except for a few fragments where adding converse or projection yields no more expressive power. One challenging problem remains open, however, for both path and Boolean queries: does adding difference yields more expressive power to fragments containing at least diversity, coprojections, and intersection?
First-order definable counting-only queries. (2018). In: Foundations of Information and Knowledge Systems, 225-243. Springer. DOI: 10.1007/978-3-319-90050-6_13. See also AMAI 2019. author copy, slides.
.For several practical queries on bags of sets of objects, the answer does not depend on the precise composition of these sets, but only on the number of sets to which each object belongs. This is the case k=$1$ for the more general situation where the query answer only depends on the number of sets to which each group of at most k objects belongs. We call such queries k-counting-only. Here, we focus on k-SyCALC, k-counting-only queries that are first-order definable. As k-SyCALC is semantically defined, however, it is not surprising that it is already undecidable whether a first-order query is in 1-SyCALC. Therefore, we introduce SimpleCALC-k, a syntactically defined (strict) fragment of k-SyCALC. It turns out that many practical queries in k-SyCALC can already be expressed in SimpleCALC-k. We prove that the k-counting-only queries form a non-collapsing hierarchy: for every k, there exist (k+1)-counting-only queries that are not k-counting-only. This result specializes to both SimpleCALC-k and k-SyCALC. Finally, we establish a strong dichotomy between 1-SyCALC and SimpleCALC-k on the one hand and 2-SyCALC on the other hand by showing that satisfiability, validity, query containment, and query equivalence are decidable for the former two languages, but not for the latter one.
From relation algebra to semi-join algebra: An approach for graph query optimization. (2017). In: Proceedings of the 16th International Symposium on Database Programming Languages, 5:1-5:10. ACM. DOI: 10.1145/3122831.3122833. See also CJ 2020, PhD thesis. author copy, slides.
.Many graph query languages rely on the composition operator to navigate graphs and select nodes of interests, even though evaluating compositions of relations can be costly. Often, this need for composition can be reduced by rewriting towards queries that use semi-joins instead. In this way, the cost of evaluating queries can be significantly reduced.
We study techniques to recognize and apply such rewritings. Concretely, we study the relationship between the expressive power of the relation algebras, that heavily rely on composition, and the semi-join algebras, that replace the composition operator in favor of the semi-join operators.
As our main result, we show that each fragment of the relation algebras where intersection and/or difference is only used on edges (and not on complex compositions) is expressively equivalent to a fragment of the semi-join algebras. This expressive equivalence holds for node queries that evaluate to sets of nodes. For practical relevance, we exhibit constructive steps for rewriting relation algebra queries to semi-join algebra queries, and prove that these steps lead to only a well-bounded increase in the number of steps needed to evaluate the rewritten queries.
In addition, on node-labeled graphs that are sibling-ordered trees, we establish new relationships among the expressive power of Regular XPath, Conditional XPath, FO-logic, and the semi-join algebra augmented with restricted fixpoint operators.
Relative expressive power of downward fragments of navigational query languages on trees and chains. (2015). In: Proceedings of the 15th Symposium on Database Programming Languages, 59-68. ACM. DOI: 10.1145/2815072.2815081. See also IS 2020, PhD thesis. author copy, slides, project page.
.Motivated by the continuing interest in the tree data model, we study the expressive power of downward fragments of navigational query languages on trees. The basic navigational query language we consider expresses queries by building binary relations from the edge relations and the identity relation, using composition and union. We study the effects on the expressive power when we add transitive closure, projections, coprojections, intersection, and difference. We study expressiveness at the level of boolean queries and path queries, on labeled and unlabeled trees, and on labeled and unlabeled chains. In all these cases, we are able to present the complete Hasse diagram of relative expressiveness. In particular, we were able to decide, for each fragment of the navigational query languages that we study, whether it is closed under difference and intersection when applied on trees.
Conjunctive context-free path queries. (2014). In: Proceedings of the 17th International Conference on Database Theory (ICDT), 119-130. OpenProceedings.org. DOI: 10.5441/002/icdt.2014.15. author copy, slides.
.In graph query languages, regular expressions are commonly used to specify the labeling of paths. A natural step in increasing the expressive power of these query languages is replacing regular expressions by context-free grammars. With the Conjunctive Context-Free Path Queries (CCFPQ) we introduce such a language based on the well-known Conjunctive Regular Path Queries (CRPQ).
First, we show that query evaluation of CCFPQ has polynomial time data complexity. Secondly, we look at the generalization of regular expressions, as used in CRPQ, to regular relations and show how similar generalizations can be applied to context-free grammars, as used in CCFPQ. Thirdly, we investigate the relations between the expressive power of CRPQ, CCFPQ, and their generalizations. In several cases we show that replacing regular expressions by context-free grammars does increase expressive power. Finally, we look at including context-free grammars in more powerful logics than conjunctive queries. We do so by adding negation and provide expressivity relations between the obtained languages.
Implication and axiomatization of functional constraints on patterns with an application to the RDF data model. (2014). In: Foundations of Information and Knowledge Systems, 250-269. Springer. DOI: 10.1007/978-3-319-04939-7_12. See also AMAI 2016. author copy, slides.
.Akhtar et al. introduced equality-generating constraints and functional constraints as an initial step towards dependency-like integrity constraints for RDF data. Here, we focus on functional constraints. The usefulness of functional constraints is not limited to the RDF data model. Therefore, we study the functional constraints in the more general setting of relations with arbitrary arity. We show that a chase algorithm for functional constraints can be normalized to a more specialized symmetry-preserving chase algorithm. This symmetry-preserving chase algorithm is subsequently used to construct a sound and complete axiomatization for the functional constraints. This axiomatization is in particular applicable in the RDF data model, solving a major open problem of Akhtar et al.
Walk logic as a framework for path query languages on graph databases. (2013). In: Proceedings of the 16th International Conference on Database Theory, 117-128. ACM. DOI: 10.1145/2448496.2448512. author copy, slides.
.Motivated by the current interest in languages for expressing path queries to graph databases, this paper proposes to investigate Walk Logic (WL): the extension of first-order logic on finite graphs with the possibility to explicitly quantify over walks. WL can serve as a unifying framework for path query languages. To support this claim, WL is compared in expressive power with various established query languages for graphs, such as first-order logic extended with reachability; the monadic second-order logic of graphs; hybrid computation tree logic; and regular path queries. WL also serves as a framework to investigate the following natural questions: Is quantifying over walks more powerful than quantifying over paths (walks without repeating nodes) only? Is quantifying over infinite walks more powerful than quantifying over finite walks only? WL model checking is decidable, but determining the precise complexity remains an open problem.
Efficient external-memory bisimulation on DAGs. (2012). In: Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data, 553-564. ACM. DOI: 10.1145/2213836.2213899. See also DBDBD 2011, Master thesis. author copy, slides, project page.
.In this paper we introduce the first efficient external-memory algorithm to compute the bisimilarity equivalence classes of a directed acyclic graph (DAG). DAGs are commonly used to model data in a wide variety of practical applications, ranging from XML documents and data provenance models, to web taxonomies and scientific workflows. In the study of efficient reasoning over massive graphs, the notion of node bisimilarity plays a central role. For example, grouping together bisimilar nodes in an XML data set is the first step in many sophisticated approaches to building indexing data structures for efficient XPath query evaluation. To date, however, only internal-memory bisimulation algorithms have been investigated. As the size of real-world DAG data sets often exceeds available main memory, storage in external memory becomes necessary. Hence, there is a practical need for an efficient approach to computing bisimulation in external memory.
Our general algorithm has a worst-case IO-complexity of O(Sort(|N| + |E|)), where |N| and |E| are the numbers of nodes and edges, resp., in the data graph and Sort(n) is the number of accesses to external memory needed to sort an input of size n. We also study specializations of this algorithm to common variations of bisimulation for tree-structured XML data sets. We empirically verify efficient performance of the algorithms on graphs and XML documents having billions of nodes and edges, and find that the algorithms can process such graphs efficiently even when very limited internal memory is available. The proposed algorithms are simple enough for practical implementation and use, and open the door for further study of external-memory bisimulation algorithms. To this end, the full open-source C++ implementation has been made freely available.
Do Programming Languages need Query Languages?. (2024). slides.
.Do Programming Languages need Query Languages?. (2024). slides.
.Do Programming Languages need Query Languages?. (2023). slides.
.Resilient Data Management Systems: Challenges and Opportunities. (2023). slides.
.The emergence of blockchain technology is fueling the development of new blockchain-based resilient data management systems (BC-RDMSs) that can manage data between fully-independent parties (federated data management) and provide resilience to Byzantine failures (e.g., hardware failures, software failures, and malicious behavior). Due to these qualities, the usage of BC-RDMSs has been proposed in areas such as finance, health care, IoT, agriculture, and fraud-prevention.
At their core, these BC-RDMSs are distributed fully-replicated systems in which each participant maintains a copy of a ledger that stores an append-only list of all transactions requested by users and executed by the system. This ledger is constructed and stored in a tamper-proof manner: new transactions can only be appended via consensus-based agreement steps that require support of a majority of replicas, ruling out unintended changes due to a minority of faulty replicas. As the ledger is maintained by all replicas, it is stored in a highly redundant manner and will survive even if individual replicas fail.
On the one hand, current BC-RDMSs provide novel guarantees not provided by traditional database systems. On the other hand, the current consensus-based techniques used for BC-RDMSs are costly and put heavy restrictions on the operations of these systems, limiting the practical usage of BC-RDMSs. In this talk, I will provide a tour of BC-RDMSs: we will look at how they work, at their limitations, and how ongoing research aims at improving BC-RDMSs.
Efficient fault-tolerant cluster-sending: Reliable and efficient communication between Byzantine fault-tolerant clusters. (2021). abstract, slides.
.Traditional resilient systems operate on fully-replicated fault-tolerant clusters, which limits their scalability and performance. One way to make the step towards resilient high-performance systems that can deal with huge workloads, is by enabling independent fault-tolerant clusters to efficiently communicate and cooperate with each other, as this also enables the usage of high-performance techniques such as sharding and parallel processing.
To enable such efficient communication, we identify the cluster-sending problem: the problem of sending a message from one Byzantine cluster to another Byzantine cluster in a reliable manner, an essential communication primitive. We not only formalize this fundamental problem, but also establish lower bounds on the complexity of this problem under crash failures and Byzantine failures. Furthermore, we develop practical cluster-sending protocols that meet these lower bounds and, hence, have optimal complexity. Finally, we propose probabilistic cluster-sending techniques that only have an expected constant message complexity, this independent of the size of the clusters involved. Depending on the robustness of the clusters involved, these probabilistic techniques require only two-to-four message round-trips while supporting worst-case linear communication between clusters, which is optimal. As such, our work provides a strong foundation for the further development of resilient high-performance systems.
Building High Throughput Permissioned Blockchain Fabrics: Challenges and Opportunities. (2020). In: Proceedings of the VLDB Endowment, 13(12), 3441-3444. VLDB. DOI: 10.14778/3415478.3415565. author copy, slides, video of the entire tutorial.
.Since the introduction of Bitcoin--the first widespread application driven by blockchains--the interest in the design of blockchain-based applications has increased tremendously. At the core of these applications are consensus protocols that securely replicate client requests among all replicas, even if some replicas are Byzantine faulty. Unfortunately, these consensus protocols typically have low throughput, and this lack of performance is often cited as the reason for the slow wider adoption of blockchain technology. Consequently, many works focus on designing more efficient consensus protocols to increase throughput of consensus.
We believe that this focus on consensus protocols only explains part of the story. To investigate this belief, we raise a simple question: Can a well-crafted system using a classical consensus protocol outperform systems using modern protocols? In this tutorial, we answer this question by diving deep into the design of blockchain systems. Further, we take an in-depth look at the theory behind consensus, which can help users select the protocol that best-fits their requirements. Finally, we share our vision of high-throughput blockchain systems that operate at large scales.
Scalable, resilient, and configurable permissioned blockchain fabric. (2020). In: Proceedings of the VLDB Endowment, 13(12), 2893-2896. VLDB. DOI: 10.14778/3415478.3415502. author copy, video by Sajjad Rahnama.
.With the advent of Bitcoin, the interest of the database community in blockchain systems has steadily grown. Many existing blockchain applications use blockchains as a platform for monetary transactions, however. We deviate from this philosophy and present ResilientDB, which can serve in a suite of non-monetary data-processing blockchain applications. Our ResilientDB uses state-of-the-art technologies and includes a novel visualization that helps in monitoring the state of the blockchain application.
Blockchain consensus unraveled: Virtues and limitations. (2020). In: Proceedings of the 14th ACM International Conference on Distributed and Event-Based Systems, 218-221. ACM. DOI: 10.1145/3401025.3404099. author copy, slides, video.
.Since the introduction of Bitcoin--the first wide-spread application driven by blockchains--the interest of the public and private sector in blockchains has skyrocketed. At the core of this interest are the ways in which blockchains can be used to improve data management, e.g., by enabling federated data management via decentralization, resilience against failure and malicious actors via replication and consensus, and strong data provenance via a secured immutable ledger.
In practice, high-performance blockchains for data management are usually built in permissioned environments in which the participants are vetted and can be identified. In this setting, blockchains are typically powered by Byzantine fault-tolerant consensus protocols. These consensus protocols are used to provide full replication among all honest blockchain participants by enforcing an unique order of processing incoming requests among the participants.
In this tutorial, we take an in-depth look at Byzantine fault-tolerant consensus. First, we take a look at the theory behind replicated computing and consensus. Then, we delve into how common consensus protocols operate. Finally, we take a look at current developments and briefly look at our vision moving forward.
An in-depth look of BFT consensus in blockchain: Challenges and opportunities. (2020). slides.
.ResilientDB: Global scale resilient blockchain fabric. (2020). In: The third International Symposium on Foundations and Applications of Blockchain. abstract, video by Suyash Gupta.
.Recent developments in blockchain technology have inspired innovative new designs in resilient distributed and database systems. At their core, these blockchain applications typically use Byzantine fault-tolerant consensus protocols to maintain a common state across all replicas, even if some replicas are faulty or malicious. Unfortunately, existing consensus protocols are not designed to deal with geo-scale deployments in which many replicas spread across a geographically large area participate in consensus.
To address this, we present the Geo-Scale Byzantine Fault-Tolerant consensus protocol (GeoBFT). GeoBFT is designed for excellent scalability by using a topological-aware grouping of replicas in local clusters, by introducing parallelization of consensus at the local level, and by minimizing communication between clusters. To validate our vision of high-performance geo-scale resilient distributed systems, we implement GeoBFT in our efficient ResilientDB permissioned blockchain fabric. We show that GeoBFT is not only sound and provides great scalability, but also outperforms state-of-the-art consensus protocols by a factor of six in geo-scale deployments.
An in-depth look of BFT consensus in blockchain: Challenges and opportunities. (2019). In: Proceedings of the 20th International Middleware Conference--Tutorials, 6-10. ACM. DOI: 10.1145/3366625.3369437. author copy, slides.
.Since the introduction of Bitcoin--the first wide-spread application driven by blockchains--the interest of the public and private sector in blockchains has skyrocketed. At the core of this interest are the ways in which blockchains can be used to improve data management, e.g., by enabling federated data management via decentralization, resilience against failure and malicious actors via replication and consensus, and strong data provenance via a secured immutable ledger.
In practice, high-performance blockchains for data management are usually built in permissioned environments in which the participants are vetted and can be identified. In this setting, blockchains are typically powered by Byzantine fault-tolerant consensus protocols. These consensus protocols are used to provide full replication among all honest blockchain participants by enforcing an unique order of processing incoming requests among the participants.
In this tutorial, we take an in-depth look at Byzantine fault-tolerant consensus. First, we take a look at the theory behind replicated computing and consensus. Then, we delve into how common consensus protocols operate. Finally, we take a look at current developments and briefly look at our vision moving forward.
Efficient transaction processing in Byzantine fault tolerant environments. (2019). In: 18th International Workshop on High Performance Transaction Systems. abstract, video by Suyash Gupta.
.Graph query optimization using semi-join rewritings. (2016). In: The Dutch-Belgian DataBase Day. See also CJ 2020, DBPL 2017, PhD thesis. abstract, slides.
.Path querying on graph databases. (2013). In: WOG (Wetenschappelijke Onderzoeksgemeenschap/Scientific Research Network) Meeting. long abstract, short abstract, slides.
.Efficient external-memory bisimulation on DAGs. (2011). In: The Dutch-Belgian DataBase Day. See also SIGMOD 2012, Master thesis. abstract, slides, project page.
.On Tarski's relation algebra: Querying trees and chains and the semi-join algebra. (2018). Hasselt University and transnational University of Limburg. Adviser: Marc Gyssens. See also CJ 2020, IS 2020, FoIKS 2018a, DBPL 2017, DBDBD 2016, DBPL 2015. author copy, thesis, slides, project page.
.Many practical query languages for graph data are based on fragments of Tarski's relation algebra which, optionally, is augmented with the Kleene-star operator. Examples include XPath, SPARQL, the RPQs, and GXPath. Because of this central role of (fragments of) the relation algebra, we study two aspects in more detail. Combined, these two studies give a detailed picture of the expressive power of the fragments of the relation algebra. Moreover, our results provide several opportunities for the development of new techniques for the efficient evaluation of graph queries.
Bisimulation partitioning and partition maintenance. (2011). Eindhoven University of Technology. Adviser: George H. L. Fletcher. See also SIGMOD 2012, DBDBD 2011. author copy, thesis, final slides, mid-term slides, poster, project page.
.The combination of graphs and node bisimulation is widely used within and outside of computer science. One example of this combination is constructing indices for speeding up queries on XML documents. Thereby XML documents can be represented by trees and many index types for indexing XML documents utilize the notion of bisimulation. Thereby the notion of bisimulation is used to relate nodes that have equivalent behavior with respect to queries performed on the XML documents. By replacing these bisimilar nodes one can reduce the size of the XML document and as such speed up queries. The objective of this thesis is to develop techniques for constructing and maintaining bisimulation partitions. Thereby a bisimulation partition groups nodes based on bisimilarity. In this thesis we primarily focus on very large directed acyclic graphs. The results in this thesis can for example be used to index very large XML documents.