MSc George Zakhour

MSc George Zakhour
george.zakhour@unisg.ch
geezee
https://grgz.me

School of Computer Science
Torstrasse 25
9000 St. Gallen, Switzerland

I am a PhD student at the Programming Group since August 2021. Before I was an R&D software developer at Agilent Technologies. My research interests are the formal study of programming languages, mathematical logic (and its relation to programming), quantum computation from the perspective of computability and complexity, and numerical simulations.

Publications

POPL
Propel

Dis/Equality Graphs

George Zakhour, Pascal Weisenburger, Jahrim Gabriele Cesario, Guido Salvaneschi

Proceedings of the ACM on Programming Languages 8 (POPL), 2025

Abstract PDF Supp

E-graphs are a data structure to compactly represent a program space and reason about equality of program terms. E-graphs have been successfully applied to a number of domains, including program optimization and automated theorem proving. In many applications, however, it is necessary to reason about disequality of terms as well as equality. While disequality reasoning can be encoded, direct support for disequalities increases performance and simplifies the metatheory.
In this paper, we develop a framework independent of a specific implementation to formally reason about e-graphs. For the first time, we prove the equivalence of e-graphs to the reflexive, symmetric, transitive, and congruent closure of the equivalence relation they are expected to encode. We use these results to present the first formalization of an extension of e-graphs that directly supports disequalities and prove an analytical result about their superior efficiency compared to embedding techniques that are commonly used in SMT solvers and automated verifiers. We further profile an SMT solver and find that it spends a measurable amount of time handling disequalities.
We implement our approach in an extension to egg, a popular e-graph Rust library. We evaluate our solution in an SMT solver and an automated theorem prover using standard benchmarks. The results indicate that direct support for disequalities outperforms other encodings based on equality embedding, confirming the results obtained analytically.
TSE

Consistent Local-First Software: Enforcing Safety and Invariants for Local-First Applications

Mirko Köhler, George Zakhour, Pascal Weisenburger, Guido Salvaneschi

IEEE Transactions on Software Engineering, 2024

Abstract PDF

Local-first software embraces data replication as a means to achieve scalability and offline availability. A crucial ingredient of local-first software are mergeable data types, like conflict-free replicated data types (CRDTs), which feature eventual consistency by enabling processes to access data locally and later merge it with other replicas in an asynchronous manner. Notably, the merging process needs to adhere to application constraints for correctness. Ensuring such application-level invariants poses a challenge, as developers must reason about the replicated program state and resort to manual synchronization of specific application components to enforce the invariant.
This paper introduces ConLoc (Consistent Local-First Software), a novel system designed to automatically enforce safety and maintain invariants in local-first applications. ConLoc effectively addresses the issue of preserving invariants in the execution of programs with replicated data types, including CRDTs. Our approach is able to verify the correctness of many CRDTs examined in the literature and in implementations, such the ones used in the Riak database. ConLoc ensures that applications are automatically synchronized correctly, resulting in substantial latency and throughput improvements when compared to sequential execution, while upholding the same set of invariants.
PLDI
Propel

Automated Verification of Fundamental Algebraic Laws

George Zakhour, Pascal Weisenburger, Guido Salvaneschi

Proceedings of the ACM on Programming Languages 8 (PLDI), 2024

Abstract PDF Supp Code Website

Algebraic laws of functions in mathematics – such as commutativity, associativity, and idempotence – are often used as the basis to derive more sophisticated properties of complex mathematical structures and are heavily used in abstract computational thinking. Algebraic laws of functions in coding, however, are rarely considered. Yet, they are essential. For example, commutativity and associativity are crucial to ensure correctness of a variety of software systems in numerous domains, such as compiler optimization, big data processing, data flow processing, machine learning or distributed algorithms and data structures. Still, most programming languages lack built-in mechanisms to enforce and verify that operations adhere to such properties.
In this paper, we propose a verifier specialized on a set of fundamental algebraic laws that ensures that such laws hold in application code. The verifier can conjecture auxiliary properties and can reason about both equalities and inequalities of expressions, which is crucial to prove a given property when other competitors do not succeed. We implement these ideas in the Propel verifier. Our evaluation against five state-of-the-art verifiers on a total of 142 instances of algebraic properties shows that Propel is able to automatically deduce algebraic properties in different domains that rely on such properties for correctness, even in cases where competitors fail to verify the same properties or time out.
EGRAPHS
Propel

Disequalities in E-Graphs: An Experiment

George Zakhour, Pascal Weisenburger, Guido Salvaneschi

Presentation at the 3rd Workshop on E-Graph Research, Applications, Practices, and Human-factors (EGRAPHS), 2024

Abstract PDF

This talk explores the integration of disequalities into e-graphs for enhancing the efficiency of automated theorem provers. We discuss two existing approaches, which we implement in the egg e-graph library, presenting preliminary results on comparing their effectiveness. Our initial experiments demonstrate the feasibility of integrating disequalities into e-graphs implemented in egg, with promising results suggesting improved efficiency in the proof search algorithm. We plan to refine this approach, integrate it into our Propel automated theorem prover, and make our extensions to egg available to the research community.
CP
ScalaLoci

Exploring Algebraic Placement in Multiparty Languages

George Zakhour, Pascal Weisenburger, Guido Salvaneschi

Presentation at the 1st Workshop on Choreographic Programming (CP), 2024

Abstract PDF

This talk provides an overview of our ongoing research on the relationship between type systems and placement systems in programming languages for distributed systems.
In distributed systems, the placement of data and computation plays a crucial role in ensuring protocol correctness, fault tolerance, security of information flow, and performance optimization. Thus, researchers have explored various techniques to express and manage data placement. Type-based approaches have proven particularly effective in modeling places and their interactions. For example, choreographic programming ensures safe communication protocols across different locations by modeling these locations – so-called roles – types. Similarly, multiparty session types specify a communication protocol for message exchange over communication channels. Recent languages for multitier programming – a programming paradigm that provides language abstractions to specify the placement of data and computations on the different components of the distributed system – also opted for expressing placement in the type system.
Reification of placements into language-level concepts enables programmers to reason about which components perform computations and about the communication between them.
PLF
Propel

Type-Checking CRDTs with Propel

George Zakhour, Pascal Weisenburger, Guido Salvaneschi

Presentation at the 2nd Workshop on Programming Local-First Software (PLF), 2023

Abstract PDF

Conflict-free Replicated Data Types (CRDTs) are modern distributed data types that allow replicating data to different devices in a distributed system and enable local copies to diverge until they are merged with other replicas ensuring eventual consistency. CRDTs play a vital role in building local-first applications, i.e., applications where devices can always progress their local state independently while also enabling seamless collaboration among devices without being blocked by devices that are (temporarily) unreachable on the network. CRDTs enable keeping replicated data consistent while guaranteeing the absence of conflicts among replicas. CRDTs come in two flavors: state-based and operation-based (op-based). For correct operation, state-based CRDTs rely on a merge function for two states that is commutative, associative and idempotent, while operation-based CRDTs rely on an application function for operations on the state that commutes with itself.
However ensuring that such algebraic properties are satisfied by implementations is left to the programmer, resulting in a process that is complex and error-prone. While techniques based on testing, automatic verification of models, and mechanized or handwritten proofs are available, we lack an approach that is able to verify such properties on concrete CRDT implementations.
In this talk the first author will present the first type system that captures the algebraic properties required by a correct CRDT implementation. The type system is designed in Propel, it can reason about programs and derive proofs of such properties with complex rules such as case analysis and induction: sum types guide the case analysis and algebraic properties in function types enable induction. Propel’s key feature is its capacity to reason about algebraic properties (a) in terms of rewrite rules and (b) to derive the equality or inequality of expressions from the properties.
OOPSLA
ScalaLoci

Type-Safe Dynamic Placement with First-Class Placed Values

George Zakhour, Pascal Weisenburger, Guido Salvaneschi

Proceedings of the ACM on Programming Languages 7 (OOPSLA2), 2023

Abstract PDF Supp Code Website

Several distributed programming language solutions have been proposed to reason about the placement of data, computations, and peers interaction. Such solutions include, among the others, multitier programming, choreographic programming and various approaches based on behavioral types. These methods statically ensure safety properties thanks to a complete knowledge about placement of data and computation at compile time. In distributed systems, however, dynamic placement of computation and data is crucial to enable performance optimizations, e.g., driven by data locality or in presence of a number of other constraints such as security and compliance regarding data storage location. Unfortunately, in existing programming languages, dynamic placement conflicts with static reasoning about distributed programs: the flexibility required by dynamic placement hinders statically tracking the location of data and computation.
In this paper we present Dyno, a programming language that enables static reasoning about dynamic placement. Dyno features a type system where values are explicitly placed, but in contrast to existing approaches, placed values are also first class, ensuring that they can be passed around and referred to from other locations. Building on top of this mechanism, we provide a novel interpretation of dynamic placement as unions of placement types. We formalize type soundness, placement correctness (as part of type soundness) and architecture conformance. In case studies and benchmarks, our evaluation shows that Dyno enables static reasoning about programs even in presence of dynamic placement, ensuring type safety and placement correctness of programs at negligible performance cost. We reimplement an Android app with 7K LOC in Dyno, find a bug in the existing implementation, and show that the app’s approach is representative of a common way to implement dynamic placement found in over 100 apps in a large open-source app store.
PLDI
Propel

Type-Checking CRDT Convergence

George Zakhour, Pascal Weisenburger, Guido Salvaneschi

Proceedings of the ACM on Programming Languages 7 (PLDI), 2023

Abstract PDF Supp Code Website

Conflict-free Replicated Data Types (CRDTs) are a recent approach for keeping replicated data consistent while guaranteeing the absence of conflicts among replicas. For correct operation, CRDTs rely on a merge function that is commutative, associative and idempotent. Ensuring that such algebraic properties are satisfied by implementations, however, is left to the programmer, resulting in a process that is complex and error-prone. While techniques based on testing, automatic verification of a model, and mechanized or handwritten proofs are available, we lack an approach that is able to verify such properties on concrete CRDT implementations.
In this paper, we present Propel, a programming language with a type system that captures the algebraic properties required by a correct CRDT implementation. The Propel type system deduces such properties by case analysis and induction: sum types guide the case analysis and algebraic properties in function types enable induction for free. Propel’s key feature is its capacity to reason about algebraic properties (a) in terms of rewrite rules and (b) to derive the equality or inequality of expressions from the properties. We provide an implementation of Propel as a Scala embedding, we implement several CRDTs, verify them with Propel and compare the verification process with four state-of-the-art verification tools. Our evaluation shows that Propel is able to automatically deduce the properties that are relevant for common CRDT implementations found in open-source libraries even in cases in which competitors timeout.