CSE Colloquium – Towards Safe and Resilient Large-scale Distributed Programming

Presenter: Philipp Haller, KTH Royal Institute of Technology
Abstract:
Distributed programming is notoriously difficult. Not only are distributed systems concurrent, they pose additional challenges including data consistency and fault tolerance. At the same time, the share of software systems that are necessarily distributed systems is growing rapidly. As a result, too many software developers are asked to become distributed systems experts. Thus, tools and techniques for ensuring the correctness of distributed systems are urgently needed in order to leave this unsustainable trajectory. This talk presents research results towards the design and implementation of programming systems that support emerging applications and workloads; provide reliability and trust; and embrace simplicity and accessibility. Concretely, the presented work focuses on two directions.
The first direction explores a distributed programming model that provides consistency while enabling high availability for workloads operating on join-semilattices without sacrificing partition tolerance. We propose a new consistency protocol, called observable atomic consistency protocol (OACP), which leverages on-demand coordination to support both coordination-free operations as well as totally-ordered operations on replicated data types. We present a formal, mechanized model of OACP in rewriting logic and verify key correctness properties using the model checking tool Maude. Furthermore, we present the evaluation of a prototype implementation of OACP based on Akka, a widely-used actor-based middleware. The second direction explores a programming system that aims to reconcile the scalability and fault tolerance of stream processing systems with the flexibility of the actor concurrency model. The programming system ensures a failure-transparency property, effectively masking failures through transparent recovery. Our work is the first to formalize failure transparency using a small-step operational semantics, and to provide proofs of failure transparency for stateful dataflow streaming and a fault-tolerant actor-based programming model.
Bio:
Philipp Haller is an Associate Professor in the School of Electrical Engineering and Computer Science (EECS) at KTH Royal Institute of Technology in Stockholm, Sweden. His main research interests are in the design and implementation of programming languages, type systems, concurrency, and distributed programming. He was part of the team that received the 2019 ACM SIGPLAN Programming Languages Software Award for the development of the Scala programming language. Prior to KTH, he was an early employee at Akka (previously Lightbend, Inc.), a start-up company developing and supporting Scala as well as frameworks for large-scale distributed programming. Prior to Akka, he was a post-doctoral fellow at Stanford University, USA, and at EPFL, Switzerland. In 2010 he received his PhD in computer science from EPFL, including a nomination for the 2010 EPFL Doctorate Award. In 2006 he received his Dipl.-Inform. degree from Karlsruhe Institute of Technology (previously University of Karlsruhe), Germany.
Hosted by: Professor Mohsen Lesani
Location: Engineering 2, Room E2-180 (Refreshments such as fruit, pastries, coffee, and tea will be provided.)
Zoom Option: https://ucsc.zoom.us/j/93445911992?pwd=YkJ2TQtF79h0PcNXbEcpZLbpK0coiY.1&jst=3