20th International Workshop
on High Performance Transaction Systems (HPTS)
September 15-18, 2024
Every two years, HPTS brings together a lively and opinionated group of technologists to discuss and debate the pressing topics that affect today's systems and their design and implementation, especially where performance and scalability is concerned. The workshop includes position paper presentations, panels, moderated discussions, and significant time for casual interaction. The presentations are not recorded, and the only publications are slide decks by presenters, who are strongly encouraged to post them.
Since its inception in 1985, HPTS has always been about large-scale systems --- systems that extend the state-of-the-art. Over the years the focus has expanded from scalable transaction processing to very large databases to cloud computing. Today, scalability is also about data analytics, machine learning and globally distributed systems. Here are some of the questions and topics we hope 2024 participants will address:
- OLTP and OLAP at Scale
- OLTP in the cloud: the current generation of cloud OLTP database services show that architectural assumptions change when OLTP is built for the cloud. In what interesting ways are modern OLTP systems pushing the boundaries?
- OLAP in the cloud: ditto for OLAP.
- Hybrid systems: In what interesting ways are these systems converging into a single platform? Or what other hybrid systems are being built from the ground up?
- Revolution vs Evolution: One class of services were built for scale natively in the cloud (revolutionary), while others evolved existing on-premise systems to become 'cloud native' - in what ways are these systems converging?
- The return of small data: small in-process systems like DuckDB (OLAP) and SQLite (OLTP) are being deployed at scale - what does the future look like for systems built with these small engines as primitives?
- ETL/ELT: Schema now? Schema later? Reverse ETL? What is the state of the art in data movement/prep/pipelines?
- Data Stores
- Consistency and coordination: What is the state of the art in transaction coordination and consistency mechanisms? How can distributed data be kept sufficiently consistent with minimal coordination?
- Performance under failures: How can we best measure performance in the presence of failure modes?
- Latency and correctness: Who's doing the best job of working around global latencies and the CAP theorem nowadays?
- Serverless data stores: How do we design data stores for the serverless computing era? Can storage and compute be scaled and billed independently with predictable costs and latencies?
- Impact of cost: Explore the continuum between how cheaply and how fast we can store and process data.
- Immutability: Is immutability helping (or hurting) storage systems at scale?
- Hardware
- Silicon: More and more data processing systems are running on ARM - mostly in the cloud. What can we learn or ask from hardware folks shipping custom Silicon at scale?
- Differential trends in costs and performance: How have changes in relative speed and cost of computing, storing and networking changed the ways that large-scale systems are (or should be) implemented?
- Novel memory architectures: Non-volatile memory, processor in memory, multi-Terabyte RAM, disaggregated memories.
- Compute offload and accelerators: What lessons does the DB community have to share about cost models, query optimization, planning and execution that will be relevant to offloading and scheduling computation on heterogeneous compute?
- Distributed Systems
- Managing huge systems: How to build and manage distributed systems with partially connected nodes at scale?
- Debugging serverless: How do we debug failures in serverless applications?
- Monitoring at scale: Extracting actionable events from a flood of distributed unstructured logging data.
- Heterogeneous hardware at scale: how can we reliably, predictably and seamlessly run massive computations (e.g. simulations) across different configurations (e.g., with different cloud configurations, chipsets, accelerators)?
- Metastable systems in the wild: How do we handle failures at scale that 'feed' and strengthen their own 'failed' condition?
- Testing: What is the state of the art in testing distributed systems and database systems?
- Networking: What interesting ways are we using the network in building modern scalable systems?
- Platforms
- Lifecycle of applications at scale: As hardware and system layers evolve, and as scale increases, how are services and applications maintained and extended?
- Safe coordination of applications: Can we provide safety properties using coordination at the application level?
- Are new application models emerging for scale? What new applications will be driving scalability requirements in the next few years?
- Semantics of reliability and durability: How should we build applications that reliably and durably process billions of dollars every day? Do we need new programming models at scale?
- Edge: what can we (should we) push to the network edge? How should cloud architectures integrate the edge?
- Containerization: What is new in containers, MicroVMs? How are we rethinking database systems with recent advancements in containerization and virtualization?
- Compliance: With distributed systems running at global scale, what are interesting/important topics that we need to think about?
- Confidential computing: What interesting products/systems are being built using trusted execution environments and/or cryptographic techniques? What is the state of the art in confidential data analytics and AI?
- Systems for ML and ML for Systems
- Systems support for large scale machine learning: What is the state-of-the-art in large-scale systems for machine learning? What are the bottlenecks for large-scale Machine Learning systems, and how can they be overcome? What systems work is needed to make LLMs or model training with billions and trillions of parameters or inference more energy efficient, fault tolerant?
- Vector databases: What is the future of vector databases/indexes? Should we be building an entirely new system to support vector retrieval? Should we evolve existing systems to support this workload? What systems and scalability challenges are there for both approaches?
- Machine learning for systems: How is ML making its way into systems? What ML techniques are being used to augment the way we operate systems at scale?
- Others topics are welcome, as long as they are likely to be of interest to people building large-scale systems.
Submission process:
- Send something thought-provoking to us: If you would like to attend, please submit a one-page technical position paper that presents a viewpoint on a controversial topic, a summary of lessons learned, experience with a large or unusual system, an innovative mechanism, an enormous problem looming on the horizon, or anything else that convinces the program committee that you have something interesting to say to builders of large-scale systems.
- Easy and simple submission process: The submission process is very lightweight, in part to attract systems developers who can't set aside time to write a paper.
- Authorship: Authorship of proposals is a consideration for invitations to the conference. Each submission can have only one author, and each author may submit only one proposal. Nevertheless, please feel free to report on joint work if you are asked to present.
The nature of the HPTS workshop:
The structure of HPTS: The workshop is by invitation only and will have under 100 participants. The submissions drive both the invitation process and the workshop agenda. Participants may be asked to give a presentation at the workshop. Students are particularly encouraged to submit and will enjoy a discounted workshop fee.
HPTS is about the breaks. The presentations punctuate the breaks! HPTS exists to promote community and relationships. This community comprises people that share a common interest in scalable systems and all their challenges. We emphasize discussion during breaks and deliberately seek out presentations that spark thought-provoking controversy.
What to submit:
- Take a stand: A one-page abstract or position statement, as text or as a link to a pdf, in 10pt font or larger. We do not want polished papers. Convince us you have interesting ideas!
- Tell us who you are and what interests you: Optionally, a short summary of the corresponding author's current work, such as a link to the author's homepage or LinkedIn page, or a short bio added to the position statement (beyond the one page).
- Additional material: Optionally, a link to one or both of the following:
- Maximum of 3 PowerPoint or pdf slides, augmenting your position statement.
- Maximum 2-minute video illustrating your position statement, such as a demo or a presentation
of you speaking (with or without slides).
- Short and sweet: The length limits will be strictly enforced. We won't consider submissions that exceed the maximum length.
Where to submit:
Important Dates:
- Submissions Due: March 1, 2024
- Notification of Acceptance: May 1, 2024
- HPTS Workshop: September 15-18, 2024
Program Chair
|
Justin Levandoski (Google)
|
Program Committee
|
Anastasia Ailamaki (EPFL) |
Marc Brooker (Amazon) |
Sudipto Das (Amazon) |
Sailesh Krishnamurthy (Google) |
Christos Kozyrakis (Stanford) |
Viktor Leis (TU-Munich) |
Danica Porobic (Oracle) |
Dan Ports (Microsoft Research) |
Mehul Shah (Aryn) |
Alex Szalay (Johns Hopkins University) |
Rebecca Taft (Cockroach) |
|
Organizing Committee |
Shel Finkelstein (UC Santa Cruz) |
Pat Helland (Salesforce) |
Ippokratis Pandis (Amazon) |
Mark Little (Red Hat) |