Characterizing Microservice Architectures

Empirically characterizing the topology and request-workflow properties of large-scale microservice deployments.

The microservice architecture has become the dominant paradigm for building large-scale distributed applications, yet the characteristics of real industrial deployments remain largely invisible to the research community. Most tools and testbeds used in academic research are built on assumptions about microservice topologies and request workflows that have never been validated against production systems — creating a risk that research findings and tools may not apply in practice.

This project addresses that gap through empirical characterization of large-scale microservice architectures. Our work includes analyses of Meta’s production microservice architecture (ATC 2023), where we find that the topology is massive (18,500+ services, 12 million instances), highly dynamic, and heterogeneous in ways that violate assumptions common in research testbeds and topology generators. We also find that request workflows are wide and shallow, and that many traces are partially unrecoverable due to rate limiting and uninstrumented services.

We complement this with a systematization of knowledge study (JSys 2022) comparing popular open-source microservice testbeds against industry practitioners’ perceptions. Through analysis of seven testbeds and interviews with twelve practitioners, we identify key mismatches: real deployments feature non-hierarchical topologies, mixed communication protocols, cycles, and hundreds to thousands of services — none of which are captured by existing testbeds. Finally, our ICPE 2024 paper addresses the widespread use of Alibaba’s publicly available microservice trace datasets, finding pervasive inconsistencies in their structure and introducing Casper, an algorithm that exploits structural redundancies to correctly reconstruct trace topologies that would otherwise be discarded or distorted.

👤 Members

Darby Huye
Vishwanath Seshagiri
Max Liu
Avani Wildani
Yuri Shkuro
Raja Sambasivan

đź“„ Related Publications

2024

  1. ICPE
    Systemizing and mitigating topological inconsistencies in Alibaba’s microservice call-graph datasets
    Darby Huye, Lan Liu, and Raja R. Sambasivan
    In ACM/SPEC International Conference on Performance Engineering, May 2024

2023

  1. ATC
    Lifting the veil on Meta’s microservice architecture: Analyses of topology and request workflows
    Darby Huye, Yuri Shkuro, and Raja R. Sambasivan
    In USENIX Annual Technical Conference, Jul 2023

2022

  1. JSys
    [SoK] Identifying Mismatches Between Microservice Testbeds and Industrial Perceptions of Microservices
    Vishwanath Seshagiri, Darby Huye, Lan Liu, and 2 more authors
    Journal of Systems Research, Jul 2022

⚙️ Code and Datasets

2024

  1. Code
    CASPER: Alibaba Microservice Call-graph Reconstruction
    Darby Huye, Lan Liu, and Raja R. Sambasivan
    2024
    Code for: Systemizing and mitigating topological inconsistencies in Alibaba’s microservice call-graph datasets (ICPE’24)
  2. Dataset
    Alibaba 2021 Microservice Call-Graph Traces (Pre-shuffled)
    Darby Huye, Lan Liu, and Raja R. Sambasivan
    2024
    Data for: Systemizing and mitigating topological inconsistencies in Alibaba’s microservice call-graph datasets (ICPE’24)
  3. Dataset
    Alibaba 2022 Microservice Call-Graph Traces (Pre-shuffled)
    Darby Huye, Lan Liu, and Raja R. Sambasivan
    2024
    Data for: Systemizing and mitigating topological inconsistencies in Alibaba’s microservice call-graph datasets (ICPE’24)

2023

  1. Dataset
    Distributed Traces from Meta’s Microservices Architecture
    Darby Huye, Yuri Shkuro, and Raja R. Sambasivan
    2023
    Licensed CC BY-NC 4.0. Data for: Lifting the veil on Meta’s microservice architecture: Analyses of topology and request workflows (USENIX ATC’23)