Skip to Main Content

Architecture

The term "AutoMQ Kafka" mentioned in this article specifically refers to the source available project automq-for-kafka under the GitHub AutoMQ organization by AutoMQ CO., LTD.

With the open source of AutoMQ, the community has had extensive discussions on the architecture of AutoMQ in recent weeks, particularly focusing on how the architecture of AutoMQ differs from existing open source or commercial solutions. This document is primarily intended to answer these types of questions.

Differences with Tiered Storage

With the release of Apache Kafka 3.6.0, the KIP-405 tiered storage solution finally came out. Although this feature is still in the early access stage, many developers on the community including Twitter have asked where the core advantages of the AutoMQ architecture over KIP-405 lie.

If we deploy Apache Kafka or AutoMQ Kafka on the cloud, indeed the two architectures have certain similarities, both are based on EBS and Object Storage (S3) deployment forms. Although the choice of storage media is consistent, the idea of how to use these two types of storage is different.

First, let's talk about EBS:

  • The architecture of Apache Kafka regards EBS as a regular block device storage medium, which has no fundamental difference from a physical hard disk in a local data center. Therefore, Apache Kafka will perform data replication based on EBS, generally requiring 3 replicas.
  • AutoMQ Kafka treats EBS as a reliable and highly available cloud storage service, fully utilizing the reliability of EBS's own 3 replicas (5 to 9 nines of reliability) and its failover capabilities within and between availability zones. This allows AutoMQ Kafka to avoid extra replication on top of EBS, thereby saving a large amount of storage, network and computing resources.

Next, let's look at Object Storage (S3):

  • Apache Kafka uses S3 for second-tier storage, with the first-tier storage built on top of EBS. From a design perspective, for each Kafka Topic partition, at least the last active Segment needs to remain on the first-tier storage. This also results in the amount of storage space used by the first-tier EBS storage in the tiered storage solution being variable, often related to the number of partitions.
  • AutoMQ Kafka uses S3 as primary storage, with EBS positioned as a high-speed write buffer, more like a low-latency persistent cache. Therefore, each Broker of AutoMQ Kafka only needs a 2GB EBS volume, of which only 500MB of Delta data not on S3 (configurable).

As EBS serves as a buffer, so the storage space is deterministic, making the EBS storage overhead of the AutoMQ Kafka architecture extremely low. A 2GB EBS volume costs only 1 yuan per month, and there is at most 500MB of data in the buffer that does not exist on S3, allowing for second-level completion of S3 uploads and support for second-level partition reassignment.

In contrast, the primary storage space in the tiered storage architecture is not fixed, requiring capacity evaluation for EBS, which is often a difficult task. Therefore, we still need to preset large-capacity EBS volumes for each Broker node. On the other hand, the amount of data each partition leaves on EBS is not fixed, so the partition reassignment time is uncertain and there is no way to scale quickly. Looking at the example given by Confluent, in a large traffic cluster, it takes 43 hours to expand the capacity in a non-tiered storage architecture, and 1.4 hours in a tiered storage architecture. It can be seen that although the tiered storage solution greatly alleviates the inherent problems of the Kafka architecture, it cannot achieve statelessness like the AutoMQ Kafka architecture.

To sum up, the cloud-native architecture of AutoMQ Kafka, compared to the tiered storage solution, is a process of quantity change leading to quality change. Through architectural optimization, AutoMQ Kafka achieves "statelessness", arbitrary scaling, and second-level partition reassignment. In contrast, the tiered storage architecture is an optimization solution, but it is still stateful.

Differences with WarpStream

WarpStream re-implements the Kafka protocol, building Kafka directly on top of object storage S3. The Kafka protocol is mainly implemented by an Agent component. Since the Agent does not need to mount EBS locally, it can be said to be completely stateless. But because of the IO characteristics of S3, S3 is billed according to API, and the S3 API call itself has a latency of hundreds of milliseconds. Therefore, WarpStream can only accumulate batches on the synchronization link, resulting in a P99 write delay of ~400ms and a P99 end-to-end delay of 1s.

Statelessness

From the perspective of statelessness, WarpStream has 0 local storage space and can be said to be completely stateless, naturally easy to carry out partition reassignment and scaling related operations.

But for AutoMQ Kafka, the local storage space only needs 2GB, of which only 500MB Delta data, can still achieve second-level partition reassignment and node expansion and contraction, can also be said to be a stateless solution.

Cost

From a cost perspective, a 2GB EBS block costs only 1 yuan per month, which can be negligible. Therefore, in terms of cost, AutoMQ and WarpStream are basically the same and can achieve the optimal cost in theory.

Latency

WarpStream has made clear trade-offs, sacrificing the latency indicator, and users who adopt the WarpStream solution must accept a P99 latency of 400ms.

On the contrary, AutoMQ Kafka's Delta WAL mechanism provides extremely competitive latency. Because of the simple storage structure of WAL, we can write WAL by writing to the raw device, completely abandoning the overhead of the file system, and optimizing the latency to the extreme.

Availability

In terms of S3 dependence, the availability of the two products is aligned, but AutoMQ also depends on EBS, so it needs to rely on EBS's Detach/Attach API and NVMe-related API to implement fault transfer capability.

Compatibility

WarpStream chose protocol adaptation, currently only adapted to the main API, advanced API related to transactions has not been adapted. In fact, adapting to the Kafka protocol is very difficult. Kafka has developed for so many years, some API have dozens of versions, it is conceivable that the difficulty of complete adaptation, there will be a fairly long maturity period after the adaptation is completed.

AutoMQ Kafka is developed based on the LogSegment aspect of Apache Kafka, replacing the storage layer of Apache Kafka. The changes are small enough that it can continue to merge and follow the code of Apache Kafka, keeping features and bug fixes consistent with Apache Kafka.

Currently, AutoMQ Kafka has passed all of the community's E2E tests, and can be said to be 100% compatible with Apache Kafka.

Difference with Redpanda

The architecture and philosophy of Redpanda and AutoMQ are quite different. Redpanda has rewritten Kafka in native language, gaining some performance advantages of native language, but its solutions in replication, object storage, etc. are not much different from Apache Kafka.

In addition, Redpanda emphasizes the extreme performance of a single machine, even a single core. This idea is more suitable for local data centers, which can make full use of idle physical resources. But in fact, in the cloud, the resources provided by cloud providers are relatively balanced, and they are basically pay-as-you-go, how to use physical resources more economically is what AutoMQ values more.