Skip to Main Content

Overview

AutoMQ re-engineers Kafka for the cloud by decoupling storage to object storage. While maintaining 100% compatibility with Apache Kafka®, it offers users up to 10 times cost efficiency and 100 times elasticity.

Design Philosophy

Decoupling Storage to Shared Cloud Storage Services

The advantages of separating compute and storage have been widely recognized. However, the industry often implements this by decoupling storage into a self-managed distributed storage software, which significantly increases the complexity of software deployment, maintenance, and governance. AutoMQ believes that decoupling storage from the software and moving it to shared cloud storage services is the optimal solution in the cloud-native era.

AutoMQ leverages an S3-based stream repository, S3Stream, to offload storage to shared cloud storage services provided by cloud providers, such as EBS and S3. By fully utilizing the storage characteristics of both, AutoMQ offers low-cost, low-latency, highly available, highly durable, and virtually infinite streaming storage capabilities. For more technical details, please refer to the technical architecture chapter.

Shared Storage vs. Shared Nothing Architecture

Apache Kafka® uses local disks to achieve high-durability storage, providing an abstraction of infinite streaming storage for business logic. All data is stored on the disks of each node according to specific logic, a design commonly referred to as Shared Nothing architecture.

Local disks lack scalability, so the Shared Nothing architecture typically achieves higher throughput through horizontal scaling. However, shared cloud storage has become highly elastic with near "infinite" capacity, making it easier to fully leverage the capabilities of cloud storage at a lower cost by adopting a shared storage architecture.

Reliability and Availability Separation

Apache Kafka® natively supports a multi-replica mechanism to ensure data durability through replica redundancy. It enhances system availability via a fast failover mechanism achieved through leader election among different replicas.

In cloud environments, EBS has a built-in 3-replica storage mechanism. If Kafka also implements a 3-replica storage system, it would result in data being stored 9 times, significantly increasing storage, bandwidth, and computational costs. AutoMQ Inc. believes that cloud-native Kafka no longer requires a multi-replica mechanism to provide both reliability and availability. By separating reliability to cloud storage and independently providing availability, it achieves true cloud-native implementation.

Technical Advantages

AutoMQ has the following advantages compared to Apache Kafka®:

10x Cost Advantage

The new cloud-native architecture of AutoMQ fully leverages the high availability and elastic provisioning capabilities of object storage, offering customers a 10x cost advantage over Apache Kafka®.

  • Using object storage as the core primary storage can significantly reduce storage costs.

  • Achieve high availability without duplicating multiple replicas, saving 2/3 of the traffic and replication costs.

  • Native support for Spot instances and AutoScaling, no need to reserve resources for peak loads.

Extreme Elasticity

AutoMQ separates state storage to object storage services, ensuring a completely stateless business logic layer. AutoMQ clusters can complete partition reassignment and traffic self-balancing within seconds, effectively solving the slow rebalancing and difficult partition reassignment issues encountered during Apache Kafka scaling operations. By integrating with cloud providers' elastic scaling group policies, adaptive elastic scaling of the cluster is easily achieved.

Cold Reads Do Not Affect Message Writing

Each Apache Kafka server provides fixed IOPS for message read and write operations. If there is a large-scale cold read, causing IOPS to hit the limit, message writing may queue up and timeout. This is due to the limitations of Kafka's integrated storage and compute architecture. In contrast, AutoMQ adopts a storage-compute separation architecture, where cold and hot reads/writes do not interfere with each other. Cold read throughput depends on the object storage's throughput capacity, and EBS is used exclusively as WAL for the write message process. Therefore, this issue is effectively avoided in the AutoMQ architecture.

Zero Service Interruption

AutoMQ stores all data on S3, thus during cluster scaling, there is no need for data replication to quickly respond to sudden traffic spikes. In comparison, Apache Kafka requires substantial bandwidth for data replication after scaling, making it challenging to handle sudden traffic surges. Through features like auto-scaling, auto-traffic balancing, and automatic fault recovery, AutoMQ achieves a high level of system autonomy, ensuring higher availability without manual intervention.

100% Compatibility with Apache Kafka®

Unlike other vendors who reimplement the Kafka protocol, AutoMQ uses a minimal storage layer replacement approach. By only modifying the underlying LogSegment implementation while keeping the main Apache Kafka code unchanged, AutoMQ easily achieves 100% compatibility with Apache Kafka and quickly adapts to new versions.