Skip to Main Content

Glossary

This article outlines the technical terms used in each module of AutoMQ and provides a concise summary.

Terminology Classification

  • Cloud Service Concepts: This section discusses the cloud services and product components utilized by AutoMQ. Users can consult the documentation from various cloud providers for additional information.

  • Apache Kafka Concepts: This section reviews some existing concepts from Apache Kafka, which might be modified in the AutoMQ implementation.

  • AutoMQ Concepts: This section introduces new concepts created for each module of AutoMQ.

Cloud Service Terminology

EBS

EBS (Elastic Block Store) offers high-performance, scalable, durable, and low-latency block storage solutions. Within AutoMQ's system design, EBS is employed to temporarily hold some message data that has not been transferred to object storage, thereby reducing message transmission and reception delays. Note that different cloud service providers may offer similar services under various product names.

S3

S3 (Simple Storage Service) provides secure, durable, and highly scalable object storage services. In AutoMQ's system architecture, object storage serves as the primary storage medium for messages, featuring on-demand access, pay-as-you-go pricing, and a significant reduction in storage costs by up to 90% compared to Apache N2k. S3 is also referred to as object storage in subsequent documents, and various cloud service providers may use different names for their object storage services.

Bucket

A Bucket is the basic container for object storage services, facilitating efficient data management. Before deploying AutoMQ, it is essential to allocate some Buckets for configuring message storage.

Auto Scaling Group( ASG)

Auto Scaling Group (ASG) is a service that automatically adjusts computing resources to meet application load demands. ASG can increase or decrease the number of instances in a virtual host group automatically, ensuring high availability of applications and optimizing costs. AutoMQ leverages ASG for features like automatic elastic scaling. Different cloud service providers may offer various product names for ASG.

Apache Kafka® Terminology

Broker

A Broker in the Apache Kafka® system acts as the logical role for processing, storing, and transmitting messages, with multiple Broker nodes together forming a Kafka cluster. In AutoMQ's system design, a Broker specifically refers to the standard role involved in message sending and receiving, excluding the Controller role responsible for scheduling and allocation.

Controller

A Controller in the Apache Kafka® system is the logical role for scheduling and coordinating task distribution among multiple nodes. Depending on the version, the implementation of a Controller may vary. In AutoMQ's system design, the Controller is built on the KRaft mode and does not rely on ZooKeeper services. Among multiple Controller nodes, one acts as the Active Controller, serving as the primary decision-making node.

Partition

A Partition is the logical segment of an Apache Kafka® Topic, used to facilitate parallel processing of data and enhance throughput capabilities. Each Partition is an ordered, immutable sequence of messages. In AutoMQ's system design, the Partition retains its original functional definition but is no longer stored on local disks; instead, it is constructed using object storage to achieve unlimited capacity and on-demand scalability.

AutoMQ® Terminology

AutoMQ

AutoMQ is a new generation of Apache Kafka® distribution, redesigned based on cloud-native principles. While being 100% compatible with the Apache Kafka® protocol, it offers up to a tenfold cost advantage and hundreds of times the scalability benefits.

S3Stream

S3Stream is a low-latency, high-throughput, scalable, and cost-effective streaming repository built on EBS and Object storage, seamlessly integrated through the Stream operation interface. AutoMQ, which encapsulates S3Stream, has replaced Apache Kafka®'s log storage. It ensures 100% compatibility with the upper-layer functions of Apache Kafka® while providing a tenfold cost reduction and vastly improved scalability, reaching hundreds of times that of Apache Kafka®.

S3Url

S3Url is a unified configuration item utilized by AutoMQ for swift cluster startup, encompassing object storage access points and identity details. Users are advised to employ installation tools to create the S3Url configuration to pre-validate parameter accuracy and resource compatibility, thereby bypassing the complex cluster ID creation and storage formatting required in Apache Kafka®.

WAL

WAL is a high-throughput, low-latency, persistent cache within the S3Stream library based on EBS, designated for temporarily caching data not yet committed to object storage. In AutoMQ, WAL is assigned per Broker granularity. When a message is received, the Broker initially logs the message to WAL sequentially, promptly responds to the client, and subsequently, asynchronously uploads the WAL content to object storage.

Stream Object

Stream Object is the fundamental unit within S3Stream designated for storing Stream data. Data from each Stream is divided among several Stream Objects, collectively emulating an infinite Stream.

Stream Set Object

Stream Set Object is a temporary data structure in S3Stream utilized to consolidate fragmented Stream write requests. When transferring temporary data from WAL to object storage, data from various fragmented Streams are merged into a single Stream Set Object prior to upload. Thereafter, the Stream Set Object is asynchronously sorted and organized into regular Stream Objects.