Skip to Main Content

MinIO

Preface

MinIO is a high-performance, distributed object storage system that runs on standard hardware, delivering an excellent balance of cost and functionality for a wide range of applications. It is purpose-built for high-performance private clouds, featuring a streamlined and effective architecture that achieves outstanding performance along with robust object storage capabilities. Whether for traditional secondary storage, disaster recovery, archiving, or newer applications like machine learning, big data, and private or hybrid clouds, MinIO excels in versatility and performance.

Thanks to MinIO's complete compatibility with the S3 API, you can establish the AutoMQ cluster on MinIO within your private data center, creating a streaming system fully compatible with Kafka but with superior cost efficiency, unmatched elasticity, and ultra-low latency. This article will show you how to set up the AutoMQ cluster using MinIO in your private data center.

Prerequisites

  • A functioning MinIO environment. If you don't already have MinIO set up, please consult its official installation guide.

  • Prepare 5 hosts for the AutoMQ cluster deployment. It's advisable to use Linux amd64 hosts with 2 cores and 16GB of memory, and to set up two virtual storage volumes. Here's an example:

    Role
    IP
    Node ID
    System Volume
    Data Volume
    CONTROLLER
    192.168.0.1
    0
    EBS 20GB
    EBS 20GB
    CONTROLLER
    192.168.0.2
    1
    EBS 20GB
    EBS 20GB
    CONTROLLER
    192.168.0.3
    2
    EBS 20GB
    EBS 20GB
    BROKER
    192.168.0.4
    3
    EBS 20GB
    EBS 20GB
    BROKER
    192.168.0.5
    4
    EBS 20GB
    EBS 20GB

    Tips:

    • Ensure these machines are in the same subnet and can communicate with each other.

    • In non-production settings, you can deploy a single Controller which, by default, also functions as a Broker.

  • Download the latest official binary installation package from AutoMQ Github Releases to install AutoMQ.

  • Create two custom-named object storage buckets on Ceph, named automq-data and automq-ops.

    1. You can configure environment variables to provide the Access Key and Secret Key needed for the AWS CLI.

    export AWS_ACCESS_KEY_ID=X1J0E1EC3KZMQUZCVHED
    export AWS_SECRET_ACCESS_KEY=Hihmu8nIDN1F7wshByig0dwQ235a0WAeUvAEiWSD

    1. Use the AWS CLI to create an S3 bucket.

    aws s3api create-bucket --bucket automq-data --endpoint=http://127.0.0.1:80
    aws s3api create-bucket --bucket automq-ops --endpoint=http://127.0.0.1:80

Install and Initiate the AutoMQ Cluster

Step 1: Generate an S3 URL

AutoMQ offers the automq-kafka-admin.sh tool for rapid startup of AutoMQ. Simply supply an S3 URL with the required S3 access point and authentication details, and you can start AutoMQ with a single command, eliminating the need to manually generate a cluster ID or format storage.


### Command Line Usage Example
bin/automq-kafka-admin.sh generate-s3-url \
--s3-access-key=xxx \
--s3-secret-key=yyy \
--s3-region=cn-northwest-1 \
--s3-endpoint=s3.cn-northwest-1.amazonaws.com.cn \
--s3-data-bucket=automq-data \
--s3-ops-bucket=automq-ops

When utilizing MinIO, you can use the following setup to create a specific S3 URL.

Parameter Name
Default Value in This Example
Description
--s3-access-key
minioadmin
Environment variable MINIO_ROOT_USER
--s3-secret-key
minio-secret-key-CHANGE-ME
Environment variable MINIO_ROOT_PASSWORD
--s3-region
us-west-2
This parameter is not applicable in MinIO and can be set to any value, such as us-west-2
--s3-endpoint
http://10.1.0.240:9000
The endpoint can be obtained by running the command sudo systemctl status minio.service
--s3-data-bucket
automq-data
-
--s3-ops-bucket
automq-ops
-

Output Results

Upon initiating this command, the process will seamlessly transition through the following stages:

  1. Identify core features of S3 to ensure compatibility with AutoMQ by utilizing the given accessKey and secretKey.

  2. Create an s3url using the identity data and access point information.

  3. Fetch the startup command example for AutoMQ from the s3url. In this command, substitute --controller-list and --broker-list with the actual CONTROLLER and BROKER needed for deployment.

Here's an example of the execution result:


############ Ping S3 ########################

[ OK ] Write s3 object
[ OK ] Read s3 object
[ OK ] Delete s3 object
[ OK ] Write s3 object
[ OK ] Upload s3 multipart object
[ OK ] Read s3 multipart object
[ OK ] Delete s3 object
############ String of S3url ################

Your s3url is:

s3://s3.cn-northwest-1.amazonaws.com.cn?s3-access-key=xxx&s3-secret-key=yyy&s3-region=cn-northwest-1&s3-endpoint-protocol=https&s3-data-bucket=automq-data&s3-path-style=false&s3-ops-bucket=automq-ops&cluster-id=40ErA_nGQ_qNPDz0uodTEA


############ Usage of S3url ################
To start AutoMQ, generate the start commandline using s3url.
bin/automq-kafka-admin.sh generate-start-command \
--s3-url="s3://s3.cn-northwest-1.amazonaws.com.cn?s3-access-key=XXX&s3-secret-key=YYY&s3-region=cn-northwest-1&s3-endpoint-protocol=https&s3-data-bucket=automq-data&s3-path-style=false&s3-ops-bucket=automq-ops&cluster-id=40ErA_nGQ_qNPDz0uodTEA" \
--controller-list="192.168.0.1:9093;192.168.0.2:9093;192.168.0.3:9093" \
--broker-list="192.168.0.4:9092;192.168.0.5:9092"

TIPS: Please replace the controller-list and broker-list with your actual IP addresses.

Step 2: Creating the Command List

Update the --controller-list and --broker-list in the previously generated command with your server details, specifically, replace them with the IP addresses of the 3 CONTROLLERS and 2 BROKERS discussed during environment setup, using the default ports 9092 and 9093.


bin/automq-kafka-admin.sh generate-start-command \
--s3-url="s3://s3.cn-northwest-1.amazonaws.com.cn?s3-access-key=XXX&s3-secret-key=YYY&s3-region=cn-northwest-1&s3-endpoint-protocol=https&s3-data-bucket=automq-data&s3-path-style=false&s3-ops-bucket=automq-ops&cluster-id=40ErA_nGQ_qNPDz0uodTEA" \
--controller-list="192.168.0.1:9093;192.168.0.2:9093;192.168.0.3:9093" \
--broker-list="192.168.0.4:9092;192.168.0.5:9092"

Parameter Explanation

Parameter NameRequiredDescription
--s3-urlYesGenerated by the command line tool bin/automq-kafka-admin.sh generate-s3-url, includes authentication details and cluster ID information
--controller-listYesRequires at least one address, used as the IP and port list for the CONTROLLER host. Format: IP1:PORT1; IP2:PORT2; IP3:PORT3
--broker-listYesRequires at least one address, used as the IP and port list for the BROKER host. Format: IP1:PORT1; IP2:PORT2; IP3:PORT3
--controller-only-modeNoDetermines if the CONTROLLER node will solely assume the role of CONTROLLER. Default is false, meaning the deployed CONTROLLER node also acts as a BROKER.

Output Result

Upon executing the command, it generates a command to launch AutoMQ.


############ Start Commandline ##############
To start an AutoMQ Kafka server, please navigate to the directory where your AutoMQ tgz file is located and run the following command.

Before running the command, make sure that Java 17 is installed on your host. You can verify the Java version by executing 'java -version'.

bin/kafka-server-start.sh --s3-url="s3://s3.cn-northwest-1.amazonaws.com.cn?s3-access-key=XXX&s3-secret-key=YYY&s3-region=cn-northwest-1&s3-endpoint-protocol=https&s3-data-bucket=automq-data&s3-path-style=false&s3-ops-bucket=automq-ops&cluster-id=40ErA_nGQ_qNPDz0uodTEA" --override process.roles=broker,controller --override node.id=0 --override controller.quorum.voters=0@192.168.0.1:9093,1@192.168.0.2:9093,2@192.168.0.3:9093 --override listeners=PLAINTEXT://192.168.0.1:9092,CONTROLLER://192.168.0.1:9093 --override advertised.listeners=PLAINTEXT://192.168.0.1:9092

bin/kafka-server-start.sh --s3-url="s3://s3.cn-northwest-1.amazonaws.com.cn?s3-access-key=XXX&s3-secret-key=YYY&s3-region=cn-northwest-1&s3-endpoint-protocol=https&s3-data-bucket=automq-data&s3-path-style=false&s3-ops-bucket=automq-ops&cluster-id=40ErA_nGQ_qNPDz0uodTEA" --override process.roles=broker,controller --override node.id=1 --override controller.quorum.voters=0@192.168.0.1:9093,1@192.168.0.2:9093,2@192.168.0.3:9093 --override listeners=PLAINTEXT://192.168.0.2:9092,CONTROLLER://192.168.0.2:9093 --override advertised.listeners=PLAINTEXT://192.168.0.2:9092

bin/kafka-server-start.sh --s3-url="s3://s3.cn-northwest-1.amazonaws.com.cn?s3-access-key=XXX&s3-secret-key=YYY&s3-region=cn-northwest-1&s3-endpoint-protocol=https&s3-data-bucket=automq-data&s3-path-style=false&s3-ops-bucket=automq-ops&cluster-id=40ErA_nGQ_qNPDz0uodTEA" --override process.roles=broker,controller --override node.id=2 --override controller.quorum.voters=0@192.168.0.1:9093,1@192.168.0.2:9093,2@192.168.0.3:9093 --override listeners=PLAINTEXT://192.168.0.3:9092,CONTROLLER://192.168.0.3:9093 --override advertised.listeners=PLAINTEXT://192.168.0.3:9092

bin/kafka-server-start.sh --s3-url="s3://s3.cn-northwest-1.amazonaws.com.cn?s3-access-key=XXX&s3-secret-key=YYY&s3-region=cn-northwest-1&s3-endpoint-protocol=https&s3-data-bucket=automq-data&s3-path-style=false&s3-ops-bucket=automq-ops&cluster-id=40ErA_nGQ_qNPDz0uodTEA" --override process.roles=broker --override node.id=3 --override controller.quorum.voters=0@192.168.0.1:9093,1@192.168.0.2:9093,2@192.168.0.3:9093 --override listeners=PLAINTEXT://192.168.0.4:9092 --override advertised.listeners=PLAINTEXT://192.168.0.4:9092

bin/kafka-server-start.sh --s3-url="s3://s3.cn-northwest-1.amazonaws.com.cn?s3-access-key=XXX&s3-secret-key=YYY&s3-region=cn-northwest-1&s3-endpoint-protocol=https&s3-data-bucket=automq-data&s3-path-style=false&s3-ops-bucket=automq-ops&cluster-id=40ErA_nGQ_qNPDz0uodTEA" --override process.roles=broker --override node.id=4 --override controller.quorum.voters=0@192.168.0.1:9093,1@192.168.0.2:9093,2@192.168.0.3:9093 --override listeners=PLAINTEXT://192.168.0.5:9092 --override advertised.listeners=PLAINTEXT://192.168.0.5:9092


TIPS: Start controllers first and then the brokers.

By default, node.id starts from 0 automatically.

Step 3: Startup AutoMQ

To initiate the cluster, sequentially run the list of commands provided in the previous step on the designated CONTROLLER or BROKER host. For example, to start the first CONTROLLER process at 192.168.0.1, execute the first command from the list of startup commands.


bin/kafka-server-start.sh --s3-url="s3://s3.cn-northwest-1.amazonaws.com.cn?s3-access-key=XXX&s3-secret-key=YYY&s3-region=cn-northwest-1&s3-endpoint-protocol=https&s3-data-bucket=automq-data&s3-path-style=false&s3-ops-bucket=automq-ops&cluster-id=40ErA_nGQ_qNPDz0uodTEA" --override process.roles=broker,controller --override node.id=0 --override controller.quorum.voters=0@192.168.0.1:9093,1@192.168.0.2:9093,2@192.168.0.3:9093 --override listeners=PLAINTEXT://192.168.0.1:9092,CONTROLLER://192.168.0.1:9093 --override advertised.listeners=PLAINTEXT://192.168.0.1:9092

Parameter Description

When using the startup command, any parameters not specified will revert to Apache Kafka's default configuration. For newly introduced parameters by AutoMQ, the default settings of AutoMQ will be applied. To customize the default settings, you can append additional --override key=value parameters at the end of the command.

Parameter Name
Mandatory
Description
s3-url
Yes
Generated by the bin/automq-kafka-admin.sh generate-s3-url command-line tool, includes authentication, cluster ID, etc.
process.roles
Yes
Options are CONTROLLER or BROKER. If a host serves as both CONTROLLER and BROKER, the configuration value would be CONTROLLER, BROKER.
node.id
Yes
An integer, uniquely identifies a BROKER or CONTROLLER within a Kafka cluster; must be unique within the cluster.
controller.quorum.votersYesHost information for participating in KRAFT elections, including nodeid, ip, and port information, for example: 0@192.168.0.1:9093, 1@192.168.0.2:9093, 2@192.168.0.3:9093
listenersYesIP and port being listened on
advertised.listenersYesThe address provided by the BROKER for Client access.
log.dirsNoDirectory for storing KRAFT and BROKER metadata.
s3.wal.pathNoIn production environments, it is recommended to store AutoMQ WAL data on a separately mounted new data volume raw device to achieve better performance, as AutoMQ supports writing data to raw devices, thereby reducing latency. Ensure the correct path is configured to store WAL data.
autobalancer.controller.enableNoDefault value is false, traffic self-balancing is not enabled. Once traffic self-balancing is automatically enabled, AutoMQ's auto balancer component will reassign partitions to ensure overall traffic is balanced.

Tips: For ongoing traffic self-balancing or during changes in cluster nodes Example: Self-Balancing When Cluster Nodes Change, it is advisable to explicitly set the parameter --override autobalancer.controller.enable=true when initializing the Controller.

Running in the Background

If background mode is required, append the following code to the end of the command:


command > /dev/null 2>&1 &

Data Volume Path

Use the lsblk command on Linux to view local data volumes; unpartitioned block devices are considered data volumes. In the following example, vdb is an unpartitioned raw block device.


vda 253:0 0 20G 0 disk
├─vda1 253:1 0 2M 0 part
├─vda2 253:2 0 200M 0 part /boot/efi
└─vda3 253:3 0 19.8G 0 part /
vdb 253:16 0 20G 0 disk

By default, AutoMQ stores metadata and WAL data in the /tmp directory. However, it's important to note that if the /tmp directory is mounted on tmpfs, it is unsuitable for production environments.

For better suitability in production or formal testing environments, it is recommended to modify the configuration as follows: designate the metadata directory log.dirs and WAL data directory s3.wal.path (write data disk's raw device) to other locations.


bin/kafka-server-start.sh ...\
--override s3.telemetry.metrics.exporter.type=prometheus \
--override s3.metrics.exporter.prom.host=0.0.0.0 \
--override s3.metrics.exporter.prom.port=9090 \
--override log.dirs=/root/kraft-logs \
--override s3.wal.path=/dev/vdb \
> /dev/null 2>&1 &

Tips:

  • Please change s3.wal.path to the actual name of the local raw device. To set up AutoMQ's Write-Ahead-Log (WAL) on local SSD storage, ensure the specified file path is on an SSD with more than 10GB of available space. For example--override s3.wal.path=/home/admin/automq-wal

  • When deploying AutoMQ in a private data center for production, ensure the reliability of the local SSD, such as by using RAID technology.

With this, you have completed the deployment of an AutoMQ cluster based on MinIO, achieving a low-cost, low-latency, and near-instantly self-balancing Kafka cluster. For further experience with AutoMQ's near-instant partition reassignment and continuous self-balancing features, refer to the official example.