> ## Documentation Index
> Fetch the complete documentation index at: https://docs.automq.com/llms.txt
> Use this file to discover all available pages before exploring further.

# StarRocks

> AutoMQ enables real-time data import to StarRocks, leveraging Kafka compatibility for integration and cloud-native scalability. Achieve efficient, high-concurrency analysis.

[StarRocks](https://www.starrocks.io/) is a high-performance analytical data warehouse that leverages advanced technologies such as vectorization, MPP architecture, CBO, intelligent materialized views, and a real-time updatable columnar storage engine. It supports multidimensional, real-time, and high-concurrency data analysis.

This article will introduce how to use StarRocks Routine Load to import data from AutoMQ into StarRocks. For a detailed understanding of the basic principles of Routine Load, refer to the [Routine Load Basic Principles](https://docs.starrocks.io/docs/loading/load_concept/strict_mode/#routine-load) documentation.

## Prerequisites

### Prepare StarRocks and Test Data

Ensure that a usable StarRocks cluster is already prepared. For demonstration purposes, we refer to [Deploy StarRocks with Docker](https://docs.starrocks.io/docs/3.0/quick_start/deploy_with_docker/) to install a demonstration cluster on a Linux machine.

Create test tables for the database and primary key model:

```sql theme={null}
create database automq_db;
create table users (
    id bigint NOT NULL,
    name string NOT NULL,
    timestamp string NULL,
    status string NULL
) PRIMARY KEY (id)
DISTRIBUTED BY HASH(id)
PROPERTIES (
    "replication_num" = "1",
    "enable_persistent_index" = "true"
);

```

### Prepare AutoMQ and Test Data

Refer to [Deploy Multi-Nodes Cluster on Linux▸](/automq/deployment/deploy-multi-nodes-cluster-on-linux) to deploy AutoMQ and ensure network connectivity between AutoMQ and StarRocks.

Quickly create a topic named `example_topic` in AutoMQ and write a test JSON data to it following these steps.

#### **Create Topic**

Use the Apache Kafka® command-line tool to create a topic. Ensure you have access to the Kafka environment and that the Kafka service is running. Below is an example command to create a topic:

```bash theme={null}
./kafka-topics.sh --create --topic exampleto_topic --bootstrap-server 10.0.96.4:9092  --partitions 1 --replication-factor 1

```

<Tip>
  When executing the command, replace `topic` and `bootstrap-server` with the actual Kafka server address.
</Tip>

After creating the topic, you can use the following command to verify whether the topic has been successfully created.

```bash theme={null}
./kafka-topics.sh --describe example_topic --bootstrap-server 10.0.96.4:9092
```

#### **Generate Test Data**

Generate test data in JSON format that corresponds to the table mentioned earlier.

```json theme={null}
{
  "id": 1,
  "name": "test_user",
  "timestamp": "2023-11-10T12:00:00",
  "status": "active"
}

```

#### **Writing Test Data**

Use Kafka's command-line tools or programming methods to write test data into a Topic named `example_topic`. Here is an example using the command-line tool:

```bash theme={null}
echo '{"id": 1, "name": "Test User", "timestamp": "2023-11-10T12:00:00", "status": "active"}' | sh kafka-console-producer.sh --broker-list 10.0.96.4:9092 --topic example_topic

```

Use the following command to view the data just written to the Topic:

```bash theme={null}
sh kafka-console-consumer.sh --bootstrap-server 10.0.96.4:9092 --topic example_topic --from-beginning
```

<Tip>
  When executing the command, replace `topic` and `bootstrap-server` with the actual Kafka server address.
</Tip>

## Creating Routine Load Import Job

Create a Routine Load job in the StarRocks command line to continuously import data from the AutoMQ Kafka Topic.

```sql theme={null}
CREATE ROUTINE LOAD automq_example_load ON users
COLUMNS(id, name, timestamp, status)
PROPERTIES
(
    "desired_concurrent_number" = "5",
    "format" = "json",
    "jsonpaths" = "[\"$.id\",\"$.name\",\"$.timestamp\",\"$.status\"]"
 )
FROM KAFKA
(
    "kafka_broker_list" = "10.0.96.4:9092",
    "kafka_topic" = "example_topic",
    "kafka_partitions" = "0",
    "property.kafka_default_offsets" = "OFFSET_BEGINNING"
);

```

<Tip>
  When executing the command, replace `kafka_broker_list` with the actual Kafka server address.
</Tip>

### Parameter Description

#### **Data Format**

The data format needs to be specified as JSON in the `PROPERTIES` clause with `"format" = "json"`.

#### **Data Extraction and Transformation**

If you need to specify a mapping and conversion relationship between the source data and the target table columns, you can configure the `COLUMNS` and `jsonpaths` parameters. In `COLUMNS`, the column names correspond to the column names of the **target table**, and the order of columns corresponds to the order of columns in **source data**. The `jsonpaths` parameter is used to extract the necessary field data from the JSON data, similar to newly generated CSV data. Subsequently, the `COLUMNS` parameter will temporarily name the fields in the order specified by `jsonpaths`. For more information on data conversion, refer to [Data Conversion Implementation During Import](https://docs.starrocks.io/docs/3.0/loading/Etl_in_loading/).

<Tip>
  If each line contains a JSON object where the names and number of keys correspond to the columns in the target table (order does not need to match), the `COLUMNS` configuration is not required.
</Tip>

## Validate Data Import

First, check the status of the Routine Load import job to ensure the task is running.

```sql theme={null}

show routine load\G;

```

And then querying the relevant tables in the StarRocks database, you can see that the data has been successfully imported.

```sql theme={null}
StarRocks > select * from users;
+------+--------------+---------------------+--------+
| id   | name         | timestamp           | status |
+------+--------------+---------------------+--------+
|    1 | Test User     | 2023-11-10T12:00:00 | active |
|    2 | Test User     | 2023-11-10T12:00:00 | active |
+------+--------------+---------------------+--------+
2 rows in set (0.01 sec)

```
