Deploy a simple Multi-Node Clickhouse Cluster with docker-compose in minutes.

Overview

Simple Multi Node Clickhouse Cluster

I hate those single-node clickhouse clusters and manually installation, I mean, why should we:

this is just weird!

So this repo tries to solve these problem.

Note

  • This is a simplified model of Multi Node Clickhouse Cluster, which lacks: LoadBalancer config/Automated Failover/MultiShard Config generation.
  • All clickhouse data is persisted under event-data, if you need to move clickhouse to some other directory, you'll just need to move the directory(that contains docker-compose.yml) and docker-compose up -d to fire it up again.
  • Host network mode is used to simplify the whole deploy procedure, so you might need to create addition firewall rules if you are running this on a public accessible machine.

Prerequisites

To use this, we need docker and docker-compose installed, recommended OS is ubuntu, and it's recommended to install clickhouse-client on machine, so on a typical ubuntu server, doing the following should be sufficient.

apt update
curl -fsSL https://get.docker.com -o get-docker.sh && sh get-docker.sh && rm -f get-docker.sh
apt install docker-compose clickhouse-client -y

Usage

  1. Clone this repo
  2. Edit the necessary server info in topo.yml
  3. Run python3 generate.py
  4. Your cluster info should be in the cluster directory now
  5. Sync those files to related nodes and run docker-compose up -d on them
  6. Your cluster is ready to go

If you still cannot understand what I'm saying above, see the example below.

Example Usage

Edit information

I've Clone the repo and would like to set a 3-master clickhouse cluster and has the following specs

  • 3 replica(one replica on each node)
  • 1 Shard only

So I need to edit the topo.yml as follows:

global:
  clickhouse_image: "yandex/clickhouse-server:21.3.2.5"
  zookeeper_image: "bitnami/zookeeper:3.6.1"

zookeeper_servers:
  - host: 192.168.33.101
  - host: 192.168.33.102
  - host: 192.168.33.103

clickhouse_servers:
  - host: 192.168.33.101
  - host: 192.168.33.102
  - host: 192.168.33.103

clickhouse_topology:
  - clusters:
      - name: "novakwok_cluster"
        shards:
          - name: "novakwok_shard"
            servers:
              - host: 192.168.33.101
              - host: 192.168.33.102
              - host: 192.168.33.103

Generate Config

After python3 generate.py, a structure has been generated under cluster directory, looks like this:

➜  simple-multinode-clickhouse-cluster git:(master) ✗ python3 generate.py 
Write clickhouse-config.xml to cluster/192.168.33.101/clickhouse-config.xml
Write clickhouse-config.xml to cluster/192.168.33.102/clickhouse-config.xml
Write clickhouse-config.xml to cluster/192.168.33.103/clickhouse-config.xml

➜  simple-multinode-clickhouse-cluster git:(master) ✗ tree cluster 
cluster
├── 192.168.33.101
│   ├── clickhouse-config.xml
│   ├── clickhouse-user-config.xml
│   └── docker-compose.yml
├── 192.168.33.102
│   ├── clickhouse-config.xml
│   ├── clickhouse-user-config.xml
│   └── docker-compose.yml
└── 192.168.33.103
    ├── clickhouse-config.xml
    ├── clickhouse-user-config.xml
    └── docker-compose.yml

3 directories, 9 files

Sync Config

Now we need to sync those files to related hosts(of course you can use ansible here):

rsync -aP ./cluster/192.168.33.101/ [email protected]:/root/ch/
rsync -aP ./cluster/192.168.33.102/ [email protected]:/root/ch/
rsync -aP ./cluster/192.168.33.103/ [email protected]:/root/ch/

Start Cluster

Now run docker-compose up -d on every hosts' /root/ch/ directory.

Validation

On 192.168.33.101, use clickhouse-client to connect to local instance and check if cluster is there.

root@192-168-33-101:~/ch# clickhouse-client 
ClickHouse client version 18.16.1.
Connecting to localhost:9000.
Connected to ClickHouse server version 21.3.2 revision 54447.

192-168-33-101 :) SELECT * FROM system.clusters;

SELECT *
FROM system.clusters 

┌─cluster──────────────────────────────────────┬─shard_num─┬─shard_weight─┬─replica_num─┬─host_name──────┬─host_address───┬─port─┬─is_local─┬─user────┬─default_database─┬─errors_count─┬─estimated_recovery_time─┐
│ novakwok_cluster                             │         1 │            1 │           1 │ 192.168.33.101 │ 192.168.33.101 │ 9000 │        1 │ default │                  │            0 │                       0 │
│ novakwok_cluster                             │         1 │            1 │           2 │ 192.168.33.102 │ 192.168.33.102 │ 9000 │        0 │ default │                  │            0 │                       0 │
│ novakwok_cluster                             │         1 │            1 │           3 │ 192.168.33.103 │ 192.168.33.103 │ 9000 │        0 │ default │                  │            0 │                       0 │
│ test_cluster_two_shards                      │         1 │            1 │           1 │ 127.0.0.1      │ 127.0.0.1      │ 9000 │        1 │ default │                  │            0 │                       0 │
│ test_cluster_two_shards                      │         2 │            1 │           1 │ 127.0.0.2      │ 127.0.0.2      │ 9000 │        0 │ default │                  │            0 │                       0 │
│ test_cluster_two_shards_internal_replication │         1 │            1 │           1 │ 127.0.0.1      │ 127.0.0.1      │ 9000 │        1 │ default │                  │            0 │                       0 │
│ test_cluster_two_shards_internal_replication │         2 │            1 │           1 │ 127.0.0.2      │ 127.0.0.2      │ 9000 │        0 │ default │                  │            0 │                       0 │
│ test_cluster_two_shards_localhost            │         1 │            1 │           1 │ localhost      │ 127.0.0.1      │ 9000 │        1 │ default │                  │            0 │                       0 │
│ test_cluster_two_shards_localhost            │         2 │            1 │           1 │ localhost      │ 127.0.0.1      │ 9000 │        1 │ default │                  │            0 │                       0 │
│ test_shard_localhost                         │         1 │            1 │           1 │ localhost      │ 127.0.0.1      │ 9000 │        1 │ default │                  │            0 │                       0 │
│ test_shard_localhost_secure                  │         1 │            1 │           1 │ localhost      │ 127.0.0.1      │ 9440 │        0 │ default │                  │            0 │                       0 │
│ test_unavailable_shard                       │         1 │            1 │           1 │ localhost      │ 127.0.0.1      │ 9000 │        1 │ default │                  │            0 │                       0 │
│ test_unavailable_shard                       │         2 │            1 │           1 │ localhost      │ 127.0.0.1      │    1 │        0 │ default │                  │            0 │                       0 │
└──────────────────────────────────────────────┴───────────┴──────────────┴─────────────┴────────────────┴────────────────┴──────┴──────────┴─────────┴──────────────────┴──────────────┴─────────────────────────┘
↘ Progress: 13.00 rows, 1.58 KB (4.39 thousand rows/s., 532.47 KB/s.) 
13 rows in set. Elapsed: 0.003 sec. 

Let's create a DB with replica:

192-168-33-101 :) create database novakwok_test on cluster novakwok_cluster;

CREATE DATABASE novakwok_test ON CLUSTER novakwok_cluster

┌─host───────────┬─port─┬─status─┬─error─┬─num_hosts_remaining─┬─num_hosts_active─┐
│ 192.168.33.103 │ 9000 │      0 │       │                   2 │                0 │
│ 192.168.33.101 │ 9000 │      0 │       │                   1 │                0 │
│ 192.168.33.102 │ 9000 │      0 │       │                   0 │                0 │
└────────────────┴──────┴────────┴───────┴─────────────────────┴──────────────────┘
← Progress: 3.00 rows, 174.00 B (16.07 rows/s., 931.99 B/s.)  99%
3 rows in set. Elapsed: 0.187 sec. 

192-168-33-101 :) show databases;

SHOW DATABASES

┌─name──────────┐
│ default       │
│ novakwok_test │
│ system        │
└───────────────┘
↑ Progress: 3.00 rows, 479.00 B (855.61 rows/s., 136.61 KB/s.) 
3 rows in set. Elapsed: 0.004 sec. 

Connect to another host to see if it's really working.

root@192-168-33-101:~/ch# clickhouse-client -h 192.168.33.102
ClickHouse client version 18.16.1.
Connecting to 192.168.33.102:9000.
Connected to ClickHouse server version 21.3.2 revision 54447.

192-168-33-102 :) show databases;

SHOW DATABASES

┌─name──────────┐
│ default       │
│ novakwok_test │
│ system        │
└───────────────┘
↘ Progress: 3.00 rows, 479.00 B (623.17 rows/s., 99.50 KB/s.) 
3 rows in set. Elapsed: 0.005 sec. 

License

GPL

You might also like...
DAMPP (gui) is a Python based program to run simple webservers using MySQL, Php, Apache and PhpMyAdmin inside of Docker containers.

DAMPP (gui) is a Python based program to run simple webservers using MySQL, Php, Apache and PhpMyAdmin inside of Docker containers.

Visual disk-usage analyser for docker images
Visual disk-usage analyser for docker images

whaler What? A command-line tool for visually investigating the disk usage of docker images Why? Large images are slow to move and expensive to store.

A Python library for the Docker Engine API

Docker SDK for Python A Python library for the Docker Engine API. It lets you do anything the docker command does, but from within Python apps – run c

Docker Container wallstreetbets-sentiment-analysis

Docker Container wallstreetbets-sentiment-analysis A docker container using restful endpoints exposed on port 5000 "/analyze" to gather sentiment anal

Hackergame nc 类题目的 Docker 容器资源限制、动态 flag、网页终端

Hackergame nc 类题目的 Docker 容器资源限制、动态 flag、网页终端 快速入门 配置证书 证书用于验证用户 Token。请确保这里的证书文件(cert.pem)与 Hackergame 平台 配置的证书相同,这样 Hackergame 平台为每个用户生成的 Token 才可以通

docker-compose工程部署时的辅助脚本

okta-cmd Introduction docker-compose 辅助脚本

Linux, Jenkins, AWS, SRE, Prometheus, Docker, Python, Ansible, Git, Kubernetes, Terraform, OpenStack, SQL, NoSQL, Azure, GCP, DNS, Elastic, Network, Virtualization. DevOps Interview Questions
Linux, Jenkins, AWS, SRE, Prometheus, Docker, Python, Ansible, Git, Kubernetes, Terraform, OpenStack, SQL, NoSQL, Azure, GCP, DNS, Elastic, Network, Virtualization. DevOps Interview Questions

Linux, Jenkins, AWS, SRE, Prometheus, Docker, Python, Ansible, Git, Kubernetes, Terraform, OpenStack, SQL, NoSQL, Azure, GCP, DNS, Elastic, Network, Virtualization. DevOps Interview Questions

🐳 Docker templates for various languages.
🐳 Docker templates for various languages.

Docker Deployment Templates One Stop repository for Docker Compose and Docker Templates for Deployment. Features Python (FastAPI, Flask) Screenshots D

Push Container Image To Docker Registry In Python

push-container-image-to-docker-registry 概要 push-container-image-to-docker-registry は、エッジコンピューティング環境において、特定のエッジ端末上の Private Docker Registry に特定のコンテナイメー

Owner
Nova Kwok
43EC 6073 0BFF A16C 34BB 9EF2 8D42 A0E6 99E5 0639
Nova Kwok
Checkmk kube agent - Checkmk Kubernetes Cluster and Node Collectors

Checkmk Kubernetes Cluster and Node Collectors Checkmk cluster and node collecto

tribe29 GmbH 15 Dec 26, 2022
Build and Push docker image in Python (luigi + docker-py)

Docker build images workflow in Python Since docker hub stopped building images for free accounts, I've been looking for another way to do it. I could

Fabien D. 2 Dec 15, 2022
Python IMDB Docker - A docker tutorial to containerize a python script.

Python_IMDB_Docker A docker tutorial to containerize a python script. Build the docker in the current directory: docker build -t python-imdb . Run the

Sarthak Babbar 1 Dec 30, 2021
Define and run multi-container applications with Docker

Docker Compose Docker Compose is a tool for running multi-container applications on Docker defined using the Compose file format. A Compose file is us

Docker 28.2k Jan 8, 2023
Ganeti is a virtual machine cluster management tool built on top of existing virtualization technologies such as Xen or KVM and other open source software.

Ganeti 3.0 =========== For installation instructions, read the INSTALL and the doc/install.rst files. For a brief introduction, read the ganeti(7) m

null 395 Jan 4, 2023
sysctl/sysfs settings on a fly for Kubernetes Cluster. No restarts are required for clusters and nodes.

SysBindings Daemon Little toolkit for control the sysctl/sysfs bindings on Kubernetes Cluster on the fly and without unnecessary restarts of cluster o

Wallarm 19 May 6, 2022
MLops tools review for execution on multiple cluster types: slurm, kubernetes, dask...

MLops tools review focused on execution using multiple cluster types: slurm, kubernetes, dask...

null 4 Nov 30, 2022
Play Wordle from any Kubernetes cluster.

wordle-operator ?? ⬛ ?? ?? ⬛ Play Wordle from any Kubernetes cluster. Using the power of CustomResourceDefinitions and Kubernetes Operators, now you c

Lucas Melin 1 Jan 15, 2022
Helperpod - A CLI tool to run a Kubernetes utility pod with pre-installed tools that can be used for debugging/testing purposes inside a Kubernetes cluster

Helperpod is a CLI tool to run a Kubernetes utility pod with pre-installed tools that can be used for debugging/testing purposes inside a Kubernetes cluster.

Atakan Tatlı 2 Feb 5, 2022