Tag Archives: galera

Provisioning a Galera Cluster on Ubuntu 18.04

So, we want to bring up a Galera cluster, and do some basic testing of how to bring it back online should things go pear shaped

First, install MariaDB on all three nodes

# apt-get install software-properties-common
# apt-key adv --recv-keys --keyserver hkp://keyserver.ubuntu.com:80 0xF1656F24C74CD1D8
# add-apt-repository "deb [arch=amd64,arm64,ppc64el] http://mariadb.mirror.liquidtelecom.com/repo/10.4/ubuntu $(lsb_release -cs) main"
# apt update
# apt -y install mariadb-server mariadb-client

At this point mariadb is installed, but has no root password configured.

# mysql_secure_installation
 NOTE: RUNNING ALL PARTS OF THIS SCRIPT IS RECOMMENDED FOR ALL MariaDB
       SERVERS IN PRODUCTION USE!  PLEASE READ EACH STEP CAREFULLY!
 In order to log into MariaDB to secure it, we'll need the current
 password for the root user. If you've just installed MariaDB, and
 haven't set the root password yet, you should just press enter here.
 Enter current password for root (enter for none): 
 OK, successfully used password, moving on…
 Setting the root password or using the unix_socket ensures that nobody
 can log into the MariaDB root user without the proper authorisation.
 You already have your root account protected, so you can safely answer 'n'.
 Switch to unix_socket authentication [Y/n] 
 Enabled successfully!
 Reloading privilege tables..
  … Success!
 You already have your root account protected, so you can safely answer 'n'.
 Change the root password? [Y/n] y
 New password: 
 Re-enter new password: 
 Password updated successfully!
 Reloading privilege tables..
  … Success!
 By default, a MariaDB installation has an anonymous user, allowing anyone
 to log into MariaDB without having to have a user account created for
 them.  This is intended only for testing, and to make the installation
 go a bit smoother.  You should remove them before moving into a
 production environment.
 Remove anonymous users? [Y/n] y
  … Success!
 Normally, root should only be allowed to connect from 'localhost'.  This
 ensures that someone cannot guess at the root password from the network.
 Disallow root login remotely? [Y/n] y
  … Success!
 By default, MariaDB comes with a database named 'test' that anyone can
 access.  This is also intended only for testing, and should be removed
 before moving into a production environment.
 Remove test database and access to it? [Y/n] y
 Dropping test database…
 … Success!
 Removing privileges on test database…
 … Success! 
 Reloading the privilege tables will ensure that all changes made so far
 will take effect immediately.
 Reload privilege tables now? [Y/n] y
  … Success!
 Cleaning up…
 All done!  If you've completed all of the above steps, your MariaDB
 installation should now be secure.
 Thanks for using MariaDB!

Do this on both servers, and you’re now ready to configure the Galera Cluster portion! On each node, you want to create a /etc/mysql/mariadb.conf.d/galera.cnf file:

# cat /etc/mysql/mariadb.conf.d/galera.cnf
[mysqld]
character-set-server = utf8
character_set_server = utf8
bind-address=0.0.0.0
port=3306
default_storage_engine=InnoDB
binlog_format=row
innodb_autoinc_lock_mode=2
# Galera cluster configuration
wsrep_on=ON
wsrep_provider=/usr/lib/galera/libgalera_smm.so
wsrep_cluster_address="gcomm://192.168.1.201,192.168.1.202,192.168.1.203"
wsrep_cluster_name="galera-cluster-1"
wsrep_sst_method=rsync
# Cluster node configuration
wsrep_node_address="192.168.1.201"
wsrep_node_name="galera-host-01"

#

The only difference between this file on each node is the last two lines – wsrep_node_address, and wsrep_node_name. note: You’ll probably want to have hosts file entries on your nodes to map IP’s to names and vice versa, unless you have reliable DNS configured internally, as it affects your cluster status displays.
Your wsrep_cluster_address line will have the IP’s of your cluster nodes.

Stop MariaDB on all nodes, and boot galera on the first node:

node2# systemctl stop mariadb
node3# systemctl stop mariadb
node1# systemctl stop mariadb
node1# galera_new_cluster

Your cluster should now have started. Lets check the cluster state.

root@galera-host-01:~# mysql -e "show status like 'wsrep_%'"
 
+-------------------------------+------------------------------------------------
| Variable_name                 | Value                               +-------------------------------+------------------------------------------------
 | wsrep_local_state_uuid        | fd6dbdcc-c95e-11e9-ac52-570534ceb766           
 | wsrep_protocol_version        | 10  
 | wsrep_last_committed          | 1 
 | wsrep_replicated              | 0 
 | wsrep_replicated_bytes        | 0 
 | wsrep_repl_keys               | 0 
 | wsrep_repl_keys_bytes         | 0 
 | wsrep_repl_data_bytes         | 0 
 | wsrep_repl_other_bytes        | 0 
 | wsrep_received                | 2 
 | wsrep_received_bytes          | 144 
 | wsrep_local_commits           | 0  
 | wsrep_local_cert_failures     | 0  
 | wsrep_local_replays           | 0 
 | wsrep_local_send_queue        | 0 
 | wsrep_local_send_queue_max    | 1 
 | wsrep_local_send_queue_min    | 0 
 | wsrep_local_send_queue_avg    | 0 
 | wsrep_local_recv_queue        | 0 
 | wsrep_local_recv_queue_max    | 1 
 | wsrep_local_recv_queue_min    | 0 
 | wsrep_local_recv_queue_avg    | 0  
 | wsrep_local_cached_downto     | 1 
 | wsrep_flow_control_paused_ns  | 0 
 | wsrep_flow_control_paused     | 0 
 | wsrep_flow_control_sent       | 0 
 | wsrep_flow_control_recv       | 0 
 | wsrep_cert_deps_distance      | 0 
 | wsrep_apply_oooe              | 0 
 | wsrep_apply_oool              | 0 
 | wsrep_apply_window            | 0 
 | wsrep_commit_oooe             | 0 
 | wsrep_commit_oool             | 0 
 | wsrep_commit_window           | 0 
 | wsrep_local_state             | 4 
 | wsrep_local_state_comment     | Synced 
 | wsrep_cert_index_size         | 0 
 | wsrep_causal_reads            | 0 
 | wsrep_cert_interval           | 0 
 | wsrep_open_transactions       | 0 
 | wsrep_open_connections        | 0 
 | wsrep_incoming_addresses      | AUTO 
 | wsrep_cluster_weight          | 1 
 | wsrep_desync_count            | 0 
 | wsrep_evs_delayed             |  
 | wsrep_evs_evict_list          |  
 | wsrep_evs_repl_latency        | 0/0/0/0/0 
 | wsrep_evs_state               | OPERATIONAL 
 | wsrep_gcomm_uuid              | fd6cf1bd-c95e-11e9-98ab-d2e5733d21d0           
 | wsrep_applier_thread_count    | 1 
 | wsrep_cluster_capabilities    |  
 | wsrep_cluster_conf_id         | 18446744073709551615 
 | wsrep_cluster_size            | 1
 | wsrep_cluster_state_uuid      | fd6dbdcc-c95e-11e9-ac52-570534ceb766           
 | wsrep_cluster_status          | Primary  
 | wsrep_connected               | ON  
 | wsrep_local_bf_aborts         | 0  
 | wsrep_local_index             | 0  
 | wsrep_provider_capabilities   | :MULTI_MASTER:CERTIFICATION:PARALLEL_APPLYING:T
 | wsrep_provider_name           | Galera 
 | wsrep_provider_vendor         | Codership Oy <info@codership.com>
 | wsrep_provider_version        | 26.4.2(r4498) 
 | wsrep_ready                   | ON   
 | wsrep_rollbacker_thread_count | 2 
 | wsrep_thread_count            | 3 
 +-------------------------------+------------------------------------------------ 

Check the cluster size to make sure the cluster came up – it should be a cluster of one right now

root@galera-host-01:~# mysql -e "show status like 'wsrep_cluster_size'"
Enter password: 
+--------------------+-------+
| Variable_name      | Value |
+--------------------+-------+
| wsrep_cluster_size | 1     |
+--------------------+-------+

This looks good – time to boot the second server

galera-host-02# systemctl start mariadb

Check the cluster size again – it should now be 2

root@galera-host-01:~# mysql -u root -p -e "show status like 'wsrep_cluster_size'"
Enter password: 
+--------------------+-------+
| Variable_name      | Value |
+--------------------+-------+
| wsrep_cluster_size | 2     |
+--------------------+-------+

Start node3, and you’ll see the node size at 3. Nice.

Cluster monitoring for HAProxy

Having a cluster is no good if you don’t have your client machines load balancing across them, and only talking to ‘up’ servers.

For this you need a way for HAProxy to know the state of each server. I like the ‘clustercheck’ script from obissick. (https://github.com/obissick/Galera-ClusterCheck)

First we add a cluster check user to MySQL.

create user clustercheckuser@'localhost' IDENTIFIED BY 'h3fUU3373Pb17Vjt&^C39hFHelA';
GRANT PROCESS ON *.* TO 'clustercheckuser'@'localhost';

Then Install the clustercheck script, and edit it

curl https://raw.githubusercontent.com/obissick/Galera-ClusterCheck/master/clustercheck.sh > /usr/bin/clustercheck
chmod +x /usr/bin/clustercheck
vi /usr/bin/clustercheck
edit the MYSQL_PASSWORD line to put the password in..
i.e. MYSQL_PASSWORD="${2-h3fUU3373Pb17Vjt&^C39hFHelA}"
apt-get install -y xinetd
echo "mysqlchk 9200/tcp #mysql check script" >> /etc/services
cat > /etc/xinetd.d/mysqlchk << __END__
default: on
description: mysqlchk
service mysqlchk
{
disable = no
flags = REUSE
socket_type = stream
port = 9200
wait = no
user = nobody
server = /usr/bin/clustercheck
log_on_failure += USERID
only_from = 192.168.1.0/24
per_source = UNLIMITED
}
__END__
service xinetd restart

Note: you will want to change the only_from to match any IP’s which will be running haproxy. Deploy the above to all members of your galera cluster.

Now, on each client machine, we can configure a block in haproxy similar to the below:

frontend mysql-dev-front
bind 127.0.0.1:3307
mode tcp
default_backend mysql-dev-back

backend mysql-dev-back
mode tcp
balance leastconn
option tcpka
option httpchk
default-server port 9200 inter 2s downinter 5s rise 3 fall 2 slowstart 60s weight 100
server node1 192.168.1.201:3306 check
server node2 192.168.1.202:3306 check
server node3 192.168.1.203:3306 check

NOTE: Make sure that you can connect to port 9200 on both MySQL servers from your client server BEFORE enabling this config in HAProxy!

curl http://192.168.1.201:9200
Galera Cluster Node is synced.
curl: (56) Recv failure: Connection reset by peer

once this is complete, any applications on your client machine can connect to localhost:3307

Failure Modes

Ok so this is one of the most important things we need to think about. There are a few failure modes we need to be able to handle in the cluster.

One node is shut down

Lets insert some rows into a test database, and then shut down node 2.

root@galera-host-01:~# mysql
Welcome to the MariaDB monitor.  Commands end with ; or \g.
Your MariaDB connection id is 50
Server version: 10.4.7-MariaDB-1:10.4.7+maria~bionic-log mariadb.org binary distribution
Copyright (c) 2000, 2018, Oracle, MariaDB Corporation Ab and others.
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

MariaDB [(none)]> create database ninjas;
Query OK, 1 row affected (0.006 sec)
MariaDB [(none)]> use ninjas;
Database changed
MariaDB [ninjas]> create table table1 (row1 integer not null);
Query OK, 0 rows affected (0.013 sec)
MariaDB [ninjas]> insert into table1 values (1), (2), (3), (4), (5), (6);
Query OK, 6 rows affected (0.006 sec)
Records: 6  Duplicates: 0  Warnings: 0

root@galera-host-02:~# mysql ninjas -e "select * from table1;"
 +------+
 | row1 |
 +------+
 |    1 |
 |    2 |
 |    3 |
 |    4 |
 |    5 |
 |    6 |
 +------+
root@galera-host-02:~# systemctl stop mariadb

root@galera-host-01:~# mysql ninjas -e "insert into table1 values (7), (8), (9);"

Ok, so we now have data in the DB since host-02 was shutdown. We’re now going to bring host-02 back up and check that it comes back into the cluster cleanly

root@galera-host-02:~# systemctl start mariadb
root@galera-host-02:~# mysql ninjas -e "select * from table1;"
+------+
| row1 |
+------+
| 1 |
| 2 |
| 3 |
| 4 |
| 5 |
| 6 |
| 7 |
| 8 |
| 9 |
+------+

Perfect.

All the nodes are shut down

And re-started in the correct order

How about where we have to shut down the cluster, and bring it back online? Lets shut down the nodes, first we’ll do reverse order 03, 02, 01, then bring them up 01, 02, 03. (Yes, That’s the way you’re meant to do it. We’re going to do it wrong soon, and see how to recover from that..)

root@galera-host-03:~# systemctl stop mariadb
root@galera-host-02:~# systemctl stop mariadb
root@galera-host-01:~# systemctl stop mariadb

If we have a look in /var/lib/mysql/grastate.dat on each host, we’ll see that galera-host-01 is indeed the host we should be booting the cluster from:

root@galera-host-01:~# cat /var/lib/mysql/grastate.dat 
GALERA saved state
version: 2.1
uuid:    a10015aa-cd62-11e9-a80b-87dacc3a89c3
seqno:   11
safe_to_bootstrap: 1

root@galera-host-02:~# cat /var/lib/mysql/grastate.dat 
GALERA saved state
version: 2.1
uuid:    a10015aa-cd62-11e9-a80b-87dacc3a89c3
seqno:   10
safe_to_bootstrap: 0

root@galera-host-03:~# cat /var/lib/mysql/grastate.dat 
GALERA saved state
version: 2.1
uuid:    a10015aa-cd62-11e9-a80b-87dacc3a89c3
seqno:   9
safe_to_bootstrap: 0

Ok, so we should simply be able to boot the cluster by running galera_new_cluster on galera-host-01, and then starting the other hosts

root@galera-host-01:~# galera_new_cluster
root@galera-host-01:~# mysql -e "show status like 'wsrep_cluster_size'"
+--------------------+-------+
| Variable_name      | Value |
+--------------------+-------+
| wsrep_cluster_size | 1     |
+--------------------+-------+

root@galera-host-02:~# systemctl start mariadb

root@galera-host-01:~# mysql -e "show status like 'wsrep_cluster_size'"
+--------------------+-------+
| Variable_name      | Value |
+--------------------+-------+
| wsrep_cluster_size | 2     |
+--------------------+-------+

root@galera-host-03:~# systemctl start mariadb

root@galera-host-01:~# mysql -e "show status like 'wsrep_cluster_size'"
+--------------------+-------+
| Variable_name      | Value |
+--------------------+-------+
| wsrep_cluster_size | 3     |
+--------------------+-------+

Restarting the cluster in the wrong order

Ok now lets shut it down with host 01 first, and then boot it from host01 first. So, we’re going to shut down the nodes, and then check our galera replication state files

root@galera-host-01:~# systemctl stop mariadb
root@galera-host-02:~# systemctl stop mariadb
root@galera-host-03:~# systemctl stop mariadb

root@galera-host-01:~# cat /var/lib/mysql/grastate.dat 
GALERA saved state
version: 2.1
uuid:    a10015aa-cd62-11e9-a80b-87dacc3a89c3
seqno:   15
safe_to_bootstrap: 0

root@galera-host-02:~# cat /var/lib/mysql/grastate.dat 
GALERA saved state
version: 2.1
uuid:    a10015aa-cd62-11e9-a80b-87dacc3a89c3
seqno:   16
safe_to_bootstrap: 0

root@galera-host-03:~# cat /var/lib/mysql/grastate.dat 
GALERA saved state
version: 2.1
uuid:    a10015aa-cd62-11e9-a80b-87dacc3a89c3
seqno:   17
safe_to_bootstrap: 1

Yep, so host-03 would be the correct host to start. But we’re going to start host-01.

root@galera-host-01:~# galera_new_cluster 
 Job for mariadb.service failed because the control process exited with error code.
 See "systemctl status mariadb.service" and "journalctl -xe" for details.

Well, that’s promising. It won’t let us do it. If we check the log, we se:

2019-09-02  9:58:05 0 [Note] WSREP: Start replication
2019-09-02  9:58:05 0 [Note] WSREP: Connecting with bootstrap option: 1
2019-09-02  9:58:05 0 [Note] WSREP: Setting GCS initial position to a10015aa-cd62-11e9-a80b-87dacc3a89c3:15
2019-09-02  9:58:05 0 [ERROR] WSREP: It may not be safe to bootstrap the cluster from this node. It was not the last one to leave the cluster and may not contain all the updates. To force cluster bootstrap with this node, edit the grastate.dat file manually and set safe_to_bootstrap to 1 .
2019-09-02  9:58:05 0 [ERROR] WSREP: wsrep::connect(gcomm://192.168.13.201,192.168.13.202,192.168.13.203) failed: 7
2019-09-02  9:58:05 0 [ERROR] Aborting

Ok. lets make it really bad, lets edit grastate.dat and set safe_to_bootstrap to 1 😀

root@galera-host-01:~# galera_new_cluster 
root@galera-host-01:~# mysql -e "show status like 'wsrep_cluster_size'"
 +--------------------+-------+
 | Variable_name      | Value |
 +--------------------+-------+
 | wsrep_cluster_size | 1     |
 +--------------------+-------+

Ok, promising. Lets start host-02

root@galera-host-02:~# systemctl start mariadb
root@galera-host-01:~# mysql -e "show status like 'wsrep_cluster_size'"
+--------------------+-------+
| Variable_name      | Value |
+--------------------+-------+
| wsrep_cluster_size | 2     |
+--------------------+-------+

Well, that’s beautiful! Ok, lets see what happens when we start host-03, which is still in safe_to_bootstrap in its’ grastate file

root@galera-host-03:~# systemctl start mariadb
root@galera-host-01:~# mysql -e "show status like 'wsrep_cluster_size'"
+--------------------+-------+
| Variable_name      | Value |
+--------------------+-------+
| wsrep_cluster_size | 3     |
+--------------------+-------+

Well, colour me impressed! Are they REALLY in sync?

root@galera-host-01:~# mysql ninjas -e "insert into table1 values (14), (15), (16);"

root@galera-host-02:~# mysql ninjas -e "select * from table1;"
+------+
| row1 |
+------+
|    1 |
|    2 |
|    3 |
|    4 |
|    5 |
|    6 |
|    7 |
|    8 |
|    9 |
|   10 |
|   11 
|   12 |
|   14 |
|   15 |
|   16 |
+------+

root@galera-host-03:~# mysql ninjas -e "select * from table1;"
+------+
| row1 |
+------+
|    1 |
|    2 |
|    3 |
|    4 |
|    5 |
|    6 |
|    7 |
|    8 |
|    9 |
|   10 |
|   11 
|   12 |
|   14 |
|   15 |
|   16 |
+------+

Well bugger me, that looks good!

Rebuilding a failed node

SHOULD you happen to have a node get completely buggered (say you have enough data that the systemd 90 second startup timeout screws you during SST and leaves your node broken), you can need to do a clean setup of MariaDB to get Galera going again.

galera-host-02# apt-get purge mariadb-server-10.4
 Reading package lists… Done
 Building dependency tree       
 Reading state information… Done
 The following packages were automatically installed and are no longer required:
   galera-4 libaio1 libcgi-fast-perl libcgi-pm-perl libfcgi-perl libhtml-template-perl mariadb-server-core-10.4 socat
 Use 'apt autoremove' to remove them.
 The following packages will be REMOVED:
   mariadb-server* mariadb-server-10.4*
 0 upgraded, 0 newly installed, 2 to remove and 0 not upgraded.
 After this operation, 77.7 MB disk space will be freed.
 Do you want to continue? [Y/n] y
 (Reading database … 68199 files and directories currently installed.)
 Removing mariadb-server (1:10.4.7+maria~bionic) …
 Removing mariadb-server-10.4 (1:10.4.7+maria~bionic) …
 Processing triggers for man-db (2.8.3-2ubuntu0.1) …
 (Reading database … 68069 files and directories currently installed.)
 Purging configuration files for mariadb-server-10.4 (1:10.4.7+maria~bionic) …

galera-host-02:~# rm -rf /var/lib/mysql/*
galera-host-02:~# mv /etc/mysql/mariadb.conf.d/galera.cnf /root/galera.cnf

galera-host-02:~# apt -y install mariadb-server mariadb-client
 Reading package lists… Done
 Building dependency tree       
 Reading state information… Done
 mariadb-client is already the newest version (1:10.4.7+maria~bionic).
 Suggested packages:
   mailx mariadb-test tinyca
 The following NEW packages will be installed:
   mariadb-server mariadb-server-10.4
 0 upgraded, 2 newly installed, 0 to remove and 0 not upgraded.
 Need to get 4,627 kB of archives.
 After this operation, 77.7 MB of additional disk space will be used.
 Get:1 http://mariadb.mirror.liquidtelecom.com/repo/10.4/ubuntu bionic/main amd64 mariadb-server-10.4 amd64 1:10.4.7+maria~bionic [4,624 kB]
 Get:2 http://mariadb.mirror.liquidtelecom.com/repo/10.4/ubuntu bionic/main amd64 mariadb-server all 1:10.4.7+maria~bionic [3,180 B]
 Fetched 4,627 kB in 6s (824 kB/s)          
 Preconfiguring packages …
 Selecting previously unselected package mariadb-server-10.4.
 (Reading database … 68059 files and directories currently installed.)
 Preparing to unpack …/mariadb-server-10.4_1%3a10.4.7+maria~bionic_amd64.deb …
 Unpacking mariadb-server-10.4 (1:10.4.7+maria~bionic) …
 Selecting previously unselected package mariadb-server.
 Preparing to unpack …/mariadb-server_1%3a10.4.7+maria~bionic_all.deb …
 Unpacking mariadb-server (1:10.4.7+maria~bionic) …
 Setting up mariadb-server-10.4 (1:10.4.7+maria~bionic) …
 Failed to stop mysql.service: Unit mysql.service not loaded.
 Created symlink /etc/systemd/system/mysql.service → /lib/systemd/system/mariadb.service.
 Created symlink /etc/systemd/system/mysqld.service → /lib/systemd/system/mariadb.service.
 Created symlink /etc/systemd/system/multi-user.target.wants/mariadb.service → /lib/systemd/system/mariadb.service.
 Setting up mariadb-server (1:10.4.7+maria~bionic) …
 Processing triggers for man-db (2.8.3-2ubuntu0.1) …

galera-host-02:~# mysql_secure_installation
......
<snipped>

galera-host-02:~# systemctl stop mariadb
galera-host-02:~# cp /root/galera.cnf /etc/mysql/mariadb.conf.d/
galera-host-02:~# echo 'TimeoutSec=infinity' >> /etc/systemd/system/mysqld.service
galera-host-02:~# systemctl daemon-reload
galera-host-02:~# systemctl start mariadb

And bingo, we’re back in the cluster 🙂