Ubuntu 12.04 / LTS : 3 Node MariaDB Cluster + XtraBackup + GLB Load Balancer ( aka MySQL + Galera )

Assumptions:

Completed Server Preparation: KVM / VMware / Standalone Ubuntu 12.04 / 12.10 Server Preparation or Proxmox / OpenVZ Container / Ubuntu 12.04 /12.10 Server Preparation

MariaDB CLluster provides the following features
Synchronous replication
Active/active multi-master topology
Read and write to any cluster node
Automatic membership control, failed nodes drop from the cluster
Automatic node joining
True parallel row level replication
Direct client connections
Drop-in replacement for native MySQL
All the MariaDB benefits

Benefits using Galera Replication
High Availability
No slave lag
No lost transactions
No more data inconsistency
Smaller client latencies
Read scalability and write throughput improvement (3 times and more, depending on your workload

#Hostnames
dbnode1 11.11.11.11
dbnode2 22.22.22.22
dbnode3 33.33.33.33

#Ports
TCP 22 : SSH

#MariaDB Cluster Ports
TCP 3306 : Regular MySQL
TCP 4567 : Group Communication (wsrep_provider gmcast.listen_addr)
TCP 4568 : Incremental State Transfer (wsrep_provider ist.recv_addr)
TCP 4444 : State Transfer (wsrep_sst_receive_address)

Set the timezone

Set the system timezone to UTC

Remove conflicting solutions and configs

**all nodes**

Setup Hostnames and SSH Passwordless logins between the nodes

Generate the rsa key **all nodes**

Enter file in which to save the key (/root/.ssh/id_rsa):
Enter passphrase (empty for no passphrase):
Enter same passphrase again:

Optional: Add hosts to /etc/hosts.allow **all nodes**

Node1

Node2

Node3

Install the latest MariaDB Cluster

Install the latest mariadb ( mysql )

**all nodes**
add the repository

Import the GP key **all nodes**

Apt-Pinning the packages **all nodes**

Install MariaDB **all nodes**

**SET A MARIADB ROOT PASSWORD, SAME ON ALL NODES**

Initial MariaDB COnfiguration

Shutdown all instances of mariadb **all nodes**

Node1

Replace dbclusterroot with a random username
Replace securepassword with a random password

Configure MariaDB

Add missing my.cnf options **all nodes**

Create the cluster root sql user **all nodes**

Replace securepassword with a random password
*avoid all special characters*

Default is 4GB ram
**optional: gcache for 512MB ram **all nodes**

**optional: gcache for 1024MB ram **all nodes**

**optional: gcache for 2GB ram **all nodes**

**optional: Disable slow query logging **all nodes**

Node1

Node2

Node3

Bugfix: Access denied for user ‘debian-sys-maint’@’localhost’

[Warning] Access denied for user ‘debian-sys-maint’@’localhost’ (using password: YES)
Node1

Optional: Install the latest xtrabackup ( wsrep_sst_method=xtrabackup ) to avoid read-only Donor during SST

The main advantage of using xtrabackup to synchronize the nodes, is that the Donor is writeable during the synchronization process.
add the repository **all nodes**

Import the GP key **all nodes**

Install Xtrabackup and Percona-toolkit **all nodes**
Note: install netcat-openbsd fixes the error: nc: invalid option — ‘d’

Set the wsrep_sst_method to xtrabackup **all nodes**

Bootstrap the cluster

Stop mysql** all nodes **

Bootstrap node1** node 1 **
Set the wsrep_cluster_address to gcom:// to enable the cluster to be bootstrapped

Start MySQL Server

Check that mysql is online, functioning and to monitor real-time node status

Variable_name Value
wsrep_local_state_uuid 2e1a97ef-d284-11e2-0800-122d8af6b436
wsrep_local_state_comment Synced
wsrep_incoming_addresses 33.33.33.33:3306
wsrep_cluster_size 1
wsrep_cluster_state_uuid 2e1a97ef-d284-11e2-0800-122d8af6b436
wsrep_cluster_status Primary
wsrep_connected ON
wsrep_ready ON

Restore the wsrep_cluster_address to the cluster, to enable node1 to connect to the cluster

Bootstrap node2**node 2 **

Check that mysql is online, functioning and to monitor real-time node status

Variable_name Value
wsrep_local_state_uuid 2e1a97ef-d284-11e2-0800-122d8af6b436
wsrep_local_state_comment Synced
wsrep_incoming_addresses 33.33.33.33:3306,11.11.11.11:3306
wsrep_cluster_size 2
wsrep_cluster_state_uuid 2e1a97ef-d284-11e2-0800-122d8af6b436
wsrep_cluster_status Primary
wsrep_connected ON
wsrep_ready ON

Bootstrap node2**node 3 **

Check that mysql is online, functioning and to monitor real-time node status

Variable_name Value
wsrep_local_state_uuid 2e1a97ef-d284-11e2-0800-122d8af6b436
wsrep_local_state_comment Synced
wsrep_incoming_addresses 33.33.33.33:3306,22.22.22.22:3306,11.11.11.11:3306
wsrep_cluster_size 3
wsrep_cluster_state_uuid 2e1a97ef-d284-11e2-0800-122d8af6b436
wsrep_cluster_status Primary
wsrep_connected ON
wsrep_ready ON

Notes / Bug Fixes: * Starting MariaDB database server mysqld [fail]

I have had issues with large databases and mysql failing to start. This is caused by the start script timing out before rsync is able to complete.
A simple solution to this is to start the failed node manually with the following:
Make sure mysql is not running and force kill any processes

Start mysql manually

Test

Monitor the MariaDB Cluster Status

Variable_name Value
wsrep_local_state_uuid 2e1a97ef-d284-11e2-0800-122d8af6b436
wsrep_protocol_version 4
wsrep_last_committed 44150
wsrep_replicated 20000
wsrep_replicated_bytes 22623578
wsrep_received 4163
wsrep_received_bytes 4863775
wsrep_local_commits 20000
wsrep_local_cert_failures 0
wsrep_local_bf_aborts 0
wsrep_local_replays 0
wsrep_local_send_queue 0
wsrep_local_send_queue_avg 0.000000
wsrep_local_recv_queue 0
wsrep_local_recv_queue_avg 0.000000
wsrep_flow_control_paused 0.000000
wsrep_flow_control_sent 0
wsrep_flow_control_recv 0
wsrep_cert_deps_distance 0.000000
wsrep_apply_oooe 0.000000
wsrep_apply_oool 0.000000
wsrep_apply_window 0.000000
wsrep_commit_oooe 0.000000
wsrep_commit_oool 0.000000
wsrep_commit_window 0.000000
wsrep_local_state 4
wsrep_local_state_comment Synced
wsrep_cert_index_size 0
wsrep_causal_reads 0
wsrep_incoming_addresses 33.33.33.33:3306,22.22.22.22:3306,11.11.11.11:3306
wsrep_cluster_conf_id 22
wsrep_cluster_size 3
wsrep_cluster_state_uuid 2e1a97ef-d284-11e2-0800-122d8af6b436
wsrep_cluster_status Primary
wsrep_connected ON
wsrep_local_index 0
wsrep_provider_name Galera
wsrep_provider_vendor Codership Oy
wsrep_provider_version 23.2.4(r147)
wsrep_ready ON

Variable_name Value
wsrep_local_state_uuid 2e1a97ef-d284-11e2-0800-122d8af6b436
wsrep_local_state_comment Synced
wsrep_incoming_addresses 33.33.33.33:3306,22.22.22.22:3306,11.11.11.11:3306
wsrep_cluster_size 3
wsrep_cluster_state_uuid 2e1a97ef-d284-11e2-0800-122d8af6b436
wsrep_cluster_status Primary
wsrep_connected ON
wsrep_ready ON

Performance Testing

Create the sbtest database with 1 000 000 tables** Node1 **

Replace securepassword with a random password

Run the test on all nodes** All Nodes **

Replace securepassword with a random password

Doing OLTP test.
Running mixed OLTP test
Using Special distribution (12 iterations, 1 pct of values are returned in 75 pct cases)
Using “BEGIN” for starting transactions
Using auto_inc on the id column
Maximum number of requests for OLTP test is limited to 500000
Threads started!
Time limit exceeded, exiting…
(last message repeated 15 times)
Done.

OLTP test statistics:
queries performed:
read: 856198
write: 305785
other: 122314
total: 1284297
transactions: 61157 (203.81 per sec.)
deadlocks: 0 (0.00 per sec.)
read/write requests: 1161983 (3872.47 per sec.)
other operations: 122314 (407.63 per sec.)

Test execution summary:
total time: 300.0627s
total number of events: 61157
total time taken by event execution: 4799.9976
per-request statistics:
min: 55.43ms
avg: 78.49ms
max: 689.76ms
approx. 95 percentile: 98.52ms

Threads fairness:
events (avg/stddev): 3822.3125/6.61
execution time (avg/stddev): 299.9999/0.02

Test Cluster load with Pen Load Balancer

This is just to test load balance performance, I suggest you use either GLB or HAProxy in a production environment
Install Pen** Node 1 **

Run Pen and listen to port 4406** Node 1 **

Execute the Sysbench, using the Pen Load Balancer

Number of threads: 16

Doing OLTP test.
Running mixed OLTP test
Using Special distribution (12 iterations, 1 pct of values are returned in 75 pct cases)
Using “BEGIN” for starting transactions
Using auto_inc on the id column
Maximum number of requests for OLTP test is limited to 500000
Threads started!
Time limit exceeded, exiting…
(last message repeated 15 times)
Done.

OLTP test statistics:
queries performed:
read: 855974
write: 305705
other: 122282
total: 1283961
transactions: 61141 (203.76 per sec.)
deadlocks: 0 (0.00 per sec.)
read/write requests: 1161679 (3871.43 per sec.)
other operations: 122282 (407.52 per sec.)

Test execution summary:
total time: 300.0643s
total number of events: 61141
total time taken by event execution: 4799.9052
per-request statistics:
min: 55.38ms
avg: 78.51ms
max: 605.87ms
approx. 95 percentile: 94.22ms

Threads fairness:
events (avg/stddev): 3821.3125/11.27
execution time (avg/stddev): 299.9941/0.02

Terminate Pen

Optional: Galera Load Balancer ( GLB )

Note: I will only write a guide once the 1.0 version is released.
http://www.codership.com/downloads/glb
http://www.codership.com/files/glb/README-1.0.0rc1

Optional: CSF Firewall

Allow/add the dbnode IPs to csf** all nodes **

==== THE ITEMS BELOW ARE IN PROGRESS :: DO NOT USE =====

Optional: UFW Firewall

Install UFW

Secure SSH for galera/wsrep
Secure MariaDB

Enter current password for root (enter for none): yoursecurepassword
Change the root password? N
Remove anonymous users? Y
Disallow root login remotely? Y
Remove test database and access to it? Y
Reload privilege tables now? Y

Restart MariaDB (mysql server)

Confirm MariaDB is running

tcp 0 0 localhost:mysql *:* LISTEN 11878/mysqld

———–

9 replies
  1. David
    David says:

    Hi,

    Thank you for this amazing run-through on Galera and MariaDB. Great software and a great resource for help.

    Just a quick note about long state-transfers causing the “service mysql start” command to report “fail”…. it’s actually fine to just leave it. The state-transfer is very-much still running in the background and eventually (checking /var/log/syslog) you’ll see that it completes. Once the state-transfer is done, running “service mysql status” reports that it is now operational on the new node… so no need to foreground “mysqld” just to do a state-transfer.

    Hope that helps and thanks again for the write-up. It proved to be an invaluable resource.

    Reply
  2. Marco
    Marco says:

    Hi, this is a great HowTo and everything was going great until the bootstrapping step and I had to restart the service. Here is the error:

    Oct 22 11:58:57 dbprd01 /etc/init.d/mysql[11322]:
    Oct 22 12:00:36 dbprd01 mysqld_safe: Starting mysqld daemon with databases from /var/lib/mysql
    Oct 22 12:00:36 dbprd01 mysqld_safe: WSREP: Running position recovery with –log_error=/tmp/tmp.6Tz5wcupW1 –pid-file=/var/lib/mysql/dbprd01-recover.pid
    Oct 22 12:00:39 dbprd01 mysqld_safe: WSREP: Recovered position 00000000-0000-0000-0000-000000000000:-1
    Oct 22 12:00:39 dbprd01 mysqld: 131022 12:00:39 [Note] WSREP: wsrep_start_position var submitted: ‘00000000-0000-0000-0000-000000000000:- 1’
    Oct 22 12:00:39 dbprd01 mysqld: 131022 12:00:39 [Note] WSREP: Read nil XID from storage engines, skipping position init
    Oct 22 12:00:39 dbprd01 mysqld: 131022 12:00:39 [Note] WSREP: wsrep_load(): loading provider library ‘/usr/lib/galera/libgalera_smm.so’
    Oct 22 12:00:39 dbprd01 mysqld: 131022 12:00:39 [Note] WSREP: wsrep_load(): Galera 23.2.7-wheezy(r) by Codership Oy loaded succesfully.
    Oct 22 12:00:39 dbprd01 mysqld: 131022 12:00:39 [Note] WSREP: Found saved state: 00000000-0000-0000-0000-000000000000:-1
    Oct 22 12:00:39 dbprd01 mysqld: 131022 12:00:39 [Note] WSREP: Reusing existing ‘/var/lib/mysql//galera.cache’.
    Oct 22 12:00:39 dbprd01 mysqld: 131022 12:00:39 [Note] WSREP: Passing config to GCS: base_host = dbprd01.nubi.net; base_port = 4567; cert .log_conflicts = no; gcache.dir = /var/lib/mysql/; gcache.keep_pages_size = 0; gcache.mem_size = 0; gcache.name = /var/lib/mysql//galera. cache; gcache.page_size = 128M; gcache.size = 256M; gcs.fc_debug = 0; gcs.fc_factor = 1; gcs.fc_limit = 16; gcs.fc_master_slave = NO; gcs .max_packet_size = 64500; gcs.max_throttle = 0.25; gcs.recv_q_hard_limit = 9223372036854775807; gcs.recv_q_soft_limit = 0.25; gcs.sync_do nor = NO; replicator.causal_read_timeout = PT30S; replicator.commit_order = 3
    Oct 22 12:00:39 dbprd01 mysqld: 131022 12:00:39 [Note] WSREP: Assign initial position for certification: -1, protocol version: -1
    Oct 22 12:00:39 dbprd01 mysqld: 131022 12:00:39 [Note] WSREP: wsrep_sst_grab()
    Oct 22 12:00:39 dbprd01 mysqld: 131022 12:00:39 [Note] WSREP: Start replication
    Oct 22 12:00:39 dbprd01 mysqld: 131022 12:00:39 [Note] WSREP: Setting initial position to 00000000-0000-0000-0000-000000000000:-1
    Oct 22 12:00:39 dbprd01 mysqld: 131022 12:00:39 [ERROR] WSREP: failed to create gcomm backend connection: 22: Can’t parse port number fro m ‘4564″‘: 22 (Invalid argument)
    Oct 22 12:00:39 dbprd01 mysqld: #011 at galerautils/src/gu_uri.cpp:parse_authority():69
    Oct 22 12:00:39 dbprd01 mysqld: 131022 12:00:39 [ERROR] WSREP: gcs/src/gcs_core.c:gcs_core_open():202: Failed to initialize backend using ‘gcomm://”dbprd01:4564,dbprd02:4564,dbprd03:4564″‘: -22 (Invalid argument)
    Oct 22 12:00:39 dbprd01 mysqld: 131022 12:00:39 [ERROR] WSREP: gcs/src/gcs.c:gcs_open():1289: Failed to open channel ‘”MariaDB_Cluster”‘ at ‘gcomm://”dbprd01:4564,dbprd02:4564,dbprd03:4564″‘: -22 (Invalid argument)
    Oct 22 12:00:39 dbprd01 mysqld: 131022 12:00:39 [ERROR] WSREP: gcs connect failed: Invalid argument
    Oct 22 12:00:39 dbprd01 mysqld: 131022 12:00:39 [ERROR] WSREP: wsrep::connect() failed: 6
    Oct 22 12:00:39 dbprd01 mysqld: 131022 12:00:39 [ERROR] Aborting
    Oct 22 12:00:39 dbprd01 mysqld:
    Oct 22 12:00:39 dbprd01 mysqld: 131022 12:00:39 [Note] WSREP: Service disconnected.
    Oct 22 12:00:40 dbprd01 mysqld: 131022 12:00:40 [Note] WSREP: Some threads may fail to exit.
    Oct 22 12:00:40 dbprd01 mysqld: 131022 12:00:40 [Note] /usr/sbin/mysqld: Shutdown complete
    Oct 22 12:00:40 dbprd01 mysqld:

    If I comment this line in my.conf then I can start the mysql service:
    #wsrep_cluster_address=gcomm://”dbprd01,dbprd02,dbprd03″

    Any thoughts?

    Thank you.

    Reply
    • admin
      admin says:

      You need to bootstrap one server to wsrep_cluster_address=gcomm://
      and then once its running, bootstrap the other clusters to it: wsrep_cluster_address=”gcomm://dbprd01,dbprd02,dbprd03″

      If this is a non test setup, I suggest you use the debian 7 or centos 6 guides over the ubuntu lts ones.

      Reply
  3. Marco
    Marco says:

    Thank you, besides that the scipt that inserts the nodes has a mistake I think. I went back and found my entrey to be: wsrep_cluster_address=gcomm://”dbprd01,dbprd02,dbprd03″ note that the first quotaion starts after the slashes and should start after the equl sign like this: wsrep_cluster_address=”gcomm://dbprd01,dbprd02,dbprd03″

    I do want to thank you for taking the time to create this tutorial, nicely done and I had alot of fun following it.

    Reply
  4. Kneeoh
    Kneeoh says:

    Is this a good result? I’m running 8 cores and 32GB of ram. I have not done any my.cnf tweaking and followed your instructions exactly. Are there any tweaks you would recommend based on my resources?

    Doing OLTP test.
    Running mixed OLTP test
    Using Special distribution (12 iterations, 1 pct of values are returned in 75 pct cases)
    Using “BEGIN” for starting transactions
    Using auto_inc on the id column
    Maximum number of requests for OLTP test is limited to 500000
    Threads started!
    Time limit exceeded, exiting…
    (last message repeated 15 times)
    Done.

    OLTP test statistics:
    queries performed:
    read: 6857984
    write: 2449280
    other: 979712
    total: 10286976
    transactions: 489856 (1632.77 per sec.)
    deadlocks: 0 (0.00 per sec.)
    read/write requests: 9307264 (31022.61 per sec.)
    other operations: 979712 (3265.54 per sec.)

    Test execution summary:
    total time: 300.0155s
    total number of events: 489856
    total time taken by event execution: 4797.3500
    per-request statistics:
    min: 2.34ms
    avg: 9.79ms
    max: 878.40ms
    approx. 95 percentile: 28.68ms

    Threads fairness:
    events (avg/stddev): 30616.0000/149.28
    execution time (avg/stddev): 299.8344/0.00

    Reply
  5. Robert
    Robert says:

    A buddy of mine sent me this link to get a Galera/MariaDB cluster up and running.  I’ve got 3 Fedora 20 servers online and their my.cnf are non-existent for any of the sed pieces above to work.  Any chance on getting a full my.cnf file or some of the ones below set up for it.

     

    /etc/my.cnf
    /etc/my.cnf.d
    /etc/my.cnf.rpmnew
    /etc/my.cnf.rpmsave
    /etc/my.cnf.d/mysql-clients.cnf
    /etc/my.cnf.d/server.cnf
    /etc/my.cnf.d/tokudb.cnf

    Reply
    • admin
      admin says:

      Please check the following locations:
      /etc/my.cnf
      /etc/mysql/my.cnf
      SYSCONFDIR/my.cnf

      MariaDB should have sample files in : /usr/local/mysql/share/mysql

      Reply
  6. Kneeoh
    Kneeoh says:

    What is the recommended best practice for cross-datacenter replication? 2 clusters with seed nodes, or one big cluster with 3 nodes at each datacenter? Or something else all together?

    Reply

Leave a Reply

Want to join the discussion?
Feel free to contribute!

Leave a Reply

Your email address will not be published. Required fields are marked *