Introducing Agent-Based Database Monitoring with ClusterControl 1.7

October 2, 2018, 7:41 am

≫ Next: How to Monitor MySQL or MariaDB Galera Cluster with Prometheus Using SCUMM

≪ Previous: Become a ClusterControl DBA: Operational Reports for MySQL, MariaDB, PostgreSQL & MongoDB

We are excited to announce the 1.7 release of ClusterControl - the only management system you’ll ever need to take control of your open source database infrastructure!

ClusterControl 1.7 introduces new exciting agent-based monitoring features for MySQL, Galera Cluster, PostgreSQL & ProxySQL, security and cloud scaling features ... and more!

Release Highlights

Monitoring & Alerting

Agent-based monitoring with Prometheus
New performance dashboards for MySQL, Galera Cluster, PostgreSQL & ProxySQL

Security & Compliance

Enable/disable Audit Logging on your MariaDB databases
Enable policy-based monitoring and logging of connection and query activity

Deployment & Scaling

Automatically launch cloud instances and add nodes to your cloud deployments

Additional Highlights

Support for MariaDB v10.3

View the ClusterControl ChangeLog for all the details!

Single Console for Your Entire Database Infrastructure

Find out what else is new in ClusterControl

Install ClusterControl for FREE

View Release Details and Resources

Release Details

Monitoring & Alerting

Agent-based monitoring with Prometheus

ClusterControl was originally designed to address modern, highly distributed database setups based on replication or clustering. It provides a systems view of all the components of a distributed cluster, including load balancers, and maintains a logical topology view of the cluster.

So far we’d gone the agentless monitoring route with ClusterControl, and although we love the simplicity of not having to install or manage agents on the monitored database hosts, an agent-based approach can provide higher resolution of monitoring data and has certain advantages in terms of security.

With that in mind, we’re happy to introduce agent-based monitoring as a new feature added in ClusterControl 1.7!

It makes use of Prometheus, a full monitoring and trending system that includes built-in and active scraping and storing of metrics based on time series data. One Prometheus server can be used to monitor multiple clusters. ClusterControl takes care of installing and maintaining Prometheus as well as exporters on the monitored hosts.

Users can now enable their database clusters to use Prometheus exporters to collect metrics on their nodes and hosts, thus avoiding excessive SSH activity for monitoring and metrics collections and use SSH connectivity only for management operations.

Monitoring & Alerting

New performance dashboards for MySQL, Galera Cluster, PostgreSQL & ProxySQL

ClusterControl users now have access to a set of new dashboards that have Prometheus as the data source with its flexible query language and multi-dimensional data model, where time series data is identified by metric name and key/value pairs. This allows for greater accuracy and customization options while monitoring your database clusters.

The new dashboards include:

Cross Server Graphs
System Overview
MySQL Overview, Replication, Performance Schema & InnoDB Metrics
Galera Cluster Overview & Graphs
PostgreSQL Overview
ProxySQL Overview

Security & Compliance

Audit Log for MariaDB

Continuous auditing is an imperative task for monitoring your database environment. By auditing your database, you can achieve accountability for actions taken or content accessed. Moreover, the audit may include some critical system components, such as the ones associated with financial data to support a precise set of regulations like SOX, or the EU GDPR regulation. Usually, it is achieved by logging information about DB operations on the database to an external log file.

With ClusterControl 1.7 users can now enable a plugin that will log all of their MariaDB database connections or queries to a file for further review; it also introduces support for version 10.3 of MariaDB.

Additional New Functionalities

View the ClusterControl ChangeLog for all the details!

Download ClusterControl today!

Happy Clustering!

Tags:

clustercontrol

PostgreSQL

MySQL

MongoDB

MariaDB

database monitoring

agent-based

↧

How to Monitor MySQL or MariaDB Galera Cluster with Prometheus Using SCUMM

October 11, 2018, 2:48 am

≫ Next: How to Monitor Your ProxySQL with Prometheus and ClusterControl

≪ Previous: Introducing Agent-Based Database Monitoring with ClusterControl 1.7

ClusterControl version 1.7 introduced a new way to watch your database clusters. The new agent-based approach was designed for demanding high-resolution monitoring. ClusterControl agentless, secure SSH-based remote stats collection has been extended with modern high-frequency monitoring based on time series data.

SCUMM is the new monitoring and trending system in ClusterControl.

It includes built-in active scraping and storing of metrics based on time series data. The new version of ClusterControl 1.7 uses Prometheus exporters to gather more data from monitored hosts and services.

In this blog, we will see how to use SCUMM to monitor multi-master Percona XtraDB Cluster and MariaDB Galera Cluster instances. We will also see what metrics are available, and which ones can are tracked with built-in dashboards.

What is SCUMM?

To be able to understand the new functionality and recent changes in ClusterControl, let’s take a quick look first at the new monitoring architecture - SCUMM.

SCUMM (Severalnines ClusterControl Unified Monitoring & Management) is a new agent-based solution with agents installed on the database nodes. It consists of two core elements:

Prometheus server which is a time series database to collect the data.
Exporters which export metrics from services like Galera cluster WSREP API.

More details can be found in our introduction blog for SCUMM.

How to use SCUMM for Galera Cluster?

The main requirement is to upgrade your current ClusterControl or install at least version 1.7. You can find the upgrade and installation procedure in the online documentation. You will then see a new tab called Dashboards. By default, the new Dashboard is disabled. Until it’s enabled, ClusterControl monitoring is based on agentless SSH secure key.

ClusterControl takes care of installing and maintaining Prometheus as well as exporters on the monitored hosts. The installation process is automated.

ClusterControl: Enable Agent-Based Monitoring

After choosing Enable Agent-Based Monitoring, you will see the following.

ClusterControl: Enable Agent-Based Monitoring

Here you can specify the host on which to install our Prometheus server. The host can be your ClusterControl server, or a dedicated one. It is also possible to re-use another Prometheus server that is managed by ClusterControl.

Other options to specify are:

Scrape Interval (seconds): Set how often the nodes are scraped for metrics. By default 10.
Data Retention (days): Set how long the metrics are kept before being removed. By default 15.

We can monitor the installation of our server and agents from the Activity section in our ClusterControl and, once it is finished, we can see our cluster with the agents enabled from the main ClusterControl screen.

Dashboards

Having our agents enabled, if we go to the Dashboards section, we would see something like this:

Default dashboard

There are seven different kinds of dashboards available: System Overview, Cross Server Graphs, MySQL Overview, MySQL InnoDB Metrics, MySQL Performance Schema, Galera Cluster Overview, Galera Graphs.

List of dashboards

Here we can also specify which node to monitor, the time range and the refresh rate.

Dashboard time ranges

In the configuration section, we can enable or disable our agents (Exporters), check the agents status and verify the version of our Prometheus server.

Prometheus Configuration

Galera Cluster Overview

Galera Cluster overview enables you to see most critical Galera cluster metrics. The information here is based on WSREP API status.

Galera Cluster Size: shows the number of nodes in the cluster.
Flow Control Paused Time: a time when flow control is in effect.
Flow Control Messages Sent: the number of messages sent to other cluster members to slow down.
Writeset Inbound Traffic: transaction commits that the node receives from the cluster.
Writeset Outbound Traffic: transaction commits that the node sends from the cluster.
Receive Queue: a number of write-sets waiting to be applied.
Send Queue: a number of write-sets sent.
Transactions Received: a number of transactions received.
Transactions Replicated: a number of transactions replicated.
Average Incoming Transaction Size: number of average incoming transactions per node.
Average Replicated Transaction: number of average replicated transactions per node.
FC Trigger Low Limit: the point at which Flow Control engages.
FC Trigger High Limit.
Sequence numbers of transactions.

MySQL Overview

MySQL Uptime: The amount of time since the MySQL server process was started.
Current QPS: The number of queries executed by the server during the last second.
InnoDB Buffer Pool Size: InnoDB buffer pool used for caching data and indexes in memory.
Buffer Pool Size % of Total RAM: The ratio between InnoDB buffer pool size and total memory.
MySQL Connections: The number of connection attempts (successful or not).
MySQL Client Thread Activity: Number of threads.
MySQL Questions: The number of queries sent to the server by clients, excluding those executed within stored programs.
MySQL Thread Cache: The thread_cache_size metric informs how many threads the server should cache to reuse.
MySQL Temporary Objects.
MySQL Select Types: counters for selects not done with indexes.
MySQL Sorts: Full table scans for sorts.
MySQL Slow Queries: Slow queries are defined as queries being slower than the long_query_time setting.
MySQL Aborted Connections: Number of aborted connections.
MySQL Table Locks: Number of table locks.
MySQL Network Traffic: Shows how much network traffic is generated by MySQL.
MySQL Network Usage Hourly: Shows how much network traffic is generated by MySQL per hour.
MySQL Internal Memory Overview: Shows various uses of memory within MySQL.
Top Command Counters: The number of times each statement has been executed.
MySQL Handlers: Internal statistics on how MySQL is selecting, updating, inserting, and modifying rows, tables, and indexes.
MySQL Transaction Handlers.
Process States.
Top Process States Hourly.
MySQL Query Cache Memory.
MySQL Query Cache Activity.
MySQL File Openings.
MySQL Open Files: Number of files opened by MySQL.
MySQL Table Open Cache Status.
MySQL Open Tables: Number of open tables.
MySQL Table Definition Cache.

MySQL InnoDB Metrics

InnoDB Checkpoint Age.
InnoDB Transactions.
InnoDB Row Operations.
InnoDB Row Lock Time.
InnoDB I/O.
InnoDB Log File Usage Hourly.
InnoDB Logging Performance.
InnoDB Deadlocks.
Index Condition Pushdown.
InnoDB Buffer Pool Content.
InnoDB Buffer Pool Pages.
InnoDB Buffer Pool I/O.
InnoDB Buffer Pool Requests.
InnoDB Read-Ahead.
InnoDB Change Buffer.
InnoDB Change Buffer Activity.

MySQL Performance Schema

This graph provides a way to inspect the internal execution of the server at runtime. It requires to have performance schema enabled.

Performance Schema File IO (Events).
Performance Schema File IO (Load).
Performance Schema File IO (Bytes).
Performance Schema Waits (Events).
Performance Schema Waits (Load).
Index Access Operations (Load).
Table Access Operations (Load).
Performance Schema SQL & External Locks (Events).
Performance Schema SQL and External Locks (Seconds).

System Overview Metrics

To monitor our system, we have available for each server the following metrics (all of them for the selected node):

System Uptime: Time since the server is up.
CPUs: Amount of CPUs.
RAM: Amount of RAM memory.
Memory Available: Percentage of RAM memory available.
Load Average: Min, max and average server load.
Memory: Available, total and used server memory.
CPU Usage: Min, max and average server CPU usage information.
Memory Distribution: Memory distribution (buffer, cache, free and used) on the selected node.
Saturation Metrics: Min, max, and average of IO load and CPU load on the selected node.
Memory Advanced Details: Memory usage details like pages, buffer and more, on the selected node.
Forks: Amount of forks processes. The fork is an operation whereby a process creates a copy of itself. It is usually a system call, implemented in the kernel.
Processes: Amount of processes running or waiting on the Operating System.
Context Switches: A context switch is an action of storing the state of a processor of a thread.
Interrupts: Amount of interrupts. An interrupt is an event that alters the normal execution flow of a program and can be generated by hardware devices or even by the CPU itself.
Network Traffic: Inbound and outbound network traffic in KBytes per second on the selected node.
Network Utilization Hourly: Traffic sent and received in the last day.
Swap: Swap usage (free and used) on the selected node.
Swap Activity: Reads and writes data on the swap.
I/O Activity: Page in and page out on IO.
File Descriptors: Allocated and limit file descriptors.

Cross-Server Graphs Metrics

If we want to see the general state of all our servers with the information combined for OS and MySQL we can use this dashboard with the following metrics:

Load Average: Servers load average for each server.
Memory Usage: Percentage of memory usage for each server.
Network Traffic: Min, max and average kBytes of network traffic per second.
MySQL Connections: Number of client connections to MySQL server.
MySQL Queries: Number of queries executed.
MySQL Traffic: Provides information about min, max and avg data send end received.

Galera Graphs

In this view, you can check Galera specific metrics for each cluster node. The dashboard lists all of your cluster nodes, so you can easily filter for performance metrics of a particular node.

Ready to Accept: Queries: Identify if the node is able to run database operations.
Local State: Shows the node state.
Desync Mode: Identified if node participates in Flow Control.
Cluster Status: Cluster component status.
gcache Size: Galera Cluster cache size.
FC (normal traffic): Flow control status.
Galera Replication Queues: Size of the replication queue.
Galera Cluster Size: Number of nodes in the cluster.
Galera Flow Control: Identifies a number of FC calls.
Galera Parallelization Efficiency: An average distance between highest and lowest sequence numbers that are concurrently applied, committed and can be possibly applied in parallel.
Galera Writing Conflicts: A number of local transactions being committed on this node that failed certification.
Available Downtime before SST Required: Downtime in minutes before SST operation.
Galera Writeset Count: the count of transactions replicated to the cluster (from this node) and received from the cluster (any other node).
Galera Writeset Size: This graph shows the average transaction size sent/received.
Galera Network Usage Hourly: Network usage hourly (received and replicated).

Conclusion

Monitoring is an area where operations teams commonly spend time developing custom solutions. It is common to find IT teams integrating these systems in order to get a holistic view of their systems.

ClusterControl provides a complete monitoring system with real-time data to know what is happening now, high-resolution metrics for better accuracy, configurable dashboards, and a wide range of third-party notification services for alerting. Download ClusterControl today (it’s free).

Tags:

↧

How to Monitor Your ProxySQL with Prometheus and ClusterControl

October 12, 2018, 2:48 am

≫ Next: MySQL on Docker: Running ProxySQL as a Helper Container on Kubernetes

≪ Previous: How to Monitor MySQL or MariaDB Galera Cluster with Prometheus Using SCUMM

ClusterControl 1.7.0 introduces a bold new feature - integration with Prometheus for agent-based monitoring. We called this SCUMM (Severalnines ClusterControl Unified Management and Monitoring). In the previous versions, the monitoring tasks were solely performed agentlessly. If you wonder how ClusterControl performs its monitoring functions, check out this documentation page.

ProxySQL, a high performance reverse-proxy which understands MySQL protocols, commonly sits on top of MySQL Replication and Galera Cluster to act as a gateway to the backend MySQL service. It can be configured as a query router, query firewall, query caching, traffic dispatcher and many more. ProxySQL also collects and exposes key metrics via its STATS schema which is very useful to analyze performance and understand what actually happens behind the scenes. Visit our comprehensive tutorial for ProxySQL to learn more about it.

In this blog post, we are going to look into monitoring the ProxySQL instances in-depth with this new approach. In this example, we have a ProxySQL instance on top of our two-node MySQL Replication (1 master, 1 slave), deployed via ClusterControl. Our high-level architecture looks something like this:

We also have the following query rules defined in the ProxySQL instance (just for reference, to make sense of the collected monitoring metrics further down):

Enabling Prometheus

ClusterControl's agent-based monitoring is enabled per cluster. ClusterControl can deploy a new Prometheus server for you, or use an existing Prometheus server (deployed by ClusterControl for other cluster). Enabling Prometheus is pretty straightforward. Just go to ClusterControl -> pick the cluster -> Dashboards -> Enable Agent Based Monitoring:

Then, specify the IP address or hostname of the new Prometheus server, or just pick an existing Prometheus host from the dropdown:

ClusterControl will install and configure the necessary packages (Prometheus on the Prometheus server, exporters on the database and ProxySQL nodes), connect to the Prometheus as data source and start visualizing the monitoring data in the UI.

Once the deployment job finishes, you should be able to access the Dashboards tab as shown in the next section.

ProxySQL Dashboard

You can access the ProxySQL Dashboards by going to the respective cluster under Dashboards tab. Clicking on the Dashboard dropdown will list out dashboards related to our cluster (MySQL Replication). You can find the ProxySQL Overview dashboard under the "Load Balancers" section:

There are a number of panels for ProxySQL, some of them are self-explanatory. Nevertheless, let's visit them one by one.

Hostgroup Size

Hostgroup size is simply the total number of host on all hostgroups:

In this case, we have two hostgroups - 10 (writer) and 20 (reader). Hostgroup 10 consists of one host (master) while hostgroup 20 has two hosts (master and slave), which sums up to a total of three hosts.

Unless you change the hostgroup configuration (introduce new host, removing existing host) from ProxySQL, you should expect that nothing will change in this graph.

Client Connections

The number of client connection being processed by ProxySQL for all hostgroups:

The above graph simply tells us that there are consistently 8 MySQL clients connected to our ProxySQL instance on port 6033 for the last 45 minutes (you can change this under the "Range Selection" option). If you stop connecting your application to ProxySQL (or bypass it), the value should eventually drop to 0.

Client Questions

The graph visualizes the number of Questions being processed by ProxySQL for all hostgroups:

According to MySQL documentation, Questions are simply the number of statements executed by the server. This includes only statements sent to the server by clients and not statements executed within stored programs, unlike the Queries variable. This variable does not count COM_PING, COM_STATISTICS, COM_STMT_PREPARE, COM_STMT_CLOSE, or COM_STMT_RESET commands. Again, if you stop connecting your application to ProxySQL (or bypass it), the value should drop to 0.

Active Backend Connections

The number of connections that ProxySQL maintains to the backend MySQL servers per host:

It simply tells us how many connections are currently being used by ProxySQL for sending queries to the backend server. As long as the max value is not close to the connection limit for the particular server (set via max_connections when the server is added into ProxySQL hostgroup), we are in a good shape.

Failed Backend Connections

The number of connections that were not established successfully by ProxySQL:

The above example simply shows that no backend connection failure happened in the last 45 minutes.

Queries Routed

This graph provides insight on the distribution of incoming statements to the backend servers:

As you can see, most of the reads are going to the reader hostgroup (HG20). From here we can understand the balancing pattern being performed by ProxySQL which matches our query rules in this read-intensive workload.

Connection Free

The graph shows how many connections are currently free:

The connections are kept open in order to minimize the time cost of sending a query to the backend server.

Latency

The current ping time in microseconds, as reported from ProxySQL monitoring thread:

This simply tells us how stable the connection is from the ProxySQL to the backend MySQL servers. High value for a long consistent time mostly indicates network issue between them.

Query Cache Memory

This graph visualizes memory consumption of queries that are being cached by ProxySQL:

From the graph above, we can tell that ProxySQL consumes a total amount of 8MB of memory by the query cache. After it reaches 8MB limit (configurable via mysql-query_cache_size_MB variable), the memory will be purged by ProxySQL's purge thread. This graph will not be populated if you have no query cache rule defined.

By the way, caching a query in ProxySQL can be done in just two clicks with ClusterControl. Go to the ProxySQL's Top Queries page, rollover on a query, click Query Cache and click on "Add Rule":

Query Cache Efficiency

This graph visualizes the efficiency of cached queries:

The blue line tells us the successful ratio of GET requests executed against the Query Cache where the resultset was present and not expired. The pink line shows the ratio of data written (insert) into or read from the Query Cache. In this case, our data read from Query Cache is higher from data written indicating an efficient cache configuration.

This graph will not be populated if you have no query cache rule defined.

Network Traffic

This graph visualizes the network traffic (data receive + data sent) from/to the backend MySQL servers, per host:

The above screenshot tells us that a significant amount of traffic is being forwarded/received from/to the reader hostgroup (HG20). In this read-intensive workload, read operations commonly consume a much higher traffic mainly due to the result set size of the SELECT statements.

Only a smaller ratio of the traffic is forwarded/received from/to the write hostgroup (HG10), which is expected since the write operations usually consume less network traffic with significantly small result set being returned to the clients.

Mirrorring Efficiency

The graph simply shows the traffic mirroring related status like Mirror_concurrency vs Mirror_queue_length:

The graph is only populated if you have configured traffic mirroring (mirror_hostgroup inside query rule). If the mirror queue is picking up, reducing the mirroring concurrency limit will increase the mirroring efficiency, which can be controlled via mysql-mirror_max_concurrency variable. In simple words, the zero queue entries is what the most efficient mirroring is all about.

Memory Utilization

The graph illustrates the memory utilization by main components inside ProxySQL - connection pool, query cache and persistent storage (SQLite):

The above screenshot tells us that ProxySQL memory footprint is rather small which is less than 12 MB in total. The connection pool only consumes 1.3 MB tops to accommodate our 8-thread (clients) workload. With more free RAM available on the host, we should be able to increase the number of client connections to ProxySQL by three-fold to four-fold or cache much more hot queries inside ProxySQL to offload our MySQL backend servers.

Bonus Feature - Node Performance

ClusterControl 1.7.0 now includes host performance metrics for the ProxySQL instances. In the previous version, ClusterControl only monitored the ProxySQL related metrics as exposed by ProxySQL stats schema. This new feature can be accessed under the Node's tab -> ProxySQL instance -> Node Performance:

The histograms provide insight to the key host metrics, similar to what are being sampled for database nodes under Nodes -> Overview section. If your ProxySQL instance is co-located in the same server with the application, you are literally using ClusterControl to monitor the application server as well. How cool is that?!

Final Thoughts

ClusterControl integration with Prometheus offers an alternative way to monitor and analyze your database stack, up until the reverse-proxy tier. You now have a choice to offload the monitoring jobs to Prometheus, or keep on using the default ClusterControl agentless monitoring approach for your database infrastructure.

Tags:

↧

MySQL on Docker: Running ProxySQL as a Helper Container on Kubernetes

October 22, 2018, 2:06 pm

≫ Next: How to Cluster Your ProxySQL Load Balancers

≪ Previous: How to Monitor Your ProxySQL with Prometheus and ClusterControl

ProxySQL commonly sits between the application and database tiers, in so called reverse-proxy tier. When your application containers are orchestrated and managed by Kubernetes, you might want to use ProxySQL in front of your database servers.

In this post, we’ll show you how to run ProxySQL on Kubernetes as a helper container in a pod. We are going to use Wordpress as an example application. The data service is provided by our two-node MySQL Replication, deployed using ClusterControl and sitting outside of the Kubernetes network on a bare-metal infrastructure, as illustrated in the following diagram:

ProxySQL Docker Image

In this example, we are going to use ProxySQL Docker image maintained by Severalnines, a general public image built for multi-purpose usage. The image comes with no entrypoint script and supports Galera Cluster (in addition to built-in support for MySQL Replication), where an extra script is required for health check purposes.

Basically, to run a ProxySQL container, simply execute the following command:

$ docker run -d -v /path/to/proxysql.cnf:/etc/proxysql.cnf severalnines/proxysql

This image recommends you to bind a ProxySQL configuration file to the mount point, /etc/proxysql.cnf, albeit you can skip this and configure it later using ProxySQL Admin console. Example configurations are provided in the Docker Hub page or the Github page.

ProxySQL on Kubernetes

Designing the ProxySQL architecture is a subjective topic and highly dependent on the placement of the application and database containers as well as the role of ProxySQL itself. ProxySQL does not only route queries, it can also be used to rewrite and cache queries. Efficient cache hits might require a custom configuration tailored specifically for the application database workload.

Ideally, we can configure ProxySQL to be managed by Kubernetes with two configurations:

ProxySQL as a Kubernetes service (centralized deployment).
ProxySQL as a helper container in a pod (distributed deployment).

The first option is pretty straightforward, where we create a ProxySQL pod and attach a Kubernetes service to it. Applications will then connect to the ProxySQL service via networking on the configured ports. Default to 6033 for MySQL load-balanced port and 6032 for ProxySQL administration port. This deployment will be covered in the upcoming blog post.

The second option is a bit different. Kubernetes has a concept called "pod". You can have one or more containers per pod, these are relatively tightly coupled. A pod’s contents are always co-located and co-scheduled, and run in a shared context. A pod is the smallest manageable container unit in Kubernetes.

Both deployments can be distinguished easily by looking at the following diagram:

The primary reason that pods can have multiple containers is to support helper applications that assist a primary application. Typical examples of helper applications are data pullers, data pushers, and proxies. Helper and primary applications often need to communicate with each other. Typically this is done through a shared filesystem, as shown in this exercise, or through the loopback network interface, localhost. An example of this pattern is a web server along with a helper program that polls a Git repository for new updates.

This blog post will cover the second configuration - running ProxySQL as a helper container in a pod.

ProxySQL as Helper in a Pod

In this setup, we run ProxySQL as a helper container to our Wordpress container. The following diagram illustrates our high-level architecture:

In this setup, ProxySQL container is tightly coupled with the Wordpress container, and we named it as "blog" pod. If rescheduling happens e.g, the Kubernetes worker node goes down, these two containers will always be rescheduled together as one logical unit on the next available host. To keep the application containers' content persistent across multiple nodes, we have to use a clustered or remote file system, which in this case is NFS.

ProxySQL role is to provide a database abstraction layer to the application container. Since we are running a two-node MySQL Replication as the backend database service, read-write splitting is vital to maximize resource consumption on both MySQL servers. ProxySQL excels at this and requires minimal to no changes to the application.

There are a number of other benefits running ProxySQL in this setup:

Bring query caching capability closest to the application layer running in Kubernetes.
Secure implementation by connecting through ProxySQL UNIX socket file. It is like a pipe that the server and the clients can use to connect and exchange requests and data.
Distributed reverse proxy tier with shared nothing architecture.
Less network overhead due to "skip-networking" implementation.
Stateless deployment approach by utilizing Kubernetes ConfigMaps.

Preparing the Database

Create the wordpress database and user on the master and assign with correct privilege:

mysql-master> CREATE DATABASE wordpress;
mysql-master> CREATE USER wordpress@'%' IDENTIFIED BY 'passw0rd';
mysql-master> GRANT ALL PRIVILEGES ON wordpress.* TO wordpress@'%';

Also, create the ProxySQL monitoring user:

mysql-master> CREATE USER proxysql@'%' IDENTIFIED BY 'proxysqlpassw0rd';

Then, reload the grant table:

mysql-master> FLUSH PRIVILEGES;

Preparing the Pod

Now, copy paste the following lines into a file called blog-deployment.yml on the host where kubectl is configured:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: blog
  labels:
    app: blog
spec:
  replicas: 1
  selector:
    matchLabels:
      app: blog
      tier: frontend
  strategy:
    type: RollingUpdate
  template:
    metadata:
      labels:
        app: blog
        tier: frontend
    spec:

      restartPolicy: Always

      containers:
      - image: wordpress:4.9-apache
        name: wordpress
        env:
        - name: WORDPRESS_DB_HOST
          value: localhost:/tmp/proxysql.sock
        - name: WORDPRESS_DB_USER
          value: wordpress
        - name: WORDPRESS_DB_PASSWORD
          valueFrom:
            secretKeyRef:
              name: mysql-pass
              key: password
        ports:
        - containerPort: 80
          name: wordpress
        volumeMounts:
        - name: wordpress-persistent-storage
          mountPath: /var/www/html
        - name: shared-data
          mountPath: /tmp

      - image: severalnines/proxysql
        name: proxysql
        volumeMounts:
        - name: proxysql-config
          mountPath: /etc/proxysql.cnf
          subPath: proxysql.cnf
        - name: shared-data
          mountPath: /tmp

      volumes:
      - name: wordpress-persistent-storage
        persistentVolumeClaim:
          claimName: wp-pv-claim
      - name: proxysql-config
        configMap:
          name: proxysql-configmap
      - name: shared-data
        emptyDir: {}

The YAML file has many lines and let's look the interesting part only. The first section:

apiVersion: apps/v1
kind: Deployment

The first line is the apiVersion. Our Kubernetes cluster is running on v1.12 so we should refer to the Kubernetes v1.12 API documentation and follow the resource declaration according to this API. The next one is the kind, which tells what type of resource that we want to deploy. Deployment, Service, ReplicaSet, DaemonSet, PersistentVolume are some of the examples.

The next important section is the "containers" section. Here we define all containers that we would like to run together in this pod. The first part is the Wordpress container:

      - image: wordpress:4.9-apache
        name: wordpress
        env:
        - name: WORDPRESS_DB_HOST
          value: localhost:/tmp/proxysql.sock
        - name: WORDPRESS_DB_USER
          value: wordpress
        - name: WORDPRESS_DB_PASSWORD
          valueFrom:
            secretKeyRef:
              name: mysql-pass
              key: password
        ports:
        - containerPort: 80
          name: wordpress
        volumeMounts:
        - name: wordpress-persistent-storage
          mountPath: /var/www/html
        - name: shared-data
          mountPath: /tmp

In this section, we are telling Kubernetes to deploy Wordpress 4.9 using Apache web server and we gave the container the name "wordpress". We also want Kubernetes to pass a number of environment variables:

WORDPRESS_DB_HOST - The database host. Since our ProxySQL container resides in the same Pod with the Wordpress container, it's more secure to use a ProxySQL socket file instead. The format to use socket file in Wordpress is "localhost:{path to the socket file}". By default, it's located under /tmp directory of the ProxySQL container. This /tmp path is shared between Wordpress and ProxySQL containers by using "shared-data" volumeMounts as shown further down. Both containers have to mount this volume to share the same content under /tmp directory.
WORDPRESS_DB_USER - Specify the wordpress database user.
WORDPRESS_DB_PASSWORD - The password for WORDPRESS_DB_USER. Since we do not want to expose the password in this file, we can hide it using Kubernetes Secrets. Here we instruct Kubernetes to read the "mysql-pass" Secret resource instead. Secrets has to be created in advanced before the pod deployment, as explained further down.

We also want to publish port 80 of the container for the end user. The Wordpress content stored inside /var/www/html in the container will be mounted into our persistent storage running on NFS.

Next, we define the ProxySQL container:

      - image: severalnines/proxysql:1.4.12
        name: proxysql
        volumeMounts:
        - name: proxysql-config
          mountPath: /etc/proxysql.cnf
          subPath: proxysql.cnf
        - name: shared-data
          mountPath: /tmp
        ports:
        - containerPort: 6033
          name: proxysql

In the above section, we are telling Kubernetes to deploy a ProxySQL using severalnines/proxysql image version 1.4.12. We also want Kubernetes to mount our custom, pre-configured configuration file and map it to /etc/proxysql.cnf inside the container. There will be a volume called "shared-data" which map to /tmp directory to share with the Wordpress image - a temporary directory that shares a pod's lifetime. This allows ProxySQL socket file (/tmp/proxysql.sock) to be used by the Wordpress container when connecting to the database, bypassing the TCP/IP networking.

The last part is the "volumes" section:

      volumes:
      - name: wordpress-persistent-storage
        persistentVolumeClaim:
          claimName: wp-pv-claim
      - name: proxysql-config
        configMap:
          name: proxysql-configmap
      - name: shared-data
        emptyDir: {}

Kubernetes will have to create three volumes for this pod:

wordpress-persistent-storage - Use the PersistentVolumeClaim resource to map NFS export into the container for persistent data storage for Wordpress content.
proxysql-config - Use the ConfigMap resource to map the ProxySQL configuration file.
shared-data - Use the emptyDir resource to mount a shared directory for our containers inside the Pod. emptyDir resource is a temporary directory that shares a pod's lifetime.

Therefore, based on our YAML definition above, we have to prepare a number of Kubernetes resources before we can begin to deploy the "blog" pod:

PersistentVolume and PersistentVolumeClaim - To store the web contents of our Wordpress application, so when the pod is being rescheduled to other worker node, we won't lose the last changes.
Secrets - To hide the Wordpress database user password inside the YAML file.
ConfigMap - To map the configuration file to ProxySQL container, so when it's being rescheduled to other node, Kubernetes can automatically remount it again.

MySQL on Docker: How to Containerize Your Database

Discover all you need to understand when considering to run a MySQL service on top of Docker container virtualization

Download the Whitepaper

PersistentVolume and PersistentVolumeClaim

A good persistent storage for Kubernetes should be accessible by all Kubernetes nodes in the cluster. For the sake of this blog post, we used NFS as the PersistentVolume (PV) provider because it's easy and supported out-of-the-box. The NFS server is located somewhere outside of our Kubernetes network and we have configured it to allow all Kubernetes nodes with the following line inside /etc/exports:

/nfs    192.168.55.*(rw,sync,no_root_squash,no_all_squash)

Take note that NFS client package must be installed on all Kubernetes nodes. Otherwise, Kubernetes wouldn't be able to mount the NFS correctly. On all nodes:

$ sudo apt-install nfs-common #Ubuntu/Debian
$ yum install nfs-utils #RHEL/CentOS

Also, make sure on the NFS server, the target directory exists:

(nfs-server)$ mkdir /nfs/kubernetes/wordpress

Then, create a file called wordpress-pv-pvc.yml and add the following lines:

apiVersion: v1
kind: PersistentVolume
metadata:
  name: wp-pv
  labels:
    app: blog
spec:
  accessModes:
    - ReadWriteOnce
  capacity:
    storage: 3Gi
  mountOptions:
    - hard
    - nfsvers=4.1
  nfs:
    path: /nfs/kubernetes/wordpress
    server: 192.168.55.200
---
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
  name: wp-pvc
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 3Gi
  selector:
    matchLabels:
      app: blog
      tier: frontend

In the above definition, we would like Kubernetes to allocate 3GB of volume space on the NFS server for our Wordpress container. Take note for production usage, NFS should be configured with automatic provisioner and storage class.

Create the PV and PVC resources:

$ kubectl create -f wordpress-pv-pvc.yml

Verify if those resources are created and the status must be "Bound":

$ kubectl get pv,pvc
NAME                     CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS   CLAIM            STORAGECLASS   REASON   AGE
persistentvolume/wp-pv   3Gi        RWO            Recycle          Bound    default/wp-pvc                           22h

NAME                           STATUS   VOLUME   CAPACITY   ACCESS MODES   STORAGECLASS   AGE
persistentvolumeclaim/wp-pvc   Bound    wp-pv    3Gi        RWO                           22h

Secrets

The first one is to create a secret to be used by the Wordpress container for WORDPRESS_DB_PASSWORD environment variable. The reason is simply because we don't want to expose the password in clear text inside the YAML file.

Create a secret resource called mysql-pass and pass the password accordingly:

$ kubectl create secret generic mysql-pass --from-literal=password=passw0rd

Verify that our secret is created:

$ kubectl get secrets mysql-pass
NAME         TYPE     DATA   AGE
mysql-pass   Opaque   1      7h12m

ConfigMap

We also need to create a ConfigMap resource for our ProxySQL container. A Kubernetes ConfigMap file holds key-value pairs of configuration data that can be consumed in pods or used to store configuration data. ConfigMaps allow you to decouple configuration artifacts from image content to keep containerized applications portable.

Since our database server is already running on bare-metal servers with a static hostname and IP address plus static monitoring username and password, in this use case the ConfigMap file will store pre-configured configuration information about the ProxySQL service that we want to use.

First create a text file called proxysql.cnf and add the following lines:

datadir="/var/lib/proxysql"
admin_variables=
{
        admin_credentials="admin:adminpassw0rd"
        mysql_ifaces="0.0.0.0:6032"
        refresh_interval=2000
}
mysql_variables=
{
        threads=4
        max_connections=2048
        default_query_delay=0
        default_query_timeout=36000000
        have_compress=true
        poll_timeout=2000
        interfaces="0.0.0.0:6033;/tmp/proxysql.sock"
        default_schema="information_schema"
        stacksize=1048576
        server_version="5.1.30"
        connect_timeout_server=10000
        monitor_history=60000
        monitor_connect_interval=200000
        monitor_ping_interval=200000
        ping_interval_server_msec=10000
        ping_timeout_server=200
        commands_stats=true
        sessions_sort=true
        monitor_username="proxysql"
        monitor_password="proxysqlpassw0rd"
}
mysql_servers =
(
        { address="192.168.55.171" , port=3306 , hostgroup=10, max_connections=100 },
        { address="192.168.55.172" , port=3306 , hostgroup=10, max_connections=100 },
        { address="192.168.55.171" , port=3306 , hostgroup=20, max_connections=100 },
        { address="192.168.55.172" , port=3306 , hostgroup=20, max_connections=100 }
)
mysql_users =
(
        { username = "wordpress" , password = "passw0rd" , default_hostgroup = 10 , active = 1 }
)
mysql_query_rules =
(
        {
                rule_id=100
                active=1
                match_pattern="^SELECT .* FOR UPDATE"
                destination_hostgroup=10
                apply=1
        },
        {
                rule_id=200
                active=1
                match_pattern="^SELECT .*"
                destination_hostgroup=20
                apply=1
        },
        {
                rule_id=300
                active=1
                match_pattern=".*"
                destination_hostgroup=10
                apply=1
        }
)
mysql_replication_hostgroups =
(
        { writer_hostgroup=10, reader_hostgroup=20, comment="MySQL Replication 5.7" }
)

Pay extra attention to the "mysql_servers" and "mysql_users" sections, where you might need to modify the values to suit your database cluster setup. In this case, we have two database servers running in MySQL Replication as summarized in the following Topology screenshot taken from ClusterControl:

All writes should go to the master node while reads are forwarded to hostgroup 20, as defined under "mysql_query_rules" section. That's the basic of read/write splitting and we want to utilize them altogether.

Then, import the configuration file into ConfigMap:

$ kubectl create configmap proxysql-configmap --from-file=proxysql.cnf
configmap/proxysql-configmap created

Verify if the ConfigMap is loaded into Kubernetes:

$ kubectl get configmap
NAME                 DATA   AGE
proxysql-configmap   1      45s

Deploying the Pod

Now we should be good to deploy the blog pod. Send the deployment job to Kubernetes:

$ kubectl create -f blog-deployment.yml

Verify the pod status:

$ kubectl get pods
NAME                           READY   STATUS              RESTARTS   AGE
blog-54755cbcb5-t4cb7          2/2     Running             0          100s

It must show 2/2 under the READY column, indicating there are two containers running inside the pod. Use the -c option flag to check the Wordpress and ProxySQL containers inside the blog pod:

$ kubectl logs blog-54755cbcb5-t4cb7 -c wordpress
$ kubectl logs blog-54755cbcb5-t4cb7 -c proxysql

From the ProxySQL container log, you should see the following lines:

2018-10-20 08:57:14 [INFO] Dumping current MySQL Servers structures for hostgroup ALL
HID: 10 , address: 192.168.55.171 , port: 3306 , weight: 1 , status: ONLINE , max_connections: 100 , max_replication_lag: 0 , use_ssl: 0 , max_latency_ms: 0 , comment:
HID: 10 , address: 192.168.55.172 , port: 3306 , weight: 1 , status: OFFLINE_HARD , max_connections: 100 , max_replication_lag: 0 , use_ssl: 0 , max_latency_ms: 0 , comment:
HID: 20 , address: 192.168.55.171 , port: 3306 , weight: 1 , status: ONLINE , max_connections: 100 , max_replication_lag: 0 , use_ssl: 0 , max_latency_ms: 0 , comment:
HID: 20 , address: 192.168.55.172 , port: 3306 , weight: 1 , status: ONLINE , max_connections: 100 , max_replication_lag: 0 , use_ssl: 0 , max_latency_ms: 0 , comment:

HID 10 (writer hostgroup) must have only one ONLINE node (indicating a single master) and the other host must be in at least in OFFLINE_HARD status. For HID 20, it's expected to be ONLINE for all nodes (indicating multiple read replicas).

To get a summary of the deployment, use the describe flag:

$ kubectl describe deployments blog

Our blog is now running, however we can't access it from outside of the Kubernetes network without configuring the service, as explained in the next section.

Creating the Blog Service

The last step is to create attach a service to our pod. This to ensure that our Wordpress blog pod is accessible from the outside world. Create a file called blog-svc.yml and paste the following line:

apiVersion: v1
kind: Service
metadata:
  name: blog
  labels:
    app: blog
    tier: frontend
spec:
  type: NodePort
  ports:
  - name: blog
    nodePort: 30080
    port: 80
  selector:
    app: blog
    tier: frontend

Create the service:

$ kubectl create -f blog-svc.yml

Verify if the service is created correctly:

root@kube1:~/proxysql-blog# kubectl get svc
NAME         TYPE        CLUSTER-IP     EXTERNAL-IP   PORT(S)        AGE
blog         NodePort    10.96.140.37   <none>        80:30080/TCP   26s
kubernetes   ClusterIP   10.96.0.1      <none>        443/TCP        43h

Port 80 published by the blog pod is now mapped to the outside world via port 30080. We can access our blog post at http://{any_kubernetes_host}:30080/ and should be redirected to the Wordpress installation page. If we proceed with the installation, it would skip the database connection part and directly show this page:

It indicates that our MySQL and ProxySQL configuration is correctly configured inside wp-config.php file. Otherwise, you would be redirected to the database configuration page.

Our deployment is now complete.

Managing ProxySQL Container inside a Pod

Failover and recovery are expected to be handled automatically by Kubernetes. For example, if Kubernetes worker goes down, the pod will be recreated in the next available node after --pod-eviction-timeout (default to 5 minutes). If the container crashes or is killed, Kubernetes will replace it almost instantly.

Some common management tasks are expected to be different when running within Kubernetes, as shown in the next sections.

Scaling Up and Down

In the above configuration, we were deploying one replica in our deployment. To scale up, simply change the spec.replicas value accordingly by using kubectl edit command:

$ kubectl edit deployment blog

It will open up the deployment definition in a default text file and simply change the spec.replicas value to something higher, for example, "replicas: 3". Then, save the file and immediately check the rollout status by using the following command:

$ kubectl rollout status deployment blog
Waiting for deployment "blog" rollout to finish: 1 of 3 updated replicas are available...
Waiting for deployment "blog" rollout to finish: 2 of 3 updated replicas are available...
deployment "blog" successfully rolled out

At this point, we have three blog pods (Wordpress + ProxySQL) running simultanouesly in Kubernetes:

$ kubectl get pods
NAME                             READY   STATUS              RESTARTS   AGE
blog-54755cbcb5-6fnqn            2/2     Running             0          11m
blog-54755cbcb5-cwpdj            2/2     Running             0          11m
blog-54755cbcb5-jxtvc            2/2     Running             0          22m

At this point, our architecture is looking something like this:

Take note that it might require more customization than our current configuration to run Wordpress smoothly in a horizontal-scaled production environment (think about static contents, session management and others). Those are actually beyond the scope of this blog post.

Scaling down procedures are similar.

Configuration Management

Configuration management is important in ProxySQL. This is where the magic happens where you can define your own set of query rules to do query caching, firewalling and rewriting. Contrary to the common practice, where ProxySQL would be configured via Admin console and push into persistency by using "SAVE .. TO DISK", we will stick with configuration files only to make things more portable in Kubernetes. That's the reason we are using ConfigMaps.

Since we are relying on our centralized configuration stored by Kubernetes ConfigMaps, there are a number of ways to perform configuration changes. Firstly, by using the kubectl edit command:

$ kubectl edit configmap proxysql-configmap

It will open up the configuration in a default text editor and you can directly make changes to it and save the text file once done. Otherwise, recreate the configmaps should also do:

$ vi proxysql.cnf # edit the configuration first
$ kubectl delete configmap proxysql-configmap
$ kubectl create configmap proxysql-configmap --from-file=proxysql.cnf

After the configuration is pushed into ConfigMap, restart the pod or container as shown in the Service Control section. Configuring the container via ProxySQL admin interface (port 6032) won't make it persistent after pod rescheduling by Kubernetes.

Service Control

Since the two containers inside a pod are tightly coupled, the best way to apply the ProxySQL configuration changes is to force Kubernetes to do pod replacement. Consider we are having three blog pods now after we scaled up:

$ kubectl get pods
NAME                             READY   STATUS              RESTARTS   AGE
blog-54755cbcb5-6fnqn            2/2     Running             0          31m
blog-54755cbcb5-cwpdj            2/2     Running             0          31m
blog-54755cbcb5-jxtvc            2/2     Running             1          22m

Use the following command to replace one pod at a time:

$ kubectl get pod blog-54755cbcb5-6fnqn -n default -o yaml | kubectl replace --force -f -
pod "blog-54755cbcb5-6fnqn" deleted
pod/blog-54755cbcb5-6fnqn

Then, verify with the following:

$ kubectl get pods
NAME                             READY   STATUS              RESTARTS   AGE
blog-54755cbcb5-6fnqn            2/2     Running             0          31m
blog-54755cbcb5-cwpdj            2/2     Running             0          31m
blog-54755cbcb5-qs6jm            2/2     Running             1          2m26s

You will notice the most recent pod has been restarted by looking at the AGE and RESTART column, it came up with a different pod name. Repeat the same steps for the remaining pods. Otherwise, you can also use "docker kill" command to kill the ProxySQL container manually inside the Kubernetes worker node. For example:

(kube-worker)$ docker kill $(docker ps | grep -i proxysql_blog | awk {'print $1'})

Kubernetes will then replace the killed ProxySQL container with a new one.

Monitoring

Use kubectl exec command to execute SQL statement via mysql client. For example, to monitor query digestion:

$ kubectl exec -it blog-54755cbcb5-29hqt -c proxysql -- mysql -uadmin -p -h127.0.0.1 -P6032
mysql> SELECT * FROM stats_mysql_query_digest;

Or with a one-liner:

$ kubectl exec -it blog-54755cbcb5-29hqt -c proxysql -- mysql -uadmin -p -h127.0.0.1 -P6032 -e 'SELECT * FROM stats_mysql_query_digest'

By changing the SQL statement, you can monitor other ProxySQL components or perform any administration tasks via this Admin console. Again, it will only persist during the ProxySQL container lifetime and won't get persisted if the pod is rescheduled.

Final Thoughts

ProxySQL holds a key role if you want to scale your application containers and and have an intelligent way to access a distributed database backend. There are a number of ways to deploy ProxySQL on Kubernetes to support our application growth when running at scale. This blog post only covers one of them.

In an upcoming blog post, we are going to look at how to run ProxySQL in a centralized approach by using it as a Kubernetes service.

Tags:

↧

How to Cluster Your ProxySQL Load Balancers

October 26, 2018, 2:08 pm

≫ Next: MySQL on Docker: Running ProxySQL as Kubernetes Service

≪ Previous: MySQL on Docker: Running ProxySQL as a Helper Container on Kubernetes

A proxy layer between applications and databases would typically consist of multiple proxy nodes for high availability. This is no different for ProxySQL. ProxySQL, just like other modern proxies, can be used to build complex logic for routing queries. You can add query rules to send queries to particular hosts, you can cache queries, you can add and remove backend servers, or manage users that are allowed to connect to the ProxySQL and MySQL. However, numerous ProxySQL nodes in the proxy layer introduces another problem - synchronization across distributed instances. Any rules or other logic need to be synchronized across instances, to ensure they behave in the same way. Even if not all of the proxies are handling traffic, they still work as a standby. In case they would need to take over the work, you don’t want any surprises if the instance used does not have the most recent configuration changes.

It is quite cumbersome to ensure this manually - to make the changes by hand on all of the nodes. You can utilize tools like Ansible, Chef or Puppet to manage configurations, but the sync process has to be coded and tested. ClusterControl can help you here through an option to sync configurations between ProxySQL instances, but it also can set up and manage the other components required for high availability, e.g., Virtual IP. But starting from version 1.4.2, ProxySQL offers a native clustering and configuration syncing mechanism. In this blog post, we will discuss how to set it up with a mix of actions taken in ClusterControl and ProxySQL commandline admin interface.

First of all, let’s take a look at a typical replication environment deployed by ClusterControl.

As you can see from the screenshot, this is a MySQL replication setup with three ProxySQL instances. ProxySQL high availability is implemented through Keepalived and Virtual IP that is always assigned to one of the ProxySQL nodes. There are a couple of steps we have to take in order to configure ProxySQL clustering. First, we have to define which user ProxySQL should be using to exchange information between the nodes. Let’s define a new one on top of the existing administrative user:

Next, we need to define that user in admin-cluster_password and admin-cluster_username settings.

This has been done on just one of the nodes (10.0.0.126). Let’s sync this configuration change to the remaining ProxySQL nodes.

As we stated, ClusterControl allows you to synchronize configuration between ProxySQL nodes with just a couple of steps. When the job ended syncing 10.0.0.127 with 10.0.0.126, there’s just the last node we need to sync.

After this, we need to make a small change in the ProxySQL administrative command line interface, which is typically reachable on port 6032. We have to create entries in the ‘proxysql_servers’ table which would define the nodes in our ProxySQL cluster.

mysql> INSERT INTO proxysql_servers (hostname) VALUES ('10.0.0.126'), ('10.0.0.127'), ('10.0.0.128');
Query OK, 3 rows affected (0.00 sec)

mysql> LOAD PROXYSQL SERVERS TO RUNTIME;
Query OK, 0 rows affected (0.01 sec)

mysql> SAVE PROXYSQL SERVERS TO DISK;
Query OK, 0 rows affected (0.01 sec)

After loading the change to runtime, ProxySQL should start syncing the nodes. There are a couple of places where you can track the state of the cluster.

mysql> SELECT * FROM stats_proxysql_servers_checksums;
+------------+------+-------------------+---------+------------+--------------------+------------+------------+------------+
| hostname   | port | name              | version | epoch      | checksum           | changed_at | updated_at | diff_check |
+------------+------+-------------------+---------+------------+--------------------+------------+------------+------------+
| 10.0.0.128 | 6032 | admin_variables   | 0       | 0          |                    | 0          | 1539773916 | 0          |
| 10.0.0.128 | 6032 | mysql_query_rules | 2       | 1539772933 | 0x3FEC69A5C9D96848 | 1539773546 | 1539773916 | 0          |
| 10.0.0.128 | 6032 | mysql_servers     | 4       | 1539772933 | 0x3659DCF3E53498A0 | 1539773546 | 1539773916 | 0          |
| 10.0.0.128 | 6032 | mysql_users       | 2       | 1539772933 | 0xDD5F0BB01235E930 | 1539773546 | 1539773916 | 0          |
| 10.0.0.128 | 6032 | mysql_variables   | 0       | 0          |                    | 0          | 1539773916 | 0          |
| 10.0.0.128 | 6032 | proxysql_servers  | 2       | 1539773835 | 0x8EB13E2B48C3FDB0 | 1539773835 | 1539773916 | 0          |
| 10.0.0.127 | 6032 | admin_variables   | 0       | 0          |                    | 0          | 1539773916 | 0          |
| 10.0.0.127 | 6032 | mysql_query_rules | 3       | 1539773719 | 0x3FEC69A5C9D96848 | 1539773546 | 1539773916 | 0          |
| 10.0.0.127 | 6032 | mysql_servers     | 5       | 1539773719 | 0x3659DCF3E53498A0 | 1539773546 | 1539773916 | 0          |
| 10.0.0.127 | 6032 | mysql_users       | 3       | 1539773719 | 0xDD5F0BB01235E930 | 1539773546 | 1539773916 | 0          |
| 10.0.0.127 | 6032 | mysql_variables   | 0       | 0          |                    | 0          | 1539773916 | 0          |
| 10.0.0.127 | 6032 | proxysql_servers  | 2       | 1539773812 | 0x8EB13E2B48C3FDB0 | 1539773813 | 1539773916 | 0          |
| 10.0.0.126 | 6032 | admin_variables   | 0       | 0          |                    | 0          | 1539773916 | 0          |
| 10.0.0.126 | 6032 | mysql_query_rules | 1       | 1539770578 | 0x3FEC69A5C9D96848 | 1539773546 | 1539773916 | 0          |
| 10.0.0.126 | 6032 | mysql_servers     | 3       | 1539771053 | 0x3659DCF3E53498A0 | 1539773546 | 1539773916 | 0          |
| 10.0.0.126 | 6032 | mysql_users       | 1       | 1539770578 | 0xDD5F0BB01235E930 | 1539773546 | 1539773916 | 0          |
| 10.0.0.126 | 6032 | mysql_variables   | 0       | 0          |                    | 0          | 1539773916 | 0          |
| 10.0.0.126 | 6032 | proxysql_servers  | 2       | 1539773546 | 0x8EB13E2B48C3FDB0 | 1539773546 | 1539773916 | 0          |
+------------+------+-------------------+---------+------------+--------------------+------------+------------+------------+
18 rows in set (0.00 sec)

The stats_proxysql_servers_checksums table contains, among others, a list of nodes in the cluster, tables that are synced, versions and checksum of the table. If the checksum is not in line, ProxySQL will attempt to get the latest version from a cluster peer. More detailed information about the contents of this table can be found in ProxySQL documentation.

Another source of information about the process is ProxySQL’s log (by default it is located in /var/lib/proxysql/proxysql.log).

2018-10-17 11:00:25 [INFO] Cluster: detected a new checksum for mysql_query_rules from peer 10.0.0.126:6032, version 2, epoch 1539774025, checksum 0xD615D5416F61AA72 . Not syncing yet …
2018-10-17 11:00:27 [INFO] Cluster: detected a peer 10.0.0.126:6032 with mysql_query_rules version 2, epoch 1539774025, diff_check 3. Own version: 2, epoch: 1539772933. Proceeding with remote sync
2018-10-17 11:00:28 [INFO] Cluster: detected a peer 10.0.0.126:6032 with mysql_query_rules version 2, epoch 1539774025, diff_check 4. Own version: 2, epoch: 1539772933. Proceeding with remote sync
2018-10-17 11:00:28 [INFO] Cluster: detected peer 10.0.0.126:6032 with mysql_query_rules version 2, epoch 1539774025
2018-10-17 11:00:28 [INFO] Cluster: Fetching MySQL Query Rules from peer 10.0.0.126:6032 started
2018-10-17 11:00:28 [INFO] Cluster: Fetching MySQL Query Rules from peer 10.0.0.126:6032 completed
2018-10-17 11:00:28 [INFO] Cluster: Loading to runtime MySQL Query Rules from peer 10.0.0.126:6032
2018-10-17 11:00:28 [INFO] Cluster: Saving to disk MySQL Query Rules from peer 10.0.0.126:6032

As you can see, we have here information that a new checksum has been detected and the sync process is in place.

Let’s stop for a moment here and discuss how ProxySQL handles configuration updates from multiple sources. First of all, ProxySQL tracks checksums to detect when a configuration has changed. It also stores when it happened - this data is stored as a timestamp, so it has one second resolution. ProxySQL has two variables which also impacts how changes are being synchronized.

Cluster_check_interval_ms - it determines how often ProxySQL should check for configuration changes. By default it is 1000ms.

Cluster_mysql_servers_diffs_before_sync - it tells us how many times a check should detect a configuration change before it will get synced. Default setting is 3.

This means that, even if you will make a configuration change on the same host, if you will make it less often than 4 seconds, the remaining ProxySQL nodes may not be able to synchronize it because a new change will show up before the previous one was synchronized. It also means that if you make configuration changes on multiple ProxySQL instances, you should make them with at least a 4 second break between them as otherwise some of the changes will be lost and, as a result, configurations will diverge. For example, you add Server1 on Proxy1 , and after 2 seconds you add Server2 on Proxy2 . All other proxies will reject the change on Proxy1 because they will detect that Proxy2 has a newer configuration. 4 seconds after the change on Proxy2, all proxies (including Proxy1) will pull the configuration from Proxy2.

As the intra-cluster communication is not synchronous and if a ProxySQL node you made the changes to failed, changes may not be replicated on time. The best approach is to make the same change on two ProxySQL nodes. This way, unless both fail exactly at the same time, one of them will be able to propagate new configuration..

Also worth noting is that the ProxySQL cluster topology can be quite flexible. In our case we have three nodes, all have three entries in the proxysql_servers table. Such nodes form the cluster where you can write to any node and the changes will be propagated. On top of that, it is possible to add external nodes which would work in a “read-only” mode, which means that they would only synchronize changes made to the “core” cluster but they won’t propagate changes that were performed directly on themselves. All you need on the new node is to have just the “core” cluster nodes configured in proxysql_servers and, as a result, it will connect to those nodes and get the data changes, but it will not be queried by the rest of the cluster for its configuration changes. This setup could be used to create a source of truth with several nodes in the cluster, and other ProxySQL nodes, which just get the configuration from the main “core” cluster.

In addition to all of that, ProxySQL cluster supports automatic rejoining of the nodes - they will sync their configuration while starting. It can also be easily scaled out by adding more nodes.

We hope this blog post gives you an insight into how ProxySQL cluster can be configured. ProxySQL cluster will be transparent to ClusterControl and it will not impact any of the operations you may want to execute from the ClusterControl UI.

Tags:

load balancing

proxysql

cluster

↧

MySQL on Docker: Running ProxySQL as Kubernetes Service

October 29, 2018, 4:07 pm

≫ Next: Simple Scheduling of Maintenance Windows across your Database Clusters

≪ Previous: How to Cluster Your ProxySQL Load Balancers

When running distributed database clusters, it is quite common to front them with load balancers. The advantages are clear - load balancing, connection failover and decoupling of the application tier from the underlying database topologies. For more intelligent load balancing, a database-aware proxy like ProxySQL or MaxScale would be the way to go. In our previous blog, we showed you how to run ProxySQL as a helper container in Kubernetes. In this blog post, we’ll show you how to deploy ProxySQL as a Kubernetes service. We’ll use Wordpress as an example application and the database backend is running on a two-node MySQL Replication deployed using ClusterControl. The following diagram illustrates our infrastructure:

Since we are going to deploy a similar setup as in this previous blog post, do expect duplication in some parts of the blog post to keep the post more readable.

ProxySQL on Kubernetes

Let’s start with a bit of recap. Designing a ProxySQL architecture is a subjective topic and highly dependent on the placement of the application, database containers as well as the role of ProxySQL itself. Ideally, we can configure ProxySQL to be managed by Kubernetes with two configurations:

ProxySQL as a Kubernetes service (centralized deployment)
ProxySQL as a helper container in a pod (distributed deployment)

Both deployments can be distinguished easily by looking at the following diagram:

This blog post will cover the first configuration - running ProxySQL as a Kubernetes service. The second configuration is already covered here. In contrast to the helper container approach, running as a service makes ProxySQL pods live independently from the applications and can be easily scaled and clustered together with the help of Kubernetes ConfigMap. This is definitely a different clustering approach than ProxySQL native clustering support which relies on configuration checksum across ProxySQL instances (a.k.a proxysql_servers). Check out this blog post if you want to learn about ProxySQL clustering made easy with ClusterControl.

In Kubernetes, ProxySQL's multi-layer configuration system makes pod clustering possible with ConfigMap. However, there are a number of shortcomings and workarounds to make it work smoothly as what ProxySQL's native clustering feature does. At the moment, signalling a pod upon ConfigMap update is a feature in the works. We will cover this topic in much greater detail in an upcoming blog post.

Basically, we need to create ProxySQL pods and attach a Kubernetes service to be accessed by the other pods within the Kubernetes network or externally. Applications will then connect to the ProxySQL service via TCP/IP networking on the configured ports. Default to 6033 for MySQL load-balanced connections and 6032 for ProxySQL administration console. With more than one replica, the connections to the pod will be load balanced automatically by Kubernetes kube-proxy component running on every Kubernetes node.

ProxySQL as Kubernetes Service

In this setup, we run both ProxySQL and Wordpress as pods and services. The following diagram illustrates our high-level architecture:

In this setup, we will deploy two pods and services - "wordpress" and "proxysql". We will merge Deployment and Service declaration in one YAML file per application and manage them as one unit. To keep the application containers' content persistent across multiple nodes, we have to use a clustered or remote file system, which in this case is NFS.

Deploying ProxySQL as a service brings a couple of good things over the helper container approach:

Using Kubernetes ConfigMap approach, ProxySQL can be clustered with immutable configuration.
Kubernetes handles ProxySQL recovery and balance the connections to the instances automatically.
Single endpoint with Kubernetes Virtual IP address implementation called ClusterIP.
Centralized reverse proxy tier with shared nothing architecture.
Can be used with external applications outside Kubernetes.

We will start the deployment as two replicas for ProxySQL and three for Wordpress to demonstrate running at scale and load-balancing capabilities that Kubernetes offers.

Preparing the Database

Create the wordpress database and user on the master and assign with correct privilege:

mysql-master> CREATE DATABASE wordpress;
mysql-master> CREATE USER wordpress@'%' IDENTIFIED BY 'passw0rd';
mysql-master> GRANT ALL PRIVILEGES ON wordpress.* TO wordpress@'%';

Also, create the ProxySQL monitoring user:

mysql-master> CREATE USER proxysql@'%' IDENTIFIED BY 'proxysqlpassw0rd';

Then, reload the grant table:

mysql-master> FLUSH PRIVILEGES;

ProxySQL Pod and Service Definition

The next one is to prepare our ProxySQL deployment. Create a file called proxysql-rs-svc.yml and add the following lines:

apiVersion: v1
kind: Deployment
metadata:
  name: proxysql
  labels:
    app: proxysql
spec:
  replicas: 2
  selector:
    matchLabels:
      app: proxysql
      tier: frontend
  strategy:
    type: RollingUpdate
  template:
    metadata:
      labels:
        app: proxysql
        tier: frontend
    spec:
      restartPolicy: Always
      containers:
      - image: severalnines/proxysql:1.4.12
        name: proxysql
        volumeMounts:
        - name: proxysql-config
          mountPath: /etc/proxysql.cnf
          subPath: proxysql.cnf
        ports:
        - containerPort: 6033
          name: proxysql-mysql
        - containerPort: 6032
          name: proxysql-admin
      volumes:
      - name: proxysql-config
        configMap:
          name: proxysql-configmap
---
apiVersion: v1
kind: Service
metadata:
  name: proxysql
  labels:
    app: proxysql
    tier: frontend
spec:
  type: NodePort
  ports:
  - nodePort: 30033
    port: 6033
    name: proxysql-mysql
  - nodePort: 30032
    port: 6032
    name: proxysql-admin
  selector:
    app: proxysql
    tier: frontend

Let's see what those definitions are all about. The YAML consists of two resources combined in a file, separated by "---" delimiter. The first resource is the Deployment, which we define the following specification:

spec:
  replicas: 2
  selector:
    matchLabels:
      app: proxysql
      tier: frontend
  strategy:
    type: RollingUpdate

The above means we would like to deploy two ProxySQL pods as a ReplicaSet that matches containers labelled with "app=proxysql,tier=frontend". The deployment strategy specifies the strategy used to replace old pods by new ones. In this deployment, we picked RollingUpdate which means the pods will be updated in a rolling update fashion, one pod at a time.

The next part is the container's template:

      - image: severalnines/proxysql:1.4.12
        name: proxysql
        volumeMounts:
        - name: proxysql-config
          mountPath: /etc/proxysql.cnf
          subPath: proxysql.cnf
        ports:
        - containerPort: 6033
          name: proxysql-mysql
        - containerPort: 6032
          name: proxysql-admin
      volumes:
      - name: proxysql-config
        configMap:
          name: proxysql-configmap

In the spec.templates.spec.containers.* section, we are telling Kubernetes to deploy ProxySQL using severalnines/proxysql image version 1.4.12. We also want Kubernetes to mount our custom, pre-configured configuration file and map it to /etc/proxysql.cnf inside the container. The running pods will publish two ports - 6033 and 6032. We also define the "volumes" section, where we instruct Kubernetes to mount the ConfigMap as a volume inside the ProxySQL pods to be mounted by volumeMounts.

The second resource is the service. A Kubernetes service is an abstraction layer which defines the logical set of pods and a policy by which to access them. In this section, we define the following:

apiVersion: v1
kind: Service
metadata:
  name: proxysql
  labels:
    app: proxysql
    tier: frontend
spec:
  type: NodePort
  ports:
  - nodePort: 30033
    port: 6033
    name: proxysql-mysql
  - nodePort: 30032
    port: 6032
    name: proxysql-admin
  selector:
    app: proxysql
    tier: frontend

In this case, we want our ProxySQL to be accessed from the external network thus NodePort type is the chosen type. This will publish the nodePort on every Kubernetes nodes in the cluster. The range of valid ports for NodePort resource is 30000-32767. We chose port 30033 for MySQL-load balanced connections which is mapped to port 6033 of the ProxySQL pods and port 30032 for ProxySQL Administration port mapped to 6032.

Therefore, based on our YAML definition above, we have to prepare the following Kubernetes resource before we can begin to deploy the "proxysql" pod:

ConfigMap - To store ProxySQL configuration file as a volume so it can be mounted to multiple pods and can be remounted again if the pod is being rescheduled to the other Kubernetes node.

Preparing ConfigMap for ProxySQL

Similar to the previous blog post, we are going to use ConfigMap approach to decouple the configuration file from the container and also for scalability purpose. Take note that in this setup, we consider our ProxySQL configuration is immutable.

Firstly, create the ProxySQL configuration file, proxysql.cnf and add the following lines:

datadir="/var/lib/proxysql"
admin_variables=
{
        admin_credentials="proxysql-admin:adminpassw0rd"
        mysql_ifaces="0.0.0.0:6032"
        refresh_interval=2000
}
mysql_variables=
{
        threads=4
        max_connections=2048
        default_query_delay=0
        default_query_timeout=36000000
        have_compress=true
        poll_timeout=2000
        interfaces="0.0.0.0:6033;/tmp/proxysql.sock"
        default_schema="information_schema"
        stacksize=1048576
        server_version="5.1.30"
        connect_timeout_server=10000
        monitor_history=60000
        monitor_connect_interval=200000
        monitor_ping_interval=200000
        ping_interval_server_msec=10000
        ping_timeout_server=200
        commands_stats=true
        sessions_sort=true
        monitor_username="proxysql"
        monitor_password="proxysqlpassw0rd"
}
mysql_replication_hostgroups =
(
        { writer_hostgroup=10, reader_hostgroup=20, comment="MySQL Replication 5.7" }
)
mysql_servers =
(
        { address="192.168.55.171" , port=3306 , hostgroup=10, max_connections=100 },
        { address="192.168.55.172" , port=3306 , hostgroup=10, max_connections=100 },
        { address="192.168.55.171" , port=3306 , hostgroup=20, max_connections=100 },
        { address="192.168.55.172" , port=3306 , hostgroup=20, max_connections=100 }
)
mysql_users =
(
        { username = "wordpress" , password = "passw0rd" , default_hostgroup = 10 , active = 1 }
)
mysql_query_rules =
(
        {
                rule_id=100
                active=1
                match_pattern="^SELECT .* FOR UPDATE"
                destination_hostgroup=10
                apply=1
        },
        {
                rule_id=200
                active=1
                match_pattern="^SELECT .*"
                destination_hostgroup=20
                apply=1
        },
        {
                rule_id=300
                active=1
                match_pattern=".*"
                destination_hostgroup=10
                apply=1
        }
)

Pay attention on the admin_variables.admin_credentials variable where we used non-default user which is "proxysql-admin". ProxySQL reserves the default "admin" user for local connection via localhost only. Therefore, we have to use other users to access the ProxySQL instance remotely. Otherwise, you would get the following error:

ERROR 1040 (42000): User 'admin' can only connect locally

Our ProxySQL configuration is based on our two database servers running in MySQL Replication as summarized in the following Topology screenshot taken from ClusterControl:

Then, import the configuration file into ConfigMap:

$ kubectl create configmap proxysql-configmap --from-file=proxysql.cnf
configmap/proxysql-configmap created

Verify if the ConfigMap is loaded into Kubernetes:

$ kubectl get configmap
NAME                 DATA   AGE
proxysql-configmap   1      45s

Wordpress Pod and Service Definition

Now, paste the following lines into a file called wordpress-rs-svc.yml on the host where kubectl is configured:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: wordpress
  labels:
    app: wordpress
spec:
  replicas: 3
  selector:
    matchLabels:
      app: wordpress
      tier: frontend
  strategy:
    type: RollingUpdate
  template:
    metadata:
      labels:
        app: wordpress
        tier: frontend
    spec:
      restartPolicy: Always
      containers:
      - image: wordpress:4.9-apache
        name: wordpress
        env:
        - name: WORDPRESS_DB_HOST
          value: proxysql:6033 # proxysql.default.svc.cluster.local:6033
        - name: WORDPRESS_DB_USER
          value: wordpress
        - name: WORDPRESS_DB_DATABASE
          value: wordpress
        - name: WORDPRESS_DB_PASSWORD
          valueFrom:
            secretKeyRef:
              name: mysql-pass
              key: password
        ports:
        - containerPort: 80
          name: wordpress
---
apiVersion: v1
kind: Service
metadata:
  name: wordpress
  labels:
    app: wordpress
    tier: frontend
spec:
  type: NodePort
  ports:
  - name: wordpress
    nodePort: 30088
    port: 80
  selector:
    app: wordpress
    tier: frontend

Similar to our ProxySQL definition, the YAML consists of two resources, separated by "---" delimiter combined in a file. The first one is the Deployment resource, which will be deployed as a ReplicaSet, as shown under the "spec.*" section:

spec:
  replicas: 3
  selector:
    matchLabels:
      app: wordpress
      tier: frontend
  strategy:
    type: RollingUpdate

This section provides the Deployment specification - 3 pods to start that matches label "app=wordpress,tier=backend". The deployment strategy is RollingUpdate which means the way Kubernetes will replace the pod is by using rolling update fashion, same with our ProxySQL deployment.

The next part is the "spec.template.spec.*" section:

      restartPolicy: Always
      containers:
      - image: wordpress:4.9-apache
        name: wordpress
        env:
        - name: WORDPRESS_DB_HOST
          value: proxysql:6033
        - name: WORDPRESS_DB_USER
          value: wordpress
        - name: WORDPRESS_DB_PASSWORD
          valueFrom:
            secretKeyRef:
              name: mysql-pass
              key: password
        ports:
        - containerPort: 80
          name: wordpress
        volumeMounts:
        - name: wordpress-persistent-storage
          mountPath: /var/www/html

In this section, we are telling Kubernetes to deploy Wordpress 4.9 using Apache web server and we gave the container the name "wordpress". The container will be restarted every time it is down, regardless of the status. We also want Kubernetes to pass a number of environment variables:

WORDPRESS_DB_HOST - The MySQL database host. Since we are using ProxySQL as a service, the service name will be the value of metadata.name which is "proxysql". ProxySQL listens on port 6033 for MySQL load balanced connections while ProxySQL administration console is on 6032.
WORDPRESS_DB_USER - Specify the wordpress database user that have been created under "Preparing the Database" section.
WORDPRESS_DB_PASSWORD - The password for WORDPRESS_DB_USER. Since we do not want to expose the password in this file, we can hide it using Kubernetes Secrets. Here we instruct Kubernetes to read the "mysql-pass" Secret resource instead. Secrets has to be created in advanced before the pod deployment, as explained further down.

We also want to publish port 80 of the pod for the end user. The Wordpress content stored inside /var/www/html in the container will be mounted into our persistent storage running on NFS. We will use the PersistentVolume and PersistentVolumeClaim resources for this purpose as shown under "Preparing Persistent Storage for Wordpress" section.

After the "---" break line, we define another resource called Service:

apiVersion: v1
kind: Service
metadata:
  name: wordpress
  labels:
    app: wordpress
    tier: frontend
spec:
  type: NodePort
  ports:
  - name: wordpress
    nodePort: 30088
    port: 80
  selector:
    app: wordpress
    tier: frontend

In this configuration, we would like Kubernetes to create a service called "wordpress", listen on port 30088 on all nodes (a.k.a. NodePort) to the external network and forward it to port 80 on all pods labelled with "app=wordpress,tier=frontend".

Therefore, based on our YAML definition above, we have to prepare a number of Kubernetes resources before we can begin to deploy the "wordpress" pod and service:

PersistentVolume and PersistentVolumeClaim - To store the web contents of our Wordpress application, so when the pod is being rescheduled to other worker node, we won't lose the last changes.
Secrets - To hide the Wordpress database user password inside the YAML file.

Preparing Persistent Storage for Wordpress

A good persistent storage for Kubernetes should be accessible by all Kubernetes nodes in the cluster. For the sake of this blog post, we used NFS as the PersistentVolume (PV) provider because it's easy and supported out-of-the-box. The NFS server is located somewhere outside of our Kubernetes network (as shown in the first architecture diagram) and we have configured it to allow all Kubernetes nodes with the following line inside /etc/exports:

/nfs    192.168.55.*(rw,sync,no_root_squash,no_all_squash)

Take note that NFS client package must be installed on all Kubernetes nodes. Otherwise, Kubernetes wouldn't be able to mount the NFS correctly. On all nodes:

$ sudo apt-install nfs-common #Ubuntu/Debian
$ yum install nfs-utils #RHEL/CentOS

Also, make sure on the NFS server, the target directory exists:

(nfs-server)$ mkdir /nfs/kubernetes/wordpress

Then, create a file called wordpress-pv-pvc.yml and add the following lines:

apiVersion: v1
kind: PersistentVolume
metadata:
  name: wp-pv
  labels:
    app: wordpress
spec:
  accessModes:
    - ReadWriteOnce
  capacity:
    storage: 3Gi
  mountOptions:
    - hard
    - nfsvers=4.1
  nfs:
    path: /nfs/kubernetes/wordpress
    server: 192.168.55.200
---
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
  name: wp-pvc
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 3Gi
  selector:
    matchLabels:
      app: wordpress
      tier: frontend

In the above definition, we are telling Kubernetes to allocate 3GB of volume space on the NFS server for our Wordpress container. Take note for production usage, NFS should be configured with automatic provisioner and storage class.

Create the PV and PVC resources:

$ kubectl create -f wordpress-pv-pvc.yml

Verify if those resources are created and the status must be "Bound":

$ kubectl get pv,pvc
NAME                     CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS   CLAIM            STORAGECLASS   REASON   AGE
persistentvolume/wp-pv   3Gi        RWO            Recycle          Bound    default/wp-pvc                           22h


NAME                           STATUS   VOLUME   CAPACITY   ACCESS MODES   STORAGECLASS   AGE
persistentvolumeclaim/wp-pvc   Bound    wp-pv    3Gi        RWO                           22h

Preparing Secrets for Wordpress

Create a secret to be used by the Wordpress container for WORDPRESS_DB_PASSWORD environment variable. The reason is simply because we don't want to expose the password in clear text inside the YAML file.

Create a secret resource called mysql-pass and pass the password accordingly:

$ kubectl create secret generic mysql-pass --from-literal=password=passw0rd

Verify that our secret is created:

$ kubectl get secrets mysql-pass
NAME         TYPE     DATA   AGE
mysql-pass   Opaque   1      7h12m

Deploying ProxySQL and Wordpress

Finally, we can begin the deployment. Deploy ProxySQL first, followed by Wordpress:

$ kubectl create -f proxysql-rs-svc.yml
$ kubectl create -f wordpress-rs-svc.yml

We can then list out all pods and services that have been created under "frontend" tier:

$ kubectl get pods,services -l tier=frontend -o wide
NAME                             READY   STATUS    RESTARTS   AGE   IP          NODE          NOMINATED NODE
pod/proxysql-95b8d8446-qfbf2     1/1     Running   0          12m   10.36.0.2   kube2.local   <none>
pod/proxysql-95b8d8446-vljlr     1/1     Running   0          12m   10.44.0.6   kube3.local   <none>
pod/wordpress-59489d57b9-4dzvk   1/1     Running   0          37m   10.36.0.1   kube2.local   <none>
pod/wordpress-59489d57b9-7d2jb   1/1     Running   0          30m   10.44.0.4   kube3.local   <none>
pod/wordpress-59489d57b9-gw4p9   1/1     Running   0          30m   10.36.0.3   kube2.local   <none>

NAME                TYPE       CLUSTER-IP       EXTERNAL-IP   PORT(S)                         AGE   SELECTOR
service/proxysql    NodePort   10.108.195.54    <none>        6033:30033/TCP,6032:30032/TCP   10m   app=proxysql,tier=frontend
service/wordpress   NodePort   10.109.144.234   <none>        80:30088/TCP                    37m   app=wordpress,tier=frontend
  kube2.local   <none>

The above output verifies our deployment architecture where we are currently having three Wordpress pods, exposed on port 30088 publicly as well as our ProxySQL instance which is exposed on port 30033 and 30032 externally plus 6033 and 6032 internally.

At this point, our architecture is looking something like this:

Port 80 published by the Wordpress pods is now mapped to the outside world via port 30088. We can access our blog post at http://{any_kubernetes_host}:30088/ and should be redirected to the Wordpress installation page. If we proceed with the installation, it would skip the database connection part and directly show this page:

It indicates that our MySQL and ProxySQL configuration is correctly configured inside wp-config.php file. Otherwise, you would be redirected to the database configuration page.

Our deployment is now complete.

ProxySQL Pods and Service Management

Failover and recovery are expected to be handled automatically by Kubernetes. For example, if a Kubernetes worker goes down, the pod will be recreated in the next available node after --pod-eviction-timeout (default to 5 minutes). If the container crashes or is killed, Kubernetes will replace it almost instantly.

Some common management tasks are expected to be different when running within Kubernetes, as shown in the next sections.

Connecting to ProxySQL

While ProxySQL is exposed externally on port 30033 (MySQL) and 30032 (Admin), it is also accessible internally via the published ports, 6033 and 6032 respectively. Thus, to access the ProxySQL instances within the Kubernetes network, use the CLUSTER-IP, or the service name "proxysql" as the host value. For example, within Wordpress pod, you may access the ProxySQL admin console by using the following command:

$ mysql -uproxysql-admin -p -hproxysql -P6032

If you want to connect externally, use the port defined under nodePort value he service YAML and pick any of the Kubernetes node as the host value:

$ mysql -uproxysql-admin -p -hkube3.local -P30032

The same applies to the MySQL load-balanced connection on port 30033 (external) and 6033 (internal).

Scaling Up and Down

Scaling up is easy with Kubernetes:

$ kubectl scale deployment proxysql --replicas=5
deployment.extensions/proxysql scaled

Verify the rollout status:

$ kubectl rollout status deployment proxysql
deployment "proxysql" successfully rolled out

Scaling down is also similar. Here we want to revert back from 5 to 2 replicas:

$ kubectl scale deployment proxysql --replicas=2
deployment.extensions/proxysql scaled

We can also look at the deployment events for ProxySQL to get a better picture of what has happened for this deployment by using the "describe" option:

$ kubectl describe deployment proxysql
...
Events:
  Type    Reason             Age    From                   Message
  ----    ------             ----   ----                   -------
  Normal  ScalingReplicaSet  20m    deployment-controller  Scaled up replica set proxysql-769895fbf7 to 1
  Normal  ScalingReplicaSet  20m    deployment-controller  Scaled down replica set proxysql-95b8d8446 to 1
  Normal  ScalingReplicaSet  20m    deployment-controller  Scaled up replica set proxysql-769895fbf7 to 2
  Normal  ScalingReplicaSet  20m    deployment-controller  Scaled down replica set proxysql-95b8d8446 to 0
  Normal  ScalingReplicaSet  7m10s  deployment-controller  Scaled up replica set proxysql-6c55f647cb to 1
  Normal  ScalingReplicaSet  7m     deployment-controller  Scaled down replica set proxysql-769895fbf7 to 1
  Normal  ScalingReplicaSet  7m     deployment-controller  Scaled up replica set proxysql-6c55f647cb to 2
  Normal  ScalingReplicaSet  6m53s  deployment-controller  Scaled down replica set proxysql-769895fbf7 to 0
  Normal  ScalingReplicaSet  54s    deployment-controller  Scaled up replica set proxysql-6c55f647cb to 5
  Normal  ScalingReplicaSet  21s    deployment-controller  Scaled down replica set proxysql-6c55f647cb to 2

The connections to the pods will be load balanced automatically by Kubernetes.

Configuration Changes

One way to make configuration changes on our ProxySQL pods is by versioning our configuration using another ConfigMap name. Firstly, modify our configuration file directly via your favourite text editor:

$ vim /root/proxysql.cnf

Then, load it up into Kubernetes ConfigMap with a different name. In this example, we append "-v2" in the resource name:

$ kubectl create configmap proxysql-configmap-v2 --from-file=proxysql.cnf

Verify if the ConfigMap is loaded correctly:

$ kubectl get configmap
NAME                    DATA   AGE
proxysql-configmap      1      3d15h
proxysql-configmap-v2   1      19m

Open the ProxySQL deployment file, proxysql-rs-svc.yml and change the following line under configMap section to the new version:

      volumes:
      - name: proxysql-config
        configMap:
          name: proxysql-configmap-v2 #change this line

Then, apply the changes to our ProxySQL deployment:

$ kubectl apply -f proxysql-rs-svc.yml
deployment.apps/proxysql configured
service/proxysql configured

Verify the rollout by using looking at the ReplicaSet event using the "describe" flag:

$ kubectl describe proxysql
...
Pod Template:
  Labels:  app=proxysql
           tier=frontend
  Containers:
   proxysql:
    Image:        severalnines/proxysql:1.4.12
    Ports:        6033/TCP, 6032/TCP
    Host Ports:   0/TCP, 0/TCP
    Environment:  <none>
    Mounts:
      /etc/proxysql.cnf from proxysql-config (rw)
  Volumes:
   proxysql-config:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      proxysql-configmap-v2
    Optional:  false
Conditions:
  Type           Status  Reason
  ----           ------  ------
  Available      True    MinimumReplicasAvailable
  Progressing    True    NewReplicaSetAvailable
OldReplicaSets:  <none>
NewReplicaSet:   proxysql-769895fbf7 (2/2 replicas created)
Events:
  Type    Reason             Age   From                   Message
  ----    ------             ----  ----                   -------
  Normal  ScalingReplicaSet  53s   deployment-controller  Scaled up replica set proxysql-769895fbf7 to 1
  Normal  ScalingReplicaSet  46s   deployment-controller  Scaled down replica set proxysql-95b8d8446 to 1
  Normal  ScalingReplicaSet  46s   deployment-controller  Scaled up replica set proxysql-769895fbf7 to 2
  Normal  ScalingReplicaSet  41s   deployment-controller  Scaled down replica set proxysql-95b8d8446 to 0

Pay attention on the "Volumes" section with the new ConfigMap name. You can also see the deployment events at the bottom of the output. At this point, our new configuration has been loaded into all ProxySQL pods, where Kubernetes scaled down the ProxySQL ReplicaSet to 0 (obeying RollingUpdate strategy) and bring them back to the desired state of 2 replicas.

Final Thoughts

Up until this point, we have covered possible deployment approach for ProxySQL in Kubernetes. Running ProxySQL with the help of Kubernetes ConfigMap opens a new possibility of ProxySQL clustering, where it is somewhat different as compared to the native clustering support built-in inside ProxySQL.

In the upcoming blog post, we will explore ProxySQL Clustering using Kubernetes ConfigMap and how to do it the right way. Stay tuned!

Tags:

↧

Simple Scheduling of Maintenance Windows across your Database Clusters

November 1, 2018, 6:54 am

≫ Next: ClusterControl Tips & Tricks - Best Practices for Database Backups

≪ Previous: MySQL on Docker: Running ProxySQL as Kubernetes Service

Maintenance is something that an operation team can not avoid. Servers have to keep up with the latest software, hardware and technology to ensure systems are stable and running with the lowest risk possible, while making use of newer features to improve the overall performance.

Undoubtedly, there is a long list of maintenance tasks that have to be performed by system administrators, especially when it comes to critical systems. Some of the tasks have to be performed at regular intervals, like daily, weekly, monthly and yearly. Some have to be done right away, urgently. Nevertheless, any maintenance operation should not lead to another bigger problem, and any maintenance has to be handled with extra care to avoid any interruption to the business.

Getting questionable state and false alarms is common while maintenance is ongoing. This is expected because during the maintenance period, the server will not be operating as it should be until the maintenance task is completed. ClusterControl, the all-inclusive management and monitoring platform for your open-source databases, can be configured to understand these circumstances to simplify your maintenance routines, without sacrificing the monitoring and automation features it offers.

Maintenance Mode

ClusterControl introduced maintenance mode in version 1.4.0, where you can put an individual node into maintenance which prevents ClusterControl to raise alarms and send notifications for the specified duration. Maintenance mode can be configured from ClusterControl UI and also using ClusterControl CLI tool called "s9s". From the UI, just go to Nodes -> pick a node -> Node Actions -> Schedule Maintenance Mode:

Here, one can set the maintenance period for a pre-defined time or schedule it accordingly. You can also write down the reason for scheduling the upgrade, useful for auditing purposes. You should see the following notification when the maintenance mode is active:

ClusterControl will not degrade the node, hence the node’s state remains as is unless you perform any action that changes the state. Alarms and notifications for this node will be reactivated once the maintenance period is over, or the operator explicitly disabling it by going to Node Actions -> Disable Maintenance Mode.

Take note that if automatic node recovery is enabled, ClusterControl will always recover a node regardless of the maintenance mode status. Don't forget to disable node recovery to avoid ClusterControl interfering your maintenance tasks, this can be done from the top summary bar.

The maintenance mode can also be configured via ClusterControl CLI or "s9s". You can use the "s9s maintenance" command to list out and manipulate the maintenance periods. The following command line schedules a one-hour maintenance window for node 192.168.1.121 tomorrow:

$ s9s maintenance --create \
--nodes=192.168.1.121 \
--start="$(date -d 'now + 1 day''+%Y-%m-%d %H:%M:%S')" \
--end="$(date -d 'now + 1 day + 1 hour''+%Y-%m-%d %H:%M:%S')" \
--reason="Upgrading software."

For more details and examples, see s9s maintenance documentation.

Cluster-Wide Maintenance Mode

At the time of this writing, the maintenance mode configuration must be configured per managed node. For cluster-wide maintenance, one has to repeat the scheduling process for every managed node of the cluster. This can be impractical if you have a high number of nodes in your cluster, or if the maintenance interval is very short between two tasks.

Fortunately, ClusterControl CLI (a.k.a s9s) can be used as a workaround to overcome this limitation. You can use "s9s nodes" to list out and manipulate the managed nodes in a cluster. This list can be iterated over to schedule a cluster-wide maintenance mode at one given time using "s9s maintenance" command.

Let's look at an example to understand this better. Consider the following three-node Percona XtraDB Cluster that we have:

$ s9s nodes --list --cluster-name='PXC57' --long
STAT VERSION    CID CLUSTER HOST         PORT COMMENT
coC- 1.7.0.2832   1 PXC57   10.0.2.15    9500 Up and running.
go-M 5.7.23       1 PXC57   192.168.0.51 3306 Up and running.
go-- 5.7.23       1 PXC57   192.168.0.52 3306 Up and running.
go-- 5.7.23       1 PXC57   192.168.0.53 3306 Up and running.
Total: 4

The cluster has a total of 4 nodes - 3 database nodes with one ClusterControl node. The first column, STAT shows the role and status of the node. The first character is the node's role - "c" means controller and "g" means Galera database node. Suppose we want to schedule only the database nodes for maintenance, we can filter out the output to get the hostname or IP address where the reported STAT has "g" at the beginning:

$ s9s nodes --list --cluster-name='PXC57' --long --batch | grep ^g | awk {'print $5'}
192.168.0.51
192.168.0.52
192.168.0.53

With a simple iteration, we can then schedule a cluster-wide maintenance window for every node in the cluster. The following command iterates the maintenance creation based on all the IP addresses found in the cluster using a for loop, where we plan to start the maintenance operation same time tomorrow and finish an hour later:

$ for host in $(s9s nodes --list --cluster-id='PXC57' --long --batch | grep ^g | awk {'print $5'}); do \
s9s maintenance \
--create \
--nodes=$host \
--start="$(date -d 'now + 1 day''+%Y-%m-%d %H:%M:%S')" \
--end="$(date -d 'now + 1 day + 1 hour''+%Y-%m-%d %H:%M:%S')" \
--reason="OS upgrade"; done
f92c5370-004d-4735-bba0-8c1bd26b9b98
9ff7dd8c-f2cb-4446-b14b-a5c2b915b853
103d715d-d0bc-4402-9326-1a053bc5d36b

You should see a printout of 3 UUIDs, the unique string that identifies every maintenance period. We can then verify with the following command:

$ s9s maintenance --list --long
ST UUID    OWNER GROUP  START               END                 HOST/CLUSTER REASON
-h f92c537 admin admins 2018-10-31 16:02:00 2018-10-31 17:02:00 192.168.0.51 OS upgrade
-h 9ff7dd8 admin admins 2018-10-31 16:02:00 2018-10-31 17:02:00 192.168.0.52 OS upgrade
-h 103d715 admin admins 2018-10-31 16:02:00 2018-10-31 17:02:00 192.168.0.53 OS upgrade
Total: 3

From the above output, we got a list of scheduled maintenance times for every database node. During the scheduled time, ClusterControl will neither raise alarms nor send out notification if it finds irregularities to the cluster.

Maintenance Mode Iteration

Some maintenance routines have to be done at a regular interval, e.g., backups, housekeeping and clean-up tasks. During the maintenance time, we would expect the server to behave differently. However, any service failure, temporary inaccessibility or high load would surely cause havoc to our monitoring system. For frequent and short interval maintenance slots, this could turn out to be very annoying and skipping the raised false alarms might give you a better sleep during the night.

However, enabling maintenance mode can also expose the server to a bigger risk since strict monitoring is ignored for the period of time. Therefore, it's probably a good idea to understand the nature of the maintenance operation that we would like to perform before enabling the maintenance mode. The following checklist should help us determine our maintenance mode policy:

Affected nodes - Which nodes are involved in the maintenance?
Consequences - What happens to the node when the maintenance operation is ongoing? Will it be inaccessible, high loaded or restarted?
Duration - How much time does the maintenance operation take to complete?
Frequency - How frequent the maintenance operation should be running?

Let's put it into a use case. Consider we have a three-node Percona XtraDB Cluster with a ClusterControl node. Supposed our servers are all running on virtual machines and the VM backup policy requires all VMs to be backed up every day starting from 1:00 AM, one node at a time. During this backup operation, the node will be freezed up for about 10 minutes max and the node which is being managed and monitored by ClusterControl will be inaccessible until the backup finishes. From a Galera Cluster perspective, this operation does not bring the whole cluster down since the cluster remains in quorum and the primary component is not affected.

Based on the nature of the maintenance task, we can summarize it as the following:

Affected nodes - All nodes for cluster ID 1 (3 database nodes and 1 ClusterControl node).
Consequence - The VM that being backed up will be inaccessible until completion.
Duration - Each VM backup operation takes about 5 to 10 minutes to complete.
Frequency - The VM backup is scheduled to run daily, starting from 1:00 AM on the first node.

We can then come out with an execution plan to schedule our maintenance mode:

Since we want all nodes in the cluster to be backed up by the VM manager, simply list out the nodes for the corresponding cluster ID:

$ s9s nodes --list --cluster-id=1
192.168.0.51  10.0.2.15     192.168.0.52  192.168.0.53

The above output can be used to schedule maintenance across the whole cluster. For example, if you run the following command, ClusterControl will activate maintenance mode for all nodes under cluster ID 1 from now until the next 50 minutes:

$ for host in $(s9s nodes --list --cluster-id=1); do \
s9s maintenance --create \
--nodes=$host \
--start="$(date -d 'now''+%Y-%m-%d %H:%M:%S')" \
--end="$(date -d 'now + 50 minutes''+%Y-%m-%d %H:%M:%S')" \
--reason="Backup VM"; done

Using the above command, we can convert it into an execution file by putting it into a script. Create a file:

$ vim /usr/local/bin/enable_maintenance_mode

And add the following lines:

for host in $(s9s nodes --list --cluster-id=1)
do \
s9s maintenance \
--create \
--nodes=$host \
--start="$(date -d 'now''+%Y-%m-%d %H:%M:%S')" \
--end="$(date -d 'now + 50 minutes''+%Y-%m-%d %H:%M:%S')" \
--reason="VM Backup"
done

Save it and make sure the file permission is executable:

$ chmod 755 /usr/local/bin/enable_maintenance_mode

Then use cron to schedule the script to be running at 5 minutes to 1:00 AM daily, right before the VM backup operation starts at 1:00 AM:

$ crontab -e
55 0 * * * /usr/local/bin/enable_maintenance_mode

Reload the cron daemon to ensure our script is being queued up:

$ systemctl reload crond # or service crond reload

That's it. We can now perform our daily maintenance operation without being bugged by false alarms and mail notification until the maintenance completes.

Bonus Maintenance Feature - Skipping Node Recovery

With automatic recovery enabled, ClusterControl is smart enough to detect a node failure and will try to recover a failed node after a 30-second grace period, regardless of the maintenance mode status. Did you know that ClusterControl can be configured to deliberately skip node recovery for a particular node? This could be very helpful when you have to perform an urgent maintenance without knowing the timespan and the outcome of the maintenance.

For example, imagine a file-system corruption happened and filesystem check and repair is required after a hard reboot. It's hard to determine in advance how much time it would need to complete this operation. Thus, we can simply use a flag file to signal ClusterControl to skip recovery for the node.

Firstly, add the following line inside the /etc/cmon.d/cmon_X.cnf (where X is the cluster ID) on ClusterControl node:

node_recovery_lock_file=/root/do_not_recover

Then, restart cmon service to load the change:

$ systemctl restart cmon # service cmon restart

Finally, ensure the specified file is present on the node that we want to skip for ClusterControl recovery:

$ touch /root/do_not_recover

Regardless of automatic recovery and maintenance mode status, ClusterControl will only recover the node when this flag file does not exist. The administrator is then responsible to create and remove the file on the database node.

That's it, folks. Happy maintenancing!

Tags:

↧

ClusterControl Tips & Tricks - Best Practices for Database Backups

November 7, 2018, 2:58 am

≫ Next: Database Security - Backup Encryption In-Transit & At-Rest

≪ Previous: Simple Scheduling of Maintenance Windows across your Database Clusters

Backups - one of the most important things to take care of while managing databases. It is said there are two types of people - those who backup their data and those who will backup their data. In this blog post, we will discuss good practices around backups and show you how you can build a reliable backup system using ClusterControl.

We will see how ClusterControl’s provides you with centralized backup management for MySQL, MariaDB, MongoDB and PostgreSQL. It provides you with hot backups of large datasets, point in time recovery, at-rest and in-transit data encryption, data integrity via automatic restore verification, cloud backups (AWS, Google and Azure) for Disaster Recovery, retention policies to ensure compliance, and automated alerts and reporting.

Backup types

There are two main types of backup that we can do in ClusterControl:

Logical backup - backup of data is stored in a human-readable format like SQL
Physical backup - backup contains binary data

Both complement each other - logical backup allows you to (more or less easily) retrieve up to a single row of data. Physical backups would require more time to accomplish that, but, on the other hand, they allow you to restore an entire host very quickly (something which may take hours or even days when using logical backup).

ClusterControl supports backup for MySQL/MariaDB/Percona Server, PostgreSQL, and MongoDB.

Schedule Backup

Starting a backup in ClusterControl is simple and efficient using a wizard. Scheduling a backup offers user-friendliness and accessibility to other features like encryption, automatic test/verification of backup, or cloud archiving.

Scheduled backups available will be listed in the Scheduled Backups tab as seen in the image below:

As a good practice for scheduling a backup, you must have already your defined backup retention and a daily backup is recommended. However, it also depends on the data you need, the traffic you might expect and the availability of the data whenever you need them especially during data recovery where data had been accidentally deleted or a disk corruption - which are inevitable. There are situations also that data loss is reproducible or can be duplicated manually like for example, report generation, thumbnails, or cached data. Though the question relies on how immediately you need them whenever a disaster happens; when possible, you’d want to take both mysqldump and xtrabackup backups on a daily basis for MySQL leveraging the logical and physical backup availability. To cover even more bases, you may want to schedule several incremental xtrabackup runs per day. This could save some disk space, disk I/O, or even CPU I/O than taking a full backup.

In ClusterControl, you can easily schedule these different types of backups. There are a couple of settings to decide on. You can store a backup on the controller or locally, on the database node where the backup is taken. You need to decide on the location in which the backup should be stored, and which databases you’d like to backup--all data set or separate schemas? See the image below:

The Advanced setting would take advantage of a cron-like configuration for more granularity. See image below:

Whenever a failure occurs, ClusterControl handle these issues efficiently and does produces logs for further diagnosis of the backup failure.

Depending on the backup type you’ve chosen, there are separate settings to configure. For Xtrabackup and Galera Cluster, you may have the options to choose what settings your physical backup would apply upon running. See below:

Use Compression
Compression Level
Desync node during backup
Backup Locks
Lock DDL per Table
Xtrabackup Parallel Copy Threads
Network Streaming Throttle Rate (MB/s)
Use PIGZ for parallel gzip
Enable Encryption
Retention

You can see, in the image below, how you could flag the options accordingly and there are tooltip icons which provide more information of the options you would like to leverage for your backup policy.

Depending on your backup policy, ClusterControl can be tailored in accordance to the best practices for taking your backups that are available up-to-date. Upon defining your backup policy, it is anticipated that you must have your required setup available from hardware to software to cloud, durability, high availability, or scalability.

When taking backups on a Galera Cluster, it’s a good practice to set the Galera node wsrep_desync=ON while the backup is running. This will take out the node from participating the Flow Control and will protect the whole cluster from replication lag, especially if your data to be backed up is large. In ClusterControl, please keep in mind that this may also remove your target backup node from the load balancing set. This is especially true if you use HAProxy, ProxySQL, orMaxScale proxies. If you have alert manager set up in case the node is desynced, you can turn off during those period when the backup has been triggered.

Another popular way of minimizing the impact of a backup on a Galera Cluster or a replication master is to deploy a replication slave and then use it as a source of backups - this way Galera Cluster will not be affected at any point as the backup on the slave is decoupled from the cluster.

You can deploy such a slave in just a few clicks using ClusterControl. See image below:

and once you click that button, you can select which nodes to setup a slave on. Make sure that the nodes binary logging enabled. Enabling the binary log can also be done through ClusterControl which adds more feasibility for administrating your desired master. See image below:

and you can also setup existing replication slave as well,

For PostgreSQL, you have options to backup either logical or physical backups. In ClusterControl, you can leverage your PostgreSQL backups by selecting pg_dump or pg_basebackup. pg_basebackup will not work for versions older than 9.3.

For MongoDB, ClusterControl offers mongodump or mongodb consistent. You might have to take note that mongodb consistent does not support RHEL 7 but you might be able to install it manually.

By default, ClusterControl will list a report for all the backups that have been taken, successful or failed ones. See below:

You can check on the list of backup reports that have been created or scheduled using ClusterControl. Within the list, you can view the logs for further investigation and diagnosis. For instance, if the backup did finish correctly according to your desired backup policy, whether compression and encryption is set correctly, or if the desired backup data size is correct. This is a good way to do a quick sanity check - if your dataset is around 1GB of size, there’s no way a full backup can be as small as 100KB - something must have gone wrong at some point.

Disaster Recovery

Storing backups within the cluster (either directly on a database node or on the ClusterControl host) comes in handy when you want to quickly restore your data: all backup files are in place and can be decompressed and restored promptly. When it comes to Disaster Recovery (DR), this may not be the best option. Different issues may happen - servers may crash, network may not work reliably, even entire data centers may not be accessible due to some kind of outage. It may happen whether you work with a smaller service provider with a single data center, or a global vendor like Amazon Web Services. It is therefore not safe to keep all your eggs in a single basket - you should make sure you have a copy of your backup stored in some external location. ClusterControl supports Amazon S3, Google Storage and Azure Cloud Storage .

For those who would like to implement their own DR policies, ClusterControl backups are stored in a nicely structured directory. You have also the option to upload your backup to the cloud. See image below:

You can select and upload to Amazon Web Services, Google Cloud, and Microsoft Azure. See image below:

As a good practice when archiving your database backups, make sure that your target cloud destination is based on the same region as your database servers, or at least the nearest. Ensure that it offers high availability, durability, and scalability; as you have to consider how often and immediate do you need your data.

In addition to creating a logical or physical backup for your DR, creating a full snapshot of your data (e.g. using LVM Snapshot, Amazon EBS Snapshots, or Volume Snapshots if using Veritas file system) on the particular node can increase your backup recovery. You can also use WAL (for Postgres) for your Point In Time Recovery (PITR) or your MySQL binary logs for your PITR. Thus, you have to consider that you might need to create your own archiving for your PITR. So it is perfectly fine to build and deploy your own set of scripts and handle DR according to your exact requirements.

Another great way of implementing a Disaster Recovery policy is to use an asynchronous replication slave - something we mentioned earlier in this blog post. You can deploy such asynchronous slave in a remote location, some other data center maybe, and then use it to do backups and store them locally on that slave. Of course, you’d want to take a local backup of your cluster to have it around locally if you’d need to recover the cluster. Moving data between data centers may take a long time, so having a backup files available locally can save you some time. In case you lose the access to your main production cluster, you may still have an access to the slave. This setup is very flexible - first, you have a running MySQL host with your production data so it shouldn’t be too hard to deploy your full application in the DR site. You’ll also have backups of your production data which you could use to scale out your DR environment.

Lastly and most importantly, a backup that has not been tested remains an unverified backup, aka Schroedinger Backup. To make sure you have a working backup, you need to perform a recovery test. ClusterControl offers a way to automatically verify and test your backup.

We hope this gives you enough information to build a safe and reliable backup procedure for your open source databases.

Tags:

database backup

best practices

data recovery

↧

Database Security - Backup Encryption In-Transit & At-Rest

November 14, 2018, 11:54 pm

≫ Next: Database Monitoring - Troubleshooting Prometheus with SCUMM Dashboards

≪ Previous: ClusterControl Tips & Tricks - Best Practices for Database Backups

Securing your data is one of the most important tasks that we should prioritize. The emergence of regulations that requires security compliance such as GDPR, HIPAA, PCI DSS, or PHI requires that your data should be stored securely or transmitted over the wire.

In ClusterControl, we value the importance of security and offer a number of features to secure database access and data stored. One of those is securing your database backups, both when at-rest and in-transit. In-transit is when the backup is being transferred through the internet or network from source to its destination, while at-rest is when data is stored on persistent storage. In this blog, we’ll show you how you can use ClusterControl to encrypt your backup data at-rest and in-transit.

Encryption In-Transit

When managing backups through ClusterControl, using mysqldump or Percona Xtrabackup/Mariabackup are managed by default without encryption when transmitted over the wire. However, using TLS/SSL for encryption of data transmission is supported by MySQL/MariaDB/Percona Server so does Percona Xtrabackup which offers SSL options when invoking the command.

ClusterControl does enable SSL by default when you deploy a cluster but it does not have the control to switch instantly and make your data encrypted over the wire during backup creation. You can check out our previous blogs for the manual approach using ClusterControl when creating your cluster or using the old fashioned way to manually setup TLS/SSL in MySQL.

In ClusterControl, when you deploy a cluster, you can check the security tab or this icon .

Clicking the button will allow you to replace the certificate which is currently being used or deployed by ClusterControl during deployment of your newly provisioned cluster. You can check this blog "Updated: Become a ClusterControl DBA - SSL Key Management and Encryption of MySQL Data in Transit" for which we showed how we handle the keys. Basically, you can go through left-side navigation of the ClusterControl and click “Key Management”.

Below is an example of Key Management in which you can create and import your existing keys.

When creating a backup using mysqldump as your logical backup, to encrypt your data, make sure that you have your SSL options set accordingly.

What’s next?

Since we have our created certificates, we need to enable and setup SSL accordingly. To ensure this, you can check via ClusterControl or mysql command prompt. See images below:

or you can check also under Performance tab and click DB Variables as seen below:

Take note that it might take some time to populate under the Performance -> DB Variables. So if you want to check manually, you can use mysql command prompt just like below:

Defining your ssl_cipher can add more security hardening. There’s a list of cipher suite if you run SHOW GLOBAL STATUS LIKE ‘Ssl_cipher_list’\G or check here. A cipher suite is a combination of authentication, encryption and message authentication code (MAC) algorithms. These are used during negotiation of security settings for a TLS/SSL connection as well as for the secure transfer of data.

Since ClusterControl will run mysqldump locally into that host, adding the following parameters (see below) in your MySQL configuration file (/etc/my.cnf, /etc/mysql/my.cnf, /usr/etc/my.cnf, ~/.my.cnf) will encrypt any actions for SQL dump. See below:

[mysqldump]
socket=/var/lib/mysql/mysql.sock
max_allowed_packet = 512M
host=127.0.0.1
ssl-cert=/var/lib/mysql/client-cert.pem
ssl-key=/var/lib/mysql/client-key.pem

You can try to monitor using tcpdump and you can see below compared to an unencrypted vs encrypted layer using TLS.

With TLS/SSL:

Without TLS/SSL:

When using Percona XtraBackup or Mariabackup under ClusterControl, there’s no TLS/SSL support as of this time when backup is transmitted over the network. It uses socat which basically just opens a port into another node as a means of communication.

e.g.

[21:24:46]: 192.168.10.20: Launching ( ulimit -n 256000 && LC_ALL=C /usr/bin/mariabackup --defaults-file=/etc/my.cnf --backup --galera-info --parallel 1 --stream=xbstream --no-timestamp | gzip -6 - | socat - TCP4:192.168.10.200:9999 ) 2>&1.
[21:24:46]: 192.168.10.20: The xtrabackup version is 2.4.12 / 2004012.
[21:24:46]: 192.168.10.20:3306: Checking xtrabackup version.
[21:24:46]: 192.168.10.20: Streaming backup to 192.168.10.200
[21:24:46]: 192.168.10.200: Netcat started, error log is in '192.168.10.200:/tmp/netcat.log'.
[21:24:43]: 192.168.10.200: Starting socat -u tcp-listen:9999,reuseaddr stdout > /home/vagrant/backups/BACKUP-71/backup-full-2018-11-12_132443.xbstream.gz 2>/tmp/netcat.log.

If you need secure layer during transport, you can do this manually using --ssl* options during command invocation. Checkout here for the Percona XtraBackup manual for more info. We still recommend to obtain your certificate/key from a registered certificate authority.

Alternatively, using ClusterControl, you can encrypt your data before sending via the network, the packets being sent are not raw but encrypted data. Although, the encryption relies on at-rest, we’ll discuss in the next section regarding this.

Encryption At-Rest

In ClusterControl, when creating your backup, you have the option to make your data encrypted. See below:

When encrypted, ClusterControl will use “Advance Encryption Standard” AES-256-CBC. See the sample log below:

[22:12:49]: 192.168.10.30: Launching ( ulimit -n 256000 && LC_ALL=C /usr/bin/mariabackup --defaults-file=/etc/my.cnf --backup --galera-info --parallel 1 --stream=xbstream --no-timestamp | gzip -6 - | openssl enc -aes-256-cbc -pass file:/var/tmp/cmon-002471-32758-24c456ca6b087837.tmp | socat - TCP4:192.168.10.200:9999 ) 2>&1.

If you are storing your backup into the cloud, for example with AWS S3, S3 offers three different modes of server-side encryption (SSE). These are SSE-S3, SSE-C, or SSE-KMS. Amazon has great SSE features to offer which handles encryption of data at rest. You can start here which tackles how S3 uses AWS KMS. If you are moving your backup to a volume or block-based storage, AWS has EBS encryption as well using AWS KMS. You can check over here for more info about this.

Microsoft Azure has cool features as well when handling data at rest. Check out the page on “Azure Storage Service Encryption for data at rest”. Azure handles the keys in their Azure Key Vault, same as AWS KMS. For Google encryption for data at rest, you can check here.

When storing data backups on-prem, you can use LUKS (Linux Unified Key Setup) with combination of crypt or dmcrypt. I’ll not discuss about this on this blog but this is a good source to look at: MySQL Encryption at Rest – Part 1 (LUKS). Fore more info about disk encryption, this ArchLinux page “Disk encryption” is a great source to start .

Most importantly, while your data has been encrypted at rest and in-transit, always verify that your backup works. A backup that has not been tested is not a backup if it does not work when recovery is needed. Storing your backup as well while encrypted must be decrypted without any problems, thus, your keys must be stored privately and in the most secure way as possible. Additionally, adding encryption to your data as well such as InnoDB encryption or SST (for Galera) adds more security, but we won’t cover these in this blog.

Conclusion

Encrypting backup data at rest and in-transit are vital components for compliance with PHI, HIPAA, PCI DSS or GDPR, to ensure that sensitive data transmitted over the wire or saved on disks are not readable by any user or application without a valid key. Some compliance regulations such as PCI DSS and HIPAA require that data at rest be encrypted throughout the data lifecycle.

Here, we show how ClusterControl offers these options to the end-user. Compliance is a huge task though, touching many different areas. Regulations are also updated on a frequent basis, and processes/tools also need to be updated accordingly. We are continuously improving different areas in ClusterControl to facilitate compliance. We would like to hear your thoughts on this and how we can improve.

Tags:

database backup encryption

disk encryption

hipaa

pci dss

↧

Database Monitoring - Troubleshooting Prometheus with SCUMM Dashboards

December 4, 2018, 5:49 am

≫ Next: How to Improve Replication Performance in a MySQL or MariaDB Galera Cluster

≪ Previous: Database Security - Backup Encryption In-Transit & At-Rest

It’s almost two months now since we released SCUMM (Severalnines ClusterControl Unified Management and Monitoring). SCUMM utilizes Prometheus as the underlying method to gather time series data from exporters running on database instances and load balancers. This blog will show you how to fix issues when Prometheus exporters aren’t running, or if the graphs aren’t displaying data, or showing “No Data Points”.

What is Prometheus?

Prometheus is an open-source monitoring system with a dimensional data model, flexible query language, efficient time series database and modern alerting approach. It is a monitoring platform that collects metrics from monitored targets by scraping metrics HTTP endpoints on these targets. It provides dimensional data, powerful queries, great visualization, efficient storage, simple operation, precise alerting, many client libraries, and many integrations.

Prometheus in action for SCUMM Dashboards

Prometheus collects metrics data from exporters, with each exporter running on a database or load balancer host. The diagram below shows you how these exporters are linked with the server hosting the Prometheus process. It shows that the ClusterControl node has Prometheus running where it also runs process_exporter and node_exporter.

The diagram shows that Prometheus is running on the ClusterControl host and exporters process_exporter and node_exporter are running as well to collect metrics from its own node. Optionally, you can make your ClusterControl host as the target as well in which you can setup HAProxy or ProxySQL.

For the cluster nodes above (node1, node2, and node3), it can have mysqld_exporter or postgres_exporter running which are the agents that scrape data internally in that node and pass it to Prometheus server and store it in its own data storage. You can locate its physical data via /var/lib/prometheus/data within the host where Prometheus is setup.

When you setup Prometheus, for example, in the ClusterControl host, it should have the following ports opened. See below:

[root@testccnode share]# netstat -tnvlp46|egrep 'ex[p]|prometheu[s]'
tcp6       0      0 :::9100                 :::*                    LISTEN      16189/node_exporter 
tcp6       0      0 :::9011                 :::*                    LISTEN      19318/process_expor 
tcp6       0      0 :::42004                :::*                    LISTEN      16080/proxysql_expo 
tcp6       0      0 :::9090                 :::*                    LISTEN      31856/prometheus

Based on the output, I have ProxySQL running as well on the host testccnode in which ClusterControl is hosted.

Common Issues with SCUMM Dashboards using Prometheus

When Dashboards are enabled, the ClusterControl will install and deploy binaries and exporters such as node_exporter, process_exporter, mysqld_exporter, postgres_exporter, and daemon. These are the common sets of packages to the database nodes. When these are setup and installed, the following daemon commands are fired up and run as seen below:

[root@testnode2 bin]# ps axufww|egrep 'exporte[r]'
prometh+  3604  0.0  0.0  10828   364 ?        S    Nov28   0:00 daemon --name=process_exporter --output=/var/log/prometheus/process_exporter.log --env=HOME=/var/lib/prometheus --env=PATH=/usr/local/bin:/usr/bin:/sbin:/bin:/usr/sbin:/usr/bin --chdir=/var/lib/prometheus --pidfile=/var/run/prometheus/process_exporter.pid --user=prometheus -- process_exporter
prometh+  3605  0.2  0.3 256300 14924 ?        Sl   Nov28   4:06  \_ process_exporter
prometh+  3838  0.0  0.0  10828   564 ?        S    Nov28   0:00 daemon --name=node_exporter --output=/var/log/prometheus/node_exporter.log --env=HOME=/var/lib/prometheus --env=PATH=/usr/local/bin:/usr/bin:/sbin:/bin:/usr/sbin:/usr/bin --chdir=/var/lib/prometheus --pidfile=/var/run/prometheus/node_exporter.pid --user=prometheus -- node_exporter
prometh+  3839  0.0  0.4  44636 15568 ?        Sl   Nov28   1:08  \_ node_exporter
prometh+  4038  0.0  0.0  10828   568 ?        S    Nov28   0:00 daemon --name=mysqld_exporter --output=/var/log/prometheus/mysqld_exporter.log --env=HOME=/var/lib/prometheus --env=PATH=/usr/local/bin:/usr/bin:/sbin:/bin:/usr/sbin:/usr/bin --chdir=/var/lib/prometheus --pidfile=/var/run/prometheus/mysqld_exporter.pid --user=prometheus -- mysqld_exporter --collect.perf_schema.eventswaits --collect.perf_schema.file_events --collect.perf_schema.file_instances --collect.perf_schema.indexiowaits --collect.perf_schema.tableiowaits --collect.perf_schema.tablelocks --collect.info_schema.tablestats --collect.info_schema.processlist --collect.binlog_size --collect.global_status --collect.global_variables --collect.info_schema.innodb_metrics --collect.slave_status
prometh+  4039  0.1  0.2  17368 11544 ?        Sl   Nov28   1:47  \_ mysqld_exporter --collect.perf_schema.eventswaits --collect.perf_schema.file_events --collect.perf_schema.file_instances --collect.perf_schema.indexiowaits --collect.perf_schema.tableiowaits --collect.perf_schema.tablelocks --collect.info_schema.tablestats --collect.info_schema.processlist --collect.binlog_size --collect.global_status --collect.global_variables --collect.info_schema.innodb_metrics --collect.slave_status

For a PostgreSQL node,

[root@testnode14 vagrant]# ps axufww|egrep 'ex[p]'
postgres  1901  0.0  0.4 1169024 8904 ?        Ss   18:00   0:04  \_ postgres: postgres_exporter postgres ::1(51118) idle
prometh+  1516  0.0  0.0  10828   360 ?        S    18:00   0:00 daemon --name=process_exporter --output=/var/log/prometheus/process_exporter.log --env=HOME=/var/lib/prometheus --env=PATH=/usr/local/bin:/usr/bin:/sbin:/bin:/usr/sbin:/usr/bin --chdir=/var/lib/prometheus --pidfile=/var/run/prometheus/process_exporter.pid --user=prometheus -- process_exporter
prometh+  1517  0.2  0.7 117032 14636 ?        Sl   18:00   0:35  \_ process_exporter
prometh+  1700  0.0  0.0  10828   572 ?        S    18:00   0:00 daemon --name=node_exporter --output=/var/log/prometheus/node_exporter.log --env=HOME=/var/lib/prometheus --env=PATH=/usr/local/bin:/usr/bin:/sbin:/bin:/usr/sbin:/usr/bin --chdir=/var/lib/prometheus --pidfile=/var/run/prometheus/node_exporter.pid --user=prometheus -- node_exporter
prometh+  1701  0.0  0.7  44380 14932 ?        Sl   18:00   0:10  \_ node_exporter
prometh+  1897  0.0  0.0  10828   568 ?        S    18:00   0:00 daemon --name=postgres_exporter --output=/var/log/prometheus/postgres_exporter.log --env=HOME=/var/lib/prometheus --env=PATH=/usr/local/bin:/usr/bin:/sbin:/bin:/usr/sbin:/usr/bin --env=DATA_SOURCE_NAME=postgresql://postgres_exporter:password@localhost:5432/postgres?sslmode=disable --chdir=/var/lib/prometheus --pidfile=/var/run/prometheus/postgres_exporter.pid --user=prometheus -- postgres_exporter
prometh+  1898  0.0  0.5  16548 11204 ?        Sl   18:00   0:06  \_ postgres_exporter

It has the same exporters as for a MySQL node, but differs only on the postgres_exporter since this is a PostgreSQL database node.

However, when a node suffers from a power interruption, a system crash, or a system reboot, these exporters will stop running. Prometheus will report that an exporter is down. ClusterControl samples Prometheus itself, and asks for the exporter statuses. So it acts upon this information, and will restart the exporter if it is down.

However, note that for exporters that have not been installed via ClusterControl, they are not going to be restarted after a crash. The reason is that they are not monitored by systemd or a daemon that acts as a safety script that would restart a process upon crash or an abnormal shutdown. Hence, the screenshot below will show how does it looks like when the exporters aren’t running. See below:

and in PostgreSQL Dashboard, will have the same loading icon with “No data points” label in the graph. See below:

Hence, these can be troubleshooted through various techniques that will follow in the following sections.

Troubleshooting Issues With Prometheus

Prometheus agents, known as the exporters, are using the following ports: 9100 (node_exporter), 9011 (process_exporter), 9187 (postgres_exporter), 9104 (mysqld_exporter), 42004 (proxysql_exporter), and the very own 9090 which is owned by a prometheus process. These are the ports for these agents that are used by ClusterControl.

To start troubleshooting the SCUMM Dashboard issues, you can start by checking the ports open from the database node. You can follow the lists below:

Check if the ports are open

e.g.

## Use netstat and check the ports
[root@testnode15 vagrant]# netstat -tnvlp46|egrep 'ex[p]'
tcp6       0      0 :::9100                 :::*                    LISTEN      5036/node_exporter  
tcp6       0      0 :::9011                 :::*                    LISTEN      4852/process_export 
tcp6       0      0 :::9187                 :::*                    LISTEN      5230/postgres_expor

There can be a possibility that the ports aren’t open because of a firewall (such as iptables or firewalld) blocking it from opening the port or the process daemon itself is not running.

Use curl from the host monitor and verify if the port is reachable and open.

e.g.

## Using curl and grep mysql list of available metric names used in PromQL.
[root@testccnode prometheus]# curl -sv mariadb_g01:9104/metrics|grep 'mysql'|head -25
* About to connect() to mariadb_g01 port 9104 (#0)
*   Trying 192.168.10.10...
* Connected to mariadb_g01 (192.168.10.10) port 9104 (#0)
> GET /metrics HTTP/1.1
> User-Agent: curl/7.29.0
> Host: mariadb_g01:9104
> Accept: */*
> 
< HTTP/1.1 200 OK
< Content-Length: 213633
< Content-Type: text/plain; version=0.0.4; charset=utf-8
< Date: Sat, 01 Dec 2018 04:23:21 GMT
< 
{ [data not shown]
# HELP mysql_binlog_file_number The last binlog file number.
# TYPE mysql_binlog_file_number gauge
mysql_binlog_file_number 114
# HELP mysql_binlog_files Number of registered binlog files.
# TYPE mysql_binlog_files gauge
mysql_binlog_files 26
# HELP mysql_binlog_size_bytes Combined size of all registered binlog files.
# TYPE mysql_binlog_size_bytes gauge
mysql_binlog_size_bytes 8.233181e+06
# HELP mysql_exporter_collector_duration_seconds Collector time duration.
# TYPE mysql_exporter_collector_duration_seconds gauge
mysql_exporter_collector_duration_seconds{collector="collect.binlog_size"} 0.008825006
mysql_exporter_collector_duration_seconds{collector="collect.global_status"} 0.006489491
mysql_exporter_collector_duration_seconds{collector="collect.global_variables"} 0.00324821
mysql_exporter_collector_duration_seconds{collector="collect.info_schema.innodb_metrics"} 0.008209824
mysql_exporter_collector_duration_seconds{collector="collect.info_schema.processlist"} 0.007524068
mysql_exporter_collector_duration_seconds{collector="collect.info_schema.tables"} 0.010236411
mysql_exporter_collector_duration_seconds{collector="collect.info_schema.tablestats"} 0.000610684
mysql_exporter_collector_duration_seconds{collector="collect.perf_schema.eventswaits"} 0.009132491
mysql_exporter_collector_duration_seconds{collector="collect.perf_schema.file_events"} 0.009235416
mysql_exporter_collector_duration_seconds{collector="collect.perf_schema.file_instances"} 0.009451361
mysql_exporter_collector_duration_seconds{collector="collect.perf_schema.indexiowaits"} 0.009568397
mysql_exporter_collector_duration_seconds{collector="collect.perf_schema.tableiowaits"} 0.008418406
mysql_exporter_collector_duration_seconds{collector="collect.perf_schema.tablelocks"} 0.008656682
mysql_exporter_collector_duration_seconds{collector="collect.slave_status"} 0.009924652
* Failed writing body (96 != 14480)
* Closing connection 0

Ideally, I practically found this approach feasible for me because I can grep and debug from the terminal easily.

Why not use the Web UI?
- Prometheus exposes port 9090 which is used by ClusterControl in our SCUMM Dashboards. Aside from this, the ports that the exporters are exposing can also be used to troubleshoot and determine the available metric names using PromQL. In the server where the Prometheus is running, you can visit http://<hostname-or-ip>:9090/targets. The screenshot below shows it in action:
  and clicking the “Endpoints”, you can verify the metrics as well just as the screenshot below:
  Instead of using the IP address, you can also check this locally via localhost on that specific node such as visiting http://localhost:9104/metrics either in a Web UI interface or using cURL.
  Now, if we go back to the “Targets” page, you can see the list of nodes where there can be a problem with the port. The reasons that could cause this are listed below:
  - Server is down
  - Network is unreachable or ports not opened due to a firewall running
  - The daemon is not running where the <exporter_name>_exporter is not running. For example, mysqld_exporter is not running.

When these exporters are running, you can fire up and run the process using daemon command. You can refer to the available running processes that I had used in the example above, or mentioned in the previous section of this blog.

What About Those “No Data Points” Graphs In My Dashboard?

SCUMM Dashboards come up with a general use case scenario which is commonly used by MySQL. However, there are some variables when invoking such metric might not be available into a particular MySQL version or a MySQL vendor, such as MariaDB or Percona Server.

Let me show an example below:

This graph was taken on a database server running on a version 10.3.9-MariaDB-log MariaDB Server with wsrep_patch_version of wsrep_25.23 instance. Now the question is, why isn’t there any data points loading? Well, as I queried the node for a checkpoint age status, it reveals that it’s empty or no variable found. See below:

MariaDB [(none)]> show global status like 'Innodb_checkpoint_max_age';
Empty set (0.000 sec)

I have no idea why MariaDB does not have this variable (please let us know in the comments section of this blog if you have the answer). This is in contrast to a Percona XtraDB Cluster Server where the variable Innodb_checkpoint_max_age does exist. See below:

mysql> show global status like 'Innodb_checkpoint_max_age';
+---------------------------+-----------+
| Variable_name             | Value     |
+---------------------------+-----------+
| Innodb_checkpoint_max_age | 865244898 |
+---------------------------+-----------+
1 row in set (0.00 sec)

What does this means though is that, there can be graphs that do not have data points gathered because it has no data being harvested on that particular metric when a Prometheus query was executed.

However, a graph that does not have data points does not mean that your current version of MySQL or its variant does not support it. For example, there are certain graphs that requires certain variables that needs to be setup properly or enabled.

The following section will show what these graphs are.

Index Condition Pushdown (ICP) Graph

This graph has been mentioned in my previous blog. It relies on a MySQL global variable named innodb_monitor_enable. This variable is dynamic so you can set this without a hard restart of your MySQL database. It also requires innodb_monitor_enable = module_icp or you can set this global variable to innodb_monitor_enable = all. Typically, to avoid such cases and confusions on why such graph does not show any data points, you might have to use all but with care. There can be certain overhead when this variable is turned on and set to all.

MySQL Performance Schema Graphs

So why these graphs are showing “No data points”? When you create a cluster using ClusterControl using our templates, by default it will define performance_schema variables. For example, these variables below are set:

performance_schema = ON
performance-schema-max-mutex-classes = 0
performance-schema-max-mutex-instances = 0

However, if performance_schema = OFF, then that’s the reason why the related graphs would display “No data points”.

But I have performance_schema enabled, why other graphs are still an issue?

Well, there are still graphs that requires multiple variables need to be set. This has already been covered in our previous blog. Thus, you need to set innodb_monitor_enable = all and userstat=1. The result would look like this:

However, I notice that in the version of MariaDB 10.3 (particularly 10.3.11), setting performance_schema=ON will populate the metrics needed for the MySQL Performance Schema Dashboard. This is great advantage because it does not have to set innodb_monitor_enable=ON which would add extra overhead on the database server.

Advanced Troubleshooting

Is there any advance troubleshooting I can recommend? Yes, there is! However, you do need some JavaScript skills, at least. Since SCUMM Dashboards using Prometheus relies on highcharts, the way the metrics that are being used for PromQL requests can be determined through app.js script which is shown below:

So in this case, I am using Google Chrome’s DevTools and tried to look for Performance Schema Waits (Events). How this can help? Well, if you look on the targets, you’ll see:

targets: [{
expr: 'topk(5, rate(mysql_perf_schema_events_waits_total{instance="$instance"}[$interval])>0) or topk(5, irate(mysql_perf_schema_events_waits_total{instance="$instance"}[5m])>0)',
legendFormat: "{{event_name}} "
}]

Now, you can use the metrics requested which is mysql_perf_schema_events_waits_total. You can check that, for example, by going through http://<hostname-ip>:9090/graph and check if there has been collected metrics. See below:

ClusterControl Auto-Recovery to the rescue!

Lastly, the main question is, is there an easy way to restart failed exporters? Yes! We mentioned earlier that ClusterControl watches the state of the exports and restarts them if required. In case you notice that SCUMM Dashboards do not load graphs normally, ensure that you have Auto Recovery enabled. See the image below:

When this is enabled, this will ensure that the <process>_exporters will be started correctly if it detects that these are not running. ClusterControl will handle that for you and no further actions need to be taken.

It is also possible to re-install or re-configure the exporters.

Conclusion

In this blog, we saw how ClusterControl uses Prometheus to offer SCUMM Dashboards. It provides a powerful set of features, from high resolution monitoring data and rich graphs. You have learned that with PromQL, you can determine and troubleshoot our SCUMM Dashboards which allows you to aggregate time series data in real time. You can as well generate graphs or view through Console for all the metrics that have been collected.

You also learned how to debug our SCUMM Dashboards especially when no data points are collected.

If you have questions, please add in your comments or let us know through our Community Forums.

Tags:

database monitoring

prometheus

scumm dashboards

↧

How to Improve Replication Performance in a MySQL or MariaDB Galera Cluster

December 19, 2018, 5:16 am

≫ Next: Announcing ClusterControl 1.7.1: Support for PostgreSQL 11 and MongoDB 4.0, Enhanced Monitoring

≪ Previous: Database Monitoring - Troubleshooting Prometheus with SCUMM Dashboards

In the comments section of one of our blogs a reader asked about the impact of wsrep_slave_threads on Galera Cluster’s I/O performance and scalability. At that time, we couldn’t easily answer that question and back it up with more data, but finally we managed to set up the environment and run some tests.

Our reader pointed towards benchmarks that showed that increasing wsrep_slave_threads did not have any impact on the performance of the Galera cluster.

To explain what the impact of that setting is, we set up a small cluster of three nodes (m5d.xlarge). This allowed us to utilize directly attached nvme SSD for the MySQL data directory. By doing this, we minimized the chance of storage becoming the bottleneck in our setup.

We set up InnoDB buffer pool to 8GB and redo logs to two files, 1GB each. We also increased innodb_io_capacity to 2000 and innodb_io_capacity_max to 10000. This was also intended to ensure that neither of those settings would impact our performance.

The whole problem with such benchmarks is that there are so many bottlenecks that you have to eliminate them one by one. Only after doing some configuration tuning and after making sure that the hardware will not be a problem, one can have hope that some more subtle limits will show up.

We generated ~90GB of data using sysbench:

sysbench /usr/share/sysbench/oltp_write_only.lua --threads=16 --events=0 --time=600 --mysql-host=172.30.4.245 --mysql-user=sbtest --mysql-password=sbtest --mysql-port=3306 --tables=28 --report-interval=1 --skip-trx=off --table-size=10000000 --db-ps-mode=disable --mysql-db=sbtest_large prepare

Then the benchmark was executed. We tested two settings: wsrep_slave_threads=1 and wsrep_slave_threads=16. The hardware was not powerful enough to benefit from increasing this variable even further. Please also keep in mind that we did not do a detailed benchmarking in order to determine whether wsrep_slave_threads should be set to 16, 8 or maybe 4 for the best performance. We were interested to see if we can show an impact on the cluster. And yes, the impact was clearly visible. For starters, some flow control graphs.

While running with wsrep_slave_threads=1, on average, nodes were paused due to flow control ~64% of the time.

While running with wsrep_slave_threads=16, on average, nodes were paused due to flow control ~20% of the time.

You can also compare the difference on a single graph. The drop at the end of the first part is the first attempt to run with wsrep_slave_threads=16. Servers ran out of disk space for binary logs and we had to re-run that benchmark once more at a later time.

How did this translate in performance terms? The difference is visible although definitely not that spectacular.

First, the query per second graph. First of all, you can notice that in both cases results are all over the place. This is mostly related to the unstable performance of the I/O storage and the flow control randomly kicking in. You can still see that the performance of the “red” result (wsrep_slave_threads=1) is quite lower than the “green” one (wsrep_slave_threads=16).

Quite similar picture is when we look at the latency. You can see more (and typically deeper) stalls for the run with wsrep_slave_thread=1.

Single Console for Your Entire Database Infrastructure

Find out what else is new in ClusterControl

Install ClusterControl for FREE

The difference is even more visible when we calculated average latency across all the runs and you can see that the latency of wsrep_slave_thread=1 is 27% higher of the latency with 16 slave threads, which obviously is not good as we want latency to be lower, not higher.

The difference in throughput is also visible, around 11% of the improvement when we added more wsrep_slave_threads.

As you can see, the impact is there. It is by no means 16x (even if that’s how we increased the number of slave threads in Galera) but it is definitely prominent enough so that we cannot classify it as just a statistical anomaly.

Please keep in mind that in our case we used quite small nodes. The difference should be even more significant if we are talking about large instances running on EBS volumes with thousands of provisioned IOPS.

Then we would be able to run sysbench even more aggressively, with higher number of concurrent operations. This should improve parallelization of the writesets, improving the gain from the multithreading even further. Also, faster hardware means that Galera will be able to utilize those 16 threads in more efficient way.

When running tests like this you have to keep in mind you need to push your setup almost to its limits. Single-threaded replication can handle quite a lot of load and you need to run heavy traffic to actually make it not performant enough to handle the task.

We hope this blog post gives you more insight into Galera Cluster’s abilities to apply writesets in parallel and the limiting factors around it.

Tags:

↧

Announcing ClusterControl 1.7.1: Support for PostgreSQL 11 and MongoDB 4.0, Enhanced Monitoring

January 7, 2019, 2:48 am

≫ Next: Monitoring HAProxy Metrics And How They Change With Time

≪ Previous: How to Improve Replication Performance in a MySQL or MariaDB Galera Cluster

We are excited to announce the 1.7.1 release of ClusterControl - the only management system you’ll ever need to take control of your open source database infrastructure!

ClusterControl 1.7.1 introduces the next phase of our exciting agent-based monitoring features for MySQL, Galera Cluster, PostgreSQL & ProxySQL, a suite of new features to help users fully automate and manage PostgreSQL (including support for PostgreSQL 11), support for MongoDB 4.0 ... and more!

Release Highlights

Performance Management

Enhanced performance dashboards for MySQL, Galera Cluster, PostgreSQL & ProxySQL
Enhanced query monitoring for PostgreSQL: view query statistics

Deployment & Backup Management

Create a cluster from backup for MySQL & PostgreSQL
Verify/restore backup on a standalone PostgreSQL host
ClusterControl Backup & Restore

Additional Highlights

Support for PostgreSQL 11 and MongoDB 4.0

View the ClusterControl ChangeLog for all the details!

Single Console for Your Entire Database Infrastructure

Find out what else is new in ClusterControl

Install ClusterControl for FREE

View Release Details and Resources

Release Details

Performance Management

Enhanced performance dashboards for MySQL, Galera Cluster, PostgreSQL & ProxySQL

Since October 2018, ClusterControl users have access to a set of monitoring dashboards that have Prometheus as the data source with its flexible query language and multi-dimensional data model, where time series data is identified by metric name and key/value pairs.

The advantage of this new agent-based monitoring infrastructure is that users can enable their database clusters to use Prometheus exporters to collect metrics on their nodes and hosts, thus avoiding excessive SSH activity for monitoring and metrics collections and use SSH connectivity only for management operations.

These Prometheus exporters can now be installed or enabled Prometheus on your nodes and hosts with MySQL, PostgreSQL and MongoDB based clusters. And you have the possibility to customize collector flags for the exporters (Prometheus), which allows you to disable collecting from MySQL's performance schema for example, if you experience load issues on your server.

This allows for greater accuracy and customization options while monitoring your database clusters. ClusterControl takes care of installing and maintaining Prometheus as well as exporters on the monitored hosts.

With this 1.7.1 release, ClusterControl now also comes with the next iteration of the following (new) dashboards:

System Overview
Cluster Overview
MySQL Server - General
MySQL Server - Caches
MySQL InnoDB Metrics
Galera Cluster Overview
Galera Server Overview
PostgreSQL Overview
ProxySQL Overview
HAProxy Overview
MongoDB Cluster Overview
MongoDB ReplicaSet
MongoDB Server

Do check them out and let us know what you think!

MongoDB Cluster Overview

HAProxy Overview

Performance Management

Advanced query monitoring for PostgreSQL: view query statistics

ClusterControl 1.7.1 now comes with a whole range of new query statistics that can easily be viewed and monitored via the ClusterControl GUI. The following statistics are included in this new release:

Access by sequential or index scans
Table I/O statistics
Index I/O statistics
Database Wide Statistics
Table Bloat And Index Bloat
Top 10 largest tables
Database Sizes
Last analyzed or vacuumed
Unused indexes
Duplicate indexes
Exclusive lock waits

Table Bloat & Index Bloat

Deployment

Create a cluster from backup for MySQL & PostgreSQL

To be able to deliver database and application changes more quickly, several tasks must be automated. It can be a daunting job to ensure that a development team has the latest database build for the test when there is a proliferation of copies, and the production database is in use.

ClusterControl provides a single process to create a new cluster from backup with no impact on the source database system.

With this new release, you can easily create MySQL Galera or PostgreSQL including the data from backup you need.

Backup Management

ClusterControl Backup/Restore

ClusterControl users can use this new feature to migrate a setup from one controller to another controller; and backup the meta-data of an entire controller or individual clusters from the s9s CLI. The backup can then be restored on a new controller with a new hostname/IP and the restore process will automatically recreate database access privileges. Check it out!

Additional New Functionalities

View the ClusterControl ChangeLog for all the details!

Download ClusterControl today!

Happy Clustering!

Tags:

↧

Monitoring HAProxy Metrics And How They Change With Time

January 10, 2019, 1:40 am

≫ Next: How to Optimize Performance of ClusterControl and Its Components

≪ Previous: Announcing ClusterControl 1.7.1: Support for PostgreSQL 11 and MongoDB 4.0, Enhanced Monitoring

HAProxy is one of the most popular load balancers for MySQL and MariaDB.

Feature-wise, it cannot be compared to ProxySQL or MaxScale, but it is fast and robust and may work perfectly fine in any environment as long as the application can perform the read/write split and send SELECT queries to one backend and all writes and SELECT … FOR UPDATE to a separate backend.

Keeping track of the metrics made available by HAProxy is very important. You have to be able to know the state of your proxy, especially if you encountered any issues.

ClusterControl always made available an HAProxy status page, which showed the state in real time. With the new, Prometheus-based SCUMM (Severalnines ClusterControl Unified Monitoring & Management) dashboards, it is now possible to track how those metrics change in time.

In this blog post we will go over the different metrics presented in the HAProxy SCUMM dashboard.

First of all, by default Prometheus and SCUMM dashboards are disabled in ClusterControl. In that case, it’s just a matter of one click to deploy them for a given cluster. If you have multiple clusters monitored by ClusterControl you can reuse the same Prometheus instance for every cluster managed by ClusterControl.

Once deployed, we can access the HAProxy dashboard. We will go over the data shown in it.

As you can see, it starts with the information about the state of the backends. Here, please note this may depend on the cluster type and how you deployed HAProxy. In this case, it was a Galera cluster and HAProxy was deployed in a round-robin fashion; therefore you see three backends for reads and three for writes, six in total. It is also the reason why you see all backends marked as up. In case of the replication cluster, things are looking different as the HAProxy will be deployed in a read/write split, and the scripts will keep only one host (master) up and running in the writer’s backend:

This is why on the screen above, you can see two backend servers marked as “down”.

Next graph focuses on the data sent and received by both backend (from HAProxy to the database servers) and frontend (between HAProxy and client hosts).

We can also check the traffic distribution between backends that are configured in HAProxy. In this case we have two backends and the queries are sent via port 3308, which acts as the round-robin access point to our Galera Cluster.

We can also find graphs showing how the traffic was distributed across all backend servers. In this case, due to round-robin access pattern, data was more or less evenly distributed across all three backend Galera servers.

Next, we can see information about sessions. How many sessions were opened from HAProxy to the backend servers. We can also track how many times per second a new session was opened to the backend. You can also check how those metrics look like when you look at them on per backend server basis.

Next two graphs show what was the maximum number of sessions per backend server and also when some connectivity issues showed up. This can be quite useful for debugging purposes when you hit some configuration error on your HAProxy instance and connections started to be dropped.

Next graph might be even more valuable as it shows different metrics related to error handling - response errors, request errors, retries on the backend side etc. Then we have a Sessions graph, which shows the overview of the session metrics.

On the next graph we can track the connection errors in time, this can be also useful to pinpoint the time when the issue started to evolve.

Finally, two graphs related to queued requests. HAProxy queues requests to backend if the backend servers are oversaturated. This can point to, for example, the overloaded database servers, which cannot cope with more traffic.

As you can see, ClusterControl tracks the most important metrics of HAProxy and can show how they change in time. This is very useful in pinpointing when an issue started and, to some extent, what could be the root cause of it. Try it out (it’s free) for yourself.

Tags:

↧

How to Optimize Performance of ClusterControl and Its Components

January 11, 2019, 1:21 am

≫ Next: Automate Deployment of your MySQL or Postgres Cluster from Backup

≪ Previous: Monitoring HAProxy Metrics And How They Change With Time

Monitoring and management is critical to any production environment, and performance matters. Slow user interfaces that lag or do not respond, delayed alerts, cluster job timeouts when the server is starved of resources are all things that can cause trouble. There are ways to improve performance of ClusterControl, especially if you are managing multiple clusters and each cluster contains multiple nodes. This blog provides some tuning tips. The points elaborated here are curated based on our experience dealing with performance issues reported by our users and customers.

As an introduction, ClusterControl consists of several main components - a web application (frontend) based on PHP together with a number of daemonized processes (backend), these leverage a MySQL/MariaDB database for persistent storage. You are effectively controlling your cluster from the web application, which will be translated to a series of process calls executed by the backend processes to manage and monitor your database clusters.

MySQL Database

ClusterControl components rely on a MySQL or MariaDB database as the persistent store for monitoring data collected from the managed nodes, as well as all ClusterControl meta data (e.g. what jobs there are in the queue, backup schedules, backup statuses, etc.). By default, the installer script will install whatever version comes by the standard repository of the OS. The following is the MySQL/MariaDB version being installed by the installer:

CentOS/Redhat 6 - MySQL 5.1
CentOS/Redhat 7 - MariaDB 5.5
Ubuntu 18.04 (Bionic) - MySQL 5.7
Ubuntu 16.04 (Xenial) - MySQL 5.7
Ubuntu 14.04 (Trusty) - MySQL 5.5
Debian 9 (Stretch) - MySQL 5.5
Debian 8 (Jessie) - MySQL 5.5
Debian 7 (Wheezy) - MySQL 5.5

The installer script would do some basic tuning like configuring datadir location, MySQL port, user and also InnoDB buffer pool size at the very beginning of the installation stage. However, the tuning might not be suitable once you have imported or created more clusters/nodes. With an increased number of nodes to be monitored and managed, ClusterControl would use more resources and the database layer is commonly the first bottleneck that users encounter. Some further tuning might be needed to contain the incoming load.

ClusterControl is smart enough to detect any performance anomaly by writing up the following lines inside cmon_X.log (where X is the cluster ID):

2018-11-28 01:30:00 : (INFO) CmonSheetManager at 0x3839af0 loaded 70 values for key 'diskstat' between 2018-11-23 01:30:00 - 2018-11-28 01:30:0
0.
2018-11-28 01:30:00 : (INFO) SQL processing: 220.0000ms
2018-11-28 01:30:00 : (INFO) Parsing       : 0.0000ms
2018-11-28 01:30:00 : (INFO) Sum           : 220.0000ms

The above simply means it took 220ms (Sum value) to load 70 values for component "diskstat", where most of the processing time was happening at the SQL processing stage and 0 ms to parse the SQL resultset. This concludes that the SQL layer takes most of the processing time when ClusterControl was trying to query the dataset.

We believe most of the SQL queries executed by ClusterControl are properly optimized for single MySQL instance and use proper indexing. Simply said, if you see something like the above appearing regularly in the log file, some improvements to the database layer are in order, as shown in the next sections.

Tuning InnoDB Buffer Pool Size

Buffer pool size is an important component and has to be configured upfront to improve the MySQL performance. It allows MySQL processing to be happening inside memory, instead of hitting the disk. A simple rule of thumb is to check the InnoDB hit ratio and look for the following line under BUFFER POOL AND MEMORY section:

Buffer pool hit rate 986 / 1000

The hit rate of 986 / 1000 indicates that out of 1000 row reads, it was able to read the row in RAM 986 times. The remaining 14 times, it had to read the row of data from disk. Simply said, 1000 / 1000 is the best value that we are trying to achieve here, which means the frequently-accessed data fits fully in RAM.

Increasing the innodb_buffer_pool_size value will help a lot to accomodate more room for MySQL to work on. However, ensure you have sufficient RAM resources beforehand. By default, ClusterControl allocates 50% of the RAM to the buffer pool. If the host is dedicated to ClusterControl, you can even push it to a higher value like 70%, provided you spare at least 2GB of RAM to the OS processes and caches. If you can't allocate that much, increasing the RAM is the only solution.

Changing this value requires a MySQL restart (older than MySQL 5.7.5), thus the correct service restart ordering will be:

Modify innodb_buffer_pool_size value inside my.cnf.
Stop the CMON service.
Restart MySQL/MariaDB service.
Start the CMON service.

Or simply reboot the host if you can afford a longer downtime.

Tuning max_connections

By default, the installer script will configure max_connections value up to 512. This is rather high, although sane, since ClusterControl barely reaches 200 connections in total, unless the MySQL server is shared with other applications or you have tens of MySQL nodes monitored by ClusterControl (we are talking about 30 nodes and more).

A high max_connections value wastes resources and adjusting the value will affect the maximum memory configured for MySQL. If it is greater than System RAM then there is a chance that the MySQL Server process will terminate with an OOM exception, if all connections are used.

To check on this, simply look for max_used_connections MySQL status. The following is the maximum connections ever reached by MySQL on a ClusterControl node that monitors 2 clusters with 6 MySQL nodes in total:

mysql> SHOW STATUS like 'max_used_connections';
+----------------------+-------+
| Variable_name        | Value |
+----------------------+-------+
| Max_used_connections | 43    |
+----------------------+-------+

A good value to start is Max_used_connections x 2, and gradually increase it if the value is consistently growing. Modifying the max_connections variable can be done dynamically via SET GLOBAL statement.

Using MySQL socket file

By default, the installer script will automatically configure the following host value inside every ClusterControl database configuration files:

Configuration File	Value
/etc/cmon.cnf	mysql_hostname=127.0.0.1
/etc/cmon.d/cmon_X.cnf (X is the cluster ID)	mysql_hostname=127.0.0.1
/var/www/html/clustercontrol/bootstrap.php	define('DB_HOST', '127.0.0.1');
/var/www/html/cmonapi/config/database.php	define('DB_HOST', '127.0.0.1');

The above will force the MySQL client to connect via TCP networking, just like connecting to a remote MySQL host although ClusterControl is running on the same server as the MySQL server. We purposely configured it this way to simplify the installation process since almost every OS platform pre-configures MySQL socket file differently, which greatly reduce the installation failure rate.

Changing the value to "localhost" will force the client to use the MySQL Unix socket file instead:

Configuration File	Value
/etc/cmon.cnf	mysql_hostname=localhost
/etc/cmon.d/cmon_X.cnf (X is the cluster ID)	mysql_hostname=localhost
/var/www/html/clustercontrol/bootstrap.php	define('DB_HOST', 'localhost');
/var/www/html/cmonapi/config/database.php	define('DB_HOST', 'localhost');

On Unix based systems, MySQL programs treat the host name localhost specially, in a way that is likely different from what you expect compared to other network-based programs. For connections to localhost, MySQL programs attempt to connect to the local server by using a Unix socket file. This occurs even if a --port or -P option is given to specify a port number.

Using MySQL UNIX socket file is much more secure and will cut off the network overhead. It is always recommended over TCP. However, you need to make sure the socket file is configured correctly. It must exist on the following directives inside my.cnf and every MySQL option files on ClusterControl node, for example:

[mysqld]
socket=/var/lib/mysql/mysql.sock

[client]
socket=/var/lib/mysql/mysql.sock

[mysql]
socket=/var/lib/mysql/mysql.sock

[mysqldump]
socket=/var/lib/mysql/mysql.sock

Changing the socket file will also require a MySQL and CMON restart. If you are using the "localhost", you can then add some additional configuration options like skip-networking=1, to prevent accepting remote connections. Our ClusterControl Docker image is using this approach to overcome a limitation in docker-proxy when binding on ports.

OpenSSH with SSSD

ClusterControl uses SSH protocol as its main communication channel to manage and monitor remote nodes. The default OpenSSH configurations are pretty decent and should work in most cases. However, in some environments where SSH is integrated with other security enhancement tools like SElinux or System Security Services Daemon (SSSD), it could bring significant impact to the SSH performance.

We have seen many cases where an ever increasing amount of SSH connections established to each of the managed nodes and eventually, both the ClusterControl server and all managed nodes max out their system memory with SSH connections. In some cases, only a normal full system reboot nightly on the ClusterControl server could solve the problem.

If you running your infrastructure with System Security Services Daemon (SSSD), it's advised for you to comment the following line inside OpenSSH client configuration at /etc/ssh/ssh_config on ClusterControl node:

#ProxyCommand /usr/bin/sss_ssh_knownhostsproxy -p %p %h

The above will skip using SSSD to manage the host key, which will be handled by OpenSSH client instead.

Starting from ClusterControl 1.7.0, you have an option to use agent-based monitoring tool with Prometheus. With agent-based monitoring, ClusterControl does not use SSH to sample host metrics which can be excessive in some environments.

File System and Partitioning

ClusterControl controller writes new entry in its log file almost every second for every cluster. For those who wants to take advantage of this sequential writes on disk and would like to save cost, you can use a spindle disk for this purpose. Modify the following line inside all cmon_X.cnf:

logfile=/new/partition/log/cmon_X.log

Replace X with the cluster ID and restart CMON service to apply the changes.

If you are using ClusterControl as the backup repository, it's recommended for you to allocate sufficient disk space on a separate partition other than the root partition. It gets better if the partition resides on a networked or clustered file system for easy mounting with the targeted nodes when performing restoration operation. We have seen cases where the created backups ate up all disk space of the main partition, eventually impacting ClusterControl and its components.

Keep up to Date

ClusterControl has a short release cycle - at least one new major release every quarter of the year plus weekly maintenance patches (mostly bug fixes - if any). The reason is ClusterControl supports multiple database vendors and versions, operating systems and hardware platforms. Often there are new things being introduced and old things being deprecated from what is provided and ClusterControl has to keep up with all the changes introduced by application vendors and follow the best-practice every time.

We recommend users to always use the latest version of ClusterControl (upgrading is easy) together with the latest web browser (built and tested on Google Chrome and Mozilla Firefox), as we are very likely taking advantage of the new features available in the latest version.

Final Thoughts

Do reach us via our support channel if you face any problems or slowness issues when using ClusterControl. Suggestions and feedback are very much welcome.

Happy tuning!

Tags:

clustercontrol

performance tuning

tips

↧

Automate Deployment of your MySQL or Postgres Cluster from Backup

February 1, 2019, 11:05 am

≫ Next: Hybrid OLTP/Analytics Database Workloads in Galera Cluster Using Asynchronous Slaves

≪ Previous: How to Optimize Performance of ClusterControl and Its Components

ClusterControl 1.7.1 introduces a new feature called Create Cluster from Backup, which allows you to deploy a new MySQL or Postgres-based cluster and restore data on it from a backup. This blog post shows how this new feature works, and how this type of automation can deliver improvements to your infrastructure operations.

Introducing: Create Cluster from Backup

Backup management is the most loved feature by our users, and we just lifted it to the next level. This new functionality sounds simple - ClusterControl 1.7.1 is able to deploy a new cluster from an existing backup. However, there are multiple procedures and checks involved in order to deploy a production-grade cluster directly from a backup. Restoration itself comes with its own challenges, such as:

Full or partial restore consequences
Base backup and its incremental backups ordering (for incremental backup)
Backup decryption (if encrypted)
Restoration tool options
Decompression (if compressed)
Backup streaming from the source to the destination server
Disk space utilization during and after the restoration
Restoration progress reporting

Combine the above with the complexity and repetitiveness of database cluster deployment tasks, you can save time and reduce risk in running error-prone procedures. The hardest part from the user's perspective is to pick which backup to restore from. ClusterControl will handle all the heavy lifting behind the scene, and report the end result once finishes.

The steps are basically simple:

Setup passwordless SSH from the ClusterControl node to the new servers.
Pick one logical backup from the backup list, or create one under Backups -> Create Backup.
Click Restore -> Create Cluster from Backup and follow the deployment wizard.

This feature is specifically built for MySQL Galera Cluster and PostgreSQL at the time being. Here is what you would see in the UI after clicking "Restore" on an existing backup:

The bottom option is what we are looking for. Next, is the summary dialog on the chosen backup before the deployment configuration:

Next, the same database cluster deployment wizard for the respective cluster (MySQL Galera Cluster or PostgreSQL) will be shown to configure a new cluster:

Note that you must specify the same database root/admin username and password as the one that you have in the backup. Otherwise, the deployment would fail half-way when starting up the first node. In general, restoration and deployment procedures will happen in the following order:

Install necessary softwares and dependencies on all database nodes.
Start the first node.
Stream and restore backup on the first node (with auto-restart flag).
Configure and add the rest of the nodes.

A new database cluster will be listed under ClusterControl cluster dashboard once the job completes.

What can you gain from it?

There are a number of things you could benefit from this feature, as explained in the following sections.

Test your dataset in various conditions

Sometimes, you might be wondering a the new database version would work or perform for your database workload and testing it out is the only way to know. This is where this feature comes in handy. It allows you to perform tests and benchmark on many variables involved that would affect the database stability or performance, for instance, the underlying hardware, software version, vendor and database or application workloads.

For a simple example, there is a big improvement on the DDL execution between MySQL 5.6 and MySQL 5.7. The following DROP operation on a 10 million rows table proves it all:

mysql-5.7> ALTER TABLE sbtest1 DROP COLUMN xx;
Query OK, 0 rows affected (1 min 58.12 sec)

mysql-5.6> ALTER TABLE sbtest1 DROP COLUMN xx;
Query OK, 0 rows affected (2 min 23.74 sec)

Having another cluster to compare with actually allows us to measure the improvement and justify a migration.

Database migration with logical backup

Logical backup like mysqldump and pg_dumpall is the safest way to upgrade, downgrade or migrate your data from one version or vendor to another. All logical backups can be used to perform database migration. The database upgrade steps are basically simple:

Create (or schedule) a logical backup - mysqldump for MySQL or pg_dumpall for PostgreSQL
Setup passwordless SSH from ClusterControl node to the new servers.
Pick one created logical backup from the backup list.
Click Restore -> Create Cluster from Backup and follow the deployment wizard.
Verify the data restoration on the new cluster.
Point your application to the new cluster.

Faster total cluster recovery time

Imagine a catastrophic failure which prevents your cluster to run, like for example a centralized storage failure which affected all the virtual machines that connected to it, you could get a replacement cluster almost immediately (provided the backup files are stored outside of the failed database nodes, stating the obvious). This feature can be automated via s9s client, where you can trigger a job via the command line interface, for example:

$ s9s cluster \
--create \
--cluster-type=postgresql \
--nodes="192.168.0.101?master;192.168.0.102?slave;192.168.0.103?slave" \
--provider-version=11 \
--db-admin=postgres \
--db-admin-passwd='s3cr3tP455' \
--os-user=root \
--os-key-file=/root/.ssh/id_rsa \
--cluster-name="PostgreSQL 9.6 - Test"
--backup-id=214 \
--log

One thing to note when using this feature is to use the same admin username and password as what's stored in the backup. Also, the passwordless SSH to all database nodes must be configured beforehand. Otherwise, if you prefer to configure it interactively, just use the web UI interface.

Scale Out via Asynchronous Replication

For MySQL Galera Cluster, the newly created cluster has the possibility to be scaled out via MySQL asynchronous replication. Let's say we already restored a new cluster in the office based on the latest backup from the production cluster in the data center, and we would like the office cluster to continue replicating from the production cluster, as illustrated in the following diagram:

You may then set up the asynchronous replication link by using the following way:

Pick one node in the production and enable binary logging (if disabled). Go to Nodes -> pick a node -> Node Actions -> Enable Binary Logging.
Enable binary logging on all nodes for office cluster. This action requires a rolling restart which will be performed automatically if you choose "Yes" under "Auto-restart node" dropdown:
Otherwise, you can perform this operation without downtime by using Manage -> Upgrade -> Rolling Restart (or manually restart one node at a time).
Create a replication user on the production cluster by using Manage -> Schemas and Users -> Users -> Create New User:

Then, pick one node to replicate to the master node in production cluster and setup the replication link:

mysql> CHANGE MASTER master_host = 'prod-mysql1', master_user = 'slave', master_password = 'slavepassw0rd', master_auto_position = 1;
mysql> START SLAVE;

Verify if the replication is running:
```
mysql> SHOW SLAVE STATUS\G
```
Make sure Slave_IO_Thread and Slave_SQL_thread are reporting 'Yes'. The office's cluster should start to catch up with the master node if it's lagging behind.

That’s it for now folks!

Tags:

↧

Hybrid OLTP/Analytics Database Workloads in Galera Cluster Using Asynchronous Slaves

February 21, 2019, 3:04 am

≫ Next: Dealing with Unreliable Networks When Crafting an HA Solution for MySQL or MariaDB

≪ Previous: Automate Deployment of your MySQL or Postgres Cluster from Backup

Using Galera cluster is a great way of building a highly available environment for MySQL or MariaDB. It is a shared-nothing cluster environment which can be scaled even beyond 12-15 nodes. Galera has some limitations, though. It shines in low-latency environments and even though it can be used across WAN, the performance is limited by network latency. Galera performance can also be impacted if one of the nodes starts to behave incorrectly. For example, excessive load on one of the nodes may slow it down, resulting in slower handling of the writes and that will impact all of the other nodes in the cluster. On the other hand, it is quite impossible to run a business without analyzing your data. Such analysis, typically, requires running heavy queries, which is quite different from an OLTP workload. In this blog post, we will discuss an easy way of running analytical queries for data stored in Galera Cluster for MySQL or MariaDB, in a way that it does not impact the performance of the core cluster.

How to run analytical queries on Galera Cluster?

As we stated, running long running queries directly on a Galera cluster is doable, but perhaps not so good idea. Hardware-dependant, this can be acceptable solution (if you use strong hardware and you will not run a multi-threaded analytical workload) but even if CPU utilization will not be a problem, the fact that one of the nodes will have mixed workload (OLTP and OLAP) will alone pose some performance challenges. OLAP queries will evict data required for your OLTP workload from the buffer pool, and this will slow down your OLTP queries. Luckily, there is a simple yet efficient way of separating analytical workload from regular queries - an asynchronous replication slave.

Replication slave is a very simple solution - all you need is just another host which can be provisioned and asynchronous replication has to be configured from Galera Cluster to that node. With asynchronous replication, the slave will not impact the rest of the cluster in any way. No matter if it is heavily loaded, uses different (less powerful) hardware, it will just continue replicating from the core cluster. The worst case scenario is that the replication slave will start lagging behind but then it is up to you to implement multi-threaded replication or, eventually to scale up the replication slave.

Once the replication slave is up and running, you should run the heavier queries on it and offload the Galera cluster. This can be done in multiple ways, depending on your setup and environment. If you use ProxySQL, you can easily direct queries to the analytical slave based on the source host, user, schema or even the query itself. Otherwise it will be up to your application to send analytical queries to the correct host.

Setting up a replication slave is not very complex but it still can be tricky if you are not proficient with MySQL and tools like xtrabackup. The whole process would consist of setting up the repository on a new server and installing the MySQL database. Then you will have to provision that host using data from Galera cluster. You can use xtrabackup for that but other tools like mydumper/myloader or even mysqldump will work as well (as long as you execute them correctly). Once the data is there, you will have to setup the replication between a master Galera node and the replication slave. Finally, you would have to reconfigure your proxy layer to include the new slave and route the traffic towards it or make tweaks in how your application connects to the database in order to redirect some of the load to the replication slave.

What is important to keep in mind, this setup is not resilient. If the “master” Galera node would go down, the replication link will be broken and it will take a manual action to slave the replica off another master node in the Galera cluster.

This is not a big deal, especially if you use replication with GTID (Global Transaction ID) but you have to identify that the replication is broken and then take the manual action.

How to set up the asynchronous slave to Galera Cluster using ClusterControl?

Luckily, if you use ClusterControl, the whole process can be automated and it requires just a handful of clicks. The initial state has already been set up using ClusterControl - a 3 node Galera cluster with 2 ProxySQL nodes and 2 Keepalived nodes for high availability of both database and proxy layer.

Adding the replication slave is just a click away:

Replication, obviously, requires binary logs to be enabled. If you do not have binlogs enabled on your Galera nodes, you can do it also from the ClusterControl. Please keep in mind that enabling binary logs will require a node restart to apply the configuration changes.

Even if one node in the cluster has binary logs enabled (marked as “Master” on the screenshot above), it’s still good to enable binary log on at least one more node. ClusterControl can automatically failover the replication slave after it detects that the master Galera node crashed, but for that, another master node with binary logs enabled is required or it won’t have anything to fail over to.

As we stated, enabling binary logs requires restart. You can either perform it straight away, or just make the configuration changes and perform the restart at some other time.

After binlogs have been enabled on some of the Galera nodes, you can proceed with adding the replication slave. In the dialog you have to pick the master host, pass the hostname or IP address of the slave. If you have recent backups at hand (which you should do), you can use one to provision the slave. Otherwise ClusterControl will provision it using xtrabackup - all the recent master data will be streamed to the slave and then the replication will be configured.

After the job completed, a replication slave has been added to the cluster. As stated earlier, should the 10.0.0.101 die, another host in the Galera cluster will be picked as the master and ClusterControl will automatically slave 10.0.0.104 off another node.

As we use ProxySQL, we need to configure it. We’ll add a new server into ProxySQL.

We created another hostgroup (30) where we put our asynchronous slave. We also increased “Max Replication Lag” to 50 seconds from default 10. It is up to your business requirements how badly analytics slave can be lagging before it becomes a problem.

After that we have to configure a query rule that will match our OLAP traffic and route it to the OLAP hostgroup (30). On the screenshot above we filled several fields - this is not mandatory. Typically you will need to use one, two of them at most. Above screenshot serves as an example so we can easily see that you can match queries using schema (if you have a separate schema with analytical data), hostname/IP (if OLAP queries are executed from some particular host), user (if application uses particular user for analytical queries. You can also match queries directly by either passing a full query or by marking them with SQL comments and let ProxySQL route all queries with a “OLAP_QUERY” string to our analytical hostgroup.

As you can see, thanks to ClusterControl we were able to deploy a replication slave to Galera Cluster in just a couple of clicks. Some may argue that MySQL is not the most suitable database for analytical workload and we tend to agree. You can easily extend this setup using ClickHouse and by setting up a replication from asynchronous slave to ClickHouse columnar datastore for much better performance of analytical queries. We described this setup in one of the earlier blog posts.

Tags:

↧

Dealing with Unreliable Networks When Crafting an HA Solution for MySQL or MariaDB

March 6, 2019, 1:26 am

≫ Next: High Availability on a Shoestring Budget - Deploying a Minimal Two Node MySQL Galera Cluster

≪ Previous: Hybrid OLTP/Analytics Database Workloads in Galera Cluster Using Asynchronous Slaves

Long gone are the days when a database was deployed as a single node or instance - a powerful, standalone server which was tasked to handle all the requests to the database. Vertical scaling was the way to go - replace the server with another, even more powerful one. During these times, one didn’t really have to be bothered by network performance. As long as the requests were coming in, all was good.

But nowadays, databases are built as clusters with nodes interconnected over a network. It is not always a fast, local network. With businesses reaching global scale, database infrastructure has also to span across the globe, to stay close to customers and to reduce latency. It comes with additional challenges that we have to face when designing a highly available database environment. In this blog post, we will look into the network issues that you may face and provide some suggestions on how to deal with them.

Two Main Options for MySQL or MariaDB HA

We covered this particular topic quite extensively in one of the whitepapers, but let’s look at the two main ways of building high availability for MySQL and MariaDB.

Galera Cluster

Galera Cluster is shared-nothing, virtually synchronous cluster technology for MySQL. It allows to build multi-writer setups that can span across the globe. Galera thrives in low-latency environments but it can also be configured to work with long WAN connections. Galera has a built-in quorum mechanism which ensures that data will not be compromised in case of the network partitioning of some of the nodes.

MySQL Replication

MySQL Replication can be either asynchronous or semi-synchronous. Both are designed to build large scale replication clusters. Like in any other master-slave or primary-secondary replication setup, there can be only one writer, the master. Other nodes, slaves, are used for failover purposes as they contain the copy of the data set from the maser. Slaves can also be used for reading the data and offloading some of the workload from the master.

Both solutions have their own limits and features, both suffer from different problems. Both can be affected by unstable network connections. Let’s take a look at those limitations and how we can design the environment to minimize the impact of an unstable network infrastructure.

Galera Cluster - Network Problems

First, let’s take a look at Galera Cluster. As we discussed, it works best in a low-latency environment. One of the main latency-related problems in Galera is the way how Galera handles the writes. We will not go into all the details in this blog, but further reading in our Galera Cluster for MySQL tutorial. The bottom line is that, due to the certification process for writes, where all nodes in the cluster have to agree on whether the write can be applied or not, your write performance for single row is strictly limited by the network roundtrip time between the writer node and the most far away node. As long as the latency is acceptable and as long as you do not have too many hot spots in your data, WAN setups may work just fine. The problem starts when the network latency spikes from time to time. Writes will then take 3 or 4 times longer than usual and, as a result, databases may start to be overloaded with long-running writes.

One of great features of Galera Cluster is its ability to detect the cluster state and react upon network partitioning. If a node of the cluster cannot be reached, it will be evicted from the cluster and it will not be able to perform any writes. This is crucial in maintaining the integrity of the data during the time when the cluster is split - only the majority of the cluster will accept writes. Minority will complain. To handle this, Galera introduces a vast array of checks and configurable timeouts to avoid false alerts on very transient network issues. Unfortunately, if the network is unreliable, Galera Cluster will not be able to work correctly - nodes will start to leave the cluster, join it later. It will be especially problematic when we have Galera Cluster spanning across WAN - separated pieces of the cluster may disappear randomly if the interconnecting network will not work properly.

How to Design Galera Cluster for an Unstable Network?

First things first, if you have network problems within the single datacenter, there is not much you can do unless you will be able to solve those issues somehow. Unreliable local network is a no go for Galera Cluster, you have to reconsider using some other solution (even though, to be honest, unreliable network will always be a problematic). On the other hand, if the problems are related to WAN connections only (and this is one of the most typical cases), it may be possible to replace WAN Galera links with regular asynchronous replication (if the Galera WAN tuning did not help).

There are several inherent limitations in this setup - the main issue is that the writes used to happen locally. Now, all the writes will have to head to the “master” datacenter (DC A in our case). This is not as bad as it sounds. Please keep in mind that in an all-Galera environment, writes will be slowed down by the latency between nodes located in different datacenters. Even local writes will be affected. It will be more or less the same slowdown as with asynchronous setup in which you would send the writes across WAN to the “master” datacenter.

Using asynchronous replication comes with all of the problems typical for the asynchronous replication. Replication lag may become a problem - not that Galera would be more performant, it’s just that Galera would slow down the traffic via flow control while replication does not have any mechanism to throttle the traffic on the master.

Another problem is the failover: if the “master” Galera node (the one which acts as the master to the slaves in other datacenters) would fail, some mechanism has to be created to repoint slaves to another, working master node. It might be some sort of a script, it is also possible to try something with VIP where the “slave” Galera cluster slaves off Virtual IP which is always assigned to the alive Galera node in the “master” cluster.

The main advantage of such setup is that we do remove the WAN Galera link which means that our “master” cluster will not be slowed down by the fact that some of the nodes are separated geographically. As we mentioned, we lose the ability to write in all of the data-centers but latency-wise writing across the WAN is the same as writing locally to the Galera cluster which spans across WAN. As a result the overall latency should improve. Asynchronous replication is also less vulnerable to the unstable networks. Worst case scenario, the replication link will break and it will be recreated when the networks converge.

Single Console for Your Entire Database Infrastructure

Find out what else is new in ClusterControl

Install ClusterControl for FREE

How to Design MySQL Replication for an Unstable Network?

In the previous section, we covered Galera cluster and one solution was to use asynchronous replication. How does it look like in a plain asynchronous replication setup? Let’s look at how an unstable network can cause the biggest disruptions in the replication setup.

First of all, latency - one of the main pain points for Galera Cluster. In case of replication, it is almost a non-issue. Unless you use semi-synchronous replication that is - in such case, increased latency will slow down writes. In asynchronous replication, latency has no impact on the write performance. It may, though, have some impact on the replication lag. It is not anything as significant as it was for Galera but you may expect more lag spikes and overall less stable replication performance if the network between nodes suffers from high latency. This is mostly due to the fact that the master may as well serve several writes before data transfer to the slave can be initiated on high latency network.

The network instability may definitely impact replication links but it is, again, not that critical. MySQL slaves will attempt to reconnect to their masters and replication will commence.

The main issue with MySQL replication is actually something that Galera Cluster solves internally - network partitioning. We are talking about the network partitioning as the condition in which segments of the network are separated from each other. MySQL replication utilizes one single writer node - master. No matter how you design your environment, you have to send your writes to the master. If the master is not available (for whatever reasons), application cannot do its job unless it runs in some sort of read-only mode. Therefore there is a need to pick the new master as soon as possible. This is where the issues show up.

First, how to tell which host is a master and which one is not. One of the usual ways is to use the “read_only” variable to distinguish slaves from the master. If node has read_only enabled (set read_only=1), it is a slave (as slaves should not handle any direct writes). If the node has read_only disabled (set read_only=0), it is a master. To make things safer, a common approach is to set read_only=1 in MySQL configuration - in case of a restart, it is safer if the node shows up as a slave. Such “language” can be understood by proxies like ProxySQL or MaxScale.

Let’s take a look at an example.

We have application hosts which connect to the proxy layer. Proxies perform the read/write split sending SELECTs to slaves and writes to master. If master is down, failover is performed, new master is promoted, proxy layer detects that and start sending writes to another node.

If node1 restarts, it will come up with read_only=1 and it will be detected as a slave. It is not ideal as it is not replicating but it is acceptable. Ideally, the old master should not show up at all until it is rebuilt and slaved off the new master.

Way more problematic situation is if we have to deal with network partitioning. Let’s consider the same setup: application tier, proxy tier and databases.

When the network makes the master not reachable, the application is not usable as no writes make it to their destination. New master is promoted, writes are redirected to it. What will happen then if the network issues cease and the old master becomes reachable? It has not been stopped, therefore it is still using read_only=0:

You’ve now ended up in a split brain, when writes were directed to two nodes. This situation is pretty bad as to merge diverged datasets may take a while and it is quite a complex process.

What can be done to avoid this problem? There is no silver bullet but some actions can be taken to minimize the probability of a split brain to happen.

First of all, you can be smarter in detecting the state of the master. How do the slaves see it? Can they replicate from it? Maybe some of the slaves still can connect to the master, meaning that the master is up and running or, at least, making it possible to stop it should that be necessary. What about the proxy layer? Do all of the proxy nodes see the master as unavailable? If some can still connect, than you can try to utilize those nodes to ssh into the master and stop it before the failover?

The failover management software can also be smarter in detecting the state of the network. Maybe it utilizes RAFT or some other clustering protocol to build a quorum-aware cluster. If a failover management software can detect the split brain, it can also take some actions based on this like, for example, setting all nodes in the partitioned segment to read_only ensuring that the old master will not show up as writable when the networks converge.

You can also include tools like Consul or Etcd to store the state of the cluster. The proxy layer can be configured to use data from Consul, not the state of the read_only variable. It will be then up to the failover management software to make necessary changes in Consul so that all proxies will send the traffic to a correct, new master.

Some of those hints can even be combined together to make the failure detection even more reliable. All in all, it is possible to minimize the chances that the replication cluster will suffer from unreliable networks.

As you can see, no matter if we are talking about Galera or MySQL Replication, unstable networks may become a serious problem. On the other hand, if you design the environment correctly, you can still make it work. We hope this blog post will help you to create environments which will work stable even if the networks are not.

Tags:

↧

High Availability on a Shoestring Budget - Deploying a Minimal Two Node MySQL Galera Cluster

March 8, 2019, 7:23 am

≫ Next: HA for MySQL and MariaDB - Comparing Master-Master Replication to Galera Cluster

≪ Previous: Dealing with Unreliable Networks When Crafting an HA Solution for MySQL or MariaDB

We regularly get questions about how to set up a Galera cluster with just 2 nodes.

The documentation clearly states you should have at least 3 Galera nodes to avoid network partitioning. But there are some valid reasons for considering a 2 node deployment, e.g., if you want to achieve database high availability but have a limited budget to spend on a third database node. Or perhaps you are running Galera in a development/sandbox environment and prefer a minimal setup.

Galera implements a quorum-based algorithm to select a primary component through which it enforces consistency. The primary component needs to have a majority of votes, so in a 2 node system, there would be no majority resulting in split brain. Fortunately, it is possible to add a garbd (Galera Arbitrator Daemon), which is a lightweight stateless daemon that can act as the odd node. Arbitrator failure does not affect the cluster operations and a new instance can be reattached to the cluster at any time. There can be several arbitrators in the cluster.

ClusterControl has support for deploying garbd on non-database hosts.

Normally a Galera cluster needs at least three hosts to be fully functional, however, at deploy time, two nodes would suffice to create a primary component. Here are the steps:

Deploy a Galera cluster of two nodes,
After the cluster has been deployed by ClusterControl, add garbd on the ClusterControl node.

You should end up with the below setup:

Deploy the Galera Cluster

Go to the ClusterControl Deploy section to deploy the cluster.

After selecting the technology that we want to deploy, we must specify User, Key or Password and port to connect by SSH to our hosts. We also need the name for our new cluster and if we want ClusterControl to install the corresponding software and configurations for us.

After setting up the SSH access information, we must select vendor/version and we must define the database admin password, datadir and port. We can also specify which repository to use.

Even though ClusterControl warns you that a Galera cluster needs an odd number of nodes, only add two nodes to the cluster.

Deploying a Galera cluster will trigger a ClusterControl job which can be monitored at the Jobs page.

Single Console for Your Entire Database Infrastructure

Find out what else is new in ClusterControl

Install ClusterControl for FREE

Install Garbd

Once deployment is complete, install garbd on the ClusterControl host. We have the option to deploy garbd from ClusterControl, but this option won’t work if we want to deploy it in the same ClusterControl server. This is to avoid some issue related to the database versions and package dependencies.

So, we must install it manually, and then import garbd to ClusterControl.

Let’s see the manual installation of Percona Garbd on CentOS 7.

Create the Percona repository file:

$ vi /etc/yum.repos.d/percona.repo
[percona-release-$basearch]
name = Percona-Release YUM repository - $basearch
baseurl = http://repo.percona.com/release/$releasever/RPMS/$basearch
enabled = 1
gpgcheck = 0
[percona-release-noarch]
name = Percona-Release YUM repository - noarch
baseurl = http://repo.percona.com/release/$releasever/RPMS/noarch
enabled = 1
gpgcheck = 0
[percona-release-source]
name = Percona-Release YUM repository - Source packages
baseurl = http://repo.percona.com/release/$releasever/SRPMS
enabled = 0
gpgcheck = 0

Then, install the Percona XtraDB Cluster garbd package:

$ yum install Percona-XtraDB-Cluster-garbd-57

Now, we need to configure garbd. For this, we need to edit the /etc/sysconfig/garb file:

$ vi /etc/sysconfig/garb
# Copyright (C) 2012 Codership Oy
# This config file is to be sourced by garb service script.
# A comma-separated list of node addresses (address[:port]) in the cluster
GALERA_NODES="192.168.100.192:4567,192.168.100.193:4567"
# Galera cluster name, should be the same as on the rest of the nodes.
GALERA_GROUP="Galera1"
# Optional Galera internal options string (e.g. SSL settings)
# see http://galeracluster.com/documentation-webpages/galeraparameters.html
# GALERA_OPTIONS=""
# Log file for garbd. Optional, by default logs to syslog
# Deprecated for CentOS7, use journalctl to query the log for garbd
# LOG_FILE=""

Change the GALERA_NODES and GALERA_GROUP parameter according to the Galera nodes configuration. We also need to remove the line # REMOVE THIS AFTER CONFIGURATION before starting the service.

And now, we can start the garb service:

$ service garb start
Redirecting to /bin/systemctl start garb.service

Now, we can import the new garbd into ClusterControl.

Go to ClusterControl -> Select Cluster -> Add Load Balancer.

Then, select Garbd and Import Garbd section.

Here we only need to specify the hostname or IP Address and the port of the new Garbd.

Importing garbd will trigger a ClusterControl job which can be monitored at the Jobs page. Once completed, you can verify garbd is running with a green tick icon at the top bar:

That’s it!

Our minimal two-node Galera cluster is now ready!

Tags:

↧

HA for MySQL and MariaDB - Comparing Master-Master Replication to Galera Cluster

March 12, 2019, 2:03 am

≫ Next: How to Run and Configure ProxySQL 2.0 for MySQL Galera Cluster on Docker

≪ Previous: High Availability on a Shoestring Budget - Deploying a Minimal Two Node MySQL Galera Cluster

Galera replication is relatively new if compared to MySQL replication, which is natively supported since MySQL v3.23. Although MySQL replication is designed for master-slave unidirectional replication, it can be configured as an active master-master setup with bidirectional replication. While it is easy to set up, and some use cases might benefit from this “hack”, there are a number of caveats. On the other hand, Galera cluster is a different type of technology to learn and manage. Is it worth it?

In this blog post, we are going to compare master-master replication to Galera cluster.

Replication Concepts

Before we jump into the comparison, let’s explain the basic concepts behind these two replication mechanisms.

Generally, any modification to the MySQL database generates an event in binary format. This event is transported to the other nodes depending on the replication method chosen - MySQL replication (native) or Galera replication (patched with wsrep API).

MySQL Replication

The following diagrams illustrates the data flow of a successful transaction from one node to another when using MySQL replication:

The binary event is written into the master's binary log. The slave(s) via slave_IO_thread will pull the binary events from master's binary log and replicate them into its relay log. The slave_SQL_thread will then apply the event from the relay log asynchronously. Due to the asynchronous nature of replication, the slave server is not guaranteed to have the data when the master performs the change.

Ideally, MySQL replication will have the slave to be configured as a read-only server by setting read_only=ON or super_read_only=ON. This is a precaution to protect the slave from accidental writes which can lead to data inconsistency or failure during master failover (e.g., errant transactions). However, in a master-master active-active replication setup, read-only has to be disabled on the other master to allow writes to be processed simultaneously. The primary master must be configured to replicate from the secondary master by using the CHANGE MASTER statement to enable circular replication.

Galera Replication

The following diagrams illustrates the data replication flow of a successful transaction from one node to another for Galera Cluster:

The event is encapsulated in a writeset and broadcasted from the originator node to the other nodes in the cluster by using Galera replication. The writeset undergoes certification on every Galera node and if it passes, the applier threads will apply the writeset asynchronously. This means that the slave server will eventually become consistent, after agreement of all participating nodes in global total ordering. It is logically synchronous, but the actual writing and committing to the tablespace happens independently, and thus asynchronously on each node with a guarantee for the change to propagate on all nodes.

Avoiding Primary Key Collision

In order to deploy MySQL replication in master-master setup, one has to adjust the auto increment value to avoid primary key collision for INSERT between two or more replicating masters. This allows the primary key value on masters to interleave each other and prevent the same auto increment number being used twice on either of the node. This behaviour must be configured manually, depending on the number of masters in the replication setup. The value of auto_increment_increment equals to the number of replicating masters and the auto_increment_offset must be unique between them. For example, the following lines should exist inside the corresponding my.cnf:

Master1:

log-slave-updates
auto_increment_increment=2
auto_increment_offset=1

Master2:

log-slave-updates
auto_increment_increment=2
auto_increment_offset=2

Likewise, Galera Cluster uses this same trick to avoid primary key collisions by controlling the auto increment value and offset automatically with wsrep_auto_increment_control variable. If set to 1 (the default), will automatically adjust the auto_increment_increment and auto_increment_offset variables according to the size of the cluster, and when the cluster size changes. This avoids replication conflicts due to auto_increment. In a master-slave environment, this variable can be set to OFF.

The consequence of this configuration is the auto increment value will not be in sequential order, as shown in the following table of a three-node Galera Cluster:

Node	auto_increment_increment	auto_increment_offset	Auto increment value
Node 1	3	1	1, 4, 7, 10, 13, 16...
Node 2	3	2	2, 5, 8, 11, 14, 17...
Node 3	3	3	3, 6, 9, 12, 15, 18...

If an application performs insert operations on the following nodes in the following order:

Node1, Node3, Node2, Node3, Node3, Node1, Node3 ..

Then the primary key value that will be stored in the table will be:

1, 6, 8, 9, 12, 13, 15 ..

Simply said, when using master-master replication (MySQL replication or Galera), your application must be able to tolerate non-sequential auto-increment values in its dataset.

For ClusterControl users, take note that it supports deployment of MySQL master-master replication with a limit of two masters per replication cluster, only for active-passive setup. Therefore, ClusterControl does not deliberately configure the masters with auto_increment_increment and auto_increment_offset variables.

Data Consistency

Galera Cluster comes with its flow-control mechanism, where each node in the cluster must keep up when replicating, or otherwise all other nodes will have to slow down to allow the slowest node to catch up. This basically minimizes the probability of slave lag, although it might still happen but not as significant as in MySQL replication. By default, Galera allows nodes to be at least 16 transactions behind in applying through variable gcs.fc_limit. If you want to do critical reads (a SELECT that must return most up to date information), you probably want to use session variable, wsrep_sync_wait.

Galera Cluster on the other hand comes with a safeguard to data inconsistency whereby a node will get evicted from the cluster if it fails to apply any writeset for whatever reasons. For example, when a Galera node fails to apply writeset due to internal error by the underlying storage engine (MySQL/MariaDB), the node will pull itself out from the cluster with the following error:

150305 16:13:14 [ERROR] WSREP: Failed to apply trx 1 4 times
150305 16:13:14 [ERROR] WSREP: Node consistency compromized, aborting..

To fix the data consistency, the offending node has to be re-synced before it is allowed to join the cluster. This can be done manually or by wiping out the data directory to trigger snapshot state transfer (full syncing from a donor).

MySQL master-master replication does not enforce data consistency protection and a slave is allowed to diverge e.g, replicate a subset of data or lag behind, which makes the slave inconsistent with the master. It is designed to replicate data in one flow - from master down to the slaves. Data consistency checks have to be performed manually or via external tools like Percona Toolkit pt-table-checksum or mysql-replication-check.

Conflict Resolution

Generally, master-master (or multi-master, or bi-directional) replication allows more than one member in the cluster to process writes. With MySQL replication, in case of replication conflict, the slave's SQL thread simply stops applying the next query until the conflict is resolved, either by manually skipping the replication event, fixing the offending rows or resyncing the slave. Simply said, there is no automatic conflict resolution support for MySQL replication.

Galera Cluster provides a better alternative by retrying the offending transaction during replication. By using wsrep_retry_autocommit variable, one can instruct Galera to automatically retry a failed transaction due to cluster-wide conflicts, before returning an error to the client. If set to 0, no retries will be attempted, while a value of 1 (the default) or more specifies the number of retries attempted. This can be useful to assist applications using autocommit to avoid deadlocks.

Single Console for Your Entire Database Infrastructure

Find out what else is new in ClusterControl

Install ClusterControl for FREE

Node Consensus and Failover

Galera uses Group Communication System (GCS) to check node consensus and availability between cluster members. If a node is unhealthy, it will be automatically evicted from the cluster after gmcast.peer_timeout value, default to 3 seconds. A healthy Galera node in "Synced" state is deemed as a reliable node to serve reads and writes, while others are not. This design greatly simplifies health check procedures from the upper tiers (load balancer or application).

In MySQL replication, a master does not care about its slave(s), while a slave only has consensus with its sole master via the slave_IO_thread process when replicating the binary events from master's binary log. If a master goes down, this will break the replication and an attempt to re-establish the link will be made every slave_net_timeout (default to 60 seconds). From the application or load balancer perspective, the health check procedures for replication slave must at least involve checking the following state:

Seconds_Behind_Master
Slave_IO_Running
Slave_SQL_Running
read_only variable
super_read_only variable (MySQL 5.7.8 and later)

In terms of failover, generally, master-master replication and Galera nodes are equal. They hold the same data set (albeit you can replicate a subset of data in MySQL replication, but that's uncommon for master-master) and share the same role as masters, capable of handling reads and writes simultaneously. Therefore, there is actually no failover from the database point-of-view due to this equilibrium. Only from the application side that would require failover to skip the unoperational nodes. Keep in mind that because MySQL replication is asynchronous, it is possible that not all of the changes done on the master will have propagated to the other master.

Node Provisioning

The process of bringing a node into sync with the cluster before replication starts, is known as provisioning. In MySQL replication, provisioning a new node is a manual process. One has to take a backup of the master and restore it over to the new node before setting up the replication link. For an existing replication node, if the master's binary logs have been rotated (based on expire_logs_days, default to 0 means no automatic removal), you may have to re-provision the node using this procedure. There are also external tools like Percona Toolkit pt-table-sync and ClusterControl to help you out on this. ClusterControl supports resyncing a slave with just two clicks. You have options to resync by taking a backup from the active master or an existing backup.

In Galera, there are two ways of doing this - incremental state transfer (IST) or state snapshot transfer (SST). IST process is the preferred method where only the missing transactions transfer from a donor's cache. SST process is similar to taking a full backup from the donor, it is usually pretty resource intensive. Galera will automatically determine which syncing process to trigger based on the joiner's state. In most cases, if a node fails to join a cluster, simply wipe out the MySQL datadir of the problematic node and start the MySQL service. Galera provisioning process is much simpler, it comes very handy when scaling out your cluster or re-introducing a problematic node back into the cluster.

Loosely Coupled vs Tightly Coupled

MySQL replication works very well even across slower connections, and with connections that are not continuous. It can also be used across different hardware, environment and operating systems. Most storage engines support it, including MyISAM, Aria, MEMORY and ARCHIVE. This loosely coupled setup allows MySQL master-master replication to work well in a mixed environment with less restriction.

Galera nodes are tightly-coupled, where the replication performance is as fast as the slowest node. Galera uses a flow control mechanism to control replication flow among members and eliminate any slave lag. The replication can be all fast or all slow on every node and is adjusted automatically by Galera. Thus, it's recommended to use uniform hardware specs for all Galera nodes, especially with respect to CPU, RAM, disk subsystem, network interface card and network latency between nodes in the cluster.

Conclusions

In summary, Galera Cluster is superior if compared to MySQL master-master replication due to its synchronous replication support with strong consistency, plus more advanced features like automatic membership control, automatic node provisioning and multi-threaded slaves. Ultimately, this depends on how the application interacts with the database server. Some legacy applications built for a standalone database server may not work well on a clustered setup.

To simplify our points above, the following reasons justify when to use MySQL master-master replication:

Things that are not supported by Galera:
- Replication for non-InnoDB/XtraDB tables like MyISAM, Aria, MEMORY or ARCHIVE.
- XA transactions.
- Statement-based replication between masters (e.g, when bandwidth is very expensive).
- Relying on explicit locking like LOCK TABLES statement.
- The general query log and the slow query log must be directed to a table, instead of a file.
Loosely coupled setup where the hardware specs, software version and connection speed are significantly different on every master.
When you already have a MySQL replication chain and you want to add another active/backup master for redundancy to speed up failover and recovery time in case if one of the master is unavailable.
If your application can't be modified to work around Galera Cluster limitations and having a MySQL-aware load balancer like ProxySQL or MaxScale is not an option.

Reasons to pick Galera Cluster over MySQL master-master replication:

Ability to safely write to multiple masters.
Data consistency automatically managed (and guaranteed) across databases.
New database nodes easily introduced and synced.
Failures or inconsistencies automatically detected.
In general, more advanced and robust high availability features.

Tags:

↧

How to Run and Configure ProxySQL 2.0 for MySQL Galera Cluster on Docker

March 20, 2019, 6:03 am

≫ Next: Understanding the Effects of High Latency in High Availability MySQL and MariaDB Solutions

≪ Previous: HA for MySQL and MariaDB - Comparing Master-Master Replication to Galera Cluster

ProxySQL is an intelligent and high-performance SQL proxy which supports MySQL, MariaDB and ClickHouse. Recently, ProxySQL 2.0 has become GA and it comes with new exciting features such as GTID consistent reads, frontend SSL, Galera and MySQL Group Replication native support.

It is relatively easy to run ProxySQL as Docker container. We have previously written about how to run ProxySQL on Kubernetes as a helper container or as a Kubernetes service, which is based on ProxySQL 1.x. In this blog post, we are going to use the new version ProxySQL 2.x which uses a different approach for Galera Cluster configuration.

ProxySQL 2.x Docker Image

We have released a new ProxySQL 2.0 Docker image container and it's available in Docker Hub. The README provides a number of configuration examples particularly for Galera and MySQL Replication, pre and post v2.x. The configuration lines can be defined in a text file and mapped into the container's path at /etc/proxysql.cnf to be loaded into ProxySQL service.

The image "latest" tag still points to 1.x until ProxySQL 2.0 officially becomes GA (we haven't seen any official release blog/article from ProxySQL team yet). Which means, whenever you install ProxySQL image using latest tag from Severalnines, you will still get version 1.x with it. Take note the new example configurations also enable ProxySQL web stats (introduced in 1.4.4 but still in beta) - a simple dashboard that summarizes the overall configuration and status of ProxySQL itself.

ProxySQL 2.x Support for Galera Cluster

Let's talk about Galera Cluster native support in greater detail. The new mysql_galera_hostgroups table consists of the following fields:

writer_hostgroup: ID of the hostgroup that will contain all the members that are writers (read_only=0).
backup_writer_hostgroup: If the cluster is running in multi-writer mode (i.e. there are multiple nodes with read_only=0) and max_writers is set to a smaller number than the total number of nodes, the additional nodes are moved to this backup writer hostgroup.
reader_hostgroup: ID of the hostgroup that will contain all the members that are readers (i.e. nodes that have read_only=1)
offline_hostgroup: When ProxySQL monitoring determines a host to be OFFLINE, the host will be moved to the offline_hostgroup.
active: a boolean value (0 or 1) to activate a hostgroup
max_writers: Controls the maximum number of allowable nodes in the writer hostgroup, as mentioned previously, additional nodes will be moved to the backup_writer_hostgroup.
writer_is_also_reader: When 1, a node in the writer_hostgroup will also be placed in the reader_hostgroup so that it will be used for reads. When set to 2, the nodes from backup_writer_hostgroup will be placed in the reader_hostgroup, instead of the node(s) in the writer_hostgroup.
max_transactions_behind: determines the maximum number of writesets a node in the cluster can have queued before the node is SHUNNED to prevent stale reads (this is determined by querying the wsrep_local_recv_queue Galera variable).
comment: Text field that can be used for any purposes defined by the user

Here is an example configuration for mysql_galera_hostgroups in table format:

Admin> select * from mysql_galera_hostgroups\G
*************************** 1. row ***************************
       writer_hostgroup: 10
backup_writer_hostgroup: 20
       reader_hostgroup: 30
      offline_hostgroup: 9999
                 active: 1
            max_writers: 1
  writer_is_also_reader: 2
max_transactions_behind: 20
                comment:

ProxySQL performs Galera health checks by monitoring the following MySQL status/variable:

read_only - If ON, then ProxySQL will group the defined host into reader_hostgroup unless writer_is_also_reader is 1.
wsrep_desync - If ON, ProxySQL will mark the node as unavailable, moving it to offline_hostgroup.
wsrep_reject_queries - If this variable is ON, ProxySQL will mark the node as unavailable, moving it to the offline_hostgroup (useful in certain maintenance situations).
wsrep_sst_donor_rejects_queries - If this variable is ON, ProxySQL will mark the node as unavailable while the Galera node is serving as an SST donor, moving it to the offline_hostgroup.
wsrep_local_state - If this status returns other than 4 (4 means Synced), ProxySQL will mark the node as unavailable and move it into offline_hostgroup.
wsrep_local_recv_queue - If this status is higher than max_transactions_behind, the node will be shunned.
wsrep_cluster_status - If this status returns other than Primary, ProxySQL will mark the node as unavailable and move it into offline_hostgroup.

Having said that, by combining these new parameters in mysql_galera_hostgroups together with mysql_query_rules, ProxySQL 2.x has the flexibility to fit into much more Galera use cases. For example, one can have a single-writer, multi-writer and multi-reader hostgroups defined as the destination hostgroup of a query rule, with the ability to limit the number of writers and finer control on the stale reads behaviour.

Contrast this to ProxySQL 1.x, where the user had to explicitly define a scheduler to call an external script to perform the backend health checks and update the database servers state. This requires some customization to the script (user has to update the ProxySQL admin user/password/port) plus it depended on an additional tool (MySQL client) to connect to ProxySQL admin interface.

Here is an example configuration of Galera health check script scheduler in table format for ProxySQL 1.x:

Admin> select * from scheduler\G
*************************** 1. row ***************************
         id: 1
     active: 1
interval_ms: 2000
   filename: /usr/share/proxysql/tools/proxysql_galera_checker.sh
       arg1: 10
       arg2: 20
       arg3: 1
       arg4: 1
       arg5: /var/lib/proxysql/proxysql_galera_checker.log
    comment:

Besides, since ProxySQL scheduler thread executes any script independently, there are many versions of health check scripts available out there. All ProxySQL instances deployed by ClusterControl uses the default script provided by the ProxySQL installer package.

In ProxySQL 2.x, max_writers and writer_is_also_reader variables can determine how ProxySQL dynamically groups the backend MySQL servers and will directly affect the connection distribution and query routing. For example, consider the following MySQL backend servers:

Admin> select hostgroup_id, hostname, status, weight from mysql_servers;
+--------------+--------------+--------+--------+
| hostgroup_id | hostname     | status | weight |
+--------------+--------------+--------+--------+
| 10           | DB1          | ONLINE | 1      |
| 10           | DB2          | ONLINE | 1      |
| 10           | DB3          | ONLINE | 1      |
+--------------+--------------+--------+--------+

Together with the following Galera hostgroups definition:

Admin> select * from mysql_galera_hostgroups\G
*************************** 1. row ***************************
       writer_hostgroup: 10
backup_writer_hostgroup: 20
       reader_hostgroup: 30
      offline_hostgroup: 9999
                 active: 1
            max_writers: 1
  writer_is_also_reader: 2
max_transactions_behind: 20
                comment:

Considering all hosts are up and running, ProxySQL will most likely group the hosts as below:

Let's look at them one by one:

Configuration	Description
writer_is_also_reader=0	Groups the hosts into 2 hostgroups (writer and backup_writer). Writer is part of the backup_writer. Since the writer is not a reader, nothing in hostgroup 30 (reader) because none of the hosts are set with read_only=1. It is not a common practice in Galera to enable the read-only flag.
writer_is_also_reader=1	Groups the hosts into 3 hostgroups (writer, backup_writer and reader). Variable read_only=0 in Galera has no affect thus writer is also in hostgroup 30 (reader) Writer is not part of backup_writer.
writer_is_also_reader=2	Similar with writer_is_also_reader=1 however, writer is part of backup_writer.

With this configuration, one can have various choices for hostgroup destination to cater for specific workloads. "Hotspot" writes can be configured to go to only one server to reduce multi-master conflicts, non-conflicting writes can be distributed equally on the other masters, most reads can be distributed evenly on all MySQL servers or non-writers, critical reads can be forwarded to the most up-to-date servers and analytical reads can be forwarded to a slave replica.

ProxySQL Deployment for Galera Cluster

In this example, suppose we already have a three-node Galera Cluster deployed by ClusterControl as shown in the following diagram:

Our Wordpress applications are running on Docker while the Wordpress database is hosted on our Galera Cluster running on bare-metal servers. We decided to run a ProxySQL container alongside our Wordpress containers to have a better control on Wordpress database query routing and fully utilize our database cluster infrastructure. Since the read-write ratio is around 80%-20%, we want to configure ProxySQL to:

Forward all writes to one Galera node (less conflict, focus on write)
Balance all reads to the other two Galera nodes (better distribution for the majority of the workload)

Firstly, create a ProxySQL configuration file inside the Docker host so we can map it into our container:

$ mkdir /root/proxysql-docker
$ vim /root/proxysql-docker/proxysql.cnf

Then, copy the following lines (we will explain the configuration lines further down):

datadir="/var/lib/proxysql"

admin_variables=
{
    admin_credentials="admin:admin"
    mysql_ifaces="0.0.0.0:6032"
    refresh_interval=2000
    web_enabled=true
    web_port=6080
    stats_credentials="stats:admin"
}

mysql_variables=
{
    threads=4
    max_connections=2048
    default_query_delay=0
    default_query_timeout=36000000
    have_compress=true
    poll_timeout=2000
    interfaces="0.0.0.0:6033;/tmp/proxysql.sock"
    default_schema="information_schema"
    stacksize=1048576
    server_version="5.1.30"
    connect_timeout_server=10000
    monitor_history=60000
    monitor_connect_interval=200000
    monitor_ping_interval=200000
    ping_interval_server_msec=10000
    ping_timeout_server=200
    commands_stats=true
    sessions_sort=true
    monitor_username="proxysql"
    monitor_password="proxysqlpassword"
    monitor_galera_healthcheck_interval=2000
    monitor_galera_healthcheck_timeout=800
}

mysql_galera_hostgroups =
(
    {
        writer_hostgroup=10
        backup_writer_hostgroup=20
        reader_hostgroup=30
        offline_hostgroup=9999
        max_writers=1
        writer_is_also_reader=1
        max_transactions_behind=30
        active=1
    }
)

mysql_servers =
(
    { address="db1.cluster.local" , port=3306 , hostgroup=10, max_connections=100 },
    { address="db2.cluster.local" , port=3306 , hostgroup=10, max_connections=100 },
    { address="db3.cluster.local" , port=3306 , hostgroup=10, max_connections=100 }
)

mysql_query_rules =
(
    {
        rule_id=100
        active=1
        match_pattern="^SELECT .* FOR UPDATE"
        destination_hostgroup=10
        apply=1
    },
    {
        rule_id=200
        active=1
        match_pattern="^SELECT .*"
        destination_hostgroup=20
        apply=1
    },
    {
        rule_id=300
        active=1
        match_pattern=".*"
        destination_hostgroup=10
        apply=1
    }
)

mysql_users =
(
    { username = "wordpress", password = "passw0rd", default_hostgroup = 10, transaction_persistent = 0, active = 1 },
    { username = "sbtest", password = "passw0rd", default_hostgroup = 10, transaction_persistent = 0, active = 1 }
)

Now, let's pay a visit to some of the most configuration sections. Firstly, we define the Galera hostgroups configuration as below:

mysql_galera_hostgroups =
(
    {
        writer_hostgroup=10
        backup_writer_hostgroup=20
        reader_hostgroup=30
        offline_hostgroup=9999
        max_writers=1
        writer_is_also_reader=1
        max_transactions_behind=30
        active=1
    }
)

Hostgroup 10 will be the writer_hostgroup, hostgroup 20 for backup_writer and hostgroup 30 for reader. We set max_writers to 1 so we can have a single-writer hostgroup for hostgroup 10 where all writes should be sent to. Then, we define writer_is_also_reader to 1 which will make all Galera nodes as reader as well, suitable for queries that can be equally distributed to all nodes. Hostgroup 9999 is reserved for offline_hostgroup if ProxySQL detects unoperational Galera nodes.

Then, we configure our MySQL servers with default to hostgroup 10:

mysql_servers =
(
    { address="db1.cluster.local" , port=3306 , hostgroup=10, max_connections=100 },
    { address="db2.cluster.local" , port=3306 , hostgroup=10, max_connections=100 },
    { address="db3.cluster.local" , port=3306 , hostgroup=10, max_connections=100 }
)

With the above configurations, ProxySQL will "see" our hostgroups as below:

Then, we define the query routing through query rules. Based on our requirement, all reads should be sent to all Galera nodes except the writer (hostgroup 20) and everything else is forwarded to hostgroup 10 for single writer:

mysql_query_rules =
(
    {
        rule_id=100
        active=1
        match_pattern="^SELECT .* FOR UPDATE"
        destination_hostgroup=10
        apply=1
    },
    {
        rule_id=200
        active=1
        match_pattern="^SELECT .*"
        destination_hostgroup=20
        apply=1
    },
    {
        rule_id=300
        active=1
        match_pattern=".*"
        destination_hostgroup=10
        apply=1
    }
)

Finally, we define the MySQL users that will be passed through ProxySQL:

mysql_users =
(
    { username = "wordpress", password = "passw0rd", default_hostgroup = 10, transaction_persistent = 0, active = 1 },
    { username = "sbtest", password = "passw0rd", default_hostgroup = 10, transaction_persistent = 0, active = 1 }
)

We set transaction_persistent to 0 so all connections coming from these users will respect the query rules for reads and writes routing. Otherwise, the connections would end up hitting one hostgroup which defeats the purpose of load balancing. Do not forget to create those users first on all MySQL servers. For ClusterControl user, you may use Manage -> Schemas and Users feature to create those users.

We are now ready to start our container. We are going to map the ProxySQL configuration file as bind mount when starting up the ProxySQL container. Thus, the run command will be:

$ docker run -d \
--name proxysql2 \
--hostname proxysql2 \
--publish 6033:6033 \
--publish 6032:6032 \
--publish 6080:6080 \
--restart=unless-stopped \
-v /root/proxysql/proxysql.cnf:/etc/proxysql.cnf \
severalnines/proxysql:2.0

Finally, change the Wordpress database pointing to ProxySQL container port 6033, for instance:

$ docker run -d \
--name wordpress \
--publish 80:80 \
--restart=unless-stopped \
-e WORDPRESS_DB_HOST=proxysql2:6033 \
-e WORDPRESS_DB_USER=wordpress \
-e WORDPRESS_DB_HOST=passw0rd \
wordpress

At this point, our architecture is looking something like this:

If you want ProxySQL container to be persistent, map /var/lib/proxysql/ to a Docker volume or bind mount, for example:

$ docker run -d \
--name proxysql2 \
--hostname proxysql2 \
--publish 6033:6033 \
--publish 6032:6032 \
--publish 6080:6080 \
--restart=unless-stopped \
-v /root/proxysql/proxysql.cnf:/etc/proxysql.cnf \
-v proxysql-volume:/var/lib/proxysql \
severalnines/proxysql:2.0

Keep in mind that running with persistent storage like the above will make our /root/proxysql/proxysql.cnf obsolete on the second restart. This is due to ProxySQL multi-layer configuration whereby if /var/lib/proxysql/proxysql.db exists, ProxySQL will skip loading options from configuration file and load whatever is in the SQLite database instead (unless you start proxysql service with --initial flag). Having said that, the next ProxySQL configuration management has to be performed via ProxySQL admin console on port 6032, instead of using configuration file.

Monitoring

ProxySQL process log by default logging to syslog and you can view them by using standard docker command:

$ docker ps
$ docker logs proxysql2

To verify the current hostgroup, query the runtime_mysql_servers table:

$ docker exec -it proxysql2 mysql -uadmin -padmin -h127.0.0.1 -P6032 --prompt='Admin> '
Admin> select hostgroup_id,hostname,status from runtime_mysql_servers;
+--------------+--------------+--------+
| hostgroup_id | hostname     | status |
+--------------+--------------+--------+
| 10           | 192.168.0.21 | ONLINE |
| 30           | 192.168.0.21 | ONLINE |
| 30           | 192.168.0.22 | ONLINE |
| 30           | 192.168.0.23 | ONLINE |
| 20           | 192.168.0.22 | ONLINE |
| 20           | 192.168.0.23 | ONLINE |
+--------------+--------------+--------+

If the selected writer goes down, it will be transferred to the offline_hostgroup (HID 9999):

Admin> select hostgroup_id,hostname,status from runtime_mysql_servers;
+--------------+--------------+--------+
| hostgroup_id | hostname     | status |
+--------------+--------------+--------+
| 10           | 192.168.0.22 | ONLINE |
| 9999         | 192.168.0.21 | ONLINE |
| 30           | 192.168.0.22 | ONLINE |
| 30           | 192.168.0.23 | ONLINE |
| 20           | 192.168.0.23 | ONLINE |
+--------------+--------------+--------+

The above topology changes can be illustrated in the following diagram:

We have also enabled the web stats UI with admin-web_enabled=true.To access the web UI, simply go to the Docker host in port 6080, for example: http://192.168.0.200:8060 and you will be prompted with username/password pop up. Enter the credentials as defined under admin-stats_credentials and you should see the following page:

By monitoring MySQL connection pool table, we can get connection distribution overview for all hostgroups:

Admin> select hostgroup, srv_host, status, ConnUsed, MaxConnUsed, Queries from stats.stats_mysql_connection_pool order by srv_host;
+-----------+--------------+--------+----------+-------------+---------+
| hostgroup | srv_host     | status | ConnUsed | MaxConnUsed | Queries |
+-----------+--------------+--------+----------+-------------+---------+
| 20        | 192.168.0.23 | ONLINE | 5        | 24          | 11458   |
| 30        | 192.168.0.23 | ONLINE | 0        | 0           | 0       |
| 20        | 192.168.0.22 | ONLINE | 2        | 24          | 11485   |
| 30        | 192.168.0.22 | ONLINE | 0        | 0           | 0       |
| 10        | 192.168.0.21 | ONLINE | 32       | 32          | 9746    |
| 30        | 192.168.0.21 | ONLINE | 0        | 0           | 0       |
+-----------+--------------+--------+----------+-------------+---------+

The output above shows that hostgroup 30 does not process anything because our query rules do not have this hostgroup configured as destination hostgroup.

The statistics related to the Galera nodes can be viewed in the mysql_server_galera_log table:

Admin>  select * from mysql_server_galera_log order by time_start_us desc limit 3\G
*************************** 1. row ***************************
                       hostname: 192.168.0.23
                           port: 3306
                  time_start_us: 1552992553332489
                success_time_us: 2045
              primary_partition: YES
                      read_only: NO
         wsrep_local_recv_queue: 0
              wsrep_local_state: 4
                   wsrep_desync: NO
           wsrep_reject_queries: NO
wsrep_sst_donor_rejects_queries: NO
                          error: NULL
*************************** 2. row ***************************
                       hostname: 192.168.0.22
                           port: 3306
                  time_start_us: 1552992553329653
                success_time_us: 2799
              primary_partition: YES
                      read_only: NO
         wsrep_local_recv_queue: 0
              wsrep_local_state: 4
                   wsrep_desync: NO
           wsrep_reject_queries: NO
wsrep_sst_donor_rejects_queries: NO
                          error: NULL
*************************** 3. row ***************************
                       hostname: 192.168.0.21
                           port: 3306
                  time_start_us: 1552992553329013
                success_time_us: 2715
              primary_partition: YES
                      read_only: NO
         wsrep_local_recv_queue: 0
              wsrep_local_state: 4
                   wsrep_desync: NO
           wsrep_reject_queries: NO
wsrep_sst_donor_rejects_queries: NO
                          error: NULL

The resultset returns the related MySQL variable/status state for every Galera node for a particular timestamp. In this configuration, we configured the Galera health check to run every 2 seconds (monitor_galera_healthcheck_interval=2000). Hence, the maximum failover time would be around 2 seconds if a topology change happens to the cluster.

References

Tags:

↧