Table of Contents
This chapter explains MySQL Group Replication and how to install, configure and monitor groups. MySQL Group Replication is a MySQL Server plugin that enables you to create elastic, highly-available, fault-tolerant replication topologies.
Groups can operate in a single-primary mode with automatic primary election, where only one server accepts updates at a time. Alternatively, for more advanced users, groups can be deployed in multi-primary mode, where all servers can accept updates, even if they are issued concurrently.
There is a built-in group membership service that keeps the view of the group consistent and available for all servers at any given point in time. Servers can leave and join the group and the view is updated accordingly. Sometimes servers can leave the group unexpectedly, in which case the failure detection mechanism detects this and notifies the group that the view has changed. This is all automatic.
The chapter is structured as follows:
-
Section 17.1, “Group Replication Background” provides an introduction to groups and how Group Replication works.
-
Section 17.2, “Getting Started” explains how to configure multiple MySQL Server instances to create a group.
-
Section 17.3, “Monitoring Group Replication” explains how to monitor a group.
-
Section 17.4, “Group Replication Operations” explains how to work with a group.
-
Section 17.5, “Group Replication Security” explains how to secure a group.
-
Upgrading Group Replication explains how to upgrade a group.
-
Section 17.9, “Group Replication Technical Details” provides in-depth information about how Group Replication works.
This section provides background information on MySQL Group Replication.
The most common way to create a fault-tolerant system is to resort to making components redundant, in other words the component can be removed and the system should continue to operate as expected. This creates a set of challenges that raise complexity of such systems to a whole different level. Specifically, replicated databases have to deal with the fact that they require maintenance and administration of several servers instead of just one. Moreover, as servers are cooperating together to create the group several other classic distributed systems problems have to be dealt with, such as network partitioning or split brain scenarios.
Therefore, the ultimate challenge is to fuse the logic of the database and data replication with the logic of having several servers coordinated in a consistent and simple way. In other words, to have multiple servers agreeing on the state of the system and the data on each and every change that the system goes through. This can be summarized as having servers reaching agreement on each database state transition, so that they all progress as one single database or alternatively that they eventually converge to the same state. Meaning that they need to operate as a (distributed) state machine.
MySQL Group Replication provides distributed state machine replication with strong coordination between servers. Servers coordinate themselves automatically when they are part of the same group. The group can operate in a single-primary mode with automatic primary election, where only one server accepts updates at a time. Alternatively, for more advanced users the group can be deployed in multi-primary mode, where all servers can accept updates, even if they are issued concurrently. This power comes at the expense of applications having to work around the limitations imposed by such deployments.
There is a built-in group membership service that keeps the view of the group consistent and available for all servers at any given point in time. Servers can leave and join the group and the view is updated accordingly. Sometimes servers can leave the group unexpectedly, in which case the failure detection mechanism detects this and notifies the group that the view has changed. This is all automatic.
For a transaction to commit, the majority of the group have to agree on the order of a given transaction in the global sequence of transactions. Deciding to commit or abort a transaction is done by each server individually, but all servers make the same decision. If there is a network partition, resulting in a split where members are unable to reach agreement, then the system does not progress until this issue is resolved. Hence there is also a built-in, automatic, split-brain protection mechanism.
All of this is powered by the provided Group Communication System (GCS) protocols. These provide a failure detection mechanism, a group membership service, and safe and completely ordered message delivery. All these properties are key to creating a system which ensures that data is consistently replicated across the group of servers. At the very core of this technology lies an implementation of the Paxos algorithm. It acts as the group communication engine.
Before getting into the details of MySQL Group Replication, this section introduces some background concepts and an overview of how things work. This provides some context to help understand what is required for Group Replication and what the differences are between classic asynchronous MySQL Replication and Group Replication.
Traditional MySQL Replication provides a simple Primary-Secondary approach to replication. There is a primary (master) and there is one or more secondaries (slaves). The primary executes transactions, commits them and then they are later (thus asynchronously) sent to the secondaries to be either re-executed (in statement-based replication) or applied (in row-based replication). It is a shared-nothing system, where all servers have a full copy of the data by default.
There is also semisynchronous replication, which adds one synchronization step to the protocol. This means that the Primary waits, at commit time, for the secondary to acknowledge that it has received the transaction. Only then does the Primary resume the commit operation.
In the two pictures above, you can see a diagram of the classic asynchronous MySQL Replication protocol (and its semisynchronous variant as well). Diagonal arrows represent messages exchanged between servers or messages exchanged between servers and the client application.
Group Replication is a technique that can be used to implement fault-tolerant systems. The replication group is a set of servers that each have their own entire copy of the data (a shared-nothing replication scheme), and interact with each other through message passing. The communication layer provides a set of guarantees such as atomic message and total order message delivery. These are very powerful properties that translate into very useful abstractions that one can resort to build more advanced database replication solutions.
MySQL Group Replication builds on top of such properties and abstractions and implements a multi-master update everywhere replication protocol. A replication group is formed by multiple servers and each server in the group may execute transactions independently at any time. However, all read-write transactions commit only after they have been approved by the group. In other words, for any read-write transaction the group needs to decide whether it commits or not, so the commit operation is not a unilateral decision from the originating server. Read-only transactions need no coordination within the group and commit immediately.
When a read-write transaction is ready to commit at the originating server, the server atomically broadcasts the write values (the rows that were changed) and the corresponding write set (the unique identifiers of the rows that were updated). Because the transaction is sent through an atomic broadcast, either all servers in the group receive the transaction or none do. If they receive it, then they all receive it in the same order with respect to other transactions that were sent before. All servers therefore receive the same set of transactions in the same order, and a global total order is established for the transactions.
However, there may be conflicts between transactions that execute concurrently on different servers. Such conflicts are detected by inspecting and comparing the write sets of two different and concurrent transactions, in a process called certification. During certification, conflict detection is carried out at row level: if two concurrent transactions, that executed on different servers, update the same row, then there is a conflict. The conflict resolution procedure states that the transaction that was ordered first commits on all servers, and the transaction ordered second aborts, and is therefore rolled back on the originating server and dropped by the other servers in the group. For example, if t1 and t2 execute concurrently at different sites, both changing the same row, and t2 is ordered before t1, then t2 wins the conflict and t1 is rolled back. This is in fact a distributed first commit wins rule. Note that if two transactions are bound to conflict more often than not, then it is a good practice to start them on the same server, where they have a chance to synchronize on the local lock manager instead of being rolled back as a result of certification.
For applying and externalizing the certified transactions, Group Replication permits servers to deviate from the agreed order of the transactions if this does not break consistency and validity. Group Replication is an eventual consistency system, meaning that as soon as the incoming traffic slows down or stops, all group members have the same data content. While traffic is flowing, transactions can be externalized in a slightly different order, or externalized on some members before the others. For example, in multi-primary mode, a local transaction might be externalized immediately following certification, although a remote transaction that is earlier in the global order has not yet been applied. This is permitted when the certification process has established that there is no conflict between the transactions. In single-primary mode, on the primary server, there is a small chance that concurrent, non-conflicting local transactions might be committed and externalized in a different order from the global order agreed by Group Replication. On the secondaries, which do not accept writes from clients, transactions are always committed and externalized in the agreed order.
The following figure depicts the MySQL Group Replication protocol and by comparing it to MySQL Replication (or even MySQL semisynchronous replication) you can see some differences. Note that some underlying consensus and Paxos related messages are missing from this picture for the sake of clarity.
Group Replication enables you to create fault-tolerant systems with redundancy by replicating the system state to a set of servers. Even if some of the servers subsequently fail, as long it is not all or a majority, the system is still available. Depending on the number of servers which fail the group might have degraded performance or scalability, but it is still available. Server failures are isolated and independent. They are tracked by a group membership service which relies on a distributed failure detector that is able to signal when any servers leave the group, either voluntarily or due to an unexpected halt. There is a distributed recovery procedure to ensure that when servers join the group they are brought up to date automatically. There is no need for server fail-over, and the multi-master update everywhere nature ensures that even updates are not blocked in the event of a single server failure. To summarize, MySQL Group Replication guarantees that the database service is continuously available.
It is important to understand that although the database service is available, in the event of a server crash, those clients connected to it must be redirected, or failed over, to a different server. This is not something Group Replication attempts to resolve. A connector, load balancer, router, or some form of middleware are more suitable to deal with this issue. For example see MySQL Router 8.0.
To summarize, MySQL Group Replication provides a highly available, highly elastic, dependable MySQL service.
The following examples are typical use cases for Group Replication.
-
Elastic Replication - Environments that require a very fluid replication infrastructure, where the number of servers has to grow or shrink dynamically and with as few side-effects as possible. For instance, database services for the cloud.
-
Highly Available Shards - Sharding is a popular approach to achieve write scale-out. Use MySQL Group Replication to implement highly available shards, where each shard maps to a replication group.
-
Alternative to Master-Slave replication - In certain situations, using a single master server makes it a single point of contention. Writing to an entire group may prove more scalable under certain circumstances.
-
Autonomic Systems - Additionally, you can deploy MySQL Group Replication purely for the automation that is built into the replication protocol (described already in this and previous chapters).
This section presents details about some of the services that Group Replication builds on.
In MySQL Group Replication, a set of servers forms a replication group. A group has a name, which takes the form of a UUID. The group is dynamic and servers can leave (either voluntarily or involuntarily) and join it at any time. The group adjusts itself whenever servers join or leave.
If a server joins the group, it automatically brings itself up to date by fetching the missing state from an existing server. If a server leaves the group, for instance it was taken down for maintenance, the remaining servers notice that it has left and reconfigure the group automatically.
Group Replication has a group membership service that defines which servers are online and participating in the group. The list of online servers is referred to as a view. Every server in the group has a consistent view of which servers are the members participating actively in the group at a given moment in time.
Group members must agree not only on transaction commits, but also on which is the current view. If existing members agree that a new server should become part of the group, the group is reconfigured to integrate that server in it, which triggers a view change. If a server leaves the group, either voluntarily or not, the group dynamically rearranges its configuration and a view change is triggered.
In the case where a member leaves the group voluntarily, it first initiates a dynamic group reconfiguration, during which all members have to agree on a new view without the leaving server. However, if a member leaves the group involuntarily, for example because it has stopped unexpectedly or the network connection is down, it cannot initiate the reconfiguration. In this situation, Group Replication's failure detection mechanism recognizes after a short period of time that the member has left, and a reconfiguration of the group without the failed member is proposed. As with a member that leaves voluntarily, the reconfiguration requires agreement from the majority of servers in the group. However, if the group is not able to reach agreement, for example because it partitioned in such a way that there is no majority of servers online, the system is not able to dynamically change the configuration, and blocks to prevent a split-brain situation. This situation requires intervention from an administrator.
It is possible for a member to go offline for a short time, then attempt to rejoin the group again before the failure detection mechanism has detected its failure, and before the group has been reconfigured to remove the member. In this situation, the rejoining member forgets its previous state, but if other members send it messages that are intended for its pre-crash state, this can cause issues including possible data inconsistency. If a member in this situation participates in XCom's consensus protocol, it could potentially cause XCom to deliver different values for the same consensus round, by making a different decision before and after failure.
To counter this possibility, from MySQL 5.7.22, servers are given a unique identifier when they join a group. This enables Group Replication to be aware of the situation where a new incarnation of the same server (with the same address but a new identifier) is trying to join the group while its old incarnation is still listed as a member. The new incarnation is blocked from joining the group until the old incarnation can be removed by a reconfiguration. If Group Replication is stopped and restarted on the server, the member becomes a new incarnation and cannot rejoin until the suspicion times out.
Group Replication includes a failure detection mechanism that is able to find and report which servers are silent and as such assumed to be dead. At a high level, the failure detector is a distributed service that provides information about which servers may be dead (suspicions). Suspicions are triggered when servers go mute. When server A does not receive messages from server B during a given period, a timeout occurs and a suspicion is raised. Later if the group agrees that the suspicions are probably true, then the group decides that a given server has indeed failed. This means that the remaining members in the group take a coordinated decision to exclude a given member.
Suspicions are triggered when servers go mute. When server A does not receive messages from server B during a given period, a timeout occurs and a suspicion is raised.
If a server gets isolated from the rest of the group, then it suspects that all others have failed. Being unable to secure agreement with the group (as it cannot secure a quorum), its suspicion does not have consequences. When a server is isolated from the group in this way, it is unable to execute any local transactions.
MySQL Group Replication builds on an implementation of the Paxos distributed algorithm to provide distributed coordination between servers. As such, it requires a majority of servers to be active to reach quorum and thus make a decision. This has direct impact on the number of failures the system can tolerate without compromising itself and its overall functionality. The number of servers (n) needed to tolerate f
failures is then n = 2 x f + 1
.
In practice this means that to tolerate one failure the group must have three servers in it. As such if one server fails, there are still two servers to form a majority (two out of three) and allow the system to continue to make decisions automatically and progress. However, if a second server fails involuntarily, then the group (with one server left) blocks, because there is no majority to reach a decision.
The following is a small table illustrating the formula above.
Group Size |
Majority |
Instant Failures Tolerated |
---|---|---|
1 |
1 |
0 |
2 |
2 |
0 |
3 |
2 |
1 |
4 |
3 |
1 |
5 |
3 |
2 |
6 |
4 |
2 |
7 |
4 |
3 |
The next Chapter covers technical aspects of Group Replication.
MySQL Group Replication is provided as a plugin to MySQL server, and each server in a group requires configuration and installation of the plugin. This section provides a detailed tutorial with the steps required to create a replication group with at least three members.
An alternative way to deploy multiple instances of MySQL is by using InnoDB cluster, which uses Group Replication and wraps it in a programmatic environment that enables you to easily work with groups of MySQL server instances in the MySQL Shell 8.0 (part of MySQL 8.0). In addition, InnoDB cluster interfaces seamlessly with MySQL Router and simplifies deploying MySQL with high availability. See Chapter 20, InnoDB Cluster.
Each of the MySQL server instances in a group can run on an independent physical host machine, which is the recommended way to deploy Group Replication. This section explains how to create a replication group with three MySQL Server instances, each running on a different host machine. See Section 17.2.2, “Deploying Group Replication Locally” for information about deploying multiple MySQL server instances running Group Replication on the same host machine, for example for testing purposes.
This tutorial explains how to get and deploy MySQL Server with the Group Replication plugin, how to configure each server instance before creating a group, and how to use Performance Schema monitoring to verify that everything is working correctly.
The first step is to deploy at least three instances of MySQL Server, this procedure demonstrates using multiple hosts for the instances, named s1, s2 and s3. It is assumed that MySQL Server was installed on each of the hosts, see Chapter 2, Installing and Upgrading MySQL. Group Replication is a built-in MySQL plugin provided with MySQL Server 5.7.17 and later. For more background information on MySQL plugins, see Section 5.5, “MySQL Server Plugins”.
In this example, three instances are used for the group, which is the minimum number of instances to create a group. Adding more instances increases the fault tolerance of the group. For example if the group consists of three members, in event of failure of one instance the group can continue. But in the event of another failure the group can no longer continue processing write transactions. By adding more instances, the number of servers which can fail while the group continues to process transactions also increases. The maximum number of instances which can be used in a group is nine. For more information see Section 17.1.3.2, “Failure Detection”.
This section explains the configuration settings required for MySQL Server instances that you want to use for Group Replication. For background information, see Section 17.7, “Requirements and Limitations”.
For Group Replication, data must be stored in the InnoDB transactional storage engine (for details of why, see Section 17.7.1, “Group Replication Requirements”). The use of other storage engines, including the temporary MEMORY
storage engine, might cause errors in Group Replication. Set the disabled_storage_engines
system variable as follows to prevent their use:
disabled_storage_engines="MyISAM,BLACKHOLE,FEDERATED,ARCHIVE,MEMORY"
Note that with the MyISAM
storage engine disabled, when you are upgrading a MySQL instance to a release where mysql_upgrade is still used (before MySQL 8.0.16), mysql_upgrade might fail with an error. To handle this, you can re-enable that storage engine while you run mysql_upgrade, then disable it again when you restart the server. For more information, see Section 4.4.7, “mysql_upgrade — Check and Upgrade MySQL Tables”.
The following settings configure replication according to the MySQL Group Replication requirements.
server_id=1
gtid_mode=ON
enforce_gtid_consistency=ON
master_info_repository=TABLE
relay_log_info_repository=TABLE
binlog_checksum=NONE
log_slave_updates=ON
log_bin=binlog
binlog_format=ROW
These settings configure the server to use the unique identifier number 1, to enable global transaction identifiers and to store replication metadata in system tables instead of files. Additionally, it instructs the server to turn on binary logging, use row-based format and disable binary log event checksums. For more details see Section 17.7.1, “Group Replication Requirements”.
At this point the option file ensures that the server is configured and is instructed to instantiate the replication infrastructure under a given configuration. The following section configures the Group Replication settings for the server.
plugin_load_add='group_replication.so'
transaction_write_set_extraction=XXHASH64
group_replication_group_name="aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa"
group_replication_start_on_boot=off
group_replication_local_address= "s1:33061"
group_replication_group_seeds= "s1:33061,s2:33061,s3:33061"
group_replication_bootstrap_group=off
-
plugin-load-add
adds the Group Replication plugin to the list of plugins which the server loads at startup. This is preferable in a production deployment to installing the plugin manually. -
Configuring
group_replication_group_name
tells the plugin that the group that it is joining, or creating, is named "aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa".The value of
group_replication_group_name
must be a valid UUID. This UUID is used internally when setting GTIDs for Group Replication events in the binary log. You can useSELECT UUID()
to generate a UUID. -
Configuring the
group_replication_start_on_boot
variable tooff
instructs the plugin to not start operations automatically when the server starts. This is important when setting up Group Replication as it ensures you can configure the server before manually starting the plugin. Once the member is configured you can setgroup_replication_start_on_boot
toon
so that Group Replication starts automatically upon server boot. -
Configuring
group_replication_local_address
sets the network address and port which the member uses for internal communication with other members in the group. Group Replication uses this address for internal member-to-member connections involving remote instances of the group communication engine (XCom, a Paxos variant).ImportantThis address must be different to the
hostname
andport
used for SQL and it must not be used for client applications. It must be only be used for internal communication between the members of the group while running Group Replication.The network address configured by
group_replication_local_address
must be resolvable by all group members. For example, if each server instance is on a different machine with a fixed network address, you could use the IP address of the machine, such as 10.0.0.1. If you use a host name, you must use a fully qualified name, and ensure it is resolvable through DNS, correctly configured/etc/hosts
files, or other name resolution processes. From MySQL 8.0.14, IPv6 addresses (or host names that resolve to them) can be used as well as IPv4 addresses. A group can contain a mix of members using IPv6 and members using IPv4. For more information on Group Replication support for IPv6 networks and on mixed IPv4 and IPv6 groups, see Support For IPv6 And For Mixed IPv6 And IPv4 Groups.The recommended port for
group_replication_local_address
is 33061.group_replication_local_address
is used by Group Replication as the unique identifier for a group member within the replication group. You can use the same port for all members of a replication group as long as the host names or IP addresses are all different, as demonstrated in this tutorial. Alternatively you can use the same host name or IP address for all members as long as the ports are all different, for example as shown in Section 17.2.2, “Deploying Group Replication Locally”. -
Configuring
group_replication_group_seeds
sets the hostname and port of the group members which are used by the new member to establish its connection to the group. These members are called the seed members. Once the connection is established, the group membership information is listed atperformance_schema.replication_group_members
. Usually thegroup_replication_group_seeds
list contains thehostname:port
of each of the group member'sgroup_replication_local_address
, but this is not obligatory and a subset of the group members can be chosen as seeds.ImportantThe
hostname:port
listed ingroup_replication_group_seeds
is the seed member's internal network address, configured bygroup_replication_local_address
and not the SQLhostname:port
used for client connections, and shown for example inperformance_schema.replication_group_members
table.The server that starts the group does not make use of this option, since it is the initial server and as such, it is in charge of bootstrapping the group. In other words, any existing data which is on the server bootstrapping the group is what is used as the data for the next joining member. The second server joining asks the one and only member in the group to join, any missing data on the second server is replicated from the donor data on the bootstrapping member, and then the group expands. The third server joining can ask any of these two to join, data is synchronized to the new member, and then the group expands again. Subsequent servers repeat this procedure when joining.
WarningWhen joining multiple servers at the same time, make sure that they point to seed members that are already in the group. Do not use members that are also joining the group as seeds, because they might not yet be in the group when contacted.
It is good practice to start the bootstrap member first, and let it create the group. Then make it the seed member for the rest of the members that are joining. This ensures that there is a group formed when joining the rest of the members.
Creating a group and joining multiple members at the same time is not supported. It might work, but chances are that the operations race and then the act of joining the group ends up in an error or a time out.
-
Configuring
group_replication_bootstrap_group
instructs the plugin whether to bootstrap the group or not. In this case, even though s1 is the first member of the group we set this variable to off in the option file. Instead we configuregroup_replication_bootstrap_group
when the instance is running, to ensure that only one member actually bootstraps the group.ImportantThe
group_replication_bootstrap_group
variable must only be enabled on one server instance belonging to a group at any time, usually the first time you bootstrap the group (or in case the entire group is brought down and back up again). If you bootstrap the group multiple times, for example when multiple server instances have this option set, then they could create an artificial split brain scenario, in which two distinct groups with the same name exist. Always setgroup_replication_bootstrap_group=off
after the first server instance comes online.
Configuration for all servers in the group is quite similar. You need to change the specifics about each server (for example server_id
, datadir
, group_replication_local_address
). This is illustrated later in this tutorial.
Group Replication uses the asynchronous replication protocol to achieve Section 17.9.5, “Distributed Recovery”, synchronizing group members before joining them to the group. The distributed recovery process relies on a replication channel named group_replication_recovery
which is used to transfer transactions from donor members to members that join the group. Therefore you need to set up a replication user with the correct permissions so that Group Replication can establish direct member-to-member recovery replication channels.
Start the MySQL server instance and then connect a client to it. Create a MySQL user with the REPLICATION-SLAVE
privilege. This process can be captured in the binary log and then you can rely on distributed recovery to replicate the statements used to create the user. Alternatively, you can disable binary logging using SET SQL_LOG_BIN=0;
and then create the user manually on each member, for example if you want to avoid the changes being propagated to other server instances. If you do decide to disable binary logging, ensure you renable it once you have configured the user.
In the following example the user rpl_user
with the password password
is shown. When configuring your servers use a suitable user name and password.
mysql>CREATE USER
mysql>rpl_user
@'%' IDENTIFIED BY 'password
';GRANT REPLICATION SLAVE ON *.* TO
mysql>rpl_user
@'%';FLUSH PRIVILEGES;
If binary logging was disabled, enable it again once the user has been created using SET SQL_LOG_BIN=1;
.
Once the user has been configured, use the CHANGE MASTER TO
statement to configure the server to use the given credentials for the group_replication_recovery
replication channel the next time it needs to recover its state from another member. Issue the following, replacing rpl_user
and password
with the values used when creating the user.
mysql> CHANGE MASTER TO MASTER_USER='rpl_user
', MASTER_PASSWORD='password
' \\
FOR CHANNEL 'group_replication_recovery';
Distributed recovery is the first step taken by a server that joins the group and does not have the same set of transactions as the group members. If these credentials are not set correctly for the group_replication_recovery
replication channel and the rpl_user
as shown, the server cannot connect to the donor members and run the distributed recovery process to gain synchrony with the other group members, and hence ultimately cannot join the group. See Section 17.9.5, “Distributed Recovery”.
Similarly, if the server cannot correctly identify the other members via the server's hostname
the recovery process can fail. It is recommended that operating systems running MySQL have a properly configured unique hostname
, either using DNS or local settings. This hostname
can be verified in the Member_host
column of the performance_schema.replication_group_members
table. If multiple group members externalize a default hostname
set by the operating system, there is a chance of the member not resolving to the correct member address and not being able to join the group. In such a situation use report_host
to configure a unique hostname
to be externalized by each of the servers.
Once server s1 has been configured and started, install the Group Replication plugin. If you used plugin_load_add='group_replication.so'
in the option file then the Group Replication plugin is installed and you can proceed to the next step. In the event that you decide to install the plugin manually, connect to the server and issue the following:
INSTALL PLUGIN group_replication SONAME 'group_replication.so';
The mysql.session
user must exist before you can load Group Replication. mysql.session
was added in MySQL version 5.7.19. If your data dictionary was initialized using an earlier version you must perform the MySQL upgrade procedure (see Section 2.11, “Upgrading MySQL”). If the upgrade is not run, Group Replication fails to start with the error message There was an error when trying to access the server with user: mysql.session@localhost. Make sure the user is present in the server and that mysql_upgrade was ran after a server update..
To check that the plugin was installed successfully, issue SHOW PLUGINS;
and check the output. It should show something like this:
mysql> SHOW PLUGINS;
+----------------------------+----------+--------------------+----------------------+-------------+
| Name | Status | Type | Library | License |
+----------------------------+----------+--------------------+----------------------+-------------+
| binlog | ACTIVE | STORAGE ENGINE | NULL | PROPRIETARY |
(...)
| group_replication | ACTIVE | GROUP REPLICATION | group_replication.so | PROPRIETARY |
+----------------------------+----------+--------------------+----------------------+-------------+
The process of starting a group for the first time is called bootstrapping. You use the group_replication_bootstrap_group
system variable to bootstrap a group. The bootstrap should only be done by a single server, the one that starts the group and only once. This is why the value of the group_replication_bootstrap_group
option was not stored in the instance's option file. If it is saved in the option file, upon restart the server automatically bootstraps a second group with the same name. This would result in two distinct groups with the same name. The same reasoning applies to stopping and restarting the plugin with this option set to ON
. Therefore to safely bootstrap the group, connect to s1 and issue:
mysql>SET GLOBAL group_replication_bootstrap_group=ON;
mysql>START GROUP_REPLICATION;
mysql>SET GLOBAL group_replication_bootstrap_group=OFF;
Once the START GROUP_REPLICATION
statement returns, the group has been started. You can check that the group is now created and that there is one member in it:
mysql> SELECT * FROM performance_schema.replication_group_members;
+---------------------------+--------------------------------------+-------------+-------------+---------------+
| CHANNEL_NAME | MEMBER_ID | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE |
+---------------------------+--------------------------------------+-------------+-------------+---------------+
| group_replication_applier | ce9be252-2b71-11e6-b8f4-00212844f856 | s1 | 3306 | ONLINE |
+---------------------------+--------------------------------------+-------------+-------------+---------------+
The information in this table confirms that there is a member in the group with the unique identifier ce9be252-2b71-11e6-b8f4-00212844f856
, that it is ONLINE
and is at s1
listening for client connections on port 3306
.
For the purpose of demonstrating that the server is indeed in a group and that it is able to handle load, create a table and add some content to it.
mysql>CREATE DATABASE test;
mysql>USE test;
mysql>CREATE TABLE t1 (c1 INT PRIMARY KEY, c2 TEXT NOT NULL);
mysql>INSERT INTO t1 VALUES (1, 'Luis');
Check the content of table t1
and the binary log.
mysql>SELECT * FROM t1;
+----+------+ | c1 | c2 | +----+------+ | 1 | Luis | +----+------+ mysql>SHOW BINLOG EVENTS;
+---------------+-----+----------------+-----------+-------------+--------------------------------------------------------------------+ | Log_name | Pos | Event_type | Server_id | End_log_pos | Info | +---------------+-----+----------------+-----------+-------------+--------------------------------------------------------------------+ | binlog.000001 | 4 | Format_desc | 1 | 123 | Server ver: 5.7.30-log, Binlog ver: 4 | | binlog.000001 | 123 | Previous_gtids | 1 | 150 | | | binlog.000001 | 150 | Gtid | 1 | 211 | SET @@SESSION.GTID_NEXT= 'aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa:1' | | binlog.000001 | 211 | Query | 1 | 270 | BEGIN | | binlog.000001 | 270 | View_change | 1 | 369 | view_id=14724817264259180:1 | | binlog.000001 | 369 | Query | 1 | 434 | COMMIT | | binlog.000001 | 434 | Gtid | 1 | 495 | SET @@SESSION.GTID_NEXT= 'aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa:2' | | binlog.000001 | 495 | Query | 1 | 585 | CREATE DATABASE test | | binlog.000001 | 585 | Gtid | 1 | 646 | SET @@SESSION.GTID_NEXT= 'aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa:3' | | binlog.000001 | 646 | Query | 1 | 770 | use `test`; CREATE TABLE t1 (c1 INT PRIMARY KEY, c2 TEXT NOT NULL) | | binlog.000001 | 770 | Gtid | 1 | 831 | SET @@SESSION.GTID_NEXT= 'aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa:4' | | binlog.000001 | 831 | Query | 1 | 899 | BEGIN | | binlog.000001 | 899 | Table_map | 1 | 942 | table_id: 108 (test.t1) | | binlog.000001 | 942 | Write_rows | 1 | 984 | table_id: 108 flags: STMT_END_F | | binlog.000001 | 984 | Xid | 1 | 1011 | COMMIT /* xid=38 */ | +---------------+-----+----------------+-----------+-------------+--------------------------------------------------------------------+
As seen above, the database and the table objects were created and their corresponding DDL statements were written to the binary log. Also, the data was inserted into the table and written to the binary log. The importance of the binary log entries is illustrated in the following section when the group grows and distributed recovery is executed as new members try to catch up and become online.
At this point, the group has one member in it, server s1, which has some data in it. It is now time to expand the group by adding the other two servers configured previously.
In order to add a second instance, server s2, first create the configuration file for it. The configuration is similar to the one used for server s1, except for things such as the server_id
. These different lines are highlighted in the listing below.
[mysqld]
#
# Disable other storage engines
#
disabled_storage_engines="MyISAM,BLACKHOLE,FEDERATED,ARCHIVE,MEMORY"
#
# Replication configuration parameters
#
server_id=2
gtid_mode=ON
enforce_gtid_consistency=ON
master_info_repository=TABLE
relay_log_info_repository=TABLE
binlog_checksum=NONE
log_slave_updates=ON
log_bin=binlog
binlog_format=ROW
#
# Group Replication configuration
#
transaction_write_set_extraction=XXHASH64
group_replication_group_name="aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa"
group_replication_start_on_boot=off
group_replication_local_address= "s2:33061"
group_replication_group_seeds= "s1:33061,s2:33061,s3:33061"
group_replication_bootstrap_group= off
Similar to the procedure for server s1, with the option file in place you launch the server. Then configure the recovery credentials as follows. The commands are the same as used when setting up server s1 as the user is shared within the group. This member needs to have the same replication user configured in Section 17.2.1.3, “User Credentials”. If you are relying on distributed recovery to configure the user on all members, when s2 connects to the seed s1 the replication user is relicated to s1. If you did not have binary logging enabled when you configured the user credentials on s1, you must create the replication user on s2. In this case, connect to s2 and issue:
SET SQL_LOG_BIN=0;
CREATE USER
rpl_user
@'%' IDENTIFIED BY 'password
';GRANT REPLICATION SLAVE ON *.* TO
rpl_user
@'%';SET SQL_LOG_BIN=1;
CHANGE MASTER TO MASTER_USER='
rpl_user
', MASTER_PASSWORD='password
' \\ FOR CHANNEL 'group_replication_recovery';
If necessary, install the Group Replication plugin, see Section 17.2.1.4, “Launching Group Replication”.
Start Group Replication and s2 starts the process of joining the group.
mysql> START GROUP_REPLICATION;
Unlike the previous steps that were the same as those executed on s1, here there is a difference in that you do not need to boostrap the group because the group already exiists. In other words on s2 group_replication_bootstrap_group
is set to off, and you do not issue SET GLOBAL group_replication_bootstrap_group=ON;
before starting Group Replication, because the group has already been created and bootstrapped by server s1. At this point server s2 only needs to be added to the already existing group.
When Group Replication starts successfully and the server joins the group it checks the super_read_only
variable. By setting super_read_only
to ON in the member's configuration file, you can ensure that servers which fail when starting Group Replication for any reason do not accept transactions. If the server should join the group as read-write instance, for example as the primary in a single-primary group or as a member of a multi-primary group, when the super_read_only
variable is set to ON then it is set to OFF upon joining the group.
Checking the performance_schema.replication_group_members
table again shows that there are now two ONLINE servers in the group.
mysql> SELECT * FROM performance_schema.replication_group_members;
+---------------------------+--------------------------------------+-------------+-------------+---------------+
| CHANNEL_NAME | MEMBER_ID | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE |
+---------------------------+--------------------------------------+-------------+-------------+---------------+
| group_replication_applier | 395409e1-6dfa-11e6-970b-00212844f856 | s1 | 3306 | ONLINE |
| group_replication_applier | ac39f1e6-6dfa-11e6-a69d-00212844f856 | s2 | 3306 | ONLINE |
+---------------------------+--------------------------------------+-------------+-------------+---------------+
When s2 attempted to join the group, Section 17.9.5, “Distributed Recovery” ensured that s2 applied the same transactions which s1 had applied. Once this process completed, s2 could join the group as a member, and at this point it is marked as ONLINE. In other words it must have already caught up with server s1 automatically. Once s2 is ONLINE, it then begins to process transactions with the group. Verify that s2 has indeed synchronized with server s1 as follows.
mysql>SHOW DATABASES LIKE 'test';
+-----------------+ | Database (test) | +-----------------+ | test | +-----------------+ mysql>SELECT * FROM test.t1;
+----+------+ | c1 | c2 | +----+------+ | 1 | Luis | +----+------+ mysql>SHOW BINLOG EVENTS;
+---------------+------+----------------+-----------+-------------+--------------------------------------------------------------------+ | Log_name | Pos | Event_type | Server_id | End_log_pos | Info | +---------------+------+----------------+-----------+-------------+--------------------------------------------------------------------+ | binlog.000001 | 4 | Format_desc | 2 | 123 | Server ver: 5.7.30-log, Binlog ver: 4 | | binlog.000001 | 123 | Previous_gtids | 2 | 150 | | | binlog.000001 | 150 | Gtid | 1 | 211 | SET @@SESSION.GTID_NEXT= 'aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa:1' | | binlog.000001 | 211 | Query | 1 | 270 | BEGIN | | binlog.000001 | 270 | View_change | 1 | 369 | view_id=14724832985483517:1 | | binlog.000001 | 369 | Query | 1 | 434 | COMMIT | | binlog.000001 | 434 | Gtid | 1 | 495 | SET @@SESSION.GTID_NEXT= 'aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa:2' | | binlog.000001 | 495 | Query | 1 | 585 | CREATE DATABASE test | | binlog.000001 | 585 | Gtid | 1 | 646 | SET @@SESSION.GTID_NEXT= 'aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa:3' | | binlog.000001 | 646 | Query | 1 | 770 | use `test`; CREATE TABLE t1 (c1 INT PRIMARY KEY, c2 TEXT NOT NULL) | | binlog.000001 | 770 | Gtid | 1 | 831 | SET @@SESSION.GTID_NEXT= 'aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa:4' | | binlog.000001 | 831 | Query | 1 | 890 | BEGIN | | binlog.000001 | 890 | Table_map | 1 | 933 | table_id: 108 (test.t1) | | binlog.000001 | 933 | Write_rows | 1 | 975 | table_id: 108 flags: STMT_END_F | | binlog.000001 | 975 | Xid | 1 | 1002 | COMMIT /* xid=30 */ | | binlog.000001 | 1002 | Gtid | 1 | 1063 | SET @@SESSION.GTID_NEXT= 'aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa:5' | | binlog.000001 | 1063 | Query | 1 | 1122 | BEGIN | | binlog.000001 | 1122 | View_change | 1 | 1261 | view_id=14724832985483517:2 | | binlog.000001 | 1261 | Query | 1 | 1326 | COMMIT | +---------------+------+----------------+-----------+-------------+--------------------------------------------------------------------+
As seen above, the second server has been added to the group and it has replicated the changes from server s1 automatically using distributed recovery. In other words, the transactions applied on s1 up to the point in time that s2 joined the group have been replicated to s2.
Adding additional instances to the group is essentially the same sequence of steps as adding the second server, except that the configuration has to be changed as it had to be for server s2. To summarise the required commands:
1. Create the configuration file
[mysqld]
#
# Disable other storage engines
#
disabled_storage_engines="MyISAM,BLACKHOLE,FEDERATED,ARCHIVE,MEMORY"
#
# Replication configuration parameters
#
server_id=3
gtid_mode=ON
enforce_gtid_consistency=ON
master_info_repository=TABLE
relay_log_info_repository=TABLE
binlog_checksum=NONE
log_slave_updates=ON
log_bin=binlog
binlog_format=ROW
#
# Group Replication configuration
#
group_replication_group_name="aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa"
group_replication_start_on_boot=off
group_replication_local_address= "s3:33061"
group_replication_group_seeds= "s1:33061,s2:33061,s3:33061"
group_replication_bootstrap_group= off
2. Start the server and connect to it. Configure the recovery credentials for the group_replication_recovery channel.
SET SQL_LOG_BIN=0; CREATE USERrpl_user
@'%' IDENTIFIED BY 'password
'; GRANT REPLICATION SLAVE ON *.* TOrpl_user
@'%'; FLUSH PRIVILEGES; SET SQL_LOG_BIN=1; CHANGE MASTER TO MASTER_USER='rpl_user
', MASTER_PASSWORD='password
' \\ FOR CHANNEL 'group_replication_recovery';
4. Install the Group Replication plugin and start it.
INSTALL PLUGIN group_replication SONAME 'group_replication.so';
START GROUP_REPLICATION;
At this point server s3 is booted and running, has joined the group and caught up with the other servers in the group. Consulting the performance_schema.replication_group_members
table again confirms this is the case.
mysql> SELECT * FROM performance_schema.replication_group_members;
+---------------------------+--------------------------------------+-------------+-------------+---------------+
| CHANNEL_NAME | MEMBER_ID | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE |
+---------------------------+--------------------------------------+-------------+-------------+---------------+
| group_replication_applier | 395409e1-6dfa-11e6-970b-00212844f856 | s1 | 3306 | ONLINE |
| group_replication_applier | 7eb217ff-6df3-11e6-966c-00212844f856 | s3 | 3306 | ONLINE |
| group_replication_applier | ac39f1e6-6dfa-11e6-a69d-00212844f856 | s2 | 3306 | ONLINE |
+---------------------------+--------------------------------------+-------------+-------------+---------------+
Issuing this same query on server s2 or server s1 yields the same result. Also, you can verify that server s3 has caught up:
mysql>SHOW DATABASES LIKE 'test';
+-----------------+ | Database (test) | +-----------------+ | test | +-----------------+ mysql>SELECT * FROM test.t1;
+----+------+ | c1 | c2 | +----+------+ | 1 | Luis | +----+------+ mysql>SHOW BINLOG EVENTS;
+---------------+------+----------------+-----------+-------------+--------------------------------------------------------------------+ | Log_name | Pos | Event_type | Server_id | End_log_pos | Info | +---------------+------+----------------+-----------+-------------+--------------------------------------------------------------------+ | binlog.000001 | 4 | Format_desc | 3 | 123 | Server ver: 5.7.30-log, Binlog ver: 4 | | binlog.000001 | 123 | Previous_gtids | 3 | 150 | | | binlog.000001 | 150 | Gtid | 1 | 211 | SET @@SESSION.GTID_NEXT= 'aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa:1' | | binlog.000001 | 211 | Query | 1 | 270 | BEGIN | | binlog.000001 | 270 | View_change | 1 | 369 | view_id=14724832985483517:1 | | binlog.000001 | 369 | Query | 1 | 434 | COMMIT | | binlog.000001 | 434 | Gtid | 1 | 495 | SET @@SESSION.GTID_NEXT= 'aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa:2' | | binlog.000001 | 495 | Query | 1 | 585 | CREATE DATABASE test | | binlog.000001 | 585 | Gtid | 1 | 646 | SET @@SESSION.GTID_NEXT= 'aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa:3' | | binlog.000001 | 646 | Query | 1 | 770 | use `test`; CREATE TABLE t1 (c1 INT PRIMARY KEY, c2 TEXT NOT NULL) | | binlog.000001 | 770 | Gtid | 1 | 831 | SET @@SESSION.GTID_NEXT= 'aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa:4' | | binlog.000001 | 831 | Query | 1 | 890 | BEGIN | | binlog.000001 | 890 | Table_map | 1 | 933 | table_id: 108 (test.t1) | | binlog.000001 | 933 | Write_rows | 1 | 975 | table_id: 108 flags: STMT_END_F | | binlog.000001 | 975 | Xid | 1 | 1002 | COMMIT /* xid=29 */ | | binlog.000001 | 1002 | Gtid | 1 | 1063 | SET @@SESSION.GTID_NEXT= 'aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa:5' | | binlog.000001 | 1063 | Query | 1 | 1122 | BEGIN | | binlog.000001 | 1122 | View_change | 1 | 1261 | view_id=14724832985483517:2 | | binlog.000001 | 1261 | Query | 1 | 1326 | COMMIT | | binlog.000001 | 1326 | Gtid | 1 | 1387 | SET @@SESSION.GTID_NEXT= 'aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa:6' | | binlog.000001 | 1387 | Query | 1 | 1446 | BEGIN | | binlog.000001 | 1446 | View_change | 1 | 1585 | view_id=14724832985483517:3 | | binlog.000001 | 1585 | Query | 1 | 1650 | COMMIT | +---------------+------+----------------+-----------+-------------+--------------------------------------------------------------------+
The most common way to deploy Group Replication is using multiple server instances, to provide high availability. It is also possible to deploy Group Replication locally, for example for testing purposes. This section explains how you can deploy Group Replication locally.
Group Replication is usually deployed on multiple hosts because this ensures that high-availability is provided. The instructions in this section are not suitable for production deployments because all MySQL server instances are running on the same single host. In the event of failure of this host, the whole group fails. Therefore this information should be used for testing purposes and it should not be used in a production environments.
This section explains how to create a replication group with three MySQL Server instances on one physical machine. This means that three data directories are needed, one per server instance, and that you need to configure each instance independently. This - procedure assumes that MySQL Server was downloaded and unpacked - into the directory named mysql-5.7
. Each MySQL server instance requires a specific data directory. Create a directory named data
, then in that directory create a subdirectory for each server instance, for example s1, s2 and s3, and initialize each one.
mysql-5.7/bin/mysqld --initialize-insecure --basedir=$PWD/mysql-5.7 --datadir=$PWD/data/s1
mysql-5.7/bin/mysqld --initialize-insecure --basedir=$PWD/mysql-5.7 --datadir=$PWD/data/s2
mysql-5.7/bin/mysqld --initialize-insecure --basedir=$PWD/mysql-5.7 --datadir=$PWD/data/s3
Inside data/s1
, data/s2
, data/s3
is an initialized data directory, containing the mysql system database and related tables and much more. To learn more about the initialization procedure, see Section 2.10.1, “Initializing the Data Directory”.
Do not use -initialize-insecure
in production environments, it is only used here to simplify the tutorial. For more information on security settings, see Section 17.5, “Group Replication Security”.
When you are following Section 17.2.1.2, “Configuring an Instance for Group Replication”, you need to add configuration for the data directories added in the previous section. For example:
[mysqld]
# server configuration
datadir=<full_path_to_data>/data/s1
basedir=<full_path_to_bin>/mysql-8.0/
port=24801
socket=<full_path_to_sock_dir>/s1.sock
These settings configure MySQL server to use the data directory created earlier and which port the server should open and start listening for incoming connections.
The non-default port of 24801 is used because in this tutorial the three server instances use the same hostname. In a setup with three different machines this would not be required.
Group Replication requires a network connection between the members, which means that each member must be able to resolve the network address of all of the other members. For example in this tutorial all three instances run on one machine, so to ensure that the members can contact each other you could add a line to the option file such as report_host=127.0.0.1
.
Then each member needs to be able to connect to the other members on their group_replication_local_address
. For example in the option file of member s1 add:
group_replication_local_address= "127.0.0.1:24901"
group_replication_group_seeds= "127.0.0.1:24901,127.0.0.1:24902,127.0.0.1:24903"
This configures s1 to use port 24901 for internal group communication with seed members. For each server instance you want to add to the group, make these changes in the option file of the member. For each member you must ensure a unique address is specified, so use a unique port per instance for group_replication_local_address
. Usually you want all members to be able to serve as seeds for members that are joining the group and have not got the transactions processed by the group. In this case, add all of the ports to group_replication_group_seeds
as shown above.
The remaining steps of Section 17.2.1, “Deploying Group Replication in Single-Primary Mode” apply equally to a group which you have deployed locally in this way.
Use the Perfomance Schema tables to monitor Group Replication, assuming that the Performance Schema is enabled. Group Replication adds the following tables:
These Perfomance Schema replication tables also show information about Group Replication:
-
performance_schema.replication_connection_status
shows information regarding Group Replication, for example the transactions that have been received from the group and queued in the applier queue (the relay log). -
performance_schema.replication_applier_status
shows the state of the Group Replication related channels and threads If there are many different worker threads applying transactions, then the worker tables can also be used to monitor what each worker thread is doing.
The replication channels created by the Group Replication plugin are named:
-
group_replication_recovery
- This channel is used for the replication changes that are related to the distributed recovery phase. -
group_replication_applier
- This channel is used for the incoming changes from the group. This is the channel used to apply transactions coming directly from the group.
The following sections describe how to interpret the information available.
There are various states that a server instance can be in. If servers are communicating properly, all report the same states for all servers. However, if there is a network partition, or a server leaves the group, then different information could be reported, depending on which server is queried. If the server has left the group then it cannot report updated information about the other servers' states. If there is a partition, such that quorum is lost, servers are not able to coordinate between themselves. As a consequence, they cannot guess what the status of different servers is. Therefore, instead of guessing their state they report that some servers are unreachable.
Table 17.1 Server State
Field |
Description |
Group Synchronized |
---|---|---|
|
The member is ready to serve as a fully functional group member, meaning that the client can connect and start executing transactions. |
Yes |
|
The member is in the process of becoming an active member of the group and is currently going through the recovery process, receiving state information from a donor. |
No |
|
The plugin is loaded but the member does not belong to any group. |
No |
|
The state of the member. Whenever there is an error on the recovery phase or while applying changes, the server enters this state. |
No |
|
Whenever the local failure detector suspects that a given server is not reachable, because for example it was disconnected involuntarily, it shows that server's state as |
No |
Once an instance enters ERROR
state, the super_read_only
option is set to ON
. To leave the ERROR
state you must manually configure the instance with super_read_only=OFF
.
Note that Group Replication is not synchronous, but eventually synchronous. More precisely, transactions are delivered to all group members in the same order, but their execution is not synchronized, meaning that after a transaction is accepted to be committed, each member commits at its own pace.
The performance_schema.replication_group_members
table is used for monitoring the status of the different server instances that are members of the group. The information in the table is updated whenever there is a view change, for example when the configuration of the group is dynamically changed when a new member joins. At that point, servers exchange some of their metadata to synchronize themselves and continue to cooperate together. The information is shared between all the server instances that are members of the replication group, so information on all the group members can be queried from any member. This table can be used to get a high level view of the state of a replication group, for example by issuing:
SELECT * FROM performance_schema.replication_group_members;
+---------------------------+--------------------------------------+--------------+-------------+--------------+
| CHANNEL_NAME | MEMBER_ID | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE |
+---------------------------+--------------------------------------+--------------+-------------+--------------+
| group_replication_applier | 041f26d8-f3f3-11e8-adff-080027337932 | example1 | 3306 | ONLINE |
| group_replication_applier | f60a3e10-f3f2-11e8-8258-080027337932 | example2 | 3306 | ONLINE |
| group_replication_applier | fc890014-f3f2-11e8-a9fd-080027337932 | example3 | 3306 | ONLINE |
+---------------------------+--------------------------------------+--------------+-------------+--------------+
Based on this result we can see that the group consists of three members, each member's host and port number which clients use to connect to the member, and the server_uuid
of the member. The MEMBER_STATE
column shows one of the Section 17.3.1, “Group Replication Server States”, in this case it shows that all three members in this group are ONLINE
, and the MEMBER_ROLE
column shows that there are two secondaries, and a single primary. Therefore this group must be running in single-primary mode. The MEMBER_VERSION
column can be useful when you are upgrading a group and are combining members running different MySQL versions. See Section 17.3.1, “Group Replication Server States” for more information.
For more information about the Member_host
value and its impact on the distributed recovery process, see Section 17.2.1.3, “User Credentials”.
Each member in a replication group certifies and applies transactions received by the group. Statistics regarding the certifier and applier procedures are useful to understand how the applier queue is growing, how many conflicts have been found, how many transactions were checked, which transactions are committed everywhere, and so on.
The performance_schema.replication_group_member_stats
table provides group-level information related to the certification process, and also statistics for the transactions received and originated by each individual member of the replication group. The information is shared between all the server instances that are members of the replication group, so information on all the group members can be queried from any member. Note that refreshing of statistics for remote members is controlled by the message period specified in the group_replication_flow_control_period
option, so these can differ slightly from the locally collected statistics for the member where the query is made. To use this table to monitor a Group Replication member, issue:
mysql> SELECT * FROM performance_schema.replication_group_member_stats\G
These fields are important for monitoring the performance of the members connected in the group. For example, suppose that one of the group’s members always reports a large number of transactions in its queue compared to other members. This means that the member is delayed and is not able to keep up to date with the other members of the group. Based on this information, you could decide to either remove the member from the group, or delay the processing of transactions on the other members of the group in order to reduce the number of queued transactions. This information can also help you to decide how to adjust the flow control of the Group Replication plugin, see Section 17.9.7.3, “Flow Control”.
This section describes the different modes of deploying Group Replication, explains common operations for managing groups and provides information about how to tune your groups. .
Group Replication operates in the following different modes:
-
single-primary mode
-
multi-primary mode
The default mode is single-primary. It is not possible to have members of the group deployed in different modes, for example one configured in multi-primary mode while another one is in single-primary mode. To switch between modes, the group and not the server, needs to be restarted with a different operating configuration. Regardless of the deployed mode, Group Replication does not handle client-side fail-over, that must be handled by the application itself, a connector or a middleware framework such as a proxy or MySQL Router 8.0.
When deployed in multi-primary mode, statements are checked to ensure they are compatible with the mode. The following checks are made when Group Replication is deployed in multi-primary mode:
-
If a transaction is executed under the SERIALIZABLE isolation level, then its commit fails when synchronizing itself with the group.
-
If a transaction executes against a table that has foreign keys with cascading constraints, then the transaction fails to commit when synchronizing itself with the group.
These checks can be deactivated by setting the option group_replication_enforce_update_everywhere_checks
to FALSE
. When deploying in single-primary mode, this option must be set to FALSE
.
In this mode the group has a single-primary server that is set to read-write mode. All the other members in the group are set to read-only mode (with super-read-only=ON
). This happens automatically. The primary is typically the first server to bootstrap the group, all other servers that join automatically learn about the primary server and are set to read only.
When in single-primary mode, some of the checks deployed in multi-primary mode are disabled, because the system enforces that only a single server writes to the group. For example, changes to tables that have cascading foreign keys are allowed, whereas in multi-primary mode they are not. Upon primary member failure, an automatic primary election mechanism chooses the new primary member. The election process is performed by looking at the new view, and ordering the potential new primaries based on the value of group_replication_member_weight
. Assuming the group is operating with all members running the same MySQL version, then the member with the highest value for group_replication_member_weight
is elected as the new primary. In the event that multiple servers have the same group_replication_member_weight
, the servers are then prioritized based on their server_uuid
in lexicographical order and by picking the first one. Once a new primary is elected, it is automatically set to read-write and the other secondaries remain as secondaries, and as such, read-only.
When a new primary is elected, it is only writable once it has processed all of the transactions that came from the old primary. This avoids possible concurrency issues between old transactions from the old primary and the new ones being executed on this member. It is a good practice to wait for the new primary to apply its replication related relay-log before re-routing client applications to it.
If the group is operating with members that are running different versions of MySQL then the election process can be impacted. For example, if any member does not support group_replication_member_weight
, then the primary is chosen based on server_uuid
order from the members of the lower major version. Alternatively, if all members running different MySQL versions do support group_replication_member_weight
, the primary is chosen based on group_replication_member_weight
from the members of the lower major version.
Whenever a new member joins a replication group, it connects to a suitable donor and fetches the data that it has missed up until the point it is declared online. This critical component in Group Replication is fault tolerant and configurable. The following section explains how recovery works and how to tune the settings
A random donor is selected from the existing online members in the group. This way there is a good chance that the same server is not selected more than once when multiple members enter the group.
If the connection to the selected donor fails, a new connection is automatically attempted to a new candidate donor. Once the connection retry limit is reached the recovery procedure terminates with an error.
A donor is picked randomly from the list of online members in the current view.
The other main point of concern in recovery as a whole is to make sure that it copes with failures. Hence, Group Replication provides robust error detection mechanisms. In earlier versions of Group Replication, when reaching out to a donor, recovery could only detect connection errors due to authentication issues or some other problem. The reaction to such problematic scenarios was to switch over to a new donor thus a new connection attempt was made to a different member.
This behavior was extended to also cover other failure scenarios:
-
Purged data scenarios - If the selected donor contains some purged data that is needed for the recovery process then an error occurs. Recovery detects this error and a new donor is selected.
-
Duplicated data - If a server joining the group already contains some data that conflicts with the data coming from the selected donor during recovery then an error occurs. This could be caused by some errant transactions present in the server joining the group.
One could argue that recovery should fail instead of switching over to another donor, but in heterogeneous groups there is chance that other members share the conflicting transactions and others do not. For that reason, upon error, recovery selects another donor from the group.
-
Other errors - If any of the recovery threads fail (receiver or applier threads fail) then an error occurs and recovery switches over to a new donor.
In case of some persistent failures or even transient failures recovery automatically retries connecting to the same or a new donor.
The recovery data transfer relies on the binary log and existing MySQL replication framework, therefore it is possible that some transient errors could cause errors in the receiver or applier threads. In such cases, the donor switch over process has retry functionality, similar to that found in regular replication.
The number of attempts a server joining the group makes when trying to connect to a donor from the pool of donors is 10. This is configured through the group_replication_recovery_retry_count
plugin variable . The following command sets the maximum number of attempts to connect to a donor to 10.
mysql> SET GLOBAL group_replication_recovery_retry_count= 10;
Note that this accounts for the global number of attempts that the server joining the group makes connecting to each one of the suitable donors.
The group_replication_recovery_reconnect_interval
plugin variable defines how much time the recovery process should sleep between donor connection attempts. This variable has its default set to 60 seconds and you can change this value dynamically. The following command sets the recovery donor connection retry interval to 120 seconds.
mysql> SET GLOBAL group_replication_recovery_reconnect_interval= 120;
Note, however, that recovery does not sleep after every donor connection attempt. As the server joining the group is connecting to different servers and not to the same one over and over again, it can assume that the problem that affects server A does not affect server B. As such, recovery suspends only when it has gone through all the possible donors. Once the server joining the group has tried to connect to all the suitable donors in the group and none remains, the recovery process sleeps for the number of seconds configured by the group_replication_recovery_reconnect_interval
variable.
The group needs to achieve consensus whenever a change that needs to be replicated happens. This is the case for regular transactions but is also required for group membership changes and some internal messaging that keeps the group consistent. Consensus requires a majority of group members to agree on a given decision. When a majority of group members is lost, the group is unable to progress and blocks because it cannot secure majority or quorum.
Quorum may be lost when there are multiple involuntary failures, causing a majority of servers to be removed abruptly from the group. For example in a group of 5 servers, if 3 of them become silent at once, the majority is compromised and thus no quorum can be achieved. In fact, the remaining two are not able to tell if the other 3 servers have crashed or whether a network partition has isolated these 2 alone and therefore the group cannot be reconfigured automatically.
On the other hand, if servers exit the group voluntarily, they instruct the group that it should reconfigure itself. In practice, this means that a server that is leaving tells others that it is going away. This means that other members can reconfigure the group properly, the consistency of the membership is maintained and the majority is recalculated. For example, in the above scenario of 5 servers where 3 leave at once, if the 3 leaving servers warn the group that they are leaving, one by one, then the membership is able to adjust itself from 5 to 2, and at the same time, securing quorum while that happens.
Loss of quorum is by itself a side-effect of bad planning. Plan the group size for the number of expected failures (regardless whether they are consecutive, happen all at once or are sporadic).
The following sections explain what to do if the system partitions in such a way that no quorum is automatically achieved by the servers in the group.
A primary that has been excluded from a group after a majority loss followed by a reconfiguration can contain extra transactions that are not included in the new group. If this happens, the attempt to add back the excluded member from the group results in an error with the message This member has more executed transactions than those present in the group.
The replication_group_members
performance schema table presents the status of each server in the current view from the perspective of this server. The majority of the time the system does not run into partitioning, and therefore the table shows information that is consistent across all servers in the group. In other words, the status of each server on this table is agreed by all in the current view. However, if there is network partitioning, and quorum is lost, then the table shows the status UNREACHABLE
for those servers that it cannot contact. This information is exported by the local failure detector built into Group Replication.
To understand this type of network partition the following section describes a scenario where there are initially 5 servers working together correctly, and the changes that then happen to the group once only 2 servers are online. The scenario is depicted in the figure.
As such, lets assume that there is a group with these 5 servers in it:
-
Server s1 with member identifier
199b2df7-4aaf-11e6-bb16-28b2bd168d07
-
Server s2 with member identifier
199bb88e-4aaf-11e6-babe-28b2bd168d07
-
Server s3 with member identifier
1999b9fb-4aaf-11e6-bb54-28b2bd168d07
-
Server s4 with member identifier
19ab72fc-4aaf-11e6-bb51-28b2bd168d07
-
Server s5 with member identifier
19b33846-4aaf-11e6-ba81-28b2bd168d07
Initially the group is running fine and the servers are happily communicating with each other. You can verify this by logging into s1 and looking at its replication_group_members
performance schema table. For example:
mysql> SELECT MEMBER_ID,MEMBER_STATE, MEMBER_ROLE FROM performance_schema.replication_group_members;
+--------------------------------------+--------------+-------------+
| MEMBER_ID | MEMBER_STATE |-MEMBER_ROLE |
+--------------------------------------+--------------+-------------+
| 1999b9fb-4aaf-11e6-bb54-28b2bd168d07 | ONLINE | SECONDARY |
| 199b2df7-4aaf-11e6-bb16-28b2bd168d07 | ONLINE | PRIMARY |
| 199bb88e-4aaf-11e6-babe-28b2bd168d07 | ONLINE | SECONDARY |
| 19ab72fc-4aaf-11e6-bb51-28b2bd168d07 | ONLINE | SECONDARY |
| 19b33846-4aaf-11e6-ba81-28b2bd168d07 | ONLINE | SECONDARY |
+--------------------------------------+--------------+-------------+
However, moments later there is a catastrophic failure and servers s3, s4 and s5 stop unexpectedly. A few seconds after this, looking again at the replication_group_members
table on s1 shows that it is still online, but several others members are not. In fact, as seen below they are marked as UNREACHABLE
. Moreover, the system could not reconfigure itself to change the membership, because the majority has been lost.
mysql> SELECT MEMBER_ID,MEMBER_STATE FROM performance_schema.replication_group_members;
+--------------------------------------+--------------+
| MEMBER_ID | MEMBER_STATE |
+--------------------------------------+--------------+
| 1999b9fb-4aaf-11e6-bb54-28b2bd168d07 | UNREACHABLE |
| 199b2df7-4aaf-11e6-bb16-28b2bd168d07 | ONLINE |
| 199bb88e-4aaf-11e6-babe-28b2bd168d07 | ONLINE |
| 19ab72fc-4aaf-11e6-bb51-28b2bd168d07 | UNREACHABLE |
| 19b33846-4aaf-11e6-ba81-28b2bd168d07 | UNREACHABLE |
+--------------------------------------+--------------+
The table shows that s1 is now in a group that has no means of progressing without external intervention, because a majority of the servers are unreachable. In this particular case, the group membership list needs to be reset to allow the system to proceed, which is explained in this section. Alternatively, you could also choose to stop Group Replication on s1 and s2 (or stop completely s1 and s2), figure out what happened with s3, s4 and s5 and then restart Group Replication (or the servers).
Group replication enables you to reset the group membership list by forcing a specific configuration. For instance in the case above, where s1 and s2 are the only servers online, you could chose to force a membership configuration consisting of only s1 and s2. This requires checking some information about s1 and s2 and then using the group_replication_force_members
variable.
Suppose that you are back in the situation where s1 and s2 are the only servers left in the group. Servers s3, s4 and s5 have left the group unexpectedly. To make servers s1 and s2 continue, you want to force a membership configuration that contains only s1 and s2.
This procedure uses group_replication_force_members
and should be considered a last resort remedy. It must be used with extreme care and only for overriding loss of quorum. If misused, it could create an artificial split-brain scenario or block the entire system altogether.
Recall that the system is blocked and the current configuration is the following (as perceived by the local failure detector on s1):
mysql> SELECT MEMBER_ID,MEMBER_STATE FROM performance_schema.replication_group_members;
+--------------------------------------+--------------+
| MEMBER_ID | MEMBER_STATE |
+--------------------------------------+--------------+
| 1999b9fb-4aaf-11e6-bb54-28b2bd168d07 | UNREACHABLE |
| 199b2df7-4aaf-11e6-bb16-28b2bd168d07 | ONLINE |
| 199bb88e-4aaf-11e6-babe-28b2bd168d07 | ONLINE |
| 19ab72fc-4aaf-11e6-bb51-28b2bd168d07 | UNREACHABLE |
| 19b33846-4aaf-11e6-ba81-28b2bd168d07 | UNREACHABLE |
+--------------------------------------+--------------+
The first thing to do is to check what is the local address (group communication identifier) for s1 and s2. Log in to s1 and s2 and get that information as follows.
mysql> SELECT @@group_replication_local_address;
Once you know the group communication addresses of s1 (127.0.0.1:10000
) and s2 (127.0.0.1:10001
), you can use that on one of the two servers to inject a new membership configuration, thus overriding the existing one that has lost quorum. To do that on s1:
mysql> SET GLOBAL group_replication_force_members="
127.0.0.1:10000,127.0.0.1:10001";
This unblocks the group by forcing a different configuration. Check replication_group_members
on both s1 and s2 to verify the group membership after this change. First on s1.
mysql> SELECT MEMBER_ID,MEMBER_STATE FROM performance_schema.replication_group_members;
+--------------------------------------+--------------+
| MEMBER_ID | MEMBER_STATE |
+--------------------------------------+--------------+
| b5ffe505-4ab6-11e6-b04b-28b2bd168d07 | ONLINE |
| b60907e7-4ab6-11e6-afb7-28b2bd168d07 | ONLINE |
+--------------------------------------+--------------+
And then on s2.
mysql> SELECT * FROM performance_schema.replication_group_members;
+--------------------------------------+--------------+
| MEMBER_ID | MEMBER_STATE |
+--------------------------------------+--------------+
| b5ffe505-4ab6-11e6-b04b-28b2bd168d07 | ONLINE |
| b60907e7-4ab6-11e6-afb7-28b2bd168d07 | ONLINE |
+--------------------------------------+--------------+
When forcing a new membership configuration, make sure that any servers are going to be forced out of the group are indeed stopped. In the scenario depicted above, if s3, s4 and s5 are not really unreachable but instead are online, they may have formed their own functional partition (they are 3 out of 5, hence they have the majority). In that case, forcing a group membership list with s1 and s2 could create an artificial split-brain situation. Therefore it is important before forcing a new membership configuration to ensure that the servers to be excluded are indeed shutdown and if they are not, shut them down before proceeding.
After you have used the group_replication_force_members
system variable to successfully force a new group membership and unblock the group, ensure that you clear the system variable. group_replication_force_members
must be empty in order to issue a START GROUP_REPLICATION
statement.
This section explains how to secure a group, securing the connections between members of a group, or by establishing a security perimeter using IP address whitelisting.
The Group Replication plugin has a configuration option to determine from which hosts an incoming Group Communication System connection can be accepted. This option is called group_replication_ip_whitelist
. If you set this option on a server s1, then when server s2 is establishing a connection to s1 for the purpose of engaging group communication, s1 first checks the whitelist before accepting the connection from s2. If s2 is in the whitelist, then s1 accepts the connection, otherwise s1 rejects the connection attempt by s2.
If you do not specify a whitelist explicitly, the group communication engine (XCom) automatically scans active interfaces on the host, and identifies those with addresses on private subnetworks. These addresses and the localhost
IP address for IPv4 are used to create an automatic Group Replication whitelist. The automatic whitelist therefore includes any IP addresses found for the host in the following ranges:
10/8 prefix (10.0.0.0 - 10.255.255.255) - Class A
172.16/12 prefix (172.16.0.0 - 172.31.255.255) - Class B
192.168/16 prefix (192.168.0.0 - 192.168.255.255) - Class C
127.0.0.1 - localhost for IPv4
An entry is added to the error log stating the addresses that have been whitelisted automatically for the host.
The automatic whitelist of private addresses cannot be used for connections from servers outside the private network, so a server, even if it has interfaces on public IPs, does not by default allow Group Replication connections from external hosts. For Group Replication connections between server instances that are on different machines, you must provide public IP addresses and specify these as an explicit whitelist. If you specify any entries for the whitelist, the private and localhost
addresses are not added automatically, so if you use any of these, you must specify them explicitly.
To specify a whitelist manually, use the group_replication_ip_whitelist
option. You cannot change the whitelist on a server while it is an active member of a replication group. If the member is active, you must issue a STOP GROUP_REPLICATION
statement before changing the whitelist, and a START GROUP_REPLICATION
statement afterwards.
In the whitelist, you can specify any combination of the following:
-
IPv4 addresses (for example,
198.51.100.44
) -
IPv4 addresses with CIDR notation (for example,
192.0.2.21/24
) -
Host names, from MySQL 5.7.21 (for example,
example.org
) -
Host names with CIDR notation, from MySQL 5.7.21 (for example,
www.example.com/24
)
IPv6 addresses, and host names that resolve to IPv6 addresses, are not supported in MySQL 5.7. You can use CIDR notation in combination with host names or IP addresses to whitelist a block of IP addresses with a particular network prefix, but do ensure that all the IP addresses in the specified subnet are under your control.
You must stop and restart Group Replication on a member in order to change its whitelist. A comma must separate each entry in the whitelist. For example:
mysql> STOP GROUP_REPLICATION;
mysql> SET GLOBAL group_replication_ip_whitelist="192.0.2.21/24,198.51.100.44,203.0.113.0/24,example.org,www.example.com/24";
mysql> START GROUP_REPLICATION;
The whitelist must contain the IP address or host name that is specified in each member's group_replication_local_address
system variable. This address is not the same as the MySQL server SQL protocol host and port, and is not specified in the bind_address
system variable for the server instance.
When a replication group is reconfigured (for example, when a new primary is elected or a member joins or leaves), the group members re-establish connections between themselves. If a group member is only whitelisted by servers that are no longer part of the replication group after the reconfiguration, it is unable to reconnect to the remaining servers in the replication group that do not whitelist it. To avoid this scenario entirely, specify the same whitelist for all servers that are members of the replication group.
It is possible to configure different whitelists on different group members according to your security requirements, for example, in order to keep different subnets separate. If you need to configure different whitelists to meet your security requirements, ensure that there is sufficient overlap between the whitelists in the replication group to maximize the possibility of servers being able to reconnect in the absence of their original seed member.
For host names, name resolution takes place only when a connection request is made by another server. A host name that cannot be resolved is not considered for whitelist validation, and a warning message is written to the error log. Forward-confirmed reverse DNS (FCrDNS) verification is carried out for resolved host names.
Host names are inherently less secure than IP addresses in a whitelist. FCrDNS verification provides a good level of protection, but can be compromised by certain types of attack. Specify host names in your whitelist only when strictly necessary, and ensure that all components used for name resolution, such as DNS servers, are maintained under your control. You can also implement name resolution locally using the hosts file, to avoid the use of external components.
Group communication connections as well as recovery connections, are secured using SSL. The following sections explain how to configure connections.
Recovery is performed through a regular asynchronous replication connection. Once the donor is selected, the server joining the group establishes an asynchronous replication connection. This is all automatic.
However, a user that requires an SSL connection must have been created before the server joining the group connects to the donor. Typically, this is set up at the time one is provisioning a server to join the group.
donor> SET SQL_LOG_BIN=0;
donor> CREATE USER 'rec_ssl_user'@'%' REQUIRE SSL;
donor> GRANT replication slave ON *.* TO 'rec_ssl_user'@'%';
donor> SET SQL_LOG_BIN=1;
Assuming that all servers already in the group have a replication user set up to use SSL, you configure the server joining the group to use those credentials when connecting to the donor. That is done according to the values of the SSL options provided for the Group Replication plugin.
new_member> SET GLOBAL group_replication_recovery_use_ssl=1;
new_member> SET GLOBAL group_replication_recovery_ssl_ca= '.../cacert.pem';
new_member> SET GLOBAL group_replication_recovery_ssl_cert= '.../client-cert.pem';
new_member> SET GLOBAL group_replication_recovery_ssl_key= '.../client-key.pem';
And by configuring the recovery channel to use the credentials of the user that requires an SSL connection.
new_member> CHANGE MASTER TO MASTER_USER="rec_ssl_user" FOR CHANNEL "group_replication_recovery";
new_member> START GROUP_REPLICATION;
Secure sockets can be used to establish communication between members in a group. The configuration for this depends on the server's SSL configuration. As such, if the server has SSL configured, the Group Replication plugin also has SSL configured. For more information on the options for configuring the server SSL, see Command Options for Encrypted Connections. The options which configure Group Replication are shown in the following table.
Table 17.2 SSL Options
Server Configuration |
Plugin Configuration Description |
---|---|
ssl_key |
Path of key file. To be used as client and server certificate. |
ssl_cert |
Path of certificate file. To be used as client and server certificate. |
ssl_ca |
Path of file with SSL Certificate Authorities that are trusted. |
ssl_capath |
Path of directory containing certificates for SSL Certificate Authorities that are trusted. |
ssl_crl |
Path of file containing the certificate revocation lists. |
ssl_crlpath |
Path of directory containing revoked certificate lists. |
ssl_cipher |
Permitted ciphers to use while encrypting data over the connection. |
tls_version |
Secure communication will use this version and its protocols. |
These options are MySQL Server configuration options which Group Replication relies on for its configuration. In addition there is the following Group Replication specific option to configure SSL on the plugin itself.
-
group_replication_ssl_mode
- specifies the security state of the connection between Group Replication members.
Table 17.3 group_replication_ssl_mode configuration values
Value |
Description |
---|---|
DISABLED |
Establish an unencrypted connection (default). |
REQUIRED |
Establish a secure connection if the server supports secure connections. |
VERIFY_CA |
Like REQUIRED, but additionally verify the server TLS certificate against the configured Certificate Authority (CA) certificates. |
VERIFY_IDENTITY |
Like VERIFY_CA, but additionally verify that the server certificate matches the host to which the connection is attempted. |
The following example shows an example my.cnf file section used to configure SSL on a server and how activate it for Group Replication.
[mysqld]
ssl_ca = "cacert.pem"
ssl_capath = "/.../ca_directory"
ssl_cert = "server-cert.pem"
ssl_cipher = "DHE-RSA-AEs256-SHA"
ssl_crl = "crl-server-revoked.crl"
ssl_crlpath = "/.../crl_directory"
ssl_key = "server-key.pem"
group_replication_ssl_mode= REQUIRED
The only plugin specific configuration option that is listed is group_replication_ssl_mode
. This option activates the SSL communication between members of the group, by configuring the SSL framework with the ssl_*
parameters that are provided to the server.
This section lists the system variables that are specific to the Group Replication plugin. Every configuration option is prefixed with "group_replication
".
Most system variables for Group Replication are described as dynamic, and their values can be changed while the server is running. However, in most cases, the change only takes effect after you stop and restart Group Replication on the group member using a STOP GROUP_REPLICATION
statement followed by a START GROUP_REPLICATION
statement. Changes to the following system variables take effect without stopping and restarting Group Replication:
Most system variables for Group Replication can have different values on different group members. For the following system variables, it is advisable to set the same value on all members of a group in order to avoid unnecessary rollback of transactions, failure of message delivery, or failure of message recovery:
Some system variables on a Group Replication group member, including some Group Replication-specific system variables and some general system variables, are group-wide configuration settings. These system variables must have the same value on all group members, cannot be changed while Group Replication is running, and require a full reboot of the group (a bootstrap by a server with group_replication_bootstrap_group=ON
) in order for the value change to take effect. These conditions apply to the following system variables:
-
A number of system variables for Group Replication are not completely validated during server startup if they are passed as command line arguments to the server. These system variables include
group_replication_group_name
,group_replication_single_primary_mode
,group_replication_force_members
, the SSL variables, and the flow control system variables. They are only fully validated after the server has started. -
System variables for Group Replication that specify IP addresses or host names for group members are not validated until a
START GROUP_REPLICATION
statement is issued. Group Replication's Group Communication System (GCS) is not available to validate the values until that point.
The system variables that are specific to the Group Replication plugin are as follows:
-
group_replication_allow_local_disjoint_gtids_join
Property Value Command-Line Format --group-replication-allow-local-disjoint-gtids-join[={OFF|ON}]
Introduced 5.7.17 Deprecated 5.7.21 System Variable group_replication_allow_local_disjoint_gtids_join
Scope Global Dynamic Yes Type Boolean Default Value OFF
Deprecated in version 5.7.21 and scheduled for removal in a future version. Allows the server to join the group even if it has local transactions that are not present in the group.
WarningUse caution when enabling this option as incorrect usage can lead to conflicts in the group and rollback of transactions. The option should only be enabled as a last resort method to allow a server that has local transactions to join an existing group, and then only if the local transactions do not affect the data that is handled by the group (for example, an administrative action that was written to the binary log). The option should not be left enabled on all group members.
-
group_replication_allow_local_lower_version_join
Property Value Command-Line Format --group-replication-allow-local-lower-version-join[={OFF|ON}]
Introduced 5.7.17 System Variable group_replication_allow_local_lower_version_join
Scope Global Dynamic Yes Type Boolean Default Value OFF
Allows the current server to join the group even if it has a lower major version than the group. With the default setting
OFF
, servers are not permitted to join a replication group if they have a lower major version than the existing group members. For example, a MySQL 5.7 server cannot join a group that consists of MySQL 8.0 servers. This standard policy ensures that all members of a group are able to exchange messages and apply transactions. Setgroup_replication_allow_local_lower_version_join
toON
only in the following scenarios:-
A server must be added to the group in an emergency in order to improve the group's fault tolerance, and only older versions are available.
-
You want to carry out a downgrade of the replication group members without shutting down the whole group and bootstrapping it again.
WarningSetting this option to
ON
does not make the new member compatible with the group, and allows it to join the group without any safeguards against incompatible behaviors by the existing members. To ensure the new member's correct operation, take both of the following precautions:-
Before the server with the lower major version joins the group, stop all writes on that server.
-
From the point where the server with the lower major version joins the group, stop all writes on the other servers in the group.
Without these precautions, the server with the lower major version is likely to experience difficulties and terminate with an error.
-
-
group_replication_auto_increment_increment
Property Value Command-Line Format --group-replication-auto-increment-increment=#
Introduced 5.7.17 System Variable group_replication_auto_increment_increment
Scope Global Dynamic Yes Type Integer Default Value 7
Minimum Value 1
Maximum Value 65535
Determines the interval between successive column values for transactions that execute on this server instance. This system variable should have the same value on all group members. When Group Replication is started on a server, the value of the server system variable
auto_increment_increment
is changed to this value, and the value of the server system variableauto_increment_offset
is changed to the server ID. These settings avoid the selection of duplicate auto-increment values for writes on group members, which causes rollback of transactions. The changes are reverted when Group Replication is stopped. These changes are only made and reverted ifauto_increment_increment
andauto_increment_offset
each have their default value of 1. If their values have already been modified from the default, Group Replication does not alter them. From MySQL 8.0, the system variables are also not modified when Group Replication is in single-primary mode, where only one server writes.The default value of 7 represents a balance between the number of usable values and the permitted maximum size of a replication group (9 members). If your group has more or fewer members, you can set this system variable to match the expected number of group members before Group Replication is started. You cannot change the setting while Group Replication is running.
-
group_replication_bootstrap_group
Property Value Command-Line Format --group-replication-bootstrap-group[={OFF|ON}]
Introduced 5.7.17 System Variable group_replication_bootstrap_group
Scope Global Dynamic Yes Type Boolean Default Value OFF
Configure this server to bootstrap the group. This option must only be set on one server and only when starting the group for the first time or restarting the entire group. After the group has been bootstrapped, set this option to
OFF
. It should be set toOFF
both dynamically and in the configuration files. Starting two servers or restarting one server with this option set while the group is running may lead to an artificial split brain situation, where two independent groups with the same name are bootstrapped. -
group_replication_components_stop_timeout
Property Value Command-Line Format --group-replication-components-stop-timeout=#
Introduced 5.7.17 System Variable group_replication_components_stop_timeout
Scope Global Dynamic Yes Type Integer Default Value 31536000
Minimum Value 2
Maximum Value 31536000
Timeout, in seconds, that Group Replication waits for each of the components when shutting down.
-
group_replication_compression_threshold
Property Value Command-Line Format --group-replication-compression-threshold=#
Introduced 5.7.17 System Variable group_replication_compression_threshold
Scope Global Dynamic Yes Type Integer Default Value 1000000
Minimum Value 0
Maximum Value 4294967295
-
The threshold value in bytes above which compression is applied to messages sent between group members. If this system variable is set to zero, compression is disabled. The value of
group_replication_compression_threshold
should be the same on all group members.Group Replication uses the LZ4 compression algorithm to compress messages sent in the group. Note that the maximum supported input size for the LZ4 compression algorithm is 2113929216 bytes. This limit is lower than the maximum possible value for the
group_replication_compression_threshold
system variable, which is matched to the maximum message size accepted by XCom. With the LZ4 compression algorithm, do not set a value greater than 2113929216 bytes forgroup_replication_compression_threshold
, because transactions above this size cannot be committed when message compression is enabled.For more information, see Section 17.9.7.2, “Message Compression”.
-
group_replication_enforce_update_everywhere_checks
Property Value Command-Line Format --group-replication-enforce-update-everywhere-checks[={OFF|ON}]
Introduced 5.7.17 System Variable group_replication_enforce_update_everywhere_checks
Scope Global Dynamic Yes Type Boolean Default Value OFF
Enable or disable strict consistency checks for multi-primary update everywhere. The default is that checks are disabled. In single-primary mode, this option must be disabled on all group members. In multi-primary mode, when this option is enabled, statements are checked as follows to ensure they are compatible with multi-primary mode:
-
If a transaction is executed under the
SERIALIZABLE
isolation level, then its commit fails when synchronizing itself with the group. -
If a transaction executes against a table that has foreign keys with cascading constraints, then the transaction fails to commit when synchronizing itself with the group.
This system variable is a group-wide configuration setting. It must have the same value on all group members, cannot be changed while Group Replication is running, and requires a full reboot of the group (a bootstrap by a server with
group_replication_bootstrap_group=ON
) in order for the value change to take effect. -
-
group_replication_exit_state_action
Property Value Command-Line Format --group-replication-exit-state-action=value
Introduced 5.7.24 System Variable group_replication_exit_state_action
Scope Global Dynamic Yes Type Enumeration Default Value READ_ONLY
Valid Values ABORT_SERVER
READ_ONLY
Configures how Group Replication behaves when a server instance leaves the group unintentionally, for example after encountering an applier error, or in the case of a loss of majority, or when another member of the group expels it due to a suspicion timing out. The timeout period for a member to leave the group in the case of a loss of majority is set by the
group_replication_unreachable_majority_timeout
system variable. Note that an expelled group member does not know that it was expelled until it reconnects to the group, so the specified action is only taken if the member manages to reconnect, or if the member raises a suspicion on itself and expels itself. -
When
group_replication_exit_state_action
is set toABORT_SERVER
, if the member exits the group unintentionally, the instance shuts down MySQL.When
group_replication_exit_state_action
is set toREAD_ONLY
, if the member exits the group unintentionally, the instance switches MySQL to super read only mode (by setting the system variablesuper_read_only
toON
). This setting is the default in MySQL 5.7.ImportantIf a failure occurs before the member has successfully joined the group, the specified exit action is not taken. This is the case if there is a failure during the local configuration check, or a mismatch between the configuration of the joining member and the configuration of the group. In these situations, the
super_read_only
system variable is left with its original value, and the server does not shut down MySQL. To ensure that the server cannot accept updates when Group Replication did not start, we therefore recommend thatsuper_read_only=ON
is set in the server's configuration file at startup, which Group Replication will change toOFF
on primary members after it has been started successfully. This safeguard is particularly important when the server is configured to start Group Replication on server boot (group_replication_start_on_boot=ON
), but it is also useful when Group Replication is started manually using aSTART GROUP_REPLICATION
command.If a failure occurs after the member has successfully joined the group, the specified exit action is taken. This is the case if there is an applier error, if the member is expelled from the group, or if the member is set to time out in the event of an unreachable majority. In these situations, if
READ_ONLY
is the exit action, thesuper_read_only
system variable is set toON
, or ifABORT_SERVER
is the exit action, the server shuts down MySQL.Table 17.4 Exit actions in Group Replication failure situations
Failure situation
Group Replication started with
START GROUP_REPLICATION
Group Replication started with
group_replication_start_on_boot =ON
Member fails local configuration check
OR
Mismatch between joining member and group configuration
super_read_only
unchangedMySQL continues running
Set
super_read_only=ON
at startup to prevent updatessuper_read_only
unchangedMySQL continues running
Set
super_read_only=ON
at startup to prevent updates (Important)Applier error on member
OR
Member expelled from group
OR
Unreachable majority timeout
super_read_only
set toON
OR
MySQL shuts down
super_read_only
set toON
OR
MySQL shuts down
-
group_replication_flow_control_applier_threshold
Property Value Command-Line Format --group-replication-flow-control-applier-threshold=#
Introduced 5.7.17 System Variable group_replication_flow_control_applier_threshold
Scope Global Dynamic Yes Type Integer Default Value 25000
Minimum Value 0
Maximum Value 2147483647
Specifies the number of waiting transactions in the applier queue that trigger flow control. This variable can be changed without resetting Group Replication.
-
group_replication_flow_control_certifier_threshold
Property Value Command-Line Format --group-replication-flow-control-certifier-threshold=#
Introduced 5.7.17 System Variable group_replication_flow_control_certifier_threshold
Scope Global Dynamic Yes Type Integer Default Value 25000
Minimum Value 0
Maximum Value 2147483647
Specifies the number of waiting transactions in the certifier queue that trigger flow control. This variable can be changed without resetting Group Replication.
-
group_replication_flow_control_hold_percent
Property Value Command-Line Format --group-replication-flow-control-hold-percent=#
System Variable group_replication_flow_control_hold_percent
Scope Global Dynamic Yes Type Integer Default Value 10
Minimum Value 0
Maximum Value 100
Defines what percentage of the group quota remains unused to allow a cluster under flow control to catch up on backlog. A value of 0 implies that no part of the quota is reserved for catching up on the work backlog.
-
group_replication_flow_control_max_commit_quota
Property Value Command-Line Format --group-replication-flow-control-max-commit-quota=#
System Variable group_replication_flow_control_max_commit_quota
Scope Global Dynamic Yes Type Integer Default Value 0
Minimum Value 0
Maximum Value 2147483647
Defines the maximum flow control quota of the group, or the maximum available quota for any period while flow control is enabled. A value of 0 implies that there is no maximum quota set. Cannot be smaller than
group_replication_flow_control_min_quota
andgroup_replication_flow_control_min_recovery_quota
. -
group_replication_flow_control_member_quota_percent
Property Value Command-Line Format --group-replication-flow-control-member-quota-percent=#
System Variable group_replication_flow_control_member_quota_percent
Scope Global Dynamic Yes Type Integer Default Value 0
Minimum Value 0
Maximum Value 100
Defines the percentage of the quota that a member should assume is available for itself when calculating the quotas. A value of 0 implies that the quota should be split equally between members that were writers in the last period.
-
group_replication_flow_control_min_quota
Property Value Command-Line Format --group-replication-flow-control-min-quota=#
System Variable group_replication_flow_control_min_quota
Scope Global Dynamic Yes Type Integer Default Value 0
Minimum Value 0
Maximum Value 2147483647
Controls the lowest flow control quota that can be assigned to a member, independently of the calculated minimum quota executed in the last period. A value of 0 implies that there is no minimum quota. Cannot be larger than
group_replication_flow_control_max_commit_quota
. -
group_replication_flow_control_min_recovery_quota
Property Value Command-Line Format --group-replication-flow-control-min-recovery-quota=#
System Variable group_replication_flow_control_min_recovery_quota
Scope Global Dynamic Yes Type Integer Default Value 0
Minimum Value 0
Maximum Value 2147483647
Controls the lowest quota that can be assigned to a member because of another recovering member in the group, independently of the calculated minimum quota executed in the last period. A value of 0 implies that there is no minimum quota. Cannot be larger than
group_replication_flow_control_max_commit_quota
. -
group_replication_flow_control_mode
Property Value Command-Line Format --group-replication-flow-control-mode=value
Introduced 5.7.17 System Variable group_replication_flow_control_mode
Scope Global Dynamic Yes Type Enumeration Default Value QUOTA
Valid Values DISABLED
QUOTA
Specifies the mode used for flow control. This variable can be changed without resetting Group Replication.
-
group_replication_force_members
Property Value Command-Line Format --group-replication-force-members=value
Introduced 5.7.17 System Variable group_replication_force_members
Scope Global Dynamic Yes Type String A list of peer addresses as a comma separated list such as
host1:port1
,host2:port2
. This option is used to force a new group membership, in which the excluded members do not receive a new view and are blocked. (You need to manually kill the excluded servers.) Any invalid host names in the list could cause this action to fail because they could block group membership. For a description of the procedure to follow, see Section 17.4.3, “Network Partitioning”.You must specify the address or host name and port as they are given in the
group_replication_local_address
option for each member. For example:"198.51.100.44:33061,example.org:33061"
After you have used the
group_replication_force_members
system variable to successfully force a new group membership and unblock the group, ensure that you clear the system variable.group_replication_force_members
must be empty in order to issue aSTART GROUP_REPLICATION
statement. -
Property Value Command-Line Format --group-replication-group-name=value
Introduced 5.7.17 System Variable group_replication_group_name
Scope Global Dynamic Yes Type String The name of the group which this server instance belongs to. Must be a valid UUID. This UUID is used internally when setting GTIDs for Group Replication events in the binary log.
ImportantA unique UUID must be used.
-
Property Value Command-Line Format --group-replication-group-seeds=value
Introduced 5.7.17 System Variable group_replication_group_seeds
Scope Global Dynamic Yes Type String A list of group members that provide a member which joins the group with the data required for the joining member to gain synchrony with the group. The list consists of the seed member's network addresses specified as a comma separated list, such as
host1:port1
,host2:port2
.ImportantThese addresses must not be the member's SQL hostname and port.
Note that the value you specify for this variable is not validated until a
START GROUP_REPLICATION
statement is issued and the Group Communication System (GCS) is available.Usually this list consists of all members of the group, but you can choose a subset of the group members to be seeds. The list must contain at least one valid member address. Each address is validated when starting Group Replication. If the list does not contain any valid host names, issuing
START GROUP_REPLICATION
fails. -
group_replication_gtid_assignment_block_size
Property Value Command-Line Format --group-replication-gtid-assignment-block-size=#
Introduced 5.7.17 System Variable group_replication_gtid_assignment_block_size
Scope Global Dynamic Yes Type Integer Default Value 1000000
Minimum Value 1
Maximum Value (64-bit platforms) 9223372036854775807
Maximum Value (32-bit platforms) 4294967295
The number of consecutive GTIDs that are reserved for each member. Each member consumes its blocks and reserves more when needed.
This system variable is a group-wide configuration setting. It must have the same value on all group members, cannot be changed while Group Replication is running, and requires a full reboot of the group (a bootstrap by a server with
group_replication_bootstrap_group=ON
) in order for the value change to take effect. -
group_replication_ip_whitelist
Property Value Command-Line Format --group-replication-ip-whitelist=value
Introduced 5.7.17 System Variable group_replication_ip_whitelist
Scope Global Dynamic Yes Type String Default Value AUTOMATIC
Specifies which hosts are permitted to connect to the group. The address that you specify for each group member in
group_replication_local_address
must be whitelisted on the other servers in the replication group. Note that the value you specify for this variable is not validated until aSTART GROUP_REPLICATION
statement is issued and the Group Communication System (GCS) is available.By default, this system variable is set to
AUTOMATIC
, which permits connections from private subnetworks active on the host. The group communication engine (XCom) automatically scans active interfaces on the host, and identifies those with addresses on private subnetworks. These addresses and thelocalhost
IP address for IPv4 are used to create the Group Replication whitelist. For a list of the ranges from which addresses are automatically whitelisted, see Section 17.5.1, “Group Replication IP Address Whitelisting”.The automatic whitelist of private addresses cannot be used for connections from servers outside the private network. For Group Replication connections between server instances that are on different machines, you must provide public IP addresses and specify these as an explicit whitelist. If you specify any entries for the whitelist, the private and
localhost
addresses are not added automatically, so if you use any of these, you must specify them explicitly.As the value of the
group_replication_ip_whitelist
option, you can specify any combination of the following:-
IPv4 addresses (for example,
198.51.100.44
) -
IPv4 addresses with CIDR notation (for example,
192.0.2.21/24
) -
Host names, from MySQL 5.7.21 (for example,
example.org
) -
Host names with CIDR notation, from MySQL 5.7.21 (for example,
www.example.com/24
)
IPv6 addresses, and host names that resolve to IPv6 addresses, are not supported in MySQL 5.7. You can use CIDR notation in combination with host names or IP addresses to whitelist a block of IP addresses with a particular network prefix, but do ensure that all the IP addresses in the specified subnet are under your control.
A comma must separate each entry in the whitelist. For example:
192.0.2.22,198.51.100.0/24,example.org,www.example.com/24
It is possible to configure different whitelists on different group members according to your security requirements, for example, in order to keep different subnets separate. However, this can cause issues when a group is reconfigured. If you do not have a specific security requirement to do otherwise, use the same whitelist on all members of a group. For more details, see Section 17.5.1, “Group Replication IP Address Whitelisting”.
For host names, name resolution takes place only when a connection request is made by another server. A host name that cannot be resolved is not considered for whitelist validation, and a warning message is written to the error log. Forward-confirmed reverse DNS (FCrDNS) verification is carried out for resolved host names.
WarningHost names are inherently less secure than IP addresses in a whitelist. FCrDNS verification provides a good level of protection, but can be compromised by certain types of attack. Specify host names in your whitelist only when strictly necessary, and ensure that all components used for name resolution, such as DNS servers, are maintained under your control. You can also implement name resolution locally using the hosts file, to avoid the use of external components.
-
-
group_replication_local_address
Property Value Command-Line Format --group-replication-local-address=value
Introduced 5.7.17 System Variable group_replication_local_address
Scope Global Dynamic Yes Type String The network address which the member provides for connections from other members, specified as a
host:port
formatted string. This address must be reachable by all members of the group because it is used by the group communication engine for Group Replication (XCom, a Paxos variant) for TCP communication between remote XCom instances. Communication with the local instance is over an input channel using shared memory.WarningDo not use this address for communication with the member.
Other Group Replication members contact this member through this
host:port
for all internal group communication. This is not the MySQL server SQL protocol host and port.The address or host name that you specify in
group_replication_local_address
is used by Group Replication as the unique identifier for a group member within the replication group. You can use the same port for all members of a replication group as long as the host names or IP addresses are all different, and you can use the same host name or IP address for all members as long as the ports are all different. The recommended port forgroup_replication_local_address
is 33061. Note that the value you specify for this variable is not validated until theSTART GROUP_REPLICATION
statement is issued and the Group Communication System (GCS) is available. -
group_replication_member_weight
Property Value Command-Line Format --group-replication-member-weight=#
Introduced 5.7.20 System Variable group_replication_member_weight
Scope Global Dynamic Yes Type Integer Default Value 50
Minimum Value 0
Maximum Value 100
A percentage weight that can be assigned to members to influence the chance of the member being elected as primary in the event of failover, for example when the existing primary leaves a single-primary group. Assign numeric weights to members to ensure that specific members are elected, for example during scheduled maintenance of the primary or to ensure certain hardware is prioritized in the event of failover.
For a group with members configured as follows:
-
member-1
: group_replication_member_weight=30, server_uuid=aaaa -
member-2
: group_replication_member_weight=40, server_uuid=bbbb -
member-3
: group_replication_member_weight=40, server_uuid=cccc -
member-4
: group_replication_member_weight=40, server_uuid=dddd
during election of a new primary the members above would be sorted as
member-2
,member-3
,member-4
, andmember-1
. This results inmember
-2 being chosen as the new primary in the event of failover. For more information, see Section 17.4.1.1, “Single-Primary Mode”. -
-
group_replication_poll_spin_loops
Property Value Command-Line Format --group-replication-poll-spin-loops=#
Introduced 5.7.17 System Variable group_replication_poll_spin_loops
Scope Global Dynamic Yes Type Integer Default Value 0
Minimum Value 0
Maximum Value (64-bit platforms) 18446744073709551615
Maximum Value (32-bit platforms) 4294967295
The number of times the group communication thread waits for the communication engine mutex to be released before the thread waits for more incoming network messages.
-
group_replication_recovery_retry_count
Property Value Command-Line Format --group-replication-recovery-retry-count=#
Introduced 5.7.17 System Variable group_replication_recovery_retry_count
Scope Global Dynamic Yes Type Integer Default Value 10
Minimum Value 0
Maximum Value 31536000
The number of times that the member that is joining tries to connect to the available donors before giving up.
-
group_replication_recovery_reconnect_interval
Property Value Command-Line Format --group-replication-recovery-reconnect-interval=#
Introduced 5.7.17 System Variable group_replication_recovery_reconnect_interval
Scope Global Dynamic Yes Type Integer Default Value 60
Minimum Value 0
Maximum Value 31536000
The sleep time, in seconds, between reconnection attempts when no donor was found in the group.
-
group_replication_recovery_use_ssl
Property Value Command-Line Format --group-replication-recovery-use-ssl[={OFF|ON}]
Introduced 5.7.17 System Variable group_replication_recovery_use_ssl
Scope Global Dynamic Yes Type Boolean Default Value OFF
Whether Group Replication recovery connection should use SSL or not.
-
group_replication_recovery_ssl_ca
Property Value Command-Line Format --group-replication-recovery-ssl-ca=value
Introduced 5.7.17 System Variable group_replication_recovery_ssl_ca
Scope Global Dynamic Yes Type String The path to a file that contains a list of trusted SSL certificate authorities.
-
group_replication_recovery_ssl_capath
Property Value Command-Line Format --group-replication-recovery-ssl-capath=value
Introduced 5.7.17 System Variable group_replication_recovery_ssl_capath
Scope Global Dynamic Yes Type String The path to a directory that contains trusted SSL certificate authority certificates.
-
group_replication_recovery_ssl_cert
Property Value Command-Line Format --group-replication-recovery-ssl-cert=value
Introduced 5.7.17 System Variable group_replication_recovery_ssl_cert
Scope Global Dynamic Yes Type String The name of the SSL certificate file to use for establishing a secure connection.
-
group_replication_recovery_ssl_key
Property Value Command-Line Format --group-replication-recovery-ssl-key=value
Introduced 5.7.17 System Variable group_replication_recovery_ssl_key
Scope Global Dynamic Yes Type String The name of the SSL key file to use for establishing a secure connection.
-
group_replication_recovery_ssl_cipher
Property Value Command-Line Format --group-replication-recovery-ssl-cipher=value
Introduced 5.7.17 System Variable group_replication_recovery_ssl_cipher
Scope Global Dynamic Yes Type String The list of permissible ciphers for SSL encryption.
-
group_replication_recovery_ssl_crl
Property Value Command-Line Format --group-replication-recovery-ssl-crl=value
Introduced 5.7.17 System Variable group_replication_recovery_ssl_crl
Scope Global Dynamic Yes Type File name The path to a directory that contains files containing certificate revocation lists.
-
group_replication_recovery_ssl_crlpath
Property Value Command-Line Format --group-replication-recovery-ssl-crlpath=value
Introduced 5.7.17 System Variable group_replication_recovery_ssl_crlpath
Scope Global Dynamic Yes Type Directory name The path to a directory that contains files containing certificate revocation lists.
-
group_replication_recovery_ssl_verify_server_cert
Property Value Command-Line Format --group-replication-recovery-ssl-verify-server-cert[={OFF|ON}]
Introduced 5.7.17 System Variable group_replication_recovery_ssl_verify_server_cert
Scope Global Dynamic Yes Type Boolean Default Value OFF
Make the recovery process check the server's Common Name value in the donor sent certificate.
-
group_replication_recovery_complete_at
Property Value Command-Line Format --group-replication-recovery-complete-at=value
Introduced 5.7.17 System Variable group_replication_recovery_complete_at
Scope Global Dynamic Yes Type Enumeration Default Value TRANSACTIONS_APPLIED
Valid Values TRANSACTIONS_CERTIFIED
TRANSACTIONS_APPLIED
Recovery policies when handling cached transactions after state transfer. This option specifies whether a member is marked online after it has received all transactions that it missed before it joined the group (
TRANSACTIONS_CERTIFIED
) or after it has received and applied them (TRANSACTIONS_APPLIED
). -
group_replication_single_primary_mode
Property Value Command-Line Format --group-replication-single-primary-mode[={OFF|ON}]
Introduced 5.7.17 System Variable group_replication_single_primary_mode
Scope Global Dynamic Yes Type Boolean Default Value ON
Instructs the group to automatically pick a single server to be the one that handles read/write workload. This server is the PRIMARY and all others are SECONDARIES.
This system variable is a group-wide configuration setting. It must have the same value on all group members, cannot be changed while Group Replication is running, and requires a full reboot of the group (a bootstrap by a server with
group_replication_bootstrap_group=ON
) in order for the value change to take effect. -
Property Value Command-Line Format --group-replication-ssl-mode=value
Introduced 5.7.17 System Variable group_replication_ssl_mode
Scope Global Dynamic Yes Type Enumeration Default Value DISABLED
Valid Values DISABLED
REQUIRED
VERIFY_CA
VERIFY_IDENTITY
Specifies the security state of the connection between Group Replication members.
-
group_replication_start_on_boot
Property Value Command-Line Format --group-replication-start-on-boot[={OFF|ON}]
Introduced 5.7.17 System Variable group_replication_start_on_boot
Scope Global Dynamic Yes Type Boolean Default Value ON
Whether the server should start Group Replication or not during server start.
-
group_replication_transaction_size_limit
Property Value Command-Line Format --group-replication-transaction-size-limit=#
Introduced 5.7.19 System Variable group_replication_transaction_size_limit
Scope Global Dynamic Yes Type Integer Default Value 0
Minimum Value 0
Maximum Value 2147483647
Configures the maximum transaction size in bytes which the replication group accepts. Transactions larger than this size are rolled back by the receiving member and are not broadcast to the group. Large transactions can cause problems for a replication group in terms of memory allocation, which can cause the system to slow down, or in terms of network bandwidth consumption, which can cause a member to be suspected of having failed because it is busy processing the large transaction.
When this system variable is set to 0, which is the default in MySQL 5.7, there is no limit to the size of transactions the group accepts. From MySQL 8.0, the default setting for this system variable is 150000000 bytes (approximately 143 MB). Adjust the value of this system variable depending on the maximum message size that you need the group to tolerate, bearing in mind that the time taken to process a transaction is proportional to its size. The value of
group_replication_transaction_size_limit
should be the same on all group members. For further mitigation strategies for large transactions, see Section 17.7.2, “Group Replication Limitations”. -
group_replication_unreachable_majority_timeout
Property Value Command-Line Format --group-replication-unreachable-majority-timeout=#
Introduced 5.7.19 System Variable group_replication_unreachable_majority_timeout
Scope Global Dynamic Yes Type Integer Default Value 0
Minimum Value 0
Maximum Value 31536000
Configures how long members that suffer a network partition and cannot connect to the majority wait before leaving the group.
In a group of 5 servers (S1,S2,S3,S4,S5), if there is a disconnection between (S1,S2) and (S3,S4,S5) there is a network partition. The first group (S1,S2) is now in a minority because it cannot contact more than half of the group. While the majority group (S3,S4,S5) remains running, the minority group waits for the specified time for a network reconnection. Any transactions processed by the minority group are blocked until Group Replication is stopped using
STOP GROUP REPLICATION
on the members of the minority. Note thatgroup_replication_unreachable_majority_timeout
has no effect if it is set on the servers in the minority group after the loss of majority has been detected.By default, this system variable is set to 0, which means that members that find themselves in a minority due to a network partition wait forever to leave the group. If configured to a number of seconds, members wait for this amount of time after losing contact with the majority of members before leaving the group. When the specified time elapses, all pending transactions processed by the minority are rolled back, and the servers in the minority partition move to the
ERROR
state. These servers then follow the action specified by the system variablegroup_replication_exit_state_action
, which can be to set themselves to super read only mode or shut down MySQL.WarningWhen you have a symmetric group, with just two members for example (S0,S2), if there is a network partition and there is no majority, after the configured timeout all members enter
ERROR
state.
This section describes the status variables which provide information about Group Replication. The variable has the following meaning:
-
group_replication_primary_member
Shows the primary member's UUID when the group is operating in single-primary mode. If the group is operating in multi-primary mode, shows an empty string. See Section 17.4.1.3, “Finding the Primary”.
This section lists and explains the requirements and limitations of Group Replication.
Server instances that you want to use for Group Replication must satisfy the following requirements.
-
InnoDB Storage Engine. Data must be stored in the
InnoDB
transactional storage engine. Transactions are executed optimistically and then, at commit time, are checked for conflicts. If there are conflicts, in order to maintain consistency across the group, some transactions are rolled back. This means that a transactional storage engine is required. Moreover,InnoDB
provides some additional functionality that enables better management and handling of conflicts when operating together with Group Replication. The use of other storage engines, including the temporaryMEMORY
storage engine, might cause errors in Group Replication. You can prevent the use of other storage engines by setting thedisabled_storage_engines
system variable on group members, for example:disabled_storage_engines="MyISAM,BLACKHOLE,FEDERATED,ARCHIVE,MEMORY"
-
Primary Keys. Every table that is to be replicated by the group must have a defined primary key, or primary key equivalent where the equivalent is a non-null unique key. Such keys are required as a unique identifier for every row within a table, enabling the system to determine which transactions conflict by identifying exactly which rows each transaction has modified.
-
IPv4 Network. The group communication engine used by MySQL Group Replication only supports IPv4. Therefore, Group Replication requires an IPv4 network infrastructure.
-
Network Performance. MySQL Group Replication is designed to be deployed in a cluster environment where server instances are very close to each other. The performance and stabiity of a group can be impacted by both network latency and network bandwidth. Bi-directional communication must be maintained at all times between all group members. If either inbound or outbound communication is blocked for a server instance (for example, by a firewall, or by connectivity issues), the member cannot function in the group, and the group members (including the member with issues) might not be able to report the correct member status for the affected server instance.
The following options must be configured on server instances that are members of a group.
-
Binary Log Active. Set
--log-bin[=log_file_name]
. MySQL Group Replication replicates binary log contents, therefore the binary log needs to be on for it to operate. This option is enabled by default. See Section 5.4.4, “The Binary Log”. -
Slave Updates Logged. Set
--log-slave-updates
. Servers need to log binary logs that are applied through the replication applier. Servers in the group need to log all transactions that they receive and apply from the group. This is required because recovery is conducted by relying on binary logs form participants in the group. Therefore, copies of each transaction need to exist on every server, even for those transactions that were not initiated on the server itself. -
Binary Log Row Format. Set
--binlog-format=row
. Group Replication relies on row-based replication format to propagate changes consistently among the servers in the group. It relies on row-based infrastructure to be able to extract the necessary information to detect conflicts among transactions that execute concurrently in different servers in the group. See Section 16.2.1, “Replication Formats”. -
Binary Log Checksums Off. Set
--binlog-checksum=NONE
. Due to a design limitation of replication event checksums, Group Replication cannot make use of them, and they must be disabled. -
Global Transaction Identifiers On. Set
gtid_mode=ON
. Group Replication uses global transaction identifiers to track exactly which transactions have been committed on every server instance and thus be able to infer which servers have executed transactions that could conflict with already committed transactions elsewhere. In other words, explicit transaction identifiers are a fundamental part of the framework to be able to determine which transactions may conflict. See Section 16.1.3, “Replication with Global Transaction Identifiers”. -
Replication Information Repositories. Set
master_info_repository=TABLE
andrelay_log_info_repository=TABLE
. The replication applier needs to have the master information and relay log metadata written to themysql.slave_master_info
andmysql.slave_relay_log_info
system tables. This ensures the Group Replication plugin has consistent recoverability and transactional management of the replication metadata. See Section 16.2.4.2, “Slave Status Logs”. -
Transaction Write Set Extraction. Set
--transaction-write-set-extraction=XXHASH64
so that while collecting rows to log them to the binary log, the server collects the write set as well. The write set is based on the primary keys of each row and is a simplified and compact view of a tag that uniquely identifies the row that was changed. This tag is then used for detecting conflicts. -
Lower Case Table Names. Set
--lower-case-table-names
to the same value on all group members. A setting of 1 is correct for the use of theInnoDB
storage engine, which is required for Group Replication. Note that this setting is not the default on all platforms. -
Multithreaded Appliers. Group Replication members can be configured as multithreaded appliers, enabling transactions to be applied in parallel. Set
slave_parallel_workers=
(whereN
N
is the number of parallel applier threads),slave_preserve_commit_order=1
, andslave_parallel_type=LOGICAL_CLOCK
. Settingslave_parallel_workers=
enables the multithreaded applier on the member. Group Replication relies on consistency mechanisms built around the guarantee that all participating members receive and apply committed transaction in the same order, so you must also setN
slave_preserve_commit_order=1
to ensure that the final commit of parallel transactions is in the same order as the original transactions. Finally, in order to determine which transactions can be executed in parallel, the relay log must contain transaction parent information generated withslave_parallel_type=LOGICAL_CLOCK
. Attempting to add a member withslave_parallel_workers
set to greater than 0 without also setting the other two options, generates an error and the instance is prevented from joining.
The following known limitations exist for Group Replication. Note that the limitations and issues described for multi-primary mode groups can also apply in single-primary mode clusters during a failover event, while the newly elected primary flushes out its applier queue from the old primary.
Group Replication is built on GTID based replication, therefore you should also be aware of Section 16.1.3.6, “Restrictions on Replication with GTIDs”.
-
Gap Locks. The certification process does not take into account gap locks, as information about gap locks is not available outside of
InnoDB
. See Gap Locks for more information.NoteUnless you rely on
REPEATABLE READ
semantics in your applications, we recommend using theREAD COMMITTED
isolation level with Group Replication. InnoDB does not use gap locks inREAD COMMITTED
, which aligns the local conflict detection within InnoDB with the distributed conflict detection performed by Group Replication. -
Table Locks and Named Locks. The certification process does not take into account table locks (see Section 13.3.5, “LOCK TABLES and UNLOCK TABLES Statements”) or named locks (see
GET_LOCK()
). -
Replication Event Checksums. Due to a design limitation of replication event checksums, Group Replication cannot currently make use of them. Therefore set
--binlog-checksum=NONE
. -
SERIALIZABLE Isolation Level.
SERIALIZABLE
isolation level is not supported in multi-primary groups by default. Setting a transaction isolation level toSERIALIZABLE
configures Group Replication to refuse to commit the transaction. -
Concurrent DDL versus DML Operations. Concurrent data definition statements and data manipulation statements executing against the same object but on different servers is not supported when using multi-primary mode. During execution of Data Definition Language (DDL) statements on an object, executing concurrent Data Manipulation Language (DML) on the same object but on a different server instance has the risk of conflicting DDL executing on different instances not being detected.
-
Foreign Keys with Cascading Constraints. Multi-primary mode groups (members all configured with
group_replication_single_primary_mode=OFF
) do not support tables with multi-level foreign key dependencies, specifically tables that have definedCASCADING
foreign key constraints. This is because foreign key constraints that result in cascading operations executed by a multi-primary mode group can result in undetected conflicts and lead to inconsistent data across the members of the group. Therefore we recommend settinggroup_replication_enforce_update_everywhere_checks=ON
on server instances used in multi-primary mode groups to avoid undetected conflicts.In single-primary mode this is not a problem as it does not allow concurrent writes to multiple members of the group and thus there is no risk of undetected conflicts.
-
MySQL Enterprise Audit and MySQL Enterprise Firewall. Prior to version 5.7.21 MySQL Enterprise Audit and MySQL Enterprise Firewall use
MyISAM
tables in themysql
system database. Group Replication does not supportMyISAM
tables. -
Multi-primary Mode Deadlock. When a group is operating in multi-primary mode,
SELECT .. FOR UPDATE
statements can result in a deadlock. This is because the lock is not shared across the members of the group, therefore the expectation for such a statement might not be reached. -
Replication Filters. Replication filters cannot be used on a MySQL server instance that is configured for Group Replication, because filtering transactions on some servers would make the group unable to reach agreement on a consistent state.
The maximum number of MySQL servers that can be members of a single replication group is 9. If further members attempt to join the group, their request is refused. This limit has been identified from testing and benchmarking as a safe boundary where the group performs reliably on a stable local area network.
If an individual transaction results in message contents which are large enough that the message cannot be copied between group members over the network within a 5-second window, members can be suspected of having failed, and then expelled, just because they are busy processing the transaction. Large transactions can also cause the system to slow due to problems with memory allocation. To avoid these issues use the following mitigations:
-
Where possible, try and limit the size of your transactions. For example, split up files used with
LOAD DATA
into smaller chunks. -
Use the system variable
group_replication_transaction_size_limit
to specify a maximum transaction size that the group will accept. In MySQL 5.7, this system variable defaults to zero, but in MySQL 8.0, it defaults to a maximum transaction size of 150000000 bytes (approximately 143 MB). Transactions above this limit are rolled back and are not sent to Group Replication's Group Communication System (GCS) for distribution to the group. Adjust the value of this variable depending on the maximum message size that you need the group to tolerate, bearing in mind that the time taken to process a transaction is proportional to its size. -
Use the system variable
group_replication_compression_threshold
to specify a message size above which compression is applied. This system variable defaults to 1000000 bytes (1 MB), so large messages are automatically compressed. Compression is carried out by Group Replication's Group Communication System (GCS) when it receives a message that was permitted by thegroup_replication_transaction_size_limit
setting but exceeds thegroup_replication_compression_threshold
setting. If you set the system variable value to zero, compression is deactivated. For more information, see Section 17.9.7.2, “Message Compression”.
If you have deactivated message compression and do not specify a maximum transaction size, the upper size limit for a message that can be handled by the applier thread on a member of a replication group is the value of the member's slave_max_allowed_packet
system variable, which has a default and maximum value of 1073741824 bytes (1 GB). A message that exceeds this limit fails when the receiving member attempts to handle it. The upper size limit for a message that a group member can originate and attempt to transmit to the group is 4294967295 bytes (approximately 4 GB). This is a hard limit on the packet size that is accepted by the group communication engine for Group Replication (XCom, a Paxos variant), which receives messages after GCS has handled them. A message that exceeds this limit fails when the originating member attempts to broadcast it.
This section provides answers to frequently asked questions.
A group can consist of maximum 9 servers. Attempting to add another server to a group with 9 members causes the request to join to be refused. This limit has been identified from testing and benchmarking as a safe boundary where the group performs reliably on a stable local area network.
Servers in a group connect to the other servers in the group by opening a peer-to-peer TCP connection. These connections are only used for internal communication and message passing between servers in the group. This address is configured by the group_replication_local_address
variable.
The bootstrap flag instructs a member to create a group and act as the initial seed server. The second member joining the group needs to ask the member that bootstrapped the group to dynamically change the configuration in order for it to be added to the group.
A member needs to bootstrap the group in two scenarios. When the group is originally created, or when shutting down and restarting the entire group.
You pre-configure the Group Replication recovery channel credentials using the CHANGE MASTER TO
statement.
Not directly, but MySQL Group replication is a shared nothing full replication solution, where all servers in the group replicate the same amount of data. Therefore if one member in the group writes N bytes to storage as the result of a transaction commit operation, then roughly N bytes are written to storage on other members as well, because the transaction is replicated everywhere.
However, given that other members do not have to do the same amount of processing that the original member had to do when it originally executed the transaction, they apply the changes faster. Transactions are replicated in a format that is used to apply row transformations only, without having to re-execute transactions again (row-based format).
Furthermore, given that changes are propagated and applied in row-based format, this means that they are received in an optimized and compact format, and likely reducing the number of IO operations required when compared to the originating member.
To summarize, you can scale-out processing, by spreading conflict free transactions throughout different members in the group. And you can likely scale-out a small fraction of your IO operations, since remote servers receive only the necessary changes to read-modify-write changes to stable storage.
Some additional load is expected because servers need to be constantly interacting with each other for synchronization purposes. It is difficult to quantify how much more data. It also depends on the size of the group (three servers puts less stress on the bandwidth requirements than nine servers in the group).
Also the memory and CPU footprint are larger, because more complex work is done for the server synchronization part and for the group messaging.
Yes, but the network connection between each member must be reliable and have suitable perfomance. Low latency, high bandwidth network connections are a requirement for optimal performance.
If network bandwidth alone is an issue, then Section 17.9.7.2, “Message Compression” can be used to lower the bandwidth required. However, if the network drops packets, leading to re-transmissions and higher end-to-end latency, throughput and latency are both negatively affected.
When the network round-trip time (RTT) between any group members is 5 seconds or more you could encounter problems as the built-in failure detection mechanism could be incorrectly triggered.
This depends on the reason for the connectivity problem. If the connectivity problem is transient and the reconnection is quick enough that the failure detector is not aware of it, then the server may not be removed from the group. If it is a "long" connectivity problem, then the failure detector eventually suspects a problem and the server is removed from the group.
Once a server is removed from the group, you need to join it back again. In other words, after a server is removed explicitly from the group you need to rejoin it manually (or have a script doing it automatically).
If the member becomes silent, the other members remove it from the group configuration. In practice this may happen when the member has crashed or there is a network disconnection.
The failure is detected after a given timeout elapses for a given member and a new configuration without the silent member in it is created.
There is no method for defining policies for when to expel members automatically from the group. You need to find out why a member is lagging behind and fix that or remove the member from the group. Otherwise, if the server is so slow that it triggers the flow control, then the entire group slows down as well. The flow control can be configured according to the your needs.
No, there is no special member in the group in charge of triggering a reconfiguration.
Any member can suspect that there is a problem. All members need to (automatically) agree that a given member has failed. One member is in charge of expelling it from the group, by triggering a reconfiguration. Which member is responsible for expelling the member is not something you can control or set.
Group Replication is designed to provide highly available replica sets; data and writes are duplicated on each member in the group. For scaling beyond what a single system can provide, you need an orchestration and sharding framework built around a number of Group Replication sets, where each replica set maintains and manages a given shard or partition of your total dataset. This type of setup, often called a “sharded cluster”, allows you to scale reads and writes linearly and without limit.
If SELinux is enabled, which you can verify using sestatus -v, then you need to enable the use of the Group Replication communication port, configured by group_replication_local_address
, for mysqld so that it can bind to it and listen there. To see which ports MySQL is currently allowed to use, issue semanage port -l | grep mysqld. Assuming the port configured is 33061, add the necessary port to those permitted by SELinux by issuing semanage port -a -t mysqld_port_t -p tcp 33061.
If iptables is enabled, then you need to open up the Group Replication port for communication between the machines. To see the current rules in place on each machine, issue iptables -L. Assuming the port configured is 33061, enable communication over the necessary port by issuing iptables -A INPUT -p tcp --dport 33061 -j ACCEPT.
The replication channels used by Group Replication behave in the same way as replication channels used in master to slave replication, and as such rely on the relay log. In the event of a change of the relay_log
variable, or when the option is not set and the host name changes, there is a chance of errors. See Section 16.2.4.1, “The Slave Relay Log” for a recovery procedure in this situation. Alternatively, another way of fixing the issue specifically in Group Replication is to issue a STOP GROUP_REPLICATION
statement and then a START GROUP_REPLICATION
statement to restart the instance. The Group Replication plugin creates the group_replication_applier
channel again.
Group Replication uses two bind addresses in order to split network traffic between the SQL address, used by clients to communicate with the member, and the group_replication_local_address
, used internally by the group members to communicate. For example, assume a server with two network interfaces assigned to the network addresses 203.0.113.1
and 198.51.100.179
. In such a situation you could use 203.0.113.1:33061
for the internal group network address by setting group_replication_local_address=203.0.113.1:33061
. Then you could use 198.51.100.179
for hostname
and 3306
for the port
. Client SQL applications would then connect to the member at 198.51.100.179:3306
. This enables you to configure different rules on the different networks. Similarly, the internal group communication can be separated from the network connection used for client applications, for increased security.
Group Replication uses network connections between members and therefore its functionality is directly impacted by how you configure hostnames and ports. For example, the Group Replication recovery procedure is based on asynchronous replication which uses the server's hostname and port. When a member joins a group it receives the group membership information, using the network address information that is listed at performance_schema.replication_group_members
. One of the members listed in that table is selected as the donor of the missing data from the group to the new member.
This means that any value you configure using a hostname, such as the SQL network address or the group seeds address, must be a fully qualified name and resolvable by each member of the group. You can ensure this for example through DNS, or correctly configured /etc/hosts
files, or other local processes. If a you want to configure the MEMBER_HOST
value on a server, specify it using the --report-host
option on the server before joining it to the group.
The assigned value is used directly and is not affected by the skip_name_resolve
system variable.
To configure MEMBER_PORT
on a server, specify it using the report_port
system variable.
When Group Replication is started on a server, the value of auto_increment_increment
is changed to the value of group_replication_auto_increment_increment
, which defaults to 7, and the value of auto_increment_offset
is changed to the server ID. The changes are reverted when Group Replication is stopped. These settings avoid the selection of duplicate auto-increment values for writes on group members, which causes rollback of transactions. The default auto increment value of 7 for Group Replication represents a balance between the number of usable values and the permitted maximum size of a replication group (9 members).
The changes are only made and reverted if auto_increment_increment
and auto_increment_offset
each have their default value of 1. If their values have already been modified from the default, Group Replication does not alter them.
If the group is operating in single-primary mode, it can be useful to find out which member is the primary. See Section 17.4.1.3, “Finding the Primary”
来源:oschina
链接:https://my.oschina.net/u/4362549/blog/4122604