Endless recovering state of secondary

后端未结

关注

 3  1094

I build a replication set with one primary, one secondary and one arbiter on MongoDB 3.0.2. The primary and arbiter are on the same host and the secondary is on another host.

相关标签:

3条回答

忘掉有多难

2021-02-08 18:16
The problem (most likely)

The last operation on the primary is from "2015-05-15T02:10:56Z", whereas the last operation of the going to be secondary is from "2015-05-14T11:23:51Z", which is a difference of roughly 15 hours. That window may well exceed your replication oplog window (the difference between the time of the first and the last operation entry in your oplog). Put simply, there are too many operations on the primary for the secondary to catch up.

A bit more elaborated (though simplified): during an initial sync, the data the secondary syncs from is the data of a given point in time. When the data of that point in time is synced over, the secondary connects to the oplog and applies the changes that were made between said point in time and now according to the oplog entries. This works well as long as the oplog holds all operations between the mentioned point in time. But the oplog has a limited size (it is a so called capped collection). So if there are more operations happening on the primary than the oplog can hold during the initial sync, the oldest operations "fade out". The secondary recognises that not all operations are available necessary to "construct" the same data as the primary and refuses to complete the sync, staying in RECOVERY mode.

The solution(s)

The problem is a known one and not a bug, but a result of the inner workings of MongoDB and several fail-safe assumptions made by the development team. Hence, there are several ways to deal with the situation. Sadly, since you only have two data bearing nodes, all involve downtime.

Option 1: Increase the oplog size

This is my preferred method, since it deals with the problem once and (kind of) for all. It's a bit more complicated than other solutions, though. From a high level perspective, these are the steps you take.
1. Shut down the primary
2. Create a backup of the oplog using direct access to the data files
3. Restart the mongod in standalone mode
4. Copy the current oplog to a temporary collection
5. Delete the current oplog
6. Recreate the oplog with the desired size
7. Copy back the oplog entries from the temporary collection to the shiny new oplog
8. Restart mongod as part of the replica set
Do not forget to increase the oplog of the secondary before doing the initial sync, since it may become primary at some time in the future!

For details, please read "Change the size of the oplog" in the tutorials regarding replica set maintenance.

Option 2: Shut down the app during sync

If option 1 is not viable, the only real other solution is to shut down the application causing load on the replica set, restart the sync and wait for it too complete. Depending on the amount of the data to be transferred, calculate with several hours.

A personal note

The oplog window problem is a well known one. While replica sets and sharded clusters are easy to set up with MongoDB, quite some knowledge and a bit of experience is needed to maintain them properly. Do not run something as important as a database with a complex setup without knowing the basics - in case Something Bad (tm) happens, it might well lead to a situation FUBAR.
0 讨论(0)
发布评论:

提交评论
- 加载中...
我寻月下人不归

2021-02-08 18:20

Add a fourth new node to the replica set. Once it has synced then reset the stale secondary.

0 讨论(0)
发布评论:

提交评论
- 加载中...
深忆病人

2021-02-08 18:40

Another option (assuming primary has healthy data) is to simply delete the data in the secondary's mongo data folder and restart. This will cause it to sync back up to the primary as if you just added it to the replica set.

0 讨论(0)
发布评论:

提交评论
- 加载中...

Endless recovering state of secondary

The problem (most likely)

The solution(s)

Option 1: Increase the oplog size

Option 2: Shut down the app during sync

A personal note