MongoDb regional replica set - primary node in each region?

后端 未结 2 1562
[愿得一人]
[愿得一人] 2021-02-09 19:53

Given an application with regional users (regions being North America, Europe, Aisa), how do you create a replica set that allows HK users to write to the nearest nodes?

相关标签:
2条回答
  • 2021-02-09 20:38

    Thanks @avanti, @MarkusWMalhberg - pondering how to respond to the comments pushed me in the right direction. This took a bit to pull together so I will be a bit verbose explaining the config.

    Overview

    Focusing on user experience, we want to create a Mongo database configuration that allows reads and writes to occur closest to the user.

    Assumptions

    • Users almost always read and write documents in their region and don't mind if infrequent reads to other region's data are slower.
    • Each document contains a key that indicates their region (for simplicity/clarity)

    Much of the sharding documentation focuses on HA/DR. With user experience and regional compliance the focus is on locality instead of load distribution.

    This example will completely ignore HA/DR, read preferences, and write concerns but these need to be addressed if the POC is matured. The example ignores these in favor of clarity meeting the goal: local reads/writes.

    References

    • Operational Segmentation
    • Managing Shard Tags
    • ReplicaSet Configuration
    • Different mongos config db errors

    Tricks

    We know

    • We want one application database so all data is available
    • We want users to read/write locally so we need a database near each user group; we need a replica set
    • Writes can only be made to primary replica set nodes so, to get primary nodes next to each user group, we need multiple replica; a sharded cluster

    Within standard ReplicaSet and Sharding knowledge, there are 2 keys to this configuration:

    • Assign a priority to the regionally local ReplicaSet node to ensure it becomes primary.
    • Use location aware shard key tagging to ensure data is written to local shards

    Shard keys can be anything: we are only concerned with users being able to read/write locally in contrast to effective load sharing.

    Each collection will have to be sharded, or writes will go to shard zero.

    Desired configuration

    The configuration

    #!/usr/bin/env bash
    
    echo ">>> Clean up processes and files from previous runs"
    echo ">>> killAll mongod mongos"
    killall mongod mongos
    
    echo ">>> Remove db files and logs"
    rm -rf data
    rm -rf log
    
    # Create the common log directory
    mkdir log
    
    echo ">>> Start replica set for shard US-East"
    mkdir -p data/shard-US-East/rsMemberEast data/shard-US-East/rsMemberWest
    mongod --replSet shard-US-East --logpath "log/shard-US-East-rsMemberEast.log" --dbpath data/shard-US-East/rsMemberEast --port 37017 --fork --shardsvr --smallfiles
    mongod --replSet shard-US-East --logpath "log/shard-US-East-rsMemberWest.log" --dbpath data/shard-US-East/rsMemberWest --port 37018 --fork --shardsvr --smallfiles
    
    echo ">>> Sleep 15s to allow US-East replica set to start"
    sleep 15
    
    # The US-East replica set member is assigned priority 2 so that it becomes primary
    echo ">>> Configure replica set for shard US-East"
    mongo --port 37017 << 'EOF'
    config = { _id: "shard-US-East", members:[
             { _id : 0, host : "localhost:37017", priority: 2 },
             { _id : 1, host : "localhost:37018" }]};
    rs.initiate(config)
    EOF
    
    echo ">>> Start replica set for shard-US-West"
    mkdir -p data/shard-US-West/rsMemberEast data/shard-US-West/rsMemberWest
    mongod --replSet shard-US-West --logpath "log/shard-US-West-rsMemberEast.log" --dbpath data/shard-US-West/rsMemberEast --port 47017 --fork --shardsvr --smallfiles
    mongod --replSet shard-US-West --logpath "log/shard-US-West-rsMemberWest.log" --dbpath data/shard-US-West/rsMemberWest --port 47018 --fork --shardsvr --smallfiles
    
    echo ">>> Sleep 15s to allow US-West replica set to start"
    sleep 15
    
    # The US-West replica set member is assigned priority 2 so that it becomes primary
    echo ">>> Configure replica set for shard-US-West"
    mongo --port 47017 << 'EOF'
    config = { _id: "shard-US-West", members:[
             { _id : 0, host : "localhost:47017" },
             { _id : 1, host : "localhost:47018", priority: 2 }]};
    rs.initiate(config)
    EOF
    
    # Shard config servers: should be 3 and all must be up to deploy a shard cluster
    # These are the mongos backing store for routing information
    echo ">>> Start config servers"
    mkdir -p data/config/config-us-east data/config/config-us-west data/config/config-redundant
    mongod --logpath "log/cfg-us-east.log"   --dbpath data/config/config-us-east   --port 57040 --fork --configsvr --smallfiles
    mongod --logpath "log/cfg-us-west.log"   --dbpath data/config/config-us-west   --port 57041 --fork --configsvr --smallfiles
    mongod --logpath "log/cfg-redundant.log" --dbpath data/config/config-redundant --port 57042 --fork --configsvr --smallfiles
    
    echo ">>> Sleep 5 to allow config servers to start and stabilize"
    sleep 5
    
    # All mongos's must point at the same config server, a coordinator dispatches writes to each
    echo ">>> Start mongos"
    mongos --logpath "log/mongos-us-east.log" --configdb localhost:57040,localhost:57041,localhost:57042 --port 27017 --fork
    mongos --logpath "log/mongos-us-west.log" --configdb localhost:57040,localhost:57041,localhost:57042 --port 27018 --fork
    
    echo ">>> Wait 60 seconds for the replica sets to stabilize"
    sleep 60
    
    # Enable sharding on the 'sales' database and 'sales.users' collection
    # Every collection in 'sales' must be sharded or the writes will go to shard 0
    # Add a shard tag so we can associate shard keys with the tag (region)
    # Shard tag range main and max cannot be the same so we use a region id for US-East = 1
    # and US-West = 2. sh.addTagRange() is inclusive of minKey and exclusive of maxKey.
    # We only need to configure one mongos - config will be propogated to all mongos through
    # the config server
    echo ">>> Add shards to mongos"
    mongo --port 27017 <<'EOF'
    db.adminCommand( { addshard : "shard-US-East/"+"localhost:37017" } );
    db.adminCommand( { addshard : "shard-US-West/"+"localhost:47017" } );
    
    db.adminCommand({enableSharding: "sales"})
    db.adminCommand({shardCollection: "sales.users", key: {region:1}});
    
    sh.addShardTag("shard-US-East", "US-East")
    sh.addShardTag("shard-US-West", "US-West")
    sh.addTagRange("sales.users", { region: 1 }, { region: 2 }, "US-East")
    sh.addTagRange("sales.users", { region: 2 }, { region: 3 }, "US-West")
    EOF
    

    Testing

    Verify our configuration is correct with sh.status(). Note shards are correctly assigned and tags, and regional shard keys are correctly assigned.

    [starver@rakshasa RegionalSharding 14:38:50]$ mongo --port 27017 sales
    ...
    rakshasa(mongos-3.0.5)[mongos] sales> sh.status()
      sharding version: {
        "_id": 1,
        "minCompatibleVersion": 5,
        "currentVersion": 6,
        "clusterId": ObjectId("55fffffdc5746e30dc3651cda4")
      }
      shards:
        {  "_id": "shard-US-East",  "host": "shard-US-East/localhost:37017,localhost:37018",  "tags": [   "US-East" ] }
        {  "_id": "shard-US-West",  "host": "shard-US-West/localhost:47017,localhost:47018",  "tags": [   "US-West" ] }
      balancer:
        Currently enabled:  yes
        Currently running:  no
        Failed balancer rounds in last 5 attempts:  0
        Migration Results for the last 24 hours: 
            1 : Success
      databases:
        {  "_id": "admin",  "partitioned": false,  "primary": "config" }
        {  "_id": "test",  "partitioned": false,  "primary": "shard-US-East" }
        {  "_id": "sales",  "partitioned": true,  "primary": "shard-US-East" }
        sales.users
          shard key: { "region": 1 }
          chunks:
            shard-US-East: 2
            shard-US-West: 1
            { "region": { "$minKey" : 1 } } -> { "region": 1 } on: shard-US-East Timestamp(2, 1) 
            { "region": 1 } -> { "region": 2 } on: shard-US-East Timestamp(1, 3) 
            { "region": 2 } -> { "region": { "$maxKey" : 1 } } on: shard-US-West Timestamp(2, 0) 
            tag: US-East  {
      "region": 1
    } -> {
      "region": 2
    }
            tag: US-West  {
      "region": 2
    } -> {
      "region": 3
    }
    

    Verify writes are made to the right shard and primary. Create a record in each region

    db.users.insert({region:1, name:"us east user"})
    db.users.insert({region:2, name:"us west user"})
    

    You can log on to each member of each replica set and see the east user only on the US-East shard and the west user only on the US-West shard.

    0 讨论(0)
  • 2021-02-09 20:46

    Have you looked into Tag Aware Sharding?

    The balancer obeys tagged range associations, which enables the following deployment patterns:

    • isolate a specific subset of data on a specific set of shards.
    • ensure that the most relevant data reside on shards that are geographically closest to the application servers.
    0 讨论(0)
提交回复
热议问题