一、当前的架构¶
新增一个 secondary节点 S3, 192.168.1.155:27019
| ip | 端口 | ⻆色 |
|---|---|---|
| 192.168.1.153 | 27018 | primary M |
| 192.168.1.154 | 27018 | secondary S1 |
| 192.168.1.155 | 27018 | arbiter S2 |
二、方案 1:直接全量同步(直接 add加入空白节点到集群)¶
2.1 新启动一个空白的 mongo节点¶
1、 192.168.1.155上新建相关目录
mkdir -p /data/mongodb/mongodb_repl/data_27019
2、生成 配置文件
根据需求修改相应参数:
# cat > /data/mongodb/mongodb_repl/conf/mongo_27019.conf <<EOF
systemLog:
destination: file
logAppend: true
path: /data/mongodb/mongodb_repl/log/mongo_27019.log
storage:
dbPath: /data/mongodb/mongodb_repl/data_27019
journal:
enabled: true
wiredTiger:
engineConfig:
directoryForIndexes: true
cacheSizeGB: 1
processManagement:
fork: true # fork and run in background
pidFilePath: /data/mongodb/mongodb_repl/mongo_27019.pid
timeZoneInfo: /usr/share/zoneinfo
net:
port: 27019
bindIp: 0.0.0.0
#bindIpAll: true
maxIncomingConnections: 5000
unixDomainSocket:
enabled: true
pathPrefix: /data/mongodb/mongodb_repl/data_27019
filePermissions: 0700
#security:
# keyFile: /data/mongodb/mongodb_repl/auth/keyfile.key
# authorization: enabled
#replication:
# replSetName: repl
EOF
3、给目录授权
chown -R mongodb.mongodb /data/mongodb/mongodb_repl/
4、启动
# /data/mongodb/mongodb_repl/bin/mongod -f /data/mongodb/mongodb_repl/conf/mongo_27019.conf
5、登录新建管理账号
mongo 192.168.1.155:27019
> use admin
> db.createUser(
{
user: "root",
pwd: "root123456",
roles: [ { role: "root", db: "admin" } ]
}
)
6、关闭 mongo
kill 563055
或者
> db.shutdownServer()
7、修改 conf文件,去掉步骤 2中 security和 replication的注释
security:
keyFile: /data/mongodb/mongodb_repl/auth/keyfile.key
authorization: enabled
replication:
replSetName: repl
8、启动 mongo
# /data/mongodb/mongodb_repl/bin/mongod -f /data/mongodb/mongodb_repl/conf/mongo_27019.conf
9、登录
/data/mongodb/mongodb_repl/bin/mongo 192.168.1.155:27019/admin -uroot -proot123456
2.2 加入新的 S3节点到集群中¶
-- 存储节点 M中执行
# /data/mongodb/mongodb_repl/bin/mongo 192.168.1.153:27018/admin -uroot -proot123456
shard1:PRIMARY> rs.add("192.168.1.155:27019")
{
"ok" : 1,
"$clusterTime" : {
"clusterTime" : Timestamp(1665991566, 1),
"signature" : {
"hash" : BinData(0,"eJPwZNjQJBo2M34Fp2SGbUw8bsI="),
"keyId" : NumberLong("7155366788732026884")
}
},
"operationTime" : Timestamp(1665991566, 1)
}
2.3 S3 的状态变化以及 log信息¶
S3的 log中 S3的状态变化
- 1) transition to STARTUP2 from STARTUP
- 2) shard信息注册
- 3)建立索引 &cloner集合信息,初始化数据 initial sync done; took 60s.
- 4) transition to RECOVERING from STARTUP2
- 5) transition to SECONDARY from RECOVERING
master中 S3的状态变化:
- 1) Member 192.168.1.155:27102 is now in state STARTUP
- 2) Member 192.168.1.155:27102 is now in state STARTUP2
- 3) Member 192.168.1.155:27102 is now in state SECONDARY
初始化信息:
REPL [replexec-0] This node is 192.168.1.155:27019 in the config
REPL [replexec-0] transition to STARTUP2 from STARTUP
REPL [replexec-0] Starting replication storage threads
REPL [replexec-6] Member 192.168.1.154:27018 is now in state SECONDARY
REPL [replexec-1] Member 192.168.1.153:27018 is now in state PRIMARY
REPL [replexec-3] Member 192.168.1.155:27018 is now in state ARBITER
STORAGE [replexec-0] createCollection: local.temp_oplog_buffer with generated UUID: 2cce2a73-7133-4acb-88d3-685192f54fa0
REPL [replication-0] Starting initial sync (attempt 1 of 10)
STORAGE [replication-0] Finishing collection drop for local.temp_oplog_buffer (2cce2a73-7133-4acb-88d3-685192f54fa0).
STORAGE [replication-0] createCollection: local.temp_oplog_buffer with generated UUID: d341653c-6373-42dc-9a58-ae7d2f700887
REPL [replication-0] sync source candidate: 192.168.1.154:27018 -- 选择从 1.154:27018上复制信息
REPL [replication-0] Initial syncer oplog truncation finished in: 0ms
REPL [replication-0] ******
REPL [replication-0] creating replication oplog of size: 10240MB...
STORAGE [replication-0] createCollection: local.oplog.rs with generated UUID: 9284f200-65fe-414d-904b-66400b916276
STORAGE [replication-0] Starting OplogTruncaterThread local.oplog.rs
STORAGE [replication-0] The size storer reports that the oplog contains 0 records totaling to 0 bytes
STORAGE [replication-0] Scanning the oplog to determine where to place markers for truncation
REPL [replication-0] ******
删除 S3节点
repl:PRIMARY> rs.remove("192.168.1.155:27019")
三、方案 2:从增量开始同步(拷⻉一个节点的 data目录到新节点的 data目录)¶
将 192.168.1.154:27019使用增量的方式加入到集群中
3.1 关闭其中一个 secondary节点 S1¶
-- 192.168.1.154:27018节点登录并关闭该 mongo
# /data/mongodb/mongodb_repl/bin/mongo 192.168.1.154:27018/admin -uroot -
proot123456
repl:SECONDARY> db.shutdownServer()
3.2 拷⻉ S1的 data目录到 S3的 data目录¶
cp -r data data_27019/
cp /data/mongodb/mongodb_repl/conf/mongo_27018.conf /data/mongodb/mongodb_repl/conf/mongo_27019.conf
vim /data/mongodb/mongodb_repl/conf/mongo_27019.conf //将data改为 data_27019,27018改为27019
chown -R mongodb.mongodb /data/mongodb/
3.3 启动原从节点 S1¶
# /data/mongodb/mongodb_repl/bin/mongod -f /data/mongodb/mongodb_repl/conf/mongo_27018.conf
3.4 启动新节点 S3¶
# /data/mongodb/mongodb_repl/bin/mongod -f /data/mongodb/mongodb_repl/conf/mongo_27019.conf
# /data/mongodb/mongodb_repl/bin/mongo 192.168.1.154:27019/admin -uroot -proot123456
repl:OTHER>
3.5 加入 S3 到副本集 shard1中¶
--primary节点执行命令
repl:PRIMARY> rs.add("192.168.1.154:27019")
新加入的 S3, 27019节点的状态变化:
S3的 log中节点的变化:
- transition to RECOVERING from REMOVED
- transition to SECONDARY from RECOVERING
Master的 log中节点的变化:
- Member 192.168.1.154:27019 is now in state RS_DOWN
- Member 192.168.1.154:27019 is now in state SECONDARY
S3具体 log信息如下:
--节点的变化:
2021-09-24T13:58:25.371+0800 I REPL [replexec-0] transition to RECOVERING from REMOVED
2021-09-24T13:58:25.974+0800 I REPL [rsSync-0] transition to SECONDARY from RECOVERING
2021-09-24T13:58:25.371+0800 I REPL [replexec-0] New replica set config in use: { _id: "shard1", version: 2, protocolVersion: 1, writeConcernMajorityJournalDefault: true, members: [ { _id: 0, host: "192.168.1.153:27018", arbiterOnly: false, buildIndexes: true, hidden: false, priority: 1.0, tags: {}, slaveDelay: 0, votes: 1 }, { _id: 1, host: "192.168.1.153:27019", arbiterOnly: false, buildIndexes: true, hidden: false, priority: 1.0, tags: {}, slaveDelay: 0, votes: 1 }, { _id: 2, host: "192.168.1.153:27019", arbiterOnly: false, buildIndexes: true, hidden: false, priority: 1.0, tags: {}, slaveDelay: 0, votes: 1 }, { _id: 3, host: "192.168.1.154:27019", arbiterOnly: false, buildIndexes: true, hidden: false, priority: 1.0, tags: {}, slaveDelay: 0, votes: 1 } ], settings: { chainingAllowed: true, heartbeatIntervalMillis: 2000, heartbeatTimeoutSecs: 10, electionTimeoutMillis: 10000, catchUpTimeoutMillis: -1, catchUpTakeoverDelayMillis: 30000, getLastErrorModes: {}, getLastErrorDefaults: { w: 1, wtimeout: 0 }, replicaSetId: ObjectId('611db90b5ab6f6f976a4e2b0') } }
2021-09-24T13:58:25.371+0800 I REPL [replexec-0] This node is 192.168.1.153:27019 in the config
2021-09-24T13:58:25.371+0800 I REPL [replexec-0] transition to RECOVERING from REMOVED
2021-09-24T13:58:25.371+0800 I REPL [replexec-0] Resetting sync source to empty, which was :27017
2021-09-24T13:58:25.371+0800 I ASIO [Replication] Connecting to 192.168.1.154:27018
2021-09-24T13:58:25.371+0800 I ASIO [Replication] Connecting to 192.168.1.155:27018
2021-09-24T13:58:25.371+0800 I REPL [replexec-0] Member 192.168.1.153:27018 is now in state PRIMARY
2021-09-24T13:58:25.372+0800 I REPL [replexec-2] Member 192.168.1.154:27018 is now in state SECONDARY
2021-09-24T13:58:25.974+0800 I REPL [rsSync-0] transition to SECONDARY from RECOVERING