一、分片规划¶
| 角色 | ip | 副本集名 | 端口 |
|---|---|---|---|
| mongos | 192.168.1.153 | 无 | 27000 |
| 192.168.1.154 | |||
| 192.168.1.155 | |||
| configServer | 192.168.1.153 | repl_config | 27100 |
| 192.168.1.154 | |||
| 192.168.1.155 | |||
| shard1 | 192.168.1.153 | shard1 | 27101,1主2从 |
| 192.168.1.154 | |||
| 192.168.1.155 | |||
| shard2 | 192.168.1.153 | shard2 | 27102,1主2从 |
| 192.168.1.154 | |||
| 192.168.1.155 |
二、配置过程¶
1、下载&解压
cd /data/download
wget https://fastdl.mongodb.org/linux/mongodb-linux-x86_64-rhel70-6.0.16.tgz
tar zxvf mongodb-linux-x86_64-rhel70-6.0.16.tgz
mkdir -p /data/mongodb/mongodb_shard
cp -r /data/download/mongodb-linux-x86_64-rhel70-6.0.16/bin /data/mongodb/mongodb_shard/bin
2、目录配置
cd /data/mongodb/mongodb_shard
mkdir -p {auth,conf,shard1_27101/{data,log},shard2_27102/{data,log},config_27100/{data,log},mongos_27000/{data,log}}
3、生成keyfile
echo "mongodb123456" > /data/mongodb/mongodb_shard/auth/keyfile.key
cd /data/mongodb/mongodb_shard
chmod 600 auth/keyfile.key
4、配置文件
(1) configserver配置
cat > /data/mongodb/mongodb_shard/conf/config_27100.conf <<EOF
systemLog:
destination: file
logAppend: true
path: /data/mongodb/mongodb_shard/config_27100/log/config_27100.log
storage:
dbPath: /data/mongodb/mongodb_shard/config_27100/data
journal:
enabled: true
wiredTiger:
engineConfig:
directoryForIndexes: true
cacheSizeGB: 1
processManagement:
fork: true
pidFilePath: /data/mongodb/mongodb_shard/config_27100/data/config_27100.pid
timeZoneInfo: /usr/share/zoneinfo
net:
port: 27100
bindIp: 0.0.0.0
#bindIpAll: true
maxIncomingConnections: 2000
unixDomainSocket:
enabled: true
pathPrefix: /data/mongodb/mongodb_shard/config_27100/data
filePermissions: 0700
security:
keyFile: /data/mongodb/mongodb_shard/auth/keyfile.key
authorization: enabled
replication:
replSetName: repl_config
sharding:
clusterRole: configsvr
EOF
(2) shard1配置
cat > /data/mongodb/mongodb_shard/conf/shard1_27101.conf <<EOF
systemLog:
destination: file
logAppend: true
path: /data/mongodb/mongodb_shard/shard1_27101/log/shard1_27101.log
storage:
dbPath: /data/mongodb/mongodb_shard/shard1_27101/data
journal:
enabled: true
wiredTiger:
engineConfig:
directoryForIndexes: true
cacheSizeGB: 1
processManagement:
fork: true # fork and run in background
pidFilePath: /data/mongodb/mongodb_shard/shard1_27101/data/shard1_27101.pid
timeZoneInfo: /usr/share/zoneinfo
net:
port: 27101
bindIp: 0.0.0.0
#bindIpAll: true
maxIncomingConnections: 5000
unixDomainSocket:
enabled: true
pathPrefix: /data/mongodb/mongodb_shard/shard1_27101/data
filePermissions: 0700
security:
keyFile: /data/mongodb/mongodb_shard/auth/keyfile.key
authorization: enabled
replication:
replSetName: shard1
sharding:
clusterRole: shardsvr
EOF
(3) shard2配置
cat > /data/mongodb/mongodb_shard/conf/shard2_27102.conf <<EOF
systemLog:
destination: file
logAppend: true
path: /data/mongodb/mongodb_shard/shard2_27102/log/shard2_27102.log
storage:
dbPath: /data/mongodb/mongodb_shard/shard2_27102/data
journal:
enabled: true
wiredTiger:
engineConfig:
directoryForIndexes: true
cacheSizeGB: 1
processManagement:
fork: true # fork and run in background
pidFilePath: /data/mongodb/mongodb_shard/shard2_27102/data/shard2_27102.pid
timeZoneInfo: /usr/share/zoneinfo
net:
port: 27102
bindIp: 0.0.0.0
#bindIpAll: true
maxIncomingConnections: 5000
unixDomainSocket:
enabled: true
pathPrefix: /data/mongodb/mongodb_shard/shard2_27102/data
filePermissions: 0700
security:
keyFile: /data/mongodb/mongodb_shard/auth/keyfile.key
authorization: enabled
replication:
replSetName: shard2
sharding:
clusterRole: shardsvr
EOF
(4) mongos配置
cat > /data/mongodb/mongodb_shard/conf/mongos_27000.conf <<EOF
systemLog:
destination: file
logAppend: true
path: /data/mongodb/mongodb_shard/mongos_27000/log/mongos_27000.log
processManagement:
fork: true # fork and run in background
pidFilePath: /data/mongodb/mongodb_shard/mongos_27000/data/mongos_27000.pid
timeZoneInfo: /usr/share/zoneinfo
net:
port: 27000
bindIpAll: true
maxIncomingConnections: 2000
unixDomainSocket:
enabled: true
pathPrefix: /data/mongodb/mongodb_shard/mongos_27000/data
filePermissions: 0700
security:
keyFile: /data/mongodb/mongodb_shard/auth/keyfile.key
sharding:
configDB: repl_config/192.168.1.153:27100,192.168.1.154:27100,192.168.1.155:27100
EOF
5、修改权限
useradd mongodb
chown -R mongodb:mongodb /data/mongodb/mongodb_shard
6、初始化 config server 服务
(1) 启动 config server
sudo su - mongodb -c '/data/mongodb/mongodb_shard/bin/mongod -f /data/mongodb/mongodb_shard/conf/config_27100.conf'
(2) 初始化
mongo port=27100
use admin
config = { _id:"repl_config",members:[ {_id:0,host:"192.168.1.153:27100"}, {_id:1,host:"192.168.1.154:27100"}, {_id:2,host:"192.168.1.155:27100"}] }
rs.initiate(config)
结果输出如下,说明集群初始化成功,可以通过rs.status()命令查看集群状态(88为primary,89/90为secondary)
{
"ok" : 1,
"$gleStats" : {
"lastOpTime" : Timestamp(1557909332, 1),
"electionId" : ObjectId("000000000000000000000000")
},
"lastCommittedOpTime" : Timestamp(0, 0)
}
注:
configserver 可以是一个节点,官方建议复制集。configserver不能有arbiter。 新版本中,要求必须是复制集。mongodb 3.4之后,虽然要求config server为replica set,但是不支持arbiter
(3) 添加 configserver的管理账号
use admin
db.createUser(
{
user: "root",
pwd: "000000",
roles: [ { role: "root", db: "admin" } ]
}
)
7、初始化 shard1&shard2存储节点
(1) 启动 shard1&shard2
sudo su - mongodb -c '/data/mongodb/mongodb_shard/bin/mongod -f /data/mongodb/mongodb_shard/conf/shard1_27101.conf'
sudo su - mongodb -c '/data/mongodb/mongodb_shard/bin/mongod -f /data/mongodb/mongodb_shard/conf/shard2_27102.conf'
(2) 初始化
mongo port=27101
use admin
config = { _id:"shard1",members:[ {_id:0,host:"192.168.1.153:27101"}, {_id:1,host:"192.168.1.154:27101"},{_id:2,host:"192.168.1.155:27101"}] }
#初始化
rs.initiate(config)
{ "ok" : 1 }
mongo port=27102
use admin
config = { _id:"shard2",members:[ {_id:0,host:"192.168.1.153:27102"}, {_id:1,host:"192.168.1.154:27102"},{_id:2,host:"192.168.1.155:27102"}] }
#初始化
rs.initiate(config)
{ "ok" : 1 }
结果输出如上,说明集群初始化成功,可以通过rs.status()命令查看集群状态(88为primary,89为secondary,90为arbiter)
(3) 添加存储节点的管理账号
use admin
db.createUser(
{
user: "root",
pwd: "000000",
roles: [ { role: "root", db: "admin" } ]
}
)
8、配置路由节点
(1) 启动 mongos
sudo su - mongodb -c '/data/mongodb/mongodb_shard/bin/mongos -f /data/mongodb/mongodb_shard/conf/mongos_27000.conf'
(2) 查看启动状态
mongo port=2000
ps -ef |grep mongos
netstat -tnlp |grep mongos
(3) 登录 mongos
mongos> use admin
switched to db admin
mongos> db.auth('root','000000')
1
9、分片集群操作
连接到其中任意一个mongos(192.168.1.153),做以下配置:
(1) 连接到mongs的admin数据库
su - mongod
mongo 192.168.1.153:27000/admin
(2) 添加分片:把shard1、shard2添加到mongos
sh.addShard('shard1/192.168.1.153:27101,192.168.1.154:27101,192.168.1.155:27101')
sh.addShard('shard2/192.168.1.153:27102,192.168.1.154:27102,192.168.1.155:27102')
(3) 列出分片
db.runCommand( { listshards : 1 } )
(4) 整体状态查看
sh.status();
三、分片配置¶
1、使用RANGE分片配置及测试
test库下的vast大表进行手工分片 为了更好的观察分片的效果,调整chunk的大小,默认128mb 切换到数据库config
mongos> use config
mongos> db.settings.updateOne(
{ _id: "chunksize" },
{ $set: { _id: "chunksize", value: 1 } },
{ upsert: true }
)
(1) 激活数据库分片功能
mongo port=27000 admin
admin> use admin
#从 MongoDB 6.0 开始,对集合分片无需运行 sh.enableSharding() 方法来配置数据库。
#5.0版本,admin> db.runCommand( { enablesharding : "test" } ) //test库开启分片功能
mongos> use config
switched to db config
mongos> db.databases.find()
{ "_id" : "test", "primary" : "shard2", "partitioned" : true, "version" : { "uuid" : UUID("252afeae-277e-44d7-9575-1363acdcfa22"), "timestamp" : Timestamp(1665996971, 1), "lastMod" : 1 } }
(2) 指定分片建对集合分片
eg:范围片键 创建索引
mongos> use test
mongos> db.vast.createIndex({"id": 1})
开启分片vast以id分片
mongos> use admin
mongos> sh.shardCollection("test.vast", { id: 1 } )
# 等价于
db.runCommand( { shardcollection : "test.vast",key : {id: 1} } )
(3) 集合分片验证
mongos> use test
插入2万条数据
mongos> for(i=1;i<20000;i++){
db.vast.insert({"id":i,"name":"shenzheng_abcdlkakjkjlkjkljlkjlkjkklllladfadfadadfdadfa","age":70,"date":new Date()}); }
查看分布,发现都集中在一个shard的chunk上
mongos> db.vast.getShardDistribution();
Shard shard2 at
shard2/192.168.1.153:27102,192.168.1.154:27102,192.168.1.155:27102
data : 2.42MiB docs : 19999 chunks : 1
estimated data per chunk : 2.42MiB
estimated docs per chunk : 19999
Totals
data : 2.42MiB docs : 19999 chunks : 1
Shard shard2 contains 100% data, 100% docs in cluster, avg obj size on shard : 127B
再插入1万数据
mongos> for(i=20000;i<30000;i++){
db.vast.insert({"id":i,"name":"shenzheng_abcdlkakjkjlkjkljlkjlkjkklllladfadfadadfdadfa","age":70,"date":new Date()}); }
查看数据分布,数据还是不均匀
db.vast.getShardDistribution();
Shard shard1 at
shard1/192.168.1.153:27101,192.168.1.154:27101,192.168.1.155:27101
data : 1023KiB docs : 8255 chunks : 1
estimated data per chunk : 1023KiB
estimated docs per chunk : 8255
Shard shard2 at
shard2/192.168.1.153:27102,192.168.1.154:27102,192.168.1.155:27102
data : 2.63MiB docs : 21744 chunks : 1
estimated data per chunk : 2.63MiB
estimated docs per chunk : 21744
Totals
data : 3.63MiB docs : 29999 chunks : 2
Shard shard1 contains 27.51% data, 27.51% docs in cluster, avg obj size on shard : 127B
Shard shard2 contains 72.48% data, 72.48% docs in cluster, avg obj size on shard : 127B
插入10万条数据,等候几分钟再查看
mongos> db.vast.getShardDistribution();
Shard shard1 at
shard1/192.168.1.153:27101,192.168.1.154:27101,192.168.1.155:27101
data : 4.99MiB docs : 41275 chunks : 5
estimated data per chunk : 1023KiB
estimated docs per chunk : 8255
Shard shard2 at
shard2/192.168.1.153:27102,192.168.1.154:27102,192.168.1.155:27102
data : 7.11MiB docs : 58722 chunks : 1
estimated data per chunk : 7.11MiB
estimated docs per chunk : 58722
Totals
data : 12.11MiB docs : 99997 chunks : 6
Shard shard1 contains 41.27% data, 41.27% docs in cluster, avg obj size on shard : 127B
Shard shard2 contains 58.72% data, 58.72% docs in cluster, avg obj size on shard : 127B
在官网查询相关资料 https://www.mongodb.com/zh-cn/docs/v6.0/faq/diagnostics/#sharded-cluster-diagnostics https://www.mongodb.com/zh-cn/docs/manual/release-notes/6.0/#sharding https://www.mongodb.com/zh-cn/docs/manual/tutorial/manage-sharded-cluster-balancer/ https://www.mongodb.com/zh-cn/docs/v6.0/tutorial/manage-sharded-cluster-balancer/#configure-default-range-size https://www.mongodb.com/zh-cn/docs/v6.0/core/sharding-balancer-administration/#std-label-sharding-balancing
总结如下:
1、getShardDistribution()查看的是预估值 2、从6.0.3不再执行自动数据快分割 3、集群必须有足够的数据才能进行分配,至少8*chunkSize才会启动分配,否则保留在一个分片上。 4、热数据段容易集中(for循环批量写入) 5、chunkSize只影响新生成的Chunk 6、数据平衡是后端异步操作,引入了智能的分片分配算法,完全由系统控制 7、数据平衡是相对平衡,不是绝对平衡
(4) 分片结果测试
4.1,通过status 查看分配情况,这个数据量太多,不太直观
mongos> sh.status()
- Sharding Status -
sharding version: {
"_id" : 1,
"minCompatibleVersion" : 5,
"currentVersion" : 6,
"clusterId" : ObjectId("66a61506264e094d6991ba63")
}
shards:
{ "_id" : "shard1", "host" : "shard1/192.168.1.153:27101,192.168.1.154:27101,192.168.1.155:27101", "state" : 1, "topologyTime" : Timestamp(1722161219, 5) }
{ "_id" : "shard2", "host" : "shard2/192.168.1.153:27102,192.168.1.154:27102,192.168.1.155:27102", "state" : 1, "topologyTime" : Timestamp(1722161390, 5) }
active mongoses:
"6.0.16" : 3
autosplit:
Currently enabled: yes
balancer:
Currently enabled: yes
Currently running: no
Failed balancer rounds in last 5 attempts: 0
Migration results for the last 24 hours:
No recent migrations
databases:
{ "_id" : "config", "primary" : "config", "partitioned" : true }
config.system.sessions
shard key: { "_id" : 1 }
unique: false
balancing: true
chunks:
shard1 1024 这个是chunk的大小
too many chunks to print, use verbose if you want to force print
{ "_id" : "test", "primary" : "shard2", "partitioned" : false, "version" : { "uuid" : UUID("4fa1ab60-e18d-41cf-8bff-d25f48e7aa98"), "timestamp" : Timestamp(1722161591, 1), "lastMod" : 1 } }
使用 sh.status({verbose:true}) 可以查看所有明细
4.2,通过getShardDistribution()查看,查看的是预估值,不是实际值
node1> mongo 192.168.1.153:27000/admin
mongos> use admin
mongos> db.auth('root','000000')
mongos> use test
mongos> db.vast.getShardDistribution();
4.3,通过计数的方式查看(准确)
shard1:
/data/mongodb/mongodb_shard/bin/mongo 192.168.1.153:27101/admin -uroot -p000000
shard1:PRIMARY> use test
shard1:PRIMARY> db.vast.count()
267902
shard2:
/data/mongodb/mongodb_shard/bin/mongo 192.168.1.153:27102/admin -uroot -p000000
shard2:PRIMARY> use test
shard2:PRIMARY> db.vast.count()
262146
2、使用Hash分片例子
对test库下的vast大表进行hash 创建哈希索引
(1) 对于test开启分片功能
mongo port 27101 admin
use admin
admin> db.runCommand( { enablesharding : "test" } )
(2) 对于test库下的vast表建立hash索引
use test
test> db.vast_h.createIndex( { id: "hashed" } )
(3) 开启分片
use admin
admin > sh.shardCollection( "test.vast_h", { id: "hashed" } )
(4) 录入1w行数据测试
use test
for(i=1;i<10000;i++){
db.vast_h.insert({"id":i,"name":"shenzheng","age":70,"date":new Date()}); }
(5) hash分片结果测试
通过getShardDistribution()查看数据统计
mongos> db.vast_h.getShardDistribution();
Shard shard1 at
shard1/192.168.1.153:27101,192.168.1.154:27101,192.168.1.155:27101
data : 394KiB docs : 4992 chunks : 2
estimated data per chunk : 197KiB
estimated docs per chunk : 2496
Shard shard2 at
shard2/192.168.1.153:27102,192.168.1.154:27102,192.168.1.155:27102
data : 396KiB docs : 5007 chunks : 2
estimated data per chunk : 198KiB
estimated docs per chunk : 2503
Totals
data : 790KiB docs : 9999 chunks : 4
Shard shard1 contains 49.92% data, 49.92% docs in cluster, avg obj size on shard : 81B
Shard shard2 contains 50.07% data, 50.07% docs in cluster, avg obj size on shard : 81B
查看数据
shard1:
/data/mongodb/mongodb_shard/bin/mongo 192.168.1.153:27101/admin -uroot -p000000
shard1:PRIMARY> use test
shard1:PRIMARY> db.vast_h.count()
4992
shard2:
/data/mongodb/mongodb_shard/bin/mongo 192.168.1.153:27102/admin -uroot -p000000
shard2:PRIMARY> use test
shard2:PRIMARY> db.vast_h.count()
5007
mongos> sh.status()
- Sharding Status -
test.vast_h
shard key: { "id" : "hashed" }
unique: false
balancing: true
chunks:
shard1 2
shard2 2
四、分片的管理¶
1、判断是否Shard集群
admin> db.runCommand({ isdbgrid : 1})
2、列出所有分片信息
admin> db.runCommand({ listshards : 1})
3、列出开启分片的数据库
admin> use config
config> db.databases.find( { "partitioned": true } )
config> db.databases.find() //列出所有数据库分片情况
4、查看分片的片键
config> db.collections.find().pretty()
{
"_id" : "test.vast",
"lastmodEpoch" : ObjectId("58a599f19c898bbfb818b63c"),
"lastmod" : ISODate("1970-02-19T17:02:47.296Z"),
"dropped" : false,
"key" : {
"id" : 1
},
"unique" : false
}
5、查看分片的详细信息
admin> db.printShardingStatus()
db.printShardingStatus({verbose:true})
admin> sh.status()
6、删除分片节点(谨慎)
确认blance是否在工作
mongos> sh.getBalancerState()
true
删除shard2节点(谨慎)
mongos> db.runCommand( { removeShard: "shard2" } )
注意:删除操作一定会立即触发blancer。
7、balancer操作
介绍: mongos的一个重要功能,自动巡查所有shard节点上的chunk的情况,自动做chunk迁移。 什么时候工作?
- 自动运行,会检测系统不繁忙的时候做迁移
- 在做节点删除的时候,立即开始迁移工作
- balancer只能在预设定的时间窗口内运行
有需要时可以关闭和开启blancer(备份的时候)
mongos> sh.stopBalancer()
mongos> sh.startBalancer()
8、自定义自动平衡进行的时间段
https://docs.mongodb.com/manual/tutorial/manage-sharded-cluster-balancer/#schedule-the-balancing-window // connect to mongos
mongos> use config
mongos> sh.setBalancerState( true )
mongos> db.settings.update({ _id : "balancer" }, { $set : { activeWindow : { start : "3:00", stop : "5:00" } } }, true )
mongos> sh.getBalancerWindow()
{ "start" : "3:00", "stop" : "5:00" }
sh.status()
9、chunk常用命令
(1) 查看shard1和shard2的chunk的个数:
mongos> use config
mongos> db.chunks.find({"shard" : "shard1"}).count()
512
mongos> db.chunks.find({"shard" : "shard2"}).count()
512
(2) 查看chunk的大小
mongos> use config
mongos> db.settings.find()
{ "_id" : "chunksize", "value" : 1 }
{ "_id" : "balancer", "mode" : "full", "stopped" : false }
{ "_id" : "autosplit", "enabled" : true }
(3) 修改chunk的大小
#切换到数据库config
use config
#设置块大小为1M
db.settings.save({"_id":"chunksize","value":1})
db.settings.updateOne(
{ _id: "chunksize" },
{ $set: { _id: "chunksize", value: 1} },
{ upsert: true }
)
(4) 查看是否发生了迁移与分割
mongos> use config
switched to db config
mongos> db.changelog.find({what: "split"}).count()
0
mongos> db.changelog.find({what: "moveChunk.commit"}).count()
7
发生了7次迁移,没有发生分割