一、分片规划

角色 ip 副本集名 端口
mongos 192.168.1.153 27000
192.168.1.154
192.168.1.155
configServer 192.168.1.153 repl_config 27100
192.168.1.154
192.168.1.155
shard1 192.168.1.153 shard1 27101,1主2从
192.168.1.154
192.168.1.155
shard2 192.168.1.153 shard2 27102,1主2从
192.168.1.154
192.168.1.155

二、配置过程

1、下载&解压

cd /data/download
wget https://fastdl.mongodb.org/linux/mongodb-linux-x86_64-rhel70-6.0.16.tgz
tar zxvf mongodb-linux-x86_64-rhel70-6.0.16.tgz
mkdir -p /data/mongodb/mongodb_shard
cp -r /data/download/mongodb-linux-x86_64-rhel70-6.0.16/bin /data/mongodb/mongodb_shard/bin

2、目录配置

cd /data/mongodb/mongodb_shard
mkdir -p {auth,conf,shard1_27101/{data,log},shard2_27102/{data,log},config_27100/{data,log},mongos_27000/{data,log}}

3、生成keyfile

echo "mongodb123456" > /data/mongodb/mongodb_shard/auth/keyfile.key
cd /data/mongodb/mongodb_shard
chmod 600 auth/keyfile.key

4、配置文件

(1) configserver配置

cat > /data/mongodb/mongodb_shard/conf/config_27100.conf <<EOF
systemLog:
  destination: file
  logAppend: true
  path: /data/mongodb/mongodb_shard/config_27100/log/config_27100.log
storage:
  dbPath: /data/mongodb/mongodb_shard/config_27100/data
  journal:
    enabled: true
  wiredTiger:
    engineConfig:
      directoryForIndexes: true
      cacheSizeGB: 1
processManagement:
  fork: true
  pidFilePath: /data/mongodb/mongodb_shard/config_27100/data/config_27100.pid
  timeZoneInfo: /usr/share/zoneinfo
net:
  port: 27100
  bindIp: 0.0.0.0
  #bindIpAll: true
  maxIncomingConnections: 2000
  unixDomainSocket:
    enabled: true
    pathPrefix: /data/mongodb/mongodb_shard/config_27100/data
    filePermissions: 0700
security:
  keyFile: /data/mongodb/mongodb_shard/auth/keyfile.key
  authorization: enabled
replication:
  replSetName: repl_config
sharding:
  clusterRole: configsvr
EOF

(2) shard1配置

cat > /data/mongodb/mongodb_shard/conf/shard1_27101.conf <<EOF
systemLog:
  destination: file
  logAppend: true
  path: /data/mongodb/mongodb_shard/shard1_27101/log/shard1_27101.log
storage:
  dbPath: /data/mongodb/mongodb_shard/shard1_27101/data
  journal:
    enabled: true
  wiredTiger:
    engineConfig:
      directoryForIndexes: true
      cacheSizeGB: 1
processManagement:
  fork: true  # fork and run in background
  pidFilePath: /data/mongodb/mongodb_shard/shard1_27101/data/shard1_27101.pid
  timeZoneInfo: /usr/share/zoneinfo
net:
  port: 27101
  bindIp: 0.0.0.0
  #bindIpAll: true
  maxIncomingConnections: 5000
  unixDomainSocket:
    enabled: true
    pathPrefix: /data/mongodb/mongodb_shard/shard1_27101/data
    filePermissions: 0700
security:
  keyFile: /data/mongodb/mongodb_shard/auth/keyfile.key
  authorization: enabled
replication:
  replSetName: shard1
sharding:
  clusterRole: shardsvr
EOF

(3) shard2配置

cat > /data/mongodb/mongodb_shard/conf/shard2_27102.conf <<EOF
systemLog:
  destination: file
  logAppend: true
  path: /data/mongodb/mongodb_shard/shard2_27102/log/shard2_27102.log
storage:
  dbPath: /data/mongodb/mongodb_shard/shard2_27102/data
  journal:
    enabled: true
  wiredTiger:
    engineConfig:
      directoryForIndexes: true
      cacheSizeGB: 1
processManagement:
  fork: true  # fork and run in background
  pidFilePath: /data/mongodb/mongodb_shard/shard2_27102/data/shard2_27102.pid
  timeZoneInfo: /usr/share/zoneinfo
net:
  port: 27102
  bindIp: 0.0.0.0
  #bindIpAll: true
  maxIncomingConnections: 5000
  unixDomainSocket:
    enabled: true
    pathPrefix: /data/mongodb/mongodb_shard/shard2_27102/data
    filePermissions: 0700
security:
  keyFile: /data/mongodb/mongodb_shard/auth/keyfile.key
  authorization: enabled
replication:
  replSetName: shard2
sharding:
  clusterRole: shardsvr
EOF

(4) mongos配置

cat > /data/mongodb/mongodb_shard/conf/mongos_27000.conf <<EOF
systemLog:
  destination: file
  logAppend: true
  path: /data/mongodb/mongodb_shard/mongos_27000/log/mongos_27000.log
processManagement:
  fork: true  # fork and run in background
  pidFilePath: /data/mongodb/mongodb_shard/mongos_27000/data/mongos_27000.pid
  timeZoneInfo: /usr/share/zoneinfo
net:
  port: 27000
  bindIpAll: true
  maxIncomingConnections: 2000
  unixDomainSocket:
    enabled: true
    pathPrefix: /data/mongodb/mongodb_shard/mongos_27000/data
    filePermissions: 0700
security:
  keyFile: /data/mongodb/mongodb_shard/auth/keyfile.key
sharding:
  configDB: repl_config/192.168.1.153:27100,192.168.1.154:27100,192.168.1.155:27100
EOF

5、修改权限

useradd mongodb
chown -R mongodb:mongodb /data/mongodb/mongodb_shard

6、初始化 config server 服务

(1) 启动 config server

sudo su - mongodb -c '/data/mongodb/mongodb_shard/bin/mongod -f /data/mongodb/mongodb_shard/conf/config_27100.conf'

(2) 初始化

mongo port=27100
use admin
config = { _id:"repl_config",members:[ {_id:0,host:"192.168.1.153:27100"}, {_id:1,host:"192.168.1.154:27100"}, {_id:2,host:"192.168.1.155:27100"}] }
rs.initiate(config)

结果输出如下,说明集群初始化成功,可以通过rs.status()命令查看集群状态(88为primary,89/90为secondary)

{
  "ok" : 1,
  "$gleStats" : {
    "lastOpTime" : Timestamp(1557909332, 1),
    "electionId" : ObjectId("000000000000000000000000")
  },
  "lastCommittedOpTime" : Timestamp(0, 0)
}

注:

configserver 可以是一个节点,官方建议复制集。configserver不能有arbiter。 新版本中,要求必须是复制集。mongodb 3.4之后,虽然要求config server为replica set,但是不支持arbiter

(3) 添加 configserver的管理账号

use admin
db.createUser(
  {
    user: "root",
    pwd: "000000",
    roles: [ { role: "root", db: "admin" } ]
  }
)

7、初始化 shard1&shard2存储节点

(1) 启动 shard1&shard2

sudo su - mongodb -c '/data/mongodb/mongodb_shard/bin/mongod -f /data/mongodb/mongodb_shard/conf/shard1_27101.conf'
sudo su - mongodb -c '/data/mongodb/mongodb_shard/bin/mongod -f /data/mongodb/mongodb_shard/conf/shard2_27102.conf'

(2) 初始化

mongo port=27101
use admin
config = { _id:"shard1",members:[ {_id:0,host:"192.168.1.153:27101"}, {_id:1,host:"192.168.1.154:27101"},{_id:2,host:"192.168.1.155:27101"}] }
#初始化
rs.initiate(config)
{ "ok" : 1 }
mongo port=27102
use admin
config = { _id:"shard2",members:[ {_id:0,host:"192.168.1.153:27102"}, {_id:1,host:"192.168.1.154:27102"},{_id:2,host:"192.168.1.155:27102"}] }
#初始化
rs.initiate(config)
{ "ok" : 1 }

结果输出如上,说明集群初始化成功,可以通过rs.status()命令查看集群状态(88为primary,89为secondary,90为arbiter)

(3) 添加存储节点的管理账号

use admin
db.createUser(
  {
    user: "root",
    pwd: "000000",
    roles: [ { role: "root", db: "admin" } ]
  }
)

8、配置路由节点

(1) 启动 mongos

sudo su - mongodb -c '/data/mongodb/mongodb_shard/bin/mongos -f /data/mongodb/mongodb_shard/conf/mongos_27000.conf'

(2) 查看启动状态

mongo port=2000
ps -ef |grep mongos
netstat -tnlp |grep mongos

(3) 登录 mongos

mongos> use admin
switched to db admin
mongos> db.auth('root','000000')
1

9、分片集群操作

连接到其中任意一个mongos(192.168.1.153),做以下配置:

(1) 连接到mongs的admin数据库

su - mongod
mongo 192.168.1.153:27000/admin

(2) 添加分片:把shard1、shard2添加到mongos

sh.addShard('shard1/192.168.1.153:27101,192.168.1.154:27101,192.168.1.155:27101')
sh.addShard('shard2/192.168.1.153:27102,192.168.1.154:27102,192.168.1.155:27102')

(3) 列出分片

db.runCommand( { listshards : 1 } )

(4) 整体状态查看

sh.status();

三、分片配置

1、使用RANGE分片配置及测试

test库下的vast大表进行手工分片 为了更好的观察分片的效果,调整chunk的大小,默认128mb 切换到数据库config

mongos> use config
mongos> db.settings.updateOne(
   { _id: "chunksize" },
   { $set: { _id: "chunksize", value: 1 } },
   { upsert: true }
)

(1) 激活数据库分片功能

mongo port=27000 admin
admin> use admin
# MongoDB 6.0 开始,对集合分片无需运行 sh.enableSharding() 方法来配置数据库
#5.0版本,admin> db.runCommand( { enablesharding : "test" } )   //test库开启分片功能
mongos> use config
switched to db config
mongos> db.databases.find()
{ "_id" : "test", "primary" : "shard2", "partitioned" : true, "version" : { "uuid" : UUID("252afeae-277e-44d7-9575-1363acdcfa22"), "timestamp" : Timestamp(1665996971, 1), "lastMod" : 1 } }

(2) 指定分片建对集合分片

eg:范围片键 创建索引

mongos> use test
mongos> db.vast.createIndex({"id": 1})
开启分片vast以id分片
mongos> use admin
mongos> sh.shardCollection("test.vast", { id: 1 } )
# 等价于
db.runCommand( { shardcollection : "test.vast",key : {id: 1} } )

(3) 集合分片验证

mongos> use test
 插入2万条数据
mongos> for(i=1;i<20000;i++){
db.vast.insert({"id":i,"name":"shenzheng_abcdlkakjkjlkjkljlkjlkjkklllladfadfadadfdadfa","age":70,"date":new Date()}); }
 查看分布,发现都集中在一个shard的chunk上
mongos> db.vast.getShardDistribution();
Shard shard2 at
shard2/192.168.1.153:27102,192.168.1.154:27102,192.168.1.155:27102
 data : 2.42MiB docs : 19999 chunks : 1
 estimated data per chunk : 2.42MiB
 estimated docs per chunk : 19999
Totals
 data : 2.42MiB docs : 19999 chunks : 1
 Shard shard2 contains 100% data, 100% docs in cluster, avg obj size on shard : 127B
 再插入1万数据
mongos> for(i=20000;i<30000;i++){
db.vast.insert({"id":i,"name":"shenzheng_abcdlkakjkjlkjkljlkjlkjkklllladfadfadadfdadfa","age":70,"date":new Date()}); }
 查看数据分布,数据还是不均匀
db.vast.getShardDistribution();
Shard shard1 at
shard1/192.168.1.153:27101,192.168.1.154:27101,192.168.1.155:27101
 data : 1023KiB docs : 8255 chunks : 1
 estimated data per chunk : 1023KiB
 estimated docs per chunk : 8255
Shard shard2 at
shard2/192.168.1.153:27102,192.168.1.154:27102,192.168.1.155:27102
 data : 2.63MiB docs : 21744 chunks : 1
 estimated data per chunk : 2.63MiB
 estimated docs per chunk : 21744
Totals
 data : 3.63MiB docs : 29999 chunks : 2
 Shard shard1 contains 27.51% data, 27.51% docs in cluster, avg obj size on shard : 127B
 Shard shard2 contains 72.48% data, 72.48% docs in cluster, avg obj size on shard : 127B
 插入10万条数据,等候几分钟再查看
mongos> db.vast.getShardDistribution();
Shard shard1 at
shard1/192.168.1.153:27101,192.168.1.154:27101,192.168.1.155:27101
 data : 4.99MiB docs : 41275 chunks : 5
 estimated data per chunk : 1023KiB
 estimated docs per chunk : 8255
Shard shard2 at
shard2/192.168.1.153:27102,192.168.1.154:27102,192.168.1.155:27102
 data : 7.11MiB docs : 58722 chunks : 1
 estimated data per chunk : 7.11MiB
 estimated docs per chunk : 58722
Totals
 data : 12.11MiB docs : 99997 chunks : 6
 Shard shard1 contains 41.27% data, 41.27% docs in cluster, avg obj size on shard : 127B
 Shard shard2 contains 58.72% data, 58.72% docs in cluster, avg obj size on shard : 127B

在官网查询相关资料 https://www.mongodb.com/zh-cn/docs/v6.0/faq/diagnostics/#sharded-cluster-diagnostics https://www.mongodb.com/zh-cn/docs/manual/release-notes/6.0/#sharding https://www.mongodb.com/zh-cn/docs/manual/tutorial/manage-sharded-cluster-balancer/ https://www.mongodb.com/zh-cn/docs/v6.0/tutorial/manage-sharded-cluster-balancer/#configure-default-range-size https://www.mongodb.com/zh-cn/docs/v6.0/core/sharding-balancer-administration/#std-label-sharding-balancing

总结如下:

1、getShardDistribution()查看的是预估值 2、从6.0.3不再执行自动数据快分割 3、集群必须有足够的数据才能进行分配,至少8*chunkSize才会启动分配,否则保留在一个分片上。 4、热数据段容易集中(for循环批量写入) 5、chunkSize只影响新生成的Chunk 6、数据平衡是后端异步操作,引入了智能的分片分配算法,完全由系统控制 7、数据平衡是相对平衡,不是绝对平衡

(4) 分片结果测试

4.1,通过status 查看分配情况,这个数据量太多,不太直观

mongos> sh.status()
- Sharding Status -
  sharding version: {
        "_id" : 1,
        "minCompatibleVersion" : 5,
        "currentVersion" : 6,
        "clusterId" : ObjectId("66a61506264e094d6991ba63")
  }
  shards:
        {  "_id" : "shard1",  "host" : "shard1/192.168.1.153:27101,192.168.1.154:27101,192.168.1.155:27101",  "state" : 1,  "topologyTime" : Timestamp(1722161219, 5) }
        {  "_id" : "shard2",  "host" : "shard2/192.168.1.153:27102,192.168.1.154:27102,192.168.1.155:27102",  "state" : 1,  "topologyTime" : Timestamp(1722161390, 5) }
  active mongoses:
        "6.0.16" : 3
  autosplit:
        Currently enabled: yes
  balancer:
        Currently enabled: yes
        Currently running: no
        Failed balancer rounds in last 5 attempts: 0
        Migration results for the last 24 hours:
                No recent migrations
  databases:
        {  "_id" : "config",  "primary" : "config",  "partitioned" : true }
                config.system.sessions
                        shard key: { "_id" : 1 }
                        unique: false
                        balancing: true
                        chunks:
                                shard1  1024   这个是chunk的大小
                        too many chunks to print, use verbose if you want to force print
        {  "_id" : "test",  "primary" : "shard2",  "partitioned" : false, "version" : {  "uuid" : UUID("4fa1ab60-e18d-41cf-8bff-d25f48e7aa98"), "timestamp" : Timestamp(1722161591, 1),  "lastMod" : 1 } }

使用 sh.status({verbose:true}) 可以查看所有明细

4.2,通过getShardDistribution()查看,查看的是预估值,不是实际值

node1> mongo 192.168.1.153:27000/admin
mongos> use admin
mongos> db.auth('root','000000')
mongos> use test
mongos> db.vast.getShardDistribution();

4.3,通过计数的方式查看(准确)

shard1:
/data/mongodb/mongodb_shard/bin/mongo 192.168.1.153:27101/admin -uroot -p000000
shard1:PRIMARY> use test
shard1:PRIMARY> db.vast.count()
267902
shard2:
/data/mongodb/mongodb_shard/bin/mongo 192.168.1.153:27102/admin -uroot -p000000
shard2:PRIMARY> use test
shard2:PRIMARY> db.vast.count()
262146

2、使用Hash分片例子

对test库下的vast大表进行hash 创建哈希索引

(1) 对于test开启分片功能

mongo port 27101 admin
use admin
admin> db.runCommand( { enablesharding : "test" } )

(2) 对于test库下的vast表建立hash索引

use test
test> db.vast_h.createIndex( { id: "hashed" } )

(3) 开启分片

use admin
admin > sh.shardCollection( "test.vast_h", { id: "hashed" } )

(4) 录入1w行数据测试

use test
for(i=1;i<10000;i++){
db.vast_h.insert({"id":i,"name":"shenzheng","age":70,"date":new Date()}); }

(5) hash分片结果测试

通过getShardDistribution()查看数据统计

mongos> db.vast_h.getShardDistribution();
Shard shard1 at
shard1/192.168.1.153:27101,192.168.1.154:27101,192.168.1.155:27101
 data : 394KiB docs : 4992 chunks : 2
 estimated data per chunk : 197KiB
 estimated docs per chunk : 2496
Shard shard2 at
shard2/192.168.1.153:27102,192.168.1.154:27102,192.168.1.155:27102
 data : 396KiB docs : 5007 chunks : 2
 estimated data per chunk : 198KiB
 estimated docs per chunk : 2503
Totals
 data : 790KiB docs : 9999 chunks : 4
 Shard shard1 contains 49.92% data, 49.92% docs in cluster, avg obj size on shard : 81B
 Shard shard2 contains 50.07% data, 50.07% docs in cluster, avg obj size on shard : 81B
 查看数据
shard1:
/data/mongodb/mongodb_shard/bin/mongo 192.168.1.153:27101/admin -uroot -p000000
shard1:PRIMARY> use test
shard1:PRIMARY> db.vast_h.count()
4992
shard2:
/data/mongodb/mongodb_shard/bin/mongo 192.168.1.153:27102/admin -uroot -p000000
shard2:PRIMARY> use test
shard2:PRIMARY> db.vast_h.count()
5007
mongos> sh.status()
- Sharding Status -
test.vast_h
                        shard key: { "id" : "hashed" }
                        unique: false
                        balancing: true
                        chunks:
                                shard1  2
                                shard2  2

四、分片的管理

1、判断是否Shard集群

admin> db.runCommand({ isdbgrid : 1})

2、列出所有分片信息

admin> db.runCommand({ listshards : 1})

3、列出开启分片的数据库

admin> use config
config> db.databases.find( { "partitioned": true } )
config> db.databases.find() //列出所有数据库分片情况

4、查看分片的片键

config> db.collections.find().pretty()
{
  "_id" : "test.vast",
  "lastmodEpoch" : ObjectId("58a599f19c898bbfb818b63c"),
  "lastmod" : ISODate("1970-02-19T17:02:47.296Z"),
  "dropped" : false,
  "key" : {
    "id" : 1
  },
  "unique" : false
}

5、查看分片的详细信息

admin> db.printShardingStatus()
db.printShardingStatus({verbose:true})
admin> sh.status()

6、删除分片节点(谨慎)

 确认blance是否在工作
mongos> sh.getBalancerState()
true
 删除shard2节点(谨慎)
mongos> db.runCommand( { removeShard: "shard2" } )

注意:删除操作一定会立即触发blancer。

7、balancer操作

介绍: mongos的一个重要功能,自动巡查所有shard节点上的chunk的情况,自动做chunk迁移。 什么时候工作?

  • 自动运行,会检测系统不繁忙的时候做迁移
  • 在做节点删除的时候,立即开始迁移工作
  • balancer只能在预设定的时间窗口内运行

有需要时可以关闭和开启blancer(备份的时候)

mongos> sh.stopBalancer()
mongos> sh.startBalancer()

8、自定义自动平衡进行的时间段

https://docs.mongodb.com/manual/tutorial/manage-sharded-cluster-balancer/#schedule-the-balancing-window // connect to mongos

mongos> use config
mongos> sh.setBalancerState( true )
mongos> db.settings.update({ _id : "balancer" }, { $set : { activeWindow : { start : "3:00", stop : "5:00" } } }, true )
mongos> sh.getBalancerWindow()
{ "start" : "3:00", "stop" : "5:00" }
sh.status()

9、chunk常用命令

(1) 查看shard1和shard2的chunk的个数:

mongos> use config
mongos> db.chunks.find({"shard" : "shard1"}).count()
512
mongos> db.chunks.find({"shard" : "shard2"}).count()
512

(2) 查看chunk的大小

mongos> use config
mongos> db.settings.find()
{ "_id" : "chunksize", "value" : 1 }
{ "_id" : "balancer", "mode" : "full", "stopped" : false }
{ "_id" : "autosplit", "enabled" : true }

(3) 修改chunk的大小

#切换到数据库config
use config
#设置块大小为1M
db.settings.save({"_id":"chunksize","value":1})
db.settings.updateOne(
   { _id: "chunksize" },
   { $set: { _id: "chunksize", value: 1} },
   { upsert: true }
)

(4) 查看是否发生了迁移与分割

mongos> use config
switched to db config
mongos> db.changelog.find({what: "split"}).count()
0
mongos> db.changelog.find({what: "moveChunk.commit"}).count()
7

发生了7次迁移,没有发生分割