最新消息:20210917 已从crifan.com换到crifan.org

【已解决】MongoDB服务报错:pymongo.errors.ServerSelectionTimeoutError localhost Errno 111 Connection refused

MongoDB crifan 4435浏览 0评论

某台服务器中,一直正在正常运行着MongoDB的服务:

mongod

但是突然某次发现一个系统无法使用了,去看log发现Flask中报错是mongodb的问题:

<code>[2018-11-30 20:53:12,763 ERROR 21362 MainProcess 140539632912712 DummyThread-952 app.py:1761 log_exception] Exception on /storybook [GET]
Traceback (most recent call last):
  File "/root/.local/share/virtualenvs/xxx-se4mevDU/lib/python3.6/site-packages/flask/app.py", line 1811, in full_dispatch_request
    rv = self.preprocess_request()
  File "/root/.local/share/virtualenvs/xxx-se4mevDU/lib/python3.6/site-packages/flask/app.py", line 2087, in preprocess_request
    rv = func()
  File "/root/xxx/app.py", line 103, in collectApiAccessLog
    insertedLog = logCollection.insert_one(requestInfoDict)
  File "/root/.local/share/virtualenvs/xxx-se4mevDU/lib/python3.6/site-packages/pymongo/collection.py", line 693, in insert_one
    session=session),
  File "/root/.local/share/virtualenvs/xxx-se4mevDU/lib/python3.6/site-packages/pymongo/collection.py", line 607, in _insert
    bypass_doc_val, session)
  File "/root/.local/share/virtualenvs/xxx-se4mevDU/lib/python3.6/site-packages/pymongo/collection.py", line 595, in _insert_one
    acknowledged, _insert_command, session)
  File "/root/.local/share/virtualenvs/xxx-se4mevDU/lib/python3.6/site-packages/pymongo/mongo_client.py", line 1242, in _retryable_write
    with self._tmp_session(session) as s:
  File "/root/miniconda3/lib/python3.6/contextlib.py", line 81, in __enter__
    return next(self.gen)
  File "/root/.local/share/virtualenvs/xxx-se4mevDU/lib/python3.6/site-packages/pymongo/mongo_client.py", line 1571, in _tmp_session
    s = self._ensure_session(session)
  File "/root/.local/share/virtualenvs/xxx-se4mevDU/lib/python3.6/site-packages/pymongo/mongo_client.py", line 1558, in _ensure_session
    return self.__start_session(True, causal_consistency=False)
  File "/root/.local/share/virtualenvs/xxx-se4mevDU/lib/python3.6/site-packages/pymongo/mongo_client.py", line 1511, in __start_session
    server_session = self._get_server_session()
  File "/root/.local/share/virtualenvs/xxx-se4mevDU/lib/python3.6/site-packages/pymongo/mongo_client.py", line 1544, in _get_server_session
    return self._topology.get_server_session()
  File "/root/.local/share/virtualenvs/xxx-se4mevDU/lib/python3.6/site-packages/pymongo/topology.py", line 427, in get_server_session
    None)
  File "/root/.local/share/virtualenvs/xxx-se4mevDU/lib/python3.6/site-packages/pymongo/topology.py", line 199, in _select_servers_loop
    self._error_message(selector))
pymongo.errors.ServerSelectionTimeoutError: localhost:32018: [Errno 111] Connection refused
</code>

所以去找找什么原因

先去看看状态:

发现是running exited

然后尝试重启,结果报错

然后再去看状态就是failed的了:

<code>[root@xx-general-01 logs]# service mongod status
● mongod.service - SYSV: Mongo is a scalable, document-oriented database.
   Loaded: loaded (/etc/rc.d/init.d/mongod; bad; vendor preset: disabled)
   Active: failed (Result: exit-code) since Fri 2018-11-30 20:25:22 CST; 52min ago
     Docs: man:systemd-sysv-generator(8)
  Process: 4391 ExecStop=/etc/rc.d/init.d/mongod stop (code=exited, status=0/SUCCESS)
  Process: 4408 ExecStart=/etc/rc.d/init.d/mongod start (code=exited, status=1/FAILURE)

Nov 30 20:25:22 xx-general-01 systemd[1]: Starting SYSV: Mongo is a scalable, document-oriented database....
Nov 30 20:25:22 xx-general-01 runuser[4419]: pam_unix(runuser:session): session opened for user mongod by (uid=0)
Nov 30 20:25:22 xx-general-01 mongod[4408]: Starting mongod: [FAILED]
Nov 30 20:25:22 xx-general-01 systemd[1]: mongod.service: control process exited, code=exited status=1
Nov 30 20:25:22 xx-general-01 systemd[1]: Failed to start SYSV: Mongo is a scalable, document-oriented database..
Nov 30 20:25:22 xx-general-01 systemd[1]: Unit mongod.service entered failed state.
Nov 30 20:25:22 xx-general-01 systemd[1]: mongod.service failed.
</code>

pymongo.errors.ServerSelectionTimeoutError localhost Errno 111 Connection refused

还是要去看看原因:

<code>[root@xx-general-01 logs]# service mongod restart
Restarting mongod (via systemctl):  Job for mongod.service failed because the control process exited with error code. See "systemctl status mongod.service" and "journalctl -xe" for details.
                                                           [FAILED]
[root@xx-general-01 logs]# journalctl -xe
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
-- Documentation: http://www.freedesktop.org/wiki/Software/systemd/multiseat
--
-- A new session with the ID 34727 has been created for the user root.
--
-- The leading process of the session is 4510.
Nov 30 21:14:59 x-general-01 systemd[1]: Started Session 34727 of user root.
-- Subject: Unit session-34727.scope has finished start-up
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
--
-- Unit session-34727.scope has finished starting up.
--
-- The start-up result is done.
Nov 30 21:14:59 xx-general-01 systemd[1]: Starting Session 34727 of user root.
-- Subject: Unit session-34727.scope has begun start-up
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
--
-- Unit session-34727.scope has begun starting up.
Nov 30 21:14:59 xx-general-01 sshd[4510]: pam_unix(sshd:session): session opened for user root by (uid=0)
Nov 30 21:18:45 xx-general-01 polkitd[505]: Registered Authentication Agent for unix-process:4568:1729661674 (system bus name :1.69510 [/usr/bin/pkttyagent --notify-fd 5 --fallbac
Nov 30 21:18:45 xx-general-01 systemd[1]: Starting SYSV: Mongo is a scalable, document-oriented database....
-- Subject: Unit mongod.service has begun start-up
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
--
-- Unit mongod.service has begun starting up.
Nov 30 21:18:45 xx-general-01 runuser[4586]: pam_unix(runuser:session): session opened for user mongod by (uid=0)
Nov 30 21:18:45 xx-general-01 runuser[4586]: pam_unix(runuser:session): session closed for user mongod
Nov 30 21:18:45 xx-general-01 mongod[4574]: Starting mongod: [FAILED]
Nov 30 21:18:45 xx-general-01 systemd[1]: mongod.service: control process exited, code=exited status=1
Nov 30 21:18:45 xx-general-01 systemd[1]: Failed to start SYSV: Mongo is a scalable, document-oriented database..
-- Subject: Unit mongod.service has failed
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
--
-- Unit mongod.service has failed.
--
-- The result is failed.
Nov 30 21:18:45 xx-general-01 systemd[1]: Unit mongod.service entered failed state.
Nov 30 21:18:45 x-general-01 systemd[1]: mongod.service failed.
Nov 30 21:18:45 x-general-01 polkitd[505]: Unregistered Authentication Agent for unix-process:4568:1729661674 (system bus name :1.69510, object path /org/freedesktop/PolicyKit1/A
[root@xx-general-01 logs]#
</code>

搜:

Failed to start SYSV: Mongo is a scalable, document-oriented database

Can not start MongoDB 3.2.1 on CentOS 7 – Stack Overflow

configuration – MongoDB doesn’t start after changing the data directory – Database Administrators Stack Exchange

MongoDB fails starting as a service on CentOS 7 – Stack Overflow

好像是文件夹权限不对了?

Unable to start MongoDB 3.0.2 service on CentOS 7 – Stack Overflow

Can’t start MongoDB on CentOS7 – Stack Overflow

先去看看:

<code>[root@xxx-general-01 logs]# ll /var/lib/mongod
ls: cannot access /var/lib/mongod: No such file or directory
[root@xxx-general-01 logs]# ll /var/lib/mongo/
total 15010456
-rw-r--r-- 1 mongod mongod 14888894464 Aug 28 11:40 collection-0-2140692489680982673.wt
-rw-r--r-- 1 mongod mongod       20480 Aug 28 11:40 collection-0--2650623679816886832.wt
-rw-r--r-- 1 mongod mongod       40960 Aug 28 11:41 collection-0--3691121321241124749.wt
-rw-r--r-- 1 mongod mongod      397312 Nov 30 20:13 collection-10-733562134066934475.wt
-rw-r--r-- 1 mongod mongod        4096 Nov  6 14:47 collection-12-733562134066934475.wt
-rw-r--r-- 1 mongod mongod      876544 Aug 28 11:40 collection-2-2140692489680982673.wt
-rw-r--r-- 1 mongod mongod       16384 Aug 28 11:40 collection-2--3691121321241124749.wt
-rw-r--r-- 1 mongod mongod   291508224 Aug 28 11:40 collection-2--5069085438818245657.wt
-rw-r--r-- 1 mongod mongod       16384 Aug 28 11:40 collection-4--3691121321241124749.wt
-rw-r--r-- 1 mongod mongod   147787776 Nov 23 14:55 collection-4-733562134066934475.wt
-rw-r--r-- 1 mongod mongod       36864 Nov  1 18:53 collection-6--3691121321241124749.wt
-rw-r--r-- 1 mongod mongod    36298752 Oct 30 13:59 collection-6-733562134066934475.wt
drwxr-xr-x 2 mongod mongod        4096 Nov 30 18:21 diagnostic.data
-rw-r--r-- 1 mongod mongod      258048 Sep 19 11:09 index-0--2784828041412477720.wt
-rw-r--r-- 1 mongod mongod      102400 Nov 30 20:13 index-11-733562134066934475.wt
-rw-r--r-- 1 mongod mongod      684032 Sep 19 11:09 index-1-2140692489680982673.wt
-rw-r--r-- 1 mongod mongod       20480 Apr  8  2018 index-1--2650623679816886832.wt
-rw-r--r-- 1 mongod mongod       36864 Aug 28 11:41 index-1--3691121321241124749.wt
-rw-r--r-- 1 mongod mongod        4096 Nov  6 14:47 index-13-733562134066934475.wt
-rw-r--r-- 1 mongod mongod      462848 Nov 20 23:50 index-14-733562134066934475.wt
-rw-r--r-- 1 mongod mongod      294912 Nov  6 15:36 index-15-733562134066934475.wt
-rw-r--r-- 1 mongod mongod       94208 Aug 28 11:43 index-3-2140692489680982673.wt
-rw-r--r-- 1 mongod mongod       16384 Apr  2  2018 index-3--3691121321241124749.wt
-rw-r--r-- 1 mongod mongod      512000 Oct 31 11:27 index-3--5069085438818245657.wt
-rw-r--r-- 1 mongod mongod      663552 Aug 28 11:43 index-4-2140692489680982673.wt
-rw-r--r-- 1 mongod mongod       16384 Aug 28 11:40 index-5--3691121321241124749.wt
-rw-r--r-- 1 mongod mongod      479232 Oct 30 13:59 index-5-733562134066934475.wt
-rw-r--r-- 1 mongod mongod       36864 Nov  1 18:53 index-7--3691121321241124749.wt
-rw-r--r-- 1 mongod mongod      815104 Oct 30 13:59 index-7-733562134066934475.wt
-rw-r--r-- 1 mongod mongod       36864 Nov  1 18:53 index-8--3691121321241124749.wt
drwxr-xr-x 2 mongod mongod        4096 Nov 23 14:54 journal
-rw-r--r-- 1 mongod mongod       36864 Nov  6 14:54 _mdb_catalog.wt
-rw-r--r-- 1 mongod mongod           6 Aug 28 11:40 mongod.lock
-rw-r--r-- 1 mongod mongod       36864 Nov 30 17:34 sizeStorer.wt
-rw-r--r-- 1 mongod mongod          95 Mar 20  2018 storage.bson
-rw-r--r-- 1 mongod mongod          49 Mar 20  2018 WiredTiger
-rw-r--r-- 1 mongod mongod        4096 Aug 28 11:40 WiredTigerLAS.wt
-rw-r--r-- 1 mongod mongod          21 Mar 20  2018 WiredTiger.lock
-rw-r--r-- 1 mongod mongod        1011 Nov 30 17:35 WiredTiger.turtle
-rw-r--r-- 1 mongod mongod      102400 Nov 30 20:13 WiredTiger.wt
</code>

好像记得:不是用service去启动mongod的?

去找找,此处是如何运行mongod的

参考之前:

【无需解决】mongod的状态中active但却existed是什么意思

去看看log:

<code>cd /var/log/mongodb/
tail mongod.log
</code>

和:

【已解决】mongo启动失败:connection /var/lib/mongo/WiredTiger.turtle handle-open open Permission denied

中的:

<code>/etc/mongod.conf
</code>

后来发现之前是:

即可正常(在当前登录用户root的情况下,换用mongd的用户去)启动mongod:

<code>sudo -u mongod mongod -f /etc/mongod.conf
</code>

去启动mongod的

先去看看log:

/var/log/mongodb/mongod.log

然后下载下来打开后,无意间发现:

<code>2018-11-30T20:13:45.560+0800 E STORAGE [thread1] WiredTiger (28) [1543580025:560859][15915:0x7f5d0df0b700], file:WiredTiger.wt, WT_SESSION.checkpoint: /var/lib/mongo/WiredTiger.wt: handle-write: pwrite: failed to write 28672 bytes at offset 102400: No space left on device
2018-11-30T20:13:45.561+0800 E STORAGE [thread1] WiredTiger (28) [1543580025:561033][15915:0x7f5d0df0b700], checkpoint-server: checkpoint server error: No space left on device
2018-11-30T20:13:45.561+0800 E STORAGE [thread1] WiredTiger (-31804) [1543580025:561047][15915:0x7f5d0df0b700], checkpoint-server: the process must exit and restart: WT_PANIC: WiredTiger library panic
</code>

然后再去看磁盘空间,真的被用完了:

<code>[root@xx-general-01 mongodb]# df -h
Filesystem      Size  Used Avail Use% Mounted on
/dev/vda1        50G   47G     0 100% /
devtmpfs        7.8G     0  7.8G   0% /dev
tmpfs           7.8G     0  7.8G   0% /dev/shm
tmpfs           7.8G  532K  7.8G   1% /run
tmpfs           7.8G     0  7.8G   0% /sys/fs/cgroup
tmpfs           1.6G     0  1.6G   0% /run/user/0
</code>

然后去删了个1.5G的文件。

结果再去看的

df -h

Avail还是0,Use还是100%

后来再去删了500多MB的文件,再去看最新df,然后终于有空出来的空间了:

<code>[root@xx-general-01 origin_data]# df -lh
Filesystem      Size  Used Avail Use% Mounted on
/dev/vda1        50G   47G  431M 100% /
devtmpfs        7.8G     0  7.8G   0% /dev
tmpfs           7.8G     0  7.8G   0% /dev/shm
tmpfs           7.8G  532K  7.8G   1% /run
tmpfs           7.8G     0  7.8G   0% /sys/fs/cgroup
tmpfs           1.6G     0  1.6G   0% /run/user/0
</code>

然后再去重启mongod,就可以了:

<code>[root@xx-general-01 origin_data]# service mongod restart
Restarting mongod (via systemctl):                         [  OK  ]
[root@xx-general-01 origin_data]# service mongod status
● mongod.service - SYSV: Mongo is a scalable, document-oriented database.
   Loaded: loaded (/etc/rc.d/init.d/mongod; bad; vendor preset: disabled)
   Active: active (running) since Fri 2018-11-30 21:49:51 CST; 4s ago
     Docs: man:systemd-sysv-generator(8)
  Process: 4391 ExecStop=/etc/rc.d/init.d/mongod stop (code=exited, status=0/SUCCESS)
  Process: 4693 ExecStart=/etc/rc.d/init.d/mongod start (code=exited, status=0/SUCCESS)
   CGroup: /system.slice/mongod.service
           └─4708 /usr/bin/mongod -f /etc/mongod.conf

Nov 30 21:49:50 xx-general-01 systemd[1]: Starting SYSV: Mongo is a scalable, document-oriented database....
Nov 30 21:49:50 xx-general-01 runuser[4704]: pam_unix(runuser:session): session opened for user mongod by (uid=0)
Nov 30 21:49:51 x-general-01 runuser[4704]: pam_unix(runuser:session): session closed for user mongod
Nov 30 21:49:51 x-general-01 mongod[4693]: Starting mongod: [  OK  ]
Nov 30 21:49:51 x-general-01 systemd[1]: Started SYSV: Mongo is a scalable, document-oriented database..
[root@xx-general-01 origin_data]#
</code>

然后之前一个项目xx产生的log:

logs/development.log

太多:

所以去修改代码,去掉很多

logger.debug和logger.info的代码。

最终减少了不少log。

不过,此处还是由于之前同事设计有问题,导致django的日志中有mysql的操作的log,其中某个操作的查询操作太弱智:

会产生大量的util的log

-》导致随随便便用户使用几次,就会产生大量的log:

所以,也是无法再继续优化了。

只能想办法以后重构这种垃圾代码了。

【总结】

此处某web应用无法正常使用,表面现象看起来是,通过log查到的:

<code>pymongo.errors.ServerSelectionTimeoutError: localhost:32018: [Errno 111] Connection refused
</code>

而再去通过:

<code>service mongod restart
</code>

却无法重启MongoDB,会失败。

其表面现象看到的是:

<code>Failed to start SYSV: Mongo is a scalable, document-oriented database..
</code>

实际上找到MongoDB的log(此处的:

<code>/var/log/mongodb/mongod.log
</code>

),从而发现:

<code>E STORAGE [thread1] WiredTiger ... failed to write 28672 bytes at offset 102400: No space left on device
</code>

很明显是磁盘空间不够了。

-》其中的:

E STORAGE

也表示是存储方面的错误。

-》然后通过

<code>df -hl
</code>

看到是当前分区可用空间是0了。

-》最后是去删除掉一些无用数据,腾出2G空间

-》但是注意:

刚删除完毕后,去用df看到的结果没变化,可用空间还是0,要稍等一会,才能更新

-》之后再去:

<code>service mongod restart
</code>

就可以了。

转载请注明:在路上 » 【已解决】MongoDB服务报错:pymongo.errors.ServerSelectionTimeoutError localhost Errno 111 Connection refused

发表我的评论
取消评论

表情

Hi,您需要填写昵称和邮箱!

  • 昵称 (必填)
  • 邮箱 (必填)
  • 网址
98 queries in 0.192 seconds, using 23.38MB memory