Loomio

There was a problem with prosody, fixed by a restart of prosody

PP Pirate Praveen Public Seen by 502
Aug 17 08:59:27 datamanager     error   Unable to write to offline storage ('/var/lib/prosody/poddery%2ecom/offline/praveen.list: Too many open files') for user: [email protected] 
Aug 17 09:10:23 datamanager     error   Unable to write to offline storage ('/var/lib/prosody/poddery%2ecom/offline/bsc.list: Too many open files') for user: [email protected] 

We got to dig deeper and find out more, anyone interested?

PP

Pirate Praveen Sun 27 Nov 2016 6:12PM

This keeps repeating often, I had to restart again today.

PP

Pirate Praveen Sun 27 Nov 2016 6:18PM

@balasankarchelamat @jayaura @akshay @anisha @fayadfami @tvm @isaagar @mintojoseph can any of you help with this issue?

MJ

Minto Joseph Mon 28 Nov 2016 1:47AM

@praveenarimbrathod Are you sure it is not caused by hitting open file descriptor limit? Checked ulimit -n ? If not fixed, will check when I am back on 30th..

PP

Pirate Praveen Thu 1 Dec 2016 12:38PM

We have added two cron jobs to log open files and sockets every day.

1 1 * * * lsof -u prosody > /root/debug/"lsof-`date --rfc-3339=date`"
1 1 * * * netstat -taupe > /root/debug/"netstat-`date --rfc-3339=date`"

When this happens next time, we have better data to analyze and find the root cause.

V

Vidyut Fri 2 Dec 2016 2:36AM

Why are logged out users allowed access to view information like this?

PP

Pirate Praveen Fri 2 Dec 2016 3:53AM

These issues are technical problems and not sensitive information. We have a private mailing list for the podmins and sensitive information like passwords are always encrypted. We have created private discussions here in the past when we felt the contents should not be public.

PP

Pirate Praveen Fri 2 Dec 2016 3:55AM

If you meant the addresses, only our own addresees (both addresses mentioned are podmins) are copied.

PP

Pirate Praveen Mon 12 Dec 2016 10:28AM

We are hitting the limits again. I was not able to login some time back and log had same error message. I'm thinking there is a problem with prosody configuration as it should be using the database and not file system for offline messages.

PP

Pirate Praveen Mon 12 Dec 2016 3:08PM

@mintojoseph was analyzing the situation and figured out the issue using strace -xvtto /tmp/lua.st -p 29152 (pid of prosody process). We have set MAXFDS to a higher value in /etc/defaults/prosody and also modified /etc/prosody/prosody.cfg.lua to use sql backend for writing offline messages. Hope this fixes the issues. We are still monitoring the situation, we'll update here if there are more issues.