probably finally solved. I have no idea why I haven't thought about this earlier, but somehow I didn't think about modsauce and galaxy being on the same machine. Well checking all the root server config adjustments I discovered one missing. The backups (rsync) where set on -c1 which means real-time. Now they are on -c3 (idle) like on every other machine. I also discovered that the servers have been given room for up to 12gb ram which is too much and causes garbage collection performance issues. Depending on the modpack and player count they are now between 4 and 8gb. I hope that this did solve the problem permanently. Sorry that I haven't thought about this before, I guess the fact that no reports from bteam came (which was on the same box as galaxy) did let me assume it is not related to the whole box.
If you have fixed it then I love you jks. Even if you have reduced the amount of timeouts it's still great news
Guess you got my last PM...lol. i just noticed yesterday/ today that they happened on auto save/ backups.
I guess we got to it at the same time. I didn't read it yet as this was the first thing I did today for the server. Sitting there with the visual vm, atop (as root), server console and the game open at the same time and just watching how things go. (non root accounts don't have access about disk io analytics which is why I didn't discover this earlier) If this did solve the issue it is caused by a long list of things happening at the same time: Putting disk usage at around 100% Backup running of 2 servers at the same time or daily backup which deletes old data when the server is + 12 days old Autosave of another server Letting the server wait until finished (does normally only cause micro to short lag spikes, but with 100% disk usage and the higher priority of the backup task, it can take ages) World is getting loaded or unloaded Player login -> force chunk load Depending on direct chunk unloads due to client connections timing out this can be a vicious circle on player login -> until the disk usage drops
Sadly the timeouts are still happening and there doesn't seem to be a change in how often either :/[DOUBLEPOST=1417279555,1417276845][/DOUBLEPOST]And if it was caused by a plugin I may have just found which one :/
Well I don't know if it's alright to say what commands we have access to here ? Shall I post it here or in staff chat/section ?[DOUBLEPOST=1417280287,1417279767][/DOUBLEPOST]I'll just post here and it can be removed if needed , basically I used the -snip- command and a second later the server crashed and hasn't come back up. So that's giving me the idea that maybe the plugin that provides that command doesn't like something on the server or something so could be causing the timeouts too.
This is quite unlikely. I did some further changes, lets wait another 24 hours and see if something changed. Than we can look further. (It would be possible to disable this/these plugins but also impact other stuff like homes, spawn, warps..)
Yeah I realised after I said it that its the main plugin that provides a lot of the essential commands (might be why it's called essentials ) I know it's probably not possible but what about having a sort of like testing day where we go through all the plugins enabling them one at a time and running the server for like 30-45 mins after each plugin to test if it's them that are causing all the trouble
From what I know the issue is not reproducible this way. Even with a lot of players online there are episodes where the server runs fine for +1 hour and if we know consider that during such a day only a small part of the players would be online.. The thread dumps show that the server is waiting for chunk loading and saving and the disk io of this system is still quite high, compared to the other boxes. From profiling most of it is coming from galaxy than modsauce and as last bteam. As bteam was the most heavy one in terms of disk io of the 1.6.4 packs the new ones do top it. I still assume that it is related to world load/unload and player force chunkloads through player logins/fast traveling or mod messing around with chunk-loading.
Please keep on open eye from now until the same time of tomorrow. If you notice any differences let me know, if not, too.