I would rather scheduled reboots every hour than ruthless destruction of all the most steadfast pack players' bases.
Its not happening at the same time, sometimes it runs out of mem after 20 minutes and the next time it takes 3 hours. A server restarting twice an hour isn't fun (personal experience), it would also move the issue a step down on the pressing matters and we might not look into it any time soon. We are looking for a permanent solution and nothing temporary, the solution with the least work for us and to provide the best experience for players would be a server reset. But again, for how long would that last and how happy would current players be?
Well, looks like the server is down for good now .. been unreachable for about a 1/2 hour now - check server stats. Just FYI.
Its creating a heap dump, might take another hour to finish, then its gonna be 3 hours downloading and probably half a day to open the file only to be able to analyze one. For proper analysis this process needs to be repeated multiple times, at least one more time with a heap dump from before the mem leak having had its play. Just to give you guys an understanding of how tedious those are to analyze, even with enterprise software (Performance and Memory Java Profiler - YourKit Java Profiler and this is only the getting started part.
And the heap dump just failed, great now we gotta get the server up for 2 hours (until it reaches a high mem used point again) and do the same all over again.
Wow, we are super lucky this time, I could get small heap dumps reducing the loading time dramatically and the cause was easy to spot as it isn't a general java object: It even gets better, its not even a mod causing those issues. Its prism the plugin we use to track all player actions. We have been using the same version for over a year, so its time to find out why it is stocking up and if it is a compatibility issue or even external (e.g. database server). Most likely and worst case would be that more events are being called than prism can handle hence the process queue/ram usage increasing. comparison: at server start: after 30 minutes: actual object after 30 minutes
glad to hear it was an easy find! Now hopes go to the fix just being a compatibility issue, and that you just need an update.
Yeah, prism can't keep up, during the last 3 hours it logged 3.5 million tree grows and there are probably twice that many left in the queue which never made it into the database. Some mod is calling the tree grow event way too often . Prism does no longer track this event and the issue should be resolved.
am i the only one who sees " leaf-decay 2666" in other words, leaves are satanic. arouse caution around leaves.[DOUBLEPOST=1465845896][/DOUBLEPOST]but on other terms... that sounds great! dunno which maniac was tree murdering but i'm glad to hear it aint logging 3.5 million trees anymore
I can log in right now to see how it's performing. EDIT: So far so good. Less lag as far as I'm experiencing.
Great. Going to mark this as done with the time and effort Mr Slind put in to this and the great results. Feel free to open a new thread if the issue returns!
I was on about two hours ago, and we still had performance issues regarding lag spikes and block lag. Keep in mind the thread is about performance as a whole and not just the memory leaks and heaps, so it might not be time to mark it as done.
Issue was unrelated to the server itself. And as this performance impacting event is done, this thread can be done =P