Creating a tool to see server stability since last restart

Discussion in 'Suggestions and Feedback' started by chugga_fan, Aug 5, 2021.

  1. chugga_fan

    chugga_fan ME 4M storage cell of knowledge, all the time

    Messages:
    5,861
    Likes Received:
    730
    Local Time:
    7:50 PM
    This suggestion is an idea I've had for awhile about a system to create an easy way to identify the overall stability of a pack in a period of time. I know there exists tools to determine when a server boots up, and it should be a fairly simple task to determine when the last planned restart was (the tool that does the forced restart can log the time the restart happened and use that as a differential). So the next logical step would be the creation of a tool that can track how many "unexpected" crashes there were in a server reboot's time period so that staff, even without crash reports, can identify unstable servers and players can theoretically see what packs have the most issues.


    What I suggest involves the following:
    1. A tool that on server boot, so that includes when OOM or otherwise killed, will write to a database or a file that the server has booted up and "restarted" via some manner
    2. Said output to be collated against when the last and next planned server restart was to determine server "stability" in units of "planned restarts" to allow for active staff to prioritize helping servers currently experiencing issues with stability.
    While I am perfectly happy with never being able to myself see such metrics, I believe that it would be beneficial to staff members who wish to work on tackling stability issues to be able to see such issues in real-time and thus would be a great addition for the staff team.
     
  2. sp33draft

    sp33draft Member

    Messages:
    247
    Likes Received:
    95
    Local Time:
    2:50 AM
    i think there should also be a way to my base effect on the server in stand of calling an admin that do that
     
  3. MaraJade2

    MaraJade2 Moderator

    Messages:
    126
    Likes Received:
    25
    Local Time:
    4:50 PM
    I believe the watchdog has those stats already. The only problem I have with it is that it would be nice if the column descriptions were listed on the same page.
     
  4. chugga_fan

    chugga_fan ME 4M storage cell of knowledge, all the time

    Messages:
    5,861
    Likes Received:
    730
    Local Time:
    7:50 PM
    The watchdog actually AFAICT lacks this statistic, unless P30, etc. mean something other than Performance Percentage over the past n minutes, maybe the S statistic has to do with it? But to the best of my knowledge this statistic is not available anywhere
     
  5. MaraJade2

    MaraJade2 Moderator

    Messages:
    126
    Likes Received:
    25
    Local Time:
    4:50 PM
    As the column descriptions on the wiki state, the P30/120/300 columns are the performance over the last 30/120/300 seconds, so what the TPS has been. The S6/12/24 columns are the number of restarts of the last 6/12/24 hours, which I think is the stat that you're looking for, though perhaps not quite in the format you're looking for, but pretty close.
     
  6. chugga_fan

    chugga_fan ME 4M storage cell of knowledge, all the time

    Messages:
    5,861
    Likes Received:
    730
    Local Time:
    7:50 PM
    I guess you are correct then, that that is what I'm looking for, then all that I would need to see in order to correctly measure stability is "when was the last planned restart executed" and my request would be fulfilled
     
  7. ElectricLemonade

    ElectricLemonade Well-Known Member

    Messages:
    554
    Likes Received:
    552
    Local Time:
    7:50 PM
    I suppose the UPTIME measurement on the stats could give you an relatively accurate time. Though if you're looking specifically 'planned restart' that may not always be the case.
     

Share This Page