Page 2 of 2

Re: [S] Site shutdowns

Posted: Thu Jul 19, 2018 12:34 am
by Unas
Sorry for the late answer.

Thanks for proposing your help.
Normally, money is not so much of an issue : I try to be conservative with it, but on paper the server where I moved on June 30th is more than powerful enough for running the site, and still cheap enough for me to afford without thinking twice. AAO shouldn't need a more expensive hosting - and really, all my monitoring shows that the site uses but a small fraction of the server's processing power and memory.

The issue here is that I have no idea left as to what causes these crashes.
- Ever since they started occurring, I never was able to find logs corresponding to the crash - everything seems perfectly normal until nothing runs anymore.
- They keep occurring on the new server, even though I changed system architecture and the version of most software involved, doubled the RAM and tripled the processing power, so it shouldn't be a hardware issue - or so I'd hope.
- They occur seemingly randomly : sometimes a long time after the previous reboot, sometimes a very short time (eg. I was forced to reboot the server on July 9th for a short time, but on July 10th we had yet another severe crash... but nothing since I rebooted it on July 11th), and monitoring does not show any issue regarding memory usage, so it cannot be a memory leak, or more generally a RAM issue.
- The new server is a VM, not a physical machine, so it cannot be a processor overheat (otherwise it would take down the whole hypervisor, affecting dozens of other clients, and the host would reboot it)
- The kernel is configured to reboot automatically in case of kernel panic, yet it doesn't... So the crash seems severe enough to bypass even the most basic kernel recovery systems.
- I also have other sites operating on other servers with the same setup (in fact, one is identical to the previous AAO server), and the others don't have this issue. They have much less traffic as well, but aside from that are pretty much identical... So this would seem to indicate it's specific to AAO, either because of the amount of traffic, or something in the site code deployed on AAO itself.

But I'm still stuck on what kind of application bug (if it's an issue within the AAO site or forum code) could crash the server so severely that it doesn't even go through kernel panic... And why it didn't occur on the previous server (the one before April 2017's move), where everything was pretty much the same - and we had more traffic back then as well.

So yeah, right now I'm keeping a close eye on it, but it's still a mystery to me :-/


And no, so far it doesn't look like there is a risk of data loss. In any case, I have weekly backups being exported so I should be able to recover most contents in case of issues.

Re: [S] Site shutdowns

Posted: Thu Jul 19, 2018 1:00 am
by Kroki
Seems AAO's been mining bitcoins in secret...

Re: [S] Site shutdowns

Posted: Sun May 12, 2019 3:12 am
by Jofe
Unfortunelly, it looks like it's back. :?
EDIT: Ok, so AAO hates Saturday, now. xD

Re: [S] Site shutdowns

Posted: Tue May 21, 2019 3:46 pm
by Super legenda
It was down again.

Re: [S] Site shutdowns

Posted: Wed May 22, 2019 5:05 pm
by Super legenda
Two days in a row?

Re: [S] Site shutdowns

Posted: Sat May 25, 2019 3:51 pm
by Enthalpy
Yeah... It's been bad lately. Here's the last I heard from Unas, a few days back:
Unas wrote: FYI, as I told you in a previous mail, over the last two or three weeks I received serval mails informing me of issues on the physical machine that is hosting my AAO VM.
Apparently, this incident I reported was enough to tip them that it was too much, and they are going to migrate me on another physical host (should take place on 24/05, in two days). Let's hope it works better after that...
We are hopeful that things will be a lot better now. If not... I'll let Unas know.