Tuesday, January 25, 2011

Is it a Best Industry practice to restart web servers periodically?

We have a web application (developed by a third party) that runs on Tomcat. We have been getting very bad performance from the application. The application developer is claiming that it is an Industry Best Practice to restart web servers every night, to free up all memory usage and start over.

From the customer perspective that alleviates their issue of the site crashing during the day, but from a SysAdmin perspective it is an awful solution.

We host 20 of these applications in different servers for different clients, and the coordination of making sure that all are being restarted every night just seems wrong.

  • IMO Servers should be shut down as little as possible. It's more likely the App Developer built a shoddy application with a memory leak.

    Helvick : Absolutely - I think the OP needs to tell someone they need to find a better developer.
    Bart Silverstrim : There's a reason big companies pay big bucks for multiple nines uptime and why companies spend thousands on redundant power supplies, RAID, hot swap cages, etc., and it certainly isn't so that they only need to reboot once a day.
  • This is certainly not a best practice. While it is good to restart your servers periodically just to make sure that everything comes up correctly, needing to restart nightly points to a very serious memory leak in the application.

    einstiien : This is a very good point. If you never restart your servers as suggested below you might not know that you have certain services that dont start properly. Then, in the event of a power failure/hard restart your server may not come back right.
    From ErikA
  • I have a script restart one of our webservers every night but that's more because of a poorly written java application rather than an industry standard. I would say that it isn't uncommon to restart the web services though. This might do the memory cleanup you're looking for and put less strain on the server as compared to a full restart.

    From einstiien
  • A server should preferably never be restarted. That's one of the reasons why we have fault tolerance. If you have to restart your server because of your applications, then your applications are leaking memory and are badly constructed.

    I have been working with Tomcat before, and I had the same problem, next time I will be working with a Java container I will look for another one, maybe JBoss or GlassFish.

    Edit: If you have to restart it every night now, then you probably have to restart it more often if/when the load increases. Be sure to have solid applications, that's the best solution.

    Zoredache : I don't think I agree when you say a server should never be restarted. Servers should be restarted to apply security fixes. They should never need to be restarted for things other then planned maintenance though.
    Jonas : It's true that some servers have to be restarted to apply security fixes. But if you have a good enough system, then you don't have to restart the system. It exists systems that are running year after year. You should aim for High Availability if you are serving a service on Internet. If you have a fault tolerant system like a cluster you could take down the nodes one by one and update them, when the service is still running.
    ErikA : If you only have a single server and/or piece of hardware, there's no such thing as High Availability. You're doing it wrong if you only gave one server and your service is so critical that it can't tolerate 15 minutes of downtime every now and again to restart the server. If you do have a "zero downtime" application, then you *will* have a true HA system with multiple nodes. In this case, rebooting periodically for patches, etc is quite easy as you pointed out.
    Kief : "Next time ... I will look for another [Java container other than Tomcat]". I wouldn't blame Tomcat. I've been running production services on it for years, and every time I've had this problem it's turned out to be an application issue. "Be sure to have solid applications, that's the best solution" Exactly. Funnily enough, every other Java application server that I've used so far suffers similar problems when I run leaky code on it. That said, Tomcat 7 is supposed to have some kind of pro-active memory leak detection.
    From Jonas
  • The application developer is more likely claiming that it's in his own best interest for you to cover his ass by working around the unprofessional job he did. He may have stopped short of actually admitting that he wrote something with a whopping memory leak, but not very far short of it.

    From mh
  • There's a difference between "Best Practice", things that many people do for good reasons, and "Common Practice", things that many people do because they're lazy and/or ignorant.

    Applications and (worse) servers that need to be routinely restarted or rebooted to keep running well are fairly common. But it's also a clear indication that you have a critical bug.

    By making it SOP to restart an application on a regular basis, your company is hiding a serious bug under the carpet. This is inexcusable, the bug needs to be faced down and squashed, or it will come back to bite you later.

    Ideally, your company should find a better developer. Unfortunately, this may lead to rather a lot of work to rewrite large tracts of your code. The fact that the developer either thinks that poorly written code is acceptable, or doesn't know enough to recognize the symptoms of buggy code, suggests the quality of the code is low. A good developer will be constitutionally incapable of leaving it in that state.

    Given that you may not be in a position to replace the developer, a few suggestions:

    • See if you can have a better developer review the code and report their assessment to someone who can do something about it,
    • Have a look into profiling tools. If you've got the skills and/or inclination, try profiling the code yourself to find the leak and report it.

    Even without getting into developer-oriented profiling tools, there are plenty of sysadmin-oriented tools for profiling and monitoring memory usage on Java applications. You should really set up monitoring of memory (particularly heap) on your production servers in any case. I'd recommend this even if you were running quality code. It may give you advance warning when your buggy apps are about to topple over.

    But better yet, these should help you to gather proof that there is a leak, and may even indicate where the issue is in the application. This will give you better ammunition to lobby for it to be fixed.

    From Kief

0 comments:

Post a Comment