Thursday, February 3, 2011

Cost effective way to handle high SSL traffic?

Some time in the future, I may need to build a dedicated SSL farm (as described in Making applications scalable with Load Balancing) or something similar to handle lots of SSL traffic. While it's not an immediate issue for me, I'd like to plan a little bit ahead. So my question is:

Is it more cost effective to use dedicated hardware for this, or can I reuse application servers, maybe with a hardware add-on card? Or is it better to have this integrated in load balancers (contrary to what the above-mentioned article stated in 2006)?

A few links to specific hardware would be nice, too - I currently don't really know where to start looking.

  • Most cost effective solution is NGINX as reverse proxy, as price/performance beats most hardware solutions like F5 Networks Big-IP 6900.
    My NGINX config: http://gist.github.com/553235

    Chris Lercher : @Kristaps: How do you handle SSL, do you use an SSL card, or is the CPU performant enough? How much SSL traffic can one such server handle?
    Kristaps : CPU performance is enought as I have 2 quad core Xeons . it can handle about ~25 kTPS
    Chris Lercher : @Kristaps: That's pretty good! I have the feeling, that this may be very cost-effective. Basically, such a server doesn't need any hard-disks (network boot), nor enormous amounts of RAM etc. - so I assume, I can use very cheap hardware - except for a decent processor. Or will I need anything else?
    From Kristaps
  • I'm assuming you are talking about HTTP traffic here (there's a big difference between stateful and stateless protocols).

    The problem is that to get best performance you want SSL session resumption to work - which favours a sticky session approach - but if your sessions are too sticky, then you won't have any failover. The big expensive boxes from f5, Cisco et al can cope with that, but its difficult to do across commodity boxes running (for instance) stunnel.

    I still think that the best solution to most load balancing problems is round-robin DNS - where failure detection is on the only place that a failure can be reliably detected (the client) and this is where the failover is implemented - it provides for server affinity but still allows failover of requests (note that it does not support resumption of requests - but I've yet to come across anything which supports this for HTTP).

    One other thing to bear in mind is that Microsoft's keep-alive support for HTTP over SSL is different from that implemented by everyone else. This is not just an openSSL thing - other vendors give the same advice. Given the additional overheads in SSL negotiation and the huge pay-off using keep-alives for HTTP traffic it may be worth considering using MS-ISA for SSL termination - although I'm guessing that it is possible to configure the software as such and I've never been impressed by the products scalability/reliability. So if I had lots of money to spend then I'd probably look at MSISA for SSL termination but not using Microsoft's clustering software and moving the failover elsewhere (e.g. to the client!).

    For a cheap solution, terminate the SSL on the webserver boxes with round-robin DNS. Add lots of webservers. Optionally use a cryptographic accelerator card (not an SSL capable network card) in the webserver for additional oomph.

    For a very fast solution - (possibly) multiple MSISA nodes addressed via round-robin DNS, talking to a LVS cluster of webservers.

    HTH

    Chris S : (I didn't downvote; but I'm close to it) Round-Robin DNS is the worst load balancing/redundancy plan available. I can't imagine suggesting it to someone as an actual solution.
    symcbean : Why? Do you know of a vendor who has solved the close-notify problem for MS clients?
    From symcbean
  • Fig2 in your link give the state of art way to build an SSL farm.

    Regarding way to build your farm and cost, it will depend of your need.

    Having SSL termination on load balancer is probably cheaper today (even with dedicated load balancer like Cisco CSS, Cisco ACE, F5 BIG-IP, ... but it still depend of load balancer manufacturer).
    The load balancer will be able to do L7 balancing as it see unencrypted data. So you will not need 2 layer of load balancer and some SSL reverse proxy. This can reduce the cost. (less hardware to buy, less rack space, ...)

    But having SSL termination on load balancer is not very scalable, so if you see that your load balancer start to be overloaded by SSL you will have a problem. If you took a dedicated device you will need to upgrade it and this will be expensive. If you build you own load balancer with a server you will need to offload SSL on new dedicated server.

    Having the card on application server can be an option if L4 load balancing is enough and if your application give an high throughput for a low cpu usage.
    I mean: an hardware SSL card is expensive so you want to use it as much as possible.
    With a dedicated SSL termination hardware you will use the card as often as possible. If the card is in the application server and the application has a low throughput, you will not use the card a lot of time. But if the application is something fast, not using too much CPU but with a high throughput having the SSL termination on the server with a dedicated card can be an option. This is genrally not the case. This also reduce high availability.

    Chris Lercher : @radius: Great answer. I was thinking about using the application server machines "twice", but my idea may be silly: The traffic would go from the L4 hardware load balancer directly into the app servers, but only to do SSL (with the help of an SSL card). From there, it goes to hardware L7 load balancers (non-SSL), which redirect again to the app servers (maybe into a separate network interface) to perform the actual application handling. Would that be possible and efficient? So the SSL card would take the place of a full-blown SSL unit. It sounds quite unconventional (and maybe for a reason?)
    radius : I would not do that because if would be hard to troubleshoot performance, find bottleneck, etc. Client Z doing a L7 request on server A could and having SSL handled by server B could have slow answer because server B is busy by client Y L7 request. I think you could have case where server N is very busy by application so it's SSL handling will be slower (because even with an hardware card, CPU will do some job) as a result Application load on all others server will decrease. Then load will not be equal on servers. If your L7 balancing take care of server load there is not this problem.
    radius : Also take care that there is two part in SSL, the handshake and then the traffic. If your session are short (https, ..) you will have more handshake than if session are long (citrix ICA, ...). More handshake mean much more SSL Load
    Chris Lercher : @radius: Thanks for the input! Very valid reasons indeed.
    From radius
  • AFAIK the article still stands.

    If you really need a farm with several load balanced SSL reverse proxies and a fair few web/application servers behind them, I would suggest looking at a blade solution. That's not cheaper than simple 1 U rackmount servers, but it will save you some rack space. Most major server manufacturers do blade solutions (Dell, HP, IBM, etc.). Some links: IBM | Dell | HP

    I would build the load balancers from Linux servers (redundant pairs connected via Heartbeat, see LVS project), and have dedicated little networks for the proxy traffic and the traffic from the second load balancer to the web/application servers.

    Chris Lercher : @wolfgangsz: Thanks. These blade servers seem to be quite expensive (for a start-up company). How much SSL traffic can I roughly expect to be handled by one such server?
    wolfgangsz : Plenty. If the reverse proxies only handle the SSL part and the caching, then each machine should easily handle 10-20MBit/s. If you don't have a problem with rack space, you might want to consider 1U servers from SuperMicro. Since you have lots of them running parallel, failure of an individual server won't make much of a difference.
    Chris Lercher : @wolfgangsz: Ok, I'm a little bit surprised. For that amount of money, 10-20 MBit/s doesn't sound like very much (this would be only 50-100 bytes/transaction, if we compare it to the 25 kTPS mentioned by Kristaps?). I looked into your links - are they really 3000 - 10000 USD, or did I look at the wrong configurations?
    wolfgangsz : A normal web server (1U, 1 quad core CPU, 2-4GB of RAM) will output about 5-10MB/s under full load. If you go higher than that, the users experience delays. With good tuning you might get more than 20MB/s out of the proxies, but depending on the hardware of each proxy, that's probably pushing it. Yes, blade enclosures are not cheap. And the blades themselves are not cheap either. That's why I said if you don't have a problem with rackspace you might want to look at a whole stack of cheap 1U servers from SuperMicro (TBH, the low end servers from Dell are not far off that pricing, either.
    Chris Lercher : @wolfgangsz: Thanks for the many good links! They will be very useful in the future. I'll try to start with the lowest cost machines, because there's no data loss, if one of them fails. BTW, you used MB/s in your last comment. Did you mean MBit/s or MBytes/s?
    wolfgangsz : That would be MBit/s.
    From wolfgangsz

0 comments:

Post a Comment