Saturday, January 29, 2011

Linux servers seeing bad download performance behind Sonicwall firewall

I'm working with a pair of co-located CentOS Linux servers sitting behind a Sonicwall PRO 2040 Enhanced firewall running in transparent bridge mode.

These servers are having a strange problem downloading files more than a few megabytes in size. For example, if I try to wget or FTP a copy of the Linux kernel from kernel.org, the first ~1-2MB will download at 600+K/s, and then throughput will drop off a cliff to 1K/s.

I've reviewed all the firewall configuration settings for anything suspicious, but found nothing. More interestingly, I performed the same download with a Windows server sitting behind the same firewall, and it sailed right through at 600+K/s the whole way.

Has anyone seen this? Where should I start looking to troubleshoot this problem?

  • There's a lot of initial diagnostics left to perform here.

    Errors in /var/log/messages?

    Errors in dmesg?

    Packet loss evidenced in /sbin/ifconfig?

    Issues with link negotiation?

    Are there any differences, physical or not, between the Windows box and Linux box?

    Edit 1

    Can you reproduce the performance using different protocols and sites?

    Joshua Penix : No errors in logs or dmesg, and ifconfig shows all counters as clean. ethtool shows a proper full duplex gigabit connection. Both the Linux machines as well as the Windows box are roughly the same-generation HP hardware, though I don't have specifics right now.
    Joshua Penix : I've reproduced the problem via both FTP and HTTP and from quite an assortment of sites and large files.
    From Warner
  • Do you see the problems downloading to the Linux server from within the Network? If not that it must be something to do with combination of Linux and the Firewall. On the firewall, can you watch CPU usage or look for warnings? What about resetting the firewall?

    Maybe after the first MB or so an adjustment is made by Linux automatically to the TCP options (or maybe Layer 2), and the firewall doesn't like this? Looking at the various network options in /proc might give you an idea. Also, a packet dump on Linux might show some change in what is going on when the slowdown happens.

    Joshua Penix : I think this is the direction I need to investigate further. I don't see any issues with traffic to/from the Linux server inside the network, only when traffic has to go outside. I've not seen anything at all strange in the firewall's logs or utilization graphs, but I'm going to start investigating the TCP stack tuning. Any suggestion on what I would want to be looking for in a packet dump?
  • Those firewalls will bog down if you have Intrusion Prevention and/or Antivirus turned on. Especially if you have TCP Stream selected as one of the types to scan. It will try to build the whole file in its memory to scan it...
    Temporarily disable those features and see if your performance climbs back up. If so, then look at adding your servers to the exception list so you don't have drop your pants for the whole network.

    Warner : Why would performance be different between OSes?
    Scott Lundberg : @Warner: Not sure what you mean by your question? The firewall issues don't have anything to do with the OS. It's the firewall itself that has a lack of horsepower to keep up in my experience.
    Scott Lundberg : doh! missed that part about windows. To answer your question: I don't know, doesn't make sense to me. I do know that when we used 2040s, we had to disable some of the scanning engines or we would have similar problems.
    Joshua Penix : Thanks for the suggestion. Just to be sure, I disabled the IPS, Anti-Virus and Anti-Spyware features of the Sonicwall entirely, and the problem still occurred.
  • Was there ever a resolution to this? I'm experiencing packetloss between a pair of 2040's via a VPN tunnel.

    Thanks! --mef

    Joshua Penix : No, haven't found the resolution of the core problem, just a temporary workaround that involves turning off TCP window scaling. That said, my problem never exhibited as packetloss, just throttling, so I'm not sure it's applicable to your situation.
  • Though I haven't found the root cause of this, I did find a quick workaround that lets me get file transfers through:

    sysctl -w net.ipv4.tcp_window_scaling=0

    The kernel default for TCP window scaling is on, but that command lets me temporarily disable it. I haven't persisted the setting permanently via sysctl.conf because I'm not sure about its overall performance effects, but it works in a pinch and then I can flip it back to 1 when I'm done.

0 comments:

Post a Comment