Why does 1GBit card have output limited to 80 MiB?

7,747

Solution 1

80 MB / second is actually pretty good! That's about 640mbps, which is pretty darn close to the gigabit capacity of the NIC. If you take into consideration the TCPIP overhead, and disk speed you're probably at your maximum speed.

Solution 2

Try putting this to your /etc/sysctl.conf

# General 10gigabit/LFP tuning
net.core.rmem_max=16777216
net.core.wmem_max=16777216
net.ipv4.tcp_rmem=4096 87380 16777216
net.ipv4.tcp_wmem=4096 65536 16777216
net.ipv4.tcp_syncookies=1
net.ipv4.tcp_max_orphans=1048576
net.ipv4.tcp_orphan_retries=2

# Removes some internal buffering
net.ipv4.tcp_low_latency=1

# Time-wait sockets
# Do not turn on unless you know what you are doing!
#net.ipv4.tcp_tw_recycle=1
#net.ipv4.tcp_tw_reuse=1

# If PMTUD ICMP blackhole appears use
# RFC 4821, Packetization Layer Path MTU Discovery
net.ipv4.tcp_mtu_probing=1

# Netfilter's conntrack
# NB! For high-performance concerns you probably don't want to use `--state` rules at all 
#net.ipv4.netfilter.ip_conntrack_max=1048576
#net.nf_conntrack_max=1048576

# SACKs are an optimization to TCP which in normal scenarios improves considerably performance. 
# In Gigabit networks with no traffic competition these have the opposite effect. 
# To improve performance they should be turned off with: 
#net.ipv4.tcp_sack=0 

# Decrease the time default value for tcp_fin_timeout connection
net.ipv4.tcp_fin_timeout=15
# Decrease the time default value for tcp_keepalive_time connection
net.ipv4.tcp_keepalive_time=1800

# Increased backlog (default: 100/1000 depending on kernel)
net.core.netdev_max_backlog=10000
net.core.somaxconn=10000

# Timestamps adds additional 12 bytes to header and uses CPU
# NB! It caused massive problems for me under benchmark load
# with a high count of concurrent connections.
# ( http://redmine.lighttpd.net/wiki/1/Docs:Performance )
#net.ipv4.tcp_timestamps=0

# Portrange for outgoing connections
# (increase the ephemeral port range)
# NB! After that tuning you probably do not want to listen on port >= 1024
net.ipv4.ip_local_port_range=1024 65535

# Fixing 'Too many open files', Second useful on nginx+aio workloads
fs.file-max=16777216
fs.aio-max-nr=65536

# If you are under DDoS you can
kernel.panic=10
# Lower following values
#net.ipv4.tcp_synack_retries=2
#net.ipv4.tcp_syn_retries=2
#net.ipv4.netfilter.ip_conntrack_tcp_timeout_fin_wait=15
#net.ipv4.netfilter.ip_conntrack_tcp_timeout_close_wait=15
# If you under ping flood
#net.ipv4.icmp_echo_ignore_all=1

Each connection we make requires an ephemeral port, and thus a file descriptor, and by default this is limited to 1024. To avoid the Too many open files problem you’ll need to modify the ulimit for your shell. This can be changed in /etc/security/limits.conf, but requires a logout/login. For now you can just sudo and modify the current shell (su back to your non-priv’ed user after calling ulimit if you don’t want to run as root):

ulimit -n 999999

Another thing you can try that may help increase TCP throughput is to increase the size of the interface queue. To do this, do the following:

ifconfig eth0 txqueuelen 1000

You can play with congestion control:

sysctl net.ipv4.tcp_available_congestion_control
sysctl net.ipv4.tcp_congestion_control=htcp

There is also some low level tuning, e.g. kernel module parameters

# /sbin/modinfo e1000
..snip...
parm:           TxDescriptors:Number of transmit descriptors (array of int)
parm:           TxDescPower:Binary exponential size (2^X) of each transmit descriptor (array of int)
parm:           RxDescriptors:Number of receive descriptors (array of int)
parm:           Speed:Speed setting (array of int)
parm:           Duplex:Duplex setting (array of int)
parm:           AutoNeg:Advertised auto-negotiation setting (array of int)
parm:           FlowControl:Flow Control setting (array of int)
parm:           XsumRX:Disable or enable Receive Checksum offload (array of int)
parm:           TxIntDelay:Transmit Interrupt Delay (array of int)
parm:           TxAbsIntDelay:Transmit Absolute Interrupt Delay (array of int)
parm:           RxIntDelay:Receive Interrupt Delay (array of int)
parm:           RxAbsIntDelay:Receive Absolute Interrupt Delay (array of int)
parm:           InterruptThrottleRate:Interrupt Throttling Rate (array of int)
parm:           SmartPowerDownEnable:Enable PHY smart power down (array of int)
parm:           KumeranLockLoss:Enable Kumeran lock loss workaround (array of int)
parm:           copybreak:Maximum size of packet that is copied to a new buffer on receive 

And even lower level hardware tunings accessible via ethtool(1).

PS. Read kernel the docs, especially Documentation/networking/scaling.txt

PPS. While tuning TCP performance you may want to consult with RFC6349

PPPS. D-Link is not the best network hardware. Try Intel hardware with pci-x or pci-64

Solution 3

Your 32-bit, 33Mhz PCI bus can transit a maximum of 1,067 megabits per second (Mbps) or 133.33 megabytes per second (MBps).

Gigabit Ethernet can transit 116 megabytes per second (MBps).

So although you card should be able to fully saturate the line you'll actually only ever get about 90% utilisation because of various overheads.

Either way if you're getting 80 megabytes per second (MBps) then you're not far off and I would be reasonably happy with that for now.

Solution 4

Gigabit ethernet is just over 1 billion bits per second. With 8/10 encoding this gives you a maximum of around 100MB per second. A 32 bit PCI bus should be able to put 133MB/sec through and you should be able to saturate it (I can demonstrate saturation of a PCI bus with a fibre channel card and get a figure close to the theoretical bandwidth of the bus), so it is unlikely to be the cause of the bottleneck unless there is other bus traffic.

The bottleneck is probably somewhere else unless you have another card using bandwidth on the bus.

Solution 5

Bottle necks at GigE speeds can come from a number of places.

  • Disk subsystem: It takes at least 3-4 hard drives in a RAID array of some sort to be able to hit GigE speeds. This is true on the sending and receiving end.
  • CPU: GigE can use a lot more CPU than you would think. Given that it's in a 33mhz PCI slot I'm going to go out on a limb here and say that this system is fairly old and may have a slower cpu.
  • TCP/IP overhead: Some bits that are sent over the wire is not the data payload but other overhead bits. This said I have had a system that consistently hit and sustained 115MB/s with a single GigE link.
  • PCI Bus: Is the NIC the only thing on that PCI bus or is it being shared with another device.
  • Other factors: There are too many other factors to mention them all but some of the biggest would be what other disk IO activity is happening. Is it a mix of read/write, lots of small IO requests, etc.
Share:
7,747

Related videos on Youtube

MarkR
Author by

MarkR

Updated on September 17, 2022

Comments

  • MarkR
    MarkR almost 2 years

    I'm trying to utilize maximal bandwidth provided by my 1GiB network card, but it's always limited to 80MiB (real megabytes). What can be the reason? Card description (lshw output):

       description: Ethernet interface
        product: DGE-530T Gigabit Ethernet Adapter (rev 11)
        vendor: D-Link System Inc
        physical id: 0
        bus info: pci@0000:03:00.0
        logical name: eth1
        version: 11
        serial: 00:22:b0:68:70:41
        size: 1GB/s
        capacity: 1GB/s
        width: 32 bits
        clock: 66MHz
        capabilities: pm vpd bus_master cap_list rom ethernet physical tp 10bt 10bt-fd 100bt 100bt-fd 1000bt 1000bt-fd autonegotiation
    

    The card is placed in following PCI slot:

    *-pci:2
         description: PCI bridge
         product: 82801 PCI Bridge
         vendor: Intel Corporation
         physical id: 1e
         bus info: pci@0000:00:1e.0
         version: 92
         width: 32 bits
         clock: 33MHz
         capabilities: pci subtractive_decode bus_master cap_list
    

    The PCI isn't any PCI Express right? It's a legacy PCI slot? So maybe this is the reason?

    OS is a linux.

    • Dan Carley
      Dan Carley about 15 years
      It would be useful to describe how you are attaining the benchmark of 80MiB.
    • Taras Chuhay
      Taras Chuhay about 15 years
      You probably reached you maximum disk read capacity.
    • Steve-o
      Steve-o almost 13 years
      A cheap NIC on a slow bus, raw throughput should reach closer to 120MB/s.
  • radius
    radius about 15 years
    As he talks about 1GiB and 80MiB, I guess it's 80mpbs and not 80MB/s.
  • Russ Warren
    Russ Warren about 15 years
    Though he does say "real megabytes" in his original post. Even if it is 80mpbs, he could have slow disk performance as a bottleneck.
  • Dan Carley
    Dan Carley about 15 years
    If we're talking purely network then 80MB/s might be acceptable for a D-Link card, but it is pretty abysmal for a good 1GbE card. 1GbE is 125MB/s wirespeed. TCP overhead is closer to 10%. That should leave you at around 110MB/s, worse case. A bit higher if you factor in jumbo frames.
  • David Pashley
    David Pashley about 15 years
    I assume "real megabytes" refers to mebibytes, not the fake megabytes that disk manufacturers like. (1024^3, rather than 1000^3)
  • David Pashley
    David Pashley about 15 years
    not to mention ethernet frame, ip and tcp packet overheads.
  • Admin
    Admin about 15 years
    TCP overhead is going to vary by packet size. This should be automatically tuned by the OS, but he could always try increasing MTUs manually.
  • radius
    radius about 15 years
    I saw you corrected 1GiB by 1Gbps, you must not use mebibytes for network but megabytes. Anyway your 80MiB are 671mbps. Now we have to know how you measure this. If it's a speed on ethernet layer it's poor for an Gbps card, if it's a speed at application level it's quite good but you could do a little better but we have to know how you measure it.
  • SaveTheRbtz
    SaveTheRbtz about 15 years
    wikipedia says that twisted pair based 1000Base-T doesn't use 8/10. Is it right?
  • ConcernedOfTunbridgeWells
    ConcernedOfTunbridgeWells about 15 years
    Not sure. I was under the impression it did. Maybe I'm thinking of Fibre Channel.
  • romandas
    romandas almost 15 years
    +1 - It's likely the drive, particularly if it is a single drive system.
  • romandas
    romandas almost 15 years
    "at least 3-4 hard drives in a RAID array" -- isn't this only true for disk writes?
  • 3dinfluence
    3dinfluence almost 15 years
    While writes are more expensive than reads in terms of performance there's still no way a single hard drive is going to keep up with a gigabit network connection. Even a 15k rpm SAS drive isn't capable of sustaining gigabit speeds across the entire surface for reads or writes.
  • MDMarra
    MDMarra over 14 years
    First of all, the OP is using PCI not PCIe. Also, x1 is not a bottleneck for a single GigE card.
  • Roy
    Roy over 14 years
    Striping two sata drives should be able to fill a gigabit, even if done in software.
  • Mike Broughton
    Mike Broughton over 14 years
    @MarkM: Gallus is the OP. (That's why his/her name is in a beige box.)
  • Vatine
    Vatine over 13 years
    Erm, no. If the other end was running at 10 Mbps, you'd be unable to push more than about 1.2 MB/s (slightly less, in fact) and if it was running 100 Mbps, you'd be looking at a peak somewhere in the 11.9 MB/s range. "Gig ethernet" is not "gigabyte", it's "gigabit" (and it's a "base 10" gig, at that, so 10^9, not 2^30).
  • Philip
    Philip over 12 years
    1000Base-T has a line rate of 125M symbols/s, the same as 100MbE. It uses all 4 pairs however, and 5 signal levels (simplified: 2 level signals per direction and a neutral), to produce 1 Gbps. The first 8 bits are encoded to 12 transmission bits to prime the DC-balance algorithm, but otherwise transmissions occur at line rate. 1000Base-SX (and other fiber variants) do use 8b/10b line coding and operate at a line rate of 1.25Gbps.
  • rackandboneman
    rackandboneman about 12 years
    To rule out disk bottlenecks, use ramdisks and turn off the swap for benchmarking.