Why does 1GBit card have output limited to 80 MiB?

linux networking ethernet gigabit-ethernet

7,747

Solution 1

80 MB / second is actually pretty good! That's about 640mbps, which is pretty darn close to the gigabit capacity of the NIC. If you take into consideration the TCPIP overhead, and disk speed you're probably at your maximum speed.

Solution 2

Try putting this to your /etc/sysctl.conf

# General 10gigabit/LFP tuning
net.core.rmem_max=16777216
net.core.wmem_max=16777216
net.ipv4.tcp_rmem=4096 87380 16777216
net.ipv4.tcp_wmem=4096 65536 16777216
net.ipv4.tcp_syncookies=1
net.ipv4.tcp_max_orphans=1048576
net.ipv4.tcp_orphan_retries=2

# Removes some internal buffering
net.ipv4.tcp_low_latency=1

# Time-wait sockets
# Do not turn on unless you know what you are doing!
#net.ipv4.tcp_tw_recycle=1
#net.ipv4.tcp_tw_reuse=1

# If PMTUD ICMP blackhole appears use
# RFC 4821, Packetization Layer Path MTU Discovery
net.ipv4.tcp_mtu_probing=1

# Netfilter's conntrack
# NB! For high-performance concerns you probably don't want to use `--state` rules at all 
#net.ipv4.netfilter.ip_conntrack_max=1048576
#net.nf_conntrack_max=1048576

# SACKs are an optimization to TCP which in normal scenarios improves considerably performance. 
# In Gigabit networks with no traffic competition these have the opposite effect. 
# To improve performance they should be turned off with: 
#net.ipv4.tcp_sack=0 

# Decrease the time default value for tcp_fin_timeout connection
net.ipv4.tcp_fin_timeout=15
# Decrease the time default value for tcp_keepalive_time connection
net.ipv4.tcp_keepalive_time=1800

# Increased backlog (default: 100/1000 depending on kernel)
net.core.netdev_max_backlog=10000
net.core.somaxconn=10000

# Timestamps adds additional 12 bytes to header and uses CPU
# NB! It caused massive problems for me under benchmark load
# with a high count of concurrent connections.
# ( http://redmine.lighttpd.net/wiki/1/Docs:Performance )
#net.ipv4.tcp_timestamps=0

# Portrange for outgoing connections
# (increase the ephemeral port range)
# NB! After that tuning you probably do not want to listen on port >= 1024
net.ipv4.ip_local_port_range=1024 65535

# Fixing 'Too many open files', Second useful on nginx+aio workloads
fs.file-max=16777216
fs.aio-max-nr=65536

# If you are under DDoS you can
kernel.panic=10
# Lower following values
#net.ipv4.tcp_synack_retries=2
#net.ipv4.tcp_syn_retries=2
#net.ipv4.netfilter.ip_conntrack_tcp_timeout_fin_wait=15
#net.ipv4.netfilter.ip_conntrack_tcp_timeout_close_wait=15
# If you under ping flood
#net.ipv4.icmp_echo_ignore_all=1

Each connection we make requires an ephemeral port, and thus a file descriptor, and by default this is limited to 1024. To avoid the Too many open files problem you’ll need to modify the ulimit for your shell. This can be changed in /etc/security/limits.conf, but requires a logout/login. For now you can just sudo and modify the current shell (su back to your non-priv’ed user after calling ulimit if you don’t want to run as root):

ulimit -n 999999

Another thing you can try that may help increase TCP throughput is to increase the size of the interface queue. To do this, do the following:

ifconfig eth0 txqueuelen 1000

You can play with congestion control:

sysctl net.ipv4.tcp_available_congestion_control
sysctl net.ipv4.tcp_congestion_control=htcp

There is also some low level tuning, e.g. kernel module parameters

# /sbin/modinfo e1000
..snip...
parm:           TxDescriptors:Number of transmit descriptors (array of int)
parm:           TxDescPower:Binary exponential size (2^X) of each transmit descriptor (array of int)
parm:           RxDescriptors:Number of receive descriptors (array of int)
parm:           Speed:Speed setting (array of int)
parm:           Duplex:Duplex setting (array of int)
parm:           AutoNeg:Advertised auto-negotiation setting (array of int)
parm:           FlowControl:Flow Control setting (array of int)
parm:           XsumRX:Disable or enable Receive Checksum offload (array of int)
parm:           TxIntDelay:Transmit Interrupt Delay (array of int)
parm:           TxAbsIntDelay:Transmit Absolute Interrupt Delay (array of int)
parm:           RxIntDelay:Receive Interrupt Delay (array of int)
parm:           RxAbsIntDelay:Receive Absolute Interrupt Delay (array of int)
parm:           InterruptThrottleRate:Interrupt Throttling Rate (array of int)
parm:           SmartPowerDownEnable:Enable PHY smart power down (array of int)
parm:           KumeranLockLoss:Enable Kumeran lock loss workaround (array of int)
parm:           copybreak:Maximum size of packet that is copied to a new buffer on receive

And even lower level hardware tunings accessible via ethtool(1).

PS. Read kernel the docs, especially Documentation/networking/scaling.txt

PPS. While tuning TCP performance you may want to consult with RFC6349

PPPS. D-Link is not the best network hardware. Try Intel hardware with pci-x or pci-64

Solution 3

Your 32-bit, 33Mhz PCI bus can transit a maximum of 1,067 megabits per second (Mbps) or 133.33 megabytes per second (MBps).

Gigabit Ethernet can transit 116 megabytes per second (MBps).

So although you card should be able to fully saturate the line you'll actually only ever get about 90% utilisation because of various overheads.

Either way if you're getting 80 megabytes per second (MBps) then you're not far off and I would be reasonably happy with that for now.

Solution 4

Gigabit ethernet is just over 1 billion bits per second. With 8/10 encoding this gives you a maximum of around 100MB per second. A 32 bit PCI bus should be able to put 133MB/sec through and you should be able to saturate it (I can demonstrate saturation of a PCI bus with a fibre channel card and get a figure close to the theoretical bandwidth of the bus), so it is unlikely to be the cause of the bottleneck unless there is other bus traffic.

The bottleneck is probably somewhere else unless you have another card using bandwidth on the bus.

Solution 5

Bottle necks at GigE speeds can come from a number of places.

Disk subsystem: It takes at least 3-4 hard drives in a RAID array of some sort to be able to hit GigE speeds. This is true on the sending and receiving end.
CPU: GigE can use a lot more CPU than you would think. Given that it's in a 33mhz PCI slot I'm going to go out on a limb here and say that this system is fairly old and may have a slower cpu.
TCP/IP overhead: Some bits that are sent over the wire is not the data payload but other overhead bits. This said I have had a system that consistently hit and sustained 115MB/s with a single GigE link.
PCI Bus: Is the NIC the only thing on that PCI bus or is it being shared with another device.
Other factors: There are too many other factors to mention them all but some of the biggest would be what other disk IO activity is happening. Is it a mix of read/write, lots of small IO requests, etc.

View more solutions

7,747

MarkR

Updated on September 17, 2022

Comments

MarkR almost 2 years

I'm trying to utilize maximal bandwidth provided by my 1GiB network card, but it's always limited to 80MiB (real megabytes). What can be the reason? Card description (lshw output):

   description: Ethernet interface
    product: DGE-530T Gigabit Ethernet Adapter (rev 11)
    vendor: D-Link System Inc
    physical id: 0
    bus info: pci@0000:03:00.0
    logical name: eth1
    version: 11
    serial: 00:22:b0:68:70:41
    size: 1GB/s
    capacity: 1GB/s
    width: 32 bits
    clock: 66MHz
    capabilities: pm vpd bus_master cap_list rom ethernet physical tp 10bt 10bt-fd 100bt 100bt-fd 1000bt 1000bt-fd autonegotiation

The card is placed in following PCI slot:

*-pci:2
     description: PCI bridge
     product: 82801 PCI Bridge
     vendor: Intel Corporation
     physical id: 1e
     bus info: pci@0000:00:1e.0
     version: 92
     width: 32 bits
     clock: 33MHz
     capabilities: pci subtractive_decode bus_master cap_list

The PCI isn't any PCI Express right? It's a legacy PCI slot? So maybe this is the reason?

OS is a linux.

Dan Carley about 15 years

It would be useful to describe how you are attaining the benchmark of 80MiB.
Taras Chuhay about 15 years

You probably reached you maximum disk read capacity.
Steve-o almost 13 years

A cheap NIC on a slow bus, raw throughput should reach closer to 120MB/s.

radius about 15 years

As he talks about 1GiB and 80MiB, I guess it's 80mpbs and not 80MB/s.
Russ Warren about 15 years

Though he does say "real megabytes" in his original post. Even if it is 80mpbs, he could have slow disk performance as a bottleneck.
Dan Carley about 15 years

If we're talking purely network then 80MB/s might be acceptable for a D-Link card, but it is pretty abysmal for a good 1GbE card. 1GbE is 125MB/s wirespeed. TCP overhead is closer to 10%. That should leave you at around 110MB/s, worse case. A bit higher if you factor in jumbo frames.
David Pashley about 15 years

I assume "real megabytes" refers to mebibytes, not the fake megabytes that disk manufacturers like. (1024^3, rather than 1000^3)
David Pashley about 15 years

not to mention ethernet frame, ip and tcp packet overheads.
Admin about 15 years

TCP overhead is going to vary by packet size. This should be automatically tuned by the OS, but he could always try increasing MTUs manually.
radius about 15 years

I saw you corrected 1GiB by 1Gbps, you must not use mebibytes for network but megabytes. Anyway your 80MiB are 671mbps. Now we have to know how you measure this. If it's a speed on ethernet layer it's poor for an Gbps card, if it's a speed at application level it's quite good but you could do a little better but we have to know how you measure it.
SaveTheRbtz about 15 years

wikipedia says that twisted pair based 1000Base-T doesn't use 8/10. Is it right?
ConcernedOfTunbridgeWells about 15 years

Not sure. I was under the impression it did. Maybe I'm thinking of Fibre Channel.
romandas almost 15 years

+1 - It's likely the drive, particularly if it is a single drive system.
romandas almost 15 years

"at least 3-4 hard drives in a RAID array" -- isn't this only true for disk writes?
3dinfluence almost 15 years

While writes are more expensive than reads in terms of performance there's still no way a single hard drive is going to keep up with a gigabit network connection. Even a 15k rpm SAS drive isn't capable of sustaining gigabit speeds across the entire surface for reads or writes.
MDMarra over 14 years

First of all, the OP is using PCI not PCIe. Also, x1 is not a bottleneck for a single GigE card.
Roy over 14 years

Striping two sata drives should be able to fill a gigabit, even if done in software.
Mike Broughton over 14 years

@MarkM: Gallus is the OP. (That's why his/her name is in a beige box.)
Vatine over 13 years

Erm, no. If the other end was running at 10 Mbps, you'd be unable to push more than about 1.2 MB/s (slightly less, in fact) and if it was running 100 Mbps, you'd be looking at a peak somewhere in the 11.9 MB/s range. "Gig ethernet" is not "gigabyte", it's "gigabit" (and it's a "base 10" gig, at that, so 10^9, not 2^30).
Philip over 12 years

1000Base-T has a line rate of 125M symbols/s, the same as 100MbE. It uses all 4 pairs however, and 5 signal levels (simplified: 2 level signals per direction and a neutral), to produce 1 Gbps. The first 8 bits are encoded to 12 transmission bits to prime the DC-balance algorithm, but otherwise transmissions occur at line rate. 1000Base-SX (and other fiber variants) do use 8b/10b line coding and operate at a line rate of 1.25Gbps.
rackandboneman about 12 years

To rule out disk bottlenecks, use ramdisks and turn off the swap for benchmarking.