NIC Ring Buffer Tuning: Packet Drops, Backlog, and Performance Considerations
π Table of Contents
Introduction to NIC Ring Buffer Issues
A company had a server that had been receiving real-time data at 100 Mbps without any major issues. However, when the incoming data rate was increased to 1 Gbps, service instability began to appear.
Among several load-balanced servers, a few started falling out of sync and were excluded from the service pool. Since the system was designed with redundancy, the failure of one or two servers didn’t cause a complete outage—the remaining servers could still handle the traffic. Nevertheless, leaving this issue unresolved posed a risk of escalating into a full-scale service disruption. This prompted a detailed investigation.
The root cause was identified fairly quickly. The network team had observed packet drops, and further checks confirmed this. Using the ethtool command on the interface revealed an increasing drop rate, making it clear that packet loss was indeed happening.
The first step in resolving such issues is to determine where the packet drops are occurring.
Understanding NIC Ring Buffer
The packet drop was found in the NIC Ring Buffer area, which is a shared memory between the NIC and the kernel. It stores packets that first arrive via the network cable. The name "ring buffer" comes from its circular data structure.
If the ring buffer is set to 256, it holds 256 empty slots, with new packets arriving sequentially. Once full, it overwrites from slot 1 again.
,~~~~,~~~~,~~~~,~~~~,~~~~,~~~~,~~~~,~~~~,~~~~,~~~~, ,-->( 1 )( 2 )( 3 )( 4 )( 5 )...(253)(254)(255)(266)--> | '~~~~'~~~~'~~~~'~~~~'~~~~'~~~~'~~~~'~~~~'~~~~'~~~~' | '--
If packets arrive faster than they are emptied, older packets get overwritten and dropped, causing performance issues.
Checking Packet Drops
You can check the drop count using the following command. Note that results vary by NIC and driver:
# ethtool -S enp1s0
NIC statistics:
rx_queue_0_packets: 178624
rx_queue_0_bytes: 151983004
rx_queue_0_drops: 437
...
The output varies depending on the NIC and driver in use, but the key is to look at the drop and error values. These counters reveal whether packets are being lost inside the NIC buffer.
When these numbers increase, it usually means that packets are arriving faster than the buffer can handle, causing overflows. In other words, the buffer space is being filled more quickly than it can be processed, leading to packet loss.
Increasing Ring Buffer
Since the root cause was identified, the solution turned out to be relatively straightforward. The ring buffer size for the enp1s0 interface was configured with a default of 256, which was insufficient under the new 1 Gbps traffic load.
To address this, the buffer size was increased to 1024. While there are several ways to adjust this setting, in our case we used nmcli to configure the ring buffer to 512 as a first step.
We increased the ring buffer from 256 to 1024 using nmcli
# nmcli con mod enp1s0 ethtool.ring-rx 4096 # nmcli con mod enp1s0 ethtool.ring-tx 4096 # nmcli con up enp1s0
When adjusting the ring buffer, it’s important to note that network latency may occur during the change process. For this reason, it’s strongly recommended to perform the adjustment outside of critical service hours to avoid potential disruptions.
Use ethtool -g to confirm the updated values
Ring parameters for enp1s0: Pre-set maximums: RX: 2048 ... Current hardware settings: RX: 1024 ...
In the output, the values are divided into two sections: Pre-set Maximums and Current Hardware Settings.
-
The Pre-set Maximums indicate the maximum values supported by the hardware.
-
The Current Hardware Settings show the values currently applied.
In other words, the Current section reflects the settings actively in use, and you can increase them up to the limits shown in the Pre-set Maximums. Note that the maximum values vary depending on the NIC model.
After increasing the ring buffer, no further packet drops were observed.
What Happens After Ring Buffer
However, a reasonable question often comes up: “Even if we increase the NIC ring buffer, what if the backend cannot process packets quickly enough, and they still pile up faster than they’re drained? Wouldn’t the buffer size then be irrelevant?”
The answer is yes. If the packet processing rate behind the NIC is slower than the arrival rate, simply enlarging the buffer won’t solve the problem.
This is why we need to also look at complementary parameters that help avoid bottlenecks when increasing the NIC ring buffer. In our case, these settings were already tuned properly, so increasing only the ring buffer was sufficient. But if your server doesn’t yet have these optimizations, it’s worth reviewing and adjusting the following configurations as well.
Here's the basic processing flow
[ NIC Ring Buffer ] (Hardware queue)
|
| (Packet arrival → Hard IRQ triggered: "work to do" signal)
v
[ Hard IRQ ]
|
| (Schedules SoftIRQ: NET_RX_SOFTIRQ)
v
[ SoftIRQ (NAPI Poll) ]
|
| (Processes ~300 packets per cycle by default)
| └─ If it cannot finish, → 3rd field of /proc/net/softnet_stat increases
v
[ Kernel Backlog Queue (per-CPU) ]
|
| (If the queue overflows, packets are dropped → 2nd field increases)
v
[ Kernel Network Stack ]
(L2: Ethernet → L3: IP → L4: TCP/UDP)
|
v
[ Socket Buffer (per-socket receive queue) ]
|
| (If the application does not read in time →
| TCP: Zero Window advertised / UDP: Packet drop)
v
[ User-space Application ]
(Data received when recv() / read() is called)
1. Ring Buffer → The NIC’s first hardware-level queue
2. Hard IRQ → Simply signals that “work is pending”
3. SoftIRQ (NAPI Poll) → Pulls packets from the Ring Buffer and processes them, subject to a budget limit
4. Backlog Queue → Temporary holding area for packets that SoftIRQ could not finish (per-CPU)
5. Network Stack → Processes packets through each layer: Ethernet → IP → TCP/UDP
6. Socket Buffer → Queue per connection (TCP) or per port (UDP), storing packets until the application reads them
7. Application → Ultimately receives the data when recv() (or similar system call) is invoked
Now, aside from the Ring Buffer we initially increased, let’s examine each parameter one by one.
Network Budget
Packets arriving in the Ring Buffer are picked up and processed by SoftIRQ. By default, SoftIRQ can handle about 300 packets per polling cycle.
If packets arrive faster than they can be processed, SoftIRQ will not be able to finish within a single cycle. The remaining packets must then wait until the next polling cycle, which can introduce processing delays.
In /proc/net/softnet_stat, the third field indicates the number of times the system could not process all packets from the interface during a single polling cycle.
# cat /proc/net/softnet_stat 00071464 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 0006b6c3 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000001
Converting to Decimal
# awk '{for (i=1; i<=NF; i++) printf strtonum("0x" $i) (i==NF?"\n":" ")}' /proc/net/softnet_stat | column -t
463412 0 0 0 0 0 0 0 0 0 0 0 0
439367 0 0 0 0 0 0 0 0 0 0 0 1When the third column in the output increases, it indicates that not all packets were processed during a single polling cycle.
In this case, you can experiment by gradually increasing the values of:
-
net.core.netdev_budget -
net.core.netdev_budget_usecs
A practical approach is to double the values step by step while monitoring system performance to see if packet drops or processing delays are reduced.
# vim /etc/sysctl.d/10-netdev_budget.conf net.core.netdev_budget = 600 net.core.netdev_budget_usecs = 4000
With this kernel setting, the system is configured to process up to 600 packets per polling cycle within a maximum of 4000 microseconds.
This represents a twofold increase from the default of 300 packets, making it a sensible initial value to test when tuning.
Backlog Queue
Recommendation: Apply when the second column of /proc/net/softnet_stat continues to increase
Packets passed on by SoftIRQ are placed into the kernel’s Backlog Queue, where they wait to be processed by the CPU. A Backlog Queue is created for each CPU core, and it automatically grows up to the limit defined by net.core.netdev_max_backlog.
If the backlog assigned to a CPU core becomes full, packet drops will begin to occur. This behavior can be monitored through the second column (drop count) in /proc/net/softnet_stat.
You can test this parameter by doubling the value step by step until the second column of /proc/net/softnet_stat either stops increasing or at least no longer grows rapidly.
This iterative approach helps you find a stable configuration where backlog drops are minimized without unnecessarily oversizing the queue.
net.core.netdev_max_backlog = 2000
“Network tuning doesn’t have a single fixed answer—it’s about finding the optimal balance for your environment. I recommend carefully monitoring metrics such as /proc/net/softnet_stat and making gradual adjustments to identify what works best for your system.”
Comments
Post a Comment