How to monitor network interface utilization correctly with Telegraf, InfluxDB, & Grafana?
Immediately after composing the question, I realized what the problem was. (cue head on desk)
Docker provides a virtual ethernet adapter to the container which will only see its own traffic. The solution is to either:
- ...run Telegraf outside of Docker or
- ...run the container with the
--net=host
flag
Related videos on Youtube
Nathan Osman
I love developing software :) I use C++ and wxWidgets extensively. I hate managed code, but I admit that it has less to do with the unmanaged code itself and more to do with the fact that it is overused. If you are looking for a good text editor, I highly recommend PowerPad - http://quickmediasolutions.com/software/powerpad/(Yes, I wrote it.)
Updated on September 18, 2022Comments
-
Nathan Osman over 1 year
I have Telegraf installed on a server and it contains the following network configuration:
[[inputs.net]] interfaces = ["eth0"]
This feeds the following metrics into InfluxDB:
bytes_recv
,bytes_sent
,drop_in
,drop_out
,err_in
,err_out
,icmp_inaddrmaskreps
,icmp_inaddrmasks
,icmp_incsumerrors
,icmp_indestunreachs
,icmp_inechoreps
,icmp_inechos
,icmp_inerrors
,icmp_inmsgs
,icmp_inparmprobs
,icmp_inredirects
,icmp_insrcquenchs
,icmp_intimeexcds
,icmp_intimestampreps
,icmp_intimestamps
,icmp_outaddrmaskreps
,icmp_outaddrmasks
,icmp_outdestunreachs
,icmp_outechoreps
,icmp_outechos
,icmp_outerrors
,icmp_outmsgs
,icmp_outparmprobs
,icmp_outredirects
,icmp_outsrcquenchs
,icmp_outtimeexcds
,icmp_outtimestampreps
,icmp_outtimestamps
,ip_defaultttl
,ip_forwarding
,ip_forwdatagrams
,ip_fragcreates
,ip_fragfails
,ip_fragoks
,ip_inaddrerrors
,ip_indelivers
,ip_indiscards
,ip_inhdrerrors
,ip_inreceives
,ip_inunknownprotos
,ip_outdiscards
,ip_outnoroutes
,ip_outrequests
,ip_reasmfails
,ip_reasmoks
,ip_reasmreqds
,ip_reasmtimeout
,packets_recv
,packets_sent
,tcp_activeopens
,tcp_attemptfails
,tcp_currestab
,tcp_estabresets
,tcp_incsumerrors
,tcp_inerrs
,tcp_insegs
,tcp_maxconn
,tcp_outrsts
,tcp_outsegs
,tcp_passiveopens
,tcp_retranssegs
,tcp_rtoalgorithm
,tcp_rtomax
,tcp_rtomin
,udp_ignoredmulti
,udp_incsumerrors
,udp_indatagrams
,udp_inerrors
,udp_noports
,udp_outdatagrams
,udp_rcvbuferrors
,udp_sndbuferrors
,udplite_ignoredmulti
,udplite_incsumerrors
,udplite_indatagrams
,udplite_inerrors
,udplite_noports
,udplite_outdatagrams
,udplite_rcvbuferrors
,udplite_sndbuferrors
I then created a panel in Grafana with the following query:
SELECT derivative(sum("bytes_sent"), 1s) AS "up",, derivative(sum("bytes_recv"), 1s) AS "down" FROM "autogen"."net" WHERE "interface" = 'eth0' AND $timeFilter GROUP BY time($__interval) fill(null)
(The
derivative()
is necessary sincebytes_recv
andbytes_sent
are accumulating metrics.)My concern is that the data is not accurate. As a test, I downloaded some very large files (1GB) and confirmed (
ifconfig eth0
) thatRX bytes
was increasing by the expected amount as data was received. However, the graph looks like this:There is no change whatsoever to the metrics being recorded. What am I doing wrong?
Details
- Host is running Ubuntu Server 16.04
- Telegraf, InfluxDB, and Grafana are running in Docker