Determining Specific File Responsible for High I/O
Solution 1
There are several aspects to this question which have been addressed partially through other tools, but there doesn't appear to be a single tool that provides all the features you're looking for.
iotop
This tools shows which processes are consuming the most I/O. But it lacks options to show specific file names.
$ sudo iotop
Total DISK READ: 0.00 B/s | Total DISK WRITE: 0.00 B/s
TID PRIO USER DISK READ DISK WRITE SWAPIN IO> COMMAND
1 be/4 root 0.00 B/s 0.00 B/s 0.00 % 0.00 % init
2 be/4 root 0.00 B/s 0.00 B/s 0.00 % 0.00 % [kthreadd]
3 be/4 root 0.00 B/s 0.00 B/s 0.00 % 0.00 % [ksoftirqd/0]
5 be/4 root 0.00 B/s 0.00 B/s 0.00 % 0.00 % [kworker/u:0]
6 rt/4 root 0.00 B/s 0.00 B/s 0.00 % 0.00 % [migration/0]
7 rt/4 root 0.00 B/s 0.00 B/s 0.00 % 0.00 % [watchdog/0]
By default it does what regular top
does for processes vying for the CPU's time, except for disk I/O. You can coax it to give you a 30,000 foot view by using the -a
switch so that it shows an accumulation by process, over time.
$ sudo iotop -a
Total DISK READ: 0.00 B/s | Total DISK WRITE: 0.00 B/s
TID PRIO USER DISK READ DISK WRITE SWAPIN IO> COMMAND
258 be/3 root 0.00 B 896.00 K 0.00 % 0.46 % [jbd2/dm-0-8]
22698 be/4 emma 0.00 B 72.00 K 0.00 % 0.00 % chrome
22712 be/4 emma 0.00 B 172.00 K 0.00 % 0.00 % chrome
1177 be/4 root 0.00 B 36.00 K 0.00 % 0.00 % cupsd -F
22711 be/4 emma 0.00 B 120.00 K 0.00 % 0.00 % chrome
22703 be/4 emma 0.00 B 32.00 K 0.00 % 0.00 % chrome
22722 be/4 emma 0.00 B 12.00 K 0.00 % 0.00 % chrome
i* tools (inotify, iwatch, etc.)
These tools provide access to the file access events, however they need to be specifically targeted to specific directories or files. So they aren't that helpful when trying to trace down a rogue file access by an unknown process, when debugging performance issues.
Also the inotify
framework doesn't provide any particulars about the files being accessed. Only the type of access, so no information about the amount of data being moved back and forth is available, using these tools.
iostat
Shows overall performance (reads & writes) based on access to a given device (hard drive) or partition. But doesn't provide any insight into which files are generating these accesses.
$ iostat -htx 1 1
Linux 3.5.0-19-generic (manny) 08/18/2013 _x86_64_ (3 CPU)
08/18/2013 10:15:38 PM
avg-cpu: %user %nice %system %iowait %steal %idle
18.41 0.00 1.98 0.11 0.00 79.49
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
sda
0.01 0.67 0.09 0.87 1.45 16.27 37.06 0.01 10.92 11.86 10.82 5.02 0.48
dm-0
0.00 0.00 0.09 1.42 1.42 16.21 23.41 0.01 9.95 12.22 9.81 3.19 0.48
dm-1
0.00 0.00 0.00 0.02 0.01 0.06 8.00 0.00 175.77 24.68 204.11 1.43 0.00
blktrace
This option is too low level. It lacks visibility as to which files and/or inodes are being accessed, just raw block numbers.
$ sudo blktrace -d /dev/sda -o - | blkparse -i -
8,5 0 1 0.000000000 258 A WBS 0 + 0 <- (252,0) 0
8,0 0 2 0.000001644 258 Q WBS [(null)]
8,0 0 3 0.000007636 258 G WBS [(null)]
8,0 0 4 0.000011344 258 I WBS [(null)]
8,5 2 1 1266874889.709032673 258 A WS 852117920 + 8 <- (252,0) 852115872
8,0 2 2 1266874889.709033751 258 A WS 852619680 + 8 <- (8,5) 852117920
8,0 2 3 1266874889.709034966 258 Q WS 852619680 + 8 [jbd2/dm-0-8]
8,0 2 4 1266874889.709043188 258 G WS 852619680 + 8 [jbd2/dm-0-8]
8,0 2 5 1266874889.709045444 258 P N [jbd2/dm-0-8]
8,0 2 6 1266874889.709051409 258 I WS 852619680 + 8 [jbd2/dm-0-8]
8,0 2 7 1266874889.709053080 258 U N [jbd2/dm-0-8] 1
8,0 2 8 1266874889.709056385 258 D WS 852619680 + 8 [jbd2/dm-0-8]
8,5 2 9 1266874889.709111456 258 A WS 482763752 + 8 <- (252,0) 482761704
...
^C
...
Total (8,0):
Reads Queued: 0, 0KiB Writes Queued: 7, 24KiB
Read Dispatches: 0, 0KiB Write Dispatches: 3, 24KiB
Reads Requeued: 0 Writes Requeued: 0
Reads Completed: 0, 0KiB Writes Completed: 5, 24KiB
Read Merges: 0, 0KiB Write Merges: 3, 12KiB
IO unplugs: 2 Timer unplugs: 0
Throughput (R/W): 0KiB/s / 510KiB/s
Events (8,0): 43 entries
Skips: 0 forward (0 - 0.0%)
fatrace
This is a new addition to the Linux Kernel and a welcomed one, so it's only in newer distros such as Ubuntu 12.10. My Fedora 14 system was lacking it 8-).
It provides the same access that you can get through inotify
without having to target a particular directory and/or files.
$ sudo fatrace
pickup(4910): O /var/spool/postfix/maildrop
pickup(4910): C /var/spool/postfix/maildrop
sshd(4927): CO /etc/group
sshd(4927): CO /etc/passwd
sshd(4927): RCO /var/log/lastlog
sshd(4927): CWO /var/log/wtmp
sshd(4927): CWO /var/log/lastlog
sshd(6808): RO /bin/dash
sshd(6808): RO /lib/x86_64-linux-gnu/ld-2.15.so
sh(6808): R /lib/x86_64-linux-gnu/ld-2.15.so
sh(6808): O /etc/ld.so.cache
sh(6808): O /lib/x86_64-linux-gnu/libc-2.15.so
The above shows you the process ID that's doing the file accessing and which file it's accessing, but it doesn't give you any overall bandwidth usage, so each access is indistinguishable to any other access.
So what to do?
The fatrace
option shows the most promise for FINALLY providing a tool that can show you aggregate usage of disk I/O based on files being accessed, rather than the processes doing the accessing.
References
- fatrace: report system wide file access events
- fatrace - report system wide file access events
- Another new ABI for fanotify
- blktrace User Guide
Solution 2
I haven't gotten an answer yet but I did write this script (at the end) and it seems to do what I want. I haven't tested it on other systems and it's Linux-specific.
Basically it just wraps around strace
for 30 seconds, filtering for file related system calls and makes an effort to strip out the filename. It counts the number of occurrences of that file in the strace
and presents a paginated summary to the user. It's not perfect but the number of system calls to a particular file may have some weak correlation to how much I/O it's performing.
I haven't tested it fully but if it doesn't work out of the box, it should give people a place to start from. If it gets fleshed out any more, it may be advisable to re-write this into a higher level language like python.
If I don't get an answer within a week of a less homebrewed way of doing this (even if it's another tool that just counts I/O of a particular process) I'll accept this as my answer for posterity.
Script:
#!/bin/bash
####
# Creates files underneath /tmp
# Requires commands: timeout strace stty
####
#
# All commands are GNU unless otherwise stated
#
##########################################################
####
## Initialization
####
outputFile=/tmp/out.$RANDOM.$$
uniqueLinesFile=/tmp/unique.$RANDOM.$$
finalResults=/tmp/finalOutput.txt.$$
if [ $# -ne 1 ]; then
echo "USAGE: traceIO [PID]" >&2
exit 2
fi
if ! [[ "$1" =~ ^[0-9]+$ ]]; then
echo "USAGE: traceIO [PID]" >&2
echo -e "\nGiven Process ID is not a number." >&2
exit 2
fi
if [ ! -e /proc/$1 ]; then
echo "USAGE: traceIO [PID]" >&2
echo -e "\nThere is no process with $1 as the PID." >&2
exit 2
fi
if [[ "x$PAGER" == "x" ]]; then
for currentNeedle in less more cat; do
which $currentNeedle >/dev/null 2>&1
if [ $? -eq 0 ]; then
PAGER=$currentNeedle
break;
fi
done
if [[ "x$PAGER" == "x" ]]; then
echo "Please set \$PAGER appropriately and re-run" >&2
exit 1
fi
fi
####
## Tracing
####
echo "Tracing command for 30 seconds..."
timeout 30 strace -e trace=file -fvv -p $1 2>&1 | egrep -v -e "detached$" -e "interrupt to quit$" | cut -f2 -d \" > $outputFile
if [ $? -ne 0 ]; then
echo -e "\nError performing Trace. Exiting"
rm -f $outputFile 2>/dev/null
exit 1
fi
echo "Trace complete. Preparing Results..."
####
## Processing
####
sort $outputFile | uniq > $uniqueLinesFile
echo -e "\n-------- RESULTS --------\n\n #\t Path " > $finalResults
echo -e " ---\t-------" >> $finalResults
while IFS= read -r currentLine; do
echo -n $(grep -c "$currentLine" "$outputFile")
echo -e "\t$currentLine"
done < "$uniqueLinesFile" | sort -rn >> $finalResults
####
## Presentation
####
resultSize=$(wc -l $finalResults | awk '{print $1}')
currentWindowSize=$(stty size | awk '{print $1}')
# We put five literal lines in the file so if we don't have more than that, there were no results
if [ $resultSize -eq 5 ]; then
echo -e "\n\n No Results found!"
elif [ $resultSize -ge $currentWindowSize ] ; then
$PAGER $finalResults
else
cat $finalResults
fi
# Cleanup
rm -f $uniqueLinesFile $outputFile $finalResults
Solution 3
You can use iwatch Using iWatch
iWatch is very simple to use, suppose you want to watch the change in /etc filesystem, you just need to run it in the console
$ iwatch /etc
and iwatch will tell you if something changes in this directory. And if you want to be notified per email:
$ iwatch -m [email protected] /etc
In this case, the admin will get email notification (maybe you can use your sms gateway account, so you will be alarmed immediately anytime and anywhere). And if you want to monitor many difference directories you can use a configuration file. This configuration file is an xml file with an easy understandable structure.
Related videos on Youtube
Bratchley
Updated on September 18, 2022Comments
-
Bratchley over 1 year
This is a simple problem but the first time I've ever had to actually fix it: finding which specific files/inodes are the targets of the most I/O. I'd like to be able to get a general system overview, but if I have to give a PID or TID I'm alright with that.
I'd like to go without having to do a
strace
on the program that pops up iniotop
. Preferably, using a tool in the same vein asiotop
but one that itemizes by file. I can uselsof
to see which files mailman has open but it doesn't indicate which file is receiving I/O or how much.I've seen elsewhere where it was suggested to use
auditd
but I'd prefer to not do that since it would put the information into our audit files, which we use for other purposes and this seems like an issue I ought to be able to research in this way.The specific problem I have right now is with LVM snapshots filling too rapidly. I've since resolved the problem but would like to have been able to fix it this way rather than just doing an
ls
on all the open file descriptors in/proc/<pid>/fd
to see which one was growing fastest.-
slm almost 11 yearspossibly related: unix.stackexchange.com/questions/9520/…
-
Bratchley almost 11 yearsYeah, I hadn't seen that one before but most of the answers to this question were basically like that: "Well if you do things this incredibly specific way, and do something weird you can have a rough idea" versus something that directly solves the problem without requiring that the admin get too fancy. I don't mean to criticize others, and I realize now the difficulty of this problem is probably way such solutions were offered, but it seems like even if there isn't a tool like
fatrace
but older, that something like the script I wrote should have been offered since it's more widely usable. -
Bratchley almost 11 yearsJust to be clear: I'm not criticizing the others who did offer help. Help is always better than no help. It's just frustrating when you feel the problem should have a straight forward response and all you can figure out yourself or see others suggesting are either kludgy workarounds or very manual processes (such as what I ended up doing with my mailman problem).
-
slm almost 11 yearsYeah I'm always amazed when I find answers to new Q's here buried in the site that don't show up until I dig for a while. Seems like somethings broken there 8-). Hence why it's good to ask the same Q multiple ways and link it to the older ones as they're routed out. Agreed your script is a better approach, I'm still surprised that there isn't a general purpose tool that does what you ask. Seems like a big gap in Unix.
-
slm almost 11 yearsMost of the help is just extremely targeted which can get a little annoying, since when answering you're saying the same thing a lot of times over and over in different ways. But that's the nature of the SE sites. I don't know how Gilles does it. I like these longer form Q&A's better.
-
Bratchley almost 11 yearsIf I had to guess, it's probably because applications generally have their own load/traffic metrics. So it probably was a marginal problem, especially since multi-use servers are typically like that because they're low traffic in the first place. Mailman, though, doesn't have any metrics to speak of, which is frustrating since I now have to re-write part of the webui and introduce a custom handler just to get that going. They have standard logging but no access to metrics or tracking progress or ANYTHING. Still, this problem should be solvable at the platform level just for cases like that.
-
Bratchley almost 11 yearsFor cases where the application traffic monitoring sucks I mean. If that is even the logic of why it took so long to develop something like
fatrace
. -
slm almost 11 yearsThe
fatrace
looks like it went through a rough path to get into the Kernel. The posts I referenced were from 2009 and it's only showing up just now (2012-2013) in the more recent Kernels. It seemed to encounter a lot of resistance but I don't truly understand why. Rolling your own "tool" was the only option using the regular cast of character tools I mentioned without it. Seems stupid that each SA/Dev. would have to make their own tool for something that seems so basic once you asked the Q.
-
-
Bratchley almost 11 yearsI'm supposing this is using
inotify
is that correct? I was hesitant to use anything based oninotify
since you have to give it paths (which is essentially what I'm looking for) and I was worried at how much overhead there would be if I just did everything underneath/
Can this filter by PID? I might be able to tolerate temporary slowness if it's going to be easy enough to extract which program is doing it. The website also doesn't have any example command output. -
vfbsilva almost 11 years@JoelDavis Im really not sure. As far as I know it consumes a huge ammount of RAM hence running it under "/" will be dangerous.
-
Bratchley almost 11 yearsSweet baby Jesus, slm. You are like the rockstar of Unix SE as far as I'm concerned. You answers are always incredibly educational and show a lot of research all in one place. Most people (if they knew about it) would have just posted the last bit about
fatrace
and not developed it much passed that. I really do appreciate how you go the extra mile to make sure people understand the complete picture and wish I could do more than just upvote and give bounty. -
slm almost 11 years@JoelDavis - thanks for your very kind words. I liked your idea of making a canonical answer so I was attempting to start that here. I've run into this problem many times as well and wished I had a resource like this so I figured we'd create it here 8-).
-
Bratchley almost 11 yearsOne thing I'm confused about: When I did the install
yum
pulled in python3's libraries for some reason. I did afile
on it and it looks like it's an ELF executable.ldd
doesn't show any links topython
and neither didstrings
. Any idea why it bothered with python3? -
slm almost 11 years@JoelDavis - which distro? CentOS 6? I did not see the package on Cent6, my commands were from Ubuntu 12.10.
-
slm almost 11 years@JoelDavis - BTW I like that you ask questions that aren't just selfishly trying to solve just your problems but also leaving a path for others in the future.
-
Bratchley almost 11 yearsThis is on Fedora 18. And thank you, sir.
-
Bratchley almost 11 yearsBTW, apparently I have to wait some time after accepting the answer to award bounty. Not that it matters to someone with roughly half of Unix SE's aggregate amount reputation points but just an FYI.
-
slm almost 11 years@JoelDavis - NP. Is the lack of a aggregate bandwidth from
fatrace
an issue for you? That implementation detail feels like a hook that was exposed for other tools to step in and expand it as needed, no? -
Bratchley almost 11 yearsNot really an issue for me, no. I can get the information I need about that via the appropriate
iotop
andiostat
calls. Also, I figured out the python thing, it looks like (on Fedora 18 at least) there's a "power-usage-report"python
script soyum
was just responding to the fact thatpython
is in the RPM's dependencies. So that particular mystery is solved. -
Bratchley almost 11 yearsBasically I'd use
iostat
to confirm bandwidth saturation on a particular device, useiotop
to get a short list of applications using a lot of I/O then usefatrace
to confirm whether that application's I/O was related to the the bandwidth saturation (i.e the I/Oiotop
is returning is targeted at the device I'm concerned about). -
slm almost 11 years@JoelDavis - yeah it's still a blending of tools situation, I hate having to do that, esp. as a DevOp on some production system at 2am. 8-).
-
Bratchley almost 11 yearsCall me a weirdo but I actually like blending tools. Mix and match helps me solve problems the developers have no way to be able to anticipate an admin encountering. It's just that you can go too far in that direction and have to perform 50 different steps just to answer a simple question.
fatrace
looks like it solves that problem by cutting out 47 of the steps (not to mention a single central solution for multiple environments to build procedures/skillsets around and direct bugfixes towards). -
Bratchley almost 11 yearsNot to say that knowing how much bandwidth fileX is taking up wouldn't be useful, though. As long as I can see the I/O going out in
fatrace
and it gets quantified byiostat
I'm fine. -
slm almost 11 years@JoelDavis - don't get me wrong, I like the ability to cut knew solutions using Unix lego blocks too, just not at 2am when I'm under the gun 8-). I was going to keep looking for some other options to see if we can mix in the bandwidth more systematically.
-
slm almost 11 years