Logrotate Successful, original file goes back to original size
Solution 1
This is probably because even though you truncate the file, the process writing to the file will continue writing at whichever offset it were at last. So what's happening is that logrotate truncates the file, size is zero, process writes to the file again, continuing at the offset it left off, and you now have a file with NULL-bytes up to the point where you truncated it plus the new entries written to the log.
od -c after truncate + sudden growth, generated output along the lines of:
0000000 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0
*
33255657600 \0 C K B - s e r v e r [ h t t
33255657620 <more log output>
What this says is from offset 0 to 33255657600 your file consists of null bytes, and then some legible data. Getting to this state doesn't take the same amount of time it would take to actually write all those null-bytes. The ext{2,3,4} filesystems support something called sparse files, so if you seek past a region of a file that doesn't contain anything, that region will be assumed to contain null-bytes and won't take up space on disk. Those null bytes won't actually be written, just assumed to be there, hence the time it takes to go to 0 to 3.5GB don't take a lot of time to. (You can test the amount of time it takes by doing something like dd if=${HOME}/.bashrc of=largefile.bin seek=3432343264 bs=1
, this should generate a file of over 3GB in a few milliseconds).
If you run ls -ls
on your logfiles after they've been truncated and had a sudden growth again, it should now report a number at the beginning of the line which represents the actual size (in blocks occupied on disk), which probably is orders of magnitude smaller than the size reported by just ls -l
.
Solution 2
I am extremely confident that Kjetil has hit it. Drew, you may not yet be convinced by his explanation but I urge you to read carefully what he's said.
If you accept it, the fix is either to stop and restart your application when the logs are rotated, or use a tool like apache's "rotatelogs", where you feed the log output to the tool via a pipe, and the tool takes care of rotating the logfile every so often. For example, one of my apache instances logs with
ErrorLog "|/usr/sbin/rotatelogs /www/logs/error_log 604800"
which causes a lot of logfiles with names like
-rw-r--r-- 1 root root 4078 Dec 21 01:04 error_log.1292457600
-rw-r--r-- 1 root root 4472 Dec 29 08:41 error_log.1293062400
-rw-r--r-- 1 root root 78630 Jan 4 12:57 error_log.1293667200
-rw-r--r-- 1 root root 15753 Jan 12 01:10 error_log.1294272000
to appear without restarting apache; I can then compress them manually after the fact. Note how the rotation is done every week, which is every 604800 seconds, that being the argument passed to rotatelogs
.
If you can't stop and restart the app, and it can't log via a pipe, then I think you've got a real problem. Perhaps others will have suggestions.
Related videos on Youtube
drewrockshard
Updated on September 17, 2022Comments
-
drewrockshard over 1 year
Has anyone had any issues with logrotate before that causes a log file to get rotated and then go back to the same size it originally was? Here's my findings:
Logrotate Script:
/var/log/mylogfile.log { rotate 7 daily compress olddir /log_archives missingok notifempty copytruncate }
Verbose Output of Logrotate:
copying /var/log/mylogfile.log to /log_archives/mylogfile.log.1 truncating /var/log/mylogfile.log compressing log with: /bin/gzip removing old log /log_archives/mylogfile.log.8.gz
Log file after truncate happens
[root@server ~]# ls -lh /var/log/mylogfile.log -rw-rw-r-- 1 part1 part1 0 Jan 11 17:32 /var/log/mylogfile.log
Literally Seconds Later:
[root@server ~]# ls -lh /var/log/mylogfile.log -rw-rw-r-- 1 part1 part1 3.5G Jan 11 17:32 /var/log/mylogfile.log
RHEL Version:
[root@server ~]# cat /etc/redhat-release Red Hat Enterprise Linux ES release 4 (Nahant Update 4)
Logrotate Version:
[root@DAA21529WWW370 ~]# rpm -qa | grep logrotate logrotate-3.7.1-10.RHEL4
Few Notes:
- Service can't be restarted on the fly, so that's why I'm using copytruncate
- Logs are rotating every night, according to the
olddir
directory having log files in it from each night.
-
Kjetil Joergensen over 13 yearsThere should be an easy way of verifying this, inspect the file and see if there's actually anything in it. Start out by running od -c filename | head -n 100. Chances are it'll tell you in a few lines that there's a truckload of null-bytes, plus the latest log-entries.
-
Zubair over 13 yearsIf this is the case then can we test whether cat /dev/null > /var/log/mylogfile.log truncates the file in a way that solves this issue, rather than using logrotate built in copytruncate?
-
Kjetil Joergensen over 13 yearsThe problem doesn't lie with the way the file is truncated, the problem lies with how the program writes the logfile. For truncating the logfile to be a viable approach, you'll have to get the program writing to the logfile somehow seek to position 0. It's possible that a filesystem that doesn't support sparse files wouldn't have this proble, although I don't have any way of testing that right now.
-
drewrockshard over 13 yearsThis was my findings, still not sure why this is having issues. sprunge.us/LAXY - The first column goes from 33255530660 -> 0000000 -> 17362070000 -> 0000000 -> 33255657600. The command I ended up running was
while true; do od -c /home/part1/kbServer/ckbServer/CKBServer.log | head -n 100;done;
so that I could continue reading in the file. -
Kjetil Joergensen over 13 yearsThe left column is offset. What would be nice would be to have some delimiter of each run. The left column is offset so, 33255530660 looks to be "dangling", that is, the beginning of that od -c | tail .. run is gone. However, if you see the 000000 -> 33255657600 run, what this tells you is. The first 16 bytes consists of NULL. Then, there's a asterisk in the left column saying that the entries from the last offset are repeated, Then you get to 33255657600 which actually contains any legible data. So, yes, the case is that what writes to the file continues at the offset it were at.
-
drewrockshard over 13 yearsCorrect, but still does not explain why I got from 0 bytes to 3.5GB in less than 5 minutes. Additionally, it does not explain why it goes back to the same original size. It's ALWAYS the same original size.
-
drewrockshard over 13 yearsCan't use
kill -HUP
as this application cannot be touched in any way - it's a sensitive application that I do not own (I don't even manage it - I only manage the OS side) so I need to be able to do the logrotations this way. -
Kjetil Joergensen over 13 years0 -> 3.5GB in less than 5 minutes incorporated into answer, as it's a bit too large for a comment. When you say "the same original size" is this as reported by: ls -l or ls -lh, the latter does an awful lot of rounding which would hide a few lines appended to a file within rounding errors.
-
drewrockshard over 13 yearsThis final answer actually made more sense and the fix will probably end up being to restart the application.
-
kasperd about 8 yearsThere is no indication that writing to the log file is done using
>
from a script. Moreover this answer is confusing because there is a major difference between>
and>(
)
in scripts. Finally if the code doing the writing is updated, it would be much better to have it simply start writing to the new logfile afterlogrotate
has done its thing. -
user578558 about 8 yearsI edit my answer so it's clearer.
-
turkenh over 7 yearsPlease also note that, opening file in append mode is the solution here in case you have such a control on the application, as detailed in serverfault.com/a/522687/387313. Because O_APPEND triggers seek to the end of file before each write, see man7.org/linux/man-pages/man2/open.2.html