How to write repeated free-form strings to a file, as fast as 'dd'?
Solution 1
$ time perl -e \
'$count=1024*1024; while ($count>0) { print "x" x 384; $count--; }' > out
real 0m1.284s
user 0m0.316s
sys 0m0.961s
$ ls -lh out
-rw-r--r-- 1 me group 384M Apr 16 19:47 out
Replace "x" x 384
(which produces a string of 384 x
s) with whatever you like.
You can optimize this further by using a bigger string in each loop, and bypassing normal standard out buffering.
$ perl -e \
'$count=384; while ($count>0) {
syswrite(STDOUT, "x" x (1024*1024), 1024*1024);
$count--;
}' > out
In this case, the syswrite
calls will pass down 1M at a time to the underlying write
syscall, which is getting pretty good. (I'm getting around 0.940s user with this.)
Hint: make sure you call sync
between each test to avoid having the previous run's flushing interfere with the current run's I/O.
For reference, I get this time:
$ time dd if=/dev/zero bs=1024 count=$((1024*384)) of=./out
393216+0 records in
393216+0 records out
402653184 bytes (403 MB) copied, 1.41404 s, 285 MB/s
real 0m1.480s
user 0m0.054s
sys 0m1.410s
Solution 2
It's generally expected that shells are slow at processing large pieces of data. For most scripts, you know in advance which bits of data are likely to be small and which bits of data are likely to be large.
- Prefer to rely on shell built-ins for small data, because forking and exec'ing an external process induces a constant overhead.
- Prefer to rely on external, special-purpose tools for large data, because special-purpose compiled tools are more efficient than an interpreted general-purpose language.
dd
makes read
and write
calls that use the block size. You can observe this with strace (or truss, trace, … depending on your OS):
$ strace -s9 dd if=/dev/zero of=/dev/null ibs=1024k obs=2048k count=4
✄
read(0, "\0\0\0\0\0\0\0\0\0"..., 1048576) = 1048576
read(0, "\0\0\0\0\0\0\0\0\0"..., 1048576) = 1048576
write(1, "\0\0\0\0\0\0\0\0\0"..., 2097152) = 2097152
read(0, "\0\0\0\0\0\0\0\0\0"..., 1048576) = 1048576
read(0, "\0\0\0\0\0\0\0\0\0"..., 1048576) = 1048576
write(1, "\0\0\0\0\0\0\0\0\0"..., 2097152) = 2097152
✄
Most other tools have a much lower cap on the maximum buffer size, so they would make more syscalls, and hence take more time. But note that this is an unrealistic benchmark: if you were writing to a regular file or a pipe or a socket, the kernel would probably not write more than a few kilobytes per syscall anyway.
Solution 3
You can use dd
for this! First write the string to the beginning of the file. Then do:
dd if=$FILE of=$FILE bs=$STRING_LENGTH seek=1 count=$REPEAT_TIMES
Note: if your $STRING_LENGTH is small, you might do something like
dd if=$FILE of=$FILE bs=$STRING_LENGTH seek=1 count=$((1024/$REPEAT_TIMES))
dd if=$FILE of=$FILE bs=1024 seek=1 count=$(($REPEAT_TIMES/1024))
(This example only works if STRING_LENGTH is a power of 2 and REPEAT_TIMES is a multiple of 1024, but you get the idea)
If you want to use this to overwrite a file (e.g. purging) use conv=notrunc
Solution 4
I've finally got my idea on how to do this working... It uses a tee
|tee
|tee
chain, which runs at close to dd
's speed..
# ============================================================================
# repstr
#
# Brief:
# Make multiple (repeat) copies of a string.
# Option -e, --eval is used as in 'echo -e'
#
# Return:
# The resulting string is sent to stdout
#
# Args: Option $1 $2
# -e, --eval COUNT STRING
# repstr $((2**40)) "x" # 1 TB: xxxxxxxxx...
# eg. repstr -e 7 "AB\tC\n" # 7 lines: AB<TAB>C
# repstr 2 "ऑढळ|a" # 2 copies: ऑढळ|aऑढळ|a
#
[[ "$1" == "-e" || "$1" == "--eval" ]] && { e="-e"; shift 1; }|| e=""
count="$1"
string="$2"
[[ "${count}" == "" ]] && exit 1 # $count must be an integer
[[ "${count//[0-9]/}" != "" ]] && exit 2 # $count is not an integer
[[ "${count}" == "0" ]] && exit 0 # nothing to do
[[ "${string}" == "" ]] && exit 0 # nothing to do
#
# ========================================================================
# Find the highest 'power of 2' which, when calculated**, is <= count
# ie. check ascending 'powers of 2'
((leqXpo=0)) # Exponent which makes 2** <= count
((leqCnt=1)) # A count which is <= count
while ((count>=leqCnt)) ;do
((leqXpo+=1))
((leqCnt*=2))
done
((leqXpo-=1))
((leqCnt/=2))
#
# ======================================================================================
# Output $string to 'tee's which are daisy-chained in groups of descending 'powers of 2'
todo=$count
for ((xpo=leqXpo ;xpo>0 ;xpo--)) ;do
tchain=""
floor=$((2**xpo))
if ((todo>=(2**xpo))) ; then
for ((t=0 ;t<xpo ;t++)) ;do tchain="$tchain|tee -" ;done
eval echo -n $e \"'$string'\" $tchain # >/dev/null
((todo-=floor))
fi
done
if ((todo==1)) ;then
eval echo -n $e \"'$string'\" # >/dev/null
fi
#
Here are some time test results.. I've gone to 32 GB because thats the about the size of a test file I wanted to create (which is what started me off on this issue)
NOTE: (2**30), etc. refers to the number of strings (to achieve a particular GB filesize)
-----
dd method (just for reference) real/user/sys
* 8GB =================================
if=/dev/zero bs=1024 count=$(((1024**2)*8)) # 2m46.941s / 00m3.828s / 0m56.864s
tee method: fewer tests, because it didn't overflow, and the number-of-strings:time ratio is linear
tee method: count string real/user/sys
* 8GB ========== ============ =================================
tee(2**33)>stdout $((2**33)) "x" # 1m50.605s / 0m01.496s / 0m27.774s
tee(2**30)>stdout -e $((2**30)) "xxx\txxx\n" # 1m49.055s / 0m01.560s / 0m27.750s
* 32GB
tee(2**35)>stdout -e $((2**35)) "x" #
tee(2**32)>stdout -e $((2**32)) "xxx\txxx\n" # 7m34.867s / 0m06.020s / 1m52.459s
python method: '.write' uses 'file.write()'
'>stcout' uses 'sys.stdout.write()'. It handles \n in args (but I know very little python)
count string real/user/sys
* 8GB ===== =================== =================================
python(2**33)a .write 2**33 "x" # OverflowError: repeated string is too long
python(2**33)a >stdout 2**33 "x" # OverflowError: repeated string is too long
python(2**30)b .write 2**30 '"xxxxxxxX" *2**0' # 6m52.576s / 6m32.325s / 0m19.701s
python(2**30)b >stdout 2**30 '"xxxxxxxX" *2**0' # 8m11.374s / 7m49.101s / 0m19.573s
python(2**30)c .write 2**20 '"xxxxxxxX" *2**10' # 2m14.693s / 0m03.464s / 0m22.585s
python(2**30)c >stdout 2**20 '"xxxxxxxX" *2**10' # 2m32.114s / 0m03.828s / 0m22.497s
python(2**30)d .write 2**10 '"xxxxxxxX" *2**20' # 2m16.495s / 0m00.024s / 0m12.029s
python(2**30)d >stdout 2**10 '"xxxxxxxX" *2**20' # 2m24.848s / 0m00.060s / 0m11.925s
python(2**30)e .write 2**0 '"xxxxxxxX" *2**30' # OverflowError: repeated string is too long
python(2**30)e >stdout 2**0 '"xxxxxxxX" *2**30' # OverflowError: repeated string is too long
* 32GB
python(2**32)f.write 2**12 '"xxxxxxxX" *2**20' # 7m58.608s / 0m00.160s / 0m48.703s
python(2**32)f>stdout 2**12 '"xxxxxxxX" *2**20' # 7m14.858s / 0m00.136s / 0m49.087s
perl method:
count string real / user / sys
* 8GB ===== =================== =================================
perl(2**33)a .syswrite> 2**33 "a" x 2**0 # Sloooooow! It would take 24 hours. I extrapolated after 1 hour.
perl(2**33)a >stdout 2**33 "a" x 2**0 # 31m46.405s / 31m13.925s / 0m22.745s
perl(2**30)b .syswrite> 2**30 "aaaaaaaA" x 2**0 # 100m41.394s / 11m11.846s / 89m27.175s
perl(2**30)b >stdout 2**30 "aaaaaaaA" x 2**0 # 4m15.553s / 3m54.615s / 0m19.949s
perl(2**30)c .syswrite> 2**20 "aaaaaaaA" x 2**10 # 1m47.996s / 0m10.941s / 0m15.017s
perl(2**30)c >stdout 2**20 "aaaaaaaA" x 2**10 # 1m47.608s / 0m12.237s / 0m23.761s
perl(2**30)d .syswrite> 2**10 "aaaaaaaA" x 2**20 # 1m52.062s / 0m10.373s / 0m13.253s
perl(2**30)d >stdout 2**10 "aaaaaaaA" x 2**20 # 1m48.499s / 0m13.361s / 0m22.197s
perl(2**30)e .syswrite> 2**0 "aaaaaaaA" x 2**30 # Out of memory during string extend at -e line 1.
perl(2**30)e >stdout 2**0 "aaaaaaaA" x 2**30 # Out of memory during string extend at -e line 1.
* 32GB
perl(2**32)f .syswrite> 2**12 "aaaaaaaA" x 2**20 # 7m34.241s / 0m41.447s / 0m51.727s
perl(2**32)f >stdout 2**12 "aaaaaaaA" x 2**20 # 10m58.444s / 0m53.771s / 1m28.498s
Solution 5
Python version:
import sys
CHAR = sys.argv[1] if len(sys.argv) > 1 else "x"
block = CHAR * 1024
count = 1024 * 384
with open("testout.bin", "w") as outf:
for i in xrange(count):
outf.write(block)
python2.7 writestr.py x
0.27s user 0.69s system 99% cpu 0.963 total
dd if=/dev/zero of=testout.bin bs=1024 count=$((1024*384))
0.05s user 1.05s system 94% cpu 1.167 total
Python has a higher initialization cost, but overall beat dd on my system.
Related videos on Youtube
Peter.O
Updated on September 18, 2022Comments
-
Peter.O almost 2 years
dd
can write repeating\0
bytes to a file very fast, but it can't write repeating arbitrary strings.
Is there a bash-shell method to write repeating arbitrary strings equally as fast as 'dd' (including\0
)?All the suggestions I've encountered in 6 months of Linux are things like
printf "%${1}s" | sed -e "s/ /${2}/g"
, but this is painfully slow compared todd
, as shown below, andsed
crashes after approximately 384 MB (on my box) -- actually that's not bad for a single line-length :) -- but it did crash!
I suppose that wouldn't be an issue forsed
, if the string contained a newline.Speed comparison of
dd
vs.printf
+sed
:real user sys WRITE 384 MB: 'dd' 0m03.833s 0m00.004s 0m00.548s WRITE 384 MB: 'printf+sed' 1m39.551s 1m34.754s 0m02.968s # the two commands used dd if=/dev/zero bs=1024 count=$((1024*384)) printf "%$((1024*1024*384))s" |sed -e "s/ /x/g"
I have an idea how to do this in a bash-shell script, but there's no point re-inventing the wheel. :)
-
Peter.O about 13 yearsAnother inciteful answer, thanks... I really like your bullet-point maxims about when to "Prefer".... I'm starting to differentiate between shell built-ins and the externals... I've close to finished my alternative method.. it's speed is very close
dd
, and seems to be rather indifferent to the string size... (I'll try to post it sometime tomorrow, once I get it ship-shape :) ... -
Peter.O about 13 yearsInteresting and useful.. As the string length reduces, the time increases ..On my box your exact command took
real/user/sys **0m4.565s**/0m0.804s/0m0.904s
..with a string "x\n", it tookr/u/s **0m30.227s**/0m29.202s/0m0.880s
... but that's still certainly faster thanprintf--sed
... The 384 byte string version is about the same speed asdd
on my system too...(it's funny how things vary... I actually got a slower dd speed this time... -
asoundmove about 13 years@fred.bear, spelling tip: I suppose you meant "insightful" rather than "inciteful" (which does not exist, but could be linked to "to incite").
-
Peter.O about 13 years@asoundmove: Thanks. I'm quite happy with such alerts.. but I definitely(?) meant 'inciteful' :) oxforddictionaries.com/view/entry/m_en_gb0404940#m_en_gb0404940 (but not to incite to illegal actions, as the strict sense of the word implies... I may have got the two cross wired.. I recall both sentiments; "insight" and "being spurred on"... Actually, I'll concede.. Hey, :) my excuse is: not a lot of sleep last night. too much Q&A.... I think I did mean mainly "insight".. but I definitely recall thinking of both words. (a bit off topic, but a change is as good as a holiday :)
-
asoundmove about 13 years@fred.bear: oh it does exist! New one on me. Learn something new everyday.
-
Peter.O about 13 yearsIn essence that's what I've done too..(but differently)... I'll check this later (busy now), and I've "answered" the question with my "tee" versoon...
-
Peter.O about 13 years@user-unknown: I've looked at it again.. I think the idea is good (but I would, as we have both used a binary doubling :).. It creates a lot of files.. which then have to be selectively
cat
d again to get the final desired number of strings. eg 987654321 ... repeats of your string... and as you said it slows downa lot with larger numbers of repeating strings... It has been running for aprox 40 mins to make a 32GB file, so I killed it. (I'm after a 35 GB file..) ... Thetee
process I've used takes 7-9 minutes... but I all for the binary idea.. binay splits and doublings are powerful tools -
Peter.O about 13 yearsThis is looking very good.. It think that the actual number of repeats (xrange) would depend on system resources, but it can get several GB of strings from xrange alone... (easily dealt with wit a bit of bounds checking)... I've included some test times in my answer.. Both your method, and my method are close to 'dd', timewise..
-
Peter.O about 13 yearsI've included some test times in my answer (so that all times relate to the same hardware).
-
erik about 11 yearsFor better comparability you should perform all of your tests on /dev/shm to avoid interfering with the cache of your harddisk. Of course only if you have enough RAM in your machine.