Handy parsing for numbers with unit suffixes?
Solution 1
Based on my answer at one of the questions you linked to:
awk '{
ex = index("KMGTPEZY", substr($1, length($1)))
val = substr($1, 0, length($1) - 1)
prod = val * 10^(ex * 3)
sum += prod
}
END {print sum}'
Another method that's used:
sed 's/G/ * 1000 M/;s/M/ * 1000 K/;s/K/ * 1000/; s/$/ +\\/; $a0' | bc
Solution 2
You can use perl regular expressions to do this. For example,
$value = 0;
if($line =~ /(\d+\.?\d*)(\D+)\s+/) {
$amplifier = 1024 if ($2 eq 'K');
$amplifier = 1024 * 1024 if ($2 eq 'M');
$amplifier = 1024 * 1024 * 1024 if ($2 eq 'G');
$value = $1 * $amplifier;
}
This is a simple script. You can consider it as starting point. Hope it will help!
Solution 3
Personally, I'd just not use the -h flag in the first place. The "human readable" version rounds off numbers which will need to be rounded again when you convert back, getting even less accurate. (For instance, 2.7MiB is 2831155.2 bytes. What did you do with the other 0.8th of a byte??!)
Otherwise, you can ask units
to convert MiB/GiB/KiB to just "B" and it'll handle this, but you'd have to do something like (assuming your output is tabbed, otherwise cut
appropriately)
{your output} | cut -f1 '-d{tab}' | xargs -L 1 -I {} units -1t {}iB B | awk '{s+=$1}END{printf "%d\n",s}'
Solution 4
VALUE=$1
for i in "g G m M k K"; do
VALUE=${VALUE//[gG]/*1024m}
VALUE=${VALUE//[mM]/*1024k}
VALUE=${VALUE//[kK]/*1024}
done
[ ${VALUE//\*/} -gt 0 ] && echo VALUE=$((VALUE)) || echo "ERROR: size invalid, pls enter correct size"
Related videos on Youtube
Muhammad Danish
Updated on September 17, 2022Comments
-
Muhammad Danish over 1 year
Let's say you have data with quantities in human-readable format, such as the output of
du -h
, and want to further operate on those numbers. Let's say you want to pipe your data through grep to do a summation of a sub-set of that data. You do this ad-hoc on many systems you've never seen before, and have only minimal utilities. You want suffix conversions for all the standard 10^n suffixes.Exists a gnu-linux utility to convert the suffixed numbers to real numbers within a pipeline? Do you have a bash function written to do this, or some perl which might be easy to remember, instead of a length of regex replacements or several sed steps?
38M /var/crazyface/courses/200909-90147 2.7M /var/crazyface/courses/200909-90157 1.1M /var/crazyface/courses/200909-90159 385M /var/crazyface/courses/200909-90161 1.3M /var/crazyface/courses/200909-90169 376M /var/crazyface/courses/200907-90171 8.0K /var/crazyface/courses/200907-90173 668K /var/crazyface/courses/200907-90175 564M /var/crazyface/courses/200907-90178 4.0K /var/crazyface/courses/200907-90179
| grep 200907 | <amazing suffix conversion> | awk '{s+=$1} END {print s}'
Relevant references:
-
Tony over 8 yearsYou rarely need to use grep and awk. If you are using awk, then use awk. Just add
/200907/
in front of your per-line code, e.g.awk '/200907/{s+=$1} END {print s}'
-
-
Muhammad Danish about 13 yearsIndeed, this is one way. I've also found stackoverflow.com/questions/2557649/….
-
Muhammad Danish about 13 yearsWell noted, that there is a loss of precision. Supplementing the input to units also works.. but I found
units
missing on my minimal distro! I think we'd all do this differently if we had full control of everything. -
djuarez almost 5 yearsfor the second method, what if the suffix is s?
-
Dennis Williamson almost 5 years@djuarez: What multiplier does the s stand for?
-
djuarez almost 5 yearsNone, just extrapolating on other unit cases.