Bash: Fastest way of determining dimensions of image from URL
Solution 1
As you note, you don't need the whole ImageMagick package. You just need identify
.
You will also need the libraries the executable links to (and the libraries those libraries link to).
> whereis identify
identify: /bin/identify /usr/bin/identify /usr/share/man/man1/identify.1.gz
> ldd /bin/identify
ldd
will show a list. When I did this, it included some X libs, libjpeg, etc. and two libraries clearly from the ImageMagick package, libMagickCore
and libMagickWand
. Those look to be linked to the same bunch of things, so if you have that, identify
should work.
You don't have to download an entire image in order to get the dimensions, because these are in a header at the beginning of the file and that's what identify
looks at. For example, here I'm copying the first 4 kB from a complete jpeg into a new file:
dd if=real.jpg of=test.jpg bs=1024 count=4
4 kB should be more than enough to include the header -- I'm sure you could do it with 1/4 that amount. Now:
>identify test.jpg
test.jpg JPEG 893x558 893x558+0+0 8-bit DirectClass 4.1KB 0.000u 0:00.000
Those are the correct dimensions for real.jpg
. Notice, however, that the size (4.1KB) is the size of the truncated file, since that information is not from the image header.
So: you only have to download the first kilobyte or so of each image.
Solution 2
You can use curl
to download parts of the image. It all depends on how robust it has to be. A test-case could be first 500 bytes. Seems to work for a lot of png
and jpg
, then use identify
or the like to check the size.
curl -o 500-peek -r0-500 "http://example.net/some-image.png"
Edit:
Long time since I wrote image parsers, but gave it some thought and refreshed some of my memory.
I suspect that it is all kind of images you want to check (but then again, perhaps not). I'll describe some of the more common ones: PNG
, JPEG
(JFIF) and GIF
.
PNG:
These are simple when it comes to extraction of size. A png
header stores the size within the first 24 bytes. First comes a fixed header:
byte value description
0 0x89 Bit-check. 0x89 has bit 7 set.
1-3 PNG The letters P,N and G
4-5 \r\n Newline check.
6 ^z MS-DOS won't print data beyond this using `print`
7 \n *nix newline.
Next comes chunks trough out the file. They consist of a fixed field of length, type and checksum. In addition an optional data section of length size.
Luckily the first chunk is always an IHDR
with this layout:
byte description
0-3 Image Width
4-7 Image Height
8 Bits per sample or per palette index
... ...
By this we have that sizes are byte 16-20, and 21-24. You can dump the data by e.g. hexdump:
hexdump -vn29 -e '"Bit-test: " /1 "%02x" "\n" "Magic : " 3/1 "%_c" "\n" "DOS-EOL : " 2/1 "%02x" "\n" "DOS-EOF : " /1 "%02x" "\n" "NIX-EOL : " /1 "%02x" "\n" "Chunk Size: " 4/1 "%02u" "\n" "Chunk-type: " 4/1 "%_c" "\n" "Img-Width : " 4/1 "%02x" "\n" "Img-Height: " 4/1 "%02x" "\n" /1 "Depth : %u bit" "\n" /1 "Color : %u" "\n" /1 "Compr.: %u" "\n" /1 "Filter: %u" "\n" /1 "Interl: %u" "\n"' sample.png
On a Big Endian/Motorola machine one could also print the sizes directly by:
hexdump -s16 -n8 -e '1/4 "%u" "\n"' sample.png
However, on Little Endian / Intel, it is not that easy, and it is nor very portable.
By this it is we could implement a bash + hexdump script as in:
png_hex='16/1 "%02x" " " 4/1 "%02x" " " 4/1 "%02x" "\n"'
png_valid="89504e470d0a1a0a0000000d49484452"
function png_wh()
{
read -r chunk1 img_w img_h<<<$(hexdump -vn24 -e "$png_hex" "$1")
if [[ "$chunk1" != "$png_valid" ]]; then
printf "Not valid PNG: \`%s'\n" "$1" >&2
return 1
fi
printf "%10ux%-10u\t%s\n" "0x$img_w" "0x$img_h" "$1"
return 0
}
if [[ "$1" == "-v" ]]; then verbose=1; shift; fi
while [[ "$1" ]]; do png_wh "$1"; shift; done
But, this isn't directly efficient. Though it requires a bigger chunk (75-100 bytes), identify
is rather faster. Or write the routine in e.g. C, which would be faster then library calls.
JPEG:
When it comes to jpg
it isn't that easy. It also starts out with a signature header, but the size chunk isn't at a fixed offset. After the header:
byte value
0-1 ffd8 SOI (Start Of Image)
2-3 ffe0 JFIF marker
4-5 <block-size> Size of this block including this number
6-10 JFIF\0 ...
11-12 <version>
13 ...
a new block comes along specified by a two byte marker starting with 0xff
. The one holding information about dimensions has the value 0xffc0
but can be buried quite a bit down the data.
In other words, one skip block-size bytes, check marker, skip block-size bytes, read marker, and so on until the correct one comes along.
When found the sizes are stored by two bytes each at offset 3 and 5 after marker.
0-1 ffc0 SOF marker
2-3 <block-size> Size of this block including this number
4 <bits> Sample precision.
5-6 <Y-size> Height
7-8 <X-size> Width
9 <components> Three for color baseline, one for grayscale.
Wrote a simple C program to check some files and of about 10.000 jpg images, proximately 50% had the size information within the first 500 bytes, mostly 50% between ca. 100 and 200. The worst was around 80.000 bytes. A picture, as we talk pictures:
GIF:
Though gif typically can have multiple images stored within, it has a canvas size specified in the header, this is big enough to house the images. It is as easy as with PNG, and require even fever bytes: 10. After magic and version we find sizes. Example from a 364x472 image:
<byte> <hex> <value>
0-2 474946 GIF Magic
3-5 383961 89a Version (87a or 89a)
6-7 6c01 364 Logical Screen Width
8-9 d801 472 Logical Screen Height
In other words you can check the first six bytes to see if it is a gif, then read the next four for sizes.
Other formats:
Could have continued, but guess I stop here for now.
Solution 3
Assumes you have "identify".
Put this in a script and chmod +x <scriptname>
. To run it type <scriptname> picture.jpg
and you will get the height and width of the image.
The first 2 sections are to check if there is an image then set it as the IMAGE variable. The next section is to make sure the file is actually there. The last 2 sections are to take the relevant information from the 'identify' output and display it.
#!/bin/bash
if [[ "${#}" -ne "1" ]]
then
die "Usage: $0 <image>"
fi
IMAGE="${1}"
if [[ ! -f "${IMAGE}" ]]
then
die "File not found: ${IMAGE}"
fi
IMG_CHARS=`identify "$1" | cut -f 3 -d' '`
WIDTH=`echo $IMG_CHARS | cut -d'x' -f 1`
HEIGHT=`echo $IMG_CHARS | cut -d'x' -f 2`
echo -e "W: ${WIDTH} H: ${HEIGHT}"
Related videos on Youtube
exvance
Updated on September 18, 2022Comments
-
exvance almost 2 years
I'm trying to figure out a really fast method in bash of determining an images dimensions.
I know I could wget the image and then use imagemagick to determine the height and width of the image. I'm concerned that this may not be the fastest way of doing it.
I'm also concerned with having to install imagemagick when I only need a very small subset of functionality. I'm on an embedded system that has very limited resources (CPU, RAM, storage).
Any ideas?
-
Admin over 10 yearsWhat image types do you need to support?
-
-
goldilocks over 10 years
file
doesn't give dimensions for, e.g.,.jpg
files. -
user2914606 over 10 yearsnice script. however, it'd be nice if you could explain what it does (since Stack Exchange is about learning).
-
peterph over 10 yearsI'm not sure PHP is well suited for a low resources embedded systems. Plus this seems to fetch the whole file.
-
peterph over 10 yearsStill it will load the whole PHP engine which is a memory hog. Plus a reasonable portion of PHP would have to be installed, which might be an issue for embedded system as well (disk space might be limited). For a regular system it might be an option, though you'd need to modify it to prevent fetching whole image (see Sukminder's answer).