Base64 encoded images and availability of their metadata for Googlebot

5,537

Solution 1

Google does not index data URI images for Google image search. Google's John Mueller says so here and in the comments below. Because data URI images are not indexed in Google image search, the EXIF data in them is irrelevant.

You can verify that these images are not indexed. I searched Google images for "data uri" and spot checked the results. All of the images I viewed were image files, not base64 encoded image URI. You would think that if Google were able to index data URI images, some of them would show up in the search results for that term.

If Google ever does decide to index data URI images, they should be able to get the EXIF data from them. Data uri is the entire file base64 encoded (no spaces or new lines) with a data:image/png;base64, prefix. Any meta data in the file would still be present in the base64 encoded data URI version.

I use data URI images on one of my websites. I do so because users typically just view one page on the site to get all the information they need. Including all CSS, JS, and image data inline in the page improves performance dramatically. The images are all small, so the technique works particularly well.

My site gets a fair amount of traffic from Internet Explorer 7 and earlier which don't support data URI images. Therefore I have to serve them conditionally. I have the images on the server as well and I choose to regular image URLs or the data URI based on the User-Agent header. I treat bots (including Googlebot) the same as IE 7, ie, I serve the images as HTTP URLs. I do this because including data uri images dramatically increases the page size. Most bots don't need to download the images, so it is more efficient for them. I had also noticed that Google Webmaster Tools reported Googlebot was crawling my site much more slowly with data URI images enabled for it. This could technically be considered cloaking, but it would be a way of getting your data URI images indexed.

Solution 2

While Google uses images as base64 encoded data URIs on its own SERP, it doesn't index such images on other websites. Thanks to @dan, who pointed me to the Google Groups discussion, where John Mueller explains this issue. It means too, that the question about the existence of EXIF data in such images isn't relevant.

This explanation makes clear, to which images is this performance optimization technique is better to apply: small images, like icons, favicons and buttons, and those images, who doesn't deliver any additional value for the site's content.

On the other site, if one categorically must embed an image WITH additional content value as base64 encoded data URI, the only best practice to provide image's metadata is to use Schema.org's markup, where it is possible to negotiate EXIF data, e.g. with this kind of markup.

Another promising kind of markup to negotiate data looking like "property:value", like EXIF is, has at the moment a proposal status. But this article from Google's blog shows structured snippets, which can be generated by the markup proposal i linked above.

Share:
5,537

Related videos on Youtube

MrH40XX
Author by

MrH40XX

Updated on September 18, 2022

Comments

  • MrH40XX
    MrH40XX over 1 year

    If I embed an image into a page as an img-src with base64 data URI, are image's metadata (EXIF, IPTC, XMP) still available for Google's imagebot?

  • MrH40XX
    MrH40XX over 9 years
    Look goo.gl/F6D2a7 - this base64 encoded image is indexed by G. You can try to find more here goo.gl/aHbgb1, but it isn't easy, thats true... Isn't it so that IE does support base64 encoded images, but not bigger than 32kb? True is too, that if used in load time optimization purpose, it is only useful to encode very little images. In other case one get less HTTP requests, but the file size becomes giant
  • Stephen Ostermiller
    Stephen Ostermiller over 9 years
    Your first example is indexed at this URL: photos.topicshow.com/… and your second at this: images5.fanpop.com/image/photos/30600000/… In all cases I could find, there is a http URL for the image as well.
  • DisgruntledGoat
    DisgruntledGoat over 9 years
    Just as a point of note, I'm assuming that all metadata in a data URI is preserved? i.e. if I have a photo with metadata that's encoded, if I were to save the image from the website it would still have that metadata?
  • Stephen Ostermiller
    Stephen Ostermiller over 9 years
    Meta data would be preserved. Data uri is the entire file base64 encoded (no spaces or new lines) with a data:image/png;base64, prefix. Any meta data in the file would still be present in the base64 encoded data uri version.
  • MrH40XX
    MrH40XX over 9 years
    @StephenOstermiller encoded string may content spaces: goo.gl/RF8r07. i will populate an image with EXIF, encode it, publish and look, whether it comes into index.
  • Stephen Ostermiller
    Stephen Ostermiller over 9 years
    Your experiment is the best way to prove or disprove my conjector.
  • dan
    dan over 9 years
    John Mueller (from Google) indicates here that Google generally doesn't index images from data URIs. Many online tools used to encode these will also strip-off metadata, so it really depends on how it's encoded as to whether the EXIF info is maintained...but given that they're not indexed anyway, it's a moot point. Let us know your results (be sure not to let the URL to the image get indexed - Google also uses image recognition so EXIF info could be used from matched images).
  • MrH40XX
    MrH40XX over 9 years
    @dan thank you! your link to John Muellers answer clears many things now at once! If G doesn't index images, where it can't get an URI, so one doesn't need to consider about whether the EXIF is left inside or not.
  • John Mueller
    John Mueller over 9 years
    As linked above, we currently don't index these as images separately. That might change in the future, but at least for the moment you'd want to use separate image URLs if you want those images indexed in Image Search.
  • Stephen Ostermiller
    Stephen Ostermiller over 9 years
    I have updated the answer incorporating many items from these comments.