Where does Google Chrome save temporary pdf files?

google-chrome pdf

29,575

Solution 1

Note: This no longer works since the Chrome disk cache format has changed

It's in that directory, all right. Just uses a random string for filenames. Test it out:

find ~/.cache/google-chrome -type f -exec file {} + | grep PDF

For example:

$ find .cache/google-chrome/ -type f -exec file {} + | grep PDF
.cache/google-chrome/Default/Cache/f_004bf0:       PDF document, version 1.5
.cache/google-chrome/Default/Cache/f_004c01:       PDF document, version 1.4

Solution 2

The Google Chrome cache directory $HOME/.cache/google-chrome/Default/Cache on Linux contains one file per cache entry named <16 char hex>_0 in "simple entry format":

20 Byte SimpleFileHeader
key (i.e. the URI)
payload (the raw file content i.e. the PDF in our case)
SimpleFileEOF record
HTTP headers
SHA256 of the key (optional)
SimpleFileEOF record

You therefore cannot simply use file to determine the file type (it will just detect data), but must search for the PDF header. This will list all PDFs in the cache directory:

grep -Rl '%PDF' $HOME/.cache/google-chrome/Default/Cache

Note: This may give you false positives in case the string %PDF appears somewhere in a file which isn't a PDF

Note: If you're not using the default Chrome profile, replace Default with the profile name, e.g. Profile 1.

evince will happily read the cache file directly, without having to strip the header.

If you do want to extract the original PDF, save the following script as extractpdf.py:

def main(cachefile):
    with open(cachefile, 'rb') as f:
        s = f.read()
    with open(cachefile + '.pdf', 'wb') as f:
        f.write(s[s.find(b'%PDF'):s.rfind(b'%%EOF')+5])

if __name__ == '__main__':
    import sys
    main(sys.argv[1])

And call it as python3 extractpdf.py <cache file>

29,575

Jignesh

Updated on September 18, 2022

Comments

Jignesh over 1 year

I was wondering where does Google Chrome save pdf documents that it opens in the browser itself. I know those get deleted if we dont save them and close the browser.

I looked into chrome temporary folder at ~/.cache/google-chrome, but couldnt find any pdf there.
Ricky Robinson almost 9 years

Just wondering... didn't it all use to be stored in the /tmp folder?
muru almost 9 years

@RickyRobinson not in the past couple of years, at least, I think. I've had problems with Chrome filling up student's quotas for at least that long.
JamesBB about 7 years

Does that find command search sub-directories? My ~/.cache/google-chrome/ folder has multiple sub-folders, each with multiple sub-folders, so I was wondering how deep it would search. I didn't find any PDF files, although I had just closed one, so is there anywhere else that Chrome might store files it's had open?
muru about 7 years

@JamesBB as you can see from the output, it went at least 3 subdirectories deep. find recurses by default, unless you used the -maxdepth option.
rivu almost 4 years

Does this method work any more? I tried and all I see in chrome cache are data files. My version is 77.
kynan over 3 years

Nope, this doesn't work any more because the file format of the Chrome cache changed.