Instruct WGET to only download the same file if the existing one is older
Solution 1
Take a look at the timestamping section in the wget manual:
Time-Stamping
One of the most important aspects of mirroring information from the Internet is updating your archives.
Downloading the whole archive again and again, just to replace a few changed files is expensive, both in terms of wasted bandwidth and money, and the time to do the update. This is why all the mirroring tools offer the option of incremental updating.
Such an updating mechanism means that the remote server is scanned in search of new files. Only those new files will be downloaded in the place of the old ones.
A file is considered new if one of these two conditions are met:
A file of that name does not already exist locally.
A file of that name does exist, but the remote file was modified more recently than the local file.
To implement this, the program needs to be aware of the time of last modification of both local and remote files. We call this information the time-stamp of a file.
The time-stamping in GNU Wget is turned on using ‘--timestamping’ (‘-N’) option, or through timestamping = on directive in .wgetrc. With this option, for each file it intends to download, Wget will check whether a local file of the same name exists. If it does, and the remote file is not newer, Wget will not download it.
If the local file does not exist, or the sizes of the files do not match, Wget will download the remote file no matter what the time-stamps say.
Solution 2
wget -N http://server/path/to/file.txt
Related videos on Youtube
James
Updated on December 05, 2020Comments
-
James almost 2 years
As the question states how to instruct WGET to only download the same file if the existing one is older
e.g. fileA has a date / file stamp of 9.00AM 10/10/2011
e.g. fileA on the remote server has a date / file stamp of 11AM 10/10/2011
so wget will download FileA on the server as its newer (and overwrite the local file)
Any help would be greatly appreciated, I have heard this is possible, but after looking around for a while I havn't come up with anything
-
Prisoner 13 about 5 years"If [...] the sizes of the files do not match, Wget will download the remote file no matter what the time-stamps say." This wasn't true for me. The file failed to download. To wit: HTTP request sent, awaiting response... 304 Not Modified File ‘xxx.csv’ not modified on server. Omitting download." when the local file was truncated and newer. Bug or misunderstanding of the feature?