How do I only download new files from a server?

18,277

Solution 1

Did you consider creating an FTP account for access to that particular folder and then using an FTP client like SmartFTP or FileZilla to synchronize your local folder with the remote one? Should be well easy to set up and also convenient to use... Also, you could simply create an FTP command script and execute that from your Java code, if absolutely necessary...

Or I'll try to point you into another direction: md5() or other message-digest algorithms could help you. you wouldn't have to rely on timestamps. Try to calculate md5() hash of the file you have and the file you are about to download. Then you know whether to download or not.

Solution 2

I might be missing something but I can't see why you would need JNI or POI to download a file. If you are downloading the file with HTTP, you can use an HttpURLConnection with the "If-Modified-Since" request header.

Solution 3

I have a number of CSV files that I want to download from Yahoo finance each day. I want my application to read the file's creation date (on my computer, not the server). If the creation date is prior to today then the new file should be downloaded (as it will have new data).

In order to detect changes to the local file, you need the file's last modification date, which is more generic than the creation date for this kind of check (since it also shows changes to the file after it has been created).

You can get that in Java by using the

public long lastModified()

method on a File object.

Note that there is no method to get the creation date in the File API, probably because this information is not available in all filesystems.

If you absolutely need to have a file creation date, then (if you create the files yourself or you can ask those who do) you could encode the creation date by convention in the file name, like this: myfile_2009_04_11.csv.

Then you will have to parse the file name and determine the creation date.

I have done some googling and have found the Apache POI project. Is this the best way to go, is there a better way, what would you recommend.

The Apache POI project is a library for reading and writing MS Office files (Excel files in this case). CSV is a simple textual format, so you don't need POI to read it.

Also, the information you need (creation date or last modification date) is available as metadata on the file itself, not in the file's data, so you don't need POI to get to it.

Is JNI at all relevant here?

Theoretically, you could use a custom JNI extension (a bridge to native code) to get the file's creation date on those filesystems that support it.

However, you're best off using the portable last modification date thats already in the Java SDK API and/or the "creation date encoded in the filename" convention.

Using JNI will make your program not portable for no real added benefit.

Solution 4

JNI is definitely irrelevant, and so is Apache POI, unless the creation date is stored in the file itself (unlikely). Otherwise, it's external metadata and either accessible via the HTTP headers (possible using pure Java), or not accessible at all.

Share:
18,277
Ankur
Author by

Ankur

A junior BA have some experience in the financial services industry. I do programming for my own personal projects hence the questions might sound trivial.

Updated on June 05, 2022

Comments

  • Ankur
    Ankur almost 2 years

    I have a number of CSV files that I want to download from Yahoo finance each day. I want my application to read the file's creation date (on my computer, not the server). If the creation date is prior to today then the new file should be downloaded (as it will have new data). If not then the new file should not be downloaded, and the correlation calculator (which is essentially what my application is), should use the last downloaded file for the particular stock code.

    I have done some googling and have found the Apache POI project.

    Is this the best way to go, is there a better way, what would you recommend. Is JNI at all relevant here?