Automate daily csv file download from website button click
Your button most likely issues a POST request to the server. In order to track it:
- Open Network tab in Chrome developer tools
- Navigate to the page and hit the button.
- Notice which request led to file download. Right click on it and copy as cURL
- Run copied cURL
Once you have cURL working you can schedule downloads using cron or Task Scheduler depending on operation system you are using.
user
Updated on June 04, 2022Comments
-
user almost 2 years
I would like to automate the process of visiting a website, clicking a button, and saving the file. The only way to download the file on this site is to click a button. You can't navigate to the file using a url.
I have been trying to use phantomjs and casperjs to automate this process, but haven't had any success.
I recently tried to use brandon's solution here Grab the resource contents in CasperJS or PhantomJS
Here is my code for that
var fs = require('fs'); var cache = require('./cache'); var mimetype = require('./mimetype'); var casper = require('casper').create(); casper.start('http://www.example.com/page_with_download_button', function() { }); casper.then(function() { this.click('#download_button'); }); casper.on('resource.received', function (resource) { "use strict"; for(i=0;i < resource.headers.length; i++){ if(resource.headers[i]["name"] == "Content-Type" && resource.headers[i]["value"] == "text/csv; charset-UTF-8;"){ cache.includeResource(resource); } } }); casper.on('load.finished', function(status) { for(i=0; i< cache.cachedResources.length; i++){ var file = cache.cachedResources[i].cacheFileNoPath; var ext = mimetype.ext[cache.cachedResources[index].mimetype]; var finalFile = file.replace("."+cache.cacheExtension,"."+ext); fs.write('downloads/'+finalFile,cache.cachedResources[i].getContents(),'b'); } }); casper.run();
I think the problem could be caused by my cachePath being incorrect in cache.js
exports.cachePath = 'C:/Users/username/AppData/Local/Ofi Labs/PhantomJS';
Should I be using something in adition to the backslashes to define the path?
When I try
casperjs --disk-cache=true export_script.js
Nothing is downloaded. After a little debugging I have found that cache.cachedResources is always empty.
I would also be open to solutions outside of phantomjs/casperjs.
UPDATE
I am not longer trying to accomplish this with CasperJS/PhantomJS. I am using the chrome extension Tampermonkey suggested by dandavis. Tampermonkey was extremely easy to figure out. I installed Tampermonkey, navigated to the page with the download link, and then clicked New Script under tampermonkey and added my javascript code.
document.getElementById("download_button").click();
Now every time I navigate to the page in my browser, the file is downloaded. I then created a batch script that looks like this
set date=%DATE:~10,4%_%DATE:~4,2%_%DATE:~7,2% chrome "http://www.example.com/page-with-dl-button" timeout 10 move "C:\Users\user\Downloads\export.csv" "C:\path\to\dir\export_%date%.csv"
I set that batch script to run nightly using the windows task scheduler.
Success!