BASH CURL: Don't close connection in between requests when run sequentially

24,198

Solution 1

Doing curl a.html && curl b.html will necessarily use two TCP (http) connections to fetch the data. Each curl operation is its own process and will open its own connection.

However, a web site doesn't use the TCP/HTTP connection to track login information. Instead, some kind of token is placed in the session (usually using a cookie) that is passed in subsequent requests to the site. The site validates that token on subsequent requests.

Curl has an option -c to indicate where cookies should be stored between connections

curl -c cookiejar -u user:pass login.php && curl -c cookierjar index.php

will be closer. I say closer because many sites don't use the http based authentication supported by the -u option but instead use custom forms and secondly the invocations assume a cookie is used (as opposed to embedding something in javascript or a url path). The latter is likely but I wouldn't count on the first bit.

Solution 2

According to curl manual the synopsis is the following:

curl [options] [URL...]

That means that you can specify several urls one after another in the same command. Curl will reuse the handle for each subsequent url:

curl will attempt to re-use connections for multiple file transfers, so that getting many files from the same server will not do multiple connects / handshakes. This improves speed. Of course this is only done on files specified on a single command line and cannot be used between separate curl invokes.

Solution 3

Principally this is what I made my Xidel for, you can write all requests and actions in a single command call and it will behave similar to a browser keeping cookies, and the connection alive:

xidel http://${IP}/login.php --download page1.html -f '"index.php"' --download page2.html 

Or if there is a link from the first page to the second one, it can directly follow that link:

xidel http://${IP}/login.php --download page1.html -f //a --download page2.html 

However, it does not support http authentication or other ports than 80,8080 and 443 yet (the backend would support it, but in-between there is an url validation which rejects it as being an invalid url)

Share:
24,198
ecbrodie
Author by

ecbrodie

Software Developer at Nulogy, Toronto, ON

Updated on July 09, 2022

Comments

  • ecbrodie
    ecbrodie almost 2 years

    I am trying to write a BASH command that uses CURL to send a GET request to two different web pages but uses the same connection. For me, it is like sending a GET request to a login page to authenticate to the server and then the second request mimics the automatic redirect to the home page that would've happened in a web browser (via meta refresh tag). I need to chain the requests because the content of the home page (generated by the server) wil be different for a guest user than an authenticated user.

    I tried this command first based on recommendation from SOF post (assume that the variables $IP and $PORT were already defined with valid values):

    curl -u user:pass ${IP}:${PORT}/login.php && curl ${IP}:${PORT}/index.php
    

    However, I always get something like this happening between the end of the first GET and the start of the second:

    * Connection #0 to host 10.0.3.153 left intact
    * Closing connection #0
    

    So was the SOF post wrong? Anyways, doing this command will successfully keep the connection open between two requests:

    curl -u user:pass ${IP}:${PORT}/login.php ${IP}:${PORT}/index.php
    

    However, I really would prefer a solution closer to the former command than the latter command. The main reason why is to separate output from the first page versus the second page into two different output files. So I want to do something like:

    curl page1.html > output1 && curl page2.html > output2
    

    Of course, I need to reuse the same connection because the contents of page2.html depends on me also doing a request to page1.html in the same HTTP session.

    I am also open to solutions that use netcat or wget, BUT NOT PHP!