Why do I get "net/http: request canceled while waiting for connection" when I try to fetch some images with "net/http"

15,775

Since you're using https, you need to create http.Client with custom transport and configure TLS (see http.Transport), e.g.

package main

import (
    "crypto/tls"
    "fmt"
    "net/http"
    "time"
)

func main() {
    //---------------------- Modification ----------------------
    //Configure TLS, etc.
    tr := &http.Transport{
        TLSClientConfig: &tls.Config{
            InsecureSkipVerify: true,
        },
    }
    client := &http.Client{
        Transport: tr,
        Timeout:   3 * time.Second,
    }
    //---------------------- End of Modification ----------------

    // var imageUrl = "https://i.stack.imgur.com/tKsDb.png"  // It works well
    var imageUrl = "https://precious.jp/mwimgs/b/1/-/img_b1ec6cf54ff3a4260fb77d3d3de918a5275780.jpg" // It fails

    req, _ := http.NewRequest("GET", imageUrl, nil)
    req.Header.Add("User-Agent", "My Test")

    resp, err := client.Do(req)
    if err != nil {
        fmt.Println(err.Error()) // Fails here
        return
    }
    defer resp.Body.Close()

    if resp.StatusCode != http.StatusOK {
        fmt.Printf("Failure: %d\n", resp.StatusCode)
    } else {
        fmt.Printf("Success: %d\n", resp.StatusCode)
    }

    fmt.Println("Done")
}
Share:
15,775
Sa Oh
Author by

Sa Oh

Updated on July 02, 2022

Comments

  • Sa Oh
    Sa Oh almost 2 years

    I'm writing a web crawler in Go language to collect images on the Internet. My crawler works most of the time, but it sometimes fails to fetch images somehow.

    Here's my snippet:

    package main
    
    import (
        "fmt"
        "net/http"
        "time"
    )
    
    func main() {
        var client http.Client
        var resp *http.Response
    
        // var imageUrl = "https://i.stack.imgur.com/tKsDb.png"  // It works well
        var imageUrl = "https://precious.jp/mwimgs/b/1/-/img_b1ec6cf54ff3a4260fb77d3d3de918a5275780.jpg"  // It fails
    
        req, _ := http.NewRequest("GET", imageUrl, nil)
        req.Header.Add("User-Agent", "My Test")
    
        client.Timeout = 3 * time.Second
        resp, err := client.Do(req)
        if err != nil {
            fmt.Println(err.Error())  // Fails here
            return
        }
        defer resp.Body.Close()
    
        if resp.StatusCode != http.StatusOK {
            fmt.Printf("Failure: %d\n", resp.StatusCode)
        } else {
            fmt.Printf("Success: %d\n", resp.StatusCode)
        }
    
        fmt.Println("Done")
    }
    

    My snippet above works for most of the URLs (e.g. "https://i.stack.imgur.com/tKsDb.png"), but it doesn't work if it tries to fetch URLs such as "https://precious.jp/mwimgs/b/1/-/img_b1ec6cf54ff3a4260fb77d3d3de918a5275780.jpg". Error message given by calling err.Error() is:

    Get https://precious.jp/mwimgs/b/1/-/img_b1ec6cf54ff3a4260fb77d3d3de918a5275780.jpg: net/http: request canceled (Client.Timeout exceeded while awaiting headers)"

    My Go version is "go1.9.3 darwin/amd64", and I can get the image with my Google Chrome and also with curl command, so I don't think I'm blocked by my IP address. Besides that, I've changed the User-Agent to be like real browser but still not luck.

    What's wrong with my code? Or is the administrator of precious.jp doing some magic to block my access?