Puppeteer - Protocol error (Page.navigate): Target closed

49,032

Solution 1

What "Target closed" means

When you launch a browser via puppeteer.launch it will start a browser and connect to it. From there on any function you execute on your opened browser (like page.goto) will be send via the Chrome DevTools Protocol to the browser. A target means a tab in this context.

The Target closed exception is thrown when you are trying to run a function, but the target (tab) was already closed.

Similar error messages

The error message was recently changed to give more meaningful information. It now gives the following message:

Error: Protocol error (Target.activateTarget): Session closed. Most likely the page has been closed.


Why does it happen

There are multiple reasons why this could happen.

  • You used a resource that was already closed

    Most likely, you are seeing this message because you closed the tab/browser and are still trying to use the resource. To give an simple example:

    const browser = await puppeteer.launch();
    const page = await browser.newPage();
    
    await browser.close();
    await page.goto('http://www.google.com');
    

    In this case the browser was closed and after that, a page.goto was called resulting in the error message. Most of the time, it will not be that obvious. Maybe an error handler already closed the page during a cleanup task, while your script is still crawling.

  • The browser crashed or was unable to initialize

    I also experience this every few hundred requests. There is an issue about this on the puppeteer repository as well. It seems to be the case, when you are using a lot of memory or CPU power. Maybe you are spawning a lot of browser? In these cases the browser might crash or disconnect.

    I found no "silver bullet" solution to this problem. But you might want to check out the library puppeteer-cluster (disclaimer: I'm the author) which handles these kind of error cases and let's you retry the URL when the error happens. It can also manage a pool of browser instances and would also simplify your code.

Solution 2

I was just experiencing the same issue every time I tried running my puppeteer script*. The above did not resolve this issue for me.

I got it to work by removing and reinstalling the puppeteer package:

npm remove puppeteer
npm i puppeteer

*I only experienced this issue when setting the headless option to 'false`

Solution 3

In 2021 I'm receiving the very similar following error Error: Error pdf creationError: Protocol error (Target.setDiscoverTargets): Target closed., I solved it by playing with different args, so if your production server has a pipe:true flag in puppeteer.launch obj it will produce errors.

Also --disable-dev-shm-usage flag do the trick

The solution below works for me:

const browser = await puppeteer.launch({
  headless: true,
  // pipe: true, <-- delete this property
  args: [
    '--no-sandbox',
    '--disable-dev-shm-usage', // <-- add this one
    ],
});

Solution 4

For me removing '--single-process' from args fixed the issue.

puppeteerOptions: {
    headless: true,
    args: [
        '--disable-gpu',
        '--disable-dev-shm-usage',
        '--disable-setuid-sandbox',
        '--no-first-run',
        '--no-sandbox',
        '--no-zygote',
        '--deterministic-fetch',
        '--disable-features=IsolateOrigins',
        '--disable-site-isolation-trials',
        // '--single-process',
    ],
}

Solution 5

I've wound up at this thread a few times, and the typical culprit is that I forgot to await a Puppeteer page call that returned a promise, causing a race condition.

Here's a minimal example of what this can look like:

const puppeteer = require("puppeteer");

let browser;
(async () => {
  browser = await puppeteer.launch();
  const [page] = await browser.pages();
  page.goto("https://www.stackoverflow.com"); // whoops, forgot await!
})()
  .catch(err => console.error(err))
  .finally(async () => await browser.close())
;

Output is:

C:\Users\foo\Desktop\puppeteer-playground\node_modules\puppeteer\lib\cjs\puppeteer\common\Connection.js:217
            this._callbacks.set(id, { resolve, reject, error: new Error(), method });
                                                              ^

Error: Protocol error (Page.navigate): Target closed.
    at C:\Users\foo\Desktop\puppeteer-playground\node_modules\puppeteer\lib\cjs\puppeteer\common\Connection.js:217:63

In this case, it seems like an unmissable error, but in a larger chunk of code and the promise is nested or in a condition, it's easy to overlook.

You'll get a similar error for forgetting to await a page.click() or other promise call, for example, Error: Protocol error (Runtime.callFunctionOn): Target closed., which can be seen in the question UnhandledPromiseRejectionWarning: Error: Protocol error (Runtime.callFunctionOn): Target closed. (Puppeteer)

This is a contribution to the thread as a canonical resource for the error and may not be the solution to OP's problem, although the fundamental race condition seems to be a likely cause.

Share:
49,032
LioRz
Author by

LioRz

Updated on July 09, 2022

Comments

  • LioRz
    LioRz almost 2 years

    As you can see with the sample code below, I'm using Puppeteer with a cluster of workers in Node to run multiple requests of websites screenshots by a given URL:

    const cluster = require('cluster');
    const express = require('express');
    const bodyParser = require('body-parser');
    const puppeteer = require('puppeteer');
    
    async function getScreenshot(domain) {
        let screenshot;
        const browser = await puppeteer.launch({ args: ['--no-sandbox', '--disable-setuid-sandbox', '--disable-dev-shm-usage'] });
        const page = await browser.newPage();
    
        try {
            await page.goto('http://' + domain + '/', { timeout: 60000, waitUntil: 'networkidle2' });
        } catch (error) {
            try {
                await page.goto('http://' + domain + '/', { timeout: 120000, waitUntil: 'networkidle2' });
                screenshot = await page.screenshot({ type: 'png', encoding: 'base64' });
            } catch (error) {
                console.error('Connecting to: ' + domain + ' failed due to: ' + error);
            }
    
        await page.close();
        await browser.close();
    
        return screenshot;
    }
    
    if (cluster.isMaster) {
        const numOfWorkers = require('os').cpus().length;
        for (let worker = 0; worker < numOfWorkers; worker++) {
            cluster.fork();
        }
    
        cluster.on('exit', function (worker, code, signal) {
            console.debug('Worker ' + worker.process.pid + ' died with code: ' + code + ', and signal: ' + signal);
            Cluster.fork();
        });
    
        cluster.on('message', function (handler, msg) {
            console.debug('Worker: ' + handler.process.pid + ' has finished working on ' + msg.domain + '. Exiting...');
            if (Cluster.workers[handler.id]) {
                Cluster.workers[handler.id].kill('SIGTERM');
            }
        });
    } else {
        const app = express();
        app.use(bodyParser.json());
        app.listen(80, function() {
            console.debug('Worker ' + process.pid + ' is listening to incoming messages');
        });
    
        app.post('/screenshot', (req, res) => {
            const domain = req.body.domain;
    
            getScreenshot(domain)
                .then((screenshot) =>
                    try {
                        process.send({ domain: domain });
                    } catch (error) {
                        console.error('Error while exiting worker ' + process.pid + ' due to: ' + error);
                    }
    
                    res.status(200).json({ screenshot: screenshot });
                })
                .catch((error) => {
                    try {
                        process.send({ domain: domain });
                    } catch (error) {
                        console.error('Error while exiting worker ' + process.pid + ' due to: ' + error);
                    }
    
                    res.status(500).json({ error: error });
                });
        });
    }
    

    Some explanation:

    1. Each time a request arrives a worker will process it and kill itself at the end
    2. Each worker creates a new browser instance with a single page, and if a page took more than 60sec to load, it will retry reloading it (in the same page because maybe some resources has already been loaded) with timeout of 120sec
    3. Once finished both the page and the browser will be closed

    My problem is that some legitimate domains get errors that I can't explain:

    Error: Protocol error (Page.navigate): Target closed.
    
    Error: Protocol error (Runtime.callFunctionOn): Session closed. Most likely the page has been closed.
    

    I read at some git issue (that I can't find now) that it can happen when the page redirects and adds 'www' at the start, but I'm hoping it's false... Is there something I'm missing?