Looking for help to make npm/pdfjs-dist work with Webpack and Django

15,116

Solution 1

This issue seems to arise due to esModule option introduced in [email protected].

The fix for this was merged in (pre-release) [email protected]

You can fix this by either upgrading pdfjs-dist to v2.6.347 OR downgrading worker-loader to v2.0.0

Solution 2

I just had to solve this issue myself...

This issue

Module not found: Error: Can't resolve 'module' in '/home/giampaolo/dev/KJ_import/KJ-JS/node_modules/webpack/lib/node'

Is caused by worker-loader loading NodeTargetPlugin, which in turn runs require("module") which I think (but I'm not 100%) is for native node modules, which when running Webpack targeted for web is not relevant

This issue can be mitigated with Webpack config

{
  node: {
    module: "empty"
  }
}

Afterwards, things move along farther, but I needed further mitigations:

import pdfjsLib from "pdfjs-dist/webpack";

This runs pdfjs-dist/webpack.js:27 which is

var PdfjsWorker = require("worker-loader!./build/pdf.worker.js");

Which is attempting to load pdf.worker.js (which worker-loader should be packaging) and then tries to instantiate the class:

pdfjs.GlobalWorkerOptions.workerPort = new PdfjsWorker();

The issue I had was that Webpack packaged pdf.worker.js as an esModule (the default for worker-loader), so the way it was require'd needs to be unwrapped with the default property on the imported esModule (said another way, the instantiation would have to be new PdfjsWorker.default()

I was able to mitigate this with the NormalModuleReplacementPlugin plugin, which is able to re-write the require statement based on a regex match/replace, which is matching the original require string and replacing it with one that sets the worker-loader option esModule=false, followed by the absolute path to the pdf.worker.js file on the local system:

new webpack.NormalModuleReplacementPlugin(
  /worker-loader!\.\/build\/pdf\.worker\.js$/,
  "worker-loader?esModule=false!" + path.join(__dirname, "../", "node_modules", "pdfjs-dist", "build", "pdf.worker.js")
)

It's important to match the complete original require string of /worker-loader!\.\/build\/pdf\.worker\.js$/ and not just the pdf.worker.js part, because you could end up in an infinite replace loop.

You need to fix the replacement string to be a proper path for your project, which would probably be

"worker-loader?esModule=false!" + path.join(__dirname, "node_modules", "pdfjs-dist", "build", "pdf.worker.js")

I have a ../ in my path because this code is being executed inside storybooks .storybook/ folder, so I have go up a directory to get to node_modules/

And with those two changes, everything for PDF.js seems to be working.

And lastly, if you want to ignore the warnings about the missing FetchCompileWasmPlugin and FetchCompileAsyncWasmPlugin modules, you can setup the webpack IgnorePlugin to just ignore these imports, my assumption is they're WASM based and not actually needed

plugins: [
  new webpack.IgnorePlugin({ resourceRegExp: /FetchCompileWasmPlugin$/ }),
  new webpack.IgnorePlugin({ resourceRegExp: /FetchCompileAsyncWasmPlugin$/ })
]

I'm guessing there might be some out-of-date mismatch of worker-loader and the modules in the currently installed Webpack version, but these WASM modules don't seem to be necessary for our purposes

Solution 3

If you're fine with using a cdn then use this

import pdfJS from 'pdfjs-dist/build/pdf.js';
pdfJS.GlobalWorkerOptions.workerSrc = 'https://cdnjs.cloudflare.com/ajax/libs/pdf.js/2.4.456/pdf.worker.js';

Make sure to import minified versions on production

import pdfJS from 'pdfjs-dist/build/pdf.min.js';
pdfJS.GlobalWorkerOptions.workerSrc = 'https://cdnjs.cloudflare.com/ajax/libs/pdf.js/2.4.456/pdf.worker.min.js';

Or you can just use minified versions all the time

Solution 4

It worked with:

var pdflib = require('pdfjs-dist/build/pdf.js');
import pdfjsWorker from 'pdfjs-dist/build/pdf.worker.js';
pdflib.GlobalWorkerOptions.workerPort = new pdfjsWorker();
Share:
15,116
Giampaolo Ferradini
Author by

Giampaolo Ferradini

Founder @KnowledgeJuicer, 11y in #startup space after 15y in #banking including @Deutsche Bank, @Commerzbank • @Bocconi University, @MIP Politecnico di Milano. Wannabe coder :D

Updated on June 11, 2022

Comments

  • Giampaolo Ferradini
    Giampaolo Ferradini almost 2 years

    I've been trying for a few hours replacing a link-based pdf.js with an npm install of pdfjs-dist, since I noticed that my links were not meant to be used as cdns and could become unstable as described here.

    I could not find much documentation on how to make that work other than a few examples, and when Webpack is involved they are mostly with React, while I am simply using ES6 in a Django framework (static compiling on the desired django directory, without using the webpack-plugin.)

    After exchanging several messages with one of the guys that work on pdf.js it seemed that my compiling errors were probably due to how Webpack handles internally the library. Here's what I am seeing:

    WARNING in ./node_modules/worker-loader/dist/index.js
    Module not found: Error: Can't resolve 'webpack/lib/web/FetchCompileAsyncWasmPlugin' in '/home/giampaolo/dev/KJ_import/KJ-JS/node_modules/worker-loader/dist'
     @ ./node_modules/worker-loader/dist/index.js
     @ ./node_modules/worker-loader/dist/cjs.js
     @ ./node_modules/pdfjs-dist/webpack.js
     @ ./src/js/views/pdfViews.js
     @ ./src/js/index.js
    
    WARNING in ./node_modules/worker-loader/dist/index.js
    Module not found: Error: Can't resolve 'webpack/lib/web/FetchCompileWasmPlugin' in '/home/giampaolo/dev/KJ_import/KJ-JS/node_modules/worker-loader/dist'
     @ ./node_modules/worker-loader/dist/index.js
     @ ./node_modules/worker-loader/dist/cjs.js
     @ ./node_modules/pdfjs-dist/webpack.js
     @ ./src/js/views/pdfViews.js
     @ ./src/js/index.js
    
    ERROR in (webpack)/lib/node/NodeTargetPlugin.js
    Module not found: Error: Can't resolve 'module' in '/home/giampaolo/dev/KJ_import/KJ-JS/node_modules/webpack/lib/node'
     @ (webpack)/lib/node/NodeTargetPlugin.js 11:1-18
     @ ./node_modules/worker-loader/dist/index.js
     @ ./node_modules/worker-loader/dist/cjs.js
     @ ./node_modules/pdfjs-dist/webpack.js
     @ ./src/js/views/pdfViews.js
     @ ./src/js/index.js
    Child HtmlWebpackCompiler:
         1 asset
        Entrypoint HtmlWebpackPlugin_0 = __child-HtmlWebpackPlugin_0
        [./node_modules/html-webpack-plugin/lib/loader.js!./src/src-select.html] 4.57 KiB {HtmlWebpackPlugin_0} [built]
    Child worker-loader node_modules/pdfjs-dist/build/pdf.worker.js:
                  Asset      Size      Chunks             Chunk Names
        index.worker.js  1.33 MiB  pdf.worker  [emitted]  pdf.worker
        Entrypoint pdf.worker = index.worker.js
        [./node_modules/pdfjs-dist/build/pdf.worker.js] 1.25 MiB {pdf.worker} [built]
        [./node_modules/process/browser.js] 5.29 KiB {pdf.worker} [built]
    ℹ 「wdm」: Failed to compile.
    

    Theoretically the pdfjs-dist should come with a zero configuration file, without even needing to set up a worker for it, so code like the one below should work:

    import pdfjsLib from 'pdfjs-dist/webpack'
    
    ////////////////////////////////////////////
    //// instantiate pdf
    export const pdfView = () => {
      // pdfjsLib.GlobalWorkerOptions.workerSrc = 'index.worker.js';
    
      // var defined through a Django template tag
      const loadingTask = pdfjsLib.getDocument(pdfData.myPdfDoc)
    
      pdfData.myPdf = loadingTask.promise.then(pdf => {
        pdfData.pdfTotalPageN = pdf.numPages;
        return pdf;
      })
    }
    
    

    but it doesn't get compiled, and I would really appreciate some pointers. Thanks in advance

  • Giampaolo Ferradini
    Giampaolo Ferradini over 3 years
    Thanks @zoran404 but I am already using the CDN approach. As the guys at Mozilla pointed out it's a bit risky for production.
  • Giampaolo Ferradini
    Giampaolo Ferradini over 3 years
    Wow, a) thanks a lot for your detailed description @Alex and b) I hope I'll be able to put that into place.
  • Giampaolo Ferradini
    Giampaolo Ferradini over 3 years
    Thanks for letting us know @Siddikesh. Plan to test it as soon as I get a moment.
  • Vikrant
    Vikrant over 3 years
    Wow I upgraded pdfjs-dist to pre-release version and that error vanished. I have put a lot of hours trying to fix that but you saved the day. Thanks.
  • ThienSuBS
    ThienSuBS over 3 years
    Thank you for your suggestion! After upgraded pdfjs to 2.6.347 via yarn upgrade [email protected] and my worker-loader is 3.0.5. Everything worked.
  • nnyby
    nnyby almost 3 years
    It looks like the esModule=false setting has now been added to upstream pdf.js: github.com/mozilla/pdf.js/blob/master/external/dist/…
  • Luke
    Luke about 2 years
    Thanks. Even though I'm using modern syntax (building from a module/ESM/.mjs file), I still had to use the legacy build versions in order to avoid a Unexpected token '=' error when building using pdfjs-dist v2.13.216.