Verify the data of the downloaded file (PDF/Word/Excel) using cypress commands

15,270

You need a plugin to access libraries like pdf-parser which work in the NodeJs environment (i.e use Node commands like fs).

The best reference for this is Powerful cy.task

Here is an example of adapting this pattern to your scenario.

cypress/plugins/index.js

const fs = require('fs')
const path = require('path')
const pdf = require('pdf-parse');

const repoRoot = path.join(__dirname, '..', '..') // assumes pdf at project root

const parsePdf = async (pdfName) => {
  const pdfPathname = path.join(repoRoot, pdfName)
  let dataBuffer = fs.readFileSync(pdfPathname);
  return await pdf(dataBuffer)  // use async/await since pdf returns a promise 
}

module.exports = (on, config) => {
  on('task', {
    getPdfContent (pdfName) {
      return parsePdf(pdfName)
    }
  })
}

spec.js

it('tests a pdf', () => {
  cy.task('getPdfContent', 'mypdf.pdf').then(content => {
    // test you pdf content here, with expect(this and that)...
  })
})

I haven't tested this, so you may find some wrinkles to iron out.

The location of the pdf is repoRoot which I understand to mean the project root folder two levels above /cypress/plugins. You may need to adjust the path since downloading is involved. You have not given enough info to understand the full test logic, I leave it up to you to make the adjustments.

The form that the content comes back in depends on the pdf library used. It looks like pdf-parse gives a Json object which should be easy to test.
After cy.task('getPdfContent') is called you can choose various cy commands such as .should() and .contains() but I would use .then() and within the callback use expect() on the content.

Share:
15,270

Related videos on Youtube

Prashant Kankhara
Author by

Prashant Kankhara

Updated on June 04, 2022

Comments

  • Prashant Kankhara
    Prashant Kankhara almost 2 years

    I have one scenario where I have to verify the downloaded file's data using Cypress commands. FileType :- pdf, word, excel. I have the URL of the Server API Action which gets called and In response, it returns the pdf file. I need to implement using Cypress commands and Typescript (plugin and typings).

    I am able to get the downloaded status and even the response.body has some text but it require some parser to parse the response body. Below is the code that I have tried.

    const oReq = new XMLHttpRequest();
        oReq.open("GET", href as string, true);
        oReq.responseType = "arraybuffer";
        oReq.onload = () => {
            if (oReq.readyState === oReq.DONE) {
                if (oReq.status === 200) {
                    // tried parsing the response. 
    // looking for any parser which can parse the given reponse body into Text or json
                }
            }
        }
    
    
    cy.request(href).then((response) => {
        expect(response.status).to.equal(200);
        expect(response.body).not.to.null;
        const headerValue = response.headers["content-disposition"];
    
        // expect(headerValue).to.equal("attachment; filename=ExperimentEntityList.<FileExtension-PDF | XLSX | DOCX>");
    
        /// have tried with YAML parser and the "FS" module that cypress and ends up in different console error
        // YAML parser gives consoole error about unidentified character "P".
        // FS module code is shown below
    });     
    
    import * as fs from "fs";
    
    function GetPDFContent()
    {
        // throws console that fs object doesn't have readFile and same with readFileSync method. 
        fs.readFile("url")..
        fs.readFileSync("url")..
    }
    

    Requirement:
    1) Read Content of PDF File
    2) Read Content of XLS(x) file
    3) Read content of doc(x) file.

    Didn't get success in reading content of PDF and DOc(x) file in typescript for the cypress automation script. Gone through various blogs in the internet install pdfparser, pdfreader, yaml parser, filereader and couple of more. But, none of them works. I have used the above mentioned code to read the files. and Check the written comment for the respective command.

    For the xlsx file I found the solution by using XSLX parser plugin that parse the Response.body which I can iterate and get the content. I am looking for the similar parser for PDF and Doc(x) file.

    Anyone knows about this. Please share it!!!

    NOTE: Brackets or syntax is not the problem. If found in above sample code then it would have missed during copy/paste.

    EDIT:

    I have found the solution to read and verify the PDF file content using Cypress commadns. Thanks to Richard Matsen, @Richard: But, the problem is when I have a full url of the PDF file. Like - http://domainname/upload/files/pdf/pdfname.pdf. Then I can read the content and verify it. But if My problem is I have a url like "http://domainname/controller/action?pdf=someid", which returns the pdf file response and the node command doesn't encode it properly and the pdf file is not parsed properly.

    Small Question

    Do anyone knows how to create a pdf file using node/cypress commands using the Response stream of the PDF data. I have tried the Axios plugin, http, xmlhttprequest plutins.

  • Prashant Kankhara
    Prashant Kankhara over 5 years
    Yes, Correct Yesterday I found and trying to implement the similar concept. But using cy.exec. Really, thanks for the support you have gave. While trying, I found that it only works if the pdf is in any root or local folder. In my case I am calling API action which returns me the pdf file responsestream. If Tried to create a file using cy.writeFile (). A blank pdf gets generated. So, looking for the commands which can executing the URL(http://....) and save the file to the local (root) folder.
  • Damien Monni
    Damien Monni about 3 years
    Awesome answer! I just did not cast pdf-parse returned value as a String so I can use the pdf-parse object. Thanks !
  • soccerway
    soccerway about 3 years
    May I ask how can I get the values from the content from the above answer ? In the chrome console it is showing as Yielded: [object Promise]
  • Richard Matsen
    Richard Matsen about 3 years
    @soccerway it seems maybe you forgot to make some function async and await the results. Alternatively, if you have a Promise you can add a .then() to it to get the actual value.
  • soccerway
    soccerway about 3 years
    All good now Richard. i have removed the String in the return getPdfContent (pdfName) { return (parsePdf(pdfName)) }
  • soccerway
    soccerway about 3 years
    Now When I am running the test it is printing the whole text
  • JohnPix
    JohnPix almost 3 years
    @soccerway I tried with String and without in both cases the promise is returned [object Promise] did you change something else?