Copying data from multiple pdf files

40,744

Jeanno's right, if you have Acrobat then using its API library to work with the file directly is much better than the workarounds. I use this every day to convert pdf files into database entries.

Your code has a few problems, but I suspect the biggest issue is the use of SendKeys "^v" to paste into Excel. You're better off selecting the cell you want then using Selection.Paste. Or even better, transfer the contents of the clipboard to a variable, then parse it out as needed on the backend before writing to your spreadsheet--but that adds a bunch of complexity and doesn't help you a lot in this case.

To use the code below, be sure to select your 'Acrobat x.x Type Library' under Tools>References.

Sub StartAdobe1()
    Dim fName       As Variant
    Dim wbTransfer  As Excel.Workbook
    Dim wsNew       As Excel.Worksheet
    Dim dOpenCol    As Double
    Dim oPDFApp     As AcroApp
    Dim oAVDoc      As AcroAVDoc
    Dim oPDDoc      As AcroPDDoc

    'Define your spreadsheet
    Set wbTransfer = Workbooks("transfer (Autosaved).xlsm")
    Set wsNew = wbTransfer.Sheets("new")
    'Find first open column
    dOpenCol = ws.Cells(1, columns.count).End(xlToleft).Column + 1

    'Instantiate Acrobat Objects
    Set oPDFApp = CreateObject("AcroExch.App")
    Set oAVDoc = CreateObject("AcroExch.AVDoc")
    Set oPDDoc = CreateObject("AcroExch.PDDoc")

For Each fName In Range("path")

    'Open the PDF file. The AcroAVDoc.Open function returns a true/false 
    'to tell you if it worked
    If oAVDoc.Open(fName.Text, "") = True Then
        Set oPDDoc = oAVDoc.GetPDDoc
    Else
        Debug.Assert False
    End If

    'Copy all using Acrobat menu
    oPDFApp.MenuItemExecute ("SelectAll")
    oPDFApp.MenuItemExecute ("Copy")

    'Paste into open column
    wbTransfer.Activate
    wsNew.Cells(1, dOpenCol).Select
    ActiveSheet.Paste

    'Select next open column
    dOpenCol = dOpenCol + 1

    oAVDoc.Close (1)    '(1)=Do not save changes
    oPDDoc.Close

Next

    'Clean up
    Set wbTransfer = Nothing
    Set wsNew = Nothing
    Set oPDFApp = Nothing
    Set oAVDoc = Nothing
    Set oPDDoc = Nothing


End Sub

Note: 1-There is also a menu item oPDFApp.MenuItemExecute ("CopyFileToClipboard") that should do the select all and copy in one step, but I have had problems with it so I stick to the two-step method above.

2-A pdf file consists of two objects, the oAVDoc and the oPDDoc. Different aspects of the file are controlled by each. In this case you might only need the oAVDoc. Try commenting out the lines dealing with oPDDoc and see if it works without them.

Share:
40,744
Sam
Author by

Sam

Updated on July 09, 2022

Comments

  • Sam
    Sam almost 2 years

    I have pdf files from which I would like to copy all the data to a column in a spreadsheet.

    Here is the code I have. All it does is open the pdf, use control-a, then control-c to copy then activates the workbook, finds an open column and pastes the data with a control-v Sendkey.

    I have a range with path names it opens and copies data from all but only pastes the last one.

    Sub StartAdobe1()
    
    Dim AdobeApp As String
    Dim AdobeFile As String
    Dim StartAdobe
    Dim fname As Variant
    Dim iRow As Integer
    Dim Filename As String
    
    For Each fname In Range("path")
    
        AdobeApp = "C:\Program Files (x86)\Adobe\Reader 10.0\Reader\AcroRd32.exe"
        StartAdobe = Shell("" & AdobeApp & " " & fname & "", 1)
    
        Application.Wait Now + TimeValue("00:00:01")
        SendKeys "^a", True
        Application.Wait Now + TimeValue("00:00:01")
        SendKeys "^c"
        Application.Wait Now + TimeValue("00:00:01")
        SendKeys ("%{F4}")
        Windows("transfer (Autosaved).xlsm").Activate
        Worksheets("new").Activate
    
        ActiveSheet.Range("A1").Select
        Selection.End(xlToRight).Offset(0, 1).Select
    
        SendKeys "^v"
        Application.Wait Now + TimeValue("00:00:2")
    
    Next fname
    
  • Sam
    Sam about 9 years
    thank you. it just gave me the same result. I'll give you an example. Right now i have the range for just 2 file paths (for simplicity while i work out the bugs). It opens the first pdf i can see that it copies all the data (control-a) then the spreadsheet is activated and the right cell is selected. However it does not paste it just goes on to open the next pdf where all its data is selected, spreadsheet activated again and cell selected again, but this time it does paste the data.
  • OpiesDad
    OpiesDad about 9 years
    It may be worth trying more files to see if it only copies the second one, the last one, skips the first one, or some other combination of files.
  • seadoggie01
    seadoggie01 about 6 years
    I know this is old, but your "Find first open column" references ws not wsNew :)