How to read contents of a csv file inside zip file using PowerShell

13,066

Solution 1

There are multiple ways of achieving this:

1. Here's an example using Ionic.zip dll:

clear
Add-Type -Path "E:\sw\NuGet\Packages\DotNetZip.1.9.7\lib\net20\Ionic.Zip.dll"
$zip = [Ionic.Zip.ZipFile]::Read("E:\E.zip")

$file = $zip | where-object { $_.FileName -eq "XMLSchema1.xsd"}

$stream = new-object IO.MemoryStream
$file.Extract($stream)
$stream.Position = 0

$reader = New-Object IO.StreamReader($stream)
$text = $reader.ReadToEnd()
$text

$reader.Close()
$stream.Close()
$zip.Dispose()

It's picking the file by name (XMLSchema1.xsd) and extracting it into the memory stream. You then need to read the memory stream into something that you like (string in my example).

2. In Powershell 5, you could use Expand-Archive, see: https://technet.microsoft.com/en-us/library/dn841359.aspx?f=255&MSPPError=-2147217396

It would extract entire archive into a folder:

Expand-Archive "E:\E.zip" "e:\t"

Keep in mind that extracting entire archive is taking time and you will then have to cleanup the temporary files

3. And one more way to extract just 1 file:

$shell = new-object -com shell.application
$zip = $shell.NameSpace("E:\E.zip")
$file =  $zip.items() | Where-Object { $_.Name -eq "XMLSchema1.xsd"}
$shell.Namespace("E:\t").copyhere($file)

4. And one more way using native means:

Add-Type -assembly "system.io.compression.filesystem"
$zip = [io.compression.zipfile]::OpenRead("e:\E.zip")
$file = $zip.Entries | where-object { $_.Name -eq "XMLSchema1.xsd"}
$stream = $file.Open()

$reader = New-Object IO.StreamReader($stream)
$text = $reader.ReadToEnd()
$text

$reader.Close()
$stream.Close()
$zip.Dispose()

Solution 2

Based on 4. solution of Andrey, I propose the following function:

(keep in mind that "ZipFile" class exists starting at .NET Framework 4.5)

Add-Type -assembly "System.IO.Compression.FileSystem"

function Read-FileInZip($ZipFilePath, $FilePathInZip) {
    try {
        if (![System.IO.File]::Exists($ZipFilePath)) {
            throw "Zip file ""$ZipFilePath"" not found."
        }

        $Zip = [System.IO.Compression.ZipFile]::OpenRead($ZipFilePath)
        $ZipEntries = [array]($Zip.Entries | where-object {
                return $_.FullName -eq $FilePathInZip
            });
        if (!$ZipEntries -or $ZipEntries.Length -lt 1) {
            throw "File ""$FilePathInZip"" couldn't be found in zip ""$ZipFilePath""."
        }
        if (!$ZipEntries -or $ZipEntries.Length -gt 1) {
            throw "More than one file ""$FilePathInZip"" found in zip ""$ZipFilePath""."
        }

        $ZipStream = $ZipEntries[0].Open()

        $Reader = [System.IO.StreamReader]::new($ZipStream)
        return $Reader.ReadToEnd()
    }
    finally {
        if ($Reader) { $Reader.Dispose() }
        if ($Zip) { $Zip.Dispose() }
    }
}
Share:
13,066

Related videos on Youtube

Ishan
Author by

Ishan

Data Analyst with a development background currently looking for a job.

Updated on July 25, 2022

Comments

  • Ishan
    Ishan over 1 year

    I have a zip file which contains several CSV files inside it. How do I read the contents of those CSV files without extracting the zip files using PowerShell?

    I having been using the Read-Archive Cmdlet which is included as part of the PowerShell Community Extensions (PSCX)

    This is what I have tried so far.

    $path = "$env:USERPROFILE\Downloads\"
    $fullpath = Join-Path $path filename.zip
    
    Read-Archive $fullpath | Foreach-Object {
        Get-Content $_.Name
    }
    

    But when I run the code, I get this error message Get-Content : An object at the specified path filename.csv does not exist, or has been filtered by the -Include or -Exclude parameter.

    However, when I run Read-Archive $fullpath, it lists all the file inside the zip file

Related