How to decompress a zip file in Azure Data Factory v2

10,889

This can be achieved by having a setting "ZipDeflate" compression type in your source data set and in the sink data set of Copy activity you don't need to specify any compression configuration (Compression type is "none").

enter image description here

In the Copy activity sink settings, please set the copy behavior to "Flatten Hierarchy" to unzip and write the individual files.

enter image description here

When the Copy behavior is set to "Flatten Hierarchy", all the files from zipped source file are extracted and written to destination folder mentioned in the sink dataset as individual files by renaming the files to data_SomeGUID.csv.

In case if you do not specify the copy behavior (set to "none") in copy activity, then it decompress ZipDeflate file(s) and write to file-based sink data store, files will be extracted to the folder: //.

Please refer to this doc to know about the Compression support in Azure data factory: https://docs.microsoft.com/azure/data-factory/supported-file-formats-and-compression-codecs-legacy#compression-support

Share:
10,889
Admin
Author by

Admin

Updated on July 06, 2022

Comments

  • Admin
    Admin almost 2 years

    I'm trying to decompress a zip file (with multiple files inside) using Azure Data Factory v2. The zip file is located in Azure File Storage. The ADF Copy task just copies the original zip file without decompressing it. Any suggestion on how to make this work?

    This is the current configuration:

    1. The zip file source was setup as a binary dataset with Compression Type = ZipDeflate.
    2. The target folder was also setup as a binary dataset but with Compression Type = None.
    3. A pipeline with a single Copy task was created to move files from zip file to target folder.