Powershell: Merge selective columns in 2 CSV file

12,952

If the files will not be too large, I would load one into a hash table using the common column value as the key, then iterate over the second file and use the key value to look up the columns to merge from the first file. You just have to watch out for eating up too much RAM if the first file is big (what is big is dependent on how much RAM you have) because its entire contents will be loaded into memory.

#Make an empty hash table for the first file

$File1Values = @{}


#Import the first file and save the rows in the hash table indexed on "KeyColumn"

Import-Csv -Path c:\file1.csv | ForEach-Object {
  $File1Values.Add($_.KeyColumn, $_)
}


#Import the second file, using Select-Object to select all the values from file2,
#  and adding custom properties from the first file using the name/expression
#  hash tables.

Import-Csv -Path c:\file2.csv | Select-Object *,@{
  Name="ABC"; Expression={$File1Values[$_.KeyColumn].ABC}
}, @{
  #You can abbreviate Name/Expression
  N="DEF"; E={$File1Values[$_.KeyColumn].DEF}
} | Export-Csv -Path c:\OutFile.csv

For that last section, you could also use any of these techniques The Many Ways to Create a Custom Object to create the custom objects, I chose the "Select-Object" method as you only have to rebuild the bits of the object that are coming from the first file (at the expense of a more complex syntax).

If you're on V3 and want to use the new [PsCustomObject] type accelerator, that last bit would look like this (notice how you have to manually add both file 1 and file 2 properties):

#Import the second file and make a custom object with properties from both files

Import-Csv -Path c:\file2.csv | ForEach-Object {
  [PsCustomObject]@{
    ABC = $File1Values[$_.KeyColumn].ABC;
    DEF = $File1Values[$_.KeyColumn].DEF;
    UVW = $_.UVW;
    XYZ = $_.XYZ;
  }
} | Export-Csv -Path c:\OutFile.csv
Share:
12,952
Sam Song
Author by

Sam Song

Updated on June 04, 2022

Comments

  • Sam Song
    Sam Song almost 2 years

    Due to a sudden need to write a script,to combine 2 csv files with rows and columns having at least 1 common column, I am resorting to powershell. I am noob in Powershell. Can anyone advise how to read from both files, compare and combine a row with a common column, finally output to another file?

    CSV File 1

    Hosts  ABC  DEF
    =====  ===  ===
    SVR01   10  100
    SRV02   22   99
    

    CSV File 2

    Hosts  UVW   XYZ
    =====  ===   ===
    SVR01   13  10.5
    SRV02   19   8.9
    

    Expected output

    Hosts  DEF  UVW   XYZ
    =====  ===  ===   ===
    SVR01  100   13  10.5
    SRV02   99   19   8.9
    

    Hope to seek some guidance.

    Thank you.

  • Sam Song
    Sam Song almost 11 years
    Many thanks for the help rendered. However, I just realized that many of the headers in the CSV is more than a word, as follows: "Object ID","Volume","Aggregate","Storage Server","Used","Total","Used (%)" In this case, the key is "Object ID". I replaced the above "KeyColumn" to "Object ID", but it doesn't seem to work. I am prompted with the following: Supply values for the following parameters: Process[0]: Btw, I have to append -Header to the Import-Csv commands to, else it would throw out errors. Appreciate your help again.
  • gpduck
    gpduck almost 11 years
    You need to use " around the column/property names that have spaces in them: $File1Values[$_."Object ID"]."Storage Server"
  • gpduck
    gpduck almost 11 years
    I just double checked using the headers you listed and I'm able to import them just fine and the script works if I put "" around any of the complex header names in the script. ForEach-Object is the only command I'm using that takes -Process as a parameter, make sure you've got the opening { on the same line as the ForEach-Object command.