Powershell: Merge selective columns in 2 CSV file
If the files will not be too large, I would load one into a hash table using the common column value as the key, then iterate over the second file and use the key value to look up the columns to merge from the first file. You just have to watch out for eating up too much RAM if the first file is big (what is big is dependent on how much RAM you have) because its entire contents will be loaded into memory.
#Make an empty hash table for the first file
$File1Values = @{}
#Import the first file and save the rows in the hash table indexed on "KeyColumn"
Import-Csv -Path c:\file1.csv | ForEach-Object {
$File1Values.Add($_.KeyColumn, $_)
}
#Import the second file, using Select-Object to select all the values from file2,
# and adding custom properties from the first file using the name/expression
# hash tables.
Import-Csv -Path c:\file2.csv | Select-Object *,@{
Name="ABC"; Expression={$File1Values[$_.KeyColumn].ABC}
}, @{
#You can abbreviate Name/Expression
N="DEF"; E={$File1Values[$_.KeyColumn].DEF}
} | Export-Csv -Path c:\OutFile.csv
For that last section, you could also use any of these techniques The Many Ways to Create a Custom Object to create the custom objects, I chose the "Select-Object" method as you only have to rebuild the bits of the object that are coming from the first file (at the expense of a more complex syntax).
If you're on V3 and want to use the new [PsCustomObject] type accelerator, that last bit would look like this (notice how you have to manually add both file 1 and file 2 properties):
#Import the second file and make a custom object with properties from both files
Import-Csv -Path c:\file2.csv | ForEach-Object {
[PsCustomObject]@{
ABC = $File1Values[$_.KeyColumn].ABC;
DEF = $File1Values[$_.KeyColumn].DEF;
UVW = $_.UVW;
XYZ = $_.XYZ;
}
} | Export-Csv -Path c:\OutFile.csv
Sam Song
Updated on June 04, 2022Comments
-
Sam Song almost 2 years
Due to a sudden need to write a script,to combine 2 csv files with rows and columns having at least 1 common column, I am resorting to powershell. I am noob in Powershell. Can anyone advise how to read from both files, compare and combine a row with a common column, finally output to another file?
CSV File 1
Hosts ABC DEF ===== === === SVR01 10 100 SRV02 22 99
CSV File 2
Hosts UVW XYZ ===== === === SVR01 13 10.5 SRV02 19 8.9
Expected output
Hosts DEF UVW XYZ ===== === === === SVR01 100 13 10.5 SRV02 99 19 8.9
Hope to seek some guidance.
Thank you.
-
Sam Song almost 11 yearsMany thanks for the help rendered. However, I just realized that many of the headers in the CSV is more than a word, as follows: "Object ID","Volume","Aggregate","Storage Server","Used","Total","Used (%)" In this case, the key is "Object ID". I replaced the above "KeyColumn" to "Object ID", but it doesn't seem to work. I am prompted with the following: Supply values for the following parameters: Process[0]: Btw, I have to append -Header to the Import-Csv commands to, else it would throw out errors. Appreciate your help again.
-
gpduck almost 11 yearsYou need to use " around the column/property names that have spaces in them: $File1Values[$_."Object ID"]."Storage Server"
-
gpduck almost 11 yearsI just double checked using the headers you listed and I'm able to import them just fine and the script works if I put "" around any of the complex header names in the script. ForEach-Object is the only command I'm using that takes -Process as a parameter, make sure you've got the opening { on the same line as the ForEach-Object command.