merge csv files by first column

text-processing awk csv columns join

5,063

You can do this entirely with POSIX-specified features of join.

join -t, csv[12] | join -t, - csv3

Using your csv1, csv2 and csv3 files as posted, that gives:

$ join -t, csv[12] | join -t, - csv3
2,qwe,rty,2014-04-03,j,k,2014-04-01,a,s,d,f,g,2014-04-01
3,zxc,cvb,2014-04-05,a,s,2014-04-04,d,f,g,h,j,2014-04-06

5,063

user3696932

Updated on September 18, 2022

Comments

user3696932 over 1 year
I have 3 csv files like this.

csv 1:
```
1,aaaa,bbb,2014-04-01
2,qwe,rty,2014-04-03
3,zxc,cvb,2014-04-05
```
csv 2:
```
2,j,k,2014-04-01
3,a,s,2014-04-04
5,g,h,2014-04-08
```
csv 3:
```
2,a,s,d,f,g,2014-04-01
3,d,f,g,h,j,2014-04-06
4,c,v,b,n,m,2014-04-09
```
How can I merge all by the first column?
```
SELECT * FROM csv1
JOIN csv2 where csv1[0]= csv2[0] --[0] is the position of the first column
```
The output should be:
```
 csv1 fields     | csv2 fields |  csv4 fields

 2,qwe,rty,2014-04-03,a,s,2014-04-04,a,s,d,f,g,2014-04-01
 3,zxc,cvb,2014-04-05,g,h,2014-04-08,d,f,g,h,j,2014-04-06  
```
- Admin almost 10 years
  
  Your desired output appears to mix up the values e.g. line 3 has g,h from line 5 of csv2 - is that what you intended? And what is csv4?
- Admin almost 10 years
  
  Just found a SQL engine over CSV files: github.com/harelba/q
- Admin almost 10 years
  
  Please don't ask multiple questions in a single post. Post a separate question for each issue instead. Since I see that you have posted your 2nd question separately, I am deleting it from here.
polym almost 10 years

user3696932 did post an answer instead of a comment to comment your answer. Please take a look at the updated, edited question containing this comment, if you already haven't done so.