How can I merge files on a line by line basis?

17,314

Solution 1

The right tool for this job is probably paste

paste -d '' file1 file2

See man paste for details.


You could also use the pr command:

pr -TmJS"" file1 file2

where

  • -T turns off pagination
  • -mJ merge files, Joining full lines
  • -S"" separate the columns with an empty string

If you really wanted to do it using pure bash shell (not recommended), then this is what I'd suggest:

while IFS= read -u3 -r a && IFS= read -u4 -r b; do 
  printf '%s%s\n' "$a" "$b"
done 3<file1 4<file2

(Only including this because the subject came up in comments to another proposed pure-bash solution.)

Solution 2

Through way:

awk '{getline x<"file2"; print $0x}' file1
  • getline x<"file2" reads the entire line from file2 and holds into x variable.
  • print $0x prints the whole line from file1 by using $0 then x which is the saved line of file2.

Solution 3

paste is the way to go. If you want to check some other methods, here is a python solution:

#!/usr/bin/env python2
import itertools
with open('/path/to/file1') as f1, open('/path/to/file2') as f2:
    lines = itertools.izip_longest(f1, f2)
    for a, b in lines:
        if a and b:
            print a.rstrip() + b.rstrip()
        else:
            if a:
                print a.rstrip()
            else:
                print b.rstrip()

If you have few number of lines:

#!/usr/bin/env python2
with open('/path/to/file1') as f1, open('/path/to/file2') as f2:
    print '\n'.join((a.rstrip() + b.rstrip() for a, b in zip(f1, f2)))

Note that for unequal number of lines, this one will end at the last line of the file that ends first.

Solution 4

Also, with pure bash (notice that this will totally ignore empty lines):

#!/bin/bash

IFS=$'\n' GLOBIGNORE='*'
f1=($(< file1))
f2=($(< file2))
i=0
while [ "${f1[${i}]}" ] && [ "${f2[${i}]}" ]
do
    echo "${f1[${i}]}${f2[${i}]}" >> out
    ((i++))
done
while [ "${f1[${i}]}" ]
do
    echo "${f1[${i}]}" >> out
    ((i++))
done
while [ "${f2[${i}]}" ]
do
    echo "${f2[${i}]}" >> out
    ((i++))
done

Solution 5

The perl way, easy to understand:

#!/usr/bin/perl
$filename1=$ARGV[0];
$filename2=$ARGV[1];

open(my $fh1, "<", $filename1) or die "cannot open < $filename1: $!";
open(my $fh2, "<", $filename2) or die "cannot open < $filename2: $!";

my @array1;
my @array2;

while (my $line = <$fh1>) {
  chomp $line;
  push @array1, $line;
}
while (my $line = <$fh2>) {
  chomp $line;
  push @array2, $line;
}

for my $i (0 .. $#array1) {
  print @array1[$i].@array2[$i]."\n";
}

Start with:

./merge file1 file2

Output:

foobar
icecream
twohundred
Share:
17,314

Related videos on Youtube

TuxForLife
Author by

TuxForLife

TuxForLife

Updated on September 18, 2022

Comments

  • TuxForLife
    TuxForLife over 1 year

    cat file1

    foo
    ice
    two
    

    cat file2

    bar
    cream
    hundred
    

    Desired output:

    foobar
    icecream
    twohundred
    

    file1 and file2 will always have the same amount of lines in my scenario, in case that makes things easier.

  • TuxForLife
    TuxForLife about 9 years
    Awesome, thank you for the very simple solution. Should I ever worry about portability when it comes to using paste?
  • nettux
    nettux about 9 years
    @user264974 paste is in GNU Coreutils so you're probably fairly safe.
  • geirha
    geirha about 9 years
    This is just plain wrong. It doesn't work at all. Either use mapfile to read the files into arrays, or use a while loop with two read commands, reading from each their fd.
  • kos
    kos about 9 years
    @geirha You're right, I messed up with the syntax, it's ok now.
  • geirha
    geirha about 9 years
    not quite. With the updated code, empty lines will be ignored, and if any line contains glob characters, the line might be replaced with matching filenames. So never use array=( $(cmd) ) or array=( $var ). Use mapfile instead.
  • kos
    kos about 9 years
    @geirha You're right of course, I took care of the glob characters but I left the newline ignored, because in order to do that and in order to make a decent solution out of it it needs to be rewritten. I specified this and 'll leave this version in case it's going to be useful to somebody in the meantime. Thanks for your points so far.