Bash to join columns from multiple files

120

for a simple solution: paste the three files together, then get the columns you want:

paste -d' ' file1 file2 file3  |\
awk 'BEGIN { FS = " +" } { NR ==1} { printf "%-10s%-7s%-7s  %-12s  %-12s\n" $1,$2,$3,$6,$7 } { NR >=2 } { printf "%-10s%-7s%-7s  %s,%s%s  %s%s%s\n" $1,$2,$3,$6,$7,$8,$9,$10,$11 } '

This will have to be adopted according to your files and likings for the output format. Explanations:

1) paste -d' ' -> merge the tree files in vertical direction, use space as -delimiter.

2) pipe it to awk (and continue command in new line |\ for readability here)

2.1) BEGIN { FS = " +" } - for all following use one or more (+) spaces as field delimiter

2.2) in first line { NR ==1} print fields 1,2,3,6,7 ($1,$2 ...) with following format (in double quotes)

%-10s a fixed 10 character long string (rest filled with spaces, alligned to the left).

twice the same with 7-character length, then two spaces, a 12-character long string, two spaces, 12-character string. Add a new line \n in the end.

(found in the { printf "%-10s%-7s%-7s %-12s %-12s\n" $1,$2,$3,$6,$7 } part)

2.2) the Data: from lines two and greater { NR >=2 } print columns $1,$2,$3,$6,$7,$8,$9,$10,$11 with the format %-10s%-7s%-7s %s,%s,%s %s,%s,%s\n

similar to the above, but now e.g. columns 6,7,8 are of arbitrary length and separated by a comma %s,%s,%s

Share:
120

Related videos on Youtube

rezafahlevi08
Author by

rezafahlevi08

Updated on September 18, 2022

Comments

  • rezafahlevi08
    rezafahlevi08 over 1 year

    In my localhost, I create an web application to get data from website, it just contain one character. So I create this:

    $.get("http://www.website.web.id/data.txt", function(client_req) { 
    alert(client_req); 
    });
    

    But it can't load the data. Why?

    • tymeJV
      tymeJV about 10 years
      Im willing to bet there's an exception in the console
    • Tomanow
      Tomanow about 10 years
      possible duplicate of cross domain jquery get
    • Hackerman
      Hackerman about 10 years
      No 'Access-Control-Allow-Origin' header is present on the requested resource ????
    • JF it
      JF it about 10 years
      unless its JSONP, you're gonna have trouble getting stuff cross-domain, because of the en.wikipedia.org/wiki/Same-origin_policy ..
    • rezafahlevi08
      rezafahlevi08 about 10 years
      ohh thank you. sorry for duplicate question.
    • FelixJN
      FelixJN almost 9 years
      are the column widths fixed? is removing the multiple spaces to single ones an option (e.g. tr -s [:space:])?
    • user3668772
      user3668772 almost 9 years
      Drav Sloan, I have not written script for this purpose. I tried to cut the columns using the command; cut -d$'\t'-f5 for all the files and same for column 8 and join them using paste command and join command
    • user3668772
      user3668772 almost 9 years
      fiximan, the columns are tab separated.
    • Peter.O
      Peter.O almost 9 years
      Youe say, "column 5 and column 8 from all the files" - Do you mean, "column 5 and column 8 from the files that contain the same first 3 fields"? ... eg. Only the first sample file contains chr1 2000 3000, but your sample output show more than 1 comma-separated value in that row's new columns ... your sample output appears to basically nott match your input samples. It seems you have generalized your sample data,, when it would be better to keep it simple so that the output actually matches the input). The gereralization in your description is enough.
    • user3668772
      user3668772 almost 9 years
      Peter.O sorry if the description had not been clear. In this case the corresponding row is absent hence in file 2, hence no value is present so it is zero in that place. I had forgot to mention that.
    • Peter Cordes
      Peter Cordes almost 9 years
      So you need pattern-matching on the begin/end markers in columns 2 and 3. That pretty much rules out any simple text-processing. Just write it in a proper programming language like awk. Do you need the output to be sorted on column 2/3, or can it take whichever it sees first when they don't match?
    • agc
      agc almost 8 years
      Both input and output data examples are vague. We need to know what kind of data is in columns #5 and #8. We also need the names of the headers of columns #5 and #8.
  • rezafahlevi08
    rezafahlevi08 about 10 years
    Thank you. I'm changing my method now.
  • rezafahlevi08
    rezafahlevi08 about 10 years
    Sorry i'm new here, not enough reputation. :)
  • rezafahlevi08
    rezafahlevi08 about 10 years
    Sorry i'm new here, not enough reputation. :)
  • user3668772
    user3668772 almost 9 years
    Dear Fiximan, thanks for the suggestion. The 3rd file I had given is the required format for output not another input file.
  • FelixJN
    FelixJN almost 9 years
    A my bad, the principle remains the same, though - but since the lines are not always matching, the solution will not fit all your need.