bash: Use columns separate in an array

6,108

This is a simple script that should show the use of readarray. I keep it more similar I could to the one you post.

#!/bin/bash 

awk '{ print $1 }' data.txt  >  file_column1.txt
awk '{ print $2 }' data.txt  >  file_column2.txt
awk '{ print $3 }' data.txt  >  file_column3.txt
# NLines=` wc -l data.txt | awk '{print $1}'`

readarray -t column1 < file_column1.txt
readarray -t column2 < file_column2.txt
readarray -t column3 < file_column3.txt

i=0;
for item in "${column1[@]}"; do
   echo  output is ${column1[$i]} bla ${column2[$i]}  bla ${column3[$i]}; 
   let "i=i+1" 
done

# rm -f file_column1.txt file_column2.txt file_column3.txt

Comments:

  • With awk you can print the column you desire ($1for the 1st, $2 the 2nd and so on).You create a different file for every column.
  • If uncommented the line #Nlines=wc -l | awk '{print $1}' could be used to keep count of the number of lines for the vector that will created after with readarray, and to do the loop in a different way...
  • With readarray you read the single file and you put in a 1D vector.
  • The loop for is made on for each component of the 1D vector column1. It should be done taking each vector because in your example they have all the same size. It should be done using Nlines.
  • In the not used variable item inside the loop there is always the same value of column1[i]
  • You access directly the component you want of the array.(The first index is 0 and the last is Nlines-1)
  • You increase the value of i at each iteration of the for loop.
  • If needed uncomment to erase the temporary files created in the script.

The output is

 output is 444 bla 999  bla 000 
 output is 555 bla 888  bla xxx 
 output is 666 bla 777  bla xxx 

Last comment
If you nest 3 loop (one inside the other) you'll obtain each permutation: not 3 but 3*3*3=27 lines

 0 0 0  
 0 0 1   
 0 0 2   
 0 1 0  
 ...
Share:
6,108

Related videos on Youtube

Pheeb
Author by

Pheeb

Updated on September 18, 2022

Comments

  • Pheeb
    Pheeb over 1 year

    Is it possible to put columns separate in arrays, not as a line but as a column. I need to access the lines sequentially. I have one file, but in this example, the files are divided into columns and used separately.

    example file:

    column1  column2  column3
      444      999      000                 
      555      888      xxx 
      666      777      xxx
    

    output file:

    output is 444  bla  999  bla  000                   
    output is 555  bla  888  bla  xxx   
    output is 666  bla  777  bla  xxx 
    

    What I tried is the following bash:

    readarray -t column <firstcolumn.txt
    for i in "${column1[@]}";  do
        readarray -t  column2 <secondcolumn.txt
        for j in "${column2[@]}"; do
            readarray -t column3 <thirdcolumn.txt
            for k in "${column3[@]}";  do
                echo "output is $i bla $j bla $k"
            done
        done
     done