Summing up an array inside of awk?

8,163

Solution 1

Setting up $temp

First be sure that you've set up the $temp variable properly:

$ temp="abc,1,2,3,4,5,6"
$ echo "$temp"
abc,1,2,3,4,5,6

Simple example

I used the following approach to do it:

$ echo "$temp" | tr ',' '\n' | grep -v abc | awk '{sum+=$1};END{print sum}'
21

Your example

Regarding your approach you forgot to print the arrays you accumulated with an END{...} block:

$ echo "$temp" | awk '{split($0,a,","); name=a[1]
      for(i=2;i<=4;i++) sum1+=a[i] ; for(i=5;i<=7;i++) sum2+=a[i] }
      END{print sum1; print sum2}'
6
15

Saving for later

Awk doesn't have a method for injecting results back into the parent's shell from where it was called, so you'll have to get a bit crafty and save it's results to an array in Bash.

Example

$ myarr=($(echo "$temp" | awk '{split($0,a,","); name=a[1]
      for(i=2;i<=4;i++) sum1+=a[i] ; for(i=5;i<=7;i++) sum2+=a[i] }
      END{ print sum1; print sum2}'))

The above is doing this:

$ myarr=($(...awk command...))

This will result in your values from sum1 and sum2 being saved into array $myarr.

Accessing the array $myarr

They're accessible like so:

$ echo "${myarr[@]}"
6 15

$ echo "${myarr[0]}"
6

$ echo "${myarr[1]}"
15

Solution 2

Try this:

$ awk -F',' 'BEGIN{OFS="\t";print "Name","Sum1","Sum2"}
                  {print $1,$2+$3+$4,$5+$6+$7}' sample.csv 
Name        Sum1 Sum2
abc         6    15
de          15   14
xyz         14   17

There is no need for your bash loop, you can do everything in awk. The -F option allows you to define the input field separator, in this case ,, so you don't need to explicitly split the line. Since awk reads files line by line, you also don't need to read the file in bash.

The BEGIN{} block is executed before reading the first line and just prints the header and sets the output separator (OFS) to a tab. Since the fields are already separated, all you need to do is sum up fields 2-4 and 5-7 and print them for each line.

Solution 3

Bash

#!/usr/bin/env bash
printf "%-5s\t%s\t%s\n" Name Sum1 Sum2
while IFS=, read -a Arr
do
        (( Grp1 = Arr[1] + Arr[2] + Arr[3] ))
        (( Grp2 = Arr[4] + Arr[5] + Arr[6] ))

        printf "%-5s\t%d\t%d\n" ${Arr[0]} $Grp1 $Grp2

done < input.txt

Output

root@ubuntu:~# bash  parse.sh
Name    Sum1    Sum2
abc     6       15
de      15      14
xyz     14      17

Thanks to @1_CR for arithmetic tricks for array element

Share:
8,163

Related videos on Youtube

Bob Ramsey
Author by

Bob Ramsey

Updated on September 18, 2022

Comments

  • Bob Ramsey
    Bob Ramsey over 1 year

    I have the following piece of code:

    sum1=
    sum2=    
    declare -a a
    echo $temp | awk '{split($0,a,","); name=a[1] ; for(i=2;i<=4;i++) sum1+=a[i] ; for(i=5;i<=7;i++) sum2+=a[i] }'
    

    This code is not working. Here temp is a string of type:

    abc,1,2,3,4,5,6
    

    Actually I am parsing data from a file. The input file is like:

    abc,1,2,3,4,5,6
    de,3,5,7,8,4,2
    xyz,6,5,3,7,8,2
    

    I am reading it using

    while  read temp
    do
     #do something
    done < sample.csv
    

    And expected output is of the form:

    Name   Sum1  Sum2
    abc      6    15
    de      15    14
    xyz     14    17 
    
    • Admin
      Admin over 10 years
      In general, it is a good idea to explicitly state what you are trying to do. Given that this is a question and the code does not do what you want it to do, it may be hard for us to understand what your objective is. For example, right now, I have no idea what you're attempting. Your awk is not printing anything, are you trying to modify the bash variable as 1_CR is asking?
    • Admin
      Admin over 10 years
      I just want to make a table and print the first field as name and the sum of three consecutive elements in group like (1,2,3) and (4,5,6) .. . I want to store sum1 and sum2 and print it later in a tabular format. Actually I am parsing a file which contains multiple such lines. I am reading one line at a time parsing it using awk .
    • Admin
      Admin over 10 years
      In that case, please post a sample of the input file and the corresponding desired output. Do you want to modify and store the sums in awk or in bash? This sounds like an XY problem, it would be easier to help you if you explained what you are trying to achieve.
    • Admin
      Admin over 10 years
      Your example code shows a file with 7 fields - a header and 6 numbers. With just 6 digits to sum, the easiest solution is to manually reference and add them. If you're looking for a more general solution - either having a variable length line or one that's got dozens or hundreds of digits to sum, your answer changes a bit and it might be worth noting in the question you want a more robust solution.
  • Bob Ramsey
    Bob Ramsey over 10 years
    And abc is not fixed It was just an example
  • slm
    slm over 10 years
    @user2179293 - by all means, sure you can save them for later.
  • slm
    slm over 10 years
    @user2179293 - yeah modify as really needed, was just showing you one way.
  • Bob Ramsey
    Bob Ramsey over 10 years
    But it is not working I mean do I need a END
  • terdon
    terdon over 10 years
    @user2179293 how is it not working? How does the result you get differ from what you want? The code you've posted correctly sums your arguments. What is wrong with it?
  • Bob Ramsey
    Bob Ramsey over 10 years
    It is always printing 0 as sum1 and sum2 which is initial value I had assigned
  • Bob Ramsey
    Bob Ramsey over 10 years
    Yes your suggestion is working but I still need to know why my code is not working
  • terdon
    terdon over 10 years
    @user2179293 that's because you are only modifying the values within awk and bash has no knowledge of them. See my answer for another way of doing it in pure awk.
  • terdon
    terdon over 10 years
    @user2179293 it is not working because bash and awk do not share variables. Your sum1 in bash is completely independent of the sum1 in awk.
  • slm
    slm over 10 years
    @user2179293 - see updates.
  • slm
    slm over 10 years
    @user2179293 - I think you're getting confused by the fact that the variables sum1 and sum2 are not showing up in your shell. The command awk is a separate program that runs which cannot create environment variables within the Bash shell from where you ran it. This is just how Unix works. Child processes can't change their parent's environment. They can only report things back to it.
  • Mathias Begert
    Mathias Begert over 10 years
    You could probably make this more readable by using arithmetic evaluation rather than arithmetic expansion. So ((Grp1 = Arr[1] + Arr[2] + Arr[3])) rather than Grp1=$(( ${Arr[1]} + ${Arr[2]} + ${Arr[3]} ))
  • Rahul Patil
    Rahul Patil over 10 years
    hmm Good to know that.. I had faced issue when I was trying Grp=(( Arr[1] + Arr[2] + Arr[3] )) but now I learn thanks
  • Rahul Patil
    Rahul Patil over 10 years
    @1_CR I have updated that.. now seems to me more readable.. :)
  • Mathias Begert
    Mathias Begert over 10 years
    While you're at it, you can skip the Arr=( ${line//,/ } ) step by using while IFS=, read -a Arr instead of while read line