Adding new columns in a csv file

text-processing awk columns

20,764

Solution 1

This code should do it, using sed, save it as an executable file and run with the full path to the source file as the first (and only) parameter. You can save the output to a new file if you like.

#!/bin/bash
FILE=$1
# check the file exists
[[ -s $FILE ]] || { echo "Can't locate file '$FILE', aborting" >&2; exit 1; }
# get the filename without directory
NAME=$(basename "$1")
# get the first 4 characters as FIRST
FIRST=${NAME:0:4}
# get the 6th & 7th characters as SECOND
SECOND=${NAME:5:2}
# are we good to go?
read -t30 -p "About to prefix '$FIRST,$SECOND,' to all lines in $FILE - ok (y/-): "
[[ $REPLY == "y" ]] || { echo "No changes made"; exit 0; }
# do it
sed "s/^/$FIRST,$SECOND,/" "$FILE"
exit 0

file exam_02:

line1,some,stuff
line2,some,more,stuff

output:

exam,02,line1,some,stuff
exam,02,line2,some,more,stuff

Solution 2

With awk:

#!/usr/bin/awk -f
BEGIN { OFS="," };
FNR==1 {
  split(FILENAME,c,/[_.]/);
};
{ print c[1], c[2], $0 }

Or as a one-liner to run on the command-line or embed in a shell script:

awk -v OFS=',' 'FNR==1 {split(FILENAME,c,/[_.]/)}; {print c[1],c[2],$0}' *.csv

For each input file, this splits each FILENAME into an array c, using the character class [_.] as field separators. Fields 1 & 2 of array c are pre-pended to each line of the file.

Example output:

$ ./zsha.awk exam_20.csv 
exam,20,1,2,3,4
exam,20,5,6,7,8
exam,20,9,10,11,12

Input data used in the example was:

$ cat exam_20.csv 
1,2,3,4
5,6,7,8
9,10,11,12

20,764

zsha

Updated on September 18, 2022

Comments

zsha 3 months

I want to append two new columns in the beginning of a csv file using bash scripting in a way such that the first four letters of the filename are the values of the first column and the next two letters are the values of the second column. For instance, if my .csv filename is "exam_20", I want to have a new first column with all entries as "exam" and a new second column with with all entries as "20". Does anyone know how to achieve this in bash? Thanks in advance.
- terdon over 6 years
  
  Please edit your question and add an example of your input file and your desired output.
zsha over 6 years

Thanks for the explanation; I am trying to do this with a group of csv files in a folder say DIR=/home/Documents. I am using the for loop- for FILE in $DIR/*.csv; do (your answer) done but I am getting errors related to basename. Is this the correct approach?
zsha over 6 years

thanks for your solution; but the script that has to do this task is also doing other tasks also so I don't want to have an awk script!
Alessio over 6 years

i wrote it as a one-liner then converted it to a stand-alone script for readability. to reverse, delete the #! and BEGIN lines and wrap the whole thing in single-quotes as awk -v OFS=',' '.....'. All the semi-colons are in-place where needed so you can even join up all the lines to make one long line. I'll edit my answer to give an example.
gogoud over 6 years

the NAME line in my code should read: NAME=$(basename "$1") (now amended) - think it should work after this change. The basename command needs a single parameter and because of lack of quotes with a path/filename containing spaces bash was passing it as multiple parameters. Oops.