How to remove an apostrophe ( ' ) from couple of columns of a .CSV file?

text-processing sed awk csv

17,878

Solution 1

You can do it with awk, The idea is to run a substitute command on columns 3 and 4 to replace the single quote with a blank. Here \047 represents the octal code for '.

 awk -F, -v OFS=, '{sub(/\047/, "", $3); sub(/\047/, "", $4); print}' file.txt

Solution 2

Use the other quote quaracter

sed "s/'//g" file

or end quoting, escape the apostrophe and start quoting again

sed 's/'\''//g' file

in this case there are no quotes necessary except for the apostrophe that hasa special meaning for the shell

sed s/\'//g file

this can also be done using the double quotes

sed s/"'"//g file

I realized after I read the comments to this post that I did not solve the original question but corrected a command postet in a comment by the OP.

17,878

Dhruuv

Updated on September 18, 2022

Comments

Dhruuv over 1 year
I have a .CSV file with 7 fields, and the 3rd and 4th columns of the file has a number starting with an apostrophe ( ' ). Please see the example below.
```
col0,col1,col2,col3,col4,col5,col6,
1value0,1value1,'8972991766941,'8972991766941,1value4,1value5,1value6,
2value0,2value1,'8912988876583,'8912988876583,2value4,2value5,2value6,
3value,3value1,'8912981226981,'8912981226981,3value4,3value5,3value6,
2value0,4value1,'8912971783681,'8912971783681,4value4,4value5,4value6,
```
How do I get rid of the apostrophes in the 3rd and 4th columns only using either sed or awk?
- Dhruuv over 10 years
  
  I tried to use the below command but not working... sed 's/'//g' file and sed 's/,'//g' file
- Kevin over 10 years
  
  tr -d \', note this will delete all single quotes in all columns.
- Dhruuv over 10 years
  
  @Kevin Even I thought of using tr however, as I cannot pass a specific column or field position in that command, I backed off...
- Alessio over 10 years
  
  the final step is to buy a sledge-hammer and smash the keyboard of whoever gave you that broken non-CSV file so they can never do it again. say "next time, it's your hands" while doing it.
Dhruuv over 10 years

Thank you for the different answers. I tried to use the first code, and it is throwing me an error as below sed: -e expression #1, char 5: unterminated s command second code worked fine, but the code needs to be modified into sed 's/'\''//g' file Third one also worked fine, but the code needs to be modified into sed 'sed s/\'//g file
Dhruuv over 10 years

I am just a beginner in sed and awk Could you please explain the substitution part in your code above? Confused with the back slash and forward slash used. Thanks in advance.
Mathias Begert over 10 years

@Dhruuv, if it's any easier and if your requirement translates to stripping all single quotes from the file regardless of their position, I have added a tr alternative. Note the 047 represents the octal value for '
Dhruuv over 10 years

tr would make it more complex as I do not want the apostrophes in other columns to be replaced.
Mathias Begert over 10 years

@Dhruuv, replaced the original awk code with a readabl version. Here \047 represents the octal code for '. So substitute ' with a blank in fields 3 and 4 and then print the line
Dhruuv over 10 years

@miracle173, @Kevin: The code works, but it is replacing all the apostrophes in the file... Is there a way, I can replace the apostrophes only in the 3rd and 4th fields of the file using sed? Using awk it can be done as the solution stated by @1_CR...
Kevin over 10 years

@Dhruuv Yes, it can be done in sed, but it's not pretty. You're better off with awk, it was designed for delimited text files like this. In case you really want to see, sed 's/\([^,]*,[^,]*,\)'\''\([^,]*,\)'\''/\1\2/' test.in or sed -E "s/(([^,]*,){2})'([^,]*,)'/\1\2/" test.in (Depending on your sed version, you may need to change -E to -r).
Dhruuv over 10 years

@Kevin: Thanks Kevin, very helpful... I just wanted to see if we can achieve this in sed just for the columns 3 and 4... I will go ahead with awk then... :)