Removing punctuation from txt file in Linux terminal using 'tr' and 'awk' commands

13,381

Using tr (as Glenn Jackman has already pointed out):

cat TEXTFILE | tr -d '[:punct:]' > OUTFILE

Using awk (tested with gawk and mawk):

cat TEXTFILE | awk '{ gsub(/[[:punct:]]/, "", $0) } 1;' > OUTFILE

You can also omit cat with AWK:

awk '{ gsub(/[[:punct:]]/, "", $0) } 1;' TEXTFILE > OUTFILE

Note: TEXTFILE and OUTFILE must be different.

Share:
13,381
Admin
Author by

Admin

Updated on July 22, 2022

Comments

  • Admin
    Admin almost 2 years

    I'm currently taking a crash course in the basics of the Linux terminal and one of the tasks is to replace punctuation in a text file using 'awk' and 'tr' commands. I have tried searching around for solutions but nothing is working for me, any help?

  • kvantour
    kvantour over 4 years
    Be advised, you never can have OUTFILE to be equal to TEXTFILE. Have a look at Bash Pitfalls 13
  • Andriy Makukha
    Andriy Makukha over 4 years
    @kvantour, yeah, I guess even with cat there is no guarantee that cat will finish reading before awk wants to write to it. Thanks.