Remove punctuations in pandas

65,586

Solution 1

Using Pandas str.replace and regex:

df["new_column"] = df['review'].str.replace('[^\w\s]','')

Solution 2

You can build a regex using the string module's punctuation list:

df['review'].str.replace('[{}]'.format(string.punctuation), '')

Solution 3

I solved the problem by looping through the string.punctuation

def remove_punctuations(text):
    for punctuation in string.punctuation:
        text = text.replace(punctuation, '')
    return text

You can call the function the same way you did and It should work.

df["new_column"] = df['review'].apply(remove_punctuations)
Share:
65,586

Related videos on Youtube

data_person
Author by

data_person

Updated on July 09, 2022

Comments

  • data_person
    data_person almost 2 years
    code: df['review'].head()
            index         review
    output: 0      These flannel wipes are OK, but in my opinion
    

    I want to remove punctuations from the column of the dataframe and create a new column.

    code: import string 
          def remove_punctuations(text):
              return text.translate(None,string.punctuation)
    
          df["new_column"] = df['review'].apply(remove_punctuations)
    
    Error:
      return text.translate(None,string.punctuation)
      AttributeError: 'float' object has no attribute 'translate'
    

    I am using python 2.7. Any suggestions would be helpful.

    • Joe T. Boka
      Joe T. Boka over 7 years
      You want to have a new column with the same string values but without the punctuation? Why?
    • data_person
      data_person over 7 years
      @JoeR i am practising sentiment analysis on the data
  • bernando_vialli
    bernando_vialli almost 5 years
    @ Bob Haffner, thank you for this but how would I preserve spaces that previously existed?
  • Roy
    Roy about 2 years
    Hi @bob-haffner, I want to remove punctuation (only dot .) only after the letter c and p. How can I do that?
  • The Singularity
    The Singularity almost 2 years
    Use import string