Menu Close

How to remove punctuation with Python Pandas?

Sometimes, we want to remove punctuation with Python Pandas.

In this article, we’ll look at how to remove punctuation with Python Pandas.

How to remove punctuation with Python Pandas?

To remove punctuation with Python Pandas, we can use the DataFrame’s str.replace method.

For instance, we write:

import pandas as pd

df = pd.DataFrame({'text': ['a..b?!??', '%hgh&12', 'abc123!!!', '$$$1234']})
df['text'] = df['text'].str.replace(r'[^\w\s]+', '')

print(df)

We call replace with a regex string that matches all punctuation characters and replace them with empty strings.

Therefore, df is:

import pandas as pd

df = pd.DataFrame({'text': ['a..b?!??', '%hgh&12', 'abc123!!!', '$$$1234']})
df['text'] = df['text'].str.replace(r'[^\w\s]+', '')

print(df)

replace returns a new DataFrame column and we assign that to df['text'].

Therefore, df is:

     text
0      ab
1   hgh12
2  abc123
3    1234

Conclusion

To remove punctuation with Python Pandas, we can use the DataFrame’s str.replace method.

Posted in Python, Python Answers