Project Problem Statement
So far, you loaded the data, made wordclouds. You realised that clean up was needed, and rightly so. In the previous microproject you did some preprocessing to break the sentences into individual words, and you have one consolidated big list with all the words. In this microproject, you will perform clean up on the words. Clean up is the most important step in any NLP project. The two main tasks for you in this first pre-processing part are -- Remove the punctuation and numbers from the data
- Normalize the case (convert everything to lowercase)
Hints:
- Remove everything that is not a letter, regex will make this easy
We would like you to try it out first on your own. You will get solution of project after 1 week of enrollment. All the best!