Free
Course

Data Science Mini Project : NLP on Reviews – Cleanup Pt. 3

This series of projects is your first steps to deriving insights from the gold mine of data that is text, and your first foray in NLP. Learn how to remove stop words from the text data. This is part 4 of the series.


Project Features

Good for Intermediate Level
100 points for Enrolling in Project
500 points for Submitting Solution

5/5 (1) 7+ Enrolled Learners
2 Lessons

Project Problem Statement

In the last micro project you executed the first two steps of text cleaning. In the word cloud we saw earlier, without any preprocessing, we had a lot of functional words (‘the’, ‘of’, ‘and’ etc.). These are very common, and don’t add a lot of value for our purpose here. In this microproject, you’ll be excluding these functional words a.k.a ‘stopwords’ from the text.   NLTK has standard stopwords list that you can use. You also add/remove stop words based on the text data you are working with. So add stop words that does not make sense and remove stop words that makes sense. You’ll know about it in our future micro projects.   Your tasks: Remove the stopwords from a single review (pick any)   Submit your solution as a py file or Jupyter notebook. Make sure to provide your insights as comments/markdown in the code.

Hints:

  1. Use NLTK's English stop words list
  2. Remove all words that are a part of the list
We would like you to try it out first on your own. You will get solution of project after 1 week of enrollment. All the best!

Please rate this