Text Mining Online Reviews for Sentiment Analysis

This post aims to introduce several basic text mining techniques. Sample implementations will be explored in the Scikit-learn library using Anaconda Python.


In data science and machine learning, there is often difficulty in extracting useful features from raw data. Textual data presents an interesting challenge in this regards, especially due to its abundance on the internet. Because of its complexity, natural language is often not directly suited to training a classifier or regressor model. The following section discusses several simple ways to extract useful features from raw text. The dataset containing the raw text that will be used can be found here.

Continue reading “Text Mining Online Reviews for Sentiment Analysis”