Sentiment Analysis Techniques and Application-Survey and Taxonomy

Sentiment analysis, Taxonomy, Tools and Techniques, Support Vector Machine, Naïve Bayes, sentiment classification level, polarity, lexicon-based approach, framework

Authors

  • Mahmood Umar Department of Computer Science, Faculty of Science, Sokoto State University, Sokoto- Nigeria
  • Abdul-Azeez Abdullahi Bena Department of Computer Science, Waziri Umaru Federal Polytechnic, Birnin Kebbi, Kebbi State-Nigeria
  • Buhari Wadata Department of Computer Science, Faculty of Science, Sokoto State University, Sokoto- Nigeria.
January 4, 2021
January 4, 2021

Downloads

Nowadays, social media platforms, blogs, and e-commerce are commonly use to express opinion on politics, movies, products, education respectively; for election forecasting, business boosting and improvement of teaching and learning. As a result, data generation becomes easier; producing big data which requires appropriate techniques and tools to analyse easily, accurately and timely. Thus, making sentiment analysis very demanding research area. This study will investigate on what basis (sentiment classification level) or area of application (data source) do supervised machine learning approaches particularly Support Vector Machine (SVM), Naïve Bayes, and Maximum Entropy algorithms, and other technique-lexicon-based approach give the best result in sentiment analysis. Based on the review of the literature there is a contradiction on the point that SVM generated the best result in analyzing student sentiment on document level. This study also discovers that sentiment analysis differs from system to system based on polarity (types of the classes to predict: positive or negative, subjective or objective), different levels of classification (sentence, phrase, or document level) and language that is processed. This research produces a taxonomy which serves as a guide for the choice of techniques in sentiment analysis. The taxonomy explores the sentiment classification levels and data preprocessing stages. It also explores that sentiment analysis techniques were organised in to three (3) groups; Machine learning, Lexicon and hybrid or combination. The machine learning techniques were sub-grouped in to two (2) namely; supervised and unsupervised. The supervised were organized in to two (2): Classification and Regression. un-supervised machine learning techniques includes clustering and association. The clustering technique consist of k-means. Decision tree which is a classification based under supervised type of machine learning technique consist of random forest,(Akinkunmi, 2019) while the ruled-based classifiers consist of confidence criterion and support criterion. The commonly used tools are Weka, Python compiler, and R programming tool.