Research Article Open Access

A STING Algorithm and Multi-dimensional Vectors Used for English Sentiment Classification in a Distributed System

Vo Ngoc Phu1 and Vo Thi Ngoc Tran2
  • 1 Nguyen Tat Thanh University, Vietnam
  • 2 Vietnam National University, Vietnam

Abstract

Sentiment classification is significant in everyday life, such as in political activities, commodity production and commercial activities. Finding a fast, highly accurate solution to classify emotion has been a challenge for scientists. In this research, we have proposed a new model for Big Data sentiment classification in the parallel network environment - a Cloudera system with Hadoop Map (M) and Hadoop Reduce (R). Our new model has used a Statistical Information Grid Algorithm (STING) with multi-dimensional vector and 2,000,000 English documents of our English training data set for English document-level sentiment classification. Our new model can classify sentiment of millions of English documents based on many English documents in the parallel network environment. However, we tested our new model on our testing data set (including 1,000,000 English reviews, 500,000 positive and 500,000 negative) and achieved 83.92% accuracy.

American Journal of Engineering and Applied Sciences
Volume 11 No. 1, 2018, 19-37

DOI: https://doi.org/10.3844/ajeassp.2018.19.37

Submitted On: 2 November 2017 Published On: 22 December 2017

How to Cite: Phu, V. N. & Ngoc Tran, V. T. (2018). A STING Algorithm and Multi-dimensional Vectors Used for English Sentiment Classification in a Distributed System. American Journal of Engineering and Applied Sciences, 11(1), 19-37. https://doi.org/10.3844/ajeassp.2018.19.37

  • 3,313 Views
  • 1,857 Downloads
  • 0 Citations

Download

Keywords

  • Sentiment Classification
  • English Sentiment Classification
  • Opinion Mining
  • English Document Opinion Mining
  • Statistical Information Grid
  • STING
  • Distributed System
  • Parallel System