Accuracy for text classification

The classifier aglo is more or less un-changed since almost 10 years. I think a good reason for that could be, for example, NB, SVM have been able to achieve relatively high accuracy since long time back, provided with optimal/sub-optimal parameters.

While at the same time, a good approach to bump up the accuracy of overall text classification result is by data/corpus preparation, including stopwords, POS,TF-IDF etc, based on my experience.

Saw a good post on accuracy of text classification, echoing this:

6 Practices to enhance the performance of a Text Classification Model

Supervised machine learning

libsvm is the first supervised machine learning library i have used extensively, more than 10 years back.

It was pretty awesome that time back, seeing a 78% text classification accuracy of against more than 100,000 hotel reviews, i have crawled from ctrip.com.

While, at version 3, they are able to achieve 96.875% for text classification results now, as:

https://www.csie.ntu.edu.tw/~cjlin/papers/guide/guide.pdf

https://www.csie.ntu.edu.tw/~cjlin/libsvm/