TechRepublic: The most in-demand technologies for IT professionals

TechRepublic: The most in-demand technologies for IT professionals.
https://www.techrepublic.com/article/the-most-in-demand-technologies-for-it-professionals/

Accuracy for text classification

The classifier aglo is more or less un-changed since almost 10 years. I think a good reason for that could be, for example, NB, SVM have been able to achieve relatively high accuracy since long time back, provided with optimal/sub-optimal parameters.

While at the same time, a good approach to bump up the accuracy of overall text classification result is by data/corpus preparation, including stopwords, POS,TF-IDF etc, based on my experience.

Saw a good post on accuracy of text classification, echoing this:

6 Practices to enhance the performance of a Text Classification Model

Supervised machine learning

libsvm is the first supervised machine learning library i have used extensively, more than 10 years back.

It was pretty awesome that time back, seeing a 78% text classification accuracy of against more than 100,000 hotel reviews, i have crawled from ctrip.com.

While, at version 3, they are able to achieve 96.875% for text classification results now, as:

https://www.csie.ntu.edu.tw/~cjlin/papers/guide/guide.pdf

https://www.csie.ntu.edu.tw/~cjlin/libsvm/

AI for system support

Have tried to build an AI bot since almost 3 years back, finally did a prototype, in case anybody would like to do something similar:

Technologies:

Java, Spring Boot, Spring, SQLlite, PostGre, Scala, Python, Anaconda, Scikit Learn,  EWS, BootStrap, AngularJS/JQuery/HTML/CSS, Symphony API, Cisco API,

Components:

Data Set

  1. I have built a scala web crawler, to download all historical support issues.
  2. at the same time, have manually cleaned up/read through each of the thousand of support issues, put in corresponding resolutions corresponding to each
AI
  1. have leveraged on anaconda & scikit learn, to NLP, to tokenize each support issue (text), remove stop words, stemmed each, remove punctuations
  2. have leveraged on anaconda & scikit learn, bag each token of the text as feature vs class, to feed into linear regression classifier, tried SLDA, so far working at 72% accuracy
AI Exposer
  1. have exposed AI as a service
Issue Feeder
  1. have leveraged EWS to read in all issues, post to AI service
UI
  1. have built a web user interface, on top of HTML5 + JQuery + Bootstrap, to show the support emails + AI responded resolutions
  2. have a option on UI, to provide user feedback to AI, to keep its intelligence updated
Notifier
  1. leverage on Java Mail API, EWS, Chat API, phone API, to post alerts for critical issues