What’s the difference between an array and events?
— Erik Meijer
1. Banks will need to hire excellent data scientists who also understand how markets work
4. An army of people will be needed to acquire, clean, and assess the data
5. There are different kinds of machine learning. And they are used for different purposes
6. Supervised learning will be used to make trend-based predictions using sample data
7. Unsupervised learning will be used to identify relationships between a large number of variables
8. Deep learning systems will undertake tasks that are hard for people to define but easy to perform
9. Reinforcement learning will be used to choose a successive course of actions to maximize the final reward
10. You won’t need to be a machine learning expert, you will need to be an excellent quant and an excellent programmer
The classifier aglo is more or less un-changed since almost 10 years. I think a good reason for that could be, for example, NB, SVM have been able to achieve relatively high accuracy since long time back, provided with optimal/sub-optimal parameters.
While at the same time, a good approach to bump up the accuracy of overall text classification result is by data/corpus preparation, including stopwords, POS,TF-IDF etc, based on my experience.
Saw a good post on accuracy of text classification, echoing this:
libsvm is the first supervised machine learning library i have used extensively, more than 10 years back.
It was pretty awesome that time back, seeing a 78% text classification accuracy of against more than 100,000 hotel reviews, i have crawled from ctrip.com.
While, at version 3, they are able to achieve 96.875% for text classification results now, as: