for data mining or machine learning beginners

Share on:

if you want to try some models (e.g., Naive Bayes Classification), I recommend (1) Weka, (2) R or (3) Scikit-learn.
Weka and R are really famous tools, so you can find their usage on web search easily. In addition, Weka and R have GUI. Weka requires Java, R requires R (R language).
Scikit-learn is good choice also, because easy to use and easy to cusmize (for Python user).

or 4th choise, if you want to use Deep Learning, you’ll use other tool such as Caffe, Chainer, TensorFlow, SkFlow, CNTK, DSSTNE. These tools can move on GPUs.

By the way, if you want to get dataset, goto UCI Machine Learning Repository.


留学生から「手持ちのデータで機械学習か何か適用してみたいのだけど〜」という話が届いたので。