There are quite a few machine learning classifiers. It is usually hard to say which is better until every one is tried on the given data and performance is measured. However, there are few rules of thumb:
- Linear classifier is better used when:
- Sparse data (lot of zeroes in feature vector)
- Feature engineering performed, or deep feature learning
- Up to large datasets (fits one machine)
- Non-linear or kernel-based classifier is better used when
- There are only few features (up to tens)
- Big data - a lot of training examples
- Evaluation: ROC under PR curve
- Negative subsampling
- Weighs for imbalanced classes (also - regularization parameter)
No comments:
Post a Comment