Machine Learning studies algorithms which improve their performance through data: The more data they process, the better they will perform.
Bias error example: If you choose a linear model to capture non-linear relations, doesnt matter how much data you use to train, it will never fit it well.
Variance error example: Decision trees are high-variance low-bias models, as they don’t do any assumption on the data structure. Usually its high variance is reduced through variance-reduction ensemble methods such as Bagging (further improved by Random Forests, where not only subsets of data are used but also subsets of features).
Loss functions
Corss-validation
Receiving Operating Characteristic ROC
Compares model Recall vs FPR (1 - Specificity) obtained with the studied model when varying a parameter.
IDEA: Combine multiple weak learners to improve results.
Techniques:
Mode: Simple voting mechanism. Take what the majority of learners say
Average / Weighted Average: Assign a weight to each learner and compute the mean prediction.
BAGGING (Bootstrap AGGregatING) : Multiple models of the same type are trained with a random subset of the data sampled with replacement (bootstrapping). This technique is specially effective to reduce variance.
BOOSTING: Each datapoint is given an “importance weight” which is adjusted during the sequential training of multiple models. In addition, a “reliability weight” is assigned to each model and weighted average is used for the final guess. Although it also lowers the variance, it is mainly used to lower the bias of the models.