Applied Machine Learning

Term 2, Winter Session 2019-2020

Application of machine learning tools, with an emphasis on solving practical problems. Data cleaning, feature extraction, supervised and unsupervised machine learning, reproducible workflows, and communicating results.

An example of feature exploration done in the class. We used SHAP to visualize how important features of each data point were for the target feature. `availability_365` and `minimum_nights` remain the most important features, as this plot orders features by SHAP value magnitude. Using the plot, we can conclude that lower `minimum_nights` will cause an increase in target prediction. However, there are still some cases that result in lower predictions. Similarly for `availability_365`, while a vast majority of low availability listings tend to be reviewed less, there are stil lots of listings that seem to have low availability and high number of reviews as well.

Weekly pair assignments used Python and Jupyter Notebooks. We explored several datasets found on Kaggle in order to gain insight into real problems that could be solved with machine learning.

We also used many scikit-learn tools to evaluate the effectiveness of different models of classification, regression and clustering algorithms including support vector machines, random forests, gradient boosting, k-means.

Another important aspect of the class was the focus on best practices when preprocessing, training, and evaluating our models.