In this video from Sebastian Thrum he says that supervised learning works with \"labeled\" data and unsupervised learning works with \"unlabeled\" data. What does he mean by thi
Labeled data, used by Supervised learning add meaningful tags or labels or class to the observations (or rows). These tags can come from observations or asking people or specialists about the data.
Classification and Regression could be applied to labelled datasets for Supervised learning.
Machine learning models can be applied to the labeled data so that new unlabeled data can be presented to the model and a likely label can be guessed or predicted.
Unlabeled data, used by Unsupervised learning however do not have any meaningful tags or labels associated with it. Unsupervised learning has more difficult algorithms than supervised learning since we know little to no information about the data, or the outcomes that are to be expected.
Clustering is considered to be one of the most popular unsupervised machine learning techniques used for grouping data points, or objects that are somehow similar.
Unsupervised learning has fewer models, and fewer evaluation methods that can be used to ensure that the outcome of the model is accurate. As such, unsupervised learning creates a less controllable environment as the machine is creating outcomes for us.
Picture courtesy of Coursera: Machine Learning with Python