A Beginner’s Guide to Supervised and Unsupervised Learning
Published:
Read this to gain some intuition about the difference between supervised and unsupervised learning.
A Beginner’s Guide to Supervised and Unsupervised Learning
Imagine a situation where we observe cloud formations for the first time. Looking up at the sky, we see some clouds that are large and fluffy, some thin and streaky, some pale white, and some dark grey. Thanks to the development of language and taxonomy, a cloud expert could let us know that the large fluffy clouds belong to the cumulus class, and the thin and streaky clouds belong to the cirrus class. After being taught cloud nomenclature, an assessment could then be set up containing a new set of clouds, where we are asked to assign labels to the new clouds just observed. Upon receiving the results of the test, disappointment attributed to misclassifications, and satisfaction attributed to proper classifications, would encourage learning of the proper labels. This situation encapsulates a supervised learning task, where the minimization of a loss function with externally labeled data (ground truths) guides learning of the correct labels. In this case, the loss function would be attributed to feelings of disappointment for thinking a cirrus cloud was a cumulus, and the subsequent desire to minimize disappointment in the future. However, in the absence of guidance that ground truth labels provide, the brain is still capable of segmenting and storing these various distinct structures in an unsupervised fashion. We still generate our own internal labels for patterns sensed from external stimuli. Given enough exposure to a wide distribution of clouds, we could piece together our own cloud classifications. This is unsupervised learning: the ability to recognize and learn patterns without ground truth labels.