How Do The Machines Learn?

INPUTS —> MODEL —> OUTPUTS

Models — Algorithm learning styles

When we talk about algorithms, there are two main things to consider: the learning style and the algorithm “type.”  We will talk about learning styles first, which is a way to describe how the algorithm uses data to gain information. There are four major learning styles. Ensemble learning is used to describe when multiple learning styles or algorithms are used in combination.

Supervised

In supervised learning, the model is being trained with data that has been labelled/categorized correctly. For example, we tell the model that spruce, cypress, and evergreen fall under the category of "tree", while tulip, rose, and lily fall under the category of "flower." In this case, what the algorithm is doing is learning to make connections, autonomously, by mapping the input data to its correct label or category. When the model churns out an incorrect answer, the model will make incremental adjustments so that it is less likely to be wrong the next time. Once it is trained to achieve an acceptable level of accuracy, it can then be applied to new data. The model will extrapolate what it learned from experience (i.e. previous data) to make determinations about new data it has not encountered before. This may include classifying the new data or making a prediction.

A brief note on model over-fitting: when a model over-fits, it means that what the model learned is so specific to the data it was trained on, that it does not generalize very well to new data. Over fitting may be seen as a model having extremely high accuracy when validated with the data on which it was trained (e.g. EHR records from "Hospital A"), but low accuracy when used with another dataset (e.g. EHR records from "Hospital B"). Ultimately, it is about balancing accuracy and generalizability.

  • Examples: facial recognition; training a model to recognize handwritten letters; training a model with labelled x-ray images to automatically detect abnormal chest x-rays

Unsupervised

As you may have guessed, this is a type of learning in which we do not have the available answers prior to training the model. This sort of learning might be used in settings in which we simply do not know the answer or it is not immediately apparent or available. The model independently identifies patterns and heuristics to determine an output that is most suitable based on the data it was given. Unsupervised learning is often used for clustering or pattern identification. It can find patterns that may be difficult to identify by hand, especially when working with large datasets. For example, it can be used to define new groupings of data or to segment patients into novel “phenotypes” based on hundreds or thousands of variables per patient. It can be used to detect anomalies, such as identifying an inpatient medication order potentially made in error because of how dissimilar it is to other orders made under similar clinical circumstances. In these examples, it is considered unsupervised learning because there is no ground truth. The model makes associations and creates implicit “rules” about the data without knowing what the answer is, as there is no predefined answer. Many of these types of activities fall under the umbrella of data mining. Unsupervised learning can also be used for dimensionality reduction, which is the process of reducing the number of features in a large dataset.

  • Examples: market or customer segmentation; word embeddings and language model creation; learning rule associations; anomaly detection; gene clustering; data mining activities

Semi-supervised

Now we get into the grey zone of somewhere in between. This type of learning employs aspects of both learning styles described above. We may have some knowledge of the correct answers, but perhaps the data set is incomplete or unreliable. Semi-supervised learning will use both labelled and unlabelled data. The labelled data can be used to create and apply labels to the unlabelled data, thereby increasing the total amount of labelled data available to be used in training the model. This strategy is helpful when you need a large quantity of training data, but may only have a small amount of labelled data.

Hand-labelling data, also called annotating, is a time-consuming task. It is often a rate-limiting step in building and researching new machine learning models. The option of using a semi-supervised approach helps to scale the development of AI; however, it also requires collaboration between data scientists and subject matter experts to ensure that the rules and algorithms used to create the new labels are appropriate. Otherwise, the model may inadvertently learn the wrong information and its accuracy may suffer.

  • Examples: web content classification; speech analysis, augmentation of NLP models; protein sequence classification

Reinforcement

Reinforcement learning is a type of supervised learning used for building decision models to be used within a time series or a sequence of events. Instead of classifying what something is, the model determines what action (i.e. decision) should be taken to achieve a predefined outcome or to maximize some type of reward.

It can be thought of a little like supervised learning, except instead of the model knowing right away whether it got the right answer, it has to wait until it has made several decisions to find out if the sum of those decisions led to the outcome it was trying to achieve. A great example of this is playing a board game like chess. The ultimate goal is to win. In the process of playing the game and trying to win, multiple decisions are made without any immediate feedback of whether each move was the right decision or if it made winning more likely. The player does not know if the summation of all their decisions were successful until the end of the game. This is known as the concept of "delayed reward." It is analogous to many healthcare processes in which the desired health outcome is not immediately known. For example, reducing cardiovascular events or preventing a heart attack is a long-term outcome that is impacted by numerous decisions and interventions over time. Similarly, many actions take place during a patient's hospitalization, but the ultimate outcome of survival to discharge is not known until the patient is discharged.

Lastly, reinforcement learning draws from aspects of childhood (and human) learning with the concepts of reward and punishment. The model trains within its environment and is continually modified so as to maximize reward and minimize penalty¹. You can argue, however, that this sort of learning largely depends on the environment the model trains in. If put in another environment, this trained model may not behave in the desired way.

  1. Types of Artificial Intelligence: A Detailed Guide - Certes. Certes. https://certes.co.uk/types-of-artificial-intelligence-a-detailed-guide/. Published December 20, 2018.