Skip to content
Glossary

Self-Supervised Learning

Learning from data without explicit labels, using inherent data structures to generate supervisory signals.
Definition

Self-Supervised Learning is an approach within unsupervised learning in artificial intelligence and machine learning, where a system learns to understand and represent data by predicting part of the data from other parts of the data itself, without the need for external labels or annotations.

This method leverages the inherent structure or context within the data to create a learning task, such as predicting missing words in a sentence, forecasting future frames in a video sequence, or filling in missing parts of an image.

The model thus generates its own supervisory signal based on the input data, enabling it to learn rich representations of the data's underlying patterns, features, and structures. Self-supervised learning is particularly valuable for exploiting large volumes of unlabeled data, reducing the reliance on expensive and time-consuming manual data labeling processes, and has shown promise in achieving state-of-the-art results in various domains.

Examples/Use Cases:

In natural language processing, a self-supervised learning task might involve removing a word from a sentence and training a model to predict the missing word based on the remaining context, as seen in models like BERT (Bidirectional Encoder Representations from Transformers). In computer vision, a model might be trained to predict the color version of a grayscale image, or to reconstruct an image with a portion removed, thereby learning about object shapes, textures, and colors.

In audio processing, self-supervised learning can be used to predict the next segment of an audio clip, enabling the model to learn about the structure and progression of sounds. These tasks force the model to develop an understanding of the data's internal structure and relationships, providing a foundation for further learning and making the model capable of performing well on supervised tasks with less labeled data.

Related Terms
← Back to Glossary

Need human evaluators for your AI research? Scale annotation with expert AI Trainers.