G06V 10/82

Definition

Diese Klassifikationsstelle umfasst:

(Für diese Definition ist die deutsche Übersetzung noch nicht abgeschlossen)

Neural networks (NN) specially adapted for image or video recognition or understanding, in particular specific architectures and specific learning tasks for this purpose.

Notes – technical background

These notes provide more information about the technical subject matter that is classified in this place:

Examples of architectures are:

Attention based neural networks such as transformer architectures;
Autoencoders consisting of encoder and decoder blocks, where the output has the same form as the input, i.e. input and outputs are both images for example;
Convolutional neural networks consisting of repetitive convolutional and pooling layers;
Pyramidal or multi-scale neural network, mostly of the convolutional type that process either differently scaled input images, have convolutional kernels of varying sizes, and/or contain skip connections from lower level layers to higher level layers or the output layer;
Recurrent neural networks, where the input data is sequential by nature. That is either the pixels of the input image are processed sequentially, or a plurality of image frames such as in videos are processed sequentially. Long-short-term-memory (LSTM) and gated recurrent units (GRU) are some specific examples of recurrent neural networks;
Region proposal networks, where the main task is not only to correctly classify objects in an input image but also provide an indication where a specific object has been found. Example architectures are R-CNN and YOLO;
Residual neural networks (ResNet) containing skip connections or shortcuts to jump over some layers;
Siamese neural networks, that work on input pairs and consist of two identical neural networks that process each pair of the input and then merges the output to provide a judgement about the input pair, such as if they are belonging to the same class or not.

Examples of learning tasks are:

Adversarial learning such as in generative adversarial networks (GANs);
Meta learning;
Metric learning, learning a distant metric between two input objects mostly done with a Siamese neural network;
Reinforcement learning, learning how to take optimal actions for performing a task, e.g. deep reinforcement learning for robotics, self-driving vehicles etc.;
Representation or feature learning, learning representations or features from raw input, mostly done with some form of encoder-decoder architecture or simply by using intermediate representations of a classification network;
Transfer or multitask learning, reusing a network trained on task A for task B or jointly training a neural network on multiple tasks.

Examples

Bildreferenz:G06V0010820000_0

Bildreferenz:G06V0010820000_1

Siamese network showing not similar (left) and similar (right) input pairs

Bildreferenz:G06V0010820000_2

Recurrent neural network for action recognition

Bildreferenz:G06V0010820000_3

Region proposal neural network for region of interest (ROI) detection

Bildreferenz:G06V0010820000_4

Adversarial learning with a generative adversarial neural network for object recognition on different backgrounds

Querverweise

Informative Querverweise

Feature extraction related to a temporal dimension; Pattern tracking	G06V 10/62
Pattern recognition or machine learning, using clustering	G06V 10/762
Pattern recognition or machine learning, using classification	G06V 10/764
Pattern recognition or machine learning, using regression	G06V 10/766
Pattern recognition or machine learning, fusion	G06V 10/80
Information retrieval of video data; Clustering; Classification	G06F 16/75
Computer systems based on biological models using neural networks	G06N 3/02
Computer systems using knowledge-based models; Inference methods	G06N 5/04
Machine learning	G06N 20/00
Motion image analysis	G06T 7/20

Glossar

AE	auto-encoder network
AlexNet	CNN designed by Alex Krizhevsky et al.
Backprop	backpropagation, an algorithm for computing the gradient of the weights of an artificial neural network
BERT	bidirectional encoder representations from transformers, a transformer based artificial neural network
CNN	convolutional neural network, an artificial neural network that includes convolutional layers
DNN	deep neural network
FCL	fully connected layer of an artificial neural network
FCNN	fully convolutional neural network
GAN	generative adversarial network
GoogLeNet	deep convolutional neural network
Inception	convolutional neural network which concatenates several filters of different sizes at the same level of the network
LeNet	early CNN that firstly demonstrated the performance of CNNs on handwritten character recognition
LSTM	long short-term memory, a recurrent neural network
MLP	multi-layer perceptron
MS COCO	annotated image dataset
Perceptron	simple feed-forward neural network
RBF	radial basis function
R-CNN	convolutional neural network using a region proposal algorithm for object detection (variants: fast R-CNN, faster R-CNN, cascade R-CNN)
Res-Net	residual neural network, an artificial neural network having shortcuts / skip connections between different layers
SOM	self-organising maps, an algorithm for generating a low-dimensional representation of data while preserving the topological structure of the data
SSD	single shot (multibox) detector, a neural network for object detection
U-Net	neural network having a specific layer structure
YOLO	you only look once, an artificial neural network used for object detection (comes in various versions: YOLO v2, YOLO v3 etc.)