G06V 10/82
Definition
Diese Klassifikationsstelle umfasst:(Für diese Definition ist die deutsche Übersetzung noch nicht abgeschlossen)
Neural networks (NN) specially adapted for image or video recognition or understanding, in particular specific architectures and specific learning tasks for this purpose.
Notes – technical background
These notes provide more information about the technical subject matter that is classified in this place:
Examples of architectures are:
- Attention based neural networks such as transformer architectures;
- Autoencoders consisting of encoder and decoder blocks, where the output has the same form as the input, i.e. input and outputs are both images for example;
- Convolutional neural networks consisting of repetitive convolutional and pooling layers;
- Pyramidal or multi-scale neural network, mostly of the convolutional type that process either differently scaled input images, have convolutional kernels of varying sizes, and/or contain skip connections from lower level layers to higher level layers or the output layer;
- Recurrent neural networks, where the input data is sequential by nature. That is either the pixels of the input image are processed sequentially, or a plurality of image frames such as in videos are processed sequentially. Long-short-term-memory (LSTM) and gated recurrent units (GRU) are some specific examples of recurrent neural networks;
- Region proposal networks, where the main task is not only to correctly classify objects in an input image but also provide an indication where a specific object has been found. Example architectures are R-CNN and YOLO;
- Residual neural networks (ResNet) containing skip connections or shortcuts to jump over some layers;
- Siamese neural networks, that work on input pairs and consist of two identical neural networks that process each pair of the input and then merges the output to provide a judgement about the input pair, such as if they are belonging to the same class or not.
Examples of learning tasks are:
- Adversarial learning such as in generative adversarial networks (GANs);
- Meta learning;
- Metric learning, learning a distant metric between two input objects mostly done with a Siamese neural network;
- Reinforcement learning, learning how to take optimal actions for performing a task, e.g. deep reinforcement learning for robotics, self-driving vehicles etc.;
- Representation or feature learning, learning representations or features from raw input, mostly done with some form of encoder-decoder architecture or simply by using intermediate representations of a classification network;
- Transfer or multitask learning, reusing a network trained on task A for task B or jointly training a neural network on multiple tasks.
Examples
![Bildreferenz:G06V0010820000_0 Bildreferenz:G06V0010820000_0](elayer/20240101/G06V0010820000_0)
![Bildreferenz:G06V0010820000_1 Bildreferenz:G06V0010820000_1](elayer/20240101/G06V0010820000_1)
Siamese network showing not similar (left) and similar (right) input pairs
![Bildreferenz:G06V0010820000_2 Bildreferenz:G06V0010820000_2](elayer/20240101/G06V0010820000_2)
Recurrent neural network for action recognition
![Bildreferenz:G06V0010820000_3 Bildreferenz:G06V0010820000_3](elayer/20240101/G06V0010820000_3)
Region proposal neural network for region of interest (ROI) detection
![Bildreferenz:G06V0010820000_4 Bildreferenz:G06V0010820000_4](elayer/20240101/G06V0010820000_4)
Adversarial learning with a generative adversarial neural network for object recognition on different backgrounds
Querverweise
Informative Querverweise
Feature extraction related to a temporal dimension; Pattern tracking
| G06V 10/62 |
Pattern recognition or machine learning, using clustering
| G06V 10/762 |
Pattern recognition or machine learning, using classification
| G06V 10/764 |
Pattern recognition or machine learning, using regression
| G06V 10/766 |
Pattern recognition or machine learning, fusion
| G06V 10/80 |
Information retrieval of video data; Clustering; Classification
| G06F 16/75 |
Computer systems based on biological models using neural networks
| G06N 3/02 |
Computer systems using knowledge-based models; Inference methods
| G06N 5/04 |
Machine learning
| G06N 20/00 |
Motion image analysis
| G06T 7/20 |
Glossar
AE
| auto-encoder network
|
AlexNet
| CNN designed by Alex Krizhevsky et al.
|
Backprop
| backpropagation, an algorithm for computing the gradient of the weights of an artificial neural network
|
BERT
| bidirectional encoder representations from transformers, a transformer based artificial neural network
|
CNN
| convolutional neural network, an artificial neural network that includes convolutional layers
|
DNN
| deep neural network
|
FCL
| fully connected layer of an artificial neural network
|
FCNN
| fully convolutional neural network
|
GAN
| generative adversarial network
|
GoogLeNet
| deep convolutional neural network
|
Inception
| convolutional neural network which concatenates several filters of different sizes at the same level of the network
|
LeNet
| early CNN that firstly demonstrated the performance of CNNs on handwritten character recognition
|
LSTM
| long short-term memory, a recurrent neural network
|
MLP
| multi-layer perceptron
|
MS COCO
| annotated image dataset
|
Perceptron
| simple feed-forward neural network
|
RBF
| radial basis function
|
R-CNN
| convolutional neural network using a region proposal algorithm for object detection (variants: fast R-CNN, faster R-CNN, cascade R-CNN)
|
Res-Net
| residual neural network, an artificial neural network having shortcuts / skip connections between different layers
|
SOM
| self-organising maps, an algorithm for generating a low-dimensional representation of data while preserving the topological structure of the data
|
SSD
| single shot (multibox) detector, a neural network for object detection
|
U-Net
| neural network having a specific layer structure
|
YOLO
| you only look once, an artificial neural network used for object detection (comes in various versions: YOLO v2, YOLO v3 etc.)
|