(Für diese Definition ist die deutsche Übersetzung noch nicht abgeschlossen)
Methods or arrangements for identifying regions in two-dimensional images, or volumes in three-dimensional point cloud data sets, which contain information relevant for recognition.
Identifying regions or volumes of interest in an image, point cloud or distance map which are likely to lead to successful object recognition.
Notes – technical background
These notes provide more information about the technical subject matter that is classified in this place:
A region or volume of interest (RoI or VoI) could include, for example, a human face (in case of a CCTV system), a vehicle or a pedestrian (in case of a camera-based traffic monitoring system), an obstacle on the road (in case of an advanced driver assistance system), or an item on a conveyor belt (in case of an industrial automation system).
The determination of a region or volume of interest is in essence a task of object detection, that is to say detecting the presence of a particular kind of object in images and localising the object(s).
It is the necessity of localising an object and, in particular, of describing the position and the spatial extent of the object (e.g. by outputting a bounding box around it) that distinguishes “object detection” algorithms from “object recognition” algorithms. This is because an “object detection” algorithm will merely assess whether a given visual object exists at a given image location. It may automatically generate a bounding box (e.g. around weeds in a field of vegetables) without solving the problem of “object classification” (e.g. analysing an image of a weed to determine its species and to output its botanical name).
Algorithms for detecting ROIs or VOIs in video sequences typically use frame differencing or more advanced optical flow methods for detecting moving objects.
Algorithms that determine a region or volume of interest (ROI or VOI) may use visual cues to establish the location of a boundary box, e.g. by evaluating features such as colour distributions or local textures.
The determination of a region or volume of interest may be facilitated by using special illumination, such as casting light in a specific direction where an object is to be expected in autonomous driving, or by treating the images of specimens with special staining, as is the case in classification of objects in microscopic imagery.
More recently developed algorithms use neural networks (NN) which integrate object detection and recognition. An example is the region-based convolutional neural network (R-CNN) which uses segmentation algorithms for splitting the image into individual segments to find candidate ROIs, followed by inputting each ROI to a classifier for subsequent object recognition.
Other solutions, such as the you only look once (YOLO), region-proposal networks (RPN) or a single shot detector (SSD) networks integrate the ROI detection into the actual object recognition step.
Examples
Using a mixed architecture based on region-proposal convolutional networks (R-CNN or RPN) to define a region of interest (ROI) and classifying it by another mixed convolutional neural network (CNN) using 2D and 3D information.
Determination of a ROI for character recognition is classified in group G06V 30/146.
Devices for radiation diagnosis | A61B 6/00 |
Diagnostic systems using ultrasound, sound, or infrasound | A61B 8/00 |
Computer-aided diagnosis systems | G16H 50/20 |
Region-based segmentation image analysis | G06T 7/11 |