Machine vision: What is it and how to use it?

Machine vision is a scientific direction in the field of artificial intelligence, in particular robotics, and related technologies for obtaining images of real world objects, processing them and using the obtained data to solve various kinds of applied problems without the participation (full or partial) of a person.

Historic breakthroughs in machine vision

  • 1955 – Oliver Selfridge. Article “Eyes and ears of the computer”.
  • 1958 – Frank Rosenblatt. Computer implementation of the perceptron.
  • 1960s – The first image processing systems.
  • 1970s – Lawrence Roberts. The concept of machine construction of three-dimensional images of objects.
  • 1979 – Hans-Helmut Nagel. Theory of analysis of dynamic scenes.
  • 1990s – The first unmanned vehicle control systems.
  • 2003 – Corporate face recognition systems.

Components of a machine vision system

  • One or more digital or analog cameras (black and white or color) with suitable optics for imaging
  • Software for making images for processing. For analog cameras, this is an image digitizer
  • Processor (modern PC with multi-core processor or embedded processor such as DSP)
  • Machine vision software that provides tools for developing individual software applications.
  • I/O equipment or communication channels for reporting results
  • Smart Camera: One device that includes all of the above.
  • Very specialized light sources (LEDs, fluorescent and halogen lamps, etc.)
  • Specific software applications for image processing and detection of related properties.
  • A sensor to synchronize the detection parts (often an optical or magnetic sensor) to capture and process images.
  • Shaped drives used to sort or discard defective parts.

Machine vision focuses on mainly industrial applications such as autonomous robots and visual inspection and measurement systems. This means that image sensor technologies and control theory are associated with the processing of video data to control the robot, and real-time processing of the received data is carried out in software or hardware.

Image processing and image analysis are mainly focused on working with 2D images, i.e. how to convert one image to another. For example, per-pixel contrast enhancement operations, edge enhancement operations, denoising operations, or geometric transformations such as image rotation. These operations assume that image processing/analysis operates independently of the content of the images themselves.

Computer vision focuses on the processing of 3D scenes projected onto one or more images. For example, restoring the structure or other information about the 3D scene from one or more images. Computer vision often depends on more or less complex assumptions about what is represented in images.

There is also a field called visualization, which was originally associated with the process of creating images, but sometimes dealt with processing and analysis. For example, radiography works with the analysis of medical application video data.

Finally, pattern recognition is a field that uses various methods to extract information from video data, mainly based on a statistical approach. A significant part of this area is devoted to the practical application of these methods.

Thus, we can conclude that the concept of machine vision today includes: computer vision, visual pattern recognition, image analysis and processing, etc.

Machine vision tasks

  • Recognition
  • Identification
  • Detection
  • Text recognising
  • Restoring a 3D shape from 2D images
  • Motion estimation
  • Scene restoration
  • Image Recovery
  • Isolation of structures of a certain type on images, image segmentation
  • Optical Flow Analysis


A classic task in computer vision, image processing, and machine vision is determining whether video data contains some characteristic object, feature, or activity.

This problem can be reliably and easily solved by a human, but has not yet been satisfactorily solved in computer vision in the general case: random objects in random situations.

One or more predefined or learned objects or classes of objects can be recognized (usually along with their two-dimensional position in the image or three-dimensional position in the scene).


An individual instance of an object belonging to a class is recognized.
Examples: identifying a specific human face or fingerprint or car.


The video data is checked for a specific condition.

Detection based on relatively simple and fast calculations is sometimes used to find small areas in the analyzed image, which are then analyzed using more resource-intensive techniques to obtain the correct interpretation.

Text recognising

Image search by content: Finding all images in a large set of images that have content defined in various ways.

Position Estimation: Determine the position or orientation of a specific object relative to the camera.

Optical Character Recognition: Character recognition in images of printed or handwritten text (usually for translating into a text format that is more convenient for editing or indexing. For example, ASCII).

Restoration of a 3D shape from 2D images is carried out using a stereo reconstruction of a depth map, reconstruction of the normal field and a depth map by shading a halftone image, reconstruction of a depth map from a texture, and determination of a shape by displacement

Motion estimation

Several motion estimation tasks in which a sequence of images (video data) are processed to find an estimate of the speed of each point in an image or 3D scene. Examples of such tasks are: determining the three-dimensional movement of the camera, tracking, that is, following the movements of an object (for example, cars or people)

Scene restoration

Two or more scene images, or video data, are given. Scene restoration has the task of recreating a three-dimensional model of the scene. In the simplest case, the model can be a set of points in three-dimensional space. More sophisticated methods reproduce a complete 3D model.

Image Recovery

The task of image restoration is to remove noise (sensor noise, motion blur, etc.).

The simplest approach to solving this problem is various types of filters, such as low-pass or mid-pass filters.

A higher level of noise removal is achieved by first analyzing the video data for the presence of various structures, such as lines or edges, and then managing the filtering process based on this data.

Image Recovery

Optical flow analysis (finding the movement of pixels between two images). Several tasks related to motion estimation, in which a sequence of images (video data) is processed to find an estimate of the speed of each point in an image or 3D scene.

Examples of such tasks are: determining the three-dimensional movement of the camera, tracking, i.e. following the movements of an object (for example, cars or people).

Image processing methods

  • Pixel counter
  • Binarization
  • Segmentation
  • Reading barcodes
  • Optical Character Recognition
  • Measurement
  • Edge detection
  • Pattern matching

Pixel counter

Counts the number of light or dark pixels.
Using the pixel counter, the user can select a rectangular area on the screen in a place of interest, for example, where he expects to see the faces of people passing by. The camera will immediately respond with information about the number of pixels represented by the sides of the rectangle.

The pixel counter allows you to quickly check whether the installed camera meets regulatory or customer requirements for pixel resolution, for example for faces entering doors controlled by the camera or for license plate recognition purposes.


Converts a grayscale image to binary (white and black pixels).
The values ​​of each pixel are conventionally encoded as “0” and “1”. The value “0” is conditionally called the background or background, and “1” – the foreground.

Often when storing digital binary images, a bitmap is used, where one bit of information is used to represent one pixel.

Also, especially in the early stages of the development of technology, the two possible colors were black and white, which is not mandatory.


Used to find and/or count parts.

The purpose of segmentation is to simplify and/or change the representation of an image so that it is simpler and easier to analyze.

Image segmentation is commonly used to highlight objects and boundaries (lines, curves, etc.) in images. More precisely, image segmentation is the process of assigning labels to each pixel in an image such that pixels with the same label share visual characteristics.

The result of image segmentation is a set of segments that together cover the entire image, or a set of contours extracted from the image. All pixels in a segment are similar in some characteristic or computed property, such as color, brightness, or texture. Neighboring segments differ significantly in this characteristic.

Reading barcodes

Barcode – graphic information applied to the surface, marking or packaging of products, representing the possibility of reading it by technical means – a sequence of black and white stripes or other geometric shapes.
In machine vision, barcodes are used to decode 1D and 2D codes designed to be read or scanned by machines.

Optical Character Recognition

Optical Character Recognition: Automated reading of text such as serial numbers.

OCR is used to convert books and documents into electronic form, to automate business accounting systems, or to publish text on a web page.

OCR allows you to edit text, search for words or phrases, store it in a more compact form, display or print material without losing quality, analyze information, and apply electronic translation, formatting, or speech to text.

My program written in LabView for working with images

Computer vision is used for non-destructive quality control of superconducting materials.

Introduction.Solving the problems of ensuring integrated security (both anti-terrorist and mechanical security of objects, and technological safety of engineering systems), at present, requires a systematic organization of control, the current state of objects. One of the most promising methods for monitoring the current state of objects are optical and optoelectronic methods based on the technologies for processing video images of an optical source. These include: programs for working with images; the latest methods of image processing; equipment for obtaining, analyzing and processing images, i.e. a set of tools and methods related to the field of computer and machine vision. Computer vision is a general set of methods that allow computers to see and recognize three- or two-dimensional objects, as an engineering direction, so no. To work with computer vision, digital or analog input-output devices are required, as well as computer networks and IP location analyzers designed to control the production process and prepare information for making operational decisions in the shortest possible time.

Formulation of the problem. Today, the main task for the designed machine vision systems remains the detection, recognition, identification and qualification of objects of potential risk located in a random place in the zone of operational responsibility of the complex. Currently existing software products aimed at solving the listed problems have a number of significant drawbacks, namely: significant complexity associated with high detailing of optical images; high power consumption and a fairly narrow range of possibilities. Expansion of the tasks of detecting objects of potential risk, to the area of ​​searching for random objects in random situations, located in a random place, is not possible with the available software products, even with the involvement of a supercomputer.

Target.Development of a universal program for processing images of an optical source, with the possibility of streaming data analysis, that is, the program must be light and fast so that it can be written to a small-sized computer device.


  • development of a mathematical model of the program;
  • writing a program;
  • testing the program in a laboratory experiment, with full preparation and conduct of the experiment;
  • study of the possibility of applying the program in related fields of activity.

The relevance of the program is determined by:

  • the lack of image processing programs on the software market with the output of a detailed analysis of the engineering components of objects;
  • constantly growing requirements for the quality and speed of obtaining visual information, which sharply increase the demand for image processing programs;
  • the existing need for programs of high performance, reliable and simple from the user’s point of view;
  • the high cost of professional visual information processing programs.

Analysis of the relevance of the development of the program.

  • the lack of image processing programs on the software market with the output of a detailed analysis of the engineering components of objects;
  • constantly growing requirements for the quality and speed of obtaining visual information, which sharply increase the demand for image processing programs;
  • the existing need for programs of high performance, reliable and simple from the user’s point of view;
  • there is a need for programs of high performance and simple control, which is extremely difficult to achieve in our time. For example, I took Adobe Photoshop. This graphic editor has a harmonious combination of functionality and ease of use for an ordinary user, but in this program it is impossible to work with complex image processing tools (for example, image analysis by building a mathematical relationship (function) or integral image processing);
  • the high cost of professional visual information processing programs. If the software is of high quality, then the price for it is extremely high, up to the individual functions of a particular set of programs. The graph below shows the price / quality dependence of simple analogues of the program.

To simplify the solution of problems of this type, I developed a mathematical model and wrote a computer program for an image analysis device using the simplest transformations of the original images.

The program works with transformations such as binarization, brightness, image contrast, etc. The principle of operation of the program is demonstrated on the example of the analysis of superconducting materials.

When creating composite superconductors based on Nb3Sn, the volume ratio of bronze and niobium, the size and number of fibers in it, the uniformity of their distribution over the cross section of the bronze matrix, the presence of diffusion barriers and stabilizing materials vary. For a given volume fraction of niobium in the conductor, an increase in the number of fibers leads, respectively, to a decrease in their diameter. This leads to a noticeable increase in the Nb / Cu-Sn interaction surface, which greatly accelerates the growth of the superconducting phase. Such an increase in the amount of the superconducting phase with an increase in the number of fibers in the conductor ensures an increase in the critical characteristics of the superconductor. In this regard, it is necessary to have a tool to control the volume fraction of the superconducting phase in the final product (composite superconductor).

When creating the program, the importance of conducting research on the materials from which superconducting cables are created was taken into account, since if the ratio of niobium to bronze is incorrect, an explosion of wires is possible, and, consequently, human casualties, money costs and loss of time. This program allows you to determine the quality of the wires based on the chemical-physical analysis of the object.

Filling Equipment Market Previous post Filling Equipment Market Trends, Growth, Demand, Scope, Analysis 2021-2026
Next post Questions to Ask Before Buying a CBCT Machine

Leave a Reply

Your email address will not be published. Required fields are marked *