Training machine vision without the factory downtime: how digital twins are reshaping inspection

The rising cost of getting vision right
Modern production moves quickly. Setups change, lot sizes shrink and clamping errors can halt operations in an instant. Machine vision helps catch these mistakes, yet training these systems still relies heavily on real images captured on the shop floor. That means stoppages, expert support and costs that make high-mix, low-volume manufacturing difficult to justify.

A virtual route to real accuracy
The latest research seeks to overcome this bottleneck by using digital twins to generate synthetic training images. Instead of photographing each setup, the approach renders scenes directly within the CAM environment, using basic settings rather than highly engineered, time-consuming visual models. This enables faster dataset creation and removes the need to interrupt production.

At the heart of the method is a feature extraction and classification pipeline that uses pretrained deep learning models. By leveraging these existing architectures, the system reduces the number of images required while maintaining high accuracy when distinguishing between different clamping situations.

Testing the approach in a real cell
A case study in an industrial manufacturing cell evaluated how well models trained exclusively on synthetic images performed on real camera data. Despite differences between rendered scenes and physical conditions, several deep learning models achieved perfect classification, with lightweight architectures such as SqueezeNet proving particularly efficient. Even with a minimal training set of just 21 images, the evaluator reliably detected incorrect clamping and supported automated decision-making.

Towards adaptable, human-error-resistant production
By streamlining the creation of vision training data, this digital twin toolchain cuts downtime, reduces reliance on expert knowledge and makes flexible manufacturing more viable. It aligns closely with the industry’s broader push for resilient, data-driven systems that enable quick, safe reconfiguration.

This article is based on the peer-reviewed publication “Enhancing machine vision training with digital twins: A toolchain for optimised image categorisation using synthetic training sets”, published in Procedia CIRP (Volume 134, 2025) and available via ScienceDirect.

Share this post