A Dearth of Defect Data Is No Longer a Barrier to Robust Inspection

A synthetic defect image generator using generative adversarial network technology can create real-world data sets, demonstrating the necessity of deep learning vision inspection in manufacturing.

EUNSEO KIM, NEUROCLE

Deep learning vision inspection is no longer a novel approach in manufacturing. Just a few years ago, many in the machine vision industry doubted its practical utility, but its adoption has surged across the sector. Today, it is widely recognized as a cornerstone technology for quality control.

Deep learning models can identify patterns that streamline inspection on the production line. Courtesy of Neurocle.

Unlike rules-based inspection systems, which rely on manually defined thresholds and conditions, deep learning inspection adapts dynamically to variations in defect patterns, enabling more robust and accurate identification of anomalies. Additionally, manual visual inspection often suffers from inconsistency due to human error and fatigue. In contrast, deep learning models maintain high accuracy and consistency during prolonged operation, proving their indispensable value in modern manufacturing environments.

There are 164 suppliers of Inspection Systems in the Photonics Marketplace.

However, as more companies aim to deploy deep learning inspection systems, the focus has shifted from adopting the technology to developing high-performance inspection models.

High-performance models

There are two primary approaches to generating high-performance models: the model-centric approach and the data-centric model.

• The model-centric approach focuses on optimizing the architecture and hyperparameters of the model itself. It involves fine-tuning layers, activation functions, and parameters to improve performance. The emphasis here is on achieving the best possible outcome by refining the model to suit the available data.

Generating high-resolution battery electrode synthetic defect images. High-resolution data can be gleaned from segments of original images. Courtesy of Neurocle.

• The data-centric approach prioritizes the quality, variety, and volume of data. Rather than relying solely on the model’s architecture, this method seeks to improve the training data set by augmenting data, reducing noise, and ensuring a balanced representation of all defect types.

Recent advancements in auto deep learning algorithms have enabled the automatic optimization of model structures and hyperparameters. This innovation allows even nonexperts to generate high-performance models.

Challenges in acquiring data

Despite these advancements, challenges in the data aspect remain.

A significant hurdle in this process is the lack of defect data in manufacturing, which stems from the high yield rates that are typical of industrial processes. For example, battery production often achieves a yield rate of >95%, with similar figures observed in the automotive and semiconductor industries. This results in a limited number of defect images being generated daily. Furthermore, the introduction of new products or components often brings diverse defect types with highly imbalanced frequencies, further complicating data collection.

This lack of defect data presents a serious challenge for deep learning models. Such models rely on pretraining with comprehensive data sets to learn the features of various defects. Insufficient or imbalanced data can lead to biased training and diminished inspection accuracy.

Ultimately, the issue of limited defect data not only delays the adoption of deep learning inspection systems but also impedes the creation of reliable and accurate inspection models.

Synthetic defect image generator

Does the scarcity of defect data mean that deep learning inspection cannot be implemented in manufacturing? Not at all. When data is scarce, it can be artificially generated. The spotlight here is on a generative adversarial network (GAN) technology.

Detection results of high-resolution synthetic defect images. Segments, or patches, of images can be used to help train a deep learning model. Courtesy of Neurocle.

GAN, a pioneering technology in generative AI, employs two competing neural networks — a generator and a discriminator. The generator creates synthetic data, while the discriminator attempts to distinguish it from real data. Through this iterative competition, GAN has evolved to produce outputs that are strikingly realistic.

In the manufacturing sector, GAN can be used to generate synthetic defect data, addressing the challenge of insufficient data sets. This approach significantly lowers the barriers to adopting deep learning inspection systems by supplementing the lack of real defect data.

Industrial requirements for GAN

While GAN technology is widely used in fields such as entertainment and design, applying it directly to manufacturing requires specific adaptations. Industrial applications of GAN must meet the following criteria:

• High performance with limited data. Manufacturing often provides limited defect images, so industrial GAN must learn defect patterns effectively from small data sets.

• Preservation of background integrity. Synthetic defects must blend seamlessly into the product’s original structure without altering its background. For instance, when generating a scratch on a battery cell surface, the battery’s design and texture must remain intact.

• Accurate reproduction of unique defect patterns. Each defect type exhibits distinct patterns or characteristics that industrial GAN must precisely replicate to ensure realistic data generation.

Key features of GAN

To satisfy these requirements, industrial GAN employs advanced features tailored to manufacturing. These include sophisticated model architectures capable of extracting detailed defect-specific patterns from minimal data. Additionally, tools for controlled defect generation that allow users to specify precise locations for synthetic defects are essential.

Synthetic images can be incorporated into the training of the deep learning model, bridging the gap created by the scarcity of real defect images. Courtesy of Neurocle.

Generative adversarial network (GAN)-based systems can generate reliable synthetic data in situations where images are scarce. Courtesy of Neurocle.

Another critical aspect is ensuring that synthetic defects blend naturally with the product background. High-resolution image generation is indispensable to maintain realism, especially for applications in which even microscopic details are critical.

Synthetic defect data generated through industrial GAN is not merely for visual inspection but serves as the foundation for training deep learning models. Thus, the quality of this synthetic data must meet rigorous standards to ensure its utility in real-world manufacturing processes.

CASTECH INC - High Precision CNC Polished Aspherical Lenses

Workflow in deep learning software

The integration of GAN technology into deep learning vision inspection software follows a systematic and efficient workflow designed for manufacturing environments:

• Learn defect patterns with the GAN model. The first step involves training the GAN model with the available defect images. Even when data sets are small, GANs effectively learn defect patterns by leveraging advanced generative capabilities. Through iterative optimization, the model becomes capable of creating synthetic defects that closely mimic real-world characteristics.

• Generate synthetic defects in the generation center. Once the GAN model is trained, synthetic defects are produced in the generation center, a dedicated space for creating synthetic defect images for later use. Defects learned by the GAN model are stored as stamp tools, allowing users to apply them to normal images to create synthetic defect images. Depending on the defect shape, various stamps can be created for an image, and they can be applied with a simple click. After applying a stamp to a normal image, users can adjust the defect’s size, position, and angle to generate the desired defect for inspection purposes. For generating a large volume of defect images quickly, users can use the random mode to create defects on numerous normal images, varying their location, shape, size, and number. If defects need to be generated in specific areas only, users can set an ROI (region of interest) to ensure that defects are created solely within the designated area.

• Include synthetic defect images in the training data set. The generated synthetic defect images are incorporated into the training data set along with the previously acquired real data.

• Train inspection models using real and synthetic defect images. Inspection models are trained using both real images and synthetic defect images.

This streamlined workflow, powered by GAN technology and the generation center, reduces the dependency on scarce defect data, accelerates model training, and paves the way for reliable and scalable deep learning inspection systems in manufacturing.

Real-world applications

The use of GAN technology to address defect data shortages is revolutionizing the manufacturing sector. This advanced technology enables the generation of synthetic defect images, addressing data limitations and enhancing the accuracy of inspection models across industries.

• Battery manufacturing: Tackling rare lead tab disconnection issues.

A prominent battery manufacturer recently adopted GAN technology to overcome challenges related to rare lead tab disconnection defects. These defects occurred infrequently, leaving the manufacturer with an insufficient data set to train a reliable inspection model.

A relatively low number of defect images delays the training of deep learning models in manufacturing. Courtesy of Neurocle.

The team at Neurocle encountered significant challenges in collecting defect data due to its low occurrence rate. To overcome this, they used GAN technology to generate 50 synthetic defect images that closely replicated the texture and appearance of real disconnections. This approach enriched the training data set, resulting in an improvement in model accuracy from 78% to 99%.

A key focus of the process was ensuring that synthetic defects were seamlessly blended into the background to prevent visual artifacts. Achieving realistic defect generation was crucial to producing meaningful results.

• Semiconductor manufacturing: Detecting hairline scratches on wafers.

Hairline scratches on wafer surfaces represent a common yet challenging defect for semiconductor manufacturers. These scratches require ultrahigh-resolution imagery to detect accurately, but their rarity led to a limited defect data set.

By employing GAN technology, a manufacturer generated more than 80 synthetic scratch images, carefully designed to mimic the unique patterns and characteristics of real scratches. These synthetic images enriched the data set and enhanced the model’s detection capability. The manufacturer cited the system’s ability to capture fine details such as subtle text variations.

• Automotive manufacturing: Improving weld seam inspection.

An automotive manufacturer had trouble detecting microscopic irregularities in weld seams due to insufficient data on defective samples. Leveraging GAN technology, the company generated 60 synthetic defect images that seamlessly integrated into high-resolution weld images, which they said not only diversified the data set but also significantly helped boost precision.

These use cases demonstrate the transformative potential of GAN technology in industries in which obtaining real-world defect data is a challenge. These generated synthetic defect images offer manufacturers the ability to train robust inspection models while maintaining high resolution and realism.

As manufacturers increasingly adopt GANs to supplement their data sets, industries can expect significant improvements in defect detection accuracy and overall product quality.

Generating high-resolution images

In recent manufacturing environments, the importance of inspecting high-resolution images is becoming increasingly apparent. In settings in which thousands of line-scan cameras detect complex defects rapidly and accurately, high-resolution image data is now an essential resource.

But what happens when high-resolution defect data is scarce? Can synthetic defects still be generated effectively? Yes, through techniques such as the patch method, synthetic high-resolution defect data can be generated without resizing the images. Instead of downscaling, which risks losing crucial pixels containing defects, the patch method divides high-resolution images into smaller segments (patches) for learning and reproduction. This ensures that even microscopic defects within large-scale images can be accurately captured.

When generating synthetic defects, patches are created individually while ensuring seamless integration within the overall image. This method avoids compromising the original structure or context of the image, preserving its fidelity and enabling effective model training.

Synthetic defect generation via the patch method empowers manufacturers to overcome data scarcity, especially for complex and high-resolution inspection requirements.

Enhance inspection performance

GAN technology is emerging as a transformative solution to one of manufacturing’s most pressing challenges: defect data scarcity. By enabling the creation of realistic synthetic defect data, GAN accelerates the adoption of deep learning inspection systems.

Furthermore, by generating synthetic defect data that closely resembles real-world defects, highly accurate inspection models can be created. Synthetic defect generation technology is expected not only to facilitate the adoption of deep learning vision inspection in manufacturing but also to enhance inspection performance across various environments through advanced image generation capabilities.

Meet the author

Eunseo Kim is a product marketing manager at Neurocle, a leading provider of deep learning vision inspection software. With extensive experience in developing marketing strategies for AI-based inspection solutions, Kim bridges the gap between technical innovations and market needs, ensuring customers understand the transformative potential of AI in manufacturing; email: [email protected].

About Neurocle

Published: March 2025

Glossary

machine vision: Machine vision, also known as computer vision or computer sight, refers to the technology that enables machines, typically computers, to interpret and understand visual information from the world, much like the human visual system. It involves the development and application of algorithms and systems that allow machines to acquire, process, analyze, and make decisions based on visual data. Key aspects of machine vision include: Image acquisition: Machine vision systems use various...
deep learning: Deep learning is a subset of machine learning that involves the use of artificial neural networks to model and solve complex problems. The term "deep" in deep learning refers to the use of deep neural networks, which are neural networks with multiple layers (deep architectures). These networks, often called deep neural networks or deep neural architectures, have the ability to automatically learn hierarchical representations of data. Key concepts and components of deep learning include: ...

Browse Cameras & Imaging, Lasers, Optical Components, Test & Measurement, and more.