Machine Vision Guides

Bruce Fiala

High-resolution imaging demands a combination of custom and off-the-shelf hardware and software.

Automating photonic device assembly processes can seem daunting. In a typical application, progressive assembly operations such as locating, guiding, gluing and bonding take place in a local region around a static package. The challenge is to integrate these different mechanical systems precisely to fit all active components within a small space.

Consider the basic task of inserting an optical fiber into a ferrule for gluing. Prior to insertion, the automation system must determine the exact location of the fiber end in three-dimensional space while faced with positioning error related to limited repeatability in gripping and to fiber curl.

Figure 1. The inspection strategy for this assembly operation for a transimpedance receiver combines air bearing servos, vision guidance and active vibration damping on the Z-axis.

To accurately pinpoint location, the vision system often must access multiple views and magnifications of the target object (Figure 1). It also may be necessary to maximize the optical working distance to allow assembly directly beneath the camera. Searching for system resolution, end users will most likely find that their application requires a combination of off-the-shelf and custom vision components and software.

Cameras and lenses

Precision vision applications require camera lenses with high magnifications. Such optics not only have shallow depths of field and short working distances, but also require bright light. The magnification of the lens won’t limit system resolution, but its quality and the wavelength of the illuminating light will. High-quality, high-magnification lenses have a high numerical aperture. In practice, a system featuring 103 magnification, a numerical aperture of 0.28 and 640-nm light offers resolution of 2.78 μm, but this would increase at shorter wavelengths (Figure 2).

Figure 2. For a vision system to resolve black objects, there must be some area with a different gray level between them. To define the smallest distance (d min) separating the two black objects, d min = 1.22 x λ/NA. Both d min and λ are in microns.

Contributing to the quest for resolution is the camera. The camera’s photosensor captures light and assigns a gray level to each pixel. Row by row, the system reads pixel intensity values from the sensor and sends them to the image processing system. The sooner the system converts sensor information to a digital signal, the better chance it has of preserving true gray-level values.

In an analog system, the image processor resident on the computer bus digitizes the image. Noise from the bus and peripheral components can degrade gray-level resolution from 8 bits to 5 or 6 bits. Uncertainties in the pixel clock also create inaccuracies in how the system assigns a gray level to a pixel (jitter) and how gray levels of one row of pixels relate to the levels of the next row.

There are no such problems in a digital system. The analog-to-digital conversion takes place inside the camera, and the resident pixel clock is digital. Resolution is 10 bits or more.

Specifying a digital system does not guarantee the best vision system, though. Digital cameras with CMOS imagers have some limitations because the imagers are pixel-addressable, and the conversion electronics are adjacent to the photo site. The spacing between sites is greater than that of a CCD imager. Therefore, for a given area, the camera is less sensitive to light. In addition, CMOS imagers are inherently noisier because more electronics are near the analog-to-digital conversion area.

Whether digital or analog, the pixel size is the driver for effective system resolution. Using a 10x magnification with a 7-μm pixel yields a theoretical resolution of 0.7 μm. Typically, the smaller sensor formats 1/4 and 1/3 in.) have a smaller pixel size, which means that high-magnification applications are generally addressed with these formats. This reduces system magnification and increases depth of field and lighting uniformity across the field.

Standard cameras are the most sensitive to light at a wavelength of 600 nm and have 70 percent less sensitivity to 400- and 800-nm light. Cameras sensitive to UV radiation are available at higher cost.

Shedding light on lighting

For most vision applications, backlighting is the technique of choice. Unfortunately, because of space constraints or the need to work inside a package, this technique is rarely usable in photonic device assembly. The other option usually involves one of two common top-lighting methods.

Figure 3. Coaxial lighting works well when illuminating reflective or uneven surfaces perpendicular to the camera axis. This 125-μm fiber is illuminated with a white background (left) and without (right).

• Through-the-lens coaxial light integrated into the optics. This works well when illuminating reflective or uneven surfaces perpendicular to the camera axis. The illuminated area is equal to the primary lens’s viewable field. Because the concentrated light fits within the field, it provides the most power (Figure 3).

#8226; Top lighting with a line light or ringlight. This technique provides uniform and shadow-free images with the light centered on the object and at the proper working distance and angle (Figure 4).

Figure 4. Top lighting provides uniform and shadow-free images with light centered on the object and at a proper working distance and angle. The 125-μm fiber is illuminated with a white background (left) and without (right).

In both instances, the light source involves either LEDs or a halogen lamp. Output intensity limits the useful working distance of an LED ring. Minimize working distance, however, and the coaxial technique using an LED will work for some higher-magnification applications.

Illuminating the object using a halogen lamp and a fiber bundle is the most common way to obtain the intense light required by the application. Fiber bundles are bulky and restraining, however, and they may cause unwanted motion errors when tethered to a camera lens or moving stage.

Halogen lamps also have drawbacks. At full power, a lamp may last only 400 hours. As lamp life dwindles, so does its output, affecting the image quality. More costly lamp houses can compensate for the intensity decay by monitoring the output and increasing the current available to the lamp.

To reduce distortions from nonuniform bending of light traveling at different wavelengths through multiple lenses, systems should use a single-color light source when available. Options include single-color LEDs or halogen lights with a bandpass filter added to the output. Assuming the system uses standard cameras, red light is the most efficient option because most cameras are optimized for that color. Shorter wavelengths will produce the most resolute image but require greater intensity because of reduced camera sensitivity. High-power spectral line lamps and lamp houses are available for intense wavelength-specific illumination at shorter wavelengths.

Z focusing

Most vision applications for precision photonic device placement require Z focusing to compensate for planar differences of parts, nests and carriers. A lens has an optimum working distance. High-magnification optics have a shallow focus range. Z focusing is the act of moving the camera relative to the object to bring it into the focus plane.

Pick-height repeatability may be necessary for consistent picking of a delicate object or the precision placement of a device in Z relative to another device. Z focusing establishes a repeatable height relationship between the focal point of the camera and the object to be picked. Once the object is precisely located, the mechanism moves to the object using an offset in X-Y-Z.

The focusing can be achieved in two ways. The optical system may come with an internal motorized focusing element, or the user can vary the distance between the camera and lens and the imaged object by moving either the camera or the part. In theory, the software approach is straightforward. Typically, the system works with a set edge location within a region of interest, taking images as the mechanics move through a focusing range. A gradient operator on each image determines the contrast of the edge to the background. Software optimization leads mechanics to converge on the best edge result.

Many factors influence how a system converges to an in-focus condition. For example, resolution, settling time and speed are important considerations for the motion system responsible for the focusing. A lens with a longer focal depth will allow larger step sizes, thereby reducing the number of iterations necessary at the expense of resolution. The acquisition time of a standard analog camera is 33 ms. Digital cameras are at least twice as fast.

The last consideration is the vision processing time. Processing boards with resident processors operate more quickly than those that rely on the computer for processing. Tightly integrating the motion and vision for on-the-fly acquisition can substantially reduce the time between reiterations. End users should consider five images per second a worst-case Z-focus rate.

There is one requisite for this software approach to work, though. The system must find an edge. For the vision sensor to place a region of interest around an edge, it must be in a repeatable area in the field of view. The system could search for the initial position of the edge within the field, but it must first be at or near focus. This creates a chicken-and-egg scenario. The edge cannot be found unless it is in focus, and the focus cannot be discerned unless an edge can be found.

Dual magnification

When the system does not know the precise position of a feature or object, dual magnification with Z focusing is useful. Systems integrators can develop a routine to coarsely locate a target feature at low magnification and then precisely find a datum at high magnification in three-dimensional space.

The setup can include separate cameras, each with fixed magnification, in a high/low magnification scenario. This requires a 3-D calibration routine to relate the two. Either the cameras or the object to be inspected must move to obtain the high- and low-magnification images. When the application demands a precise mechanical relationship between the two images, engineers also must consider motion-system repeatability and accuracy.

Another dual-magnification technique employs a custom-designed optical assembly with a common input lens to split optical paths to two cameras. There are separate magnifications, as well as different optics in each path. With the relationship between the cameras mechanically fixed, the software can develop that relationship precisely. Lighting required for high magnification is greater than for low magnification, so programmable lighting intensity or blocking filters must be added to the optical path to obtain the same illumination levels under different magnifications. Because depth of focus will be greater at lower magnification, the whole assembly may have to move in the Z direction relative to the imaged object when switching between magnifications.

A motorized turret can provide a way to move lenses with different magnifications in front of the camera, but the mechanics of obtaining constant lighting and consistent alignment are difficult to implement. Motorized zoom-and-focus lens assemblies with high optical quality also are available. They offer configuration and working-distance flexibility, but can take several seconds to change from high to low magnification and are not necessarily durable. They also won’t work well on high-throughput machines.

Although there are several ways to accomplish a dual-magnification scheme, it is important to consider the application’s lighting and cable management issues before selecting a technique.

Hardware and software

The vision system should support multiple standard (640 x 480) and large-format (1k x 1k) cameras. Some applications can require four or more cameras. For accuracy with an analog approach, the pixel jitter specification of the acquisition board should be below ±2 ns. End users working with a digital path, on the other hand, should note that there is no standard interface format. They should make sure that the system supports RS-422 or EIA-644 formats and one of the newer Camera Link or IEEE-1394 cameras.

The setup also requires robust and repeatable image analysis tools to make the best use of the image. No matter how good the contrast is between an object and the background, a highly magnified object will still look fuzzy and will have edge transitions that fall across many pixels. The challenge for vision algorithms is to determine repeatedly where the edge transition is.

Geometric-based searching goes one step further. By developing relationships between multiple edges of an object, this search tool learns only these relationships and does not consider the gray level or texture of the object or the background. This technique is fast and can find objects with major differences in lighting, scale and rotation.

Shape-based edge-finding tools add important capabilities. They accurately locate roughly defined edge-based features like lines, pairs of lines, curves and circles, summing gray-scale image data along one coordinate of a region of interest to create a one-dimensional projection and then extracting edge data from the differentiated projection.

Cost vs. accuracy

Also helpful is the capability to build an internal look-up table to seamlessly compensate for linearity and perspective errors of cameras. Any nonlinear distortion in the optics will cause pixel-to-real-world unit calibration errors. Nonperpendicularities between the camera optics and nesting plates also can cause perspective errors.

Ultimately, developers of precision systems for photonic device placement face some trade-offs between cost and accuracy. No matter how well engineered, manufactured and assembled, the device will have inherent mechanical inaccuracies. End users can significantly reduce system errors, though, using the machine vision system, low thermal expansion grid plates and mapping techniques.

Meet the author

Bruce Fiala is a senior software engineer for robotics and vision at RTS Wright Industries LLC in Nashville, Tenn.