Digital Still Cameras: The Changing Face of Imaging

Morio Onoe, Professor Emeritus, University of Tokyo

The digital camera represents an integration of optics, mechanics and electronics consisting of three layers (Figures 1a, b and c). The top and the middle layers are printed circuit boards (PCBs), which contain most of the electronic components. The bottom layer houses most of the mechanical components.

Figure 1. Internal views of a digital still camera showing two layers of PCBs (a and b) and a base for mechanical and optical parts (c).

Figure 2, a block diagram of a digital still camera, illustrates a major difference from a traditional camera in that an image focused by a lens is sensed by a CCD sensor instead of film coated by light-sensitive silver halide emulsion. The CCD’s output is then digitized and stored in memory.

The image can then be viewed immediately on a liquid crystal display (LCD) or be transferred to a computer or a network. In a computer, it can be trimmed and retouched at will. It can be printed on paper or fabric of various sizes and colors or with special effects.

Systems, such as that in Figure 2, which consist of a video camera, a frame grabber and a computer, have been around for decades and have found use in image processing and parts inspection. However, the real significance of the digital still camera is that all the functions seen in Figure 2 are packed in a compact body. Digital cameras have now become appliances, and integration with personal digital assists (PDAs) or a mobile phone has further expanded their usefulness.

Figure 2. Functional diagram of a digital camera displays the sequence followed in capturing an image.

The development of CCD sensors with a few million pixels and full-color inkjet or dye-sublimation printers makes it possible for the image quality of digital output to be comparable to that of a photograph. A compact memory card or stick provides a removable memory for a few hundred images and easy connectivity with a personal computer. CD-R and CDR/W of some 600 MB are available for storing huge amounts of image data, and DVDs of a few gigabytes are possible. Archiving image data on a network provides further capabilities of global dissemination and remote and on-demand printing.

Optical considerations

The photosensitive area of a CCD sensor is usually smaller than that of 35-mm film. Hence the optical system of a digital camera is a shrunken version of that of a film camera. Macro- and telephoto capabilities are easily housed within a compact body, whereas a wide-angle lens is more difficult to fit. Because the size of a CCD sensor varies, the focal length of a digital camera lens is usually converted into an equivalent focal length of a 35-mm film camera, which has the same angle of view. Optical zooming is available while further zooming is possible via digital image processing.

A typical figure for a camera using a 1.8-in. CCD is f2.8~4.5f, where f = 7.3 to 21.9 mm which is equivalent to 35 to 105 mm in a 35-mm camera.

The real image size of 35-mm film is 36 × 24 mm; with an aspect ratio of 3:2. In a digital camera the most popular aspect ratio is 4:3, which is common with TV and PC displays. Sometimes a square (1:1) aspect ratio is used to eliminate the difference between images taken by horizontal and vertical camera positions.

A digital camera usually has a small LCD display, which serves as an electronic finder and a display of exposed images and information on camera settings. An optical finder, which helps in aiming a camera at a moving object, is available. Because an electronic finder has no parallax, it is useful in macrophotography. In addition, it can be rotated with no regard of the lens axis, making it convenient in taking a self-portrait or a scene above a crowd.

Image capture

An interline CCD sensor is widely used for image capture. There are a large number of cells, which correspond to pixels, on a silicon chip. Typical pitch between cells is 3 to 0.5 µm. Each cell is essentially a metal-insulator-semiconductor capacitor and, at the same time, a photodiode. Incoming light focused on the chip generates electronic charge which is stored in each capacitor. There are vertical transfer lines between photodiode (capacitor) columns (Figure 3a).

Figure 3. Block diagrams illustrate progressive scan, and interlaced scan.

A vertical transfer line is a CCD shift register whose structure is similar to a neighboring capacitor column but shielded from incoming light by a metal layer. The manipulation of applied voltage and gates first horizontally transfers accumulated charge in a photodiode into the next capacitor cell in a vertical transfer line. Then all the vertical transfer lines are shifted upward in parallel, so that a row of information is stored in a horizontal CCD transfer line and subsequently is shifted out to an amplifier as a serial flow of data. The process repeats again to read out the next row of data.

The output of the amplifier is converted into digital figures (typically 10 bits) by an A/D converter. This is called progressive mode. Since an entire image's data are transferred all at once in cells in vertical lines, electronic shuttering is possible. A drawback of the structure shown in Figure 3a is that the area of vertical transfer lines is large relative to the area of photo-diode columns. This increases the chip size and hence the price.

A remedy is shown in Figure 3b. The operation is similar to the previous case, but an entire image's data are read out in two sequences, the even rows first and the odd rows next. The vertical transfer lines is doubled and the width of interlines can be reduced. This operation is called interlaced mode. There is a time interval between fields, a half image of even rows and another of odd rows. Therefore, a mechanical shutter is required to reduce motion blur.

A guard zone surrounding each cell prevents charge spillover to the next cell. Such a spill causes a smear, which often shows up in an early sensor when there is a bright spot, like a white ball, in a scene, Sensitivity or signal-to-noise ratio of a sensor is determined by shot noise and dark current.

A good design is comparable in sensitivity to conventional photographic film, with the equivalent to ISO 200 or 400 easily available. Linearity between the input light and the output is excellent (gamma = 1) and leaves ample room for image processing at a later stage. The output digital data, however, are usually compressed from 10 to 8 bits, a size convenient for computer processing. In order to preserve a dynamic range, gamma is reduced to around 0.5.

The number of pixels determines an image's resolution. Increasing the number of pixels on the same area of a chip, however, reduces the fill factor and degrades the sensitivity because the wiring and guard zone cannot be proportionally reduced. In a high-end camera, 2 to 3 million pixels are available, whereas one order lower resolution is common in a low-end camera. Cells are usually arranged in a rectangular array as shown in Figures 3a and b. Sometimes a honeycomb array (Figure 3c) is used to improve MTF along both vertical and horizontal directions. Its fabrication is somewhat easier. Its drawback, however, is a need for resampling and interpolation to fit data in standard image formats.

Figure 3c. Structure of CCD sensors in a honeycomb array.

Resolution can be doubled by mechanically shifting a sensor a half pitch by a built-in piezoelectric actuator between the first and second exposures and combining the images. This is useful in capturing an image of a fine document.

Spectral response of a CCD sensor is wide and extends beyond the visible into the infrared, making an infrared sensor possible for special purposes. In a digital camera, color information is obtained by applying red, green and blue (RGB) filters over each cell respectively. Human eyes are more sensitive to green, Hence the number of green cells is usually twice that of red or blue (Figure 3a-c, Bayer's array). Full RGB information for each cell is obtained by digital interpolation over neighboring cells. The results are better when neighboring cells are similar in color. Hence a digital camera yields an excellent image in macrophotography but not in wide-angle photography where fine details of different color often exist.

CCD sensors give excellent image quality with low noise. They are rather expensive, however, because manufacturing requires a production line separate from conventional processor and memory chip lines. A CMOS image sensor can be manufactured together with its associated transistors in a conventional production line. Electric charge or potential of each photodiode is individually read out and amplified by three transistors surrounding each cell. The complex structure, which also reduces the light-sensitive area, and relatively large noise have hindered widespread use of CMOS sensors. However, the CMOS sensor is less expensive, needs only one supply voltage, enjoys a low consumption of power and can replace CCDs in low-end digital cameras.

Memory and image formats

Most cameras have both built-in and removable memories to store images. Some eliminate removable memory to shrink size or reduce cost and instead provide a communication port for the transfer of image data to a PC or network. Some have a minimum of built-in memory for processing and depend solely on removable memory for storage of entire image data. In some, both memories have a capacity of more than a few megabytes. In this case, a limited degree of image management can be done within a camera.

A removable memory provides a convenient means of image transfer via an adapter to a PC or network. It can be reset for another use after the transfer. Storage capacity is increased by adding cards. The once popular PCI card has been replaced by such smaller memory cards as smart cards, compact flash cards or memory sticks.

The choice of image size depends on intended applications. Typical available sizes are 2048 × 1536, 1028 × 768 and 640 × 480 pixels in the traditional 4:3 form factor. In low-end cameras, the size may be much smaller. For example, 640 × 480 pixels suffice for a full screen display (usually 72 dpi) on a PC as well as for printing (usually 180 dpi or more in a dye-sublimation printer) on postcard-size paper. Much smaller images are sufficient for most Web applications.

Image compression techniques such as JPEG are widely used for packing more images in a given memory. The memory requirement of a full-color (8 bits each for RGB) 640 × 480 pixel image is about 1 MB without compression and is reduced to 1/4 ~ 1/16 with JPEG compression depending on allowable degradation of image quality. The higher the compression, the lower the image quality. It should be noted that JPEG is a nonreversible compression and the original quality cannot be recovered from a compressed image. Hence, if image processing in a later stage is anticipated, then it is safer to use higher quality compression or choose an image size of one order larger.

Built-in processing

A digital camera’s processor has to conduct many tasks, including controlling exposure, shutter, focus, zoom and flash — as does an electronic film camera. However, additional image processing and management such as display, resampling, interpolation, compression and data transfer are required. These usually involve pixel-by-pixel operations and take time if done sequentially by a conventional processor. To reduce time between exposures, a special processor is usually used.

Color balance is required to compensate for color temperature differences of a light source. The actual color of white paper is bluish (cooler) in daylight and reddish (warmer) in room light using an incandescent bulb. Nevertheless, human vision automatically compensates for the difference. In a film camera, this compensation is done by changing the film itself — daylight type or room-light type. In a digital camera, built-in control of color (white) balance is required. This is done either manually by looking at a built-in LCD display or automatically. In the latter, the brightest portion of an image is picked up and its color is shifted more to white. Either way requires pixel-by-pixel operations, which are usually included in the special processor.

Figure 4. A panoramic image is constructed from individual elements.

With a digital camera, a number of images are easily taken and transferred to a PC. Manipulation of plural images opens a variety of applications. Three-dimensional (3-D) information of an object can be extracted from images taken from various angles. Then the object on a display can be rotated to create a view from a different angle or stereo pairs. Plural images can be combined to form a seamless panoramic picture as shown in Figure 4.

CD-R and CD-R/W are used for image archives up to 650 MB. CD-R can be read by a conventional CD-ROM drive, which is a standard peripheral of the PC. Network archives are popular because they allow remote viewing and printing.

Network integration

Once a digital camera is connected to a PC via wire, infrared light or card media, all the applications on the Internet are at hand. Images may be attached to an e-mail, opened to the public as a Web gallery or stored in Web archives for dissemination and remote printing.

A recent trend, however, is direct connectivity to a network without using a PC. A camera with a built-in Personal Computer Memory Card International Association (PCMCIA) card slot, now called a PC card, can be connected to a network via a LAN card or to a telephone network via a modem card. A built-in processor supports communication protocols.

Figure 5. Mobile phone handset with built-in digital camera.

As the infrastructure of mobile communication advances, cameras have been combined or integrated with mobile phone handsets to attain a direct connectivity to the Internet.

In 2003, just three years after the first sha-mail (photo-mail) appeared, nearly 60 percent of the nearly 40 million mobile phones manufactured came with digital camera capability (Figure 5). Some employ megapixel CCDs with resolutions exceeding those of low-end digital cameras.

In addition to traditional picture taking and mailing, a host of new applications have emerged. For example, remote reading of two-dimensional bar codes allows reliable transactions or personal identification. These devices are so versatile that some bookstores prohibit the use of mobile phones to prevent unauthorized copying of books and magazines.

Integration with multimedia

Some cameras have a built-in microphone. With the capability of image processing, a built-in processor can also handle voice recording and processing. Voice recording is used alone or as an annotation of a captured image or as a part of video recording. A shutter or self-timer can be activated by a shout.

Most cameras have standard video output, which allows viewing on a TV screen or projection by a video projector. A limited amount of image management, such as display and selection of thumbnail images is available without the use of a PC.

Figure 6a. Slim camera (5.3 million pixels, 91.7 × 60.2 × 14.7 mm, 136 g).

Time-lapse exposures at a fixed point of observation serve as an excellent remote surveillance tool. The combination with a global positioning system (GPS), now widely used in car navigation, is used for recording routes and environment.

Slimmer or fatter

Since a mobile phone with a built-in camera becomes a consumer commodity, digital still cameras have to explore new frontiers.

Two trends, slimmer and fatter cameras, have been visible in recent years. A mobile phone has to be housed in a limited form factor which cannot accommodate a large screen display. There is also not much space for complex circuits with sophisticated functions.

Figure 6a shows an example of a slim camera. The size of a credit card with a thickness of less than 1.5 cm makes it possible to easily carry the camera in a shirt pocket. An optical path of zoom lens is folded inside of the housing, so that zoom action doesn’t bulge the outer surface.

Figure 6b. Fat camera (1.24 million pixels, 157.5 × 149.5 × 85.5 mm, 1070 g).

Figure 6b shows an example of a fat camera. A CCD larger than 35 mm film yields a resolution greater than ten million pixels. Exchangeable lenses cover a wide range from telephoto to macro. Many additional features are available for professional photography.

More intelligence

A traditional film camera usually uses an optical viewfinder with a small eyepiece. So one must tightly grip a camera with both hands to take a stable shot. A digital camera with a (rotatable) LCD display allows more freedom to aim a camera, often with one hand, e.g., shooting a parade over heads of a crowd or taking a self-portrait at arms length. The practice is, however, vulnerable to shaking or swaying of the camera yielding a blurred image.

Figure 7. Antihandshake mechanism.

Now many digital still cameras feature an antihandshake mechanism, due to the recent advent of tiny gyroscope sensors made by MEMS technology (Figure 7). A pitch sensor and a roll sensor supply a compensation signal to a movable lens (or CCD) for stabilizing a focal spot on a CCD.

Automatic exposure control and automatic focusing have been common features in traditional film cameras. Ever increasing image processing power in a digital camera enables more sophisticated face recognition and face tracking techniques for better portrait, family and group shots. Distinct features of human faces such as positions and relationships between eyes and mouth are used for face recognition (Figure 8). Automatic focusing is done based on the nearest face or on average distances of faces. Detection of an additional face in a scene can start a self-timer.

A camera can also track landmarks. With an attached module containing a GPS receiver and a three-axis compass, GPS coordinates, as well as aiming directions of the camera, are automatically embedded with captured images. Images with attribute data can be transferred to a PC, so that positions and directions of shooting are overlaid on a map and each corresponding image is popped up with a click as shown in Figure 9.

Figure 8. Detection and tracking faces.

More eyes

Most cameras have been cyclops (a one-eyed giant in Greek mythology). A stereo camera has two eyes in a pair. Future cameras may have more eyes similar to the compound eye of an insect.

Figure 10 shows a schematic of integral photography invented by G. Lippman in 1908. In recording (left half of the figure), a microlens array is placed in front of a film. Each microlens forms a small subimage of lens aperture, which is different from lens to lens and contains information on directional distribution of incoming light at each microlens. In reconstruction (right half of the figure), the developed film is illuminated from the back and an observer looking through a microlens array can get a three-dimensional perception. (Actually the reconstructed image is reversed in front and back. There are various ways to remedy this effect which is left out for brevity). This reconstruction is purely optical and uniformly applied to all subimages. Viewing angle and focus are chosen by an observer. Difficulties in making the microlens array, the alignment of film and array in reconstruction, and the need for a remedy of reversed front and back have hindered acceptance of integral photography.

Figure 9. Picture of landmark integrated with map. Courtesy of Ricoh.

In a digital camera, film is replaced by an image sensor and all subimages are captured as digital data. With enough computational power, optical reconstruction in integral photography – which is essentially geometric optics of light rays – can be mimicked by digital processing, often called computational photography. Viewing angle and focus can be specified as processing parameters. The output usually consists of multiple reconstructed two-dimensional images with various viewing angles and focus. Even an image in perfect focus at every part can be composed.

Camera mechanism may be greatly simplified because lens movement for focusing is not necessary. Images can be refocused after a single shot. The space between the lens and image sensor can be reduced because focus distance is proportional to lens aperture for the same value of F (focus distance / aperture diameter). This allows making the overall size more compact. Microlens arrays can be mapped on a sphere similar to the compound eye of a dragonfly, so that a very wide angle of view can be realized.

There are still problems to be solved before these radically new digital cameras are widely available: making a uniform microlens array; an image sensor requires much higher resolution to record many subimages with large redundancy; a large amount of processing power is required. Advancements in nanotechnology and processor technology are rapidly solving these problems.

Figure 10. Comparison between integral photography and computational photography.

Power consumption and batteries

A digital camera consumes more energy than an electronic film camera because of additional components such as an LCD, a fast processor and a large memory capacity. The portability of a digital camera depends on built-in batteries.

Many digital cameras now use nickel-metal hydride or lithium ion batteries. The latter are more expensive but have higher capacity. Both need a good charger in order to use them to their full extents.

Acknowledgment

The author wishes to thank Mr. Akira Takahashi of Ricoh Co. Ltd. for his valuable contributions, including Figure 1.