2.7 Color Imaging Systems

A color imaging system embodies any combination of chemical, electronic and optical technologies and devices required to perform image capture, signal processing, and image formation. The document only scratches the surface of the many possible combinations. The HVS is it?self a color imaging system: light is captured by the eye, signals are generated and an image is formed in the mind. Interested readers should refer to Madden and Giorgianni (2007) for a comprehensive publication on color imaging systems.

2.7.1 Image Capture

Image formation imposes that a signal was captured and processed in the first place. In section 2.3.2, spectroradiometers, spectrophotometers, and tristimulus colorimeters are presented as the instruments used to perform color stimuli measurements in Colorimetry.

While most of the time they are used to perform unrelated color measurements for a single sample, they share similar principles and components with cameras and telescopes: An entrance slit akin to the tiny aperture of pinhole cameras Collimating lenses or mirrors akin to telescope mirrors or camera lenses beaming and guiding the light toward the measuring portion of the instrument. A prism or diffraction grating that spreads the light beam so that individual components can be measured at a discrete interval. A Complementary Metal Oxide Semiconductor (CMOS) or Charged-Coupled Device (CCD) similar to those of movie or astrophotography cameras.

Typical spectrometer components. Biology Wiki. (n.d.). Spectrometer. Retrieved October 14, 2018, from http://biologywiki.apps01.yorku.ca/index.php?title=File:Spectrometer.png

Optical diagram of a typical DSLR. Unknown. (n.d.). DSLR Optical Diagram. Retrieved October 14, 2018, from https://www.dpreview.com/forums/post/61693925

CMOS sensors tend to be cheaper to produce, have less power consumption, e.g. 100 times less, but usually exhibit more noise than CCD sensors. CMOS sensors are typically found in consumer and motion pictures cameras while CCD sensors are usually used in astrophotography and scientific applications requiring the lowest noise profile possible.

Both sensor types register photons hits, which upon conversion to electrons are stored into charge packets that are then read out.

A successful and accurate color imaging system must have color reproduction characteristics following the HVS, and thus must be trichromatic. Image capture devices such as consumer and movie cameras attempt to be colorimetric by satisfying the Maxwell-Ives criterion which states that their spectral responses, i.e. their sensitivities, must be a linear combination of the CMFs to reduce metameric failures.

Nikon 5100 sensitivities compared to HVS cone fundamentals for 2 degrees. In practice, most consumer and motion pictures cameras are not colorimetric, and compromises are made to increase performance such as offsetting the red response of the sensor to longer wavelengths to reduce the noise arising from a peak centered like those of the L cones.

Traditionally, photographs and movies were captured using photographic film. However, with the digital transition and advent of digital capture in the 2000s, this practice is nowadays becoming rarer thus it will not be described in favor of electronic color imaging systems.

While some high-end camera systems use an expensive set of three dichroic filters, three CCD sensors and, a beam-splitter prism, the vast majority of consumer and motion pictures cameras adopt a Bayer Color Filter Array (CFA) with a single sensor. The Bayer CFA is an arrangement of colored filters over the sensor photosites. The exact bandpass characteristics of the filters vary between cameras, but they are designed to transmit red, green and blue light. Because the HVS sensitivity peaks at 555nm, there are twice as many green filtered photosites as there are red or blue. This Bayer CFA was invented by Bryce Bayer at Eastman Kodak and patented in 1976. It empowered the production of much cheaper and compact image capturing systems at the price of reduced image quality due to the loss in spatial resolution and increased noise.

Close-up of a Bayer CFA mosaiced image on the left, the RGGB CFA mask in the middle and the demosaiced image with Directional Filtering and a posteriori Decision from Menon (2007).

2.7.2 Signal Processing

The raw image data must be processed to be displayed. Multiple operations are performed in succession until the image is ready. However, there is not a single, and unique path to that end and two cameras or raw processing applications might take different ways for a similar result. A typical process may adopt those steps: Linearization Black Level Subtraction White Level Scaling Clipping White Balancing Demosaicing Conversion to Working RGB Color Space With raw image captured by a Bayer CFA camera already linearly encoded or whose data is already linearized, the thermal black encoding level must be subtracted. The black subtracted image data is then rescaled in range 0-1 with the scale factor being the inverse of the difference between the white level, i.e. the saturation encoding level, point after which the sensor behaves non-linearly, or camera circuitry introduces clipping, and the maximum black level for a given channel. The scaled image is clipped so that no values are outside range 0-1.

The clipped image is white balanced so that achromatic colors of the scene stay neutral once reproduced. The process is similar to the HVS chromatic adaptation ability. Viggiano and Henrietta (2004) compared the accuracy of various white balancing methods and found "that the illuminant-dependent characterization produced the best results, sharpened camera RGB and native camera RGB were next best, XYZ and Von Kries were often not far behind, and balancing in the -709 primaries was significantly worse."

The Von Kries chromatic adaptation method is described because of its widespread usage in other contexts such as conversion between RGB color space with different white points. It involves converting the image to a LMS color space representing the cone fundamentals sensitivities and perform gain adjustments on each one of the Long, Medium and Short components as follows:

Where L’, M’, S’ are the white-balanced LMS cone values, Lw, Mw, Sw are the LMS cone values of a reference achromatic object in the scene, e.g. a perfect white diffuser, and L, M, S are the non white-balanced LMS cone values. The conversion from CIE XYZ tristimulus values to LMS color space is performed with a chromatic adaptation matrix such as Bradford or CAT02.

CAT02 chromatic adaptation matrix.

Once white balanced, the image must then be demosaiced to produce RGB pixel values for all the photosites. Many different demosaicing algorithms exist to estimate the values for the two unrecorded channels at each photosite. Fast ones, such as bilinear interpolation, used in real-time applications or for previewing purposes and editorial. Algorithms such as Adaptive Homogeneity-Directed (AHD) from Hirakawa and Parks (2005) or Demosaicing With Directional Filtering and a posteriori Decision (DDFAPD) from Menon (2007) produce much higher quality results but are slower. In the context of digital imagery that will be heavily post-processed, it is often preferable to use a demosaicing algorithm producing softer results and thus less prone to edge artifacts and then add sharpening selectively as part of the finishing process.

Upon demosaicing, the resulting image must be mapped from its native color space, often referred to as Camera RGB, to the CIE XYZ color space or any relevant RGB color space. The transformation is usually performed using a 3x3 matrix applied to the linear Camera RGB values. This matrix is the result of the camera characterization process described in section 4.8. and is ordinarily given by the camera makers or third-party organizations such as Adobe Systems with their camera .dcp profiles or the A.M.P.A.S with some of their ACES Input Device Transforms (IDT).

At this stage, the image is almost ready to be displayed: It requires a Look Transform and Output Transform as described in sections 3.1.3 and 3.14 to be faithfully represented.

2.7.3 Image Formation

Image formation is the final stage of a color imaging system. The processed image signals are used to drive the display device color-forming elements: the display primaries.

As indicated in section 2.3.9, the vast majority of modern electronic display devices use an additive image formation process: from the venerable CRT to the high-end modern phosphor laser cinema projector, and including LCD, plasma, OLED and DLP. Display primaries of a consumer smartphone and a consumer tablet. Spectral data courtesy of fluxometer.com. LCD, plasma and OLED displays form images using red, green and blue light emitting adjacent elements. The colors are mixed when the light emitted by each element reaches the observer HVS, and overlaps onto the retina to generate the final color stimuli.

On the other hand, DLP and projectors form images by combining modulated red, green and blue light directly on the projection screen, creating the final color stimuli and reflecting it toward the observer HVS.

Writing in-progress?