Image Processing

Our eyes are a major source of information for us. Every moment, our eyes receive optical signals from the surroundings and process them to give us a good idea about our surroundings. In order to process images, we need to understand what is light and how our eyes perceive it and how our mind processes it.


What is light? Physically, light is an electromagnetic wave. Now what is that? As proved by Micheal Faraday, electrical and magnetic fields are tightly coupled to each other. A changing magnetic field creates an electrical field and a changing electrical field creates a magnetic field. When there is no matter to absorb either of them, this oscillation continues and generates an electromagnetic oscillation or electromagnetic wave. The frequency of these oscillations can vary based on various factors at its origin. Humans have tapped a very wide range of frequencies in the electromagnetic waves - ranging from 1Hz to 1024Hz. They have also determined the speed of these electromagnetic waves to be around 2.997 x 105km/s. This is the speed in vacuum. It reduces when they pass through atmosphere and other different kinds of medium.

When sunlight(or any other source of light or energy) excites the surface of matter, it reflects/emits electromagnetic waves if a frequency corresponding to it's surface. Not all these electromagnetic waves are visible to our eyes. What we perceive as light is a very narrow range of frequencies in the entire electromagnetic spectrum. These are the 7 colors - Violet, Indigo, Blue, Green, Yellow, Orange and Red - ranging from 8x1015 to 4x1015

Perception of Light

When this narrow range - light - falls on the retina in our eyes, it generate impulses in the auditory nerve - that causes the perception of sight. In fact, our eyes do not see all these colors. We can sense only three colors - Red, Green and Blue. When Yellow light falls on the retina, it partially triggers the Red and Green sensors. Based on the relative amplitude of these, our mind perceives the result as yellow and so on.

Another important aspect of light perception is the brightness. More the energy content, brighter is the light. When the energy content is low, the perception of color fades out leaving a perception of black. In face, humans are more sensitive to brightness than to the color. This is because the brightness is used to identify edges in the perceived image. An edge is a curve that marks a drastic change in the brightness of the image. Using these edges, our mind constructs a shape and tries to compare it with the different shapes it knows - to make a guess about the object it sees.

This understanding is important in order to create an efficient model for light. When we understand the physiology of perception, we can save only the important part, ignoring the redundant aspects.

Software Model

Knowing these limitations of the human perception of light, there is no reason to waste our computational resources on the redundant aspects of images. An image in software is a set of three 2D arrays - each corresponding to the Red, Green and Blue values of each pixel in the image (typically one byte per color). Thus, we have R/G/B defined in range of values 0-255 - creating 3 bytes per pixel.

In fact, there is a lot of redundancy in these RGB arrays For example, in any image, there is very little change in the consecutive pixels. The drastic change occurs only on the edges. Also, since the major information content lies in the brightness rather than colors, we can allocate more memory to the brightness and reduce the memory wasted on individual colors. Many such tweaks are used to compress the image into the various standard formats like jpg, gif, png - they are all based on similar principles. We have many open source implementations that convert the image across different formats. But, most image processing algorithms are implemented as processing of these RGB arrays.