written 5.7 years ago by |
A straightforward digitized video consists of a sequence of digitized pictures, where each picture is known as a frame. However, such digitized video results in huge data files. Each pixel of a frame is stored as several bytes and a frame may contain about a million pixels.
A key technique in compressing video is to recognize that successive frames, often have much similarity so instead of sending a sequence of digitized pictures, we can send one digitized picture frame, followed by data describing just the difference between the base frame and the next frame. We can send just the difference data for numerous frames, before sending another base frame. Such a method results in some loss of quality, but as long as we send base frames frequently enough, the quality may be acceptable.
Of course, if there is much change from one frame to the next, we can't use the difference method. Video compression devices therefore need to quickly estimate the similarity between two successive digitized frames to determine whether frames can be sent using the difference method. A common way to determine the similarity of two frames is to compute what is known as the sum of absolute differences (SAD).
For each pixel in frame 1, SAD involves computing differences between that pixel and the corresponding pixel in frame 2. Each pixel is represented by a number, so difference means the difference in numbers. Suppose a pixel is represented with a byte and the pixels at the upper left of frames 1 and 2 in figure (a) are being compared. Say frame 1's upper left pixel has a value of 255. Frame 2's pixel is clearly the same, so would have a value of 255 also.
Thus, the difference of these two pixels is 255-255 = 0. SAD would compare the next pixels of both frames in that row, finding the difference to be 0 again. And so on for all the pixels in that row for both frames, as well as the next several rows. However, when computing the difference of the leftmost pixel of the middle row, where that black circle is located, we see that frame 1's pixel will be black, say with a value of 0.
On the other hand, frame 2's corresponding pixel will be white, say with a value of 255. So the difference is 255-0=255. Likewise, somewhere in the middle of that row, we''ll find another difference, this time with frame 1's pixel white (255) and frame 2's pixel black (0) - the difference is again 255 - 0= 255. Note that only the difference matters to SAD, not which is bigger or smaller, so we are actually looking at the absolute value of the difference between frame 1 and frame 2 pixels.
Summing the absolute value of the differences for every pair of pixels results in a number that represents the similarity of the two frames - 0 means identical and bigger numbers means less similar. If the resulting sum is below some threshold, the video compression method might apply the method of sending the difference data, as in figure (b) - we don't explain how to compute the difference data here, as that is beyond the scope of this example. If the sum is above the threshold, then the difference between the blocks is too great, so the compression method might instead send the full digitized frame for frame 2. Thus, video with similarity among frames will achieve a higher compression than video with many differences.