JPEG Compression

Compression

Image

$ T█

$ The given image^C

$

$ Starting compression...

R, G, B Channel Splitting

$ S█

$ Splitting of the image into the red, green and blue channels.

$

$ * JPEG compression does not support alpha channels.

$ Compressing [# ] 10%

Color Space Transformation (YCbCr)

$ T█

$ Transformation from the RGB color space to YCbCr.

$

$ Y - luma

$ Cb - chroma-blue

$ Cr - chroma-red

$

$ Y = 0.299 * R + 0.587 * G + 0.144 * B

$ Cb = 128 - 0.168 * R - 0.331 * G + 0.5 * B

$ Cr = 128 + 0.5 * R - 0.418 * G - 0.081 * B

$ Compressing [## ] 20%

Chroma Subsampling

$ E█

$ Encode images by implementing less resolution for chroma information than from luma information.

$

$ * Takes advantage of the human visual system lower acuity for color differences than for luminance

$ Compressing [### ] 30%

Block Splitting

$ S█

$ Splitting of the channels into streams of 8x8 chunks to be passed to the DCT.

$ Compressing [#### ] 40%

D

C

T

C

O

E

F

⋅

I

D

C

T

C

O

E

F

T

Discrete Cosine Transform

$ █

$ Compressing [##### ] 50%

/

Q

N

T

Z

M

T

R

X

Quantization

$ P█

$ Process of dividing the frequency matrix element-wise by a quantization matrix in order to provide more resolution to more perceivable frequency components (low frequency) and to get rid of less perceivable components (high frequency).

$

$ Reduction factor can be varied by changing the quantization matrix.

$

$ * Lossy process

$ Compressing [###### ] 60%

Entropy Encoding

35, 12, -12, 0, -5, 0, -1, 0, 0, 9, 0, 4, 0, 0, 0, 0 ...

$ O█

$ Order coefficients in a zig-zag manner to maximize the chance of a large sequence of 0s, which can be encoded more efficiently.

$

$ * Lossless process

$ Compressing [####### ] 70%

Run-Length Encoding

{ [(0, 6), 35],
  [(0, 4), 12],
  [(0, 4),-12],
  [(1, 3), -5],
  [(1, 1), -1],
  [(2, 4), 9],
  [(2, 2), 4],
  [(0, 0)]      }

$ J█

$ JPEG specific run-length encoding turns the stream of coefficients into a stream of tuples of the form:

$

$ [(r, s), c]

$ r - preceding 0s

$ s - bit size of coefficient

$ c - coefficient value

$

$ * [(0, 0)] - End of block

$ Compressing [######## ] 80%

Huffman Encoding

0010110100101010001001...

$ C█

$ Construct a Huffman tree from the tuples and encode the stream using the tree.

$ Compressing [######### ] 90%

JPEG

struct Jpeg {

	soi	  byte[2]; # start of image

	app0  App0;    # app specific metadata

	dqt   Dqt[];   # quantization tables

	dht   Dht[];   # huffman tables

	sof0  Sof0;    # start of frame

	sos   byte[2]; # start of scan

	data  byte[];  # compressed data

	eoi   byte[2]; # end of image

}

$ s█

$ soi - start of image marker

$ app0 - app specific metadata

$ dqt - specifies one or more quantization tables

$ dht - specifies one or more huffman table

$ sof0 - specifies the image dimensions, the number of components and the component subsampling

$ sos - start of scan marker

$ eoi - end of image marker

$ Compressing [##########] 100% - COMPLETE

Decompression

     00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F Decoded text

0000 FF DB FF E0 00 10 4A 46 49 46 00 01 01 01 00 48 ......JFIF.....H

0010 00 48 00 00 FF DB 00 43 00 06 04 05 06 05 04 06 .H.....C........

0020 06 05 06 07 07 06 08 0A 10 0A 0A 09 09 0A 14 0E ................

0030 0F 0C 10 17 14 18 18 17 14 16 16 1A 1D 25 1F 1A .............%..

0040 1B 23 1C 16 16 20 2C 20 23 26 27 29 2A 29 19 1F .#... , #&')*)..

     . . .

$ █

$ Starting decompression...

JPEG Data

0010110100101010001001...

$ E█

$ Extracting the huffman tables and the compressed data.

$ Decompressing [# ] 10%

Huffman Decoding

{ [(0, 6), 35],
  [(0, 4), 12],
  [(0, 4), 12],
  [(1, 3), -5],
  [(1, 1), -1],
  [(2, 4), 9],
  [(2, 2), 4],
  [(0, 0)]      }

$ D█

$ Decoding the compressed data using the huffman tables. Following the binary sequence, descending the tree until a leaf is reached. The value of the leaf is the decoded symbol.

$ Decompressing [## ] 20%

Run-Length Decoding

35, 12, -12, 0, -5, 0, -1, 0, 0, 9, 0, 4, 0, 0, 0, 0 ...

$ U█

$ Unpacking the tuples into a stream of 0s and values.

$ Decompressing [### ] 30%

Entropy Decoding

$ A█

$ Arranging the values back into a 8x8 chunk using a zig-zag pattern.

$ Decompressing [#### ] 40%

∘

Q

N

T

Z

M

T

R

X

Dequantization

$ M█

$ Multiplying each value by the corresponding value in the quantization table stored in the jpeg file.

$

$ * Since the quantization process is lossy, dequantization does not produce the same values as before quantization.

$ Decompressing [##### ] 50%

I

D

C

T

C

O

E

F

T

⋅

I

D

C

T

C

O

E

F

Inverse Discrete Cosine Transform

$ █

$ Decompressing [###### ] 60%

Block Merging

$ M█

$ Merging of the streams of 8x8 chunks into their respective channel, with dimensions of the original subsampled channels.

$ Decompressing [####### ] 70%

Upsampling

$ D█

$ Duplicaiton of pixels in the chroma channels to match the dimensions of the luminance channel.

$ Decompressing [######## ] 80%

Color Space Transformation (RGB)

$ T█

$ Transformation from the YCbCr color space to the RGB.

$

$ R = Y + 1.402 * (Cr - 128)

$ G = Y - 0.344 * (Cb - 128) - 0.714 * (Cr - 128)

$ B = Y + 1.772 * (Cb - 128)

$ Decompressing [######### ] 90%

R, G, B Channel Joining

$ J█

$ Joining the R, G, B channels to form the final image.

$ Decompressing [##########] 100% - COMPLETE

Discrete Cosine Transform

Frequency Decomposition

X₀

C₀

X₁

C₁

X₇

C₇

Frequency Component

X_k

=

c ₀

c ₁

c ₂

c ₃

c ₄

c ₅

c ₆

c ₇

∘

x ₀

x ₁

x ₂

x ₃

x ₄

x ₅

x ₆

x ₇

$ T█

$ The dot product of the input vector and a wave's coefficients measures the similarity between the two vectors, resulting in a frequency component.

Cosine Wave Sampling

9π

16

11π

16

13π

16

15π

16

π

16

3π

16

5π

16

7π

16

$ S█

$ Splitting the cosine wave into 8 equal parts, and sampling the wave at each part's midpoint

$

$ yn = cos((2n + 1)π/16)

$

$ n ∈ [0, 7]

$

$ Sampling of all 8 cosine waves at each of the 8 points gives us 64 coefficients.

$

$ * Doesn't include normalization

DCT Coefficients

DCT Coefficients - Matrix-Vector Product

X ₀

X ₁

X ₂

X ₃

X ₄

X ₅

X ₆

X ₇

=

s ₀

s ₁

s ₂

s ₃

s ₄

s ₅

s ₆

s ₇

f ₀

f ₁

f ₂

f ₃

f ₄

f ₅

f ₆

f ₇

⋅

x ₀

x ₁

x ₂

x ₃

x ₄

x ₅

x ₆

x ₇

$ T█

$ The DCT coefficient matrix is made by stacking the cosine waves' sampled points.

$ The matrix is multiplied by the input vector, so each frequency vector is multiplied by the input vector.

$

$ * Matrix multiplication is a more general form of the dot product

$

$ For a 8x8 chunk of the image, we pass the rows through the DCT matrix, and then the columns.

$ It is equivalent to multiplying the coefficient matrix to the input chunk, and them multiplying the result by the transpose of the coefficient matrix.