MPEG-4 Inside – Visual

MPEG-4 Inside – Visual

MPEG-4 Visual provides a coding algorithm for natural video that is capable of operating from 5 kbit/s with a spatial resolution of QCIF (144×176 pixels) scaling up to bitrates of some Mbit/s for ITU-R 601 resolution pictures (288×720@50Hz and 240×720@59.94 Hz). Additionally the Studio Profile addresses an operation range in excess of 1 Gbit/s. It is ITU-T H.263 compatible in the sense that a basic H.263 bitstream is correctly decoded by an MPEG-4 Video decoder. 

As mentioned before, MPEG-4 Video supports conventional rectangular images and video (upper portion of Figure 1 below) as well as images and video of arbitrary shape (lower portion of figure).

mpeg-4_video_concept

Figure 1 – The MPEG-4 Video Core and the Generic MPEG-4 Coder

The coding of conventional images and video is similar to conventional MPEG-1/2 coding. It involves motion prediction/compensation followed by texture coding. For content-based functionalities, where the image sequence input may be of arbitrary shape and location, coding shape and transparency information is encoded as well. Shape may be either represented by an 8-bit transparency component – which allows the description of transparency if one Video Object (VO) is composed with other objects – or by a binary mask.

The basic coding structure is represented in the figure below. This involves shape coding (for arbitrarily shaped VOs) and motion compensation as well as DCT-based texture coding (using standard 8×8 DCT or shape adaptive DCT).

MPEG-4_Video_encoder

Figure 2 – The MPEG-4 Video coding scheme

If the a-priori knowledge of the scene is exploited MPEG-4 Visual can offer unexpectedly high compression ratios. To code of the top left image of Figure 3 would require a considerable amount of information but, if it is possible to separate the background and the sprite (top right), coding of the picture below can be achieved with relatively few bit/s.

mpeg-4_background_and_sprites

Figure 3 – Background and sprites in MPEG-4 Video

MPEG-4 Visual supports the 3 forms of scalability depicted in Figure 4.

T-S-Q_scalability

Figure 4 – Temporal, spatial and quality scalability

The MPEG-4 Visual standard specification includes also technologies to handle 2D and 3D graphics information, but these will be introduced later.