MPEG-1 Development – Video

MPEG-1 Development – Video

The Kurihama meeting in October 1989 was a watershed in many senses. Fifteen video coding proposals were received, including one from the COMIS project. They contained D1 tapes with sequences encoded at 900 kbit/s, the description of the algorithm used, an assessment of complexity and other data. The site had been selected because JVC had been so kind to offer their outstanding facilities to perform the subjective tests with MPEG experts acting as testing subjects. At the end of the meeting a pretty rough idea of the features of the algorithm could be obtained and plans were made to continue work “by correspondence”, as this kind of work  was called in those pre-internet days. 

About 100 delegates attended the Kurihama meeting. With the group reaching such a size, it became necessary to put in place a formal structure. New subgroups were established and chairmen appointed: Tsuneyoshi Hidaka (JVC), who had organised the subjective tests, led the Test group, Allen Simon (Intel) led the Systems group, Colin Smith (Inmos, later acquired by ST Microelectronics) led the VLSI group and Takuyo Kogure (Matsushita) led the Digital Storage Media (DSM) group. These were in addition to the already established Audio and Video groups chaired by Hans-Georg Mussman (University of Hannover) and Didier Le Gall (Bellcore), respectively. 

In this way the main pillars of the MPEG organisation were established: the Video group and the Audio group in charge of developing the specific compression algorithms starting from the most promising elements contained in the submissions, the Systems group in charge of developing the infrastructure that held the compressed audio and video information together and made it usable by applications, the Test group assessing video quality (the Audio group took care directly of organising their own tests), the VLSI group assessing the implementation complexity of compression algorithms and the DSM group studying the (at that time only) application environment (stirage) of MPEG standards. In its 25 years, the internal organisation of MPEG has undergone gradual changes, and there has been quite a turnover of chairs. 

The different subgroups had gradually become the place where heated technical discussions had become the norm, while the MPEG plenary meeting had become a place where the entire work done by the groups was reviewed for the benefits of those who had not had the opportunity to attend other groups’ meetings, but still wanted to be informed possibly even retain the ability to have a say in other groups’ conclusions, to resolve unsettled matters and give a formal seal of approval to all decisions. There were, however, other matters of general interest that also required discussion, but it was no longer practical to have such discussions in the plenary. As a way out, I started convening representatives of the different national delegations in separate meetings at night. This was the beginning of the Heads of Delegation (HoD) group. This name lasted for a quarter of a century until one day someone in ISO discovered that there are no delegations in working groups. From that moment the HoDs were called Convenor Advisors and everything went on as before.

It was during an HoD meeting that the structure of the MPEG standard was discussed. One possible approach considered was to have a single standard containing everything, the other to split the standard in parts. The former was attractive but it would lead to a standard of monumental proportions. Eventually the approach was adopted, as John Morris of Philips Research, the UK HoD at that time, put it, of making MPEG “one and trine”, i.e. a standard in three parts: Systems, Video and Audio. The Systems part would deal with the infrastructure holding together the compressed audio and video (thereby making sure that later implementors would not find holes in the standards), the Video part would deal with video compression, and the Audio part would deal with audio compression. 

From a non-technical viewpoint, but quite important for a fraction of the participants, the meeting was also remarkable because during the lunch break on the first day, news appeared on television that a major earthquake had hit San Francisco and most of the participants from the Bay Area had to leave in a haste. It later became known that fortunately no one connected to a Kurihama meeting participant had been seriously affected, but the work on VLSI implementation complexity clearly suffered, as many of the participants in that activity had arrived from the Bay area. 

At Kurihama there was also a meeting of SC 2/WG 8. A noteworthy event was the establishment of the Multimedia and Hypermedia Experts Group (MHEG), a parallel group to JBIG, JPEG and MPEG. This was the outcome of my undertaking decided at the Hannover meeting one year before to look into the problem of a general multimedia standard. After that meeting I had contacted Francis Kretz of CCETT who had been spearheading an activity in France on the subject and invited him to come to Kurihama. At that meeting Francis was appointed as chair of MHEG, another group parallel to MPEG. 

The Kurihama tests had given a clear indication that the best performing and still most promising video coding algorithm was the one that encoded pictures predictively starting from a motion-compensated previous picture using Discrete Cosine Transform (DCT), à la CCITT H.261. This meant that the new standard could easily support one of the requirements of “compatibility with H.261”, a request made by the MPEG telco members. On the other hand, the design of the standard could enjoy more flexibility in coding tools because the target application was “storage and retrieval on DSM” and not real-time communication where information transmission at minimum delay was at a premium. This is why MPEG-1 Video (as all subsequent video standards produced by MPEG so far) has interpolation between coded pictures as a tool that an encoder can use.

Philips hosted the following meeting in Eindhoven at the end of January 1990. The most important result was the drafting of the Reference Model (RM) version zero (RM0), i.e. a general description of the algorithm to be used as a test bed to carry out experiments. The Okubo group had also used a similar document with the same name for the development of H.261, but MPEG formalised the process of Core Experiments (CE) as a practical means to improve the RM in a collaborative fashion. A CE was defined as a particular instance of the RM at one stage that allowed the execution of optimisation tests performed on one feature while keeping fixed all other options in the RM. At least two companies had to perform the CE and provide comparably positive results for the CE to qualify for promotion into the standard. This method of developing standards based on CE has been a constant in MPEG since then. 

RM0 was largely based on H.261. This, and a similar decision made in 1996 to base MPEG-4 Video on ITU-T H.263, is brought by some MPEG critics as a claim that MPEG does not innovate. Those making this remark are actually providing an answer to their own remarks because innovation is not an abstract good in itself. The timely provision of good solutions that enable interoperable – as opposed to proprietary – products and services is the value added that MPEG offers to its constituency – companies large and small. People would be right to see symptoms of misplaced self-assertion, if MPEG were to choose something different from what is known to do a good job just for the sake of it, but that is not the MPEG way. On the other hand I do claim that MPEG does a lot innovation, but this is at the level of transforming research results into practically implementable audio-visual communication standards, as the thousands past and present researchers working in MPEG in the last quarter-of-a-century can testify. 

The Eindhoven meeting did not adopt the Common Intermediate Format (CIF) and in its stead it introduced Source Input Format (SIF). Unlike CIF, SIF is not yet another video format. It is two formats in one, but not “new”, because both formats are obtained by subsampling two existing formats: 625 lines @25 frames/s and 525 lines @29.97 frames/s. The former has 288×352 pixels @25 Hz and the latter 240×352 pixels @29.97 Hz. CE results could be shown and results considered irrespective of whether one or the other format was used. 

At the end of the second and last day of the meeting, a hurricane of unseen proportions swept all of the Netherlands. Trees were uprooted, roads blocked and trains stopped midway between stations: a proof that the Forces of Nature were showing again their concern for the work of MPEG.

The development of MPEG-1 Video took two full years in total starting from the Kurihama tests and involved the participation of hundreds of experts, some attending the meetings and many more working in laboratories and providing results to be considered at the next meeting. As a result, an incredible wealth of technical inputs was provided that allowed the development of an optimised algorithm. 

A major role in this effort was played by the VLSI group, chaired by Geoffrey Morrison of BT Labs, who had replaced Colin Smith who had founded the group. The VLSI group provided the neutral place where the impact on the complexity of different proposals – both video and audio – was assessed. MPEG-1 is a standard optimised for the VLSI technology of those years, because real-time video and audio decoding – never mention encoding – was only possible with integrated circuits. Even though the attention of that time concentrated around VLSI implementation complexity, the subgroup already considered software implementation complexity as part of their mandate. 

At the following meeting in Tampa, FL, hosted by IBM, the name Reference Model was abandoned in favour of Simulation Model (SM). This tradition of sequentially changing the name of the Model at each new standard has continued: the names Test Model (TM), Verification Model (VM), eXperimentation Model (XM) and sYstem Model (YM) have been used for each of the MPEG-2, MPEG-4, MPEG-7 and MPEG-21 standards, respectively. 

Still in the area of software, an innovative proposal was made at the Santa Clara, CA meeting in September 1990 by Milton Anderson, then with Bellcore. The proposal amounted to using a slightly modified version of the C programming language to describe the more algorithmic parts of the standard. The proposal was accepted, and this marked the first time that a (pseudo-) computer programming language had been used in a standard to complement the text. The practice has now spread to virtually all environments doing work in this and similar areas and has actually beed extended to describe the entire standard in software as we will see soon.