Last update: 2011/08/21
The events that led to the definition of the first MPEG project: MPEG-1
The target of the first MPEG work item was of interest to many: the CE industry because it could create a new product riding the success of CD Audio, the IT industry because the scope of use of the PC was growing fast along with its computing power and local interactivity with pictures was a great addition, and the telco industry because of the possibility to obtain the much-needed integrated circuits for the H.261 real-time audio-visual communication that they, as explained before, were unable to develop by themselves.
This is a possibly too sweetened a representation of industry feelings at that time. The way the CE industry used to operate was that when a new product was devised, each company, possibly in combination with some trusted ally, would develop the necessary technology (and file the patents enabling that technology). The product would then be put on the market but, at about the same time, competitors would also put other products with similar features on the market. The different products would then compete until the time the market would crown one as the winner. At that time the company or consortium with the winning product would register some key enabling technology of the product with a standards body and would start licensing the technology to all companies, former competitors included. This had happened for the Compact Cassette (CC) when the winner was Philips against Bosch, for the CD, when the winners were Philips and Sony against RCA, and for the VHS Video Cassette Recorder (VCR) when the winner was JVC against Sony.
The project proposed by MPEG, however, was introducing something that was clearly going to upset the established modus operandi of the CE world. Participants knew that, by accepting the rules of international standardisation, they would be deprived of the rightful time-honoured "war booty", i.e. the exclusive control of the patents needed to build the product that also largely controlled its evolution, in case they were the eventual winners. The advantage for them was the avoidance of costly format wars.
Another industry had mixed feelings: broadcasting. Even though digital television was an important goal for them, at that time the bitrate of 1.5 Mbit/s was considered way too low to provide pictures that broadcasters would even remotely consider as acceptable. On the other hand, they clearly understood that the technology used for MPEG could be extended to higher bitrates, those of interest to them. A glimpse of their attitude can be seen in the letter that Mr. Kirby, then the Director of CCIR, sent to the relevant CCIR SG Chairmen upon receiving news of the establishment of MPEG. The letter requested the Chairmen to study the impact that this unknown group would have on future CCIR activities in the area.
At my instigation, between January 1988 and the first MPEG meeting in May, a group of European companies had gathered with the intention of proposing a project to the ESPRIT program. A consortium was eventually established and a proposal put together. This had the name of Coding of Moving Images for Storage (COMIS) and a dual purpose. The first was to contribute to the successful development of the new standard by pooling European resources together and the second to give European industry a time lead in exploiting that standard. At the instigation of Hiroshi Yasuda, a project with similar goals was being built in Japan with the name of Digital Audio and Picture Architecture (DAPA). Some time later, a European project funded by the newly established Eurescom Institute (an organisation established by European telcos) and called Interactive Multimedia Services at 1 Mbit/s (IMS-1) was also approved.
Therefore, by the time the first meeting of the MPEG group took place in Ottawa, ON in May 1988, the momentum was already building and indeed 29 experts attended that meeting, although some of them were just curious visitors from the JPEG meeting next door. In Ottawa the mandate of the group was established. Drafting this was an exercise in diplomacy. There were already other groups dealing with video coding in ITU-T, ITU-R and CMTT, so the mandate was explicitly confined to Storage and Retrieval on Digital Storage Media (DSM). With this came the definition of the initial 3 phases of work:
| Phase 1 | Coding of moving pictures for DSM's having a throughput of 1-1.5 Mbit/s |
| Phase 2 | Coding of moving pictures for DSM's having a throughput of 1.5-5 Mbit/s |
| Phase 3 | Coding of moving pictures for DSM's having a throughput of 5-60 Mbit/s (to be defined) |
People in the business had no doubt about our plans. We intended to start working on low-definition pictures, for which a market was expected to exist because of a great carrier - the CD - existed and because of plans of the CE industry and, partly, the telco industry. The next step would then be to move to standard-definition pictures for which we expected that a market exists because industry was ready to accept digital television as talks of it had been ongoing for years. Eventually we would move to HDTV. These plans were sharply in contrast with those prevailing, e.g. in European broadcasting circles, where the idea was to start from HDTV and define a top-down hierarchy of compatible coding schemes. This was a technically good plan, but one that would take years to be implemented, if ever.
One meeting in Turin and one in London in September followed the Ottawa meeting. So, with the video coding work in MPEG on good foundations, I could pursue another favourite theme of mine. A body dealing with moving pictures with a wide participation of industry was good, but fell short of achieving what I considered a goal of practical value, because audio-only applications are plentiful, but appealing mass-market video-only application are harder to find. The importance of this theme was magnified by my experience of the ISDN videophone project of the ITU, an AV application par excellence. In that project the video coding standard (H.261), an outstanding piece of work and the multiplexing standard (H.221), a technically less excellent piece of work - but never mind - had been developed but the audio coding part had been left unsettled. This happened because the Video Coding Experts had launched the videophone project under the auspiced of SG XV, while the Audio Coding experts operated in SG XVIII, and the videotelephone team did not dare to make any decision on a field they had no authority on.
This organisational structure of the ITU-T, and a similar one in ITU-R and IEC, was a reflection of the organisation of the R&D establishments of that time: research groups in audio and video were located in different places of the organisation because of their different background and target products. For a manufacturer of videophone equipment, the easiest thing to do was to use one B channel for compressed video and one B channel with PCM audio, never mind the not-so-subtle irony that one channel carried a bitstream that was the result of a compression of more than 3 orders of magnitude - from 216 Mbit/s down to 64 kbit/s - while the other carried a bitstream in the form prescribed by a 30-year old technology without any compression at all! My personal experience in the use of videoconference - but I may be biased in my judgment - is that the video signal is always there, the audio signal is there only if you are lucky. This is not because the audio coding experts have done a lousy job but because the integration of audio and video has never been given the right priority.
So, besides video, the audio component was also needed and an action was required so that MPEG would not end up like videoconference, with an excellent video compression standard but no audio (music) or with a quality non comparable with the video quality. The other concern was that integrating the audio component in a system that had not been designed for that could lead to some technical oversights that could only be solved later with some abominable hacks. Hence the idea of a "Systems" activity, conceptually similar to the function executed by H.221 for ISDN videophone - but more technically sound - whose goal was to provide the specification of the complete infrastructure, including multiplexing and synchronisation of audio and video that allowed the building of a complete AV solution.
After the promotional efforts made in the first months of 1988 to make the industry aware of the video coding work, I undertook a similar effort to inform the industries that MPEG was going to provide a complete audio-visual solution. In this effort I also contacted Prof. Hans-Georg Mussmann, director of the Information Processing Institute at the Technical University of Hannover. Hans was well known to me because he had been part of the Steering Committee of the "Workshop on 64 kbit/s coding of moving video", an initiative that I had started in 1988 to promote the progress of low bitrate video coding research and he had actually hosted the first two events. Because of his institute's and personal standing, Hans was playing a major role in the Eureka project 147 Digital Audio Broadcasting (DAB).
The last meeting of 1988 was held at Hannover. The first two days (29 and 30 November) were dedicated to video matters and held at the old Telefunken labs, those that had developed PAL. Part of the meeting was devoted to viewing and selecting video test sequences to be used for simulation work and quality tests. The CCIR library of video sequences had been kindly made available through the good offices of Ken Davies, then with the Canadian Broadcasting Corporation (CBC). On that occasion two video sequences - "Table Tennis" and "Flower Garden" - were selected and for many years they would be used and watched by thousands of people engaged in video coding research. Another output of that meeting was the realisation that the MPEG standard had to be able to integrate "multimedia" components, if it had tobe fully exploitable for the main target of interactive applications on CD-Read Only Memory (CD-ROM). Therefore I undertook to see how this request could be fulfilled.
The last two days (1 and 2 December) saw the kickoff of the audio work with the participation of some 30 experts at Hans's Institute. Gathering so many audio coding experts had been quite an achievement because, unlike video and speech coding for which there were well established technical communities with a long tradition in the development of standards - of which I was myself a part - audio coding was a field where the number of researchers was more limited and scattered in a reduced number of places like the research establishment of ATT, CCETT, IRT, Matsushita, Philips, Sony, Thomson and a few others. The Hannover meeting gave attending researchers the opportunity to listen to the audio coding results of their peers. So the first MPEG subgroup - Audio - was born and Prof. Mussmann was appointed as its chairman. The meeting also produced a document, intended for wide external distribution, which invited interested parties to pre-register their intention to submit proposals for video and audio coding algorithms when MPEG would issue a Call for Proposals (CfP).
Bellcore, a research organisation spun off from the Bell Laboratories after the break-up of ATT, hosted the February 1989 meeting at their facilities in Livingston, NJ. The main task of the meeting was to develop the first version of the so-called Proposal Package Description (PPD), i.e. a document describing all the elements that proposers of algorithms had to submit in order to have their proposals considered. The document also contained the first ideas concerning the testing of proposals, both subjective and objective.
Worth mentioning was also the participation of Mr. Roland Zavada of Kodak. Rollie, the chairman of a high-level image-related coordination group in ISO, had come to inspect this unheard-of group dealing with Moving Pictures - which he had clearly taken to mean Motion Pictures - with a membership growing at every meeting at a pace that had never been seen before in ISO.
The July 1989 meeting in Stockholm produced a new version of the PPD where the video part was final and incorporated in the CfP. This contained operational data for carrying out subjective tests but also data to assess VLSI implementability and to weigh the importance of different features. Similar data were also beginning to populate the part concerning the audio tests. For systems aspects the document was still at the more preliminary level of requirements.
At the same meeting the second MPEG subgroup - Video - was established and Didier Le Gall, then with Bellcore, was appointed as its chairman. This particular subgroup was established as a formalisation of the most prominent of the ad hoc groups that had already been working, meeting and reporting in the area of Video, Tests, Systems, VLSI implementation complexity and Digital Storage Media (DSM).