The development of the MPEG-2 Audio and Video standards required the best experts in digital audio and video processing, but the development of the Systems part required seasoned engineers, a species in scarce supply today, so much so that we may no longer be able to access their unique expertise (and non because of their untimely departure but because they are spending their days on some exotic beaches). The lucky side for MPEG at that time was that there were plenty of them – and very good ones – because so many companies were waiting for a solution to make products or offer services.
The MPEG-2 media coding parts of the standard had been designed to be “generic” (hence the title eventually given to MPEG-2 as “Generic coding of moving pictures and associated audio”). Therefore the requirements of the application domains on the Systems part, as the interface with the application domains, were a huge task that was again assigned to the Requirements group.
A major requirement was that digital television would be carried by delivery systems that were mostly analogue, typically Hertzian channels and CATV. Different industries and countries had plans to develop solutions to digitise them with appropriate modulation schemes. So MPEG could assume that digitisation would “happen” (as in fact it did, albeit in a very disorderly and non-uniform fashion) but there were a number of functionalities between the media coding and the physical layer, such as multiplexing of different television programs, that were roughly equivalent to an OSI “transport layer” and that were not going to be provided by modulation schemes.
A brand new “systems” layer was needed with completely different requirements than those that had led to the definition of MPEG-1 Systems. The MPEG-1 Systems layer had adopted a packet multiplexer, which I consider a great achievement (and, as I said, a personal technical vindication). This had happened thanks to the positive interaction between a group of IT-prone members and other open-minded groups of telco and CE members. That this outcome was not discounted can be seen from the case of DAB, a service that uses MPEG-1 Audio and a traditional frame-based solution instead of the MPEG-1 Systems layer. The reasons are because the MPEG-1 Systems layer does not provide support for adaptation to the physical layer (e.g., it assumed an error-less environment, hardly a valid assumption in a radio channel), but more importantly because a packet-based multiplex was anathema to Audio engineers at that time.
In the digital television domain, we were talking, if not of the same engineers, of people with a similar cultural background so the packet-based vs frame-based argument popped up again. Eventually the decision was made to adopt a fixed-length packet-based multiplexer, a choice that somehow accommodated both views of the world (what is the difference between a frame and a fixed-length packet?). This, however, only solved one half of the problem, because a multiplex laden with features designed to support transmission in a hostile environment was not the best choice for storage applications at that time. A new solution was required, essentially the same as the MPEG-1 Systems standard one.
The first definition of the MPEG-2 Systems layer was achieved at the Sydney meeting in March/April 1993, where it was recognised that a single solution encompassing both application domains was not feasible, at least in the very tight timeline of the project. Therefore the systems layer was defined as having two forms, one called Transport Stream (TS) and the other Program Stream (PS).
There is no time to regret now, but I am still consumed by my failure to bring together all the industries that had an interest in a “transport solution for real-time media”. Granted that reconciling so many conflicting requirements could have been challenging but now the PS and TS basically have no common root if not the rather evanescent Packetised Elementary Streams (PES). As a result, the industries in need of a TS or PS solution went away with their part of the booty, while the telcos looked disdainfully from a distance at the TS/PS debate without even trying to join the discussion, being lost as they were in their ATM Adaptation Layer (AAL) dispute of AAL1/AAL2 vs. AAL5. My regret is augmented by the fact that MPEG did have enlightened and competent people who could have provided the unifying solution withstanding the unnatural solution, designed for non-real time data on the network, that was forced down the throats of us media people for real-time media.
The request that the US National Body had made in Paris about a non-MPEG audio codec had been rejected, but the reasons that had prompted it remained unchanged. Indeed the USA, with their ATV project, were moving ahead with plans to deploy their terrestrial digital television system (which they did in 1997) and they wanted to use MPEG-2 Systems and Video but use a non-MPEG audio codec. How was it possible for them to do so if the system did not recognise a non-MPEG audio bitstream?
The problem was solved by establishing a Registration Authority (RA), a standard ISO mechanism to cater to an evolving standard that needed the addition of rerferences but without following the rather cumbersome process of Amendments and new Editions. Those who wanted to have their proprietary streams carried by the MPEG-2 Systems layer would register that stream with the RA which would then assign a registration number to be carried in an appropriate field of the bitstream. The Society for Motion Pictures and Television Engineers (SMPTE) was eventually appointed by ISO as the RA for this so-called “format identifier”.
With the same mechanism it was possible to accept a request made at the Singapore meeting in November 1994 by the Confédération Internationale des Sociétés des Auteurs et des Compositeurs (CISAC), the international confederation of rights societies of authors and composers. The request was to provide the means to signal copyright information regarding the video stream, the audio stream, and the audio-visual stream in an MPEG-2 stream. The so-called “copyright identifier” solved the problem with a two-field number where the first field identifies the agency managing the rights to the stream and the second field the identifier assigned by that agency to the specific content item. Again, the solution requires a RA where agencies can go to and get their identifiers.
Another, very important, component was added to MPEG-2 Systems. This was in response to the request from pay TV operators to provide an infrastructure on top of which proprietary protection schemes could be implemented. The addition of two special messages solved the problem: Entitlement Control Messages (ECM) and Entitlement Management Messages (EMM). More about this later.
All that has been described so far was sufficient for the particular, though very important, Over-The-Air (OTA) broadcasting constituency, not for those – the telecommunication and CATV industry – which employed physical delivery means. To stay in or to move into the business of digital television competitively, these industries needed a standard protocol to set up a channel with the remote device and to let a receiver interact with content stored at the source. The DSM group provided the home for this important piece of work.
An incredibly active group of people started gathering under the chairmanship of Tom Lookabough first and of Chris Adams later, both of Divicom, to develop the Digital Storage Media Command and Control (DSM-CC) standard that became part 6 of MPEG-2. In the best MPEG tradition, MPEG developed a completely generic standard. So, even if the DSM, telco and CATV industries had triggered the work, the final protocol is generic in the sense that it can be used both in the case a return channel exists and when the channel is unidirectional. In the latter case the transmitter can use a carousel, but the receiver is presented with a single interface. Ironically, because the Video on Demand (VOD) business did not fare as expected, the carousel part of the DSM-CC standard is widely used in broadcast applications.
The last major component of MPEG-2 is the so-called Real-Time Interface (RTI). This was developed because the MPEG-2 Systems specification assumes that packets arrive at the decoder with zero jitter, clearly an idealised assumption that holds reasonably well in most OTA broadcast, satellite and CATV environments, but is not a valid assumption for such packet-based networks as ATM and Internet Protocol (IP). The purpose of part 9 of MPEG-2 is then to provide a specification for the level of jitter that an implementation is required to withstand.
The MPEG-2 Systems, Video and Audio Committee Draft (CD), the first stage of the standard issued for ballot, was approved in Seoul in November 1993, after one of the most intensive weeks in MPEG history. Some delegates worked until 6 am on Friday to produce the three drafts so that they could be photocopied and distributed to all members for approval at the afternoon plenary. It was at that meeting that the mark of one million photocopied pages was reached.
The short night did not prevent Tristan Savatier from staging another of his tricks. He convinced one of the lady delegates to lend him her stockings and shoes and, during the coffee break of the Friday afternoon plenary, he hid under my desk wearing the stocking in his hands and arms, and the shoes in his hands. When I resumed the meeting he started showing the stockings and the shoes from below the desk as if they were mine.
The following Paris meeting allowed people to make a review of the work done at the intense Seoul meeting. The systems part was found to need a major overhaul and so it was decided that a special meeting would be held in June in Atlanta, GA, hosted by Scientific Atlanta, just before the regular July meeting. With this additional effort MPEG final approved the standard in Singapore in November 1994, as planned.
I would like to conclude this chapter by reporting what VADIS did to promote the development of MPEG-2 and specifically CSELT’s role in it. Besides active participation in tens of CEs, VADIS carried out a thorough campaign of field trials to assess the performance of the MPEG-2 standard. Some VADIS members produced audio-visual bitstreams, others made available transmission adaptors, like one of the first modems for satellite, cable and terrestrial UHF, still others made available their ATM networks. CSELT had continued working on its multiprocessor architecture (the third generation, using an Intel 860 RISC instead of the original 80186 and five 2901 DSPs per board) and produced two real-time MPEG-2 decoders. The two decoders used to be still in regular use in my lab when I left in 2003. Another achievement of the project was the support given to the development of the VLSI design of an MPEG-2 Video decoder, which enabled Philips to become the 4th worldwide supplier of such chips, before it decided to sell its semiconductor division to NXP.