Category Why, How And For What

MPEG-21 – Inside

Part 1 of the MPEG-21 standard has the title “Vision, Technologies and Strategy“. It is not a standard but a Technical Report because it contains a description of part of the content of MPEG-21.

Digital Item Declaration (DID) is part 2. This normative part defines the technology supporting DIs. The purpose of the DID standard is to describe a set of abstract terms and concepts to form a useful model for defining DIs. The Digital Item Declaration Language (DIDL) is an XML language for defining DIs. By using DIDL the standard XML representation of a DI is obtained.

For each transaction we need a means to identify the object of the transaction. Part 3 of MPEG-21, called “Digital Item Identification” (DII), plays that role by providing the means to uniquely identify DIs. The role of DII is similar to the one played by International Standard Book Numbering (ISBN) for books, International Standard Serial Number (ISSN) for periodicals and ISRC for recordings. Its scope includes: 

  • How to uniquely identify DIs and parts thereof (including resources); 
  • How to uniquely identify IP related to the DIs (and parts thereof); 
  • How to use identifiers to link DIs with related information such as descriptive metadata; 
  • How to identify different types of DIs. 

Part 4 of MPEG-21 Intellectual Property Management and Protection (IPMP) Components provides the means to control the flow and usage of Digital Items throughout their lifecycle by specifying how to include IPMP information and protected parts of Digital Items in a DIDL document. The IPMP DIDL encapsulates and protects a part of the hierarchy of a Digital Item, and associates appropriate identification and protection information with it. This is related to MPEG-4 IPMP-eXtensions (IPMP-X), started in 1999 as part of MPEG-4 and completed in 2002. The same technology has been applied to MPEG-2 and has become part 11 of MPEG-2. IPMP-X defines standard ways of retrieving IPMP tools from remote locations, authenticating IPMP tools and exchanging messages between the tools used to protect a piece of content and a terminal that needs to process (e.g. decrypt, decode, present) the content. 

Already in the physical world we seldom have absolute rights to an object. In the virtual world, where the disembodiment of content from carriage augments the flexibility with which business models can be conceived and deployed, this trend is likely to continue. That is why part 5 of MPEG-21 Rights Expression Language (REL) has been developed to express rights about a Resource in a way that can be interpreted and possibly acted upon by a computer. 

The MPEG REL data model for a rights expression consists of four basic entities and their relationship. This basic relationship is defined by the MPEG REL assertion “grant”. Structurally, an MPEG REL grant consists of the following:

  • The principal to whom the grant is issued
  • The right that the grant specifies
  • The resource to which the right in the grant applies
  • The condition that must be met before the right can be exercised

This is depicted in Figure 1.

REL_model

Figure 1 – The MPEG-21 REL model

A right exists to perform actions on something. Today we use such verbs as: “display”, “print”, “copy” or “store” and, in a given context, we humans share the semantics of these words, i.e. what they mean. But computers do not and must be “taught” the meaning. That is why MPEG developed the  Rights Data Dictionary (RDD) as part 6 of MPEG-21 to give the precise semantics of all the actions that are used in the REL in addition to a lot more actions. However, the basic semantics is already in the REL standard. 

The digital world should give people the means to do more than just find new ways of doing old businesses. Content and service providers used to know their customers very well. They used to know – even control – the means through which their content is delivered. Consumers used to know the meaning of well-classified services such as television, movies and music. Today we are having fewer and fewer such certainties: end users are less and less predictable, the same piece of content can reach them through a variety of delivery systems and can be enjoyed by a plethora of widely differing consuming devices. How can we cope with this unpredictability of end user features, delivery systems and consumption devices? This is where Digital Item Adaptation (DIA), part 7 of MPEG-21, comes to help, because DIA provides the means to describe how a resource and/or description in a DI should be adapted (i.e. transformed) so that it best matches the specific features of the User and.or the Network and/or the Device. 

DIA model
Figure 2 – The MPEG-21 DIA model

As shown in Figure 2, a Digital Item Adaptation Description specified by part 7 of MPEG-21 can be used by the (non-normative) “resource adaptation” and “descriptor adaptation” engines to produce adapted Digital Items.

Part 8 contains the usual Reference Software of the entire MPEG-21 standard. Part 9 is the MPEG-21 File Format, the “transport” format of a DI. The MPEG-21 file format inherits several concepts of MP4, as a DI may be a complex collection of information that contains still and dynamic media, information related to the DI such as metadata, layout information, etc. 

A DID is a static declaration defined using the DIDL. Digital Item Methods (DIM) are defined in Part 10 “Digital Item Processing” (DIP) to allow DI Users (authors, publishers, distributors, etc.) to add functionality to a DID, such as specifying a selection of preferred procedures by which the DI should be handled at the level of the DI itself. On receipt of a DID, a list of DIMs applicable to the DI is presented to the User. The User chooses one DIM that is then executed by the DIP Engine. As an example, for a music album DI an “AddTrack” DIM might be provided such that a user can add a new track in the preferred format. 

Back to part 3, getting an identifier for a DI is important, but how are we going to put a “virtual sticker” on it to carry the identification? This is where Persistent Association Technologies may be of help. SDMI struggled with the selection of very advanced “Phase I” and “Phase II” screening technologies and its task was made harder by the fact that no established methods existed to assess the performance of these technologies. MPEG-21 contains part 11 called “Evaluation Methods for Persistent Association Technologies” that does exactly that: a Technical Report, hence non-normative and similar to a “best practice” for those who need to assess the performance of watermarking and related technologies. 

Part 12 is called  Test Bed for MPEG-21 Resource Delivery. It is a comprehensive environment that can be used to test the effect of different conditions for delivery of media resources.

During the long study period that eventually led to the acquisition of the technologies required to develop a scalable video coding standard, it was thought that novel technologies would be required for such a form of video coding that would not fit in the MPEG-4 standard as, e.g., done by Advanced Video Coding (AVC). Part 13 Scalable Video Coding (SVC) was originally intended to host such a standard. However, when it became clear that SVC would be an extension of AVC, as opposed to a new standard, this part 13 was moved to MPEG-4 part 10 as an amendment to AVC and MPEG-21 Part 13 became void.

Conformance of an implementation is of course needed for MPEG-21 technologies as well. Therefore the purpose of Part 14 Conformance is to provide the necessary test methodologies and suites to be used to assess the conformity of a software that creates an MPEG-21 entity (typically an XML document) and a decoder (typically a parser) to the relevant MPEG-21 standard.

Many application domains require a technology that can generate an event every time a Digital Item is processed and an action identified by an action is performed. The technology achieving this is specified in Part 15 Event Reporting (ER).

ERR_model

Figure 3 – The MPEG-21 Event Report model

A User places an Event Report Request (ERR) in a DI. When the DI is received by a device, the ERR is passed to an ERR Receiver and parsed. An Event Receiver senses all internal and external events and passes them to an ER Builder that creates a message (Event Report) and dispatches it to the address indicated in the ERR.

Since a few years (starting from MPEG-7) MPEG has standardised a technology that allows the lossless conversion of a typically very bulky (because of its verbosity) XML document to a binary format, while preserving the ability to efficiently parse the binarised XML format. The technology was originally part of MPEG-7 Systems but was later moved to MPEG-B Part 1 Binary XML format (BiM). BiM is now essentially just a reference in MPEG-7 Part 1 Systems and MPEG-21 Part 16 Binary format.

There are cases where it is necessary to identify a specific fragment of a resource as opposed to the entire set of data. Part 17 Fragment Identification (FID) specifies a normative syntax for URI Fragment Identifiers to be used for addressing parts of a resource from a number of Internet Media Types.

While Part 9 provides a solution to transport a Digital Item in a file, Part 18 Digital Item Streaming (DIS) provides the technology to transport a DI over a streaming mechanism (e.g. in broadcasting when transport is done using MPEG-2 Transport Stream or over IP networks when RTP/UDP/IP is used.

DIS enables the incremental delivery of a Digital Item (DID, metadata, resources) in a piece-wise fashion and with temporal constraints so that the receiver may incrementally consume the DI. This is achieved by using the Bitstream Binding Language (BBL). BBL defines syntax and semantics of instructions applied to fragment a DI and map it into a plurality of delivery channels each employing a specific transport protocol.

MPEG-21_DIS

Figure 4 – The Bitstream Binding Language and Digital Item Streaming

Part 19 Media value Chain Ontology (MVCO) provides a normative core model of a knowledge domain that spans the full media value chain and that can be extended to represent other specialisations. Thus, the MVCO provides a common backbone for interoperable standard services and products (metadata, licenses, attribution etc.) offering new for new business model opportunities to a broad set of interconnected value chains and niches.

Part 20 Contract Expression Language (CEL) provides a standard structured digital representation of complete business agreements between parties. CEL may be used to represent contracts directly related to content or services.

The CEL features include

  1. Identification of the contract and its parties,
  2. Digital expression of the agreed permissions, obligations, and prohibitions, and the associated terms and conditions (deontic expressions) addressing the rights for the exploitation of intellectual property entities, including the specification of the associated conditions, together with other contractual aspects, such as payments, notifications or material delivery.
  3. The possibility to insert the textual version of the contract and/or of the specific clauses, especially for the case in which the original contract is written in natural language.
  4. The possibility to add metadata related to any contract entity and encrypt the whole or any-sub-part of the contract.
  5. As electronic format for a contract document, the agreement of the parties can be proved by their digital signature..

Part 21 Media Contract Ontology (MCO) specifies an ontology for expressing CEL contracts in a semantic representation.

Part 22 User Description (UD) provides standard descriptions of user, context, service and recommendation with reference to Figure 5.

UD_model

Figure 5 – User Description Model

If the input (descriptions) and output (recommendation) data formats are standardised, it is possible for a user to process, e.g. combine, different recommendations to obtain one recommendations that is best suited to the user as depicted in Figure 6.

comination_of_recommendations

Figure 6 – Combination of standard recommendations


Open Content Protection

MP3 has shown that the combination of technology and user needs can create mass phenomena behaving like a hydra. For every head of the MP3 hydra that is cut by a court sentence, two new heads appear. How to fight the MP3 hydra, then? Some people have fought it with the force of law but a better way was to offer a legitimate alternative providing exactly what people with their use of MP3 were silently demanding: any music, any time, anywhere, on any device. Possibly for free, but not necessarily so.

Unfortunately, this is easier said than done, because once you release an MP3 file everybody can have it for free and the means to distribute files multiply by the day. The alternative could then be to release MP3 files in encrypted form, so that the rights holder retains control, but then we are back – strictly from the technology viewpoint – to the digital pay TV model with the added complexity that the web is not (but is becoming) the kind of watertight content distribution channel that pay TV can be.

One day toward the end of 1997, Cesare Mossotto, then Director General of CSELT, asked me why the telco industry had successfully proved that it was possible to run a business based on a common security standard – embodied in the Subscriber Identification Module (SIM) card of GSM – while the media industry was still struggling with the problem of protecting its content without apparently being able to reach a conclusion. Shouldn’t it be possible to do something about it?

This question came at a time I was myself assessing the MPEG-2 take up by the industry three years after approval of the standard: the take off of satellite television that was not growing as fast it should have, the telcos’ failure to start VoD services, the limbo of digital terrestrial television, the still ongoing (at that time) discussions about DVD, the software environment for the STB and so on.

During the Christmas 1997 holidays I mulled over the problem and came to a conclusion. As I wrote in a letter sent at the beginning of January 1998 to some friends:

It is my strong belief that the main reason for the unfulfilled promise (of digital media technology – my note) lies in the segmentation of the market created by proprietary Access Control systems. If the market is to take off, conditions have to be created to move away from such segmentation.

By now you should realise that this sentence was just the signal that another initiative was in the making. I gave it the preliminary name “Openit!” and convened the first meeting in Turin, attended by 40 people from 13 countries and 30 companies where I presented my views of the problem. A definition of the goal was agreed, i.e. “a system where the consumer is able to obtain a receiver and begin to consume and pay for services, without having prior knowledge which services would be consumed, in a simple way such as by operating a remote control device”. Convoluted as the sentence may appear, its meaning was clear to participants: if I am ready to pay, I should be able to consume – but without transforming my house into the warehouse of a CE store, please.

OPIMA_interoperability

Figure 1 – A model of OPIMA interoperability

On that occasion the initiative was also rechristened as Open Platform for Multimedia Access (OPIMA).

The purpose of the initiative was to make consumers happy, by removing hassle from their life, because consumer happiness made possible by such a platform would maximise the willingness of end users to consume content to the advantage of Service Providers (SP), that in turn would maximise content provisioning to the advantage of Content Providers (CP). The result would be a globally enhanced advantage of the different actors on the delivery chain. The meeting also agreed to a work plan that foresaw the completion of specifications in 18 months, achieved through an intense schedule of meetings every other month.

The second OPIMA meeting was held in Paris, where the OPIMA CfP was issued. The submissions were received and studied at the 3rd meeting in Santa Clara, CA, where it also was decided to make OPIMA an initiative under the Industry Technical Agreement (ITA) of the IEC. This formula was adopted because it provided a framework in which representatives of different companies could work to develop a technical specification without the need to set up a new organisation as I had done with DAVIC and FIPA before.

From this followed a sequence of meetings that produced the OPIMA 1.0 specification in October 1999. A further meeting was held in June 2000 to revise comments from implementors and produced version 1.1 that accommodated a number of comments received.

The reader who believes it is futile to standardise protection systems, or Intellectual Property Management and Protection Systems (IPMP-S), to use the MPEG-4 terminology, should wait until I say more about the technical content of the OPIMA specification. The fact of the matter is that OPIMA does not provide a standard protection system because OPIMA is a standard just for exchanging IPMP-Ss between OPIMA peers, using an OPIMA-defined protocol for their secure download. Thus the actual choice of IPMP-Ss is outside of the standard and is completely left to the users (presumably, some rights holders). The IPMP-S need not stay the same and can be changed when external conditions so determine.

An OPIMA peer produces or consumes Protected Content. This is defined as the combination of a content set, an IPMP-S set and a rules set that apply under the given IPMP-S. The OPIMA Virtual Machine (OVM) is the place where Protected Content is acted upon. For this purpose OPIMA exposes an “Application Service API” and an “IPMP Service API”, as indicated in the figure below.

opima2

Figure 2 – An OPIMA peer

The OVM sits on top of the hardware and the native OS but the way this is achieved is not specified by OPIMA. In general, at any given time an OPIMA peer may have a number of IPMP-Ss either implemented in hardware or software and either installed or simply downloaded. 

To see how OPIMA works, let us assume that a user is interacting with an application that may have been obtained from a web site or received from a broadcast channel (see Fig. 2 below).

opima3

Figure 3 – An example of operation of an OPIMA peer

When the user wants to access some protected content, e.g. by clicking a button, the application requests the functionality of the Application Service API. The OVM sets up a Secure Authenticated Channel (SAC). Through the SAC the IPMP-S corresponding to the selected content is downloaded (in an MPEG-4 application there may be several objects each possibly protected with its own specific IPMP-S). The content is downloaded or streamed. The OVM extracts the usage rules associated with the content. The IPMP-S analyses the usage rules and compares them with the peer entitlements, as provided, e.g. by a smart card. Assuming that everything is positive, the IPMP-S instructs the OVM to decrypt and decode the content. Assuming that the content is watermarked, the OVM will extract it and the watermark information handed over to the IPMP-S for screening. Assuming that the screening result is positive, the IPMP system will instruct the OVM to render the content.

It might be worth checking what are the degrees of similarity between the OPIMA solution and GSM. In the latter case each subscriber is given a secret key, a copy of which is stored in the SIM card and in the service provider’s Authentication database. The GSM system goes through a number of steps to ensure secure use of services: 

  • Connection to the service network;
  • Equipment Authentication. Check that the terminal is not blacklisted bu using the unique identity of the GSM Mobile Terminal;
  • SIM Verification. Prompt the user for a Personal Identification Number (PIN), which is checked locally on the SIM. 
  • SIM Authentication. The service provider generates and sends a random number to the terminal. This and the secret key are used by both the mobile terminal and the service provider to compute, through a commonly agreed ciphering algorithm, a so called Signed Response (SRES), which the mobile terminal sends back to the service provider. Subscriber authentication succeeds if the two computed numbers are the same. 
  • Secure Payload Exchange. The same SRES is used to compute, using a second algorithm, a ciphering key that will be used for payload encryption/decryption, using a third algorithm. 

The complete process is illustrated in Figure 3.

gsm_model

Figure 4 – Secure communication in GSM

While the work carried out by FIPA, MPEG and OPIMA was moving on at the pace of one meeting every 50 days on average, I was contacted by the SDMI Foundation. SDMI had been established by the audio recording industry as a not-for-profit organisation at the end of 1998, ostensibly in reaction to the MP3 onslaught or, as the SDMI site recites 

to develop specifications that enable the protection of the playing, storing, and distributing of digital music such that a new market for digital music may emerge.

I was asked to be Executive Director of SDMI at a time I was already reaching physical limits with the said three ongoing initiatives – without mentioning my job at CSELT that my employer still expected me to carry out. If I had accepted the proposal I would run the risk that a fourth initiative would be one too many, but it was too strong an enticement to resist the idea of being involved in a high-profile type of media, already under stress because of digital technologies, helping shepherd music from the clumsy world of today to the bright digital world of tomorrow in an organisation that was going to be the forerunner of a movement that would set the rules and the technology components of the digital world. These were the thoughts I had after receiving the proposal which I eventually accepted.

The first SDMI meeting was held in Los Angeles, CA at the end of February 1999. It was a big show with more than 300 participants from all technology and media industries. The meeting agreed to develop, as a first priority, the specification of a secure Portable Device (PD). To me that first meeting signaled the beginning of an exciting period: in just 4 months of intense work, with meetings held at the pace of one every two weeks, a group of people who did not have a group “identity” a few weeks before was capable of producing the first SDMI Portable Device specification 1.0 (June 1999). All meetings but one were held in the USA and one every other meeting was of the Portable Device Working Group (PD WG), to whose chairman Jack Lacy, then with ATT Research, goes much of the credit for the achievement. I am proud to have been the Executive Director of SDMI at that time and preside over the work carried out by a collection of outstanding brains in those exciting months. 

I will spend a few words on that specification. The first point to be made is that PD 1.0 is not a standard, like any of the MPEG standards or even OPIMA because it does not say what a PD conforming to SDMI PD specification does with the bits it is reading. It is more like a “requirements” document that sets levels of security performance that must be satisfied by an implementation to be declared “SDMI PD 1.0 compliant”. It defines the following elements: 

  • “Content”, in particular SDMI Protected Content
  • “SDMI Compliant Application”
  • “Licensing Compliant Module” (LCM), i.e. an SDMI-Compliant module that interfaces between SDMI-Compliant applications and a Portable Device
  • “Portable Device”, a device that stores SDMI Protected Content received from an LCM residing on a client platform. 

Figure 4 represents a reference model for SDMI PD 1.0.

sdmi_reference_model

Figure 5 – SDMI Portable Device Specification Reference Model

According to this model, content, e.g. a music track from a CD or a compressed music file downloaded from a web site, is extracted by an application and passed to an LCM, from where it can be moved to the “SDMI domain”, housed by an SDMI compliant portable device.  

So far so good, but this was – in a sense – the easy part of the job. The next step was to address the problem that to build the bright digital future, one cannot do away with the past so easily. One cannot just start distributing all the music as SDMI Protected Content because in the last 20 years hundreds of millions of CD players and billions of CDs have been sold and these were all clear text (not to mention releases on earlier carriers). People did buy those CD’s in the expectation that they could “listen” to them on a device. The definition of “listen” is nowhere to be found, because until yesterday it was “obvious” that this meant enjoying the music by playing a CD on a player or after copying it on a compact cassette, etc. Today this means enjoying the music after compressing the CD tracks with MP3, moving the files on your portable device and so on…

So it was reasonable to conclude that SDMI specifications would apply only to “new” content. But then SDMI needed a technology that would allow a playback device to screen “new” music from “old” music. This was a policy decision taken at the London meeting in May 1999, but its implementation required a technology. The traditional MPEG tool of drafting and publishing a CfP was adopted by SDMI as well, a good tool indeed because asking people “outside” to respond requires – as a minimum – that one has a clear and shared idea of what is being asked.

The selection of what would later be called “Phase I Screening Technology” was achieved in October 1999. This was a “robust” watermark inserted in the music itself indicating that the music file is “new” content. Robust means that the watermark information is so tightly coupled with the content that even the type of sophisticated processing that is performed by compression coding is unable to remove it. Still, the music should not be affected by the presence of the watermark since we do not want to scare away customers. The PD specification amended by the selected Phase I Screening Technology was called PD 1.1. 

One would think that this specification should be good news for those involved, because everybody  – from garage bands, to church choirs, to major record labels – could now be in the business of publishing and distributing digital music while retaining control of it. 

This is the sequence of steps of how it should work:

  • The music is played and recorded
  • The digital music file is created
  • Screening technology is added
  • The screened digital music file is compressed
  • The compressed music file is encrypted
  • The encrypted compressed music file is distributed
  • A transaction is performed by a consumer to acquire some rights to the content.

But imagine I am the CEO of a big record label and there are no standards for identification, audio codec, usage rules language, Digital Rights Management (DRM), etc. Secure digital music is then a good opportunity to create a “walled garden” where content will only play on certain devices. With an additional bonus, viz. that there is a high barrier to entry for any newcomer, actually much higher than it used to be, because to be in business one must make a number of technology licensing agreements, find manufacturers of devices, etc. A game that only big guys can expectto be able to play.

In June 1999, while everybody was celebrating the successful achievement of SDMI PD 1.0, I thought that SDMI should provide an alternative with at least the same degree of friendliness as MP3 and that SDMI protected content should be as interoperable as MP3, lest consumers not part from their money to get less tomorrow than they can have for free today. Sony developed an SDMI player – a technology jewel – and discovered at its own expense that people will not buy a device without the assurance that there will be plenty of music in that format and that their playback device will play music from any source. 

On this point I had a fight with most people in SDMI, from music record, CE and IT companies – at least those who elected to make their opinions known at that meeting. I said at that time and I have not changed opinion today that SDMI should have moved forward and made other technology decisions. But my words fell on deaf ears. 

Still, I decided to stay on, in the hope that one day people would discover the futility of developing specifications that were not based on readily available interoperable technologies, and that my original motivation of moving music from the trogloditic analogue-to-digital transitional age to the shining digital age would be fulfilled. 

I was illustrating the working of the robust watermark selected for Phase I. If the playback device finds no watermark, the file is played unrestricted, because it is “old” content. What if the presence of the watermark is detected, because the content is “new”? This requires another screening technology, that SDMI called phase II, capable of giving an answer to the question whether the piece of “new” content that is presented to an SDMI player is legitimate or illegitimate, in practice, whether the file had been compressed with or without the consent of the rights holders. 

A possible technology solving the problem could have been a so-called “fragile” watermark, i.e. one with features that are just the opposite of those of the phase I screening watermark. Assume that a user buys a CD with “new” content on it. If the user wishes to compress it in MP3 for his old MP3 player, there is no problem, because that player knows of no watermark. But assume that he likes to compress it so that it can be played on his feature-laden SDMI player that he likes so much. In that case the MP3 compression will remove the fragile watermark, with the consequence that the SDMI player will detect its absence and will not play the file. 

That phase of work, another technologically exciting venture, started with a CfP in February 2000. Submissions were received at the June meeting in Montréal, QC and a methodical study started. At the September meeting in Brussels, with only 5 proposals remaining on the table, SDMI decided to issue a “challenge” to hackers. The idea was to ask them to try and remove the screening technology, without seriously affecting the sound quality (of course you can always succeed removing the technology if you set all the samples to zero). 

The successful hacker of a proposal would jointly receive a reward of 10,000 USD. The word challenge is somehow misleading for a layman, because it can be taken to mean that SDMI “challenged” the hackers. Actually that word is widely used in the security field when an algorithm is proposed and submitted to “challenges” by peers to see if it withstands attacks. Two of the five proposals (not the technologies selected by SDMI) were actually broken and the promised reward given to the winners. 

Unfortunately it turned out that none of the technologies submitted could satisfy the requirements set out at the beginning, i.e. being unnoticeable by so-called “golden ears”. So SDMI decided to suspend work in this area and wait for progress in technology. This, however, happened after I had left SDMI because of my new appointment as Vice President of the Multimedia Division at the beginning of 2001, after CSELT had been renamed Telecom Italia Lab and given a new, more market oriented, mission. 

I consider it a privilege to have been in SDMI. I had the opportunity to meet new people and smart ones, I must say – and they were not all engineers, nor computer scientists

Let we recount a related story. After I joined SDMI my standing with the content community increased substantially, in particular with the Società Italiana Autori ed Editori (SIAE), the Italian authours and publishers society. With Eugenio Canigiani, a witty SIAE technologist we planned an initiative designed to offer SDMI content to the public and we gave it the name dimension@.

dimension_concept

Figure 6 – The dimension@ service concept

Dimension@ was planned to be a service for secure content distribution supporting all players of the music value chain. We hoped Telecom Italia and SIAE would be the backers, but it should not be a surprise if a middle manager in one of the two companies blocked the initiative.


Craft, Intellect, and Art

In primitive societies every able individual was capable of making for himself all that was needed for a living: bows, arrows, shoes, tents, pottery, etc. From early on, the social organisation, with a complexity originally so far from today’s, gradually evolved towards a specialisation of functions: the stronger ones became warriors, those good at making traps or hunting game went out to get food, the wiser ones became advisors to the tribe’s chief and the cunning ones were sent to negotiate with other tribes.  Even so, individuals had to rely on just themselves for most matters a lot more than is even conceivable today.

As society grew more complex, specialisation of functions affected the individual in a deeper way. While still generally capable of making most of the tools of their daily life for themselves, those in need of a sword had better go to a good blacksmith if they did not want to run the risk of finding themselves at a disadvantage in the battle field. To impress the beloved one, it was advisable to go to a capable craftsman and buy a nice necklace, otherwise one would risk one’s chances. If a visit from an important person was expected, hiring a good cook to offer the visitor a sumptuous meal was a necessity, if one wanted to get the much sought-after contract. 

From the oldest times and under a variety of civilisations and social orders, people have happily parted with some of their wealth for the purpose of laying their hands on some physical objects, produced by the skill of some craftsman, to achieve some concrete goal in the wider world or for their own personal enjoyment. 

With the growing sophistication of daily life, there were other cases where humans became ready to part with portions of their wealth for other, less physical, but for them no less important, matters. Somebody, realising his own lack of knowledge or inferior intellectual capabilities, might decide to “borrow” the missing abilities from somebody else who has a reputation for being endowed with them. In case of illness the medicine man might be called because his potions, magic formulas, dances and prayers were said to work wonders. After many months of drought, the shaman might be offered huge rewards if he only could get rain (but he was probably clever enough to request payment before his prayers took effect). The same ritual could happen if a tribe wanted abundant game or a good harvest. The making of these “wonders” required compensation to the person who effected them. 

With the further progress of civilisation, the ability to use words skilfully gave more and more benefits to those who mastered that skill. They could persuade somebody else to do something he had otherwise no intention of doing. The person able to give good advice to others would be considered a wise man and he could get many benefits if he exploited his skill. His ability to make strong arguments would convince people in trouble with the clan elders to enlist such services and defend themselves better. Great ideas about the current form of society would prompt the head of a social organisation to enlist the holder of such ideas in his ranks so that the status quo in society would be preserved. If the ideas were radical and promoted a new social order, and if the holder of these ideas was smart enough, he could himself become the head of a new social organisation. 

History provides countless examples. Demosthenes earned a living by offering counsel to Athenians in trouble with the law, but his political ambitions were thwarted when he failed to rouse Athenians against Alexander the Great. Cicero attacked Catiline and successfully exploited that case for his social standing and fame, but had to succumb to Anthony’s wrath when his party was defeated. Confucius preached his philosophy in the hope that one of the rulers of the Chinese kingdoms of his time would hire him. Moses became the leader of his people within a reshaped religion and Mohammed became the founder and head of a new religion. 

There is another type of intellectual ability that has existed ab immemorabili, possibly even before speech actually took shape. Humans use songs, often accompanied by rhythmic movements of the body or by some musical instrument, to convey a message that could otherwise not be expressed, or expressed poorly with words alone. This communication medium makes words more effective because rhythm heightens their effect as it allows performers to convey them with a passion that the rational product called “word” often cannot. 

Public Authorities, civil as well as religious, have always been good at getting control of those social moments when people forget their daily troubles in those opportunities of social relaxation called festivals. The authorities often lavishly remunerate those involved in the musical and scenic events because of the role that these moments have always played in preserving and fostering social cohesion and because of their functions in maintaining the essential elements of individuality of a social grouping. 

But, alongside the lucky ones, those groups of people or individuals who cannot get the support of some authority, or simply want to follow their own artistic inspiration without compromise, can likely meet with a very different fate. Artists and performers can be very good at playing a musical instrument or dancing or reciting a poem and people would even come from far away just to listen to or watch them. The artists and performers may personally be very fond of being surrounded by a crowd of people enthused by their performing ability. Those around, if they feel like that, might even happily give some coins, just to show the performers how much they like the way they sing, play, dance or recite. But others would give nothing thinking that the very fact that they have been listening to or watching them is sufficient reward to the performer. 

All this does not really matter if the artists and performers are wealthy persons and their only purpose in life is to express their creativity, possibly for their own exclusive pleasure. However, if they are poor or average persons, possibly with a family to raise, we can see a blurring of the border between an artist, receiving a reward for his creativity from the good will of onlookers, and a person who lives on the goodwill of others. 

This description of artists and performers is not something recalled from past antiquity, it is still with us today. The sad fact is that people take it for granted that they must pay for the work of a blacksmith or an attorney, but they consider it an option to pay for the performance of a singer or an actor. Maybe because art and business flair seldom go together, artists have often felt the need for somebody else to take care of the promotion of their work. 

Of course there are excellent artists who have a good life, but it is a sad fact that the excellence of too many great artists is recognised only after they are dead.


List of MPEG standards

Std Pt Title
1 1 Systems
2 Video
3 Audio
4 Compliance testing
5 Software simulation
2 1 Systems
2 Video
3 Audio
4 Coformance testing
5 Software simulation
6 Extensions for DSM-CC
7 Advanced Audio Coding (AAC)
8 VOID
9 Extension for real time interface for systems decoders
10 Conformance extension – DSM-CC
11 IPMP on MPEG-2 Systems
4 1 Systems
2 Visual
3 Audio
4 Conformance testing
5 Reference software
6 Delivery Multimedia Integration Framework (DMIF)
7 Optimized reference software for coding of audio-visual objects
8 Carriage of ISO/IEC 14496 contents over IP networks
9 Reference hardware description
10 Advanced Video Coding
11 Scene description and application engine
12 ISO base media file format
13 Intellectual Property Management and Protection (IPMP) extensions
14 MP4 file format
15 Carriage of NAL unit structured video in the ISOBMFF
16 Animation Framework eXtension (AFX)
17 Streaming text format
18 Font compression and streaming
19 Synthesised texture stream
20 Lightweight Application Scene Representation (LASeR) and Simple Aggregation Format (SAF)
21 MPEG-J Graphics Framework eXtensions (GFX)
22 Open Font Format
23 Symbolic Music Representation
24 Audio and systems interaction
25 3D Graphics Compression Model
26 Audio conformance
27 3D Graphics conformance
28 Composite font representation
29 Web video coding
30 Timed text and other visual overlays in ISO base media file format
31 Video coding for browsers
32 Reference software and conformance for file formats
33 Internet Video Coding
7 1 Systems
2 Description definition language
3 Visual
4 Audio
5 Multimedia description schemes
6 Reference software
7 Conformance testing
8 Extraction and use of MPEG-7 descriptions
9 Profiles and levels
10 Schema definition
11 MPEG-7 profile schemas
12 Query format
13 Compact descriptors for visual search
14 Reference software, conformance and usage guidelines for CDVS
15 Compact descriptors for video analysis
21 1 Vision, Technologies and Strategy
2 Digital Item Declaration
3 Digital Item Identification
4 Intellectual Property Management and Protection Components
5 Rights Expression Language
6 Rights Data Dictionary
7 Digital Item Adaptation
8 Reference Software
9 File Format
10 Digital Item Processing
11 Evaluation Methods for Persistent Association Technologies
12 Test Bed for MPEG-21 Resource Delivery
13 VOID
14 Conformance Testing
15 Event Reporting
16 Binary Format
17 Fragment Identification of MPEG Resources
18 Digital Item Streaming
19 Media Value Chain Ontology
20 Contract Expression Language
21 Media Contract Ontology
22 User Description
A 1 Purpose for multimedia application formats
2 MPEG music player application format
3 MPEG photo player application format
4 Musical slide show application format
5 Media streaming application format
6 Professional archival application format
7 Open access application format
8 Portable video application format
9 Digital Multimedia Broadcasting application format
10 Surveillance application format
11 Stereoscopic video application format
12 Interactive music application format
13 Augmented reality application format
14 VOID
15 Multimedia Preservation Application Format
16 Publish/Subscribe Application Format
17 Multisensorial Media Application Format
18 Media Linking Application Format
19 Common Media Application Format
20 Visual Identity Application Format
21 Visual Identity Management Application Format
22 Multi-Image Application Format
B 1 Binary MPEG format for XML
2 Fragment Request Units
3 XML IPMP messages
4 Codec configuration representation
5 Bitstream Syntax Description Language (BSDL)
6 VOID
7 Common encryption format for ISO base media file format files
8 VOID
9 Common Encryption for MPEG-2 Transport Streams
10 Carriage of Timed Metadata Metrics of Media in ISO Base Media File Format
11 Green metadata
12 Sample Variants
13 Media Orchestration
14 Partial File Format
C 1 Accuracy requirements for implementation of integer-output 8×8 inverse discrete cosine transform
2 Fixed-point 8×8 inverse discrete cosine transform and discrete cosine transform
3 Representation of auxiliary video streams and supplemental information
4 Media tool library
5 Reconfigurable media coding conformance and reference software
6 Tools for reconfigurable media coding implementations
D 1 MPEG Surround
2 Spatial Audio Object Coding (SAOC)
3 Unified speech and audio coding
4 Dynamic Range Control
E 1 Architecture
2 Multimedia application programming interface (API)
3 Component model
4 Resource and quality management
5 Component download
6 Fault management
7 System integrity management
8 Reference software
V 1 Architecture
2 Control information
3 Sensory information
4 Virtual world object characteristics
5 Data formats for interaction devices
6 Common types and tools
7 Conformance and reference software
M 1 Architecture
2 MPEG extensible middleware (MXM) API
3 Conformance and reference software
4 Elementary services
5 Service aggregation
U 1 Widgets
2 Additional gestures and multimodal interaction
3 Conformance and reference software
H 1 MPEG Media Transport (MMT)
2 High Efficiency Video Coding
3 3D Audio
4 MMT Reference Software
5 HEVC Reference Software
6 3D Audio Reference Software
7 MMT Conformance Testing
8 HEVC Conformance Testing
9 3D Audio Conformance Testing
10 MPEG Media Transport Forward Error Correction (FEC) codes
11 MPEG Composition Information
12 Image file format
13 MMT Implementation guidelines
14 Conversion and coding practices for HDR/WCG video
15 Signalling, backward compatibility and display adaptation for HDR/WCG video
DASH 1 Media presentation description and segment formats
2 Conformance and reference software
3 Implementation guidelines
4 Segment encryption and authentication
5 Server and Network Assisted DASH
6 DASH with Server Push and WebSockets
7 Delivery of CMAF content with DASH
I 1 Immersive Media Architectures
2 Omnidirectional MediA Format
3 Immersive Video Coding
4 Immersive Audio Coding
5 Point Cloud Compression
6 Immersive Media Metrics
7 Immersive Media Metadata
8 Network Based Media Processing
CICP 1 Systems
2 Video
3 Audio
G 1 Transport and Storage of Genomic Information
2 Genomic Information Representation
3 API for Genomic Information Representation
4 Reference Software
5 Conformance
IoMT 1 IoMT Architecture
2 IoMT Discovery and Communication API
3 IoMT Media Data Formats and API
Expl 1 Advance signalling of MPEG containers content
2 Digital representation of neural networks
3 Future Video Coding
4 Hybrid Natural/Synthetic Scene Container
5 Network Distributed Video Coding

An Introduction

This chapter Next chapter
ToC Introduction A guided tour

All living beings communicate in some form and the living beings that are currently on top of the ladder – we, the humans – have the most advanced native form of communication: the word. Not content with it they have invented and used a range of technologies that have made communication between ever more effective:

  • Directly by them: drawing, painting, sculpture, playing music, writing
  • Through machines: printing, photography and cinematography
  • Through immaterial means: wired and wireless communication of text, audio and video
  • By recording: audio and video.

It has taken millennia to get the first and a few centuries to get the last three. Starting just some 30 years ago, however, the ability of humans to communicate has been greatly impacted by the combination of three digital technologies that have brought about the Digital Media Revolution:

  • Media – handling all information sources via the common “bit” unit;
  • Network – delivering information bits everywhere;
  • Device – processing information bits inexpensively.

The Digital Media Revolution shows no sign of abating and it is likely that we will continue riding the media bits for quite some time. Therefore I decided – almost 20 years ago – to write these pages bound by the title “Riding The Media Bits” because most Digital Media Technologies have been – and more continue to be – spawned by the Moving Picture Experts Group (MPEG) that I conceived, created and led for 32 years.

mpeg1000

My goal is to provide the knowledge necessary to understand the nature of media, how digital media came about and how technologies fostered their evolutionTechnologies are seen from the perspective of the author’s experience – Media, but I have also complemented this with Device and Network aspects when I found it appropriate to complement the picture.

The target reader of these pages is non-technical. The matters handled, however, typically involve sophisticated technologies and some knowledge of them will be required, if understanding is not to come out of thin air. I dare say, though, that technical readers can also benefit from being exposed to the breadth of issues treated in these pages.

In order not to scare away the readers of this first page, I guarantee that I have made all efforts to reduce the technical requirements to the minimum necessary. Non-technical readers are therefore advised to exercise a minimum of perseverance (often not very much), when they see themselves confronted with technical descriptions, if they want to reap the results promised. As a last resort, they may skip the chapter that is challenging them beyond their desire to understand.

There is one last thing I would like to state before taking the reader with me for a 32-year ride on the media bits. You will find that personal pronouns are rigorously kept in masculine form. I know this is politically incorrect, but I do think that if a language forces people to use personal pronouns in a sentence, like English does, there should be one of two choices: either one can change the language and make the use of pronouns optional, as in Italian or Japanese, or the people who expect to see a constant use of “he or she”, “him or her”, “his or hers” etc., become less prudish. As, neither of these options is within my reach, I will do as I said. After all I would rather look like a male chauvinist and use masculine pronouns, than be a male chauvinist but use politically correct expressions.

The only promise I can make is that I will use all personal pronouns in feminine form on the next occasion (if there will ever be one :-).

This page would not be complete if I did not acknowledge my English mentor – Philip Merrill. Of his own initiative he has reviewed many of the original pages, providing countless invaluable suggestions. If the pages are more understandable – and readable – the credit goes to him. If they are not the discredit only goes to me.

This chapter Next chapter
ToC Why, How And For What A guided tour

A Guided Tour

You can ride the digital media bits in many ways, even create your own roadmapp, but the table below suggests one that combines the sequence of events with a meaningful story. The suggested reading is organised in 21 chapters each subdivided in a variable number of sections. The structure is mostly sequential in time but also tries to accommodate the evolution of technology.

1 Why, How And For What An introduction, a guided tour and a table of contents
2 The Early Communication A brief review of communication in the history of mankind, the role of Public Authorities, why communication by digital means was preferable and why we need of compression of digital media
3 The Early Digital Communication How digital communication technologies were first developed and deployed, a brief history of computing, how bits were stored and transmitted and why telecom bits are somewhat different from computer bits. Finally how a fault line in my professional life led to the creation of MPEG
4 Media Get Digital A brief tale of the events that led to MPEG-1, the development of its 3 main technologies – Video, Audio and Systems, the role of reference software and conformance, a look inside MPEG-1 and what MPEG-1 has achieved.
5 Digital Media Get Better A brief history of television and why it was so important to go digital, the development of 3 main MPEG-2 technologies – Video, Audio and Systems, a look inside MPEG-2, what MPEG-2 has achieved and how a bold global initiative tried to accelerate the deployment of digital television.
6 Standards, ISO And MPEG Why standards are important, the role of patents, the MPEG way of developing standards, how an MPEG meeting unfolds and a sample of life in an international organisation like ISO
7 Works, Rights And Exploitation Why it is difficult to achieve recognition of the value of some intellectual works, how technology helps the distribution of those works, how rights are defined and how they can be protected
8 Computers And Internet The role of software and particularly operating system and how it is possible to to remove dependency of applications from it, how the current Graphical User Interface was developed, how computers achieved creation of pictures and sound, and how internet came to pervade our lives
9 Digital Media Do More Before getting into the MPEG-4 story we have to recall how media came to meet computers, the development and the inside of the many MPEG-4 components, and what MPEG-4 has achieved
10 Software And Communication How different bytes are made out of the same bits, a short story of Open Source Software and the MPEG relationships with it, how patents and standards create new forms of communication, trying to make digital media standards without patents and the myth of real-time person-to-person audio-visual communication
11 Digital Media For Machines About adding descriptions to other data, the fascinating story of other internet technologies, the development and the inside of the MPEG-7 standard, how machines have begun to talk to other machines, what MPEG-7 has achieved and how we can make machines talk to other machines
12 More About Rights and Technologies The many ways technology changes rights and their enforcements, how content protection can be opened, facing a world that MP3 has changed forever and equiring about why, if technology changes society, laws should not change
13 Frameworks For Digital Media The development of MPEG-21, looking inside it, the story of the Digital Media Project, looking inside its specification, and opening the way for a deployment
14 Putting Digital Media Together A brief description of the first batch of MPEG-A, the digital media integration standard, followed by 3 more Application Formats: Multimedia Preservation, Publish/Subscribe and Media Linking
15 More MPEG Triadic Technologies Why there was a need of more Systems, Video and Audio standards: a short overview of what they do, and a standard to describe decoders and to build repositories of media coding tools
16 Technologies For Virtual Spaces How MPEG-4 deals with 3D Graphics, how we can establish bridges between real and virtual worlds, how we can interact with digital media in a standard way and  how an application format can help kickstart Augmented Reality
17 Systems And Services How MPEG has standardised parts of the inside of devices, how I developed a business using standard technologies and how MPEG standards can be used to build a better internet of the future
18 Coping with an unreliable internet Even though very little of what is called internet guarantees anything, our society is based on it. MPEG has developed standards that decrease the impact of internet reliability on its media.
19 More System-wide Digital Media Standards Why we need MPEG-H, another integrated Systems-Video-Audio standard, how we can communicate over an unreliable internet, and looking inside MPEG-H for Systems, 2D and 3D Video, and Audio
20 Compression, the technology for the digital age Compression has been MPEG’s bread and butter and the propulsive for  digital media. But why should compression only apply to media? There are other digital sources that benefit from compression.
21 The future of media – immersion So far the digital media experience has been largely based on extensions of early media technologies. Technology  promises to provide virtual experiences that are undistinguishable from real ones.
22 Internet of Media Things Internet of Things is a catch word that describes the ability of machines (things) to communicate and process information. MPEG is developing standards for the case when things are media things.
23 Glimpses Of The Future About the – sometimes wild – ideas for future MPEG standards, the future of MPEG and the future of research
24 Acknowledgements Thanking the many people without which we would be riding different media bits
25 Support Material A detailed list of acronyms, the hall of fame of those who served or are still serving as MPEG Chairs, and the complete list of all MPEG standards (so far)

 


The Roadmap of “Riding the Media Bits”

The table below provides the full list of chapters and sections on the left-hand side. The right-hand side briefly introduces the content of each section.

Table 1 – How to navigate the Riding The Media Bits pages

1 Why, How And For What
Introduction What has motivated the writing these pages and what they hope to achieve
A guided tour A summary of each of the main areas described in these pages
The Roadmap A summary of each of the individual pages
2 The Early Communication
Communication Before Digital How the forms of communication and of the business of providing the means to communicate have evolved in analogue times
Communication And Public Authorities The role of public authorities and international organisations in communication
Digital Communication Is Good The steps that brought digital technologies within the reach of exploitation by the media industries
Compressed Digital Is Better The developments that led to the practical exploitation of digital technologies for the media
3 The Early Digital Communication
The First Digital Wailings The first sample applications of digital technologies to the media
Digital Technologies Come Of Age The first practical cases of exploitation of digital technologies for the media
Electronic Computers A succinct history of the hardware side of data processing
Carrying Bits Solving the problem of storing and transmitting bits on analogue carriers
Telecom Bits And Computer Bits Bits are bits are bits, but telecom bits are different from computer bits
A personal faultline A fault line in my professional life that led to the creation of MPEG
4 Media Get Digital
The 1st MPEG Project The events that led to the definition of the first MPEG project: MPEG-1
MPEG-1 Development-Video The development of MPEG-1: Video
MPEG-1 Development-Audio The development of MPEG-1: Audio, Systems and Reference Software
MPEG-1 Development-Systems The first time IT puts media together in a synchronised way
Reference Software Software and standards used to be in different worlds. How they first became two sides of the same coin
Conformance Why MPEG standards need conformance and how it can be assessed
Inside MPEG-1 An overview of the technical content of MPEG-1
The Achievements Of MPEG-1 How MPEG-1 has influenced and benefited the media industry
5 Digital Media Get Better
The Highs And Lows Of Television The importance of television, how it was deployed and how it (should have) developed
The digital television maze Why digital television is such a good idea and why using it is so difficult
MPEG-2 Development-Video The steps that led to the development of MPEG-2 Video
MPEG-2 Development-Audio The steps that led to the development of MPEG-2 Audio and AAC.
MPEG-2 Development-Systems The steps that led to the development of MPEG-2 Systems, DSM-CC and RTI.
Inside MPEG-2 An overview of the technical content of MPEG-2
The Impact Of MPEG-2 How MPEG-2 has influenced and benefited the media industry
Beyond MPEG-2 Digital Audio And Video Why there was a need for DAVIC, what it did and why it was wound up
6 Standards, ISO And MPEG 
The Need For Standards Standards are important but their role must be properly understood
Patents And Standards If standards require patented technology their use must obey some rules
The MPEG Way Of Standards Making The unique MPEG way to develop standards. 
An MPEG Meeting A virtual experience of how an MPEG meeting unfolds
Life In ISO A sample of life in an international organisation and how it affected the first phases of MPEG
7 Works, Rights And Exploitation 
Craft, Intellect And Art People agree to pay for the work of a blacksmith or an attorney, but some consider it an option to pay for the performance of a singer or an actor
Fixating Works How technology used to help the distribution of literary and artistic works
Rights The rights to a hammer are obvious, those to a book less so, those to a bunch of bits are still waiting for a solution
Protecting Content The need to protect digital content and how it can be done
8 Computers And Internet
Computer Programming The role of software, and particularly operating systems, in IT
Operating System Abstraction Is there a way to remove dependency of applications from the operating system?
Humans Interact With Machines Brief history of how we came to the current Graphical User Interface to enable interaction with computers
Computers Create Pictures And Sound Brief history of a complex business case of IT use in the media space: humans perceive pictures and sound created not by the real world but by computers as well
Internet
The fascinating story of a technology and how it changed the media landscape
9 Digital Media Do More
Media Meet Computers And Digital Networks The story of a project integrating most of the different technologies we have talked about so far – and more
MPEG-4 Development How MPEG-4 developed to become _the_ multimedia standard
Inside MPEG-4 – Systems An overview of the MPEG-4 Systems layer
Inside MPEG-4 – Visual The story of how we tried to build videos from objects
Inside MPEG-4 – Audio MP3 suggested that everything for audio was done, but AAC shows that was not the case
Inside MPEG-4 – File Format
The first encounter of MPEG with files (as opposed to streams)
Inside MPEG-4 – Font An overview of a multimedia content type of critical importance
Inside MPEG-4 – Advanced Video Coding
MPEG continues pushing farther  the limits of video compression
The Impact Of MPEG-4 How MPEG-4 has changed the media landscape
10 Software And Communication
Bits And Bytes Bytes are made of 8 bits, but chopping a bitstream in chunks of 8 bits does not necessarily make bytes
Open Source Software Writing software may be an art and some artists have pretty special ideas about the use of the “art” they create
MPEG and Open Source Software MPEG is a group operating in an industrial environment, but the software it develops uses principles similar to those of the Open Source Software community
The Communication Workflow The role of patents and standards in the creation of new forms of communication
Type 1 Video Coding Sometimes it helps to question the foundations of the way we operate
A Fuller Form Of Communication The myth of real-time communication with pictures in addition to audio
11 Digital Media For Machines
Tagging Information A key technology to add descriptions to other data
 The World Wide Web The fascinating story of other internet technologies and how they changed our lives
MPEG-7 Development The MPEG standard to describe what a piece of content is or contains
Inside MPEG-7 An overview of the technical content of MPEG-7
Machines Begin To Understand The World Searching for information out of an image
The Impact Of MPEG-7 How MPEG-7 is beginning to change the way people access content
A World Of Peers If humans can talk to humans, why should machines not talk to machines (intelligently)?
12 More About Rights and Technologies
Technology Challenging Rights Learning from MP3: the many ways technology changes rights and their enforcements
Opening Content Protection Two relevant stories teaching that it does not help to preserve the value of content by protecting it if people cannot access it
The World After MP3 MP3 has changed the media world forever. People must stop playing the game their traditions has accustomed them to play.
Technology, Society and Law If Digital Media Technologies have wrought a revolution in society why should the laws governing it non change? Can changes putting patches to the old be a response?
13 Frameworks For Digital Media
MPEG-21 Development MPEG-21 contains the components of a global solution
Inside MPEG-21 The technologies that let users build reasonable digital media systems
The Digital Media Project A project to right any wrongs that users of technologies may have made
Inside The Interoperable DRM Platform A walkthrough of value chains enabled by the end-to-end Interoperable DRM Platform
Inside The Other DMP Phases The DMP mission is not over
Doing Something For My Country If MPEG and DMP provide the tools for rightful use of digital media, why should my country – or all countries – not benefit from them?
14 Putting Digital Media Together
The First Application Formats Standards for multimedia formats
Multimedia Preservation Application Format How can we cater to  the long-term future of media
Publish/Subscribe Application Format The media business is about the meeting of demand and offer. A new standard capable of disrupting the status quo
Media Linking Application Format Linking the inside of a document to the inside of another document is done billions of times a day. Let’s do the same for media.
15 More MPEG Triadic Technologies
Generic MPEG Technologies Technology matures and the audio-visual system components have achieved independent lives
Generic MPEG Systems Standards Some words about a bunch of Systems standards
Generic MPEG Video Standards Some words about a bunch of Video standards
Generic MPEG Audio Standards Some words about a bunch of Audio standards
Reconfigurable Media Coding A standard to describe decoders and to build repositories of media coding tools
16 Technologies For Virtual Spaces
Inside MPEG-4 – Graphics Adding 3D Graphics to the media tool set
Interaction Between Real And Virtual Worlds Building bridges between real and virtual worlds
Technologies To Interact With Digital Media Interating with media – but without knobs and switches
Augmented Reality Application Format It is possible to make standards for Augmented Reality, not just buzzwords
17 Systems And Services
Inside Digital Media Devices MPEG is about media but not necessarily only about media formats
Getting Things Done My Way Using, not just developing, standard technologies for a business
Technologies For The Internet Of The Future MPEG-21 and MPEG-M show a practical path to information-centric networks
18 Coping with an unreliable internet
19 More System-wide Digital Media Standards
Multimedia Standards For An Evolving Market It is time again to provide an integrated Systems-Video-Audio standard
Coping With An Unpredictable Internet DASH – to get the most out of the internet resource
Inside MPEG-H – Systems The need for new transport technologies to cope with a variety of application contexts, especially hybrid
Inside MPEG-H – 2D Video HEVC
inside MPEG-H – 3D Video After three quarters of a century of flat television, it is time to add a 3rd dimension?
Inside MPEG-H – 3D Audio The need for new transport technologies to cope with a variety of application contexts, especially hybrid
20 Compression, the technology for the digital age
21 The future of media – immersion
22 Internet of Media Things
23 Glimpses Of The Future
MPEG Explorations About the – sometimes wild – ideas for future MPEG standards
End Of the MPEG Ride? MPEG has played a major role in creating the new world of Digital Media Technologies. Does it still have a role to play?
The end of MPEG may be coming, soon? If not the end, a very substantial resizing of MPEG 
The Future Of Research Research is the basis of human progress and the life blood of MPEG. Are we sure research is in the hands of people who know what research is?
24 Acknowledgements
25 Support Material
Acronyms In a field where are just too many acronyms, this page provides the meaning of those used in these pages.
MPEG Subgroups And Chairs The hall of fame of those who served or are still serving as MPEG Chairs
MPEG standards The complete list of all MPEG standards (so far)

 


Communication Before Digital

If one sets aside some minor downsides, such as famine, floods, droughts, attacks by other tribes or death by some incurable disease, life in the Neolithic age was not necessarily so bad. If you got a smart idea – say, how to capture a deer that had been seen around – you could call on your neighbour, convince him with the force of your arguments, and then possibly the two of you would set out to convince more people and go hunting. If you were on your deathbed you would call your family and leave your last words to them so that you could die in the hope that those around your deathbed would forward your last will to your grandsons and great-grandsons a generation in generationem

With an increasingly sophisticated and geographically expanded society, communication had to keep up with new needs. Writing evolved from simplified forms of drawing and painting and enabled the recording of spoken words, but could also be used to send messages to individuals in remote places or times. Kings and emperors – but also republics – could even afford to set up standing networks of couriers to propagate their will to the remotest corners of their empires or territories within the constraints of the time it took to cover the distance with a series of horses. A manuscript could reach more people, but only if they happened to be all at the same place. If not, they could only read it at different times in a sequence. However, if a group of sufficiently learned people was hired, multiple copies of an original could be made and distributed to multiple places and multiple people at the same time. 

In those early times creation of copies and distribution of manuscripts was indeed time consuming and costly. Gutenberg’s invention of mobile character printing made reproduction and distribution of written works cheaper and faster, thereby achieving an incomparably – for those times – higher productivity. Use of the invention required skilled people – with quite different skills than those needed before for copying manuscripts. Spreading an idea, until that time a process that involved a large number of people spending their time traveling and talking to other people or copying and distributing manuscripts, became a much simpler undertaking as demonstrated by the rapid spreading of Protestantism across 16th-century Europe. 

It took several centuries before the invention of the typewriter made it possible to compose a written text that did not carry with it the handwriting of the person who had typed the text. 

Daguerre’s photography was the first example of a technology enabling the automatic reproduction of the visual component of a natural and static scene without necessarily requiring a particular artistic ability on the part of the person taking a picture. On the other hand, at least in the early years, considerable technical ability was required to handle a camera, take a picture, and develop and print photographs. Photography had also the added advantage that multiple copies could be made from the same negative.

Similarly, Edison’s phonograph enabled the recording of the sound components of a portion of a physical space on a physical carrier, with the possibility to make multiple copies from the same master. An important difference between photographic and sound recording technologies of those times was that producing negatives and printing photographs required relatively inexpensive devices and materials, and was kind of within the reach of the man in the street, while recording and printing discs required costly professional equipment that could only be afforded by large organisations. 

Cinematography of the Lumière brothers permitted the capture not just of a snapshot of the visual component of the real world but of a series of snapshots close enough in time that the viewer’s brain could be convinced that they reproduced something that looked like real movement – if the series of snapshots was displayed in a sufficiently rapid succession using an appropriate device. The original motion pictures were later supplemented by sound to give a more complete reproduction of the real world, satisfying both the aural and visual senses.

The physical principles used by these technologies were mechanical for printing and sound recording, chemical and optical for photography and mechanical, chemical and optical for cinematography. 

In the wake of expensive line-of sight communication systems such as deployed by Napoleon for state use, Samuel Morse’s telegraph enabled any user to send messages in real-time to a physically separated point by exploiting the propagation of electromagnetic waves along wires, whose physical principles were barely understood at that time. The telegraph modulated an electric current with long and short signals separated by silence to enable the instantaneous transmission to a far end of a message expressed by Latin characters and, later, characters of other languages as well.

The facsimile device enabled the transmission to a far end of the information present on a piece of paper put on a scanning device and transmitted.

The teletypewriter enabled transmission of characters to a far end where an electromechanical printer would print the characters of the message.

Telephony extended the basic idea of telegraphy by sending an analogue electric signal coming from a microphone, a device that contained carbon and produced a current when sound waves impinged on it. Telephony was designed to enable real-time two-way communication between people.

The discovery that electromagnetic waves propagate in the air over long distances led to Marconi’s invention of wireless telegraphy first and sound broadcasting, called “radio” par excellence, later. What was done for sound, however, was later done also for light, where the equivalent of the microphone was a tube made sensitive to light by a special phosphor layer that produces current when hit by light. The equivalent of the loudspeaker was the Cathode Ray Tube, a tube with a nearly planar surface producing light at a given point of the screen at a particular time with an intensity proportional to the magnitude of the input electric signal. But unlike the time-dependent one-dimensional electric signal at the output of a microphone, the light intensity on the surface of the light-sensitive tube is a time-dependent two-dimensional signal. The mapping of such a signal into a time-dependent signal was achieved by reading the current generated by an electron beam scanning the tube in an agreed order (left-to-right and top-to-bottom) on a sufficiently high number of images (“frames”) per second. 

The purpose of scanning is similar to the one achieved with cinematography, viz. to convince the brain that the sequence of rapidly changing illuminated spots created by the electron flying spot is a continuous motion. The electric signal is then transmitted to a receiver where an electron beam, moving synchronously with the original electron beam, generates time-wise the same intensity that had hit the pick-up tube. The television scanning process produces much higher frequencies than audio because of the need to represent a two-dimensional time-dependent signal. The highest frequency is proportional to the product of the number of times per second an image is scanned (frame frequency), the number of scanning lines (vertical frequency) and the number of transitions that the eye can discern on a scan line (horizontal frequency). 

At the time the first television systems were introduced, the state of technology suggested the trick of subjectively “doubling” the frame frequency and the scan lines, to reduce what would otherwise be a fastidious flicker effect. Therefore, a frame wass composed of two “fields”, where one field has the horizontal scan lines offset by half the vertical line spacing with a process called interlacing. Originally, cathode ray tubes were only capable of producing black-and-white light and therefore television could only produce monochrome pictures. Later, it became possible to manufacture pick-up tubes and cathode ray tubes with 3 types of sensor/phosphor, each capable of sensing/producing red, green and blue light, so as to pick up and generate colour pictures that looked more natural to the human eye. It was indeed experimentally proved – and physiological bases for this found – that any human-perceived colour can be expressed as a combination of 3 independent colours, such as Red Green and Blue (RGB) used in television and Cyan, Magenta and Yellow, with the addition of blacK (CMYK) used in colour printing.

colour_space

Figure 1 – The colour space

The transformation of the aural and visual information into electric signals made possible by the microphone and the television pick-up tube, along with the ability to magnetise tapes covered with magnetic material moving at an appropriate speed and then to read the corresponding information, facilitated the invention of systems to record audio and video information in real-time. 

In more recent times radio has been used to offer telephone service to people on the move. A number of antennae, deployed in an area with an appropriate spacing so as to create a set of “cells”, capture the signal emitted by mobile telephone sets and hand it over to the antenna nearest to the receiving set when a mobile handset changes cell, so that users are not even aware that they are communicating through a different antenna.

mobile_communication

Figure 1 – Mobile communication

The cellular system handles the transition to the next cell, so that users are not even aware that they are communicating through a different antenna.


Communication and Public Authorities

The most potent driver to the establishment of civilisation has been the sharing by the members of a community of an understanding that certain utterances are associated with certain objects and concepts, all the way up to some shared intellectual values. Civilisation is preserved and enhanced from generation to generation because the members of a community agree that there is a mapping between certain utterances and certain graphical signs, even though the precise meaning of the mapping may slowly change with time.

It helps to understand the process leading to the creation of writing by looking at Chinese characters: some are known to derive from a simplified drawing of an object while others, often representing concepts, contain a character indicating the category and a second character whose sound indicates the pronunciation.

chinese_characters

Figure 1 – Some Chinese characters

Writing enables a human not only to communicate with his fellow humans in different parts of the globe but also to leave messages beyond his life span. A future generation will be able to revisit the experience of people who have departed possibly centuries or even millennia before, often with difficulty because of the mentioned drift of the mapping.

Since the earliest times Public Authorities have always had a keen interest in matters related to communication. In most civilisations priesthood have, if not developed, certainly taken over the art of writing. In historical times it is possible to trace back to political considerations the adoption of the Latin and Cyrillic alphabets in Middle and Eastern Europe, the adoption of Chinese characters in Japan, Korea and Vietnam, the introduction of the hangul alphabet in Korea as a replacement of Chinese characters, the replacement of the Arabic alphabet with the Latin one in post World War I Turkey, the use of the Cyrillic alphabet in the former Soviet republics in Central Asia and its recent replacement with the Latin alphabet in Turkish-speaking former Soviet republics. Beyond writing, one can recall the policy of the daimyos in medieval Japan to foster diversity of speech in their territories so as to spot intruders more easily, or the prohibition still enforced in some multi-ethnic countries dominated by one particular ethnical group that makes it a crime, sometimes even punished with the death penalty, to broadcast or even speak in a public place a language different from the “official” one. 

While late 16th century Italy, with its lack of political unity (a “geographic expression” said Metternich, the Austrian foreign minister after the Congress of Vienna up to 1848), witnessed the spontaneous formation of the “Accademia della Crusca” with its self-assigned goal of preserving the Florentine language of Dante, in France the Académie française, established a few decades later by Cardinal Richelieu, is to this day the official body in charge of the definition of the French Language, reaffirmed in recent times by the Loi Toubon. Similarly, the German Bundestag approved a law that amended the way the German language should be written. From that time on, law-abiding Germans fond of Italian cuisine should have stopped eating spaghetti and start eating Spagetti instead. In Japan the Ministry of Education publishes a list of 1850 Chinese characters (Touyou Kanji – 当用漢字) that are taught in elementary schools and used in newspapers. These are all attempts at strengthening the ties of a community through the formal codification of verbal or written expressions. Unfortunately – or fortunately, as the case may be – sometimes the success of its implementation falls short of intentions. 

From early on, technology extended peoples’ communication capabilities in ways that Public Authorities did not necessarily welcome. Regulating the use of goose pens was not easy to implement because of the large supply of the necessary “raw material”, so the goal of inhibiting the dissemination of “dangerous” ideas was effectively achieved by keeping people in ignorance. It was printing, with the greater ease for interested people to disseminate their views, which provided Public Authorities with their first technology-induced challenge. The Catholic Church wanted to retain control of orthodoxy and decided to introduce “imprimatur” (“let it be printed”), a seal of “absence of disapproval” meaning that there was no opposition (“nihil obstat”) on the part of the Church to the printing of a particular book. It did not take long for civil authorities to emulate the Church, so much so that “freedom of the press” did become one of the first claims made by the different revolutions that affected Europe in the late 18th and all of the 19th centuries, while the Americans got it earlier – but not at the very beginning of their independence – as the First Amendment to their Constitution. 

The mail service is an example of Public Authorities proactively fostering communication, but mail is a communication system largely intended to be person-to-person. The mail service started in the UK in 1840 with the introduction of prepaid letters and developed quickly in all countries soon afterwards. All countries charged a uniform rate for all letters of a certain weight within their countries, regardless of the distance involved. The conflicting web of postal services and regulations linking the different countries was overcome by the General Postal Union , established in 1874 by a number of states, and renamed Universal Postal Union in 1878, when the member states succeeded in defining a single postal territory where the principle of “freedom of transit” for letters applied and mail items could by exchanged using a single rate. This did not mean that restrictions were not applied, though. Censorship has now disappeared in most countries, but Public Authorities still retain the right, in special circumstances and subject to certain rules -when this still makes sense – to open letters. 

The invention of telegraphy must have caused great concern to Public Authorities, because their citizens were suddenly given the technical means to instantly communicate with anybody, first within and later even outside of their country. But theirs was a brave reaction: after the first confused restrictions, when a telegram had to be transcribed, translated and handed over in paper at the frontier between two countries before being retransmitted over the telegraph network of the neighbouring country, Public Authorities of that time took the very forward-looking attitude of agreeing on a single “information representation” standard, i.e. a single code to represent a given character. It did take some time, but eventually they got there. All this was facilitated by the establishment in 1865 of the International Telegraph Convention, one of the first examples of sovereign states ceding part of their authority to a specific supranational organisation catering for common needs. In 1885, following the invention of the telephone and the subsequent expansion of telephony, the International Telegraph Convention began to draw up international rules for telephony as well.

In 1906, after the invention of radio, the first International Radiotelegraph Convention was signed. The International Telephone Consultative Committee, set up in 1924, the International Telegraph Consultative Committee (CCIT), set up in 1925, and the International Radio Consultative Committee (CCIR), set up in 1927 were made responsible for drawing up international standards. In 1927, the Convention allocated frequency bands to the various radio services existing at the time (fixed, maritime and aeronautical mobile, broadcasting, amateur, and experimental). In 1934 the International Telegraph Convention of 1865 and the International Radiotelegraph Convention of 1906 were merged and became the International Telecommunication Union (ITU). In 1956, the CCIT and the CCIF were amalgamated to give rise to the International Telephone and Telegraph Consultative Committee (CCITT). IN 1996 the CCITT was renamed called ITU-T and the CCIR ITU-R. We now take it for granted that we can make fixed-line telephone calls and listen to analogue radio everywhere in the world, but this is the result of decades of efforts by the people who have worked in these international committees to make this happen – a unique achievement if one thinks of the belligerent attitude of countries of those times. 

In the 1930s, the UK started a television broadcasting system, with 405 horizontal scanning lines and 25 interlaced frames/second. The USA did the same in 1942 but with a system that had 525 lines and 30 interlaced frames/second. After the end of World War II and each at different times, the other European countries introduced their television systems – all with 625 horizontal scanning lines and 25 interlaced frames/second and were followed by the UK that had to manage a dual system (405 and 625 lines) for several decades. Maybe because the ravages of World War II were still so vivid in people’s minds, the same system was adopted over all of Europe, possibly the only example of such a large-scale agreement on both sides of the Iron Curtain. 

A complex tug-of-war started when progress in pick up tubes, displays and electronics in general made colour television possible. The United States extended their system and defined a nation-wide television standard defined by and known as National Television System Committee (NTSC). In doing so they had to change the original frame frequency of 30.00 Hz to 29.97 Hz (Americans call it trial and error, and apparently it works). Japan, South Korea and Taiwan, and most countries in the American continent that had already adopted the 525-line 30 Hz television standard soon adopted NTSC. 

A few years later Europe witnessed the competition between the German-originated system called Phase Alternating Lines (PAL) and the French-originated system Séquentiel à Mémoire (SECAM) that spread across countries and continents, a fact reminiscent of battles of yore but, thanks God, less bloody this time. PAL and SECAM extended their influence also to the American continent, the Monroe doctrine notwithstanding. Two of the three Guyanas use PAL, but the French Guyana uses SECAM, Argentina chose PAL but with a different colour subcarrier and Brazil decided to add its own indigenous version of PAL on top of the original American 525 lines 30.00 Hz TV system. 

This bifurcation – the first major split in international telecommunication standards – was justified because the Very High Frequency (VHF) radio band – around 100 MHz – and later of the Ultra High Frequency (UHF) radio band – a few 100s MHz – did not allow propagation beyond line of sight. While Short Wave and Medium Wave radio could propagate for thousands of kilometres, television could be made a strictly “national business” and managed accordingly – a blessing of God for the local Public Authorities. The CCIR became a place where countries would come and inform the body of their decisions and the CCIR, much as a Notary Public, dutifully recorded the information provided. The result is that ITU-R Recommendation 624 “Television systems”, is a document with over thirty pages, full of tables where minuscule and large countries alike compete in footnotes stating that they reserve the right to adopt, say, a different frequency tolerance compared to the value adopted by other countries. This is clearly not because, all of a sudden, the Maxwell equations governing propagation of radio waves start behaving differently when a border is crossed, but because of a conscious policy decision driven by the desire to protect the national manufacturing industry or to keep foreign television programs out of the national market, or both. Interestingly, though, Frequency Modulation (FM) radio that uses the same frequency band as television, and has accordingly the same propagation constraints, is the same all over the world. 

All colour television systems were based on the idea of filling in some “holes” in the spectrum of the monochrome TV signal – called Y signal – with the spectrum of two colour difference signals, U = R(ed)-Y and V = B(lie)-Y. From these 3 signals a receiver can recover the three RGB colour primaries and drive the flying spot with the right colour information to the corresponding phosphors.  

Interestingly, Public Authorities had little concern of communication means other than telecommunication and broadcasting. For instance formats for tapes, cassettes and discs have consistently and independently been defined by private enterprises, as shown by the cases of the Compact Disc (CD) or the Vertical Helix Scan (VHS) format for video cassette recording universally adopted after the market had issued its verdict between competing technologies. The International Electrotechnical Commission (IEC), in charge of international standards for all electrical, electronic, and related technologies, played a role similar to the one played by ITU-R for television systems. 

The International Organisation for Standardisation (ISO) deals with such communication standards as photography, cinematography and Information Technology (IT). The ISO work on photography and cinematography has ensured that anybody could buy a camera anywhere in the world and a film anywhere else in the world – choosing among a small number of formats – and be sure that there is a film that fits in the camera. Erle examples of IT standards produced or ratified by ISO are character set codes, such as the 7-bit American Standard Code for Information Interchange (ASCII), also known as (aka) ISO/IEC 646, 8-bit Latin 1 (ISO/IEC 8859-1) and the 16-bit Unicode (part of ISO/IEC 10646).


Digital communication is good

Both audio and video signals can be represented as waveforms. The number of waveforms corresponding to a signal is 1 for telephone, 2 for stereo music and 3 for colour television. While it is intuitively clear that an analogue waveform can be represented to any degree of accuracy by taking a sufficiently large number of samples of the waveform, it was the discovery of Harry Nyquist, a researcher at the Bell Labs in the 1920s, formalised as the Nyquist theorem bearing his name, that a signal with a finite bandwidth of B Hz can be perfectly – in a statistical sense – reconstructed from its samples, if the number of samples per second taken on that signal is greater than 2B. Bell Labs used to be the research centre of the Bell System that included ATT, the telephone company operating in most of the USA and Western Electric, the manufacturing arm (actually it was the research branch of Western Electric’s engineering department that became the Bell Labs).

sampling_and_quantisation

Figure 1 – Signal sampling and quantisation

Around the mid 1940s it became possible to build electronic computers. These were machines with thousands of electronic tubes designed to make any sort of calculations on numbers expressed in binary form based on the sequence of operations described in a list of instructions, called “program”. For several decades the electronic computer was used in a growing number of fields: science, business, government, accounting, inventory etc., in spite of the guess of one IBM executive in the early days that “the world would need only 4 or 5 such machines”.

Researchers working on audio and video saw the possibility of eventually using electronic computers to process samples taken from waveforms. The Nyquist theorem establishes the conditions under which using samples is statistically equivalent to using the continuous waveforms, but samples are in general real numbers, while digital electronic computers can only operate on numbers expressed with a finite number of digits. Fortunately, another piece of research carried out at Bell Labs showed that, given a statistical distribution of samples, it is possible to calculate the maximum number of levels – called quantisation levels – that must be used to represent signal samples so that the power of the error generated by the imperfect digital representation stays below a given value. 

So far so good, but this was “just” the theory. The other, practical but no less important, obstacle was the cost and clumsiness of electronic computers of that time. Here another invention of the Bell Laboratories – the transistor – created the conditions for another later invention – the integrated circuit. This is at the basis of the unstoppable progress of computing devices, also known as Moore’s law, that has allowed making more powerful and smaller integrated devices, including computers, by reducing the size of circuit geometry on silicon (i.e. how close a “wire” of a “circuit” can be close to a “wire” of another circuit). 

It is nice to think that Nyquist’s research was funded by enlightened Bell Labs managers who foresaw that one day there could be digital devices capable of handling digitised telephone signals. Such devices would be used for two fundamental functions of the telecommunication business. The first is moving bits over a transmission link, the second is routing (switching) bits through the nodes of a network. Both these functions – transmission and switching – were performed by the networks of that time but in analogue form. 

The motivations were less exciting but no less serious. The problem that plagued analogue telephony was its unreliability, because analogue equipment performance tends to drift with time, typically because electrical component deteriorate slowly with time. A priori, it is not particularly problematic to have a complex system like the telephone network subject to error – nothing is perfect in this world – the problem is when a large number of small random drifts add up in unpredictable ways and the performance of the system degrades below acceptable limits without being able to point the finger to a specific cause. More than by the grand design suggested above, the drive to digitisation was caused by the notion that, if signals were converted into ones and zeroes, one could create a network where devices either worked or did not work. If this was achieved, it would have been possible to put in place procedures that would make spotting the source of malfunctioning easier. Correcting the error would then easily follow: just change the faulty piece. 

In the 1960s the CCITT made its first decision on the digitisation of telephone speech: a sampling frequency of 8 kHz and 8 bits/sample for a total bitrate of 64 kbit/s. Pulse Code Modulation (PCM) was the name given to the technology that digitised signals. But digitisation had an unpleasant by-product: conversion of a signal into digital form with sufficient approximation creates so much information that transmission or storage requires a much larger bandwidth or capacity than the one required by the original analogue signal. This was not just an issue for telecommunication, but also for broadcasting and Consumer Electronics, all of which used analogue signals for transmission – be it on air or cable – or storage, using some capacity-limited device. For computers this was not an issue – just yet – because at that time audio and video were a data type that was still too remote from practical applications if in digital form. 

The benefits of digitisation did not extend just to the telco industry. Broadcasting was also a possible beneficiary because the effect of distortions on analogue radio and television signals would be greatly reduced – or would become more manageable – by conversion to digital form. The problem was again the amount of information generated in the process. Digitisation of a stereo sound signal (two channels) by sampling at 48 kHz with 16 bits/sample, one form of the so-called AES/EBU interface developed by the Audio Engineering Society (AES) and the European Broadcasting Union (EBU), generates 1,536 kbit/s. Digitisation of television by sampling the luminance information at 13.5 MHz and each of the colour-difference signals at 6.75 MHz (this subsampling can be done because the eye is less demanding on colour information accuracy), generates 216 Mbit/s. Dealing with such high bitrates required special equipment that could only be used in the studio. 

In the CE domain, digital made a strong inroad at the beginning of the 1980s when Philips and Sony on the one hand, and RCA on the other, began to put on the market the first equipment that carried bits with the meaning of musical audio to consumers’ homes in the form of a 12-cm optical disc. After a brief battle, the Compact Disc (CD) defined by Philips and Sony prevailed over RCA’s. The technical specification of the CD was based on the sampling and quantisation characteristics of stereo sound: 44.1 kHz sampling and 16 bits/sample for a total bitrate of 1.41 Mbit/s. For the first time end-users could have studio-quality stereo sound in their homes at a gradually affordable cost, providing the same quality no matter how many times the CD was played, something that only digital technologies could make possible.