Jump to content

Compact Disc Digital Audio: Difference between revisions

From Wikipedia, the free encyclopedia
Content deleted Content added
moving pictures to CD article as per talk page
created standard section that made more sense due to change in articles name; merged duplicate info from Data encoding and Data structure sections
Line 17: Line 17:
{{optical disc authoring}}
{{optical disc authoring}}
'''Compact Disc Digital Audio''' ('''CDDA''' or '''CD-DA''') is the [[standardization|standard]] format for audio [[Compact Disc]]s. The standard is defined in the ''Red Book'', one of a series of "[[Rainbow Books]]" (named for their binding colors) that contain the [[technical specification]]s for all CD formats.
'''Compact Disc Digital Audio''' ('''CDDA''' or '''CD-DA''') is the [[standardization|standard]] format for audio [[Compact Disc]]s. The standard is defined in the ''Red Book'', one of a series of "[[Rainbow Books]]" (named for their binding colors) that contain the [[technical specification]]s for all CD formats.

The first edition of the ''Red Book'' was released in 1980 by [[Philips]] and [[Sony]];<ref name="BBC"/><ref name="Auto45-1"/> it was adopted by the Digital Audio Disc Committee and ratified as [[International Electrotechnical Commission|IEC]] 60908 (published in 1987).<ref name="Auto45-2"/> The second edition of IEC 60908 was published in 1999<ref name="Auto45-3"/> and it cancels and replaces the first edition, amendment 1 (1992) and the corrigendum to amendment 1. The standard is not freely available and must be licensed from Philips.


The ''Red Book'' specifies the physical parameters and properties of the CD, the optical "stylus" parameters, deviations and error rate, modulation system ([[eight-to-fourteen modulation]], EFM) and error correction facility ([[cross-interleaved Reed–Solomon coding]], CIRC), and the eight [[Compact disc subcode|subcode channels]].
The ''Red Book'' specifies the physical parameters and properties of the CD, the optical "stylus" parameters, deviations and error rate, modulation system ([[eight-to-fourteen modulation]], EFM) and error correction facility ([[cross-interleaved Reed–Solomon coding]], CIRC), and the eight [[Compact disc subcode|subcode channels]].


It also specifies the form of [[digital audio]] encoding: 2-channel [[Signedness|signed]] 16-[[bit]] [[Linear pulse code modulation|Linear PCM]] sampled at [[44,100 Hz]]. This [[sampling rate]] is adapted from that attained when recording digital audio on a [[PAL]] (or [[NTSC]]) [[videotape]] with a [[PCM adaptor]], an earlier way of storing digital audio.<ref name="Auto45-4"/> An audio CD can represent frequencies up to 22.05&nbsp;kHz, the [[Nyquist frequency]] of the 44.1&nbsp;kHz sample rate. Although rarely used, the specification allows for discs to be mastered with a form of [[Emphasis (telecommunications)|emphasis]].
It also specifies the form of [[digital audio]] encoding: 2-channel [[Signedness|signed]] 16-[[bit]] [[Linear pulse code modulation|Linear PCM]] sampled at [[44,100 Hz]]. This [[sampling rate]] is adapted from that attained when recording digital audio on a [[PAL]] (or [[NTSC]]) [[videotape]] with a [[PCM adaptor]], an earlier way of storing digital audio.<ref name="Auto45-4"/> An audio CD can represent frequencies up to 22.05&nbsp;kHz, the [[Nyquist frequency]] of the 44.1&nbsp;kHz sample rate. Although rarely used, the specification allows for discs to be mastered with a form of [[Emphasis (telecommunications)|emphasis]].

== Standard ==
The first edition of the ''Red Book'' was released in 1980 by [[Philips]] and [[Sony]];<ref name="BBC"/><ref name="Auto45-1"/> it was adopted by the Digital Audio Disc Committee and ratified as [[International Electrotechnical Commission|IEC]] 60908 (published in 1987).<ref name="Auto45-2"/> The second edition of IEC 60908 was published in 1999<ref name="Auto45-3"/> and it cancels and replaces the first edition, amendment 1 (1992) and the corrigendum to amendment 1.

The standard is not freely available and must be licensed. It is available from [[Philips]] and the [[International Electrotechnical Commission|IEC]]. {{As of|2004}}, the cost per the relevant Philips order form is {{US$|5,000}}.<ref name="Auto45-13"/> {{As of|2013}}, the IEC 60908 document is available as a PDF download for {{US$|372}}.<ref name="Auto45-14"/>


== Basic specifications ==
== Basic specifications ==
Line 32: Line 35:
* [[International Standard Recording Code]] (ISRC) should be included
* [[International Standard Recording Code]] (ISRC) should be included


== Data encoding ==
== Audio encoding ==
Audio in a CD is digitally encoded by [[linear pulse-code modulation]] (LPCM). Each audio sample is a [[Signedness|signed]] 16-bit [[two's complement]] [[Integer (computer science)|integer]], with sample values ranging from −32768 to +32767.
Audio in a CD is digitally encoded by [[linear pulse-code modulation]] (LPCM). Each audio sample is a [[Signedness|signed]] 16-bit [[two's complement]] [[Integer (computer science)|integer]], with sample values ranging from −32768 to +32767.


Before being written to the disc, the source audio data is divided into frames, containing containing twelve [[Sampling (signal processing)|samples]] each (six left and right samples, alternating). These 192-bit (24 bytes) frames are then subjected to [[Cross-interleaved Reed–Solomon coding|CIRC]] encoding, which segments and rearranges the data and expands it with [[parity bit]]s in a way that allows occasional read errors to be detected and corrected. This adds 64 bits (8 bytes) of [[error correction]] data to each frame. After this, 8 bits (1 byte) of [[subcode]] or subchannel data are added to each frame, which is used for control and addressing when playing the CD.
Before being written to the disc, the source audio data is divided into frames, containing containing twelve [[Sampling (signal processing)|samples]] each (six left and right samples, alternating). These 192-bit audio frames are then subjected to [[Cross-interleaved Reed–Solomon coding|CIRC]] encoding, which segments and rearranges the data and expands it with [[parity bit]]s in a way that allows occasional read errors to be detected and corrected. This adds 64 bits of [[error correction]] data to each frame. After this, 8 bits of [[subcode]] or subchannel data are added to each frame, which is used for control and addressing when playing the CD.


These resulting frames are then modulated through [[eight-to-fourteen modulation]] (EFM), where each 8-bit word is replaced with a corresponding 14-bit word designed to reduce the number of transitions between 0 and 1. This reduces the density of physical pits on the disc and provides an additional degree of error tolerance. Three "merging" bits are added before each 14-bit word for disambiguation and synchronization. A 24-bit word is added to the beginning of each frame to assist with synchronization, so the reading device can locate frames easily.
These resulting frames are then modulated through [[eight-to-fourteen modulation]] (EFM), where each 8-bit word is replaced with a corresponding 14-bit word designed to reduce the number of transitions between 0 and 1. This reduces the density of physical pits on the disc and provides an additional degree of error tolerance. Three "merging" bits are added before each 14-bit word for disambiguation and synchronization. A 27-bit word is added to the beginning of each frame to assist with synchronization, so the reading device can locate frames easily.


The EFM, merging bits, and sync words thus expand each frame to 588 bits of "channel data". The frames of channel data are written to disc physically in the form of pits and lands, with each pit or land representing a series of zeroes, and with the transition points—the edge of each pit—representing 1.
The EFM, merging bits, and sync word thus expand each frame to 588 bits of "channel data". The frames of channel data are written to disc physically in the form of pits and lands, with each pit or land representing a series of zeroes, and with the transition points—the edge of each pit—representing 1.

The disc's audio data stream is continuous, but has three parts: the main portion, which is further divided into playable audio tracks, is the ''program area''. This section is preceded by a ''lead-in'' track and followed by a ''lead-out'' track. The lead-in and lead-out tracks encode only silent audio, but all three sections contain subcode data streams. The lead-in's subcode contains repeated copies of the disc's Table Of Contents (TOC), which provides an index of the start positions of the tracks in the program area and lead-out. The track positions are referenced by absolute [[timecode]], relative to the start of the program area, in MSF format: minutes, seconds, and fractional seconds called ''frames''. Each timecode frame is one seventy-fifth of a second, and corresponds to a block of 98 channel-data frames—ultimately, a block of 588 pairs of left and right audio samples. Timecode contained in the subchannel data allows the reading device to locate the region of the disc that corresponds to the timecode in the TOC.

In the 1990s, [[CD-ROM]] and related [[ripping|Digital Audio Extraction]] (DAE) technology introduced the term ''[[CD-ROM#CD-ROM format|sector]]'' to refer to each timecode frame, with each sector being identified by a sequential [[integer]] number starting at zero, and with tracks aligned on sector boundaries. An audio CD sector corresponds to 2,352 bytes of decoded data. The ''Red Book'' does not refer to sectors, nor does it distinguish the corresponding sections of the disc's data stream except as "frames" in the MSF addressing scheme.


== Sample rate ==
== Sample rate ==
Line 50: Line 49:


There was a long debate over the use of 14-bit (Philips) or 16-bit (Sony) [[Quantization (sound processing)|quantization]], and 44,056 or 44,100 samples/s (Sony) or approximately 44,000 samples/s (Philips). When the Sony/Philips task force designed the Compact Disc, Philips had already developed a 14-bit [[Digital-to-analog converter|D/A converter]] (DAC), but Sony insisted on 16-bit. In the end, 16 bits and 44.1 kilosamples per second prevailed. Philips found a way to produce 16-bit quality using its 14-bit DAC by using four times [[oversampling]].
There was a long debate over the use of 14-bit (Philips) or 16-bit (Sony) [[Quantization (sound processing)|quantization]], and 44,056 or 44,100 samples/s (Sony) or approximately 44,000 samples/s (Philips). When the Sony/Philips task force designed the Compact Disc, Philips had already developed a 14-bit [[Digital-to-analog converter|D/A converter]] (DAC), but Sony insisted on 16-bit. In the end, 16 bits and 44.1 kilosamples per second prevailed. Philips found a way to produce 16-bit quality using its 14-bit DAC by using four times [[oversampling]].

== Bit rate ==
The audio [[bit rate]] is 1,411.2 [[Data_rate_units#Kilobit_per_second|kbit/s]] (as 2 channels × 44,100 samples per second per channel × 16 bits per sample = 1,411,200 [[Bits per second|bit/s]] = 1,411.2 kbit/s). Likewise, in a computer, audio data coming in from a CD drive is accessed by sectors, each sector being 2,352 bytes, and with 75 sectors containing 1 second of audio, for the same bit rate of 2,352 × 75 = 176.4 [[kibibyte|KiB]]/s (1,411.2 kbit/s). In comparison, the bit rate of a "1×" [[CD-ROM]] is defined as 2,048 bytes per sector × 75 sectors per second = 150 [[kibibyte|KiB]]/s (1,228.8 kbit/s). The undecoded channel-data rate for a ''Red Book'' audio CD is 4.3218 [[megabit|Mbit/s]], with 2.0338 Mbit/s being the rate of the undecoded audio and subcode.


== Storage capacity and playing time ==
== Storage capacity and playing time ==
Line 63: Line 65:


== Data structure ==
== Data structure ==
=== Sessions ===<!-- This section is linked from [[Session (CD)]] -->
=== Overall structure===<!-- This section is linked from [[Session (CD)]] -->
[[File:CD Diagra.001.jpg|thumb|This image of a [[CD-R]] demonstrates some of the visible features of an audio CD, including the lead-in, program area, and lead-out. A microscopic spiral of digital information<ref name="Auto45-5"/> begins near the disc's middle and ends near the edge. Data-free areas of the disc and silent portions of the spiral reflect light differently, sometimes allowing track boundaries to be seen]]
[[File:CD Diagra.001.jpg|thumb|This image of a [[CD-R]] demonstrates some of the visible features of an audio CD, including the lead-in, program area, and lead-out. A microscopic spiral of digital information<ref name="Auto45-5"/> begins near the disc's middle and ends near the edge. Data-free areas of the disc and silent portions of the spiral reflect light differently, sometimes allowing track boundaries to be seen]]


The audio data stream in an audio CD is continuous, but has three parts. The main portion, which is further divided into playable audio tracks, is the ''program area''. This section is preceded by a ''lead-in'' track and followed by a ''lead-out'' track. The lead-in and lead-out tracks encode only silent audio, but all three sections contain [[subcode]] data streams.
Data on an audio compact disc is laid out in ''sessions''. Each session has three areas: a ''lead-in'' containing the session's ''Table of Contents'' (TOC); a ''program'' holding several ''tracks'' (described in the next section); and a ''lead-out'' to mark then end of the session. A disc can have up to 99 tracks. Each session must have at least one track. The tracks are in the program area of the session.


In ''multisession'' discs, the lead-in areas contain addresses of the previous sessions. The TOC in the lead-in of the latest session is used to access the tracks.
If a disc supports multiple sessions, each session has this same structure (lead-in, program area, and lead-out). In ''multisession'' discs, the lead-in areas contain addresses of the previous sessions. The TOC in the lead-in of the latest session is used to access the tracks. Each session must have at least one track.


The following table shows the structure of a session:
The following table shows the structure of a CD:


{| class="wikitable"
{| class="wikitable"
Line 93: Line 95:
|}
|}


The '''Table of Contents''' ('''TOC''') is the area where the layout of the tracks on the disc is described. It is in the lead-in area of the disc session. The TOC on discs is analogous to the [[partition table]] on [[hard drive]]s. Nonstandard or corrupted TOC records are abused as a form of [[CD/DVD copy protection]], in e.g. the [[key2Audio]] scheme.
The lead-in's subcode contains repeated copies of the disc's Table Of Contents (TOC), which provides an index of the start positions of the tracks in the program area and lead-out. The track positions are referenced by absolute [[timecode]], relative to the start of the program area, in MSF format: minutes, seconds, and fractional seconds called ''frames''. Each timecode frame is one seventy-fifth of a second, and corresponds to a block of 98 channel-data frames—ultimately, a block of 588 pairs of left and right audio samples. Timecode contained in the subchannel data allows the reading device to locate the region of the disc that corresponds to the timecode in the TOC. The TOC on discs is analogous to the [[partition table]] on [[hard drive]]s. Nonstandard or corrupted TOC records are abused as a form of [[CD/DVD copy protection]], in e.g. the [[key2Audio]] scheme.


The '''lead-in''' area of a CD session is the starting part of the session. It contains the TOC for the session. If the disc is multisession, it also has the address of the next session or the next free part of the disc where a session can be added.
The lead-out area is the ending part of the CD or of a session. The first lead-out is 6,750 sectors (about 13 megabytes) long; each subsequent lead-out is 2,250 sectors (4 megabytes) long.

The '''lead-out''' area is the ending part of the CD session. When the session is closed, the lead-out area is written. The first lead-out is 6,750 sectors (about 13 megabytes) long; each subsequent lead-out is 2,250 sectors (4 megabytes) long.


=== Tracks ===
=== Tracks ===
{{main|Track (CD)}}
{{main|Track_(CD)#Audio tracks}}
The largest entity on a CD is called a track. A CD can contain up to 99 tracks (including a data track for [[Mixed Mode CD|mixed mode discs]]). Each track can in turn have up to 100 indexes, though players which handle this feature are rarely found outside of [[pro audio]], particularly radio broadcasting{{Citation needed|date=October 2012}}. The vast majority of songs are recorded under index 1, with the [[pre-gap]] being index 0. Sometimes [[hidden track]]s are placed at the end of the last track of the disc, often using index 2 or 3. This is also the case with some discs offering "101 sound effects", with 100 and 101 being indexed as two and three on track 99. The index, if used, is occasionally put on the track listing as a decimal part of the track number, such as 99.2 or 99.3. ([[Information Society (band)|Information Society]]'s ''[[Hack (album)|Hack]]'' was one of very few CD releases to do this, following a release with an equally obscure CD+G feature.) The track and index structure of the CD were carried forward to the [[DVD]] format as title and chapter, respectively.
The largest entity on a CD is called a track. A CD can contain up to 99 tracks (including a data track for [[Mixed Mode CD|mixed mode discs]]). Each track can in turn have up to 100 indexes, though players which handle this feature are rarely found outside of [[pro audio]], particularly radio broadcasting{{Citation needed|date=October 2012}}. The vast majority of songs are recorded under index 1, with the [[pre-gap]] being index 0. Sometimes [[hidden track]]s are placed at the end of the last track of the disc, often using index 2 or 3. This is also the case with some discs offering "101 sound effects", with 100 and 101 being indexed as two and three on track 99. The index, if used, is occasionally put on the track listing as a decimal part of the track number, such as 99.2 or 99.3. ([[Information Society (band)|Information Society]]'s ''[[Hack (album)|Hack]]'' was one of very few CD releases to do this, following a release with an equally obscure CD+G feature.) The track and index structure of the CD were carried forward to the [[DVD]] format as title and chapter, respectively.


Line 106: Line 106:


=== Frames and timecode frames ===
=== Frames and timecode frames ===
{{main|Track_(CD)#Audio tracks}}
{{main|Track_(CD)#Sector structure}}
The smallest entity in a CD is a channel-data ''frame'', which consists of 33 bytes and contains six complete 16-bit stereo samples: 24 bytes for the audio (two 8-bit bytes × two channels × six samples = 24 bytes), eight CIRC error-correction bytes, and one [[Compact Disc subcode|subcode]] byte, used for control and display. Each byte is translated into a 14-bit word using [[eight-to-fourteen modulation]], which alternates with three-bit merging words. In total there are 33 × (14 + 3) = 561 bits. A 27-bit unique synchronization word is added, so that the number of bits in a frame totals 588 (which are decoded to only 192 bits music because of the error-corretion and control bits).
The smallest entity in a CD is a channel-data ''frame'', which consists of 33 bytes and contains six complete 16-bit stereo samples: 24 bytes for the audio (two 8-bit bytes × two channels × six samples = 24 bytes), eight CIRC error-correction bytes, and one [[Compact Disc subcode|subcode]] byte, used for control and display. Each byte is translated into a 14-bit word using [[eight-to-fourteen modulation]], which alternates with three-bit merging words. In total there are 33 × (14 + 3) = 561 bits. A 27-bit unique synchronization word is added, so that the number of bits in a frame totals 588 (which are decoded to only 192 bits music because of the error-corretion and control bits).


On a ''Red Book'' audio CD, data is addressed using [[timecode]] expressed in minutes, seconds and another type of ''frames'' (mm:ss:ff), where one frame corresponds to 1/75th of a second of audio: 588 pairs of left and right samples. This timecode frame is distinct from the 33-byte channel-data frame described above, and is used for time display and positioning the reading laser. When editing and extracting CD audio, this timecode frame is the smallest addressable time interval for an audio CD; thus, track boundaries only occur on these frame boundaries. Each of these structures contains 98 channel-data frames, totaling 98 × 24 = 2,352 bytes of music. The CD is played at a speed of 75 frames (or sectors) per second, thus 44,100 samples or 176,400 bytes per second.
On a ''Red Book'' audio CD, data is addressed using the ''MSF scheme'', with [[timecode]]s expressed in minutes, seconds and another type of ''frames'' (mm:ss:ff), where one frame corresponds to 1/75th of a second of audio: 588 pairs of left and right samples. This timecode frame is distinct from the 33-byte channel-data frame described above, and is used for time display and positioning the reading laser. When editing and extracting CD audio, this timecode frame is the smallest addressable time interval for an audio CD; thus, track boundaries only occur on these frame boundaries. Each of these structures contains 98 channel-data frames, totaling 98 × 24 = 2,352 bytes of music. The CD is played at a speed of 75 frames (or sectors) per second, thus 44,100 samples or 176,400 bytes per second.


On a [[CD-ROM]] data disc and related technology, including Digital Audio Extraction (DAE, or ''[[ripping]]'' of audio CDs), the timecode frames are called ''sectors'' and are addressed with simple, sequential, non-negative [[integer]] numbers. Since error concealment cannot be applied to non-audio data in case the CIRC error correction fails to recover the user data, a third layer of error correction is defined, reducing the payload to 2,048 bytes per sector for the Mode 1 CD-ROM format. To increase the data-rate for [[Video CD]] (which uses the Mode 2 CD-ROM format for sectors), the third layer is omitted, increasing the payload to 2,336 user-available bytes per sector, only 16 bytes (for synchronization and header data) less than available in Red-Book audio.
In the 1990s, [[CD-ROM]] and related [[ripping|Digital Audio Extraction]] (DAE) technology introduced the term ''[[CD-ROM#CD-ROM format|sector]]'' to refer to each timecode frame, with each sector being identified by a sequential [[integer]] number starting at zero, and with tracks aligned on sector boundaries. An audio CD sector corresponds to 2,352 bytes of decoded data. The ''Red Book'' does not refer to sectors, nor does it distinguish the corresponding sections of the disc's data stream except as "frames" in the MSF addressing scheme.


The following table shows the relation betwee tracks, timecode frames (sectors) and channel-data frames:
The following table shows the relation betwee tracks, timecode frames (sectors) and channel-data frames:
Line 133: Line 133:
|}
|}


== Data access from computers ==
== Files ==
Unlike on a DVD or CD-ROM, there are no "[[computer file|file]]s" on a ''Red Book'' audio CD; there are only the physical pits and lands, which in turn represent a single encoded data stream, which ultimately represents one continuous stream of LPCM audio data, and a parallel, smaller set of 8 subcode data streams. Computer operating systems, however, may provide access to an audio CD as if it contains files. For example, Windows represents the CD's TOC as a set of [[Compact Disc Audio track]] (CDA) files, each file containing indexing information, not audio data.
Unlike on a DVD or CD-ROM, there are no "[[computer file|file]]s" on a ''Red Book'' audio CD; there are only the physical pits and lands, which in turn represent a single encoded data stream, which ultimately represents one continuous stream of LPCM audio data, and a parallel, smaller set of 8 subcode data streams. Computer operating systems, however, may provide access to an audio CD as if it contains files. For example, Windows represents the CD's TOC as a set of [[Compact Disc Audio track]] (CDA) files, each file containing indexing information, not audio data.


In a process called [[ripping]], Digital Audio Extraction software can be used to read ''Red Book'' audio data and store it in files. Common [[audio file format]]s for this purpose include WAV and AIFF, which simply preface the LPCM data with a short [[header (computing)|header]]; FLAC, ALAC, and Windows Media Lossless, which compress the LPCM data in ways that conserve space yet allow it to be restored without any changes; and various [[lossy compression|lossy]], [[perceptual audio coder|perceptual coding]] formats like MP3 and AAC, which modify and compress the audio data in ways that irreversibly change the audio, but that exploit features of human hearing to make the changes difficult to discern.
In a process called [[ripping]], Digital Audio Extraction software can be used to read ''Red Book'' audio data and store it in files. Common [[audio file format]]s for this purpose include WAV and AIFF, which simply preface the LPCM data with a short [[header (computing)|header]]; FLAC, ALAC, and Windows Media Lossless, which compress the LPCM data in ways that conserve space yet allow it to be restored without any changes; and various [[lossy compression|lossy]], [[perceptual audio coder|perceptual coding]] formats like MP3 and AAC, which modify and compress the audio data in ways that irreversibly change the audio, but that exploit features of human hearing to make the changes difficult to discern.

== Bit rate ==
The audio [[bit rate]] is 1,411.2 [[Data_rate_units#Kilobit_per_second|kbit/s]] (as 2 channels × 44,100 samples per second per channel × 16 bits per sample = 1,411,200 [[Bits per second|bit/s]] = 1,411.2 kbit/s). Likewise, in a computer, audio data coming in from a CD drive is accessed by sectors, each sector being 2,352 bytes, and with 75 sectors containing 1 second of audio, for the same bit rate of 2,352 × 75 = 176.4 [[kibibyte|KiB]]/s (1,411.2 kbit/s). In comparison, the bit rate of a "1×" [[CD-ROM]] is defined as 2,048 bytes per sector × 75 sectors per second = 150 [[kibibyte|KiB]]/s (1,228.8 kbit/s). The undecoded channel-data rate for a ''Red Book'' audio CD is 4.3218 [[megabit|Mbit/s]], with 2.0338 Mbit/s being the rate of the undecoded audio and subcode.


== Format deviations ==
== Format deviations ==
Line 147: Line 144:


[[DVD Audio]], an advanced version of the audio CD, emerged in 1999.<ref name="Auto45-12"/> The format was designed to feature audio of higher fidelity. It applies a higher sampling rate and used 650&nbsp;nm lasers.
[[DVD Audio]], an advanced version of the audio CD, emerged in 1999.<ref name="Auto45-12"/> The format was designed to feature audio of higher fidelity. It applies a higher sampling rate and used 650&nbsp;nm lasers.

== Availability ==
The standard is available from [[Philips]] and the [[International Electrotechnical Commission|IEC]]. {{As of|2004}}, the cost per the relevant Philips order form is {{US$|5,000}}.<ref name="Auto45-13"/> {{As of|2013}}, the IEC 60908 document is available as a PDF download for {{US$|372}}.<ref name="Auto45-14"/>


== See also ==
== See also ==

Revision as of 14:44, 14 May 2013

Compact Disc Digital Audio
Media typeOptical disc
Encoding2 channels of LPCM audio, each signed 16-bit values sampled at 44100 Hz
Capacityup to 74–80 minutes (up to 24 minutes for mini 8 cm CD)
Read mechanismSemiconductor laser (780 nm wavelength)
StandardIEC 60908
Developed bySony & Philips
UsageAudio storage

Compact Disc Digital Audio (CDDA or CD-DA) is the standard format for audio Compact Discs. The standard is defined in the Red Book, one of a series of "Rainbow Books" (named for their binding colors) that contain the technical specifications for all CD formats.

The Red Book specifies the physical parameters and properties of the CD, the optical "stylus" parameters, deviations and error rate, modulation system (eight-to-fourteen modulation, EFM) and error correction facility (cross-interleaved Reed–Solomon coding, CIRC), and the eight subcode channels.

It also specifies the form of digital audio encoding: 2-channel signed 16-bit Linear PCM sampled at 44,100 Hz. This sampling rate is adapted from that attained when recording digital audio on a PAL (or NTSC) videotape with a PCM adaptor, an earlier way of storing digital audio.[1] An audio CD can represent frequencies up to 22.05 kHz, the Nyquist frequency of the 44.1 kHz sample rate. Although rarely used, the specification allows for discs to be mastered with a form of emphasis.

Standard

The first edition of the Red Book was released in 1980 by Philips and Sony;[2][3] it was adopted by the Digital Audio Disc Committee and ratified as IEC 60908 (published in 1987).[4] The second edition of IEC 60908 was published in 1999[5] and it cancels and replaces the first edition, amendment 1 (1992) and the corrigendum to amendment 1.

The standard is not freely available and must be licensed. It is available from Philips and the IEC. As of 2004, the cost per the relevant Philips order form is US$5,000.[6] As of 2013, the IEC 60908 document is available as a PDF download for US$372.[7]

Basic specifications

The basic specifications state that:

  • Maximum playing time is 79.8 minutes[8]
  • Minimum duration for a track is 4 seconds (including 2-second pause)
  • Maximum number of tracks is 99
  • Maximum number of index points (subdivisions of a track) is 99 with no maximum time limit
  • International Standard Recording Code (ISRC) should be included

Audio encoding

Audio in a CD is digitally encoded by linear pulse-code modulation (LPCM). Each audio sample is a signed 16-bit two's complement integer, with sample values ranging from −32768 to +32767.

Before being written to the disc, the source audio data is divided into frames, containing containing twelve samples each (six left and right samples, alternating). These 192-bit audio frames are then subjected to CIRC encoding, which segments and rearranges the data and expands it with parity bits in a way that allows occasional read errors to be detected and corrected. This adds 64 bits of error correction data to each frame. After this, 8 bits of subcode or subchannel data are added to each frame, which is used for control and addressing when playing the CD.

These resulting frames are then modulated through eight-to-fourteen modulation (EFM), where each 8-bit word is replaced with a corresponding 14-bit word designed to reduce the number of transitions between 0 and 1. This reduces the density of physical pits on the disc and provides an additional degree of error tolerance. Three "merging" bits are added before each 14-bit word for disambiguation and synchronization. A 27-bit word is added to the beginning of each frame to assist with synchronization, so the reading device can locate frames easily.

The EFM, merging bits, and sync word thus expand each frame to 588 bits of "channel data". The frames of channel data are written to disc physically in the form of pits and lands, with each pit or land representing a series of zeroes, and with the transition points—the edge of each pit—representing 1.

Sample rate

The selection of the sample rate was based primarily on the need to reproduce the audible frequency range of 20–20,000 Hz (20 kHz). The Nyquist–Shannon sampling theorem states that a sampling rate of more than twice the maximum frequency of the signal to be recorded is needed, resulting in a required rate of at least 40 kHz. The exact sampling rate of 44.1 kHz was inherited from a method of converting digital audio into an analog video signal for storage on U-matic video tape, which was the most affordable way to transfer data from the recording studio to the CD manufacturer at the time the CD specification was being developed. The device that converts an analog audio signal into PCM audio, which in turn is changed into an analog video signal is called a PCM adaptor. This technology could store six samples (three samples per stereo channel) in a single horizontal line. A standard NTSC video signal has 245 usable lines per field, and 59.94 fields/s, which works out to be 44,056 samples/s/stereo channel. Similarly, PAL has 294 lines and 50 fields, which gives 44,100 samples/s/stereo channel. This system could store 14-bit samples with some error correction, or 16-bit samples with almost no error correction.

There was a long debate over the use of 14-bit (Philips) or 16-bit (Sony) quantization, and 44,056 or 44,100 samples/s (Sony) or approximately 44,000 samples/s (Philips). When the Sony/Philips task force designed the Compact Disc, Philips had already developed a 14-bit D/A converter (DAC), but Sony insisted on 16-bit. In the end, 16 bits and 44.1 kilosamples per second prevailed. Philips found a way to produce 16-bit quality using its 14-bit DAC by using four times oversampling.

Bit rate

The audio bit rate is 1,411.2 kbit/s (as 2 channels × 44,100 samples per second per channel × 16 bits per sample = 1,411,200 bit/s = 1,411.2 kbit/s). Likewise, in a computer, audio data coming in from a CD drive is accessed by sectors, each sector being 2,352 bytes, and with 75 sectors containing 1 second of audio, for the same bit rate of 2,352 × 75 = 176.4 KiB/s (1,411.2 kbit/s). In comparison, the bit rate of a "1×" CD-ROM is defined as 2,048 bytes per sector × 75 sectors per second = 150 KiB/s (1,228.8 kbit/s). The undecoded channel-data rate for a Red Book audio CD is 4.3218 Mbit/s, with 2.0338 Mbit/s being the rate of the undecoded audio and subcode.

Storage capacity and playing time

The partners aimed at a playing time of 60 minutes with a disc diameter of 100 mm (Sony) or 115 mm (Philips).[9] Sony vice-president Norio Ohga suggested extending the capacity to 74 minutes to accommodate Wilhelm Furtwängler's recording of Ludwig van Beethoven's Symphony No. 9 from the 1951 Bayreuth Festival.[10][11]

The additional 14-minute playing time subsequently required changing to a 120 mm disc. Kees Immink, Philips' chief engineer, however, denies this, claiming that the increase was motivated by technical considerations, and that even after the increase in size, the Furtwängler recording would not have fit on one of the earliest CDs.[9][12] According to a Sunday Tribune interview,[13] the story is slightly more involved. In 1979, Philips owned PolyGram, one of the world's largest distributors of music. PolyGram had set up a large experimental CD plant in Hannover, Germany, which could produce huge numbers of CDs having, of course, a diameter of 115 mm. Sony did not yet have such a facility. If Sony had agreed on the 115-mm disc, Philips would have had a significant competitive edge in the market. Sony decided that something had to be done. The long playing time of Beethoven's Ninth Symphony imposed by Ohga was used to push Philips to accept 120 mm, so that Philips' PolyGram lost its edge on disc fabrication.[13]

The 74-minute playing time of a CD, which was longer than the 22 minutes per side[14][15] typical of long-playing (LP) vinyl albums, was often used to the CD's advantage during the early years when CDs and LPs vied for commercial sales. CDs would often be released with one or more bonus tracks, enticing consumers to buy the CD for the extra material. However, attempts to combine double LPs onto one CD occasionally resulted in the opposite situation in which the CD would actually offer fewer tracks than the equivalent LP, though bonus tracks were also added to CD re-releases of double LPs as well.

Playing times beyond 74 minutes are achieved by decreasing track pitch beyond the original Red Book standard. Most players can accommodate the more closely spaced data.[16] Christian Thielemann's live Deutsche Grammophon recording of Bruckner's Fifth with the Munich Philharmonic in 2004 clocks at 82:34.[17] The Kirov Orchestra recording of Pyotr Ilyich Tchaikovsky's The Nutcracker conducted by Valery Gergiev and released by Philips/PolyGram Records (catalogue number 462 114) on October 20, 1998, clocks at 81:14.[citation needed] The Mission of Burma compilation album Mission of Burma, released in 1988 by Rykodisc, previously held the record at 80:08.[18]

Current manufacturing processes allow an audio CD to contain up to 80 minutes (variable from one replication plant to another) without requiring the content creator to sign a waiver releasing the plant owner from responsibility if the CD produced is marginally or entirely unreadable by some playback equipment. Thus, in current practice, maximum CD playing time has crept higher by reducing minimum engineering tolerances; by and large, this has not unacceptably reduced reliability.[citation needed]

Data structure

Overall structure

This image of a CD-R demonstrates some of the visible features of an audio CD, including the lead-in, program area, and lead-out. A microscopic spiral of digital information[19] begins near the disc's middle and ends near the edge. Data-free areas of the disc and silent portions of the spiral reflect light differently, sometimes allowing track boundaries to be seen

The audio data stream in an audio CD is continuous, but has three parts. The main portion, which is further divided into playable audio tracks, is the program area. This section is preceded by a lead-in track and followed by a lead-out track. The lead-in and lead-out tracks encode only silent audio, but all three sections contain subcode data streams.

If a disc supports multiple sessions, each session has this same structure (lead-in, program area, and lead-out). In multisession discs, the lead-in areas contain addresses of the previous sessions. The TOC in the lead-in of the latest session is used to access the tracks. Each session must have at least one track.

The following table shows the structure of a CD:

Session level Session 1 Session 2 ...
Track level Lead in (with TOC) Track 1 Track 2 ... Lead out Lead in (with TOC) Track 1 Track 2 ... Lead out ...

The lead-in's subcode contains repeated copies of the disc's Table Of Contents (TOC), which provides an index of the start positions of the tracks in the program area and lead-out. The track positions are referenced by absolute timecode, relative to the start of the program area, in MSF format: minutes, seconds, and fractional seconds called frames. Each timecode frame is one seventy-fifth of a second, and corresponds to a block of 98 channel-data frames—ultimately, a block of 588 pairs of left and right audio samples. Timecode contained in the subchannel data allows the reading device to locate the region of the disc that corresponds to the timecode in the TOC. The TOC on discs is analogous to the partition table on hard drives. Nonstandard or corrupted TOC records are abused as a form of CD/DVD copy protection, in e.g. the key2Audio scheme.

The lead-out area is the ending part of the CD or of a session. The first lead-out is 6,750 sectors (about 13 megabytes) long; each subsequent lead-out is 2,250 sectors (4 megabytes) long.

Tracks

The largest entity on a CD is called a track. A CD can contain up to 99 tracks (including a data track for mixed mode discs). Each track can in turn have up to 100 indexes, though players which handle this feature are rarely found outside of pro audio, particularly radio broadcasting[citation needed]. The vast majority of songs are recorded under index 1, with the pre-gap being index 0. Sometimes hidden tracks are placed at the end of the last track of the disc, often using index 2 or 3. This is also the case with some discs offering "101 sound effects", with 100 and 101 being indexed as two and three on track 99. The index, if used, is occasionally put on the track listing as a decimal part of the track number, such as 99.2 or 99.3. (Information Society's Hack was one of very few CD releases to do this, following a release with an equally obscure CD+G feature.) The track and index structure of the CD were carried forward to the DVD format as title and chapter, respectively.

Tracks, in turn, are divided into timecode frames (or sectors), which are further subdivided into channel-data frames.

Frames and timecode frames

The smallest entity in a CD is a channel-data frame, which consists of 33 bytes and contains six complete 16-bit stereo samples: 24 bytes for the audio (two 8-bit bytes × two channels × six samples = 24 bytes), eight CIRC error-correction bytes, and one subcode byte, used for control and display. Each byte is translated into a 14-bit word using eight-to-fourteen modulation, which alternates with three-bit merging words. In total there are 33 × (14 + 3) = 561 bits. A 27-bit unique synchronization word is added, so that the number of bits in a frame totals 588 (which are decoded to only 192 bits music because of the error-corretion and control bits).

On a Red Book audio CD, data is addressed using the MSF scheme, with timecodes expressed in minutes, seconds and another type of frames (mm:ss:ff), where one frame corresponds to 1/75th of a second of audio: 588 pairs of left and right samples. This timecode frame is distinct from the 33-byte channel-data frame described above, and is used for time display and positioning the reading laser. When editing and extracting CD audio, this timecode frame is the smallest addressable time interval for an audio CD; thus, track boundaries only occur on these frame boundaries. Each of these structures contains 98 channel-data frames, totaling 98 × 24 = 2,352 bytes of music. The CD is played at a speed of 75 frames (or sectors) per second, thus 44,100 samples or 176,400 bytes per second.

In the 1990s, CD-ROM and related Digital Audio Extraction (DAE) technology introduced the term sector to refer to each timecode frame, with each sector being identified by a sequential integer number starting at zero, and with tracks aligned on sector boundaries. An audio CD sector corresponds to 2,352 bytes of decoded data. The Red Book does not refer to sectors, nor does it distinguish the corresponding sections of the disc's data stream except as "frames" in the MSF addressing scheme.

The following table shows the relation betwee tracks, timecode frames (sectors) and channel-data frames:

Track level Track N
Timecode frame or sector level Timecode frame or sector 1 (2,532 b of data) Timecode frame or sector 2 (2,532 b of data) ...
Channel-data frame level Channel-data frame 1 (24 b of data) ... Channel-data frame 98 (24 b of data) ... ...

Data access from computers

Unlike on a DVD or CD-ROM, there are no "files" on a Red Book audio CD; there are only the physical pits and lands, which in turn represent a single encoded data stream, which ultimately represents one continuous stream of LPCM audio data, and a parallel, smaller set of 8 subcode data streams. Computer operating systems, however, may provide access to an audio CD as if it contains files. For example, Windows represents the CD's TOC as a set of Compact Disc Audio track (CDA) files, each file containing indexing information, not audio data.

In a process called ripping, Digital Audio Extraction software can be used to read Red Book audio data and store it in files. Common audio file formats for this purpose include WAV and AIFF, which simply preface the LPCM data with a short header; FLAC, ALAC, and Windows Media Lossless, which compress the LPCM data in ways that conserve space yet allow it to be restored without any changes; and various lossy, perceptual coding formats like MP3 and AAC, which modify and compress the audio data in ways that irreversibly change the audio, but that exploit features of human hearing to make the changes difficult to discern.

Format deviations

Some major recording publishers have begun to sell CDs that violate the Red Book standard. Some do so for the purpose of copy prevention, using systems like Copy Control.

Some do so for extra features such as DualDisc, which includes both a CD layer and a DVD layer whereby the CD layer is much thinner, 0.9 mm, than required by the Red Book, which stipulates a nominal 1.2 mm, but at least 1.1 mm. Philips and many other companies have warned them that including the Compact Disc Digital Audio logo on such non-conforming discs may constitute trademark infringement. Either in anticipation or in response, recent[when?] copy-protected CDs bear stickers and warnings that the CD is not standard and may not play in all CD players, and no longer display the long-familiar logo.[citation needed]

DVD Audio, an advanced version of the audio CD, emerged in 1999.[20] The format was designed to feature audio of higher fidelity. It applies a higher sampling rate and used 650 nm lasers.

See also

References

  1. ^ 2-35] Why 44.1KHz? Why not 48KHz?
  2. ^ "How the CD was developed". BBC News. August 17, 2007. Retrieved 2007-08-17.
  3. ^ "Philips Compact Disc". Philips Historical Products. Retrieved 2010-10-06.
  4. ^ IEC 60908 ed 1.0 - IEC search, retrieved 2011-07-28
  5. ^ IEC, IEC 60908 ed. 2.0 - preview (PDF), retrieved 2011-07-28
  6. ^ Document no. 28/10/04-3122 783 0027 2
  7. ^ IEC 60908 Ed. 2.0 b:1999 Audio recording – Compact disc digital audio system
  8. ^ Clifford, Martin (1987). "The Complete Compact Disc Player." Prentice Hall. p. 57. ISBN 0-13-159294-7.
  9. ^ a b Kees A. Schouhamer Immink (2007). "Shannon, Beethoven, and the Compact Disc". IEEE Information Theory Newsletter: 42–46. Retrieved 2007-12-12.
  10. ^ Philips. "Beethoven's Ninth Symphony of Greater Importance than Technology". Retrieved 2007-02-09.
  11. ^ AES. "AES Oral History Project: Kees A.Schouhamer Immink". Retrieved 2008-07-29.
  12. ^ Kees A. Schouhamer Immink (1998). "The CD Story". Journal of the AES. 46: 458–465. Retrieved 2007-02-09.
  13. ^ a b Cassidy, Fergus (2005-10-23). "Great Lengths" (reprint). Sunday Tribune. Retrieved 2007-12-21. {{cite news}}: Italic or bold markup not allowed in: |publisher= (help)
  14. ^ Hoffmann, Frank (2005). Encyclopedia of Recorded Sound. CRC Press. p. 1289. ISBN 0-415-93835-X, 9780415938358. {{cite book}}: Check |isbn= value: invalid character (help); Italic or bold markup not allowed in: |publisher= (help); Unknown parameter |coauthors= ignored (|author= suggested) (help)
  15. ^ Goldmark, Peter. Maverick inventor; My Turbulent Years at CBS. New York: Saturday Review Press, 1973.
  16. ^ Andy McFadden (2010-01-09). "CD-Recordable FAQ". Retrieved 2010-12-30.
  17. ^ "BRUCKNER: Symphony No. 5 in B flat major — Munich Philharmonic/Christian Thielemann — DGG". audaud.com. Retrieved 2011-04-06.
  18. ^ "Mission of Burma 1988 Rykodisc compilation information". discogs.com. Retrieved 2011-01-18. This Rykodisc release was the first compact disc to contain 80 minutes of music; 78 minutes had previously been the longest length possible to encode on a CD.
  19. ^ On a Red Book CD, the spiral would consist of a series of pits and lands pressed into the disc at the time it is manufactured, but on a Red Book-compatible CD-R, instead of actual pits and lands, the spiral is pit-and-land-shaped spots on a layer of organic dye; a laser created the spots by altering the reflective properties of the dye.
  20. ^ Taylor, Jim. "DVD FAQ". DVD Demystified. Retrieved 21 August 2012.[dead link]