Chapter 6: Synchronized Media Content and Players

  • 601  General
  • 602  Video or Audio Content with Interactive Elements
  • 603  Captions and Transcripts for Audio Content
  • 604  Video Description and Transcripts for Video Content
  • 605  Caption Processing Technology
  • 606  Video Description Processing Technology
  • 607  User Controls for Captions and Video Description
  • 608  Audio Track and Volume Control

601 General

601.1 Scope.  The provisions of this chapter shall apply where required by Chapter 1, or where referenced by a requirement in this document.

Exception:  Synchronized media content and players complying with the WCAG 2.0 Level AA Success Criteria and Conformance Requirements and, where applicable, 604.4, 604.5, 607, and 608 of this chapter shall not be required to comply with other requirements of this chapter.

Advisory 601.1 Scope.  Synchronized media is audio or video displayed at the same time as other time-based content that is required for understanding of the complete presentation.  The other content that the audio or video is synchronized with to meet this definition does not include equivalents such as captions, subtitles, or video description.  Examples of time-based content are video, audio, and user-action.

602 Video or Audio Content with Interactive Elements

602.1 General.  Video or audio content containing interactive elements shall provide a mode of operation that conforms to Chapter 4 (Platforms, Applications, and Interactive Content).

Advisory 602.1 General.  Examples of video or audio content with interactive elements include DVD menus and dynamic on-screen television program guides.

603 Captions and Transcripts for Audio Content

603.1 General.  Regardless of format, materials containing audio content, or video with audio content, shall conform to 603.

603.2 Pre-recorded Audio Content with No Video Content or User Interaction.  Materials containing pre-recorded audio content, and no video content or user interaction, shall provide a transcript of the content.

Advisory 603.2 Pre-recorded Audio Content with No Video Content or User Interaction.  A best practice is to provide synchronized captions for pre-recorded and real-time (live) audio-only media.  This best practice is not a requirement.  Complementing an audio-only stream with captions entails the addition of a visual stream and translation of the data.  An example where this would be a fundamental alteration is a radio broadcast.

An audio-only presentation used in a public setting may require captions or other real-time visual equivalent under sections 501 and 504 of the Rehabilitation Act.

603.2.1 Transcript.  When a separate transcript is provided, the text shall conform to Chapter 5 (Electronic Documents).

603.3 Pre-recorded Audio Content with User Interaction.  Materials containing pre-recorded audio content with user interaction shall provide synchronized captions.

Advisory 603.3 Pre-recorded Audio Content with User Interaction.  An example of pre-recorded audio with user interaction would be an on-line tutorial where spoken narration guides a user through a task.

603.4 Pre-recorded Video Content with Synchronized Audio Content.  Materials containing pre-recorded video content with synchronized audio content shall provide synchronized captions.

Advisory 603.4 Pre-recorded Video Content with Synchronized Audio Content.  Captions are either closed or open.  “Closed” means capable of being turned on and off.  “Open” means visible to all users.

A best practice is for captions to conform to Chapter 5 (Electronic Documents) to provide alternate formats for people who are deaf-blind and rely on braille.

603.5 Real-Time Video Content.  Materials that contain real-time video content with audio information shall provide synchronized captions.

Exception:  When real-time video is unattended and has the primary purpose of conveying a visual experience, and a text alternative that provides descriptive identification is provided, synchronized captions shall not be required.

604 Video Description and Transcripts for Video Content

604.1 General.  Regardless of format, materials containing video content, with or without audio content, shall conform to 604.

604.2 Pre-recorded Video Content with No Audio Content or User Interaction.  Materials containing pre-recorded video content and no audio content or user interaction, shall provide either a separate transcript or an equivalent audio alternative.

Advisory 604.2 Pre-recorded Video with No Audio or User Interaction.  An example of pre-recorded video with no audio information or user interaction is a silent movie.

The purpose of the transcript is to provide an equivalent to what is presented visually.

The purpose of the audio alternative is to be an equivalent to the video. 

A text equivalent is not required for audio that is provided as an equivalent for video with no audio information.  For example, it is not required to caption video description that is provided as an alternative to a silent movie.

A video-only presentation used in a public setting may require video description or other real-time audio equivalent as a reasonable accommodation under section 501 and 504 of the Rehabilitation Act.

604.2.1 Transcript.  When a separate transcript is provided electronically, the text shall conform to Chapter 5 (Electronic Documents).

604.3 Pre-recorded Video with Synchronized Audio Content.  Materials containing pre-recorded video with synchronized audio content shall provide video description.

604.4 Real-Time Video.  Materials that contain real-time video, with or without audio content, shall provide real-time video description.

Exception:  When real-time video is unattended and has the primary purpose of extending a visual experience, and a text alternative that provides descriptive identification is provided, video description shall not be required.

Advisory 604.4 Real-Time Video.  A best practice is for speakers to incorporate verbal descriptions of any visual information presented.  This practice is necessary for any live presentation to be accessible.  This practice may also avoid having to add video description to presentations that are recorded.

An example of real-time video is a live “on-site” news broadcast.

Advisory 604.4 Real-Time Video Exception.  An example of real-time video that is unattended and has the primary purpose of extending a sensory experience is an automated fixed camera that overlooks a national park and continuously broadcasts sights and sounds.

A best practice to provide as much textual based information as possible, even when only descriptive identification is required.  An example of this is adding current wind speed and temperature information to the text alternative associated with an unattended “beach cam”.

604.5 Multiple Visual Areas of Focus.  Materials containing real-time or pre-recorded video content with synchronized audio content that display visual content in multiple areas of focus shall provide video description for visual content necessary for the comprehension of content.

Advisory 604.5 Multiple Visual Areas of Focus.  Video may contain more than a single visual focus.  This provision requires that video description be provided for each area of focus.  An example is in-house agency broadcast programs where scrolling event notices appear under the main program.  People who are blind have traditionally missed out on visual information in videos, when it was necessary to understand more than one thing at the same time.  For example, a video of an interview should audibly describe who is speaking when their names are visually displayed.  In addition, streaming information, such as daily news highlights, should also be video described.

In particular, visual emergency communication, such as emergency announcements in the form of text scrolling on a screen, is covered by this requirement.  This is consistent with the Federal Communications Commission (FCC) rule requiring broadcasters and cable operators to make local emergency information accessible to persons who are deaf or hard of hearing, and to persons who are blind or have visual disabilities.  This part extends that requirement to ICT.

A second method for meeting this requirement is to stream the audio information in two tracks that can be fed separately into the right and left ears.  Since the video description stream can be closed, people without disabilities would not have to listen to the competing audio streams.  People who are blind often have the skill to listen to two audio streams simultaneously or can be trained to do so.

605 Caption Processing Technology

605.1 General.  ICT that displays or processes video with synchronized audio content shall conform to 605.

Advisory 605.1 General.  Examples of products that display synchronized media include but are not limited to:  Analog television (TV), digital television (DTV), tuners (including TV tuner cards for use in computers), digital-to-analog TV converter boxes, personal video display devices, and software players.

A TV tuner card is a computer component that allows television signals to be received by a computer.

A digital-to-analog TV converter box is a stand-alone device that receives and converts digital signals into a format for display on an analog television receiver.

Components of a system may be obtained separately and integrated.  Such a system could include a separate DVD player and projector.  This provision requires that the system, as a whole, will include the necessary technology to support the display of open and closed captions.  As described in Chapter 10 (ICT Support Documentation and ICT Support Services), a best practice is for manufacturers who sell system components to explain in their product documentation how to integrate the system to support open and closed captions.

Caption technology may offer features that enhance the usability of captions.  These features include choices for background and foreground color, font selection, and contrast.

605.2 Audio-Visual Players and Displays.  Audio-visual players and displays shall conform to 605.2.1 or 605.2.2.

605.2.1 Decoding of Closed Captions to Open Captions.  Audio-visual players and displays that process video with synchronized audio information shall decode closed caption data and pass on an open-captioned signal to the video display.

605.2.2 Pass through of Closed Caption Data.  Audio-visual players and displays that process video with synchronized audio information shall pass closed caption data through to the video display for decoding as displayed text.  All cabling or ancillary equipment shall not block the passing through of closed captioning.

606 Video Description Processing Technology

606.1 General.  ICT that displays or processes video with synchronized audio information shall conform to 606.

606.2 Audio-Visual Players and Displays.  Audio-visual players and displays shall conform to 606.2.1 or 606.2.2.

606.2.1 Audio for Video Description.  Audio-visual players and displays that process video with synchronized audio information shall play audio information associated with video description.

606.2.2 Processing of Video Description.  Audio-visual players and displays that process video with synchronized audio information shall conform to 606.2.2.1 or 606.2.2.2.

606.2.2.1 Analog Signal Tuners.  Analog signal tuners shall conform to MTS/BTSC Broadcast Television Systems Committee (BTSC) Multichannel Television Sound Standard (1984) (incorporated by reference, see “Referenced Standards or Guidelines“in 508 Chapter 1).  Analog signal tuners shall be equipped with Secondary Audio Program (SAP) process circuitry as defined by the Broadcast Television Systems Committee (BTSC) Multichannel Television Sound Standard.

606.2.2.2 Digital Television Tuners.  Digital television tuners shall conform to ATSC A/53 Digital Television Standard, Parts 1-6 (2007) (incorporated by reference, see “Referenced Standards or Guidelines” in 508 Chapter 1).  Digital television tuners shall support processing of video description when encoded as a Visually Impaired (VI) associated audio service that is provided as a complete program mix containing video description according to the A/53 standard developed by the Advanced Television Systems Committee (ATSC).

607 User Controls for Captions and Video Description

607.1 General.  ICT that displays video with synchronized audio content shall provide user controls for closed captions and video description that conform to 607 and Chapter 3 (Common Functionality).

607.2 User Controls Location.  Location of user controls for closed captions and video description shall conform to 607.2.1 through 607.2.3.

607.2.1 Caption Controls.  When controls are provided for the selection of volume, controls for the selection of captions shall be provided in at least one location that is comparable in prominence to the location of the controls for volume.

607.2.2 Dedicated Video Description Controls.  When controls are provided for the selection of channels, the controls for the selection of video description shall be provided in at least one location that is comparable in prominence to the location of the controls for channels.

Advisory 607.2.1 Dedicated Caption Controls; Advisory 607.2.2 Dedicated Video Description Controls.  The user controls needed to access captioning and video description must be in at least one location that is comparable in prominence to the controls needed to control volume or program selection.  At a minimum, this requires placement of such controls on either the product’s physical apparatus or its remote control, where the ability to control volume or program selection is otherwise provided on that apparatus or remote control.

607.2.3 On-screen Menus.  When an on-screen menu is used to control the selection of volume or channels, the controls for the selection of captions and video description shall be at the same menu level as the corresponding volume and channel selection.

608 Audio Track and Volume Control

608.1 General.  ICT that displays and processes synchronized media shall conform to 608.

Advisory 608.1 General.  The intent of this provision is to provide users with a control so that they can distinguish speech from background audio.  Examples of ICT that displays and processes synchronized media are audio visual players and displays.

This provision applies to players of digital broadcast signals and players of media.  An example of this is a DVD player.

A best practice is to produce videos with speech and background sounds on separate tracks in order for users to be able to select a preferred audio track. 

Some individuals with hearing loss may find it difficult to understand speech in videos or broadcasts when there is competing background music or other sound effects.

In some videos developed under the DTV A/53 Standard, users may choose to listen to speech only, without background sound that may interfere with comprehension.

608.2 Independent Selection.  When materials contain speech content that is provided on a separate track from the other audio tracks, audio-visual players and displays shall provide users with a mode of operation to select the speech track independently from the other audio tracks.

608.3 Volume Adjustment.  When materials contain multiple audio tracks, audio-visual players and displays shall provide users with a mode of operation to adjust the volume of each audio track independently.