Timed text tracks

Internet Explorer 10 and Windows apps using JavaScript introduce support for the track element as described in Section 4.8.9 of the World Wide Web Consortium (W3C)'s HTML5 standard.The track element enables you to add timed text tracks, such as closed captioning, translations, or text commentary, to HTML5 video elements.

  • The track element
  • Track file formats
    • WebVTT
    • TTML
  • Multiple track files
  • Scripting the track element
    • The textTrack and textTrackList objects
    • The textTrackCueList and textTrackCue objects
  • Working with cues
    • Get the Current cue
    • Get all the cues
  • API Reference
  • Samples and tutorials
  • Internet Explorer Test Drive demos
  • IEBlog posts
  • Specification
  • Related topics

The track element

The syntax for the element is as follows.

<video id="mainvideo" controls autoplay loop>
  <track src="en_track.vtt" srclang="en" label="English" kind="caption" default>
</video>

The <track> element represents a timed text file to provide users with multiple languages or commentary for videos. You can use multiple tracks, and set one as default to be used when the video starts.

The text is displayed in the lower portion of the video player. At this time the position and color can't be controlled, but you can retrieve text through script and display it in your own way.

Track file formats

Text tracks use a simplified version of the Web Video Text Track (WebVTT) or Timed Text Markup Language (TTML) timed text file formats.Internet Explorer 10 and Windows apps using JavaScript currently support only timing cues and text captions.

WebVTT

WebVTT files are 8-bit Unicode Transformation Format (UTF-8) format text files that look like the following.

WEBVTT

00:00:01.878 --> 00:00:05.334
Good day everyone, my name is John Smith

00:00:08.608 --> 00:00:15.296
This video will teach you how to 
build a sand castle on any beach

The file starts with the tag WEBVTT on the first line, followed by a line feed. The timing cues are in the format HH:MM:SS.sss. The Start and End cues are separated by a space, two hyphens and a greater-than sign ( --> ), and another space. The timing cues are on a line by themselves with a line feed. Immediately following the cue is the caption text. Text captions can be one or more lines. The only restriction is that there must be no blank lines between lines of text. The MIME type for WebVTT files is "text/vtt".

TTML

Internet Explorer 10 and Windows apps using JavaScript use a subset of the TTML (Timed Text Markup Language) file format, which is defined in the TTML specification. Windows Internet Explorer and Windows apps using JavaScript support the following structure.

<?xml version='1.0' encoding='UTF-8'?>
<tt xmlns='http://www.w3.org/ns/ttml' xml:lang='en' >
<body>
<div>

<p begin="00:00:01.878" end="00:00:05.334" >Good day everyone, my name is John Smith</p>
<p begin="00:00:08.608" end="00:00:15.296" >This video will teach you how to<br/>build a sand castle on any beach</p>
</div>

</body>
</tt>

The TTML file uses a namespace declaration and the language attribute in the root element ( tt). This is followed by the body and a div element. Within the div element are the timing cues. The actual times are set as attributes (begin, end) of the opening paragraph tag (<p>) and the text is delineated by the closing </p> tag. Blank lines and white space are ignored. If there are multiple lines, they are separated by <br/> tags.

The MIME type for TTML files is "application/ttml+xml", or "text/xml". See Section 5.2 of the TTML specification for more information.

Multiple track files

More than one timed text file can be used—for instance, to provide your users with multiple languages or alternate commentary. If you're using multiple tracks, you set one as default to be used if your page doesn't specify or the user hasn't picked a language. Within the video player, the user can choose alternate tracks through a built-in user interface.

The following example shows a video element with three track elements.

<video id="mainvideo" controls autoplay loop>
  <source src="video.mp4" type="video/mp4">
  <track id="enTrack" src="engtrack.vtt" label="English" kind="subtitles" srclang="en" default>
  <track id="esTrack" src="spntrack.vtt" label="Spanish" kind="subtitles" srclang="es">
  <track id="deTrack" src="grmtrack.vtt" label="German" kind="subtitles" srclang="de">
</video>

In this example, the source element is used to define the video file, and the track elements each specify a text translation. The track elements are children of the video element. The track element accepts the following attributes.

Attribute Description

kind

Defines the type of text content. Possible values are: subtitles, captions, descriptions, chapters, metadata.

src

URL of the timed text file.

srclang

The language of the timed text file. For information purposes; not used in the player.

label

Provides a label that can be used to identify the timed text. Each track must have a unique label.

default

Specifies the default track element. If not specified, no track is displayed.

 

Scripting the track element

Like most elements, the track element can be manipulated through scripting. The following objects, methods, and properties are available to manage track content and cues. A track is a collection of cues that provides times and text content related to a video.

The textTrack and textTrackList objects

Object Description

textTrack

Represents the timed text track of a track element. The track consists of a collection of cues.

var texttract = document.getElementByID("trackElement");

textTrackList

Represents the list of tracks associated with a specific video element.

var texttracklist = document.getElementById("videoelement").textTracks;

 

The textTrackList is an object associated with the video element that contains a list of the textTrack objects. To get a list of tracks used with a certain video (if there are any), the video object provides the textTracks property. The textTracks property is an object of type textTrackList, and is an array of the textTrack objects associated with the video.

  var oVideo = document.getElementById(“videoElement”);
  var oTTlist = oVideo.textTracks;   // get the textTrackList 
  var oTrack = oTTlist[0];          // get the first text track on the video object
  var oTrack.track
Property Description

VideoElement.textTracks.length

Returns the number of textTrack objects associated with a video element.

VideoElement.textTracks[n]

Returns the nth textTrack in the video element's list of tracks.

TrackElement.track

Returns the track element's text track.

 

The following example shows how to get a textTrackList object, which provides a list of all tracks of an element. The textTracks property on the video object returns the textTrackList object. The textTrackList object is an array of textTrack objects.

<!DOCTYPE html >

<html >
<head>
    <title>Track list example</title>
  <script type="text/javascript">
      function getTracks() {
         // get list of tracks
          var allTracks = document.getElementById("video1").textTracks;
         
         //append track label
          for (var i = 0; i < allTracks.length; i++) {
              document.getElementById("display").innerHTML += (allTracks[i].label + "<br/>");  
          }
      }
  </script>
</head>
<body>
<video id="video1" >
<source src="video.mp4">
  <track id="entrack" label="English subtitles" kind="captions" src="example.vtt" srclang="en">
  <track id="sptrack" label="Spanish subtitles" kind="captions" src="examplesp.vtt" srclang="es">
  <track id="detrack" label="German subtitles" kind="captions" src="examplede.vtt" srclang="de">
</video>
<button onclick="getTracks();">click</button>
<div id="display"></div>
</body>
</html>

The textTrackCueList and textTrackCue objects

The textTrack.cues property returns an array of textTrackCue objects. A textTrackCue object, or cue, includes an identifier, a start and end time, and other data.

Object Description

textTrackCueList

An array object that represents all the cues for a specific track.

var texttrackcuelist = trackelement.track.cues;

textTrackCue

Represents a cue in a track.

var texttrackcue = texttrackcuelist[i];  // where i == an index into the track cue list array.

 

These objects expose the following properties.

Property Description

textTrackCueList.Item(index)

Returns the text track cue that corresponds to a given index.

textTrackCueList.Length

Returns the number of text track cues in the list.

textTrack.cues

Returns a textTrackCueList object.

textTrack.activeCues

Returns the cues from the text track list of cues that are currently active, as a textTrackCueList object.

textTrackCue.startTime

Returns the starting time of a timed text cue.

textTrackCue.endTime

Returns the ending time of a timed text cue.

textTrackCue.Id

Returns a unique identifier for an individual cue

textTrackCue.Pause-on-exit

Indicates whether the video should stop when it reaches the endTime specified.

textTrackCue.text

Returns the text value of a TextTrackCue.

textTrackCue.Track

Returns the text track object to which the textTrackCue belongs, or "null" otherwise.

 

These objects expose the following methods.

Method Description

textTrackCueList.getCueByID(id)

Gets a cue from the textTrackCueList by the ID.

textTrackCue.getCueAsHTML()

Returns the text tract cue text as a document fragment that consists of HTML elements and other Document Object Model (DOM) nodes.

 

The textTrackCue object exposes the following events.

Event Description

Exit

Fires when a cue is done.

Enter

Fires when a cue is active.

 

Working with cues

Using the cues property on the track element, you can get an array or list of all the cues on that track. The textTrack.cues property returns an array of textTrackCue objects. The textTrackCue object, or cue, includes an ID, the start and end time, and text.

Get the Current cue

In contrast to the cues property, which gets all cues associated with a track, the activeCues property gets you just the ones that are currently being displayed. The following example displays the startTime and endTime of the subtitle being displayed.

<!DOCTYPE html >

<html >
<head>
    <title>Current cue example</title>
  <script type="text/javascript">
      function eID(elm) {
          return document.getElementById(elm);  // create short cut to getElementById()
      }

      // after elements are loaded, hook the cuechange event on the track element
      window.addEventListener("load", function () {
          eID("track").addEventListener("cuechange", function (e) {
              var myTrack = e.target.track;  // the target property is the track element
              var myCues = myTrack.activeCues;   // activeCues is an array of current cues.
              //  display the start and end times
              eID("display").innerHTML = myCues[0].startTime + " --> " + myCues[0].endTime;
          }, false);

      }, false);
     
  </script>
</head>
<body>
<video id="video1"  controls autoplay>
 <source src="movie.mp4"  >
 <track id='track' label='English captions' src='captions.vtt' kind='subtitles' srclang='en' default  > 
</video>
<div id="display"></div>
</body>
</html>

Get all the cues

The following example gets all the cues associated with a track, and displays them in a select box. When you click an item in the box, that cue is played on the video. Use the search to filter the results to a specific keyword.

<!DOCTYPE html>
<head>
<title>All cues example</title>
<script>

function loadCaptions(track) {
// retrive cues for track element
    var cues = track.track.cues;
    var list = document.getElementById('results');
    for (i = 0; i < cues.length; i++)
    {
        var x = cues[i].getCueAsHTML(); //get the text of the cue
        var option = document.createElement("option"); // create an option in the select list
        option.text = x.textContent;        // assign the text to the option 
        option.setAttribute('data-time', cues[i].startTime);  // assign an attribute called data-time to option
        list.add(option);       // add the new option 
    }    
}

function playCaption(control) 
{
    var o = control.options[control.options.selectedIndex];  //get the option the user clicked
    var t = o.getAttribute('data-time');        // get the start time of that option
    var video = document.getElementById('video');   // get video element. 
    video.currentTime = t - 0.1;                //  move the video to start at the time we want (subtrackting a fudge factor)
}

function search(text) 
{
    var cues = eID('track').track.cues;     // retrieve a list of cues from current track
    var list = eID('results');              // get the select box object
    list.innerHTML = '';                    // clear the select box 
    for (i = 0; i < cues.length; i++) {     // scan through list of cues
        var cuetext = cues[i].getCueAsHTML().textContent;  // get the text content of the current element in cue lists 
        if (cuetext.toLowerCase().indexOf(text.toLowerCase()) != -1)  // does the cue contain the key we're looking for? 
        {                                                             // if a match, create a new option to add to the select box
            var option = document.createElement("option");         
            option.text = cuetext;                                  
            option.setAttribute('data-time', cues[i].startTime);
            list.add(option);
        }
    }   
}

</script>
</head>

<body>
   <video id='video' controls autoplay loop>
     <source src='movie.mp4'>
     <track id='track' label='English captions' src='captions.vtt' kind='subtitles' srclang='en' default onload='loadCaptions(this)'>
   </video>
   <div>Search captions <input onkeyup='search(this.value)' /> </div>
   <div>Click a caption to jump to that time in the video</div>
   <div><select size='10' id='results' onchange='playCaption(this)'></select></div>

</body>
</html>

API Reference

HTML5 Audio and Video

Samples and tutorials

HTML5 Timed Text Track sample

Make your videos accessible with Timed Text Tracks

Internet Explorer Test Drive demos

HTML5 Video Caption Maker

IE10 Video Captioning

IEBlog posts

HTML5 Video Captioning

Specification

HTML5: Section 4.8.10.12.5

Audio

Video