Simple Delivery Profile (SDP) caption styling

You can position and style captions on your HTML5 videos using Simple Delivery Profile (SDP).

In Internet Explorer 11 you can use SDP to style and position closed-caption text on your video screen. SDP is a new W3C standard for creating closed captions, that lets you put your text anywhere on a video frame, and style it for better visibility or emphasis.

Based on Timed Text Markup Language (TTML), you can:

  • Change text color.
  • Create solid colored backgrounds.
  • Change fonts, font size, and font styles.
  • Animate text on the screen.

Using these new capabilities, you can:

  • Simulate speech balloons when multiple speakers are shown.
  • Set colors for text that will show up better against light or dark backgrounds.
  • Reveal text a letter at a time for more emphasis.

Writing SDP text for IE11

Writing SDP based TTML code for IE11 isn't much different from writing TTML caption code for Internet Explorer 10.

The header information is similar to earlier versions, and includes the following elements for namespace and version:

  • The first line defines that this is an XML file: <?xml version="1.0" encoding="utf-8"?>

  • The next line is the language and namespace definition.

    <tt xml:lang="en-us" xmlns="http://www.w3.org/ns/ttml" 
       xmlns:s="http://www.w3.org/ns/ttml#styling"  
       xmlns:p="http://www.w3.org/ns/ttml#parameter">
    

The <head> area of the page contains the profile, styling, and layout sections. The profile section uses a <p:profile> tag, and consists of the profile name:

<p:profile use="http://www.w3.org/ns/ttml/profile/sdp-us"/>

The styling section defines the page's styles for the text. Styles include:

  • Id for the style.
  • Display preference (auto, none, etc).
  • Text color.
  • Text alignment (left, center, right).
  • Outline (or stroke) color and width.
  • Background color and transparency.
  • Origin (x/y position) of text, expressed as a percentage (e.g., 30%).
  • Extent (size of text box or background), expressed as a percentages (e.g., 75%).

The layout section defines the following attributes for up to four regions:

  • Id for a region.
  • Regions, the basic unit of location on the screen.
  • A beginning and end time for roll up text.
  • Display preference, often used to turn on specific text when style has it turned off.

These are the basic tags to get you started. There are a number of other styles and layout options described in the W3C specification TTML Simple Delivery Profile for Closed Captions (US).

There are a few rules specified by the W3C spec that you need to follow to create styled closed captions. The following is a list of top issues.

  • All documents must have tt, head, and body elements.
  • All documents must have both a styling element and a layout element.
  • You can't nest div or span elements; only one level per element is allowed at a time. You can have a span as a child of a div.

Note  A document must not contain two elements for which the lexical order of the elements is different from the temporal order of the elements.

 

Thus, you need to arrange your cues chronologically by begin time, and never put a later time before an earlier one. For example:

This is the right way:

<p region="bottomMid" begin='00:00:00.101' end='00:00:05.000'> This is the first caption</p>
<p region="topMid" begin='00:00:05.000' end='00:00:10.000'> This is the second caption</p>
<p region="topLeft" begin='00:00:10.000' end='00:00:15.000'> This is the third caption</p>
<p region="bottomRight" begin='00:00:15.000' end='00:00:20.000'> This is the fourth caption</p> 

This is the wrong way:

<p region="bottomMid" begin='00:00:00.101' end='00:00:05.000'> This is the first caption</p>
<p region="topLeft" begin='00:00:10.000' end='00:00:15.000'> This is the third caption</p>
<p region="topMid" begin='00:00:05.000' end='00:00:10.000'> This is the second caption</p>
<p region="bottomRight" begin='00:00:15.000' end='00:00:20.000'> This is the fourth caption</p> 

In this example, the third caption comes before the second one. While the full TTML spec allows the captions to be placed in any order, SDP requires the begin values to be in chronological order. The end values can be in any order.

Using SDP in your apps

Here's an example that uses SDP to style and position text. The example consists of two files, the first is the HTML file that shows the video and reads the track file. The second is saved as a .ttml file, and contains the captions in SDP format.

This HTML example loads a generic video and the track file. The file isn't synced to a specific video, so you can substitute any mp4 file for testing.

<!DOCTYPE html>
<html>
    <head>
        <title>SDP Test</title>
        <meta http-equiv="X-UA-Compatible" content="IE=edge" />
    </head>
    <body>
        <video src="video.mp4" controls muted autoplay width="800">
            <track src="SDPTest.ttml" label="SDP Examples" default/>
        </video>
    </body>
</html>

This example is the SDPTest.ttml file that positions four different colored messages on the video screen at 5 second intervals.

<?xml version="1.0" encoding="utf-8"?>
<tt xml:lang="en-us" xmlns="http://www.w3.org/ns/ttml" 
    xmlns:s="http://www.w3.org/ns/ttml#styling" 
    xmlns:p="http://www.w3.org/ns/ttml#parameter">
  <head>
    <p:profile use="http://www.w3.org/ns/ttml/profile/sdp-us"/>
    <styling>
      <!-- define styles for text color and position --> 
      <style xml:id="bottomMidStyle" s:textAlign="center" s:textOutline="red 1px" s:backgroundColor="#ff000044" 
       s:color="#ffffffff" s:origin='20% 78%' s:extent='30% 10%'/>
      <style xml:id="topMidStyle" s:textAlign="center" s:textOutline="black 1px" s:backgroundColor="#00ff0088" 
       s:color="#ff11ffff" s:origin='20% 40%' s:extent='60% 18%'/>
      <style xml:id="topLeftStyle" s:textAlign="left" s:textOutline="blue 1px" s:backgroundColor="transparent" 
       s:color="#ff11ffff" s:origin='10% 10%' s:extent='30% 10%'/>
      <style xml:id="bottomRightStyle" s:textAlign="right" s:textOutline="black 1px" s:backgroundColor="white" 
       s:color="green" s:origin='70% 70%' s:extent='30% 10%'/>
    </styling>

    <layout>
      <!-- define regions for locating text -->
      <region xml:id="bottomMid" style="bottomMidStyle" />
      <region xml:id="topMid" style="topMidStyle" />
      <region xml:id="topLeft" style="topLeftStyle" />
      <region xml:id="bottomRight" style="bottomRightStyle" />
    </layout>
  </head>
  <body>
    <div style="defaultFont">
      <p region="bottomMid" begin='00:00:00.101' end='00:00:05.000'> This is a Pop-up caption</p>
      <p region="topMid" begin='00:00:05.000' end='00:00:10.000'> This is another Pop-up caption</p>
      <p region="topLeft" begin='00:00:10.000' end='00:00:15.000'> Hello from up top</p>
      <p region="bottomRight" begin='00:00:15.000' end='00:00:20.000'> And back down</p> 
   </div>
  </body>
</tt>

Enabling Professional-Quality Online Video: New Specifications for Interoperable Captioning

TTML Simple Delivery Profile for Closed Captions (US)