CEA-708
CEA-708 is the standard for closed captioning for ATSC digital television (DTV) streams in the United States and Canada. It was developed by the Electronic Industries Alliance.
Unlike RLE DVB and DVD subtitles, CEA-708 captions are low bandwidth and textual like traditional EIA-608 captions and EBU Teletext subtitles. However, unlike EIA-608 byte pairs, CEA-708 captions are not able to be modulated on an ATSC receiver's NTSC VBI line 21 composite output and must be pre-rendered by the receiver with the digital video frames, they also include more of the Latin-1 character set, and include stubs to support full UTF-32 captions, and downloadable fonts. CEA-708 caption streams can also optionally encapsulate EIA-608 byte pairs internally, a fairly common usage.[1]
CEA-708 captions are injected into MPEG-2 video streams in the picture user data. The packets are in picture order, and must be rearranged just like picture frames are. This is known as the DTVCC Transport Stream. It is a fixed-bandwidth channel that has 960 bit/s typically allocated for backward compatible "encapsulated" Line 21 captions, and 8640 bit/s allocated for CEA-708 captions, for a total of 9600 bit/s.[2] The ATSC A/53 Standard contains the encoding specifics. The main form of signalling is via a PSIP caption descriptor which indicates the language of each caption and if formatted for "easy reader" (3rd grade level for language learners) in the PSIP EIT on a per event basis and optionally in the H.222 PMT only if the video always sends caption data.
CEA-708 caption decoders are required in the U.S. by FCC regulation in all 13" (33 cm) diagonal or larger digital televisions. Further, some broadcasters are required by FCC regulations to caption a percentage of their broadcasts.
Packets in CEA-708
Caption streams are transmitted with many packet wrappers around them. These are the picture user data, which contains the caption data, which contains the cc_data, which contains the Caption Channel packets, which contains the Service Block, which contains the caption streams. These packets are described in detail in this section. But the streams themselves are described in the following sections.
This layering is based on the OSI Protocol Reference Model:
OSI Layers | DTVCC Layers | Comments |
---|---|---|
Application | Interpretation | Issuing commands and appending text to windows |
Presentation | Coding | Breaking up individual commands and characters |
Session | Service | Service Block Packets |
-- | Packet | DTVCC Packet assembly from cc_data Packets |
Transport | Injection | cc_data Packets extracted from video frames |
Network | unused | directly connected link |
Link | SMPTE 259M or H.222 or MXF | video frames split from link format |
Physical | SDI or 8VSB | link format demodulated from transmission |
This section will describe the various packets, the Coding Layer and Presentation Layers are described in the remainder of this document.
Picture User Data
These are inserted before a SMPTE 259M active video frame or video packet. Common video packets are a picture header, a picture parameter set and a Material Exchange Format essence.
Length | Name | Type | Value |
---|---|---|---|
32 bits | user_data_start_code | patterned bslbf | 0x000001B2[3] |
32 bits | user_identifier | ASCII bslbf | GA94[4] |
8 bits | user_data_type_code | uimsbf | 3 |
X*8 bits | user_data_type_structure | binary | free form |
bslbf: bit string, left bit first ; uimsbf: unsigned integer, most significant bit first
Length | Name | Type | Value |
---|---|---|---|
8-16 bits | nal_unit | patterned bslbf | 6 in 8-bits for H.264 39 in 16-bits for H.265 |
8 bits | payloadType | uimsbf | 4 |
8 bits | payloadSize | uimsbf | variable |
8 bits | itu_t_t35_country_code | uimsbf | 181 |
16 bits | itu_t_t35_provider_code | uimsbf | 49 or 47 |
32 bits | ATSC_user_identifier (only if provider is 49) | ASCII bslbf | GA94 |
8 bits | ATSC1_data_user_data_type_code (only if provider is 47 or 49) | uimsbf | 3 |
8 bits | DIRECTV_user_data_length (only if provider is 47) | uimsbf | variable |
X*8 bits | user_data_type_structure | binary | free form |
bslbf: bit string, left bit first ; uimsbf: unsigned integer, most significant bit first
NOTE: the SEI depending on the encoder can contain more payloads than just the captions, so one would need to navigate all payloadTypes contained within.
When the itu_t_t35_country_code is set to 181, the itu_t_t35_provider_code defines U.S. maintained manufacturers.
The itu_t_t35_provider_code for U.S. maintained manufacturers, when set to 47 defines DirecTV user_data and set to 49 defines ATSC user_data.
The ATSC_user_identifier code for ATSC1_data is "GA94" and for EBU AFD_data is "DTG1".
If the ATSC1_user_data_type_code is not 3 for DTV CC, or 4 for SCTE EIA-608, or 5 for SCTE pulse-amplitude modulated luma samples, or 6 for EBU bar data, then the packet will be terminated with the bytes 0x0, 0x0, 0x1.
Length | Name | Type | Default |
---|---|---|---|
16 or 128 bits | ancillary_flag or ancillary_header | patterned bslbf or 7 uimsbf | 0xFFFF or varies |
8 bits | data_id | uimsbf | 97 (0x61) |
8 bits | secondary_data_id | uimsbf | 1 |
8 bits | data_count | uimsbf | 78 (0x4E) |
16 bits | cdp_id | uimsbf | 0x9669 |
8 bits | cdp_data_count | uimsbf | 78 (0x4E) |
4 bits | cdp_framing_rate (30000/1001 = 4) | uimsbf | 4 |
4 bits | cdp_reserved | uimsbf | 15 (0xF) |
1 bit | cdp_timecode_added | flag | 0 |
1 bit | cdp_data_block_added | flag | 1 |
1 bit | cdp_service_info_added | flag | 0 |
1 bit | cdp_service_info_start | flag | 0 |
1 bit | cdp_service_info_changed | flag | 0 |
1 bit | cdp_service_info_end | flag | 0 |
1 bit | cdp_contains_captions | flag | 1 |
1 bit | cdp_reserved | flag | 1 |
16 bits | cdp_counter | uimsbf | varies |
8 bits | cdp_data_section | uimsbf | 0x72 |
X*8 bits | user_data_type_structure | binary | free form |
8 bits | cdp_footer_section | uimsbf | 0x74 |
16 bits | cdp_counter | uimsbf | varies |
8 bits | cdp_checksum | uimsbf | varies |
bslbf: bit string, left bit first ; uimsbf: unsigned integer, most significant bit first
This structure was designed for any digital audio or metadata that is to be synchronized with a video frame. SDI transports every eight bits in a 10 bit aligned packet, unlike MXF which is byte aligned and the ancillary flag bytes are replaced by 128 bit header. If the cdp_timecode_added is true, then a five byte SMPTE timecode section is inserted before the cdp_data_section. If the cdp_service_info_added is true, then a two byte header and seven bytes per service listing of caption services is inserted after the cdp_data_section. The cdp_framing_rate can be set to the following enumerations: 1 for 24000/1001, 2 for 24, 3 for 25, 4 for 30000/1001, 5 for 30, 6 for 50, 7 for 60000/1001 and 8 for 60 frames per second.
The cdp_timecode is used when cdp data stream is discontinuous (i.e., not padded) and the cdp_service_info is used to add extra details to the PSIP broadcast metadata such as language code, easy reader and widescreen usage.
The cdp_checksum is the value necessary to make the arithmetic sum of the entire packet (first byte of cdp_id to cdp_checksum, inclusive) modulo 256 equal zero.
Length | Name | Type | Default |
---|---|---|---|
1 bit (b7) | process_em_data_flag | flag | 1 |
1 bit (b6) | process_cc_data_flag | flag | 1 |
1 bit (b5) | additional_data_flag | flag | 0 |
5 bits (b0-b4) | cc_count | uimsbf | variable |
8 bits | em_data (not in CDP data) | uimsbf | 255 |
cc_count*24 bits | cc_data_pkt's | bslbf | free form |
8 bits | marker_bits (not in CDP data) | patterned bslbf | 255 |
24+ bits | ATSC_reserved_user_data (not in CDP data) | bslbf | free form |
Marker bits and reserved bits should all be set by default. If the additional_data_flag is set then the ATSC_reserved_user_data will be at the tail of the packet, terminated by the bytes 0x0,0x0,0x1. If the process_cc_data_flag is set the cc_data_pkt's should be parsed as follows:
At some future time the process_em_data_flag will indicate whether to process the em_data bit string. As the meaning was not yet defined in the ATSC a/53 standard.
Closed Caption Data Packet (cc_data_pkt)
3 bytes total:
Length | Name | Type | Default |
---|---|---|---|
5 bits (b7-b3) | marker_bits (all 1's) | patterned bslbf | 31 |
1 bit (b2) | cc_valid | flag | 1 |
2 bits (b1-b0) | cc_type | bslbf | 0 |
8 bits | cc_data_1 | bslbf | DTVCC free form/EIA-608 byte 1 |
8 bits | cc_data_2 | bslbf | DTVCC free form/EIA-608 byte 2 |
If cc_valid is not set the cc_data_pkt's should be considered padding and discarded.. If it is set, cc_type will be one of four values NTSC_CC_FIELD_1 = 0, NTSC_CC_FIELD_2 = 1, DTVCC_PACKET_DATA = 2, DTVCC_PACKET_START = 3. If it is either 0 or 1, the cc_data fields should be interpreted as EIA-608 Captions (allowing for 4 total captions, as EIA-608 does). If cc_type is 3 then a decoder should begin assembling a Caption Channel Packet with the cc_data as described below, and if the cc_type is 2 it should append the cc_data to any Caption Channel Packet being assembled. If a DTVCC packet is already being assembled and either cc_valid is set and the cc_type is 3 or cc_valid is clear and cc_type is 2 or 3, then the packet should be considered complete.
NOTE: In a caption decoder cc_data packets must be reassembled in the correct order to create the DTVCC packets. The standard is not clear on this, but it appears this should be in frame display order, not encoded frame order. This means in encoder DTVCC Packets should probably be broken up and inserted into the picture user data as cc_data packets in display order as well.
NOTE: To avoid this bug in the CEA-708 standard some encoders encode captions only on one frame type, such as only P frames, or only I frames, since if only one frame type is used, the frame display and frame encoded order are the same.
DTVCC packet (cc_data_1/cc_data_2)
Length | Name | Type | Default |
---|---|---|---|
2 bits | sequence_number | uimsbf | 0 |
6 bits | packet_size (if 0, packet_size is 64) | uimsbf | variable |
(packet_size * 2 - 1) * 8 bits | packet_data | binary | free form |
Within the packet_data, there is only one type of packet. This is known as the Service Block. This further subdivides the DTVCC Transport Stream into 63 substreams, each of which describes a discrete captioning service. Service 1 is designated as the Primary Caption Service, while Service 2 is the Secondary Language Service. The Caption Descriptor describes any other services offered. packet_size defines the number of two byte blocks that follow with odd blocks padded with a NULL byte.
Service Block Packet (packet_data)
Length | Name | Type | Default |
---|---|---|---|
3 bits | service_number | uimsbf | 1 |
5 bits | block_size | uimsbf | variable |
2 bits | null_fill (only if service_number is 7) | byte align | 0 |
6 bits | extended_service_number (only if service_number is 7) | uimsbf | variable |
block_size*8 bits | block_data (when block_size > 0) | uimsbf | free form |
If service_number is 7, then the extended_service_number is added and used instead of the service_number. If block_size is 0, the service_number must be zero as well with no block_data present. This is known as a Null Service Block Header, which is used for padding the packet, when no captions are sent.
Note: Service Blocks may not cross Caption Channel Packet Boundaries. This means each Caption Channel Packet can be parsed without keeping any state for the Service Blocks themselves.
Caption Stream Encoding (block_data)
The 63 caption service sub-streams contain a mixed command and text stream, much like Telnet. There are four logical code sub-groups: CL, GL, CR, and GR. These each have single and multi-character code sets.
CL Group: C0 | 0x00-0x1F | Subset of ASCII Control Codes |
CR Group: C1 | 0x80-0x9F | Caption Control Codes |
CL Group: C2 | 0x1000-0x101F | Extended Miscellaneous Control Codes |
CR Group: C3 | 0x1080-0x109F | Extended Control Code Set 2 |
GL Group: G0 | 0x20-0x7F | Modified version of ANSI X3.4 Printable Character Set (ASCII) |
GR Group: G1 | 0xA0-0xFF | ISO 8859-1 Latin 1 Characters |
GL Group: G2 | 0x1020-0x107F | Extended Control Code Set 1 |
GR Group: G3 | 0x10A0-0x10FF | Future characters and icons |
Whenever a command character is seen any text accumulated in the parser should be flushed. Since text might need to be flushed when there is no command pending, there is a null command known as the ETX command in the C0 command set. There are also two special commands, the Reset and DelayCancel. These must be parsed with lookahead. A Delay command issued previously can be canceled at any time with a DelayCancel command, so once a Delay is seen a decoder must look ahead for a DelayCancel, and only look for a DelayCancel. A Reset command on the other hand is sent to break out from an unknown decoder state and all data before it must be ignored.
C0 Table
0x00 | 0x01 | 0x02 | 0x03 | 0x04 | 0x05 | 0x06 | 0x07 | 0x08 | 0x09 | 0x0a | 0x0b | 0x0c | 0x0d | 0x0e | 0x0f | |
0x00 | NUL | ETX | BS | FF | CR | HCR | ||||||||||
0x10 | EXT1 | P16 |
NUL, BS, FF, and CR are interpreted as they are in ASCII control codes. HCR moves the pen location to the beginning of the current line and deletes its contents. FF clears the screen and moves the pen location to (0,0). ETX is the NULL command mentioned earlier, which is used to flush text to the current window when no other command is pending. EXT1 is used to escape to the 'C2', 'C3', 'G2', and 'G3' tables for the following byte. Finally, P16 can be used to escape the next two bytes for Chinese and other large character maps.
All characters in the range 0x10-0x17, which currently includes EXT1, are followed by one byte which needs to be interpreted differently. And, all characters in the range 0x18-x1f, which currently includes P16, are followed by two bytes that need to be interpreted differently. If a decoder encounters one of these and does not know what to do, it should still skip the next byte or two, as appropriate, before continuing.
C1 Table
0x00 | 0x01 | 0x02 | 0x03 | 0x04 | 0x05 | 0x06 | 0x07 | 0x08 | 0x09 | 0x0a | 0x0b | 0x0c | 0x0d | 0x0e | 0x0f | |
0x80 | CW0 | CW1 | CW2 | CW3 | CW4 | CW5 | CW6 | CW7 | CLW | DSW | HDW | TGW | DLW | DLY | DLC | RST |
0x90 | SPA | SPC | SPL | SWA | DF0 | DF1 | DF2 | DF3 | DF4 | DF5 | DF6 | DF7 |
The C1 Table contains all the currently defined caption commands. These will be described in detail in the next section.
C2 Table
The C2 Table contains no commands as of CEA-708 revision A. However, if a command is seen in these code sets a decoder must skip an appropriate number of the following bytes.
0x00-0x07 | +0 bytes |
0x08-0x0f | +1 byte |
0x10-0x17 | +2 bytes |
0x18-0x1f | +3 bytes |
C3 Table
The C3 Table contains no commands as of CEA-708 revision A. However, if a command is seen in these code sets, a decoder must skip an appropriate number of the following bytes.
0x80-0x87 | +4 bytes |
0x88-0x8f | +5 bytes |
G0 Table
0x00 | 0x01 | 0x02 | 0x03 | 0x04 | 0x05 | 0x06 | 0x07 | 0x08 | 0x09 | 0x0a | 0x0b | 0x0c | 0x0d | 0x0e | 0x0f | |
0x20 | SP | ! | " | # | $ | % | & | ' | ( | ) | * | + | , | - | . | / |
0x30 | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | : | ; | < | = | > | ? |
0x40 | @ | A | B | C | D | E | F | G | H | I | J | K | L | M | N | O |
0x50 | P | Q | R | S | T | U | V | W | X | Y | Z | [ | \ | ] | ^ | _ |
0x60 | ` | a | b | c | d | e | f | g | h | i | j | k | l | m | n | o |
0x70 | p | q | r | s | t | u | v | w | x | y | z | { | | | } | ~ | MN |
The G0 Table consists of ASCII characters for the most part. SP here is shorthand for Space. MN is a musical note, which replaces the Delete command code in ASCII, and can be any of "♩", "♪", "♫" or "♬", depending on the receiver manufacturer.
G1 Table
0x00 | 0x01 | 0x02 | 0x03 | 0x04 | 0x05 | 0x06 | 0x07 | 0x08 | 0x09 | 0x0a | 0x0b | 0x0c | 0x0d | 0x0e | 0x0f | |
0xa0 | NBS | ¡ | ¢ | £ | ¤ | ¥ | ¦ | § | ¨ | © | ª | « | ¬ | - | ® |  ̄ |
0xb0 | ° | ± | ² | ³ | ´ | µ | ¶ | · | ¸ | ¹ | º | » | ¼ | ½ | ¾ | ¿ |
0xc0 | À | Á | Â | Ã | Ä | Å | Æ | Ç | È | É | Ê | Ë | Ì | Í | Î | Ï |
0xd0 | Ð | Ñ | Ò | Ó | Ô | Õ | Ö | × | Ø | Ù | Ú | Û | Ü | Ý | Þ | ß |
0xe0 | à | á | â | ã | ä | å | æ | ç | è | é | ê | ë | ì | í | î | ï |
0xf0 | ð | ñ | ò | ó | ô | õ | ö | ÷ | ø | ù | ú | û | ü | ý | þ | ÿ |
The G1 Table is basically the ISO 8859-1 Latin-1 character set. Note character 0xa0 is the non-breaking space, which is to be used to prevent word wrap from separating two words onto separate lines.
G2 Table
0x00 | 0x01 | 0x02 | 0x03 | 0x04 | 0x05 | 0x06 | 0x07 | 0x08 | 0x09 | 0x0a | 0x0b | 0x0c | 0x0d | 0x0e | 0x0f | |
0x20 | TSP | NBTSP | … | Š | Œ | |||||||||||
0x30 | BLK | ' | ' | “ | ” | • | ™ | š | œ | ℠ | Ÿ | |||||
0x40 | ||||||||||||||||
0x50 | ||||||||||||||||
0x60 | ||||||||||||||||
0x70 | ⅛ | ⅜ | ⅝ | ⅞ | │ | ┐ | └ | ─ | ┘ | ┌ |
TSP and NBTSP are the Transparent Space, and Non-Breaking Transparent Space, respectively. The G2 Table contains miscellaneous characters that may not be displayed in all browsers. BLK indicates a solid block which fills the entire character block with a solid foreground color.
G3 Table
The G3 Table contains only a single character, the [CC] Icon, with square corners. This character is at 0xa0.
Caption commands
bits | Command Name | Parameters | |
---|---|---|---|
ETX 0x03 | 8 | EndOfText | |
CW0–CW7 0x80–0x87 | 8 | SetCurrentWindow0–7 | |
CLW 0x88 | 16 | ClearWindows | window bitmap |
DSW 0x89 | 16 | DisplayWindows | window bitmap |
HDW 0x8A | 16 | HideWindows | window bitmap |
TGW 0x8B | 16 | ToggleWindows | window bitmap |
DLW 0x8C | 16 | DeleteWindows | window bitmap |
DLY 0x8D | 16 | Delay | tenths of seconds |
DLC 0x8E | 8 | DelayCancel | |
RST 0x8F | 8 | Reset | |
SPA 0x90 | 24 | SetPenAttributes | pen size, font, scripting, italics, underline |
SPC 0x91 | 32 | SetPenColor | foreground color, foreground opacity, background color, background opacity, edge color, edge type |
SPL 0x92 | 24 | SetPenLocation | row, column |
SWA 0x97 | 40 | SetWindowAttributes | justify, print direction, scroll direction, word wrap, display effect, effect direction, effect rate. fill color, border color, border type, opacity |
DF0–DF7 0x98–0x9F | 56 | DefineWindow0–7 | priority, anchor number, anchor vertical, anchor horizontal, row count, column count, locked, visible, centered, style ID |
EndOfText (0x03)
The EndOfText command is a Null Command which can be used to flush any buffered text to the current window. All commands force a flush of any buffered text to the current window, so this command is only needed when no other command is pending.
SetCurrentWindow0-7 (0x80-0x87)
SetCurrentWindow tells the caption decoder which window the following commands describe: SetWindowAttributes, SetPenAttributes, SetPenColor, SetPenLocation. If the window specified has not already been created with a DefineWindow command then SetCurrentWindow and the window property commands can be safely ignored.
ClearWindows (0x88 + 1 byte)
ClearWindows clears all the windows specified in the 8 bit window bitmap.
DisplayWindows (0x89 + 1 byte)
DisplayWindows displays all the windows specified in the 8 bit window bitmap.
HideWindows (0x8A + 1 byte)
HideWindows hides all the windows specified in the 8 bit window bitmap.
ToggleWindows (0x8B + 1 byte)
ToggleWindows hides all displayed windows, and displays all hidden windows specified in the 8 bit window bitmap.
DeleteWindows (0x8C + 1 byte)
DeleteWindows deletes all the windows specified in the 8 bit window bitmap. If the current window, as specified by the last SetCurrentWindow command, is deleted then the current window becomes undefined and the window attribute commands should have no effect until after the next SetCurrentWindow or DefineWindow command.
Delay (0x8D + 1 byte)
Delay suspends all processing of the current service, except for DelayCancel and Reset scanning. The period of suspension is set to by the one byte parameter. The parameter specifies the delay in tenths of a second, so the minimum delay is 0.1 seconds, and the maximum delay is 25.5 seconds. A zero second delay can safely be ignored in a decoder, but should not be emitted from an encoder. A delay should be cancelled if the caption decoder's input buffer becomes full, a DelayCancel or Reset is received, or the specified delay time elapses.
DelayCancel (0x8E)
DelayCancel terminates any active delay and resumes normal command processing. DelayCancel should be scanned for during a Delay.
Reset (0x8F)
Reset deletes all windows, cancels any active delay, and clears the buffer before the Reset command. Reset should be scanned for during a Delay.
SetPenAttributes (0x90 + 2 bytes)
The SetPenAttributes command specifies how certain attributes of subsequent characters are to be rendered in the current window, until the next SetPenAttributes command. This command has the following parameters:
+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+ |TXT_TAG|OFS|PSZ| |I|U|EDTYP|FNTAG| +-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+ 15 8 7 0 OFS = offset ; PSZ = pen size I = italic toggle ; U = underline toggle EDTYP = edge type ; FNTAG = font tag
- pen size, 2 bits, { SMALL=0, STANDARD=1, LARGE=2, ILLEGAL_VAL=3 }
- offset, 2 bits, { SUBSCRIPT=0, NORMAL=1, SUPERSCRIPT=2, ILLEGAL_VAL=3 }
- text tag, 4 bits, { dialog=0, source_or_speaker_id=1, electronically_reproduced_voice=2, dialog_in_other_language=3, voiceover=4, audible_translation=5, subtitle_translation=6, voice_quality_description=7, song_lyrics=8, sound_effect_description=9, musical_score_description=10, oath=11, undefined_0=12,undefined_1=13,undefined_2=14, invisible=15 }
- font tag, 3 bits, { default=0, monospaced_serif=1, proportional_serif=2, monospaced_sanserif=3, proportional_sanserif=4, casual=5, cursive=6, smallcaps=7 }
- edge type, 3 bits, { NONE=0, RAISED=1, DEPRESSED=2, UNIFORM=3, LEFT_DROP_SHADOW=4, RIGHT_DROP_SHADOW=5, ILLEGAL_VAL0=6, ILLEGAL_VAL1=7 }
- underline, 1 bit, { NO=0, YES=1 }
- italic, 1 bit, { NO=0, YES=1 }
SetPenColor (0x91 + 3 bytes)
SetPenColor sets the foreground, background, and edge color for the subsequent characters. Color is specified with 6 bits, 2 for each of blue, green and red. The lowest order bits are for blue, the next two for green and the highest order bits represent red. Opacity is represented by two bits, they represent SOLID=0, FLASH=1, TRANSLUCENT=2, and TRANSPARENT=3. The edge color is the color of the outlined edges of the text, but the outline shares its opacity with the foreground, so the highest order bits of the third parameter byte should both be cleared. The parameters are as follows:
+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+ |FOP|F_R|F_G|F_B| |BOP|B_R|B_G|B_B| |0|0|E_R|E_G|E_B| +-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+ 23 16 15 8 7 0 FOP = foreground opacity ; BOP = background opacity F_? = foreground color component ; B_? = background color component E_? = edge color component
- foreground color, 6 bits
- foreground opacity, 2 bits
- background color, 6 bits
- background opacity, 2 bits
- edge color, 6 bits
SetPenLocation (0x92 + 2 bytes)
SetPenLocation sets the location of for the next bit of appended text in the current window. It has two parameters, row and column. If a window is not locked (see Define Window) and the SMALL font is in effect the location can be outside the otherwise valid addresses.
+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+ |0|0|0|0| ROW | |0|0| COLUMN | +-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+ 15 8 7 0
- row, 4 bits, normally 0-14
- null padding, 4 bits
- column, 6 bits, normally 0-31 for 4:3 formats, and 0-41 for 16:9 formats
- null padding, 2 bits
SetWindowAttributes (0x97 + 4 bytes)
SetWindowAttributes Sets the window attributes of the current window. Fill Color is specified with 6 bits, 2 for each of blue, green and red. The lowest order bits are for blue, the next two for green and the highest order bits represent red. Fill Opacity is represented by two bits, they represent SOLID=0, FLASH=1, TRANSLUCENT=2, and TRANSPARENT=3. The window's Border Color is specified the same way. However, the Border Type is split into two fields. They should be combined, with border type 01 representing the low order bits, and border type 2 the high order bit. Once combined the Border Type has 6 valid values: NONE=0, RAISED=1, DEPRESSED=2, UNIFORM=3, SHADOW_LEFT=4, and SHADOW_RIGHT=5.
+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+ |FOP|F_R|F_G|F_B| |BTP|B_R|B_G|B_B| |W|B|PRD|SCD|JST| |EFT_SPD|EFD|DEF| +-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+ 31 24 23 16 15 8 7 0 FOP = fill opacity ; BTP = border type lower bits; B = border type upper bit F_? = fill color component ; B_? = border color component W = word wrap toggle ; PRD = print direction ; SCD = scroll direction JST = justification ; EFT_SPD = effect speed ; EFD = effect direction ; DEF = display effect
- fill color, 6 bits. Window interior color.
- fill opacity, 2 bits. { SOLID=0, FLASH=1, TRANSLUCENT=2, and TRANSPARENT=3 }
- border color, 6 bits. Window border color.
- border type 01, 2 bits. See discussion above.
- justify, 2 bits. For Left-to-Right and Right-to-Left print directions the values are: {LEFT=0, RIGHT=1, CENTER=2, FULL=3}, for Top-to-Bottom and Bottom-to-Top print directions the values are: TOP=0, BOTTOM=1, CENTER=2, FULL=3
For Left justification, decoders should display any portion of a received row of text when it is received. For center, right, and full justification, decoders may display any portion of a received row of text when it is received, or may delay display of a received row of text until reception of a row completion indicator. A row completion indicator is defined as receipt of a CR, ETX or any other command, except SetPenColor, SetPenAttributes, or SetPenLocation where the pen relocation is within the same row.
Receipt of a character for a displayed row which already contains text with center, right or full justification will cause the row to be cleared prior to the display of the newly received character and any subsequent characters. Receipt of a justification command which changes the last received justification for a given window will cause the window to be cleared.
- scroll direction, 2 bits. This specifies which direction text will scroll when the end of a caption "line" is reached. It has one of four values: LEFT_TO_RIGHT=0, RIGHT_TO_LEFT=1, TOP_TO_BOTTOM=2, and BOTTOM_TO_TOP=3.
- print direction, 2 bits. This specifies how order text is added to a window. It has one of four values: LEFT_TO_RIGHT=0, RIGHT_TO_LEFT=1, TOP_TO_BOTTOM=2, and BOTTOM_TO_TOP=3.
- word wrap, 1 bit. If set word wrapping is enabled, otherwise word wrap should not be employed.
- border type 2, 1 bits. See discussion above.
- display effect, 2 bits. This specifies an effect to be used to display or hide a window. It has one of three valid values: SNAP=0, FADE=1, and WIPE=2. SNAP means the window should assume full opacity immediately. FADE means the window should fade in or out at effect speed. Finally, WIPE means the window should fly onto or off the screen from the border of the screen border specified in effect direction at the rate specified in effect speed
- effect direction, 2 bits. This specifies where a wipe effect comes from on window display. It has one of four values: LEFT_TO_RIGHT=0, RIGHT_TO_LEFT=1, TOP_TO_BOTTOM=2, and BOTTOM_TO_TOP=3. When the window is wiped off the screen it should be wiped off in the opposite direction from how it was wiped onto the screen.
- effect speed, 4 bits. This specifies in half-seconds how long a caption display or hide effect, such as FADE, and WIPE, should take. The maximum time is 7.5 seconds, and the minimum non-zero value is 0.5 seconds.
Colors, text painting, effects, and border types can be customized with the SetWindowAttributes and SetPenAttributes commands. However, the caption provider may wish to use predefined standard window styles. A set of predefined styles will be hard stored in receivers. This set will anticipate the most widely used types of caption windows in order to conserve caption channel bandwidth by eliminating the need to transmit superfluous SetWindowAttributes and SetPenAttributes commands.
Predefined window and pen styles can be specified by the window style and pen style ID parameters in the DefineWindow command.
DefineWindow07 (0x98-0x9F, + 6 bytes)
DefineWindow0-7 creates one of the eight windows used by a caption decoder. This command should be sent periodically by a caption encoder even for pre-existing windows so that a newly tuned in caption decoder can begin displaying captions. When issued on a pre-existing window the pen style and window style can be left null, this tells the decoder not to change the current styles if they exist, and initialize both to style 1 if the window does not exist in its context.
+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+ |0|0|V|R|C|PRIOR| |P| VERT_ANCHOR | | HOR_ANCHOR | +-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+ 47 40 39 32 31 24 V = visible ; R = row lock toggle ; C = column lock toggle PRIOR = priority ; P = relative toggle +-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+ |ANC_ID |ROW_CNT| |0|0| COL_COUNT | |0|0|WNSTY|PNSTY| +-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+ 23 16 15 8 7 0 WNSTY = window style ; PNSTY = pen style
The parameters are as follows:
- priority, 3 bits, 0-7. A decoder is only required to display up to four windows. If more than four displayed windows are requested, the decoder should display the four highest priority windows.
- column lock, 1 bit. If set, column lock fixes the absolute number of columns to be displayed. If not set, a caption decoder may display more columns of text when the font size permits it, and a SetPenLocation command may go to a location outside the defined window size.
- row lock, 1 bit. If set, row lock fixes the absolute number of rows to be displayed. If not set, a caption decoder may display more rows of text when the font size permits it, and a SetPenLocation command may go to a location outside the defined window size.
- visible, 1 bit. If set, this flag causes the window to be displayed upon creation, if not set, the window is initially hidden.
- null, 2 bits. Null padding.
- anchor vertical, 7 bits. Vertical position of the window's anchor point. The range is normally 0-74. When the relative positioning bit is set however the range is 0-99.
- relative positioning, 1 bit. If set, the anchor horizontal and anchor vertical represent relative coordinates, percentages, instead of regular coordinates.
- anchor horizontal, 8 bits. Horizontal position of the window's anchor point. The range is normally 0-209 when the stream's aspect ratio is 16:9, and 0-159 when the stream's aspect ratio is 4:3. When the relative positioning bit is set however the range is 0-99.
- row count, 4 bits. This is the number of rows of text, assuming the STANDARD font size, the window will hold. The range is 0-15. NOTE: In practice a decoder must add one to the number to get the intended effect. i.e. 0 -> 1, 1 -> 2, etc.
- anchor ID, 4 bits. Valid Values: { UPPER_LEFT=0, UPPER_CENTER=1, UPPER_RIGHT=2, MIDDLE_LEFT=3, MIDDLE_CENTER=4, MIDDLE_RIGHT=5, LOWER_LEFT=6, LOWER_CENTER=7, LOWER_RIGHT=8 }
- column count, 6 bits. This is the number of columns of text, assuming the STANDARD font size, the window will hold. The range is 0-31 for 4:3 streams, and 0-41 for 16:9 streams. NOTE: In practice a decoder must add one to the number to get the intended effect. i.e. 0 -> 1, 1 -> 2, etc.
- null, 2 bits. Null padding.
- pen style, 3 bits. If the value is zero and this is a new window, pen style one should be used for future characters. If the value is zero and this is an existing window, the previous pen style should continue to be used. For non-zero values the pen style should be set as if SetPenStyle were called with the parameters in the pen style table, below.
- window style, 3 bits. If the value is zero and this is a new window, window style one should be used for future characters. If the value is zero and this is an existing window, the previous window style should continue to be used. For non-zero values the window style should be set as if SetWindowStyle were called with the parameters in the window style table below.
- null, 2 bits. Null padding.
Predefined Pen style
Unless stated the predefined font size is standard, offset is normal, italics and underline are not set, edge type is none, foreground color is white, foreground opacity is solid, background color is black, background opacity is solid, and edge color is black.
- Default
- Monospaced Serif
- Proportional Serif
- Monospaced Sans Serif
- Proportional Sans Serif
- Monospaced Sans Serif - background opacity is transparent
- Proportional Sans Serif - background opacity is transparent
Predefined Window style
Unless stated the predefined justification is left, print direction is left-to-right, scroll direction is bottom-to-top, word wrap is off, display effect is snap, effect direction and speed are not set, fill color is black, fill opacity is solid, and border type is none.
- CEA-608 Style PopUp
- PopUp w/Transparent Background - fill opacity is transparent
- CEA-608 Style PopUp Centered - justification is center
- CEA-608 Style RollUp - word wrap is on
- RollUp w/Transparent Background - word wrap is on; fill opacity is transparent
- CEA-608 Style Centered RollUp - word wrap is on; justification is center
- Ticker Tape - print direction is top-to-bottom; scroll direction is right-to-left
How to interpret the caption stream
Word wrap
It may sometimes be desired that word wrap be performed in a caption decoder. This may happen because the end user of the caption decoder specifies a different font than the encoder requests, or the end user wishes to see more of the caption text than normally possible. Note that SetWindowAttributes sets a word wrap flag, when set this indicates the subtitles are written with word wrap in mind, and this may be used as a hint to the decoder that word wrapping is safe. Word wrap can be performed on carriage return, space, and hyphen characters, however both the non-breaking space (0xA0 in the G1 Table), and the non-breaking transparent space (0x21 in the G2 Table) should not be considered safe characters to rewrite.
Anchor ID
There are nine valid anchor ID's, shown below:
These are used to tell the caption decoder how to expand the text box as text is added to a caption window. A window is assigned an anchor point, or location and an anchor ID. If the anchor point is say 0,0, and the anchor ID is 0, then the window will expand down and right from the upper left corner of the caption area. If the anchor point is 50%,50% and the anchor ID is 4 the window will expand equally in all directions from the center of the caption area.
Fonts
CEA-708 supports eight font tags: undefined, monospaced serif, proportional serif, monospaced sans serif, proportional sans serif, casual, cursive, small capitals. The first is not defined and should probably be avoided. However these fonts are implemented it should be possible to underline them, and italicize them. Bold versions are not needed, but it should be possible to draw the outline of each letter in a different color and opacity than the fill. Finally, these fonts must allow superscripts, subscripts, and be able to support Latin-1 plus the additional symbols in CEA-708, such as the [CC] symbol and the dozen or so Unicode characters in this standard. Below are some font examples, for more see the Wikipedia Fonts article.
Proportional Serif | |
---|---|
Proportional Sans Serif | |
Windows
The window addressable area should always be within the Safe-Title area, so that all addressable locations are within the display window if the monitor overscans the image onto a non-rectangular screen. If the video stream has a 16:9 aspect ratio the addresses should be in the range 0..74 for the vertical addresses, and 0..209 for the horizontal addresses. If the video stream has a 4:3 aspect ratio the addresses should be in the range 0..74 for the vertical addresses, and 0..159 for the horizontal addresses. For other aspect ratios relative addressing should be used and both vertical and horizontal addresses should be in the range 0..99%.
The window size should be scaled based on the font size. With this in mind, rows longer than 32 characters are discouraged even on 16:9 ratio screen so that larger than specified fonts may be selected by the user.
Row and column locking
Row and column locking features are supported in the CEA-708-B standard but in the later version CEA-708-C it has been assumed that both rows and columns are locked. The basic functionality is as below:
In total, four combinations are provided 1) Row locked and Column locked 2) Row unlocked and Column locked 3) Row locked and Column unlocked 4) Row unlocked and Column unlocked
1. Row locked and Column locked: If both rows and columns are locked then the window size in terms of columns and rows can't be extended.
For a window if the number of rows and columns are defined as, say 3 and 10, then the text 'ROWS AND COLUMNS ARE NOT LOCKED FOR EVER AND EVER AND EVER' which comes in the 0 row looks like below (assume that word wrapping is disabled)
1. ROWS AND C 2. 3.
Since both are locked, text cannot be extended beyond 10 columns and also row cannot be extended beyond the 0 row.
2. Row unlocked and Column locked: In this case the window can be extended up to the max row given in the window define command. The same above text will look like below
1. ROWS AND C 2. OLUMNS ARE 3. NOT LOCKED
Row is unlocked so text can be extended up to max rows of a window define command.
3. Row locked and Column unlocked: In this case the window can be extended up to max number columns. As per the CEA-708 standard Max number of columns for any window is 32. The same above text then look like below
1. ROWS AND COLUMNS ARE NOT LOCKED 2. 3.
Column is unlocked so text can be extended up to max columns.
4. Row unlocked and Column unlocked: In this case the window can extended in terms of both rows and columns. The same above text then look like below
1. ROWS AND COLUMNS ARE NOT LOCKED 2. FOR EVER AND EVER AND EVER
Since both are unlocked so the text can extended up to 32 columns and as well as total rows.
Implementation notes
- The minimum buffer size for each of the 63 possible services (Service Input Buffers) is 128 bytes.
- In a caption decoder the DelayCancel and Reset commands should be interpreted outside the buffering mechanism. It should be safe to scan just for the 0x8E and 0x8F codes.
- In a caption encoder the 0x8E and 0x8F values might need to be encoded in a parameter to another command. Commands can be split into several subcommands to avoid this problem.
- The closed caption icon in the G3 code set must not be rendered with rounded corners in a WTO country, due to trademark licensing problems.
References
- https://www.adobe.com/content/dam/acom/en/devnet/video/pdfs/introduction_to_closed_captions.pdf (2015) "The majority of premium content produced for the United States today still contains 608 captions embedded in the 608 over 708 digital format."
- https://ecfsapi.fcc.gov/file/6008646915.pdf "NTSC...captions...must always be placed in the User datastream before any DTVCC caption data" and "On average, NTSC captions are allocated 960 bps, and DTVCC captions (EIA-708-A) are allocated 8640 bps" 4 captions are possible as in EIA 608
- Table A7 Picture User Data Syntax6 for 5F485C53d01
- "Archived copy" (PDF). Archived from the original (PDF) on 2010-11-20. Retrieved 2012-05-25.CS1 maint: archived copy as title (link)
External links
- Critique of CEA-708 caption fonts
- the CEA-708-D documentation (not for free)