Glossary
Section 508 Section 508 refers to
accessibility law, which requires video broadcasts and many
webcasts to be made accessible to the deaf and hard of hearing.
Please see the official Section 508 web site for more details.
Not directly related to CEA-608 and CEA-708, which are technical
standards.
CEA-608 CEA-608 refers to the technical
standard for captioning standard-definition NTSC video. It is
also commonly referred to as "Line 21 closed captioning". Click
here for an in-depth look at the difference between 608 and 708.
CEA-708 CEA-708 refers to the technical standards for
captioning high-definition video. It is also commonly referred
to as the VANC data or "Line 9 closed captioning". Click here
for an in-depth look at the difference between 608 and 708.
A Active Format Description Active Format Description
(AFD): A standard for telling a receiving device how to best
frame video. For example, without AFD, a 16:9 signal which
contains a pillarboxed 4:3 video would then be letterboxed for
display on a 4:3 TV, making a tiny image surrounded by black
bars on all sides. With AFD, the TV knows that it can crop the
video so that the actual picture fills the whole screen. For
more information click here (Wikipedia).
ATSC ATSC
(Advanced Television Systems Committee) is the digital
television (DTV) standard used by broadcasters for HDTV and the
digital broadcast of SD in the United States and Canada. ATSC
supports closed captioning (608 and 708) in the metadata of the
video signal.
B Burn-in Burn-in refers to a
graphic, text, or image that is superimposed on video, and thus
becomes part of the video itself. Closed captions are not burned
in, since they can be turned on and off, unlike open captions
and many subtitles which cannot.
C .CAP A .cap file
can refer to many different types of files, so you need to be
careful when using them. Several formats of caption files have
the .cap extension, including the popular Cheetah .cap format. A
.cap file can also be a project file for CPC Caption maker (PC)
which must be opened in CaptionMaker before exporting to another
caption format.
Capture Card A capture card is a piece
of hardware for your computer that allows you to bring video
into your computer for editing and output back to a physical
format like tape. Many capture cards support closed captioning
for HD and SD, including ones from Matrox, AJA, and Blackmagic.
CaptionMaker CaptionMaker is closed captioning software
developed by CPC for the PC platform. CaptionMaker reads and
writes all major captioning formats and supports many
traditional workflows involving hardware encoders. In addition
to broadcast SD video, CaptionMaker encodes captions for web
formats like Quicktime, Flash, YouTube, and Windows Media, and
also tapeless workflows like MPEG-2 Program Streams.
Closed Captioning Closed Captioning is text that appears over
video that can be turned on an off using a decoder which is
built into most consumer television sets and cable boxes. Closed
Captions differ from subtitles in that they contain information
about all audio, not just dialogue. For more information, please
click here.
Codec Codec stands for "coder-decoder", it
is a method of compressing video in order to strike a balance
between file size and quality. Different codecs have different
data rates, aspect ratios, and methods of closed captioning in
order to achieve this balance. Some examples of codecs are DV,
MPEG-2, WMV, H.264, Uncompressed, and ProRes. To watch a video,
your computer needs the specific codec that video uses,
otherwise it will not play. Not all codecs are available for all
operating systems, and they may not be free to use.
Container Format A container format is a way to encapsulate
video so that it can be viewed in a video player, edited in an
non-linear editor, or processed in some other way. Examples of
container formats are Quicktime, AVI and MXF.
D
Decoder A decoder is a device that makes enables closed
captions to be turned on if they are present in the video
signal, essentially turning closed captions into open captions.
Typically a decoder is inside your TV or cable box and you can
turn captions on using your remote or a setting in the menus.
There are also hardware and software decoders available that
allow you to preview captions on your computer or a master tape
to ensure that they are present.
Drift "Drift" is a
term used to describe a specific type of behavior of closed
captions. It can either mean they are appearing progressively
later than they should, or progressively earlier. Most often,
this occurs slowly over the duration of a program, resulting in
a discrepancy of over three seconds by the end of an hour. Drift
is most often caused by a drop-frame / non-drop discrepancy.
Drop-Frame Drop-frame timecode refers to a method of
counting timecode in 29.97 fps video. It does not refer to
actual frames of video being dropped that would affect video
quality. Since 29.97 fps is not exactly 30 fps, when counting in
drop-frame certain numbers in the timecode counter are skipped
in order to ensure that the timecode will reflect the real-time
length of the program. The counterpart of drop-frame is non-drop
which does not skip numbers when counting timecode. To prevent
drift, it is important to timestamp in the correct mode (or
convert your captions' timecode using CPC software) when closed
captioning a 29.97 fps program.
DTV DTV stands for
"digital television" and is a general term encompassing digital
television around the world to distinguish is from analog
television (such as NTSC). In the US the standard for DTV is
ATSC.
E Elementary Stream A data stream that
contains either video or audio data, but not both. Usually
associated with MPEG video files and given the extension .m2v
for video, or .m2a or .ac3 for audio. Elementary MPEG-2 video
streams can contain closed caption data. A Program or Transport
stream can be demultiplexed, or separated, into its component
Elementary streams.
Encoder The term "encoder"
typically refers to a hardware encoder, but can refer to
software encoders as well. A hardware encoder is usually a
rack-mounted device that accepts a video signal, marries it to
closed captions, and then outputs a new closed captioned video
signal, usually resulting in generation loss. A software
encoder, such as MacCaption, can add captions to video without a
hardware encoder. You can simply encode captions to video files
already present on your computer, or to file formats that will
add captions as you output from your NLE with no generation
loss.
F G H High Definition (HD) High
Definition is a television standard with either 720 or 1080
lines in the video signal. Closed captioning for HD is sometimes
called Line 9 or VANC, and is codified under the 708 standard.
I J K L Line 9 Line 9 refers to the location
of the VANC closed captioning data in an HD video signal. In the
full raster, it appears it is the 9th line from the top of the
frame.
Line 21 Line 21 refers to the location of the
VBI closed captioning data in an NTSC 720x486 signal. It
actually appears on lines 21 and 22 since line 22 is the second
field of the closed captioning data.
Live Captioning
Live captioning is captioning process used for live webcasts or
broadcasts to add captions to video on the fly. It requires
several important tools. The first is a source of transcription
such as a stenographer or speech recognition software. Please
note, getting speech recognition software to usable levels of
accuracy still requires an individual to operate it. That second
item required for live captioning is a hardware encoder which
will accept the video signal and the closed caption data and
combine them for output. Last, you may need captioning software
to tie these two things together (especially if you're using
speech recognition software).
M MacCaption
MacCaption is closed captioning software developed by CPC for
the Mac platform. MacCaption reads and writes all major
captioning formats and supports the latest closed captioning
workflows for Final Cut Pro. In addition to broadcast HD and SD
video, MacCaption encodes captions for web formats like
Quicktime, Flash, YouTube, and Windows Media, and also tapeless
workflows like MPEG-2 Transport Streams, DVCPRO HD and XDCAM.
.MCC A .MCC is a MaCCaption closed captioning file, and
the only file format that supports both 608 and 708 (SD and HD)
closed captioning, unlike .SCC, which only can encode 608 (SD)
closed captions. This comprehensive format is being used by
several companies for integration in to their closed captioning
workflows.
MPEG-2 MPEG-2 can refer to not only a video
codec, but also a container format. MPEG-2 can come in three
different file types, Elementary Streams, Program Streams and
Transport Streams. MPEG-2 files are becoming a more common form
of video delivery because it allows a broadcaster to put it
directly on their server instead of ingesting from tape.
MacCaption can add captions to all three forms of MPEG-2 files.
N Non-Drop Non-Drop timecode refers to a method of
counting timecode in 29.97 fps video. It does not refer to
actual frames of video being dropped that would affect video
quality. Since 29.97 fps is not exactly 30 fps, when counting in
non-drop, the timecode will get progressively further and
further behind "real time." For instance, after 2000 frames a
drop-frame counter will display 00:01:06:22, while a non-drop
counter will display 00:01:06:20, but the content and real-time
length of the video will be the same. The drop-frame counter is
slightly ahead because it's goes straight from 00:00:59:29 to
00:01:00:02. To prevent drift, it is important to timestamp in
the correct timecode mode (or convert your captions' timecode
using CPC software) when closed captioning a 29.97 fps program.
Non-linear Editor A non-linear editor (NLE) is a
piece of software that allows you to edit video by moving pieces
of it around in a timeline with multiple layers of video. This
is in contrast to linear editing, which forces you to add one
piece of video after another to tape in a linear fashion. Many
NLEs support closed captioning for HD, SD, or both. Examples of
non-linear editors are AVID, Final Cut Pro, Premiere Pro, and
Sony Vegas
NTSC NTSC (National Television Systems
Committee) is the analog television standard for North America,
Japan, and some other parts of the world. NTSC supports closed
captioning (608 only) on Line 21 of the video signal.
O
Open Captions Open captions are captions that do not need to
be turned on, they are always visible. This is opposed to closed
captions which must be turned on with a decoder. Open captions
are actually part of the image itself, this is also known as
burned-in captions.
P Paint-on Paint-on captions
appear on the screen from left to right, one character at a
time. This mode of displaying captions is uncommon except as the
first caption of many commercial spots to reduce lag.
Pop-on Pop-on captions appear on the screen one at a time,
usually two or three lines at a once. This mode of displaying
captions is typically used for pre-recorded television.
Program Stream A data stream that multiplexes, or combines, a
single video and a single audio stream together. Usually given
the extension .mpg, and used for files to be played on a PC,
some DVD authoring systems, and some tapeless distribution.
Q R Roll-up Roll-up captions appear from the bottom
of the screen one line at a time, usually with only three lines
visible at a time. This mode of displaying captions is typically
used by live television like news broadcasts.
S .SCC
file SCC stands for "Scenarist Closed Caption", a file type
developed by Sonic. SCC files have become a popular standard for
many different applications of closed captions. Some programs
that use .scc files are Sonic Scenarist, DVD Studio Pro, Final
Cut Pro, and Compressor.
Shadow Speaker When using
speech recognition software, a shadow speaker is a person who
repeats everything said in a programs in to a microphone so that
the speech recognition software only has to interpret that
shadow speaker's voice and not the multiple voices in the
program. After training the software (about 15 minutes), it can
achieve accuracy rates up to 90-95% in a clean audio
environment.
Speech Recognition Speech recognition
software takes spoken word and translates it into text. State of
the art speech recognition technology can only achieve 60-80%
accuracy without the use of a shadow speaker. Software that uses
a shadow speaker can achieve up to 90-95% accuracy, but is
limited to recognizing one person's voice at a time and needs to
be used in a clean audio environment.
Standard Definition
(SD) Standard Definition is a television standard with
(typically) 480 lines in the video signal (486 when NTSC).
Closed captioning for SD sometimes called Line 21 and is
codified under the 608 standard.
Stenographer A
stenographer is a person who can transcribe video from audio on
the fly (like a court reporter). Stenographers can dial in to a
hardware encoder remotely over a phone line so that closed
captions can be added to a video signal for live broadcast. See
also: Live Captioning.
Subtitling Subtitling is text
that appears on screen that normally only gives information
about dialogue that is spoken. With the exception of DVD and Blu-ray,
subtitles cannot be turned off, but are burned into the image.
For more information, please click here.
T Transport
Stream A data stream that multiplexes, or combines, multiple
video and audio streams together with other metadata. Usually
given the extension .ts, .m2t, or .m2ts, and used for DTV
broadcast, VOD, tapeless delivery, and other systems where
multiple channels are mixed together.
U V VANC
VANC stands for "Vertical ANCillary data space" and refers to
the data stored on Line 9 in HD video (outside the display area)
that holds the 708 closed captioning data while it is going over
an HD-SDI signal or on an HD tape format. VANC data appears on
only the part of Line 9 towards the left, but VANC data can also
carry different information, like V-chip data.
VBI VBI
stands for "Vertical Blanking Interval" and is the time between
the last line or field drawn in a video frame and the first line
or field of the next frame. This is usually measured with lines,
in NTSC there are 40 lines for VBI. Closed Captioning data for
NTSC video is stored on Line 21 of the VBI.
Voice
Recognition See: Speech Recognition.
W
X
Y
Z
|