Supported formats

Read: - DFXP/TTML - SAMI - SCC - SRT - WebVTT

Write: - DFXP/TTML - SAMI - SRT - Transcript - WebVTT

See the examples folder for example captions that currently can be read correctly.

SAMI Reader / Writer :: spec

Microsoft Synchronized Accessible Media Interchange. Supports multiple languages.

Supported Styling: - text-align - italics - font-size - font-family - color

If the SAMI file is not valid XML (e.g. unclosed tags), will still attempt to read it.

DFXP/TTML Reader / Writer :: spec

The W3 standard. Supports multiple languages.

Supported Styling: - text-align - italics - font-size - font-family - color

SRT Reader / Writer :: spec

SubRip captions. If given multiple languages to write, will output all joined together by a ‘MULTI-LANGUAGE SRT’ line.

Supported Styling: - None

Assumes input language is english. To change:

pycaps = SRTReader().read(srt_content, lang='fr')

WebVTT Reader / Writer :: spec

WebVTT is a W3C standard for displaying timed text in HTML5. Its specification is currently (as of February 2015) in draft stage and therefore not all features are implemented by major players, the same being true for pycaption.

By default, the reader assumes the language is English and the writer returns the first language it finds in the caption set. You can specify a language using the lang parameter:

pycaps = WebVTTReader().read(content, lang='fr')

If you need to adjust all timestamps in a WebVTT, you can use the time_shift_milliseconds parameter which moves the timestamps forward (positive integer) or backward (negative integer) with the specified amount:

pycaps = WebVTTReader(time_shift_milliseconds=1154).read(content)

Styling

Styling in WebVTT can be done via inline tags (e.g. <b>, <i> etc.) or external CSS rules applied to text wrapped in class (<c>) or voice (<v>) tags.

pycaption currently only keeps voice tags on conversion.

Example:

<v Fred>Hi, my name is Fred

is converted to

Fred: Hi, my name is Fred

The following WebVTT supported tags are stripped off the cue text:

  • <c>, <i>, <b>, <u>, <ruby>, <rt>, <lang> and timestamp tags (<h:mm:ss.sss>)

Non-supported tags are left unchanged as a natural part of the cue text with no special meaning.

Positioning

The WebVTT specs allow customizing the position of cues by configuring a number of cue settings. pycaption currently only maintains positioning information on writing, in which case it supports the following settings:

  • A WebVTT line position cue setting.

  • A WebVTT text position cue setting.

  • A WebVTT size cue setting.

  • A WebVTT alignment cue setting.

pycaption does not support:

  • A WebVTT vertical text cue setting.

  • A WebVTT region cue setting.

Refer to the official WebVTT specification for details about the cue settings.

SCC Reader :: spec

Scenarist Closed Caption format. Assumes Channel 1 input.

Supported Styling: - italics

By default, the SCC Reader does not simulate roll-up captions. To enable roll-ups:

pycaps = SCCReader().read(scc_content, simulate_roll_up=True)

Also, assumes input language is english. To change:

pycaps = SCCReader().read(scc_content, lang='fr')

Now has the option of specifying an offset (measured in seconds) for the timestamp. For example, if the SCC file is 45 seconds ahead of the video:

pycaps = SCCReader().read(scc_content, offset=45)

The SCC Reader handles both dropframe and non-dropframe captions, and will auto-detect which format the captions are in.

For debugging purposes, the SCC captions can be translated into a human readable form as following:

translated_scc = translate_scc(scc_content, brackets="[]")

Square brackets are used by default, but they can be replaced with other brackets or None.

Transcript Writer

Text stripped of styling, arranged in sentences.

Supported Styling: - None

The transcript writer uses natural sentence boundary detection algorithms to create the transcript.