Implementing low-latency DASH in Video.js

This article shows a proof-of-concept implementation of Low-Latency DASH (LL-DASH) streaming in Video.js, a popular open-source media player for live and VoD OTT streaming. To my knowledge, LL-DASH is only supported by a limited number of players, such as the GPAC player, Dash.js and TheoPlayer. So I guess supporting it in Video.js could still be interesting to the community.

This LL-DASH implementation allows Video.js to achieve as low as 1 second latency (Figure 1), and an average latency of ~2 seconds. As seen in Figure 1, the DASH media segments are downloaded using the Fetch API, even though the initialization segments are still downloaded using XmlHttpRequest API. A demo LL-DASH Video.js player is available for evaluation at, https://ec2-34-218-35-44.us-west-2.compute.amazonaws.com/videojs_lowLatencyDash/http-streaming/index.html?debug=false&autoplay=false&muted=false&minified=false&liveui=false&partial=false&url=https%3A%2F%2Fakamaibroadcasteruseast.akamaized.net%2Fcmaf%2Flive%2F657078%2Fakasource%2Fout.mpd&type=application%2Fdash%2Bxml&keysystems=&buffer-water=false&low-latency-dash=true. I have only tested using Chrome. If the video does not start, please refresh the link and try again. As I have tested, the demo player works most of time, but can still refuse to play due to time-sensitive cases which I still need to cover later.

Image for post
Image for post

Figure 1: Video.js plays LL-DASH stream with 1 second latency

In the rest of this article, I will briefly introduce how LL-DASH works in general, then show how I implemented LL-DASH in Video.js.

1. Overview of LL-DASH streaming

Before LL-DASH, a live video encoder produces a sequence of video segments, each segment is a few seconds long of video data. Every video segment is a closed GOP starting with an IDR frame. Within a video segment, the encoding of one video frame could be dependent on other frames following it. As a result, a client has to receive a full segment, before it can start decoding the segment. As such, the live latency is at least one full segment long. Also, to avoid rebuffering and improve playback stability, some players may have to add additional buffer time which is a few more segments long. This could further increase the latency.

The key method which LL-DASH uses to achieve low latency is chunk-based video encoding and data transfer. A video chunk is a sequence of video frames which do not depend on the following chunks. Each chunk is grouped into a single “moof” and “mdat” atom when it is packaged in ISO-BMFF format. The duration of a video chunk could be as low as one video frame, but is usually several hundreds milliseconds. As soon as the video encoder produces the first chunk of a video segment, the player can start requesting the segment using HTTP/1.1’s chunked transfer encoding. Using chunked transfer encoding, a player receives a whole segment one chunk by another as they are produced by the video encoder, until the last chunk is ready. Because a downloaded chunk does not depend on any following chunks, it can be decoded and rendered right after it is downloaded. As more chunks become available in that segment, the encoder continues to push the following chunks to the player using chunked transfer encoding, until the end of the segment (signaled by an empty chunk as per HTTP/1.1 chunked transfer encoding). Such chunk-based encoding and transfer reduces the live latency from at least one segment long (a few seconds) to one chunk long (several hundreds milliseconds). All the elements on a live streaming workflow have to support chunk-based encoding/decoding and data transfer in order to achieve low latency streaming in an end to end manner. This includes the video encoder, the origin server, the CDN and players.

2. Implementing LL-DASH in Video.js

2.1. Supporting “availabilityTimeOffset”

LL-DASH requires the players to support two MPD attributes, “availabilityTimeOffset” and “availabilityTimeComplete”. The use of “availabilityTimeOffset” is to notify the player to request the segment as soon as the first video chunk becomes available. A DASH live player uses the following equation to adjust the segment availability time, adjustedSegmentAvailabilityTime. This is the earliest time which the player can request the segment,

adjustedSegmentAvailabilityTime = segmentAvailabilityTime + segmentDuration — availabilityTimeOffset

Eq. 1: calculating adjustedSegmentAvailabilityTime

segmentAvailabilityTime is the MPD availability time of the segment, segmentDuration is the segment duration. According to the DASH specification, “availabilityTimeOffset” specifies an offset to define the adjusted segment availability time”.

For regular DASH live players, availabilityTimeOffset is defaulted to 0, so the player can only request a segment when that segment becomes fully available, i.e., at the time segmentAvailabilityTime + segmentDuration. By setting availabilityTimeOffset, the player requests the segment at an earlier time, before the segment becomes fully available, segmentAvailabilityTime + segmentDuration - availabilityTimeOffset. An example is given as follows.

Image for post
Image for post

Figure 2: segment request time of a LL-DASH player

2.2. Using “Fetch” API instead of “XHR” API to download chunked media segments

In low latency mode, we shall use fetch API to download a segment chunk by chunk (as ReadableStream), instead of using the xhr API. The fetch API supports reading partial data of an object as if it is read from a stream. From the downloaded partial data, we need to check if it is a valid video chunk by searching for “moof” box and “mdat” box.

2.3. Appending the media chunks as soon as they are downloaded

When a media chunk is downloaded, it should be passed to the browser’s MSE source buffer as soon as possible. Regular players only pass full segments to MSE, an LL-DASH player should pass media data to MSE more frequently to take advantage of LL-DASH’s chunk-based downloading.

2.4. Incorporating availabilityTimeOffset in the computation of player-side segment availability window

“availabilityTimeOffset” adjusts the availability time of segments, therefore it affects the calculation of the player-side segment availability window. “availabilityTimeOffset” should be involved in the calculation of segment availability time.

Finally, if you have any question about this article, please feel free to contact me at maxutility2011@gmail.com.

About the author

Bo Zhang is currently a staff video research engineer at Brightcove inc. His area of expertise include online video streaming, IP networking and telecommunications. He received Ph.D. degree from George Mason University, M.S degree from University of Cincinnati, and B.S. degree from Huazhong University of Science and Technology, all in computer science. He received the best paper award from ACM MSWiM 2011 — the flagship research conference of ACM SIGSIM.

Bo Zhang is currently a staff video system engineer at Brightcove. He works for the video research team on video delivery, playback and cdn.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store