This article describes a server-side only method to reduce streaming startup delay by using HTTP/2 server push. In the context of OTT video streaming, the startup delay is defined as the time interval between when a viewer clicks the “play” button on the video player UI, and when she/he sees the first video picture on the screen. Depending on the network connectivity between a viewer and the CDN, OTT streaming startup delay can be as high as 2 seconds for VoD streaming. For live streaming, the delay could be even higher due to the wait time to join a live stream from a stream access point.
This article aims at reducing streaming startup delay by minimizing round-trips for downloading the required data before a player can render the first video frame. This is achieved by preemptively pushing all the leading files using HTTP/2 server push, before a video player requests them. A simple h2-push video server is implemented and available on github, https://github.com/maxutility2011/h2_push_video_server. No player side change is needed. Through experiments, I observed about 1 second reduction in startup delay.
Optimizing streaming experience using HTTP/2
HTTP/2 () has been around since 2015. It proposes several major improvements over HTTP/1.1 such as,
- Multiplexing multiple requests on a same TCP connection to increase throughput,
- Using HTTP header compression and binary protocol syntax to achieve higher protocol efficiency,
- Using HTTP server push to reduce web content load time.
HTTP/2 has been an enormous success since inception. In 2020, one third of the Internet traffic is carried in HTTP/2, up from 24% in 2019. All the Google services now run on HTTP/2.
Even though HTTP/2 can dramatically improve web surfing experience, it does not immediately improve video streaming experience, unless we take advantage of the new features in the server and player implementations. For example, we can modify the players to download multiple media segments in parallel to take advantage of requests multiplexing, in order to improve throughput. However, this requires some non-trivial work of modifying the player implementation to handle parallel downloads, but sequential buffering and rendering of multiple media segments.
A much simpler optimization is to take advantage of server push to reduce streaming startup delay. The basic idea is for the server to preemptively push the init segment and the first media segment to a player when it requests the manifest file. So the player does not have to request the two files before it can start rendering the first video frame. This could save at least 2 round trips which would otherwise be needed for transferring the files.
One question people may ask is which rendition (containing the init and the first media segment) should the server push to the client, because the server has no idea how much download bandwidth is available on the player side when the streaming just begins. I found that many player implementations such as ExoPlayer and Video.js, start the playback from the lowest bitrate rendition. As the player obtains more accurate bandwidth measurement and buffers enough media segments, then it can gradually increase bitrate. As a result, in my design, the server pushes segments of the lowest bitrate rendition to the player. Even if the player begins with a higher rendition and discards the pushed segments, the overhead won’t be high given the small segment size in the lowest rendition. It is a good deal to pay a few hundreds kilobytes extra download for 1 second delay saving. Furthermore, if the server ends up requesting segments from a different rendition, the segments can still be requested and downloaded as usual, there is no penalty in delay.
Implementation and experiment setup
I implemented a simple video server for this purpose using node.js. Node.js natively supports HTTP/2 server push. The code is very simple as described in the previous section. The server assumes the use of MPEG-DASH streaming, however other streaming protocols such as Apple-HLS can also be supported with minimal modification.
In my experiments, I ran the server on an AWS EC2 free-tier instance located in the US-West-2 (Oregon) zone. In my demo, the server serves test videos to players using DASH on port 4433. I use Dash.js player for playing the video, but other DASH players should also work. The player runs in Chrome browser, but any browser that supports HTTP/2 should work too. The player is physically located in Boston area.
To compare with regular, non-optimized server implementations, I also ran a HTTP/2 Nginx server on port 443 to serve the same test video. Note that, I do not enable server push in Nginx, so this is simply a traditional request/response server. I measure the startup delay using the reported download time in the network dev-tool in Chrome. The total time used to download the DASH MPD, the init audio and video segments, and the first audio and video segments provides a good-enough estimation of the startup delay. This is because the player can only render the first video frame after downloading the whole segment. Any further delay in the browser media engine is orthogonal to my optimization and can be ignored. Figure 1 and 2 below show the startup delay for streams served from my server implementation and served from Nginx. As seen from the network dev-tool in Chrome, the leading segments (audio and video “init.mp4", audio and video “seg-1.m4f”) are all “pushed” (as indicated under the “initiator” column) to the player.
Figure 1: startup latency for optimized stream
Figure 2: startup delay for non-optimized stream
The measured delay for non-optimized stream is 1041 milliseconds (Figure 2), and 104 milliseconds (Figure 1) for optimized stream. This represents a delay saving of 937 milliseconds which is easily perceivable by human eyes. I have run the experiment multiple times, similar results were observed.
I provide a quick demo stream on my AWS EC2 instance. To play the stream, you can use Dash.js player. You can simply copy the stream url into the player, and click “load”. You can also play the non-optimized stream served by Nginx and compare the difference in delay. Finally, both streams only play for 2 seconds because unfortunately my AWS free-tier instance does not have enough space for storing the whole test video. This is why in Figure 1 and 2, you see “404 not found” for seg-2.m4f and onward.
If you have any question regarding this article, please contact me at firstname.lastname@example.org.
 IETF RFC 7540: Hypertext Transfer Protocol Version 2 (HTTP/2), https://tools.ietf.org/html/rfc7540
About the author
Bo Zhang is currently a staff video research engineer at Brightcove inc. His area of expertise include online video streaming, IP networking and telecommunications. He received Ph.D. degree from George Mason University, M.S degree from University of Cincinnati, and B.S. degree from Huazhong University of Science and Technology, all in computer science. He received the best paper award from ACM MSWiM 2011 — the flagship research conference of ACM SIGSIM. He is involved in organizations such as DASH industry forum, MPEG (ISO/IEC JTC 1/SC 29/WG 11) and CTA-WAVE.