What Is a Streaming Media Protocol?

This is another installment in our series of "What Is...?" articles, designed to offer definitions, history, and context around significant terms and issues in the online video industry. 
Choosing a streaming technology involves multiple considerations, including an understanding of the pluses and minuses of the streaming protocol used by the technology. This article defines a streaming protocol and then discusses the relative merits of the protocols used by today’s leading streaming technologies.

What's a Communications Protocol?

Communications protocols are rules governing how data is communicated, defining elements like the syntax of file headers and data, authentication, and error handling. There are easily dozens of protocols involved in sending a simple data packet over the internet, and it’s important to understand how they work together.

Briefly, the International Organization for Standardization (ISO) created the Open Systems Interconnection model which defines seven logical layers for communications functions. All streaming protocols are in the application layer, which means they can use any layer beneath it for plumbing functions like transmitting data packets. This enables protocols within each layer to focus on a particular function, rather than having to recreate the entire stack of functions.
OSI Model
For example, the Real Time Streaming Protocol (RTSP) is an application-level streaming protocol that can use multiple protocols in the transport layer to transmit its packets, including the Universal Datagram Protocol (UDP) and Transmission Control Protocol (TCP). Sometimes application-level protocols are written specifically for a particular transport protocol, like the Real-Time Transport Protocol (RTP), which is typically built on UDP transport.
Hopefully, this brief overview will help you understand where streaming protocols live and how they interact with other, lower level protcols. This is as technical as we get, folks, so from here on out it should be smooth sailing.

In the Beginning there was HTTP, and it Was Good

With this as background, let’s start examining the application-layer protocols used to stream video, starting with the granddaddy of them all, HTTP. As you probably know, HTTP stands for Hypertext Transfer Protocol, which is the lingua franca for the web. HTTP governs the communications between web servers and browsers and is the protocol used to distribute all the content on websites to remote viewers, including HTML text, GIF and JPG graphics, PDF files and other web-based (as opposed to FTP) downloads.
Early experiments with delivering video via HTTP were less than satisfactory for a number of reasons, not the least of which was the limited bandwidth available in the 28/56Kbps modems of the day. The first video files posted on the web were delivered via download and play, which mean they had to be fully downloaded before playback began. Then Apple pioneered the concept of progressive download, where the video could start to play as it was downloaded, which helped a bit, but didn’t provide functionality like lookahead seeking or random access.
The other big negatives of HTTP-delivered video were cost and quality of service issues. HTTP delivery is accomplished as fast as available bandwidth will allow. If a viewer connected via a high-speed connection, the entire video file was sent as quickly as possible. If the viewer stopped watching after a few moments, much of that transfer was wasted.
In addition, this mode of delivery made it difficult to serve multiple viewers. When viewer A clicked on the video, the server started sending the video as quickly as possible. When viewers B, C, D, and E clicked on the video, outbound bandwidth might be insufficient to serve them any video until the transfer to viewer A was complete.

The Rise of Streaming Protocols

As streaming media increased in importance, several streaming protocols were created to address these issues, including the aforementioned RTSP, Microsoft Media Services (MMS) and Macromedia’s (and then Adobe’s) Real Time Messaging protocol (RTMP). At a high level, these protocols shared several common elements.
First was the existence of a streaming server, or a software program charged solely with delivering streaming content. These streaming servers worked in conjunction with traditional HTTP servers so that when a viewer clicked a link on the HTTP server, it initiated a connection between the streaming server and the player that persisted until the viewer stopped watching. Because of this connection, these protocols are considered “stateful,” as compared to HTTP, which is stateless and has no connection between server and player.
This connection addressed most of the negatives of HTTP delivery. Streaming protocols enabled seeking to random points in the video file, and adaptive streaming, where multiple encoded files could be distributed to the player based upon available bandwidth and CPU power. The server could meter out the flow of video to the player on a just in time basis, so if the viewer stopped watching, little extra bandwidth was wasted. Because the outbound flow was metered, a streaming server could more effectively serve multiple users, improving overall quality of service.

HTTP - Back to the Future

Over time, as Flash video rose to dominate the streaming video landscape, RTMP became the dominant streaming protocol, and is still widely used today. However, with the introduction of Microsoft’s HTTP-based Smooth Streaming and Apple’s HTTP Live Streaming (HLS), HTTP-based streaming technologies began a resurgence for multiple reasons, both having to do with perceived negatives of RTMP and innovations in HTTP technologies that addressed many of its negatives.
Perceived shortcomings in RTMP include:
  • RTMP packets may be blocked by certain firewalls, though the Adobe Media Server has workarounds if these problems are experienced.
  • RTMP packets can’t leverage standard HTTP caching  mechanisms available within the networks of ISPs, corporations, and other organizations, which can improve distribution efficiency and quality of service.
  • The persistent server to player connection means increased costs, because streaming servers cost money.
  • The required server may also limit scalability as compared to HTTP-based streaming, since there are many more HTTP servers than RTMP.
I say "perceived," because as this article is being written (August 2012), RTMP is still used by sites like Bloomberg and The Street.com, which tends to cast doubt on the notion that RTMP can’t get through to heavily firewalled viewers. ESPN and MTV also use RTMP, which makes you question the scalability and cost issues.
These doubts aside, there’s a general perception among technical cognoscenti that HTTP-based technologies are more effective at delivering high-quality streams. Plus, Adobe introduced HTTP Dynamic Streaming (HDS) in 2010, providing a Flash-based alternative for those seeking HTTP-based streaming to the desktop. All of a sudden, changing to HTTP-based online video delivery no longer involved a seismic shift to a totally new technology; Flash users could continue to leverage their investment in Flash development and infrasture and leverage the benefits of HTTPstreaming.
As mentioned, several innovations in HTTP streaming also addressed previous limits of the technology. As before, there is no persistent connection between the server and the player; the video resides on any HTTP server and the technology remains stateless. However, now all HTTP-based streams are broken into chunks, either separate files or segments within a larger file. Rather than retrieving a single large file with a single request, HTTP-based technologies retrieve consecutive short chunks on as needed basis.

This has multiple benefits. First, there’s little waste because the video is delivered as it is watched. This effectively meters out the video, enabling a single HTTP server to efficiently serve more streams. Seeking is no problem; if the viewer drags the playhead forward, the player can just retrieve the appropriate chunks. These technologies also enable the efficient switching between streams, so all the listed technologies (Smooth Streaming, HLS, and HDS) stream adaptively.
Since these technologies are delivered via HTTP, they sidestep the issues faced by RTMP. HTTP-based technologies are firewall friendly and can leverage HTTP caching mechanisms. Because no streaming server is required, they are less expensive to implement and can scale more cheaply and effectively to serve available users.

Final Points

Again, RTMP distribution is still widely and beneficially used by many websites today. However, while there may be little impetus for some sites to change, at this point if you’re considering a streaming technology, the overwhelming sentiment is to deliver via HTTP. Of course, for adaptive delivery to Apple devices (and Android 3.0 and higher), HLS is your only option.
It’s also useful to recognize that most video content is delivered via plain old HTTP progressive download. Sure, there are limitations, like the lack of adaptive streaming, but you can’t say it’s worked out too badly for YouTube, who delivers about 70% of video over the web, exclusively via progressive download. However, this approach prevents YouTube from deploying the digital rights management (DRM) techniques available via HDS, Smooth Streaming, and HLS to protect their videos, which is a key reason sites with branded content use these technologies.
Finally, the focus of this article has been general internet streaming. Particularly for intranet use, streaming-server-based protocols like IP Multicast and applications like peer-to-peer delivery provide lots of value and even more promise. So don’t throw out the baby with the bathwater; streaming servers aren’t “bad” and HTTP isn't "good." Rather, choose the best tool for the job.

Comments