03 June 2011

HTML5 video and DRM, part two

This is a continuation of the first part.

Let's say you (or someone else) have resolved all previously mentioned issues. You've got yourself a half-working DRM implementation for your HTML5 video player. Remember the last diagram:


The streaming looks like this:
  1. Video player has finished loading. User clicks play.
  2. The player opens new Web Socket connection and sends a request for a stream to the content provider.
  3. (Assuming the request is valid). Server finds the original video file.
  4. Requested video file is being split to let's say 5 seconds chunks. Each chunk is a proper video file with relatively same headers. All processed chunks land in a new temporary localization.
  5. Server reads a (first) chunk and encodes it to make a base-64 payload.
  6. The payload is being sent through the Web Socket to the user's browser.
  7. (An onmessage event is fired). Player receives the payload and decodes it.
  8. Payload's data is being converted to a Data URL. This URL is then set as a video element's source.
  9. Meanwhile (probably around 7th step) a request for a next chunk is being sent to the server.
  10. Back to the 5th step. This time server sends the adequate following chunk. 
  11. Next, at the 8th step, the video source is not replaced at once. It'd cause skipping. This time a new Data URL awaits for the last chunk to end playing (the ended or possibly timeupdate event). The moment video player is finished the replacement occurs. 
  12. When the last chunk is transferred, Web Socket connection is being closed.
Of course, video file splitting and encoding is only necessary until we get binary payload support in Web Socket protocol and streaming capabilities for the video element. 

Further issues

Now, where to start? 
Splitting and conversion cost is going to be painfully huge. No doubt, you'd normally do it only once and then cache at your CDN. Now imagine doing this for a few available quality settings... Still, disk space is cheap, you can do it. Really bad things happen on the client side. The browser has to keep up with your constant video element's source switching. And this is done in a function fired by the event listener -- expect delays. As far as I know there isn't a way to just make it work. Skipping and pauses will occur anyway (think about mobile devices). 
Another important thing is to keep consistent state between media status and player's controls. You'd have to do a lot of calculations regarding whole video's length, current chunk number and position on a timeline. 

Seeking

Since we now split our source files into small chunks seeking is not much of a problem. At any time the player can request appropriate chunk from the server, fetch (and unpack) it and finally replace current media source. This would get us a coarse granularity -- not particularly user friendly. A nice and quite easy improvement would be seeking to exact location in the current chunk. Let's say we need to jump to 00:12s with 10s chunks. We'd request the second part of the video and then we'd seek to 00:02s in our player.
The more important problem is again performance and further delays. It makes no sense to fetch each chunk more than once. So, some kind of caching is desirable. And our workflow will take a lot of space -- a 5s chunk of 1080p H.264 video weighs around 1.5-2MiB (your mileage may vary). To give user some (minimal) margin of flexibility we can store current and two previous chunks (during seeking it'd mean one to the future). For sure, we cannot store a 6MiB of data in user's cookies. Fortunately, we have HTML5 Local Storage/Session Storage. Currently, in browsers providing support, we can expect a 5MiB upper limit for our whole domain. That's not much, barely enough for ten seconds of HD video. This issue remains open.

Content security

The most important thing for a DRM implementation is the security of content. Our model is not yet immune to eavesdropping. Never mind the transport, we can use SSL'd version of the Web Socket protocol. The real problem reveals itself further in the chain. We fetch our chunks and happily send them to a HTML5 video element. Anyone with a knowledge of JavaScript could attach an event listener and scrape our whole content. One can try to wrap the player in a nice opaque object and then set all access methods to undefined. This would be a poor solution -- some script attached earlier in the DOM could store it's own (original) version of all these methods.
The only solution I have in mind is to pack it all in a browser extension. Then, in your website, you'd leave only a placeholder for the movie window. The extension would replace it with a proper player and do all communication with the content server. You can even implement a crypto infrastructure between these two endpoints. Also it's important for the extension to scan your site for scripts and check their legality. If something smells fishy you could cease playing and launch your retaliation measures (reload whole page, display a message and so on).

tl;dr: HTML5 video with DRM is possible but certainly not easy.

No comments:

Post a Comment