³ÉÈËÂÛ̳

« Previous | Main | Next »

Frame accurate video in HTML5

Post categories:

Dirk Willem van Gulik | 08:44 UK time, Monday, 21 February 2011

Hello, I am Dirk-Willem van Gulik, Chief Technical Architect here at the ³ÉÈËÂÛ̳. An important part of my job is to help the ³ÉÈËÂÛ̳ use the right internet and web technologies - and help the industry and open standards bodies create the internet and web technologies which are right for the ³ÉÈËÂÛ̳.

Now the ³ÉÈËÂÛ̳ is a very special place to work. And one of the main things which makes it so special is "Quality". At the ³ÉÈËÂÛ̳ it is a currency, it is a goal, it is a culture - and as an engineer, it is something you are tasked to deliver.

One of our roles in FM&T is to provide our creative colleagues with tools. The tools they need for broadcast and to create high quality video. This includes tools for "non linear editing" - taking short clips, cutting them to the right length, stringing them together, adding some voice overs and graphics - and then endlessly tuning the resulting video so that it tells a story perfectly.

Usually we shoot hundreds of hours of video, import it onto an editing server, painstakingly tagging or "logging" the content on the way, and then edit each clip into something that makes sense. Because the original video files are so huge (especially in HD), we actually edit low resolution "proxy" versions of each file, and we store edit decisions using timecodes rather than actually mashing up the real video all the time. Then everything can be synced up and "conformed" using the original high-quality versions later on.

Throughout all of this, s play a major role. They are the key 'link' to get right. They ensure that recipes done on the proxy give identical (albeit at a higher resolution) results when repeated on the raw high resolution footage at the end. They ensure that the audio tracks are perfectly synchronized with the clips, that transitions start and end at exactly the right time (and there is not some extra black frame due to a rounding error). They are also important in the creative process - as they let us communicate. We can ask each other to look at a specific frame - or discuss whether we move a cut by a few frames to achieve a particular effect.

If this sounds a bit overly perfectionistic and artistic - then consider this - a cut every 3 seconds or so is quite normal. So if you are off by 1 frame either way - then we're already talking errors of over 2%! Even a very pragmatic engineer would have to agree that that matters!

So timecodes using exact frame references are important. Really important. And the dirty little secret is that the internet has none. NONE! None of today's open standard technologies, or even the dominant proprietary ones, do timecodes right. They are off by one; they round to the nearest half second, they jump to the nearest previous I-frame. Whatever. (In all fairness - there are highly specialist products one can buy and install, usually with special browser plugins, which are accurate, often provided they are used with specially prepared material and within a single LAN. But none of those are conductive to the 'internet' network effect by facilitating collaboration between creative people across organisational barriers.)

video encoding

So at the ³ÉÈËÂÛ̳ we've been struggling with this. Because creative people want to work together, over the internet, from where ever they are. From their iPad, from their laptop, from a PC in a internet cafe near Tahrir Square. Anywhere, any time. Regardless for what production house they work for (as we outsource a lot, i.e. commission at third parties) and with workflows which often span across many specialist companies. So right now - we cannot create ³ÉÈËÂÛ̳ quality video using internet and web technology based tools.


Because the first thing a professional needs is a rock solid way to reference each and every individual frame accurately. So they can talk about it. For us - 'video on the web' is a bit as in - today the internet feels like that plastic 1:1 model of a spitfire[9]. It looks like one - but it sure does not fly.

Now over the past two months that landscape has started to radically change. A few of us[1] have been working with the various open standard and open source HTML5 communities. And as of this week, after 120 emails, the bleeding edge development versions of several HTML5 implementations (as used in Safari, Chrome, Mozilla and many others) are now fully frame accurate.

First was (the basis for Safari, Chrome and , which as of revision r77919 has frame accurate playback!

Really. Frame Accurate. Actually even more accurate than just a frame (which is important for audio). You can jump to any point in the video (i.e. 1 hour, 3 minutes, 6 seconds and 5 frames, or to frame 178127) - and it will be exactly at that frame. Not at the nearest i-frame, rounded down to the nearest second, or off by one. No it will be exactly at that very frame.

So today, the HTML5 community has opened a door for us. Which will allow creative people to collaborate and edit professional video on the web.

Do know though that, while key, this is just a first step. There is a lot to still build, so we'll need many hyper creative companies and internet engineers working together to make this work. We need to create a new breed of web based production tools which can interact at the quality levels professionals and the ³ÉÈËÂÛ̳ expect. And we still have issues around UMIDs (unique global references for video) to crack. And even some very basic things (like did you know that a pixel in the video world is actually rectangular, rather than square?!) will need to universally understood between the broadcasting and internet engineers. But boy, getting s, that is a big step!

Again - a big thank you to the open source folks of WebKit and Mozilla. IE9 is not quite there - (progress is tracked at ) but Microsoft has let us know that we "can expect the video-frame-accurate seeking be available when IE9 is final"!

[1] To give credit where credit is due: within the ³ÉÈËÂÛ̳, Raymond Le Gué (programme director at ³ÉÈËÂÛ̳) insisted on having frame accurate playback in the browser. Rob Coenen went on and beyond his call of duty to make this happen, - patiently working with the wider developer community, explaining , why film and television production cannot live without it, proving that it was not working in browsers and helping the developers to fix it. He got help from Bas Schouten (at Mozilla), Andy Armstrong and Dirk-Willem van Gulik (both at the ³ÉÈËÂÛ̳).

But most credit should go to the open standards and open source communities around Webkit, Chrome and Mozilla which made it happen: Andrew Scherkus and the Chromium team get credit for being . The actual fixes where ultimately created by Jer Noble, Eric Carlson (both at Apple) and Chrome developer Andrew Scherkus; while Matthew Gregan and Anthony Hughes did the job for Mozilla.

Dirk-Willem van Gulik is Chief Architect, ³ÉÈËÂÛ̳ Future Media & Tecnology

SMPTE timecode based and frame accurate metadata logging is now possible over the web with HTML5. This image is a screen shot of what a prototype tool to do this might look like.

SMPTE timecode based and frame accurate metadata logging is now possible over the web with HTML5. The image above is a made up screen shot of what a prototype tool to do this might look like.

Comments

  • Comment number 1.

    Excellent work! It's great to see such positive responses from the browser developers and awesome to see HTML5 going towards true usability.

  • Comment number 2.

    looks to be a 404 to me

  • Comment number 3.

    is the correct URL.

  • Comment number 4.

    Do drop-frame timecodes work correctly? If not, then I'm glad it's you not me whose trying to get those to work...

    Though to be honest W3C should have consulted the right people in the first place and built a proper video system instead of people having to retrofit advanced features onto a mere shell of a standard which browser vendors have implemented differently.

  • Comment number 5.

    Looking fantastic, great that you've been pushing progress on this front! I'm quite interested in the web based video logger in the screen shot, this is something I've have been looking into as well. Mainly interested in what your feeding it from in terms of video and meta data and indeed what the meta data is being fed back into, is this linked to an ingex or sQ server system (I know you've just bought a few more of these from Quantel) or something else?

  • Comment number 6.

    That's lovely. Can we have HTML5 video for we mere viewers now, or are we to remain saddled with closed, proprietary, DRM crapware for ever?

    Or in other words - nice technical job; what are you actually going to use it for?

  • Comment number 7.

    mrg17/Andy Wilson - thanks for the heads up on the broken link. It's now working.

  • Comment number 8.

    @kierank1 "Do drop frames work" - the short answer is - we've not properly tested that (a rudimentary test with Apple their 'Blip Blop' sample suggest it is correct).

    The reason for this is that drop frames* are very specific to NTSC, which is dominantly used in the Americas and some parts of Asia. In the ³ÉÈËÂÛ̳ we use PAL (and hence we tested with 25 and 50 frames/second). We'll leave the detailed testing of that to our American brethren (happy to include a high quality test video though on above test side).

    Thanks, Dw.

    *: for those curious - NTSC has 29.976 frames/second. Which is not a nice round number. To make it almost round again - we drop (i.e. skip) a 'count' twice a minute (except every 10 minutes - and then some more refinements) - pretty much the same concept as is behind the leap day on the 29th of February which we have every 4 years (and then some special refinement every 100 and 400 years).

  • Comment number 9.

    @ewan: "what are you actually going to use it for".. Well - right now, today, for "Nothing".

    As these 'improvements' are not yet on the market - they've only just gone into the source code repositories of the browsers. From here it needs to go into Alphas, Betas, release candidates and then gradually become commonly available. That is not a process of weeks or months.

    However it does inform us in the timescale of years.

    It means that innovative companies can suddenly build editing tools, review tools, logging tools or perhaps a whole new class of creation. Tools which are truly 'on the internet'. And which, at least in theory, have the frame accuracy to allow professional use.

    But there is still a long way to go; audio is complex, colour is difficult to get right - performance and efficient use of bandwidth fiendishly complex. However - frame accuracy was one of the major hurdles to clear.

  • Comment number 10.

    Glad you’re trying to improve the state of web browsers, just make sure you don’t favour one more than others. You didn’t mention Opera: should I assume because they dont have a public tracker that you didnt link it, or they didnt have the problem?

    Fully expecting WebM iPlayer playback soon…

  • Comment number 11.

    I've been playing with this type of thing for a side project for a little bit and have a very rough working prototype that works in Safari 5.0.3. Not 100% but has most of the tools you'd expect including keyboard navigation through the video.

  • Comment number 12.

    Excellent, the first spin off already! I'm curious to see what other web-based and frame-accurate broadcast tools will emerge.

  • Comment number 13.

    I'm planning on adding in proper timecode support now I know it's coming. The project is a collaboration tool for people who are remote working in the visual industries. SFX houses, soundtrack composers, freelance editors that kind of thing.

  • Comment number 14.

    This comment was removed because the moderators found it broke the house rules. Explain.

  • Comment number 15.

    @johndrinkwater - on " You didn’t mention Opera: should I assume because they dont have a public tracker that you didnt link it, or they didnt have the problem?"

    As you may have seen (assuming you are indeed an opera user) - their current browser seems to jump to a nearby second mark (with their latest Presto/2.7 engine).

    And you are correct - Opera does indeed not operate a public tracker - so you'll have to await their next release and its release notes.

    The good news is that Opera their core developers are well aware of the issue and are actively working with the community (See for example the What-WG mailing list (whatwg.org) around 2011-2-21).

  • Comment number 16.

    @Dirk

    Thanks for the response. I agree in Europe drop frames are not an issue but if you want to get people to use the HTML5 timecode support, it has to be feature complete. In my opinion leaving parts of the standard implementation-dependent or incomplete was one of the real problems with HTML5.

  • Comment number 17.

    "And even some very basic things (like did you know that a pixel in the video world is actually rectangular, rather than square?!)".

    Is that because you are working only with 720x576 and 1440x1080 video? What about 1280x720p50 and 1920x1080p50 video (surely they use square pixels - and SD video could use square pixels if it was sampled that way)? Will you be allowing use of those formats too (including 1080p60 etc)?

    What about the refresh rate of the LCD screens in use? Won't most people be using 60Hz (or is it 59.94Hz?) LCD monitors? Won't that mean pull-down judder problems with most 25Hz & 50Hz content?

    Have you done tests of European "100Hz" HDTVs (eg. LCD/Plasma) and do they actually operate at exactly 100Hz - even though their input for PC use is 60Hz? How does that affect ³ÉÈËÂÛ̳ 25/50Hz programmes when the PC input to a "100Hz" TV is 60Hz (which is the rate they recommend).

    What if you are making a TV programme that incorporates 24Hz (or 23.976Hz) and 60Hz (59.94Hz) content - eg. a film review programme or the BAFTA film awards? Wouldn't it be better if the content was in it's original form? eg. for something like the BAFTA film awards couldn't the film clips be shown at whatever they were shot at (eg. 24/23.976 or more for things like Avatar 2), but the rest of the program be at 50Hz? ie. allow variable frame rates?

    Seeing as you are helping set standards - couldn't you encourage video/film content to be made and encoded at integer rates (eg. in the film/US world).

    So couldn't you allow higher frame rates than 50Hz and allow frame rates of the source footage to be kept at it's native rate without speeding up/slowing down or similar conversions that might involve judder/interpolation?

  • Comment number 18.

    Appreciate the work being done here and the informative article, however there was no mention of the retched DRM policy of the ³ÉÈËÂÛ̳ and how the use of open html5 video technologies and DRM will or can co-exist?

  • Comment number 19.

    @HD1080 - thanks - those are really good comments and questions. Rather than answer them here - we've been preparing another more elaborate post on exactly these topics for later in the year.

    Do not hold your breath though - it is a complex story to craft - and we want it to be exactly right as to solicit valuable community feedback.

  • Comment number 20.

  • Comment number 21.

    Are you re-inventing the wheel? Don't some current logging applications such as CatDV already allow frame accurate logging and rough cuts to be made over the web?

  • Comment number 22.

    Dirk Willem van Gulik, I am a Web user :) I prefer to use Firefox, though it is not my sole browser.
    Sadly I worry these changes wont be useful for ³ÉÈËÂÛ̳ content for me for ages, all of my browsers are sent formats I can’t view.

  • Comment number 23.

    Hi Dirk!

    For someone with such an important position at such an august body as the ³ÉÈËÂÛ̳, you are surprisingly and seriously out of date with your implication that there are no time-coded frame accurate editing systems available on the web without "special" browser plugins, and used with specially prepared material and within a single LAN.

    In fact, you have been able to edit timecoded frame accurate video through a browser over the public internet as a professional for around six years and as a consumer for around five years. Over 1,000,000 hours of professionally shot source material has already been handled by such systems.

    Last time I tried it, anyone on the ³ÉÈËÂÛ̳ desktop inside the ³ÉÈËÂÛ̳ firewall could use such tools - with neither installation of special software or configuration of their PCs - provided they had a standard web browser with the common standard plugins installed.

    Today, Android users have access to similar technology on their tablets.

    Your house rules prevent me from giving details, but I would welcome contact from you if you would like a demonstration. It could save you a few year's work - and help you avoid the blind alleys which can sometimes result when people with little practical experience of an existing working solution try to set a standard without being aware of the lessons already learnt elsewhere over many years.

  • Comment number 24.

    23. At 18:44pm on 27th Feb 2011, sbstreater
    -------------------------------------------------------------

    I thought there was just a long line of failures over the last 15 years.... producer desktop, DMI etc etc have all struggled on but not delivered the functionality required for proper roll-out.

  • Comment number 25.

    Hi sbstreater!

    I'm aware of what you are pointing at, and it's not HTML. It's using the Java plug-in, which is just another plug-in like Flash or Silverscreen- and definitely not part of the HTML standard. As you probably know, the big guys like Apple and Microsoft are not so keen on this and have a policy of not installing these plug-ins into the webbrowsers anymore. For 'Android' the same logic applies: there are just applications running under the Android OS, or Apple's iOS for that matter- and not part of HTML.

  • Comment number 26.

    Hi Rob Coenen!

    As you have noticed, my company uses Java for its Cloud video platform. My research has shown that Java is currently the only widely available solution for a responsive frame accurate video editing system running over the internet - even including the mobile internet.

    Apple and Microsoft are of course competitors to Java - because they benefit from trapping people into their own architectures, and Java is cross platform.

    Microsoft does not decide whether Java is installed or not anymore than they decide which disks are installed - this question is decided by the PC manufacturer - so this comment show a bit of a misunderstanding of the situation. And the vast majority of Windows machines come with Windows installed (and it is free to add if you need to).

    As it happens, all the Apple Macs I have used come with Java as standard, and this is currently Apple's policy I believe. As the hardware manufacturer, they decide what goes on their boxes.

    The problem with trying to do real time video in HTML is that it is simply not appropriate. The over reliance on the remote server cripples performance, and the simplistic HTML being proposed does not allow tight enough integration between the high-CPU intensive video codecs and the other real time demands on the system from a video editing system.

    Yes - Java is the obvious solution. It is a very common plug in, and can be installed for free in the rare case where it is not installed. Last time I looked, every ³ÉÈËÂÛ̳ desktop and ITV desktop had Java installed, for example.

  • Comment number 27.

    Hello sbstreater,

    my research has shown that other plug-in (Flash and Silverlight) can do the same trick. But they are proprietary 3th party plug-ins. I do agree with you that 5 years ago it was probably not appropriate to use HTML for real time video- which is pretty much the reason why the Flash plug-in is installed on 99% of the internet-enabled computers. But the HTML5 community has been working hard to get HTML5 Video ready- and it does work now: just grab any of the latest nightly builds and see for yourself: HTML5 has real-time, frame-accurate, plugin-free video using just open standards.

  • Comment number 28.

    Hi Rob Coenen!

    I don't think anyone doubts the good work being done with open standards. Although it's a pity that HTML 5 supports the patented MPEG codecs - no more open than Java, wouldn't you say? - And that some major suppliers and browsers support only this patented video format.

    As I see it there are two types of standards. The first type tells you that you can only do something in a particular way - a restrictive standard. You must put your video in this particular format, which "we" have designed and optimised to do something which may not be want you want to do with your video. It may be the wrong datarate, the wrong CPU requirements, or be good for streaming but bad for editing. Or good for server side, but bad for scaleable rich client services. Or wrong in ways "we" (or you) haven't even thought of. But that's just tough, because "we" are restricting what you can do with your video to what "we" understand, because in our arrogance, "we" can't imagine that we haven't thought of something and "we" think "we" know best.

    The second type is an enabling standard. This says you can write any software you like - including to handle your own video format - and it will work on any standard conforming device. Java is such an enabling standard.

    Rather than adding ideas one at a time to a standard like HTML - which takes years to ratify and roll out across the installed base - just adding one enabling standard like Java allows every standard conforming device to do everything. Of course, later improvements to the standard can make these easier and more efficient - but you are not restricted to applications people knew about when they wrote the standard.

    I can run the latest version of my frame accurate video editing software on any browser even with ten year old versions of Java - and (with this piece of specially written software), I benefit from many technical features not even being considered for HTML. How long will it be before the installed base of HTML can incorporate these new ideas, let alone work on existing PCs built since 2000 without needing to upgrade them?

    I think we can say the answer to that is Never. This constant churn of standards will always lag the innovation in the market.

    The reality is that the video in HTML 5 only looks good because what was there before was so totally lacking. But it is a poor substitute for a real innovative platform like an efficient virtual machine like Java - which is the real advance HTML is looking for.

Ìý

More from this blog...

³ÉÈËÂÛ̳ iD

³ÉÈËÂÛ̳ navigation

³ÉÈËÂÛ̳ © 2014 The ³ÉÈËÂÛ̳ is not responsible for the content of external sites. Read more.

This page is best viewed in an up-to-date web browser with style sheets (CSS) enabled. While you will be able to view the content of this page in your current browser, you will not be able to get the full visual experience. Please consider upgrading your browser software or enabling style sheets (CSS) if you are able to do so.