Category Archives: Javascript

GUIMark 2: The rise of HTML5

Posted by Sean Christmann

Introduction

Two years ago I had an itch that needed scratching. “RIA” was the future of the web and every major company seemed to have a solution to get us there. I developed the first version of GUIMark to not only get a good understanding of the respective technologies, but also to give my clients through EffectiveUI and everyone else something to actively gauge the rendering performance of the different runtimes. After releasing it I got a good response from both the tech community as well as several platform engineers interested in resolving problems. There were however two serious flaws in the test that immediately stood out. First, the test was relying too heavily on text layout performance. I was barely engaging the the vector and bitmap side of the rendering engines. Secondly, the test was too artificial and developers have a tendency of resisting optimizing apis against unrecognizable test cases.

Evolution

Fast forward to today and the web is a different beast. Attempt to shine a positive light on a plugin technology and you will be booed off the stage. Create something fun and silly in HTML5 and you’ll have hundreds of thousands of visitors pounding down the front door of your blog to speculate on the death of Flash. It’s undeniable that a new anchor technology is taking root in the web space, and needless to say I’ve got a new itch to scratch.

GUIMark 2

Like the first GUIMark, this new benchmark is designed for one sole purpose, to burn a hole in your CPU. I still believe that by completely saturating the rendering pipeline, we can get a better idea of which technologies are best suited for running interactive content on the web. Developers tend to focus primarily on the speed of the programming language itself, when in reality, most of your cpu time is spent inside internal rendering APIs. I also firmly believe that any benchmark testing rendering performance should stick to sub 60 fps numbers. Almost all users on the web today are browsing with 60Hz LCD monitors and there’s no reason to design a test that has to throw away frame data.

While the new benchmark sticks to the original in theory, this version introduces some much needed changes. First, I’ve split GUIMark into 3 separate tests: Vector, Bitmap, and Text rendering, and I’ve attempted to make the test cases as real world as possible. Second, I’ve only implemented these tests in HTML5 and Flash. I’m not opposed to adding Silverlight and JavaFX to the benchmark, it’s just that I didn’t have the time to build them right now and something tells me a much smaller percentage of the internet crowd is interested in those results anyway. (Feel free to flame me in the comments section for that one). Lastly, I’ve added mobile versions of some of the tests, we’ve all heard inflammatory statements from certain CEOs about mobile web performance, let’s see if the numbers back that up.

Enough already, on to the results.

Test environment

All of the tests below were performed on a 15″ unibody Macbook Pro with a 2.53 GHz Intel Core 2 Duo and an NVIDIA GeForce 9400m. On the Mac side I’m running Snow Leopard with Flash player 10.0.45.2 installed. On the PC side I have Windows 7 32bit with Aero turned on and again with Flash player 10.0.45.2. For Linux I ran a Linux Mint 8 Live CD with Firefox and Flash player 10.0.45.2. Unfortunately running off the Live CD meant no access to Nvidia drivers.

Vector Charting Test

This benchmark is designed to stress the vector apis by simulating a streaming stock chart. The test makes heavy use of strokes with complex alpha fills. Originally I had added gradient fills into the mix to make sure that a good majority of vector APIs were being flexed, but there was no significant difference in the results so I pulled them out to make the visuals cleaner. While the source may appear to be heavy on the javascript side, the actual speed of code excluding canvas draw calls is less then 1 millisecond.

HTML5 Flash 10
Windows 7
Internet Explorer 8.0.7600 N/A 30.7
Firefox 3.6.3 15.73 29.65
Chrome 4.1.249 6.41 26
Opera 10.53 24.77 29.9
Safari 4.0.5 Safari* 29.5
Avg (15.64) fps
Avg (29.15) fps
Snow Leopard
Safari 4.0.5 4.04 20.55
Firefox 3.6.3 3 23.92
Chrome 5.0.342 2.86 25.48
Opera 10.10 12.22 15.24
Avg (5.53) fps
Avg (21.29) fps
Linux Mint
Firefox 3.5.9 14.61 fps
22.88 fps

*Safari on Windows 7 will not animate the chart, it will only render one frame each time I press down on my mouse button.

Results are all over the place for this test. On the HTML5 side Opera delivers the best performance on both platforms. Flash on the PC is consistently high, but on OS X, Chrome takes the top spot. Linux pulls off great numbers despite running off a Live CD. HTML5 on the Mac side requires closer inspection though. When I first made the test and showed it to my co-worker John Blanco, he started ripping apart the code to find any mistakes I might have made. What he discovered was that by changing the stroke size on my lines from 2 pixels to 1 pixel, performance in Safari, Firefox and Chrome shot up to rates closer to Flash, while Opera stayed at the exact same FPS.

1 Pixel Stroke Results Safari – 23.59 fps Firefox – 17.43 fps Chrome – 17.12 fps Opera – 12.12 fps

Flash on the Mac, as well as HTML5 and Flash on PC were largely unaffected by this change though, gaining maybe a single frame rate by changing to 1 pixel strokes. I’m not sure what to make of these findings. What kind of bug causes this and what side effects might be introduced by fixing it? Will a change allow both 1 and 2 pixel strokes to run at higher speeds, or will they both settle somewhere near Operas numbers.

Bitmap Gaming Test

The bitmap test was designed to simulate a tower defense type game. The test stresses pushing around lots of bitmap assets that animate each frame. The entire rectangle view needs to be cleared each frame to account for all the changes happening in the scene. The test supports a minimal amount of z depth ordering but not so much as to cause user scripts to take more then 1 millisecond to execute. Both environments are using anti-aliasing to scale the bitmap images.

HTML5 Flash 10
Windows 7
Internet Explorer 8.0.7600 N/A 17.34
Firefox 3.6.3 5.78 17.7
Chrome 4.1.249 10.1 15.98
Opera 10.53 13.59 17.23
Safari 4.0.5 Error* 17.29
Avg (9.82) fps
Avg (17.1) fps
Snow Leopard
Safari 4.0.5 11.76 13.21
Firefox 3.6.3 7.5 14.09
Chrome 5.0.342 7.4 19.96
Opera 10.10 5.86 14.53
Avg (8.13) fps
Avg (15.44) fps
Linux Mint
Firefox 3.5.9 4.84 fps
10.91 fps

*Safari on PC again only renders one frame per mousedown event, so the results are impossible to verify.

These results are really surprising. Chrome on OS X manages to push Flash higher then even Windows based browsers. I was so surprised I ended up rebooting and running the test again just to make sure something wasn’t wrong. We’re starting to see a trend where HTML5 on average runs slower for Canvas based animations and I’ll explain why a bit further below. Linux takes a huge performance hit in this test but the percentage difference mirrors the other platforms exactly. With Nvidia drivers I’d imagine the real numbers would be closer to Mac performance.

Text Column Test

This test is designed to push the text layout and rendering engine in HTML and Flash. The test utilizes custom fonts introduced with CSS3 as well as multibyte character string. This is my least favorite test in the group because it doesn’t simulate any real world test cases, however it should provide a good estimate of how quickly a page full of text can be calculated. I call it the “iceberg” test since 80% of the hit on the CPU happens outside the renderable view. It works because although text that overflows outside the textblock doesn’t get rendered, it does have to get calculated in order to know how many lines of text can be scrolled. HTML pages do this all the time when you load a site with text below the fold.

HTML5 Flash 10
Windows 7
Internet Explorer 8.0.7600 21.79* 1.51
Firefox 3.6.3 24.7 1.5
Chrome 4.1.249 23.58* 1.44
Opera 10.53 21.16 1.49
Safari 4.0.5 30* 1.46
Avg (24.24) fps
Avg (1.48) fps
Snow Leopard
Safari 4.0.5 27.26 16.24
Firefox 3.6.3 23.61 18.71
Chrome 5.0.342 26.07* 22.85
Opera 10.10 22.72 15.22
Avg (24.91) fps
Avg (18.25) fps
Linux Mint
Firefox 3.5.9 25.89 11.67

*Safari continues to show problems on PC. Safari reports 30 fps but it looks like it’s running at 10 fps. I’ve included the results but they’re really wrong.

*Internet Explorer renders the view, but is unable to display the custom fonts.

*Chrome on both platforms is unable to render the Jedi custom font.

I didn’t have time to investigate whether the super slow PC performance in Flash is my fault or Adobe’s, but I expect that will be uncovered soon enough. As for the general differences between HTML and Flash in the text test, this is exactly what I was expecting. HTML was built for text rendering and this is further proof that browsers do this best.

GUIMark Mobile

The Vector and Bitmap tests have been ported into miniature forms to test on mobile devices with a minimum resolution of 320×480. This is the area I imagine will see a lot of updates over the next 6 months. I’ve ordered the results by the release date of each phone tested.

HTML5 Vector HTML5 Bitmap Flash Vector Flash Bitmap
Palm Pre c/o Kevin O’Shea 21.46 32.89
iPhone 3GS 10.79 12.86
Motorola Droid 8.95 12.59
Nokia N900 Flash 9 9.51 9.65 16.69 19.78
Nexus One 15.86 18.83
HTC HD2 c/o Matt Emory 10.43 17.59 29.91 37.62

Two phones running the Flash player isn’t conclusive evidence about Flash’s performance in general in the mobile space, but it does cast immediate doubts on claims that Flash is slow on ARM based smart phones. Meanwhile, if you want the best performance in HTML5 based web content, Palm Pre and Nexus One are sitting at the top of the pile. If you have results you’d like to see added to the chart, you can email results to mech {at} craftymind dot com.

What about video comparison?

I had really hoped to add a video test to this benchmark but I quickly found out there’s no reliable way to record rendering performance for video objects. As far as I can tell, HTML5 video doesn’t provide an api to catch frame dropping events, or a way to determine the playback fps. Blindly running a Timer object on the main thread didn’t seem to help either. At that point I didn’t even bother seeing what hurdles Flash had to testing playback.

Parsing the Results

I imagine half of the people reading this page will have one of two thoughts at this point, “Who cares if HTML5 is slower, I just want Flash to die” and “HTML5 is still brand new, it’s going to get a lot faster”. While I’m not interested in addressing the first point, developers should have context around the second point. There is a fundamental difference between the rendering models used in HTML5 Canvas and Flash which heavily influence the performance divide. The difference is, Canvas uses an immediate mode renderer while Flash uses a retained mode renderer.

When you write a line of javascript that draws a vector or bitmap to a Canvas the browser will immediately render that change before moving on to the next line of javascript. Since the browser has to block that line of javascript while rendering, it means the environment is most efficient when running on a single thread. Text rendering on the other hand occurs at the end of the event loop, behaving more like Flash.

In contrast, Flash commits all renderable changes to an internal store, and after the main event loop finishes processing user code it hands out rendering tasks to all available cores. As a result, Flash scales with both the speed of your processor cores and the number of cores available. Here’s an illustration to better understand.

Javascript vs Flash Rendering Model

Theoretically you might achieve twice the performance in Flash on a dual core system, but in practice there is overhead that you need to take into account like z ordering, bounds checking and re-compositing, and dividing tasks between cores is never perfect. All of this might not seem like a big deal to HTML5 developers, but the truth is the next 10 years are going to be dominated by increases to cpu core count, not single threaded execution speed. You can already see the results of this on a quad core i7 2.67 GHz processor.

Windows 7 Firefox 3.6.3 HTML5 Bitmap Game – 6.07 fps Flash 10 Bitmap Game – 30.1 fps

HTML5 Canvas performance saw virtually no increase jumping to 4 cores, while Flash performance nearly doubled. Without a major shift in execution processing, Javascript based animations and interactions are going to remain stagnate over the coming years. Unfortunately I don’t see that change coming. All the talks of multithreading coming from browser vendors right now is between the browser interface and the html view, but not the HTML rendering model.

HTML5 video is largely exempt from this problem however. While the video api exposes hooks to the main thread for playback control, all rendering and sound is processed under the hood on secondary threads. As a result, media performance increases with GPU and CPU cores.

TL;DR

There is no doubt that HTML5 adoption will grow significantly in the next 2 years, and that more and more content will be targeted to SVG and Canvas implementations. But developers need to be cautious with adopting one technology or another wholesale. HTML5 may not be fast, but it is proficient at a good amount of tasks. If you need static or limited interactive content on your website, HTML5 will soon be your best option. If you need complex interactive content, you’re probably better targeting Flash. As for me, you’ll find me abusing the hell out of both technologies and posting the results to this blog.

In the meantime, if you want the best HTML5 performance on Windows, you should be using Opera right now. On the Mac side, it’s a tossup depending on the type of content you’re interacting with. If you’re looking for good Flash performance on Windows any browser will do, whereas on the Mac side Chrome is clearly outperforming everyone else.

The sources for each test should be linked within the test itself if you want to peruse the code. I tried to make sure everything could be contained to one file whenever possible and not rely on external dependencies. You can download all the fonts and tower defense assets here. If you find any errors with the results, or feel like taking a stab at testing another technology, feel free to email me at mech {at} craftymind dot com.

Flame on!

Updates

10/05/07 – As I should have assumed, quite a few people have started sending in updated tests with their own levels of optimizations. I want to be very clear on certain improvements, the purpose of these benchamarks is to stress the graphics APIs available to developers, not cancel them out. An optimization that affects both platforms equally (like caching the grid lines behind the charting test) doesn’t further the goal of exposing how efficient the two platforms are internally. If you have a unique optimization that can only be applied to one platform and not the other, please let me know and I’ll try to incorporate the change and retest.

10/05/09 – Changed some language on the rendering model for HTML5. Canvas paints immediately, standard text paints at the end of the event loop

Blowing up HTML5 video and mapping it into 3D space

I’ve been doing a bit of experimenting with the Canvas and Video tags in HTML5 lately, and found some cool features hiding in plain sight. One of those is the Canvas.drawImage() api call. Here is the description on the draft site.

3.10 Images

To draw images onto the canvas, the drawImage method can be used.
This method can be invoked with three different sets of arguments:
  • drawImage(image, dx, dy)
  • drawImage(image, dx, dy, dw, dh)
  • drawImage(image, sx, sy, sw, sh, dx, dy, dw, dh)
Each of those three can take either an HTMLImageElement, an HTMLCanvasElement, or an HTMLVideoElement for the image argument.

The api lets you take the contents of specific HTML elements and draw them into a canvas, and the 3rd element in that list is just begging to be abused. Copying video into a canvas element means opening up the ability to manipulate or process video frames at runtime. Two concepts instantly came to mind that seemed like fun to try and figure out, here they are below.

Blowing apart fragments of video

Click around the video frame to blow up that part of the video, the exploded pieces will continue to play the video inside them. After a while they retract back to their original place. One feature I didn’t have time to figure out was adding depth to the explosion, so pieces that are closest to ground zero fly up into the air as they sail outward. With full shadow effects this could look really cool.

3D Video

This demo in particular runs really well inside webkit based browsers, but not so much in Firefox. Firefox doesn’t appear to have any hardware acceleration for Ogg decoding so I had to drop the video size in half in order to run at acceptable framerates. Even still, Firefox chokes pretty badly on my Macbook Pro.

*Update* – I’ve changed the ogg video to be 640 x 360, prepare to see firefox weep

Lessons learned

There’s a couple hints I found out along the way that are good to know if you want to play around with drawing video. First, you need a bit of hackish code to get this to work effeciently and it flows like this.

[Video playing] -> [Draw Video onto Canvas 1] -> [Draw fragments of Canvas 1 onto Canvas 2]

Don’t ask me why, but copying pixel data out of a video tag is expensive, so expensive that drawing it into a temporary canvas, and then drawing pieces of that temp canvas onto a final canvas is faster then just referencing the video tag repeatedly within the same loop. That’s why you’ll see 2 Canvases in the source code for the demos. I’m sure there’s a technical reason for this duplication process, but it’s a lazy reason.

Secondly, don’t try copying individual pixels around. You can still see the remnants of my first code attempt inside the explosion demo with getPixel() and setPixel(). This turned out to be horribly slow and completely unnecessary. Canvas.drawImage() + matrix transforms on the canvas space is far more efficient then handcrafted pixel pushing. On the other hand, pixel manipulation allows you to do things like runtime chroma keying, get ready for a new wave of “clippy” style videos with full transparency popping over websites to help you out.

Lastly, I’m learning very quickly that not all browsers are created equal when it comes to performance, it’s a crapshoot when it comes to heavy video+image manipulation. Safari and Chrome work well with h.264, Firefox slogs along with Ogg Theora, and Opera is somewhere in the middle.

GUIMark

 

GUIMark 2 is out now! View the new tests here.

 

Home | Detailed Analysis | Benchmark and Rendering Engine theory

GUIMark is a benchmark test suite designed to compare the rendering systems of several popular UI runtimes. In general it should be able to give designers and developers a good indication of which technologies can draw complex interfaces at a smooth rate of motion. The test mostly addresses RIA technologies like Flash, Silverlight, HTML or Java, but was designed to be easily ported to any 2D GUI environment. The basis for this project was inspired by the Bubblemark animation test, but was designed to heavily saturate the rendering pipeline and determine what kind of visual complexity is achievable in the sub-60 fps realm.

The Test

GUIMark referenceThe reference design was originally created in Flex and then ported to the technologies listed below. All results listed in the matrix as well as the detailed results page were run on the same Macbook Pro running Leopard for OS X, and running Win XP under a Boot Camp partition. Each test case was run 3 times in a new browser instance and the highest framerate observed was recorded. For the HTML test, the fastest performing browser on a each OS was used in the comparison matrix (Internet Explorer 7 and Safari 3 won for their respective OSes). All subsequent plugin based tests for the OS were tested in those browsers.

My hope is to port the benchmark to all the other untested technologies listed below and I fully welcome any optimizations or ports that readers want to contribute. I’m really curious to see if any community experts or platform engineers are able to speed up their technology of choice. Although the code is fairly simple at a glance, there are no easy optimization paths to be found (and no cheating by turning off anti-aliasing).

Results

Results for Win XP running on Macbook Pro Intel Core 2 Duo 2.33 GHz

Tech Base Version Average FPS Source
Browser HTML 28.36 / IE 7 Download
SVG - -
Canvas - -
Flash Flex 3 46.08 Download
Flash 9 - -
Java Java 5 Swing 19.37 Download
Java 6 Swing - -
Processing - -
JavaFX - -
Silverlight Silverlight 1 / Javascript 9.12 Download
Silverlight 2 Beta / C# 7.95 Download

Results for OS X 10.5 running on Macbook Pro Intel Core 2 Duo 2.33 GHz

Tech Base Version Average FPS Source
Browser HTML 18.20 / Safari 3 Download
SVG - -
Canvas - -
Flash Flex 3 8.01 Download
Flash 9 - -
Java Java 5 Swing 7.19 Download
Java 6 Swing - -
Processing - -
JavaFX - -
Silverlight Silverlight 1 / Javascript 5.25 Download
Silverlight 2 Beta / C# 5.38 Download

Findings

I’ve been surprised with the results so far between WinXP and OS X. On the same machine its very clear which vendors take more advantage of the underlying hardware. The results for the different plugin technologies aren’t too surprising since it’s regularly admitted that most companies spend their optimization time on Windows due to its larger install base. This argument doesn’t hold any water though when comparing html rendering on Safari/Mac against IE /Windows where there’s roughly a 1.6 : 1 advantage to the IE team. I can’t help but wonder if the core apis on the Mac platform are creating any unnecessary roadblocks. I’m also extremely surprised at the rendering speed that Flash is able to pull off on Windows. I developed this benchmark under OS X and after compiling the results I’m considered making the testcase more intensive since Flash is running so fast, but for now maybe the really poor Mac performance will give Adobe something to work on.

You can read more about rendering engine theory, the structure of the test case itself, and detailed analysis of the results on the sub pages within the site.

Updates

John Dowdell from Adobe brought up a valid point that plugin vendors are restricted by the browser environments they run in. This is true to an extent, but the limitations enforced on plugins don’t come into play with the GUIMark test. Browsers typically restrict the number of event loops available to a plugin which caps the framerate to around 40 – 50 fps. GUIMark doesn’t come anywhere close to hitting that limit on Mac. There are also no restrictions to the amount of cpu available to a plugin running within the browser which is why all of them peg the cpu to 100%. To illustrate the point, I created an AIR implementation of GUIMark and ran it on my Powerbook and here are the results I got. The Flash players rendering engine performs no differently outside the browser then it does inside the browser. Until plugins start bumping up against the event loop or Beam Sync restrictions, Adobe, Sun, and Microsoft don’t get a free pass for slow performance on Mac.