GUIMark

 

GUIMark 2 is out now! View the new tests here.

 

Home | Detailed Analysis | Benchmark and Rendering Engine theory

GUIMark is a benchmark test suite designed to compare the rendering systems of several popular UI runtimes. In general it should be able to give designers and developers a good indication of which technologies can draw complex interfaces at a smooth rate of motion. The test mostly addresses RIA technologies like Flash, Silverlight, HTML or Java, but was designed to be easily ported to any 2D GUI environment. The basis for this project was inspired by the Bubblemark animation test, but was designed to heavily saturate the rendering pipeline and determine what kind of visual complexity is achievable in the sub-60 fps realm.

The Test

GUIMark referenceThe reference design was originally created in Flex and then ported to the technologies listed below. All results listed in the matrix as well as the detailed results page were run on the same Macbook Pro running Leopard for OS X, and running Win XP under a Boot Camp partition. Each test case was run 3 times in a new browser instance and the highest framerate observed was recorded. For the HTML test, the fastest performing browser on a each OS was used in the comparison matrix (Internet Explorer 7 and Safari 3 won for their respective OSes). All subsequent plugin based tests for the OS were tested in those browsers.

My hope is to port the benchmark to all the other untested technologies listed below and I fully welcome any optimizations or ports that readers want to contribute. I’m really curious to see if any community experts or platform engineers are able to speed up their technology of choice. Although the code is fairly simple at a glance, there are no easy optimization paths to be found (and no cheating by turning off anti-aliasing).

Results

Results for Win XP running on Macbook Pro Intel Core 2 Duo 2.33 GHz

Tech Base Version Average FPS Source
Browser HTML 28.36 / IE 7 Download
SVG - -
Canvas - -
Flash Flex 3 46.08 Download
Flash 9 - -
Java Java 5 Swing 19.37 Download
Java 6 Swing - -
Processing - -
JavaFX - -
Silverlight Silverlight 1 / Javascript 9.12 Download
Silverlight 2 Beta / C# 7.95 Download

Results for OS X 10.5 running on Macbook Pro Intel Core 2 Duo 2.33 GHz

Tech Base Version Average FPS Source
Browser HTML 18.20 / Safari 3 Download
SVG - -
Canvas - -
Flash Flex 3 8.01 Download
Flash 9 - -
Java Java 5 Swing 7.19 Download
Java 6 Swing - -
Processing - -
JavaFX - -
Silverlight Silverlight 1 / Javascript 5.25 Download
Silverlight 2 Beta / C# 5.38 Download

Findings

I’ve been surprised with the results so far between WinXP and OS X. On the same machine its very clear which vendors take more advantage of the underlying hardware. The results for the different plugin technologies aren’t too surprising since it’s regularly admitted that most companies spend their optimization time on Windows due to its larger install base. This argument doesn’t hold any water though when comparing html rendering on Safari/Mac against IE /Windows where there’s roughly a 1.6 : 1 advantage to the IE team. I can’t help but wonder if the core apis on the Mac platform are creating any unnecessary roadblocks. I’m also extremely surprised at the rendering speed that Flash is able to pull off on Windows. I developed this benchmark under OS X and after compiling the results I’m considered making the testcase more intensive since Flash is running so fast, but for now maybe the really poor Mac performance will give Adobe something to work on.

You can read more about rendering engine theory, the structure of the test case itself, and detailed analysis of the results on the sub pages within the site.

Updates

John Dowdell from Adobe brought up a valid point that plugin vendors are restricted by the browser environments they run in. This is true to an extent, but the limitations enforced on plugins don’t come into play with the GUIMark test. Browsers typically restrict the number of event loops available to a plugin which caps the framerate to around 40 – 50 fps. GUIMark doesn’t come anywhere close to hitting that limit on Mac. There are also no restrictions to the amount of cpu available to a plugin running within the browser which is why all of them peg the cpu to 100%. To illustrate the point, I created an AIR implementation of GUIMark and ran it on my Powerbook and here are the results I got. The Flash players rendering engine performs no differently outside the browser then it does inside the browser. Until plugins start bumping up against the event loop or Beam Sync restrictions, Adobe, Sun, and Microsoft don’t get a free pass for slow performance on Mac.

From AS3 to Objective-C: Flex vs iPhone development

Recently I’ve been given the opportunity to work full time on commercial iPhone development at EffectiveUI. The most intriguing thing about the platform for me is having access to non traditional user input mechanisms. When I was playing around with Wii remote integration on the desktop, the potential was exciting, but the ubiquity was limiting. In the same way that pc game companies develop for the keyboard and mouse first and then provide hooks for joysticks after the fact, I knew that serious Wii remote integration in a desktop app was limited. Knowing that I can write software for the iPhone that always has access to multitouch and accelerometer data from the outset really allows for unique gestures as a first class citizen within an app.

After working with some code and spending time with the SDK itself, I couldn’t help but naturally compare UI development on the iPhone with GUI frameworks like Flex or Swing, heres the things that stand out so far.

  1. I’m really spoiled by higher level languages. A good high level language like ECMAScript, Ruby, or Java rides the fine line of “Making things as simple as possible, but no simpler”. I’ve never felt constrained by the language features in these technologies, only by the apis exposed. Stepping *back* into Objective-C certainly provides more power and flexibility in the language, but there’s a loss in productivity for me that I just can’t shake. Some of this loss comes from Objective-C’s design itself, and some of it just comes from XCodes introspection ability. For instance, I’m not sure if I’ll ever get to the point where I can read these lines of Objective-C as effeciently as their ActionScript counterparts would probably look.
    NSString *aString = [[[NSBundle mainBundle] infoDictionary] objectForKey:@"CFBundleName"];
    UIView *contentView = [[UIView alloc] initWithFrame:[[UIScreen mainScreen] applicationFrame]];
  2. Closed source UI frameworks suck. Most of what I learned about custom UI development in Flex I learned by inspecting the source code for the bundled controls. Ripping open Containers to see how layout rules are determined, or Lists to see how delegates are passed around, or the Image control to see how different display types are handled provides invaluable gems about implementing Flash apis. It has also helped me optimize the interactions of an app knowing the intentions of the developer who created the UI controls. With the iPhone SDK, you’re given documentation for the visual components, but no source to help determine how they work.
  3. Core graphics and animation is really strong. Between Quartz and the OpenGL layer there’s alot of potential for getting easy access to some of the more complex visual hacks. Although I think 3D user interfaces are prone to usability issues, the iPhone is a much better device to explore them on then a standard keyboard and mouse interface.
  4. Data binding, event listeners, and mxml. The Flash Player and Flex model provide features on top of the ActionScript language that arguably optimize UI needs and keep development more declarative. Cocoa development could really benefit from a ‘gui compiler’ that takes Objective-C to a higher level and bakes in features that support common ui design patterns.
  5. Garbage Collection. Objective-C 2.0 provides a unique system that lets you create objects that will be automatically garbage collected, or you can continue to manually manage object allocation/deallocation yourself. The concept sounds cool, but I can imagine allowing both systems to be mixed within the same project is just begging for trouble.

So far I think that Objective-C has alot of power and some really awesome features that outclass GUI features in Flash, but compared to Flex development as a whole, I’d have to say that XCode and the included visual frameworks are not as sophisticated.

Updated ‘Elastic Racetrack’ for Flash 9 and AVM2

In 2005 Ted Patrick posted a great article on the frame execution model inside the Flash Player that he dubbed the ‘elastic racetrack‘. It’s served as a great reference for me over the years to help understand how code execution and rendering were balanced within the processing of a frame. Since the introduction of Flash Player 9 and the new AVM2, I’ve noticed a few changes to the elastic racetrack model and thought I’d share them. This information is based on research into Flash player internals as well as observations I’ve made playing around with the event and rendering model, but the full model hasn’t been confirmed by Adobe engineers.

The basic premise of the original elastic racetrack is still the same. Given a specific frame rate to operate on, the Flash player will devote the first segment of the frame to execute code, and the second segment to render display objects. Either segment can grow its part of the racetrack to accommodate more processing and effectively extend the duration of the frame.

Flash Player Elastic Racetrack

What changes from the previous model is how those segments look under a microscope and how they come together to form a ‘frame’.

AVM2 is controlled by what I’m going to call the Marshal. The Marshal is responsible for carving out time slices for the Flash Player to operate on. Its important to clarify up front that these time slices are not the same thing as the framerate compiled into a swf, but we’ll see below how the player ultimately synthesizes a framerate from these slices. Running a Flex compiled swf within Firefox under Mac OS X, the Marshal appears to be carving out 19-20 millisecond slices, but this can be different between platforms and browsers based on what I’ve observed as well as Adobe employees have hinted at. This can also change depending on how the swf was compiled, see some of the comments below. For the sake of the article lets assume we’re only talking about a 20 millisecond slice to make the math easy. This means the Marshal will attempt to generate and execute no more then 50 slices each second, and it may be less depending on the elasticity of code execution or rendering. Inside each slice, 5 possible steps are processed in the following order.

  1. Player events are dispatched – This includes events dispatched by the Timer, Mouse, ENTER_FRAMEs, URLLoader, etc…
  2. User code is executed – Any code listening to events dispatched by step 1 are executed at this stage.
  3. RENDER event is dispatched – This special event is dispatched when the user calls stage.invalidate() during normal user code operation.
  4. Final user code is executed – User code listening specifically for step 3 is executed at this point.
  5. Player renders changes to the display list.

AVM2 Marshalled Slice

The Marshal executes this 20 millisecond slice over and over and decides on the fly which actions to run. The exact actions processed within a slice will ultimately derive the 2 main racetrack segments (code execution and rendering) that constitute a ‘frame’. User actions and Invalidation actions fill up the code segment track, while Render actions fill up the render segment track. Its important to note that actions will only occur at times predetermined by the Marshal, so a if you have a short running User action, the Marshal will still wait a few milliseconds before moving on to the Invalidate and Render actions.

The best way to illustrate which actions are run and how the elastic racetrack is created, is to look at how those slices are processed on a swf running at 5 fps, 25, fps, and 50 fps.

Flash Frame Marshaling Diagram

As you can see, the elastic racetrack performs different actions per frame and requires a different visual illustration depending on the framerate that the player is trying to synthesize. So for a swf running at 5 fps, each frame processed 10 User actions, 1 Invalidation action, and 1 Render action. At 25 fps, each frame processed 2 User actions, 1 Invalidation action, and 1 Render action. At 50 fps, each frame processed 1 User action, 1 Invalidation action, and 1 Render action. Whats important to note in the above chart is that some events are only available in certain slices. For instance, the Event.ENTER_FRAME event will only ever be dispatched in a slice that occurs at the start of a frame.

So what does this all mean? Theres a couple quick ideas to take away from this.

  1. Long running code execution or render segments can extend a given slice beyond 20 milliseconds. Elasticity will be applied to that particular slice and the duration of the frame may or may not be extended as a result. The Marshal may drop the number of slices that constitute a frame in order to keep the active framerate close to the compiled framerate.
  2. A swfs real framerate won’t exceed the Marshals rate defined for the player instance. You can set your compiled framerate at 120fps, but Flash will still only process 50 slices max that generate 50 render calls (more or less depending on the system config).
  3. Code can be executed more often then the compiled framerate. A swf compiled at 1 fps can execute a Timer or Mouse event in every slice, even though it will only render in the last slice. Additionally, if you choose, you can render to the screen sooner then the compiled framerate by calling updateAfterEvent() , but only within a Mouse, Timer, or Keyboard event handler. In this instance though, the Marshal will consider that the end of the frame and start a new frame on the next slice. Lastly, Flash will force an automatic render when mousing over any Sprite that has had its visual properties (x,y,width,height,etc..) changed, naturally this still occurs at the end of the slice and any prerender logic will still run.
  4. Compiling a framerate that isn’t a multiple of the total number of slices per second for your platform will cause irregular rendering as it tries to divide up the slices. If you were to compile in a framerate of 20 on a platform executing 50 slices per second, then the player has to render 2 frames every 5 slices and would follow a 3-2-3-2-3-2 slice-to-render rate.

These 4 facts are moving targets though, since for this article we’re working on a 20 millisecond slice that’s processed 50 times per second. In reality you’ll see time slices as low as 5 milliseconds or as high at 100 milliseconds and some of the math will change as a result.

If you’d like to test this model for yourself, the easiest route is to create a swf running at 1 fps and another at 100 fps both with a Timer object set on a 1 millisecond interval. Inside the Timer event handler change the x property of a display object and hook a bunch of getTimer() traces up to different player events like Mouse, EnterFrame, and Render and watch the carnage unfold in your console. The rest of the information you can’t derive from the results comes from alot of context about the player I’ve learned over the past 2 years and so isn’t as easily visible. If anyone has any information to help add to or correct the above model, please submit it in the comments.

Thanks to several readers who have clarified some of the differences between Flex and Flash as well as how the Flash API is able to change the default behaviors described above.

Hacking away at UI development

-->