GPU renderMode for AIR mobile profile and caching clarification

Posted on October 14, 2010 | 3 comments

The mobile profile for AIR, which is most prominently demonstrated through the AIR for Android runtime, provides the renderMode options CPU, GPU, and AUTO. Currently AUTO is the same as CPU mode. The GPU mode puts your application into a state where all the rendering steps, both rasterization and scene composition, are handled by the GPU. The rasterization step in the rendering phase takes all the display list objects, either vectors and bitmaps, and creates textures to be rendered. Then the textures are copied on to the buffer and make up the scene.

Caching is actually different and has nothing to do with GPU renderMode directly. But indirectly what caching does for you is allows for faster rasterization of certain types of objects, ie: complex vectors.

Some recent posts, Christian Cantrell and Lee Brimelow, show off GPU renderMode with great performance over CPU renderMode for AIR for Android. These posts are a great generalization but this topic is a bit more complex then it seems.

I have added some code to github here (a fork off Cantrell’s code). This code breaks up caching and renderMode into separate apps that setups 10 rotating vector squares and then runs for 500 frames. After the 500 frames it then outputs the FPS.

My results running on a Droid X (which will be different on different devices as GPU’s and hardware is not the same on each device), is below (cABM = cacheAsBitmapMatrix):

Optimal FPS = 50fps
CPUTest (renderMode=”cpu” & cABM) – 15fps
CPUTestNoCABM (renderMode=”cpu” & no cABM) – 38fps

GPUTest (renderMode=”gpu” & cABM) – 32fps
GPUTestNoCABM (renderMode=”gpu” & no cABM) – 45fps

What does this tells us?

For simple vector squares that are rotating GPU with no caching is faster then CPU caching or no caching, CPU mode with no caching is fast, and basically caching slows both modes down. It doesn’t tells much more then that. As each set of content could be affected differently depending on how complex the vectors are, if bitmaps are used, nested objects, etc…

On top of all this the GPU renderMode for iOS devices using the current state of PFI act quite different. I’ll try and post more about this topic as the days go along, but currently the best way to find out more is come see David Knight (AIR Team Engineer working on GPU and cABM) and I present at MAX on this topic.

  • derek knox

    I asked on Lee’s site if cacheAsBitmap and cacheAsBitmapMatrix had to be utilized in order for GPU acceleration to occur. He said they do, but this post seems (to me at least) to suggest that there is more going on. Can you confirm if great gains can be attained via GPU acceleration without utilizing cacheAsBitmap and cacheAsBitmapMatrix?

    • Renaun Erickson

      There is some confusion about the specifics of these really works. We’ll be talking about it more in detail at MAX. Technically caching is not tied to GPU, but most cases rely on caching to get best performance. In the post about shows a use case where this is not necessarily true. The main point is you need to test your content.

  • Shawn

    As developers, what we really need is Auto to actually be Auto. ie decide what makes a good candidate for PU acceleration and apply it there, leave the rest to CPU.

    To be totally honest, currently GPU rendering is just unusable for any complex application.

    It’s one thing to spike out a few rotating sprite, and benchmark. But as soon as you build anything complex, with multiple moving parts, text, different views, and you apply GPU, your performance chokes. I’m basing this on experience building now 3 fairly complex apps.

    Currently, the only real-world application I can see for GPU rendering mode is in simple games, where it really is just sprites moving around.

    I would love to see a demo, of an even marginally complex application, using GPU mode, and running at a faster clip than CPU, to prove me wrong!