LiveFrost: Fast, Synchronous UIView Snapshot Convolving
LiveFrost is a new thing that Nicholas and I have spent half an evening working on. It gives you fast and synchronous UIView snapshot convolution by providing a LFFrostView, a blurring view for UIKit which you can drop into any superview to be blurred. When the app runs, LFFrostView will be filled with a convolved image drawn from the snapshot of its superview.
LiveFrost is released under the MIT license and comes with a sample app.
There are many competing implementations available: FXBlurView, ios-realtimeblur are the top two hits.
iOS-blur is another one that warrants special mention. It’s an amazingly brilliant hack for iOS 7+ which simply stole UIToolbar and had that view do the blurring.
iOS-blur deserves special mention because it relies on Apple’s kindness and generosity to work. If you try to run it on an iPhone 4, where LiveFrost works smoothly, it would refuse to blur. However, if you’re just looking for a blurring view for your iOS 7+ application which does not target the iPhone 4, and you’re not keen on customization nor compatibility, this library obviously does the blurring with the least amount of code. :)
The general idea of such a blurring view is pretty simple:
Draw the contents of its superview into a bitmap context, like a CGBitmapContextRef if you are using Core Graphics.
Blur the bitmap algorithmically. (For example, by using GPUImage’s GPUImageUnsharpMaskFilter, or the Accelerate framework’s vImageConvolve_ARGB8888.)
Send the bitmap back onto the screen in some way.
Not so simple in practice. The first thing you’d notice when running samples of these implementations on the real device is possibly the slugginess, low frame rates, or out-of-sync blurring results lagging a few frames behind the main view.
Slow Drawing Explanations
In greater detail, the jankiness (in which you lose frames) is usually caused by doing too much on the main thread (1 second /60 frames = 0.016̅ seconds per frame). If you’ve ever profiled such solutions, they’re usually spending a lot of time in drawing into a large image buffer. Once you’ve solved that by bringing the scale factor down (as the product will get convolved anyway), you’ll find the solution still spending a lot of time creating single-use image buffers.
That isn’t right nor necessary. If the blurring view has not been resized — in other words, if its bounds size has not changed — it should not have to waste time throwing away then reclaiming memory. Reusing this context gives you much more time for actual work.
Once jankiness is fully understood we can pierce through reasons causing frame lag as well. The developer, faced with the problem of things taking too long, may try rendering asynchronously — off the main thread — to ease the burden. Now they have precisely two problems: frame lag and threads.
First of all, putting rendering on the background means the bitmap has to (conceptually) travel to the background thread, get operated on then at a later stage re-committed back to the layer, some times in form of -[CALayer setContents:] as an CGImageRef. As all drawing is done by the Render Server (backboardd) the actual image may be committed several frames past its originating frame, resulting in visible lag.
Rendering views off the main thread may also not work out as intended. Some collection views driving multiple cells, usually one per represented object, compute layout in lump sum. They usually hold an internal layout map that correlates objects to their presentation items, deriving bounds and other attributes from the same source. (This is exactly why infinite scrolling is so difficult to achieve with UITableView. This class expects you to know everything because it wants to use that to compute a complete layout, or at least something it can get layout information from.) Views that prefer to build interim layout states as it go still need to constantly mutate their layer trees, updating subviews to reflect content with the new offset. Even though CALayer is thread-safe, you might catch the view in the middle of mutating its own subviews as you attempt to render it from a background thread.
Practically, this results in missing cells in the final images. If you scroll really fast on an implementation which throttles number of frames, you’ll see this happening a lot if you have a long enough collection view to draw from.
If the drawing or convolving itself still takes too long, the developer will have to manually drop frames. They might decide to have a demigod object which listens to CADisplayLink, and implement the callback handler like this. I first learned of this technique from Brad Larson’s answer to CADisplayLink OpenGL rendering breaks UIScrollView behaviour:
- (void) refresh { if (dispatch_semaphore_wait(_renderSemaphore, DISPATCH_TIME_NOW) == 0) { dispatch_async(_renderQueue, ^{ for (UIView<LFDisplayBridgeTriggering> *view in self.subscribedViews) { [view refresh]; } dispatch_semaphore_signal(_renderSemaphore); }); } }
Using this technique, the developer effectively clamps the depth of the dispatch queue to one block maximum. When the callback is fired, it invokes dispatch_semaphore_wait with an immediate timeout, effectively tricking the function to return immediately if the semaphore is not released — as the previous queued block has not yet finished. This technique throttles the number of frames processed by the blurring code without causing main thread slowdowns.
Unfortunately, fancy procrastination can’t save you from being late. You need to draw things fast on the main thread.
It’s possible that you’ve spotted the #1 time sink — disposable single-use contexts. This approach is really clean because no streams ever crossed and really slow because you’re now constantly deallocating and reallocating. Larger images need to be held in larger chunks of memory and it’s harder to find larger chunks of memory when you’re in a tight spot.
You should therefore reuse the bitmap contexts. Create or re-create them when the working size of the bitmap you have has changed, for example when you’ve got a new bounds that has a different size, and only when you have to. In other times, just draw into the context you have and don’t throw the memory away.
Turns out -[CALayer renderInContext:] is really fast when drawing into a context with a 0.5f scale factor (instead of 2.0f on a Retina Display), and it’s also much faster to convolve a smaller image.
LiveFrost obtains a pretty stable and high frame rate by using these simple rules.
By default, LiveFrost uses CADisplayLink to drive update notifications. Instead of using a NSTimer which fires at fixed intervals, CADisplayLink allows you to synchronize drawing with the refresh rate of the display. By using CADisplayLink, you can be sure that on every invocation, you get draw and update the exact frame in exactly the right run loop mode you specified. Not such with NSTimer, which is also scheduled on a run loop but does not care about the screen.
The only weakness is that by default, CADisplayLink does not pause. LiveFrost will be convolving the same image over and over even if the underlying view has not been updated. This is, generally speaking, a design tradeoff to avoid exposing more interface than necessary, but you can always take the LFFrostView off screen when done.
If you’re trying to do something with Open GL ES, you can possibly look into the LFDisplayBridgeTriggering protocol:
@protocol LFDisplayBridgeTriggering <NSObject> - (void) refresh; @end
By default, interfacing with the display link is done over LFDisplayBridge, which holds a mutable, unretained, unsafe set of pointers to LFFrostView instances. If you pause the display link within LFDisplayBridge, you can then control actual refreshes, still, by calling [[LFDisplayBridge sharedInstance] refresh]. However, if you’re not overlaying UIKit things over your Open GL ES view, you might consider to just convolve things with Open GL ES directly without touching LiveFrost.
Like this, if you’re feeling adventurous:
LFDisplayBridge *displayBridge = [LFDisplayBridge sharedInstance]; CADisplayLink *displayLink = nil; Ivar displayLink = object_getInstanceVariable(displayBridge, "displayLink", &displayLink); [displayLink pause];
Hardware Compatible Coding
This is pretty much a side note.
CGImageRef is a versatile wrapper which means it could have to be dynamically decoded if necessary. If you’ve ever profiled an app trying to display a JPEG file obtained from the Internet, you’ll see lots of time spent in decoding and converting such image to the GPU’s native format.
Fortunately, if you’re already drawing into a bitmap buffer, you have full control and the result does not require additional transcoding. It’ll be fast.