The ShadowRenderer
is an interface that can either create a shadow of an image or paint an image with a shadow. The most interesting method it includes is:
public ARGBPixels createShadow(ARGBPixels srcImage, ARGBPixels destImage, float kernelRadius, Color shadowColor);
The demo showcases 3-4 different renderers, but in production you probably are only interested in the two best-performing renderers:
BoxShadowRenderer
is the fastest, but it doesn't mimic a Gaussian blur/shadow well.DoubleBoxShadowRenderer
mimics a Gaussian blur/shadow, and it is a little bit slower.A simple invocation may resemble:
ShadowAttributes attr = new ShadowAttributes(dx, dy, kernelRadius, color); renderer.paint(graphics2D, srcImage, x, y, attr);
The renderer iterates over a source image and blurs each pixel. The actual RGB components of the source pixel don't matter: they're all replaced with the same color (usually a translucent gray or black).
The ShadowAttributes
object mentioned above is a simple data container bean for information about how to execute and position the blurred shadow.
Here is a chart showing the execution time for four renderers as the kernel (blur size) increases:
All the renderers execute in two passes: one horizontal and one vertical. As smarter people have long observed:
It turns out that a Gaussian is separable, which, if you're blurring an image, means you can blur in one dimension, then blur in the other, and get the same result as if you had done the whole thing at once.
The "Unoptimized Gaussian" is the baseline. It iterates over every pixel first in its vertical blur, and then it iterates over every pixel in its horizontal blur. It uses an int array as its kernel, and it faithfully multiplies every element of the kernel as it computes each output alpha value.
The "Optimized Gaussian" produces identical visual results, but is faster. On an older machine this was about 1/3 the execution time of the unoptimized blur. On my newer laptop (shown above, Apple M2 with 8 cores) it performs even faster. The biggest performance optimization is multithreading, which means machines will see different performance. There are a few other minor tweaks I found to squeeze out a little more performance here and there, but multithreading is the key distinction.
(I'm going to assume if you're reading this far that you're at least a little familiar with what a "box filter" is. There's lots of great articles on it, but if you go a-googling know that it is unfortunately similar in name to the CSS "box-shadow" property. In summary: a box filter applies a uniform kernel, and a formal Gaussian kernel is more of a 2D bell-curve.)
The "Double Box" is about 1/8th of the baseline time. This is also multithreaded. This is the result of applying two consecutive box filter renderers. I would argue that for most people: this is visually indistinguishable from a formal Gaussian blur. The two kernel box radii it uses to approximate a gaussian kernel are empirically calculated. (That is: there's a lookup table that tells us that a Gaussian kernel of "3.5" can be approximated by applying a box kernel of "0.7519531" and "0.7558594".)
The "Box" uses a simple box filter. This executes the fastest (at maybe 1/10th of the baseline execution time), but it is the only renderer in this profile that does NOT look like a Gaussian blur. Depending on your needs: it's probably safe to say some users won't care about this distinction. (A blur is usually a background decoration, after all.)
All of these renderers offer floating-point kernels. This is achieved by tweening two integer-based kernels. For example: if a 3px kernel is [1, 3, 1] and a 5px kernel is [1, 2, 4, 2, 1], then the shorter kernel is padded with a zero on either side and we tween two together. This feature helps us support animation so the user can smoothly fade from a smaller blur to a larger blur.
The "Box" zigzags in a saw-tooth pattern based on whether the kernel is an integer or a decimal value, because integers are simpler to calculate.