KrazyKey – pixel by pixel keying, with the goodness of CUDA

We’ve been adding CUDA support to many of our custom tools lately and finding speed-ups of 40x in some cases. This tool is an example of an idea that would be totally impractical (hence the krazy) without a highly parallel GPU-based approach. The algorithm takes each pixel in the input image, and finds the shortest distance (in RGB space) from it to the pixels in the “selection” image. This requires an every-pixel-to-every-pixel distance calculation, with the final shortest distance value compared to a threshold to determine whether or not to mask the pixel.

Krazy Key, enriched with CUDA
Download KrazyKey 1.0Download KrazyKey

A keying operation masks out a subset of the RGB color cube. This subset is usually a contiguous region within the cube, with the final masking weights related in some way to the parameters of the shape’s geometry.

As an example, the ChromaKeyer makes an octahedron, but not a regular one; think of it as a box aligned to the color cube, but with 2 corners truncated along the white/black “desaturated” axis. Primatte uses a a regular polyhedron but with the vertices repositioned to enclose the appropriate bounding region.  The main algorithm uses 128 faces in a closed shape. In both these cases, forcing the sampling of a contiguous region may include color coordinates that you don’t want, no matter how complex the shape within the RGB cube.

KrazyKey does not create a polyhedron at all, but uses a point cloud to define the sampling region.  It can thus enclose ANY color set at all, regardless of shape, and does not have to be contiguous.  This means that any keying operation can be performed with just one KrazyKey, whereas you might need a few Primattes, or dozens of ChromaKeyers.

Chad put together a demo comp that shows the subset of the color cube used by several common keying methods compared to what’s possible with KrazyKey.

Comparing keying geometries (red/cyan)

Comparing keying geometries (red/cyan)

KrazyKey’s interface is pretty simple:

You input two images, the first (Input) is the image you want the key applied to, and the second (Selection) is the set of colors you want to compare against.  Any pixels with an alpha of zero are ignored, which reduces the number of compared, but also allows you to mask out which portions of the color set to use. The Selection image is collapsed to a 1D vector, as any spatial information in the Selection image is useless.

The Threshold specifies the maximum distance between the Input pixel and its closest color from the Selection image.  If the closest distance is greater than this value, the pixel returns black.  Remember, this is operating in float 3D space, so the maximum distance between 2 colors in a clamped 0-1 color cube is sqrt(3).  In unclamped colors, the distances could be much larger.

Rectify Output returns the absolute value of the distances.  Otherwise, the tool will output both positive and negative distances for each of the components of the vector.  Without this option, integer images would return black instead of the correct distances if negative.  We strongly recommend using KKey with float inputs.

The final option, Use cuda, is where the magic happens. If you don’t have CUDA, the tool will failover to a CPU only version, and will thus be slower, but that allows it to be more render-farm friendly.   This toggle is also good for benchmarking your CUDA hardware.

Download Comparing keyer geometry Download Comparing keyer geometry

Download KrazyKey keying example Download KrazyKey keying example

So give the tool a try and leave some feedback in the comments section.  We’re also curious about the performance of various CUDA hardware, so let us know what kind of speed improvements you’re getting.

5 Responses

  1. Chad Says:

    We forgot to include the CUDA dll’s in the zip, so if you downloaded earlier and it won’t work, try grabbing it again.

  2. Casey Basichis Says:

    I just stumbled upon your fantastic blog. Some really great posts here.

    Do you ever intend to develop an OFX version of this? That would be very exciting.

  3. Chad Says:

    My experience with OFX has been less than encouraging. Even The Foundry has to make special OFX plugins for each host, which sort of defeats the purpose. We also do a lot of our work in 16 bit float, so an OFX version wouldn’t be ideal. Now, if a new version of OFX comes along and becomes easier to use, then we might be more inclined, so we’ll just take the wait and see approach.

  4. Au Says:

    any plans for a Fusion 6 64bit verison?

  5. Chad Says:

    At this point it might be more beneficial to switch it internally to an OpenCL plugin. There was some foolishness over at Nvidia when we had originally released this related to the 64-bit version of CUDA and what versions of VS it worked with or something like that. Things may have changed, so I’m not sure how easy/hard this is.

Leave a Comment

Please note: Comment moderation is enabled and may delay your comment. There is no need to resubmit your comment.