Advanced tricks

Smart antialiasing

Antialiasing for free is easy in 2D shaders, as long as you can express shapes as distances: just blur borders over a pixel using smoothstep ( treated here ).
But this solution doesn’t work for 3D shaders, especially ray-tracing / casting / marching / surfing ones. Beside advanced complex techniques such as voxelization (since volumes are naturally smooth) or cone-tracing (many variants, quite involved), the basic solution is oversampling: consider a subresolution in each pixel and average (the theory of the perfect filtering kernel is a bit more complicated, but let keep it simple 🙂 ). This also applies to complex 2D shaders not simply based on distances.
The inconvenient is that it multiplies the cost of the shader by the sampling size N, with N = 3×3 being already costly and not perfectly anti-aliased. Still, most of the time the aliasing only occurs in a small proportion of the pixels in the window: at shape silhouettes, shadows border, highlights, etc. Oversampling these regions only should thus do the trick for not too expansive: this is adaptive sampling.

One way to implement it is to define your render function in Common tab, to call it once as usual in BufferA, and in Image tab to either just display it or add more samples if there is too much difference with neighbor pixels (your choice of criterion here).
Another way is to relate on the marvelous hardware derivatives again (a reminder that they apply to any variable or expression and are almost free). It’s not perfect ( 2×2 block neighborhood ) but the code is then ultra simple: just rename your main as mainImage0 and append this at the end of your code:

// === easy adaptive sampling. === https://shadertoyunofficial.wordpress.com/2021/03/09/advanced-tricks/
//                           more: https://www.shadertoy.com/results?query=easy+adaptive+sampling
void mainImage(out vec4 O, vec2 U) {
    mainImage0(O,U);
    if ( fwidth(length(O)) > .01 ) {  // difference threshold between neighbor pixels
        vec4 o;
        for (int k=0; k < 9; k+= k==3?2:1 )
          { mainImage0(o,U+vec2(k%3-1,k/3-1)/3.); O += o; }
        O /= 9.;
     // O.r++;                        // uncomment to see where the oversampling occurs
    }
}

To be adapted to higher sampling if you wish. Jittering the samples would even improve this basic scheme. You might even cascade this twice via a buffer to allow higher sampling on tough areas only.

Note that if your main was ending with a sRGB conversion pow(col,1./2.2) − as it should − , move it at the end of the new main. Antialiasing looks better with the proper final sRGB space ( a reminder that otherwise a 0.5 grey don’t perceptually appears as the mean intensity between black and white ).

Some examples.

Smart Gaussian blur

Hardware MIPmap is a cheap way to get blurred images (even on your Buffers: just activate the MIPmap filter at binding), but the filter is multi-crappy: box filter, axis-aligned, on power-of-two blocks (otherwise bilinearly interpolated), evaluated at power-of-two scales (and blending the 2 nearest levels otherwise).
Oppositely, convolving an image with a large kernel is very costly.

But many filters are smooth: you can consider them as piecewise box filters. Meaning, instead of applying them on the full resolution image, you can apply them on an intermediate MIPmap LOD: choose the quality/cost compromise you wish by setting the number of samples you want to afford, then select the LOD to use accordingly. Using LOD L means saving the cost by a factor 4L .

Second trick: Gaussians are lovely functions, and the unique family that has the separability property: a 2D Gaussian is the product of two 1D Gaussians in x and y. Meaning: you can 1D filter horizontally in a buffer, then vertically in the next buffer: cost N instead of N². On top of that, the previous trick saves by a factor 2L.

See example.

Note that hardware built-MIPmap is very buggy on non-power-of-two textures (and horribly buggy on Windows) , so very large kernels kernels might show artifacts if you use the coarsest levels. Future shadertoy implementation will allow custom image size (then chose powers-of-two) and manually built MIPmap, but for now only CubeMap(A) is MIPmap friendly up to the coarsest level.

General Purpose MIPmap

Massively parallel programming with Cuda or Compute shaders allows more complex algorithm than GLSL, for instance by cascading passes of very different resolutions. This is typically required by some not parallel-friendly operations like sorting, compacting zeros, extracting&compacting values above a level (or the local max). Managing this in parallel leads to the smart stream reduction algorithms.
In GLSL we can at least access the single built-in one: hardware built-MIPmap. The trick is trying to use that for many general purposes, considering that is allows very fast integrals on pixels blocks (axis aligned, power-of-2 blocks, for power-of-2 sizes, but still). Examples:

  • Average color of an image ( = coarsest MIPmap ).
    Or average in large local blocks ( = less coarse level in nearest mode or via texelFetch ).
  • Standard deviation or variance of colors ( whole image, or per blocks ):
    Make a buffer containing the image values squared. Then, Variance = MIPmap(I²) - MIPmap(I)²
    Other applications: auto-constrast (another), white balance (better), select frequency, filtering LUTs, height/bump map filtering
  • Center of Mass of a grey image ( whole image, or per blocks ):
    Make a buffer containing uv*grey. Center of mass = MIPmap(uv*I) / MIPmap(I)
  • Order-2 Moments / inertia matrix J / fitting ellipse / covariant matrix :
    Make another buffer containing vec3(u²,v²,u*v)*grey. Then Jxx = MIPmap(u²I)/MIPmap(I) - (MIPmap(uI)/MIPmap(I))² . Same for Jyy and Jxy → J = mat2(Jxx,Jxy,Jxy,Jyy)
    Other applications: iceberg simulator
  • Surface of an object in a mask / counting set pixels : it’s just the coarsest MIPmap level.
  • Collisions / Intersection:
    Have the 2 masks in R and G channel, make B = R*G. Coarsest MIPmap gives size of intersecting surfaces : collision is > 0.
    Some applications: packing shape with dart throwing, heat/salient map ordered algos (example)…
  • Building a min/max pyramid:
    using ( MIPmap(I-big) )big and ( MIPmap(Ibig) )-big
    Other applications: find extrema,…
  • Poor man Low-pass Filter, High-pass Filter, Differential of “Gaussians” filter (select a frequency band):
    just LOD, LOD – image, substract two LODs.
    Other applications: cartooning, …
  • Cheap high quality true Gaussian filter (see above):
    Do the convolution, but on a coarser image than the full resolution.
  • Detect high variations in neighborhood:
    Compare buffer value to MIPmapped value (LOD depending on neighbor size to consider. For 2×2 neighborhood, you can also consider hardware derivatives, i.e., fwidth() values ).
    Some applications: adaptive antialiasing (see above), …

More: All General Purpose MIPmap examples.

Note that hardware built-MIPmap is very buggy on non-power-of-two textures (and horribly buggy on Windows), which makes coarsest MIPmap levels unusable on buffers. Future shadertoy implementation will allow custom image size (then chose powers-of-two) and manually built MIPmap, but for now use CubeMapA (providing six 1024×1024 buffers in half floats ) if you need correct coarsest level.

( To be continued )

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s