Programming tricks in Shadertoy / GLSL

Many people start writing GLSL shaders as they would write C/C++ programs, not accounting for the fact that shaders are massively parallel computing + GLSL language offers many useful goodies such as vector ops and more. Or, not knowing how to deal with some basic issues, many people copy-paste very ugly or unadapted code designs, or re-invent the wheel (generally ending on less-good solutions than the evolutionary polished ones 😉 ).

Here we address some basic patterns/tasks. For even more basic aspects related to the good use of GLSL language and parallelism, please first read usual-tricks-in-shadertoy/GLSL . And for non-intuitive issues causing huge cost (or even crashes) at runtime or compilation time, read avoiding-compiler-crash-or-endless-compilation .

Normalizing coordinates

The window is a rectangle that can have various sizes: icon, sub-window in browser tab, fullscreen, seen from different computers (including tablets and smartphones), and different aspect ratio (fullscreen vs sub-window, or in different hardwares including smartphones long screens in landscape or portrait mode). So we usually start by normalizing coordinates. For some reason, many people use a very ugly pattern of first normalizing+distorting the window coordinates to [0,1]x[0,1] ( – 0.5 if centered) then applying an aspect ratio to undistort. Basic clean solutions are:

vec2 R = iResolution.xy,
  // U = fragCoord / R.y;                     // [0,1] vertically
     U = ( 2.*fragCoord - R ) / R.y;          // [-1,1] vertically
  // U = ( fragCoord - .5*R ) / R.y;          // [-1/2,1/2] vertically
  // U = ( 2.*fragCoord - R ) / min(R.x,R.y); // [-1,1] along the shortest side

Displaying textures and videos

Note that if you want to map an image on the full window, thus with distortions, you then do need to use fragCoord/R.
But if you want to map un undistorted rectangle image – typically, a video – , things are a little more involved: see here. Since typical video ratio is accidentally not too far to window ratio (on regular screen) most people blindspotly relied on the “map to full window” above, but on smartphones it then look totally distorted.
( Note that texelFetch avoids texture distortion on a simpler way, but then you no longer benefit from hardware interpolate, rescale, wrap features. )

Managing colors

Don’t forget sRGB / gamma !

Don’t forget that image textures and videos channel intensities are encoded in sRGB, and that final shader color is to be reencoded by you in sRGB, while most synthesis and treatments done in shaders are assumed to be in flat space.
This is especially important for antialiasing since returning 0.5 is really not perceived as mid-grey (test here), for color interpolation (see counter-example, and another), and for luminance computation of textures images and video (NB: this encoding of intensity was historically chosen to account for non-linear intensity distortion in CRT screens, as perception-based cheap compression, then as a normalization to understand colors the same way through multiple input and output devices).
Fortunately sRGB is close to gamma 2.2 conversion: do fragColor = pow(col, vec4(1./2.2) ) at the very end of your program, and col = pow(tex,vec4(2.2)) after reading a texture image to be treated or combined (this does not apply to noise textures). Note that just doing fragColor = sqrt(col), resp. col = tex*tex, is a pretty good approximation.


Many people rely on full costly RGB2HSV conversion just to get a hue value.
This can be made a lot simpler using (see ref):

#define hue(v) ( .6 + .6 * cos( 2.*PI*(v) + vec4(0,-2.*PI/3.,2.*PI/3.,0) ) )  // looks better with a bit of saturation
// code golfed version:
// #define hue(v) ( .6 + .6 * cos( 6.3*(v) + vec4(0,23,21,0) ) )

For full RGB2HSV/HSL and back, see classical and iq references.

Drawing thick bars

step( x0, x ) transitions from 0 to 1 at x0.
smoothstep( .0 , .01, x-x0 ) does the same with smooth transition.
To make a thick bar, rather than multiplying a 0-to-1 with a 1-to-0 transition, just do:

step(r/2., abs(x-x0) )
smoothstep(.0, .01 , abs(x-x0)-r/2. )  // smooth version

NB: above, 1 is outside. If you want 1 inside use 1.- above, or:

step( abs(x-x0), r/2 )
smoothstep( .01, .0,  abs(x-x0)-r/2. )  // smooth version


Aliasing in space or in time is ugly and make your shader looking very newbie 😀 . Oversampling inside each pixel is very costly and gives not-so-good improvement but with hundreds samples per pixel. For algorithms like ray-tracing you have little alternatives (but complex techniques like game-programming screen-space time-based denoising). But for simple 2D shaders it’s often easy to have very good antialiasing for almost free, by using 1-pixel-smooth transitions at all boundaries: More generally, the idea is to return a floating point “normalized distance” rather than an binary “inside or outside”.
Typically, instead of if (x>x0) v=0.; else v=1. ( or  v = x>x0 ? 0. : 1. ), which are equivalent to v=step( x0, x ) , just use v = smoothstep( x0-pix, x0+pix, x ) where pix is the pixel width measured with your coordinates (e.g. pix=2./R.y if vertical coord is normalized to [-1,1]). ( Or simply clamp( (x-x0)/(2.*pix) ,0., 1.) . Note that smoothstep eats part of the transition interval so you need to compensate using at least pix = 1.5*pixelWidth. ). cf Example code.

// antialiased 2D ring or 1D bar of radius r around v0. (2D disc: v0 = 0 )
#define S(v,v0,r)  smoothstep( 1.5/R.y, -1.5/R.y, length(v-(v0)) - (r) )

When you see magic numbers like 0.01 in smoothsteps tell the code author that it won’t scale (aliased in icon, blurry in fullscreen) and tell them to just use true pixel width instead. Note that for 1 pixel thin features, result will look aliased if you forget the final sRGB  conversion at the end of the shader.

Nastier functions are  floor , fract and mod since there is no simple way(*) to smooth their discontinuity the same way we did for step. Still, these are often used with some final thresholding, that just have to not be right on the discontinuity: e.g.,  fract(x+.5)-.5 has no longer discontinuity at x = 0 (or at x = integer).
(*) :  E.g. 1: see smoothfloor/smoothfract . E.g. 2: you might sometime use clamp( sin(Pi*x)/Pi / pix, 0.,1. ) instead of int(x)%2 .

If the parameter value is not a simple scaling of coordinates it can be difficult to know the pixel size in these units. But GLSL hardware derivatives can do it for you: pix = fwidth(x) , at least if x is not crazily oscillating faster than pixel rate. But then as a derivative any discontinuity will cause an issue while you were only interested in the coarse gradient. If x contains discontinuities like x=fract(x’) or x=mod(x’), then simply use x’ instead of x in fwidth since it’s just the same gradient without the discontinuity. cf Example code.

Drawing lines

People solved this long ago, so you don’t need to reinvent the wheel 😉 .
The principle is to return the distance to a segment, then to use the “antialiased thick bar” trick above (cf #define S). Note that for a complex drawing you can first compute the min distance to all features then apply the antialiased-bar (and optional coloring) at the very end. You might even use dot(,) rather than length() so as to compute sqrt only once.

float line(vec2 p, vec2 a,vec2 b) { // --- distance to segment with caps
    p -= a, b -= a;
    float h = clamp(dot(p, b) / dot(b, b), 0., 1.);// proj coord on line
    return length(p - b * h);                      // dist to segment
    // We might directly return smoothstep( 3./R.y, 0., dist),
    //     but its more efficient to factor all lines.
    // We can even return dot(,) and take sqrt at the end of polyline:
    // p -= b*h; return dot(p,p);

Depending on the use case, you might want the distance to an isolated segment (including caps at ends) or just to the capless segment.  cf Example code.

Blending / compositing

When you splat semi-transparent objects, or once you use antialiasing, rather than setting or adding colors you must compose these semi-transparent layers or you will suffer artifacts.
Below, C is pure object color in RGB and opacity in A, O is current and final color.

Drawing assumed to be from front to back stage (i.e. closest first):
(which allows to stop as soon as opacity is 100% or above some threshold like 99.5%)

O += (1.-O.a) * vec4( C.rgb, 1 ) *C.a;

Drawing assumed to be from back to front stage (i.e. closest last):

O = mix( O, vec4( C.rgb, 1), C.a );

Vector maths

First, a reminder that GLSL directly knows about vectors, matrices, vector geometry operations, blending operations; even most ordinary math functions do work on vectors: see here. Besides geometry, vector can also be used for RGBA colors, for complex numbers, etc. Each time you want to do the same thing on x,y,z (for instance), use them ! The perf won’t be a lot better, but the readability of the code will be a lot more, comprising the reasoning, bug chasing, code evolution.

In addition it’s often convenient to add some more vector constructors like:

#define CS(a)        vec2( cos(a), sin(a) )
#define cart2pol(U)  vec2( length(U), atan((U).y,(U).x) )
#define pol2cart(U) ( (U).x * CS( (U).y ) )

Some operations on complexes: ( vec2 Z  means  Z.x + i Z.y  )

// add, sub;  mul or div by float : just use +, -, *, /
#define cmod(Z)     length(Z)
#define carg(Z)     atan( (Z).y, (Z).x )
#define cmul(A,B) ( mat2( A, -(A).y, (A).x ) * (B) )  // by deMoivre formula
#define cinv(Z)   ( vec2( (Z).x, -(Z).y ) / dot(Z,Z) ) 
#define cdiv(A,B)   cmul( A, cinv(B) )
#define cpow(Z,v)   pol2cart( vec2( pow(cmod(Z),v) , (v) * carg(Z) ) )
#define cexp(Z)     pol2cart( vec2( exp((Z).x), (Z).y ) )
#define clog(Z)     vec2( log(cmod(Z)), carg(Z) )

the simplest is to just return the 2D matrix (even for 3D axial rotations):

#define rot(a)      mat2( cos(a), -sin(a), sin(a), cos(a) )
// use cases:
vec2 v = ... ; v *= rot(a); // attention: left-multiply reverses angle 
vec3 p = ... ; p.xy *= rot(a.z); p.yz*= rot(a.x); ...

Note that the optimizer recognizes identical formulas and won’t evaluate sin and cos twice.

Just for fun, the code golfed version 🙂 :  mat2( cos( a + vec4(0,33,11,0)) )

Computing random values

    • Sometime we need the equivalent of drand(), i.e. linear congruence series, that can easily be reimplemented explicitely. cf wikipedia.
    • But most of the time what we really need is a hash value, i.e. a different random value for each pixel, or grid cell, or 3D coord, or 2D+time, etc. And this hash might be a scalar or a vector.
      • For simple use cases, you might rely on the shadertoy 2D or 3D noise textures in grey or RGBA, see special-shadertoy-features . (Take care to not interpolate and reach texel centers if you really want a hash, possibly using nearest flag or texelFetch). Still, the precision is limited (8 bit textures, 64 or 256 resolution).
      • Early integer-less shading languages popularized old-school cheap float-based hashes relying on the chaotic lowest-significant bits after a non-linear operation. (The magic values are important and come from the dawn of computer science age.)
        #define hash21(p) fract(sin(dot(p, vec2(12.9898, 78.233))) * 43758.5453)
        #define hash33(p) fract(sin( (p) * mat3( 127.1,311.7,74.7 , 269.5,183.3,246.1 , 113.5,271.9,124.6) ) *43758.5453123)

        see many variants here. A problem is that precision is hardware (and compiler) dependent so random values can varies with users. Plus p must be not too small or not too big as well: on poor 16 or 24 bits hardwares the random value might just always be zero.

      • Since webGL2 we can now rely on robust precise (but a bit costlier) integer-based hashes: see reference code , especially the GlibC or NRC refs in Integer Hash – II.
        They usually eat an unsigned, so take care when casting from floats  around zero (since [u]int(-0.5) = [u]int(0.5) ).
      • Attention: the variant introduced by Perlin based on permutation tables is very inefficient in shaders since arrays and texture fetches are ultra-costly, and cascading dependent access of 3D-to-1D wrap is not pipeline-friendly as well.
    • You might not want a hash, but a continuous random noise function. Depending on your needs,
      • you might then be happy with a simple value noise (e.g. simple noise texture with interpolation, or analytic using ref codes),
      • splined value noise,
      • or more costly gradient noise (see ref codes),
      • up to full Perlin noise (gradient + spline interpolation + fractal. NB: Perlin published 3 different algorithms along time: Classical, Improved, Simplex).
        Attention: many shaders or blog named “Perlin noise” indeed just fake a simple gradient or even value noise, with random rotations through scales to mask artifacts. This might be ok for you but don’t confuse for what it is not. Conversely, it’s not a good idea for perfs to use the permutation tables for the hashes.



Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s