Usual tricks in Shadertoy / GLSL

Ever figured the ‘?’ icon at the bottom-right of the source area πŸ™‚ ?
And the “shader Inputs>” on top-left ? πŸ™‚
These are, respectively, a GLSL ES summary, and a list of the Shadertoy variable.

Don’t get hypnotized by names

  • Shadertoy API defines fragCoord and fragColor. so don’t access directly to GLSL gl_FragCoord and gl_FragColor or you might break things some day. Using discard in Buffers might also give surprising side effects.
  • Also, the Image tab is not the true beginning of the GLSL shader, so don’t use pragmas and special declaration meant for the top lines of pure shaders.
  • fragColor is not initialized: be sure to do it, even if it seem to work now and for you. It’s often not the case on other machines or when the shader complexifies.
  • It’s your program, so these names can be changed. I personally prefer U and O, as kind of math notations (and shorter more readable expressions if you are a science nerd like me πŸ™‚ ).
  • These are variable like any others, so you can reuse them: fragCoord /= iResolution.y;  fragColor = pow(fragColor,vec4(1./2.2));

GLSL already knows vectors and matrices

  • comprising operations like length, distance, normalize, dot, cross and mat*vec
  • Many operations directly work on vectors (acting on each components)
  • Some special operations do boolean operations on vectors (all, any, lessthan…)
  • Many operations implicitly expend floats to vectors ( V+2., V/=2., step(0.,V), clamp(V,0.,1.), mix(V0,V1,.2)… )
  • and constructors are also casters ( e.g. V=vec4( x>y , 0 , vec2(x) ) ).
  • In particular, you can do V=vec2(0,0), or even V=vec2(0).
    Note that mat(d) setup a diagonal matrix with d along the diagonal: mat(1) is Identity matrix.
  • GLSL already knows 3D graphics operations such as reflect, refract, faceforward
  • GLSL provides many goodies like clamp, mix (linear interpolation), smoothstep (Hermite weighting+clamp)

NB: Complex calculus easily implements as vector and matrices :

  • Use vec2 for definition,  + – between complexes,  + – * / with a float
  • complex multiplication of z1 by z2 is mat2(z1,-z1.y,z1.x) * z2
  • complex division of z1 by z2 is z1 * mat2(z2,-z2.y,z2.x) / dot(z2,z2)

More about maths & complex operations on GLSL here.

C tricks still good for GLSL perfs and programming ease

  • pow(x,y) is doing exp(y*log(x)): costly, not valid for x<0, not perfect precision.
    • for x^2, do prefer x*x !
    • for 2^x use exp2(x).  the reverse log2 also exist (both in most langages πŸ™‚ ).
  • atan API also includes the 2 parameters version doing atan2(y,x)
  • x = cond ? v1 : v0 can be useful, especially for cascaded small expressions.
  • There should be a law punishing people using if cond then x=true else x=false. Just do directly x= cond πŸ˜‰
  • macros can be a convenient substitutes for templates (e.g. expressions valid for floats, vec2, vec3, vec4).

Perf  vs  readability tricks

You might like to optimize your codes (more or less deeply). Still, know that:

  • Consts are evaluated at runtime, and optimizer knows how to transform /const by *const, so you can keep code readable at not cost: prefer uv *= sqrt(3.)/2.; to code full of black-magic approximate (or long) constants.
  • Opposed to C/C++, GLSL unrolls all the loops it can (and function calls as well), so loop variables are indeed consts too at the “real” final compilation, and thus many derived expressions (or at least parts of them) as well. E.g. vec2 P = vec2(cos(2.*Pi/3.*i),sin(2.*Pi/3.*i)) in loop i will indeed be evaluated at compilation time.  So some cheesy optimizations loose readability for no perf gain (and sometime even causing perf loss).
  • Personally, I feel bumpy to have to read float(int) casts all over the code. You can harmlessly loop on floats most of the time (excepted when integer precision is really required). You might then even loop from 0 to 1 with small steps rather than using float(i)/float(N) all over the place. The semantic is then closer to continuum geometry or integration (i.e. the loop variable is a segment parameterization).
  • IEEE 32bits floating points can store at most 6-7 digits. So it’s useless (and deceiving) to define Pi = 3.14159265358979323846264338327950288419716939937510582097 .
    Oppositely, Pi = acos(-1.) is not guaranteed to be maximum precision.
    Convertly, most of the time your algorithm does not really requires high precision of Pi. But sometime yes (programmers should know when precision is crucial).
  • You don’t need to heavily explicit  2.0f : 2. is just as good. And constructors are auto-casters, so vec4(0) is ok.

More non-intuitive perf or compiling time (or crash) related to how GLSL compiles are treated here.

Uncomplete integer operations

In webGL1 many integer operations are missing. And anyway sometime it’s simpler to do them on floats then cast (or not). But be careful to precision loss. Still,

  • integers up to 16,777,216 are exactly represented by IEEE floats on 32bits (shown here).
  • + – * will thus be exact… But not the division: x/x might not be exactly 1.
  • exp2 and log2 are directly reading the mantissa and exponent so are lossless
  • In particular, << and >> can be represented by *exp2(n) and *exp2(-n)
  • mod on int-in-floats has precision bugs.
    You can do mod directly on ints with  a % b = a-a/b*b
  • Note that you can loop on floats to avoid loads of casts.

GLSL run pixels in parallel

  • So it can computes derivative of any variable for free ! dFdx, dFdy, fwidth
    (The precision is approximate, though: uncentered finite differences within 2×2 blocks)
  • Think parallel. Doing long initializations and definition of arrays won’t be factored through pixels since the whole shader is called at every pixels. So most of the time you save code, memory, registers, by merging the initialization and action loops.
  • A reminder that local memory and number of registers is an ultra-critical resource on GPU. In particular, arrays eat a lot of resources. Use them when they are *really* the only solution (or in small harmless cases).
  • Think procedural: to draw 1000 objects on screen, don’t draw and clamp all of them in your shader – i.e. full set checked at each pixel. Try to find the one(s) that cover the pixel, then render only this one. In particular, for regular grids use mod/fract rather than loops. Cf example.
  • Texture access are also costly (e.g. in big loops), especially if random access or depending on previous computation.

Texture tricks (GLSL or Shadertoy)

  • MIPmap is simply activated by switching the texture mode. Still,
    • You can bias it (force less or more blur) via a third parameter at texture call.
    • At parameterization discontinuity the automatic estimation on the LOD might be very wrong. => you can force it using texture…LodEXT
    • Note that MIPmap can be used to approximate integrals.
  • texture…gradEXT directly computes the texture derivative
  • Noise color texture: G and A channels are R and B translated by (37.,17.) , if no vflip. This allows to fake interpolated 3D noise.
  • tex15 is an ordered Bayer matrix : first made for easy half-toning (just threshold it with the grey level), it also provides a permutation table in [0,63].
  • Shadertoy buffers can be used to precompute data.
    More generally, in a multi-buffer algorithm if the result of a buffer is not expected to change do the computation only at iFrame==0  (or up to a delay, if asynchroneous data such as images are used).
  • Note that sound texture includes the FFT of the music.
  • Check for the magic keyboard matrix πŸ™‚

More “touchy” tricks (e.g. for code golfing)

  • The final alpha is ignored, so you can work directly on pixelColor or do vec4(myGreyShader).
  • The final color is naturally clamped (your screen pixel won’t be negative or surbright πŸ™‚ ) so for the final image operation you can forget the last clamp.
    ( Of course this can be wrong for intermediate calculations, and buffers do store unclamped floats. )
  • You must initialize variables, comprising out parameters (such as pixelColor).
    But v -= v will work 99.9999% of the times. The only theoretical issue is when the reused register occurred to value NaN by chance (which I’ve never seen occurring up to now).

Advanced super-tricks

You want to use your own texture ? read section Extending Shadertoy & more.
(But it won’t save and others people won’t see it if they don’t do the same insert.)


See the dedicated section .


2 thoughts on “Usual tricks in Shadertoy / GLSL

  1. I really dislike that many people define pi via its digits. I don’t see why `const acos(-1.)` should have precision problems; or one can use `radians(180.)` instead.


    • I would have preferred a rational motive rather than a “I really dislike”. πŸ˜‰

      I can develop mine:
      programming is not doing abstract maths: floats can store only limited precision, operators are numerical algorithms providing limited precision (possibly uneven, especially on a GPU), plus you use Pi (or sqrt, or acos, etc) within a context that has a context-dependent requirements: some contexts require high precision (rarely a good news BTW) and some really not ( 3 for Pi might sometime be good enough πŸ˜‰ , especially if the rest of the formula contains coarser approximations ).
      To me, beliefs and magic thinking are dangerous traps for programmers, source of hard to find bugs, and limits to understanding what really occurs in calculations. When calculations are involved, I prefer people thinking about what’s up precision-wise. I see way too many #define PI with 20-30 digits (giving the illusory impression of precision while only 5 digits will be stored), or people thinking that the code formula mimicking the true math close form formula “is somehow the same thing” and thus might give more precision than a dry explicit number.

      Liked by 1 person

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s