Classical corner cases

The following addresses more specific issues, simple or more involved:

Accessing texel center

Textures are strange arrays indexed by a float. Once rescaled by the texture size, texel centers happen to correspond to coordinates integer+0.5 , as for window pixels (but unlike iMouse coordinates). When the texture is supposed to contain non-interpolatable data, it is easy to misfetch the value. Turning the texture interpolation flag to nearest is often a bad idea since if you misfetched the precise texel location, rounding to integer can easily access to the wrong plain value. So you do have to get your formula exact !
Note that if you really want to access the texture as an array, there is now texelFetch( iChannel0, ivec2(U), lod ).  Integer index, no interpolation or wrapping (so possibly faster to), and more easy to fetch the right place in 🙂 .

Please no black icons: Demo mode on mouse-tuned shaders

Many shaders are reactive to the mouse, which is nice. But a lot of them forget that Mouse==vec4(0) at start and for icon display, leading to black or uninteresting image.
→For your mouse-tuned shader, always think about detecting the zero case and set special demo values – possibly animated – then !

Two possibilities (among many):

if (iMouse.xy==vec2(0)) set-nice-params; else set-params(mouse);
vec2 M = iMouse.xy; M = length(M)<10. ? fake-M-value(t) : normalization(M);

The second case allows the user to easily get back to demo-mode by clicking in the bottom-left corner.

One more thing about initial mouse value:
some detects that the mouse is currently clicked by the sign of Just take care that in the icon and at start == 0 as well, which is not negative despite not clicked 😉 .  More details about Shadertoy iMouse uniform here.

iFrame  vs  iTime  vs  iDate(.w)

  • When you really want the viewer experience to be smooth, i.e. synchronized with his own time (camera motion, simulation), use iTime. Using iFrame would yield a motion with some stops and changing speed, that would differs among users system, and with screen resolution.
  • Speaking of simulation: in physical simulations all equations should refer to real units of space and time to account for various window resolution and shader framerate, and thus should rely on some dx,dy and dt variables. iTimeDelta let you know the real dt since the last frame. NB: it’s usually better to use a relaxation to smooth it along several frames, for isolated micro-freezes don’t corrupt the simulation: dt = mix(dt, iTimeDelta, .05); // smooth on 20 frames
  • iDate.w, giving (micro)seconds since midnight, is useful when you need to really have unique seeds even at shader initialization – along the run-time iTime is generally sufficient since FPS is never exactly regular. If you use the fractional part of iDate.w, take care that at the end of day (~86000 sec) the precision collapses.
  • iFrame is more indicated to manage internal states of your shader: initialization time, actions to delay to next frame or more, periodic switching between 2 states, etc.

Attention: in the icon, iTime starts at 10 seconds, while iFrame is really 0. This is one more reason for not using iTime to trigger initializations. It’s also a reason to take care about what your shader looks like at 10″ time, since it will be the look of your icon. (Always check the look of your icon in the “latest shaders” view).
More details about Shadertoy special timing uniforms here.

Video sync, video time

Avoid binding the same video multiple time in different buffers: it’s very easy to get each version out of sync. → store it in the first buffer, and access only this version.

Also, you can’t be sure when the video will be loaded and start so it’s very unprecise to rely on iTime for timed video effects.  If you want your effects to be well synchronized, you can use the special Shadertoy uniform iChannelTime[0].

Loading texture at initialization

In buffers, we can proceed to some initialization doing  if ( iFrame == 0 )
Alas it won’t work if you need a texture (e.g. the noise texture), since html is asynchronous and textures need a few frames to be loaded. To handle this some people do instead if ( iFrame < 30 ), but when the network’s Gods are unhappy it can take longer, and you don’t want your shader to rigidly wait at start.
The solution is to store and compare iChannelResolution[] through frames, since this ShaderToy uniform is set to texture resolution only when loaded. Example here. Note that it also allows dynamic detection of texture binding by the user.

Detecting resolution change / waiting for fullscreen

At resolution change you might want to redraw the background or reorganize data. A typical use case is wanting to let the user go fullscreen before doing the meaningful initialization. Pausing the shader 1sec at start is a quite rigid, requiring a keypress can be not transparent enough. The solution again is to store and compare iResolution through frames, Example here.
Note that detecting going to fullscreen allows you to zoom and refit content if you wish, but there is currently no possibility to do the same on size shrink, since the buffer is already trimmed. 😦

MIPmap and dFdx/dFdy/fwidth artifacts at discontinuities

Hardware derivatives are useful for antialiasing or drawing constant-thickness curves, They are also used by the hardware to compute the LOD to apply to MIPmapped textures. But if the mapping value has discontinuity at some places, the look will be ugly there, showing segments or curvy artifacts.

A classical  situation bringing discontinuity is fract(coords) or mod . ( NB: I hope you know that textures can directly repeat without using fract, by just setting the flag 🙂 .)
A solution consists in manually computing the LOD and using either textureLod() (for simple LOD level tunning) or textureGrad() (and providing the full pixel footprint). Base API (but would reproduce the artifact if derivatives used as is):
textureLod( iChannel0,  log2(length(fwidth(U*iResolution.xy))) );
textureGrad( iChannel0, U, dFdx(U), dFdy(U) );

As for what we did for antialiasing we can use hardware derivative in the task and just skip the fract or mod, since they don’t change the gradient (but by adding discontinuities). When the scaling is simple the LOD level is easily known directly, without computing derivatives. Or in some case you might prefer computing the derivatives analytically.

Another classical situation is caused by atan(y,x) (e.g. polar coordinates), at jump 2π→ 0 or -π→π . But since the jump amount is known, and easy to detect (big value while usual gradients are small) it is easy to trim it:
da = fwidth(a); if ( da > 3.) da = abs(da-2π);  cf example.

Something more puzzling: sometime the bug doesn’t show until you resize the window or go fullscreen (or somebody see it and not you). Cf example. The reason is that hardware derivatives use an approximate tricks: pixels are organized in tiles of 2 x 2 neighborhoods, and the derivatives are evaluated only once per tile as right – left for dFdx and top – bottom for dFdy. So only discontinuities occurring within a tile are detected. If the fract(coords) 1→0 jump or the polar -π→π jump happens between 2 tiles (typically for centered polar coordinates, or for texture tiled 2 x 2, with window height multiple of 4) then the artifact won’t occur. So a hacky trick could be to center your coordinates so as to control this, e.g. offsetting screen coords by 2.*floor(iResolution.xy/4.)   🙂 .

A more nasty case is the use of MIPmap or derivatives in loops, if or switch blocks. As we have seen here, SIMD parallel computation yields such situation to correspond to divergence: the neighbor pixels might be not executing the same code (i.e. being idle while the current pixel is following his combinatory branch). Since hardware derivatives are based on comparing the values of neighbor pixels, they no longer work (and can return 0, NaN or stupid things depending on the system). A call to texture() in a loop might seems harmless, but if the loop has varying length then we are in such a case: if the current pixel is looping further than one of its neighbor, MIPmap and hardware derivatives are wrong for it.
Here again the solution is to compute the LOD manually, and if using derivatives, to evaluate them before the if or switch. Or to save the useful value in the loop and compute derivative or access MIPmap after the loop.

WIP – to be continued


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s