Maths and ShaderX8

ShaderX7 is out

ShaderX7 Cover ArtShaderX7 has now hit the press in the USA, although (at the time of writing) it looks like the UK will have to wait a bit longer. I was luck enough to nab a copy directly from the publisher while at GDC. The first thing you notice is it's really fat this time - definitely one of the largest books on my shelf. It's not short on material and Amazon are currently selling it for a bargain $37.79 - rather short of the $59.99 RRP. Go grab yourself a copy.

Editing the shadows section was my first adventure into the world of publishing, and I'm glad I put in the effort. I was genuinely  surprised by the amount of time involved in editing just 4 articles, so my hat goes off to everyone - authors and editors alike - that contribute to these industry-led books. It's not done for money, it's very definitely a labour of love, and I'm happy that the end result made the work worthwhile.

ShaderX8 - now with added maths

Which leads me onto ShaderX8! This time around there will be a new section on Mathematics, which Wolfgang is kindly allowing me to get my editing paws on. The idea here is that the complexity and quantity of maths involved in writing shaders has greatly increased in recent times. Early shader models had pretty limited capabilities and most uses of them likely capped out at requiring knowledge of linear algebra - say, vectors, homogeneous matrices, and so on.  But we are now fast approaching being able to run typical x86 code on a GPU and the mathematical models being run are getting correspondingly more complex. There's also more than one way of boiling your mathematical eggs, and performance matters. Part of the process in writing shaders is learning, developing and optimising mathematical models - hence the dedicated new section.

The book is still at proposal stage so what this section really needs now are some article proposals. Please take a look at the schedule on the ShaderX website and email your proposals to Wolfgang before 17th May 09. If you have any questions on the maths section feel free to email me as well. The writing guidelines can be found here.

Complexity of maths in shaders

Here's my 2p on the complexity of maths in modern shaders.

I expect almost all graphics programmers will now at least have heard of spherical harmonics as they are an extremely efficient way of capturing lighting irradiance. Given their importance, there have been several excellent tutorials written to help the games industry understand how they work. But my impression of the industry is that many people's understanding of them is not yet at "comfortable" level. The use of spherical harmonics in graphics does not require comprehensive knowledge of the maths that underpins them, but it's definitely a step up from what was required of a programmer in a younger industry.

To add some context, it's worth noting that the use of spherical harmonics in lighting is now several years old. By industry standards spherical harmonics are by no means a new thing. I'd hazard this is evidence of their mathematical complexity reaching beyond what the industry is truly comfortable with handling.

Spherical harmonics are definitely not the most complex mathematical model in graphics. To use a more recent example: In the ShaderX7 Shadows section there's a really excellent paper by Mark Colbert and Jaroslav Křivánek, "Real-time dynamic shadows for image-based lighting". I won't go into the implementation details here, but suffice to say it's non-trivial. It requires familiarity with some fairly advanced linear algebra, a very sound knowledge of sampling, and solving some fiddly least-squares problems with some interesting regularisation to prevent overfitting. It's not dissimilar to the level of complexity of your typical SIGGRAPH paper.

mark-relighting

It's a pretty steep learning curve these days.

(Get your proposals in!)

On parsing, regex, haskell and some other cool things

I've recently become slightly obsessed about finding ways (new or otherwise) to make parsing text really really simple. I'm concerned there are wide gaps in the range of currently parsing tools, all of which are filled by pain.

It's also a nice distraction from the C++ language proposal I was working on which is stalled while I dig through more research. It turns out someone has already done something very similar to what I was thinking! So there will be a bit of a delay while I bottom that out properly.

Parsed the pain.
Parsing with regular expressions covers a decent amount of simple low-hanging fruit. I happen to be a big fan of regex but it definitely doesn't handle parsing 'structured documents' very well. Here 'structure' means some non-trivial pattern: perhaps matching braces, nested data or maybe a recursive structure.

This is by design. Regular expressions are, or were originally, a way of describing an expression in a 'regular grammar'. Its expressive power is actually very limited and text doesn't need to be that complex before it exceeds the expressiveness of a regular expression. This regex email address parser is just about readable, but kind of pushing the limits:

\b[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,4}\b

However, XML, HTML, pretty much all source code, every file format I've ever written - basically all the documents I care about - are not regular grammars.

The pain in context
The next step up from a regular grammar in the Chomsky hierarchy is a 'context free' grammar. Parsing a context-free grammar frequently involves writing a lexer and parser combination to do the work. The lexer breaks the character stream into 'tokens' and the parser translates the token stream into a more meaningful tree structure. Parsers in particular tend to be complex and lengthy pieces of code so you'd more often than not find yourself using a parser generator such as yacc, bison, or antlr to actually generate the code for you from a separate description of the grammar. This is all before you actually get to doing something useful with the tree the parser outputs.

Either way you cut it, this is a significant step up in pain from a regular expression. Your problem has suddenly jumped from a condensed one-liner to full-on procedurally-generated code. If the task you have in mind is just a bit more complex than a regex can handle your pain increases disproportional with this increase in complexity.

Sadly, even context-free grammars don't cut much in practice. There's a fair gap between the expressiveness of context-free grammar and the real world of nasty ambiguous context-sensitive languages. I'm thinking mainly of the context-sensitivity of C++ where the task of writing a parser is full of painful implementation details. Not to mention that there is a further major leap to get close to parsing the world of natural languages, such as English.

Pain relief
There are no shortage of parsing tasks in the "slightly more complex than a regex" category. Context-free grammars actually contain several sub-categories that are more restrictive but simpler to parse, such as LL and LR. So it's not really much of a surprise to discover that a typical 'regex' isn't actually a 'regular grammar expression' any more.

Perl's implementation of regex supports recursion, back references, and finite look-ahead which allow it handle some - maybe all - context-free documents. I recently re-read the Perl regex tutorial to remind myself of it, and had some fun scraping the web for tescos voucher codes. I think the expansion beyond supporting just regular grammars is very helpful, but I don't think it's really bridging the gap to context-free parsing in a particularly manageable and re-usable way.

So, if Perl's extended regex doesn't cut it, what are the alternatives? Well, here's a couple of thoughts.

Structured grep
I thought this was quite a nice find: sgrep ("structured grep"). It's similar to, but a separate from the familiar grep. There are binaries for most platforms on-line as well as being found in Cygwin, Ubuntu and probably most other Linux distros. At least in theory, it extends regular grammar pattern matching to support structure through the use of nested matching pairs and boolean operators.

Here's how you might scrap a html document for the content of all 'bold' tags:

$ cat my_webpage.html | sgrep '"" .. ""'

The .. infix operator matches the text region with the specified start and end text strings. It also support boolean operators like this:

$ cat my_webpage.html | sgrep '"".. ("" or "")'

If you dig through the manual you'll come across macros and other cool operators such as 'containing', 'join', 'outer' and so on. It seems easy to pick up and you can compose more complex expressions with macros.

I would go on about it for longer but sadly it's current implementation has a fairly major flaw - it has no support for regex! This feels like a bit of simultaneous forwards and backwards step. I'm not actually sure whether it's a fundamental flaw in the approach they've taken or whether the functionality is simple missing from the implementation. It's a bit of shame because I think it looks really promising, and if you are interested I'd recommend you take a moment to read a short article on their approach. I found it an interesting read and have since hit upon a handful of mini-parsing problems that I found sgrep very helpful with.

Parser combinators
This was a recent discovery, and it now surprises me I hadn't come across it before. I think I didn't know of it because it's rather tightly bound to the realm of 'functional' languages, which isn't something I've spent that much time looking at until now. That's all changing though, as I think I'm becoming a convert.

It occured to me that a parser might be easier to write in a functional language: Parsing a grammar is kind of like doing algebra, and algebraic manipulation is the kind of thing functional languages are particularly good at. Googling these ideas turned up both Parser Combinators, an interesting parsing technique, and Haskell, a pure functional language where a 'parser' is really a part of the language itself.

Parse combinators are a simple concept: You write a set of micro-parsers (my name for them) that do very basic parsing duties. Each is just a single function that given a text string, returns a list of possible interpretations. Each interpretation is a pair of the interpreted object and the remaining text string. In Haskell, you'd write the type of all parsers (a bit like a template in C++) like this:

type Parser a = String -> [(a,String)]

For an unambiguous input string the parser will produce a list with just one item, ambiguous inputs will produce a list with more than one item, and an invalid input produces an empty list. An example micro-parser might just match a particular keyword at the start of the string.

Since all your parsers are of the same type, it's now simple to compose them together into more complex parsers. This is modular programming at its most explicit.

It's quite surprisingly how tiny and general the code to compose these parsers can be. You can reduce them to one-liners. Here's a few examples, again in Haskell:


-- Here, m and n are always Parser types.
-- p is a predicate, and b is a general function.

-- parse-and-then-parse
(m # n) cs = [((a,b),cs'') | (a,cs') <- m cs, (b,cs'') <- n cs']

-- parse-or-parse
(m ! n) cs = (m cs) ++ (n cs)

-- parse-if-result
(m ? p) cs = [(a,cs') | (a,cs') <- m cs, p a]

-- parse-and-transform-result
(m >-> b) cs = [(b a, cs') | (a,cs') <- m cs]

-- parse-and-ignore-left-result
(m -# n) cs = [(a,cs'') | (_,cs') <- m cs, (a,cs'') <- n cs']

-- parse-and-ignore-right-result
(m #- n) cs = [(a,cs'') | (a,cs') <- m cs, (_,cs'') <- n cs']

I've taken these examples from "Parsing with Haskell", which is an excellent short paper and well worth a read.

Learning Haskell has been something of a revelation. I had glanced at Objective CAML and Lisp before, but I'm actually really quite shocked at how cool Haskell is and that it took me so long to find it.

C++0x, "just about everywhere"

When I started this blog I promised myself not to post rants about C++. In something as large and complex as the C++ language there is always plenty of material to rant about, and I figured I'd quickly bore myself and everyone else. To be honest, I was planning on attempting to forget about C++ altogether as a fact of life every programmer must live with and learn to love. But here I am, writing about it quite fondly.

"C++0x" is the working title for the now feature-complete upcoming ISO standard of C++, hopefully to become "C++09" this year. In any coders diary, this is a significant occasion and shouldn't go by without some pause for thought.

I'm not going to comment on the contents of the upcoming standard for a while, but if you work with C++ I highly recommend you check it out so I'll provide some links at the end.

I'm currently more interested in the development trend of the language itself. The process is fascinating. Plus, I am cooking up a proposal for a new language project based on C++ that could be an interesting angle - but that's for another day.

Significant standards

The first standard for C++ came in 1998, almost 20 years after it's conception in 1979. By 1998, C++ itself had long since 'dug in' and by many accounts the standardisation effort was a big painful undertaking. Unsurprisingly it wasn't perfect and "C++98", as it became known informally, had a spring clean in 2003 but without any significant changes an end-user would notice.

C++0x represents the first major revision of the language since it was first standardised as C++98, and looks to be another dramatic undertaking. It's a pretty bold move on many levels. It has many bright innovations and the current draft appears to be very well thought through.

I find C++0x even more remarkable when considering how it has been developed. On face value, the end product could be considered fairly impenetrable and bears more similarity to a legal contract than a working design. The C++ standardisation committee is a huge international democratic effort with nearly 200 members consisting of "corporations to fanatics". It has no owner. It does no marketing. But it appears to know that the decisions it makes through its huge, painful, diligent and slow process will effect the lives of literally millions of developers. I think it's a spectacular organisational achievement and its coordinators probably possess saint-like levels of patience :).

C++ "everywhere"

Now, in theory, you could write any application in assembler - but it doesn't scale so you'd struggle to write large, complex software with it. And you couldn't write every application in C# or Java. As interpreted, garbage-collected languages, they are just not always going to fit with your hardware or application's performance demands. C++ is in a sweet spot.

Even so, I have a love/hate relationship with C++. On one hand, I've spent many an hour teasing out holes and swearing at my compiler, but on the other I have no doubt my job would be more more difficult without it. Despite its flaws it is a language that is unique in the range of applications it can address. And because of this it's a language that's "just about everywhere".

I'm quoting Bjarne Stroustrup, from a talk he gave to the University of Waterloo last year. As the language's father and original author he may have a bias, but he has some evidence to support his statement. It's arguable that this applications list could well be cherry-picked, but I'd hazard a guess that he's likely to be more in touch with the users of C++ than most, and frankly I agree with him.

Endless growth

"Give a man enough rope and he will hang himself". There's even a book on C/C++ with the same title. As I published this post I discovered Bjarne is quoted as saying, "C++ makes it harder to shoot yourself in the foot; but when you do, it takes off the whole leg". His argument is really that the idiom is as true of C++ as it is of any powerful tool. I think this is a fair point but I think it is still a valid concern to have of popular language, particularly one that's growing.

So, given the language is about to expand even further, I spent a surprisingly enjoyable couple of hours one Saturday morning watching Bjarne talk through some new features in C++0x and comment on its development.

C++0x appears to me to not be an extension in the sense it just provides some new features. In the literal sense, this is exactly what C++0x does. But when inspected more closely, some of the features actually help clean up some of the extraneous cruft. To me, features such as initialisation lists and concepts appear to be part of a crafty expand-but-consolidate manoeuvre. If you're worried about having too much rope, then this is the best possible route the committee could have taken, so I am suitably impressed.

Subtraction

The committee have one hand tied. They simply can't subtract. It's practically impossible for them to remove features or they risk breaking people's existing code and the fallout from that could be enormous.

To be fair, the standardisation committee pursue some other options as well. They look at revising the C++ standard libraries (stl). Libraries are a far simpler way to extend a language as they are intrinsically "turn on and off-able" in a way language extensions typically are not. Opinions on the stl tend to vary, but I don't think many people would argue that they could be considerably better.

Libraries are not immune from the subtraction problem. But it's definitely an area the committee could do a lot of further good.

C++ is dead. Long live Java/C#/Python/etc

Although it's not specific to C++, the 'subtraction problem' is a rather fundamental problem with all language development - and perhaps much "live" software. Like many humans, software has a tendency to grow-up too quickly, become fat and fidgety at the peak of its powers, leading to the inevitable replacement by younger leaner contenders.

Given the inevitable growth at each revision - however well constructed the revision - how far will the life of C++ extend? In the long run, is C++ condemned to expand its way to death? This is certainly the view many people like to hold.

The death of C++ was being widely touted while I was at university in the 90s, rather conspicuously around the same time the first ISO C++ standard was being agreed. C++ was an old messy language that Java would replace, we were told. This belief extended into the course structure where no C or C++ option was offered. As a consequence I was never taught the language I've used daily ever since. And to be honest, I would not be surprised if some universities continue to do the same thing now.

Despite it all, C++ lives on and does so with relative ease. C++0x should further increase the language's lease on life. However, C++0x will not - and can not - resolve all the problems with the language. Assuming we believe perfection is possible, to do so requires subtraction, and subtraction requires an alternative approach to the standardisation committee. I have a proposal brewing on this topic, which I'll return to in a follow-up post.

Some references

My eyes! My eyes!

This is the eye-candy equivalent of munching too many wham bars. It's so simple I can't help but love it, cheesy as it is.

Before I explain what's going on - although you can probably guess - have a go at staring at the applet below. If there's a big "P" then it's still loading. If there's no applet... well, email me your browser, etc, etc and I'll try and fix it.

Instructions:

  1. Get really close to the screen
  2. Stare at the flashing square in the middle of the crazy-colour image. Don't blink or move your eyes!
  3. Count a good few flashes. 8-10 flashes should give you a good burn-in.
  4. Click the mouse button
  5. :O !
  6. Rinse and repeat

Processing-based Java applet:

This browser does not have a Java Plug-in.
Get the latest Java Plug-in here.

r0x0r!

The effect can last for a surprising amount of time, providing you keep your eyes completely fixed on the square. The second you move them, something in your eyes and brain puts the internal window screen wipers on and it's lost. At least for me, it seems that holding a fixed-view is more important than the length of time you stare. You can get a reasonable after image from a pretty quick glance providing you don't look around the screen.

Most people are familiar with "persistence of vision", as it's known, through looking at light bulbs for too long, their tellies refresh rate, helicopter blades turning into a blur, and cool gadgets like this. It's likely that the colour burn-in aspect of POV doesn't have quite as many real applications as the "things moving too quick to see" aspect, so I expect the applet above will come as a happy surprise to at least a few people.

Question is: Is it useful? Well, I reckon it might be in the right setting, but for someone working in graphics/lighting it's at least something to be aware of. I was shown this effect by Jeremy Vickery, an ex-Pixar lighting artist while on a visit to Geomerics, so I'd expect the film industry is already well aware of it. He demoed it to us in a regular powerpoint slide as one of several examples of how crazy and unpredictable your eyes/brain can be. You eyes just lie, frankly. If you are interested you can get further details from his DVD and likely find his GDC "Practical Light and Color" talk slides on the net.

The question is, can this effect be used to good effect in games/films? I'm open minded to this possibility, but still thinking about it. There are definitely other tricks-of-the-eye that are relevant to film makers. In films you have robust control over what comes next. Games? Maybe. Still thinking.

A quiet renaissance of graph theory?

I freely admit to being immensely captivated by graphs. I'm certain to continue writing/thinking about them, so this post probably qualifies as a bit of a warm up.

A researcher in the field would be able to comment better than I can, but I have the impression that Graph Theory is undergoing something of a slow but steady renaissance at the moment. Your typical book on graph theory tends to be a pure theoretical text* with remarkably few pictures given the visual nature of the subject. (I like pictures). Even the 'applied' ones seem to have a fairly narrow range of applications. I think it's fair to say that many books prefer to concentrate on the 'pureness' of the mathematics than discuss applications. My belief is that graphs are an extremely general and flexible concept, and this distance between theory and applications bothers me.

Enter the likes of Spectral Graph Theory (rather quietly and formally). The (only?) major text on this is by Fan Chung. You can find the first four chapters of her book available to download from her website, although its not the easiest entry point to the subject.

In a nutshell, spectral graph theory is the extension of traditional graph theory with the study of a graphs 'spectrum' or eigenvectors/eigenvalues. Chung begins her (1992) book by simply stating that spectral graph theory has "a long history". I notice that she's rather more upbeat in her latest book, "Complex graphs and networks":

In many ways, working on graph theory problems over the years has always seemed like fun and games. Recently, through examples of large sparse graphs in realistic networks, research in graph theory has been forging ahead into an exciting new dimension. Graph theory has emerged as a primary tool for detecting numerous hidden structures in various information networks, including Internet graphs, social networks, biological networks, or more generally, any graph representing relations in massive data sets.

I believe this is very clearly true, but don't just take my word for it. There's at least one truly killer example to offer as evidence for it: PageRank is the basis behind Google's search algorithm and is a great example of graph eigen analysis being put to good effect.

If you clicked and read the wikipedia PageRank article above, you might have noticed the complete absence of any reference to graph theory, spectral or otherwise. Instead there's a description of probability and random surfers leading to Markov Chains and then eigengaps. But this is spectral graph theory, and I think it's actually easier to understand the process as such, where random walks and Markov Chains as good family relations of this modern graph theory.

The ten-thousand foot overview goes something like this.

  • The web is a graph.
  • Any graph can also be thought of as a square matrix. Each row/column is a vertex and the matrix values describe the weights of the edges connecting the vertices - this is called the adjacency matrix.
  • All matrices can be decomposed into pairs of "eigenvectors" and "eigenvalues", in the same way sound can be decomposed into differently pitched waves, or light into different colours (hence the term 'spectral').
  • Examining the spectrum of the matrix/graph/web tells you things about it's structure, in the same way you can discover things about an everyday object by listening to the sound it makes when you strike it

I'll expand on it more another time, but that's the main jist of it. I'm really just trying to convey the generality of the subject as I see it. My little description links together graphs, linear algebra and the frequency domain in a major real world example (the web). These are all huge subjects, and I find their proximity very interesting - even exciting. The subject just seems to have that "obvious" but not yet "evident" quality about it.

If you are curious, I think PageRank is a great algorithm to get you into the subject. Personally, I found the most accessible and practical starting point to be the various course notes, slides and pdfs on Daniel Spielman's homepage - particular the course notes. There are huge aspects of the subject in there I haven't even mentioned yet.

* That's not to say traditional graph theory books they aren't worth a read anyway! Fyi, you can get a pdf of, "Graph Theory", by Reinhard Diestel free online. Although, I haven't read it! Gawd bless the interweb and all who sail in her.