OpenCL - the compute intermediate language

We are now fast approaching the yearly Game Developer Conference, and around this time last year my favourite topic of conversation was the need for a "virtual ISA" that would include the current and future processor architectures, particularly GPUs. The term "virtual ISA" implied an assembly-like language and toolset that could be used to generate high performance (most likely data parallel) code without being tied to a specific architecture. This is much the same as the "virtual ISA" that LLVM provides for a wide variety of (primarily) CPU architectures.

Even a year on, it remains such a simple idea that once stated it becomes a wonder that it still doesn't exist. The main change in conversation is that it has become clearer that this is actually the role OpenCL should attempt to fill.

Why do we need a virtual ISA, and why not another language?

Right now, the reality is that no one knows the best way to efficiently program the emerging heterogeneous architectures. In fact, I don't think we even understand how to program "normal" GPUs yet. C++ will inevitably be crowbarred into some form of working shape, but I'd rather not settle for this as a long term solution.

A compute software stack sounds like a far more attractive option. And to build a well functioning stack you will need a solid foundation. As an illustration of what occurs without one, you should observe the rather scary rise in the number of compiler-like software projects that generate CUDA C as their output. This use case is not something CUDA C is designed for, and as OpenCL is essentially a clone of CUDA C, you may well think that this trend is something future OpenCL revisions should pay attention to.

In fact, there are a few other interesting trends to note.

A change in the wind?

A significant strength of the CUDA architecture is PTX, the NVIDIA-specific assembly-like language that CUDA C compiles to. OpenCL does not yet have a parallel of this. Several independent little birdies tell me that the committee felt attempting to define this as part of the initial standard of OpenCL could have derailed the process before OpenCL had a chance to find its feet. However more birdies tell me that this view has now mostly changed, and that defining something akin to a virtual ISA could actually be back on the cards.

PTX as a unofficial standard?

As nice as PTX is, it is not a GPU version of LLVM. Current support is limited to a single PDF reference and an assembler tool. The very impressive GPU ray tracer OptiX uses PTX as its target, but the authors do have the significant advantage of residing inside NVIDIA HQ. The compiler toolkit that makes LLVM so attractive is missing. Not to mention that PTX only targets NVIDIA hardware - although this itself is an interesting area for change, and one where I feel NVIDIA could gain a lot of momentum. One project I will be keeping a close eye on is GPU-Ocelot, which appears to be going a fair way towards addressing the shortcomings of PTX. While PTX may not be an "official" standard, much like Adobe Flash, it could establish itself as one anyway.

LLVM as a data parallel ISA?

Given the close parallels, we should seriously question whether LLVM can support GPU-like architectures. As a reasonably mature project, LLVM already has a lot in its favour and can point to some successes in this area. Notably, its use in the Mac OpenGL implementation, the Gallium 3D graphics device driver project, and an experimental PTX backend. I spoke with a few people who I know have seriously considered this option and found the jury still out. I haven't use LLVM in sufficient anger to come to a decision.

One obvious obstacle is what LLVM would compile to? I would expect a serious cross-vendor virtual ISA to have direct support from all hardware manufacturers. A level of indirection through LLVM or any other third-party ISA is unlikely to gain sufficient ground if it's not explicitly supported by all vendors through a standard such as OpenCL.

Demand and evolution

With relatively little movement in the industry for a year or more, I do occasionally consider whether I have misread the demand for a virtual ISA. But not for long! Apart from the clear advantage for code generation and domain specific language implementations (a topic I'm very interested in), a virtual ISA should become the target bytecode for GPU and HPC languages and APIs such as OpenGl, OpenCl, DirectX, DirectCompute, CUDA and their successors. It's a long and growing list.

While uncertainty surrounds the best way to program your GPU we can expect to see unhealthy but unavoidable amounts of biodiversity. But if we want to prevent our industries from painfully diverging we should at least agree on a foundational virtual ISA that we start to unify behind and build on.


Proposal for a graphics pipeline DSL

Geomerics are making a proposal to the UK Technology Stategy Board (TSB) for "disruptive technology" funding to germinate an idea we've been discussing for a long while now. The proposal is a very public one via a youtube video submission, and the "5th judge" will be public feedback! So I'm looking for as much input as possible to improve this idea. Naturally, this is funding proposal, rather than a tech one, so apologies if it's short on tech details.

UPDATE: The video has been submitted. Thanks to all that contributed! Please check it out and leave feedback on the TSB site - you are the 5th judge so your opinion matters!

Here is the final text for the 2 minute video.

Efficient Real-Time Graphics Through A Domain Specific Language

Parallel computing is the next major challenge for software development as chip manufacturing has hit physical limits that require us to go parallel to do more. However, writing parallel software is a hard problem, and requires a new approach.

Computer graphics leads parallel hardware development and is the best known parallel application. But the same hardware and software that drives the graphics in your modern PC is also used by the scientific and medical imaging communities. Advances in graphics frequently have a wider impact in these fields. But despite its apparent suitability, graphics still suffers from a serious programmability gap. There is not yet a good way of driving the hardware effectively. It is widely accepted that without further innovation, faster hardware will not deliver comparatively improved visual quality.

So our goal is to bridge the programmability gap for graphics. By doing so we will tap the unexploited potential of the new power-efficient parallel hardware, providing an important product for the games and graphics industry, while making progress into the wider parallel programming problem.

With full TSB funding, we would draw on our expertise to prototype an alternative graphics pipeline, taking the novel approach of structuring it as a domain specific language (DSL). We will show that new graphics algorithms can be efficiently developed in this language and provide solutions to a range of outstanding problems affecting game and graphics developers that are hard or impossible to realise with existing approaches.

These challenges include:
- higher fidelity images (anti-aliasing)
- cinematic lens effects (bokeh, depth of field)
- complex illumination and shadowing
- semi-transparent materials, such as glass or water
- volumetric rendering, such as fog and smoke

We intend to show a step-change in visual quality, and demonstrate a new parallel programming model with wider applicability. The TSB funding would allow us to build the foundation for a middleware product Geomerics would commercialise in the games and graphics industry. With partial funding we will scale the prototype to deliver a vertical slice focussing on a single rendering challenge.

With its worldwide reputation for game graphics, its close ties with hardware manufacturers and its relationships with academia, Geomerics is ideally placed to carry out this development.

There is some good background material on this topic from this year's Beyond Programmable Shading Course from SIGGRAPH 2010. Of particular relevance is, Johan Andersson's "5 Major Challenges in Interactive Rendering" - another crowd-sourced proposal.

Input and feedback appreciated! If you would prefer not to post publicly, feel free to email me directly (sam.martin@geomerics.com).


Why isn't Haskell popular in Industry?

This question was recently posed (and closed!) on stackoverflow. The top voted reply is so good I thought I'd replicate it here! I'm afraid all I know of the original author is their "Orphi" posting name.

It's tongue-in-cheek and broadly true, although I'd argue the details of a few points. I think it's interesting that it doesn't contain the main reason blocking me from using it on a daily basis. More on that in a moment.

First, here's Orphi's response:

  1. Nobody's ever heard of it. No one's going to use something they don't know exists.
  2. It's unpopular. People assume that the most popular language is the best language, because if it wasn't good, it wouldn't be popular. This is actually untrue; as you can see, the most popular language is the most popular language. To put it another way, Haskell is unpopular because it's unpopular. This is what Haskell programmers refer to as "recursion", and despite what imperative programmers tell you, it's extremely common in The Real World.
  3. It's different. People are always afraid of what's different.
  4. It's difficult. People think that Haskell is difficult to understand or difficult to learn. This is almost certainly related to point #3. It's also related to the fact that the Haskell community is populated by people who casually remark "a monad is just a monoid in the category of endofunctors, what's the problem?" and expect normal human beings to comprehend this.
  5. It's risky. Most companies don't want to be the first to use something. Haskell isn't being used by many people, so not many people want to try it. (See this recursive unpopularity argument again?)
  6. Can't hire programmers. First, by #2, there aren't many programmers who already know Haskell. Second, most people believe #4, which means you can't train programmers to use Haskell. (At least, it would if it were actually true.) A language that you can't hire programmers for is a very, very risky proposition indeed. (Which leads us back to #5.)
  7. Libraries. This is probably the big one, so I'm going to spend some time on it.
    • Quality. We have the quantity. We do not yet have the quality. Most of Hackage is one-man hobby projects with little to no documentation. Some of it is incomplete, some of it has long since bit-rotted, some of it malfunctions if used in certain ways.
    • The Outside World. If you want a binary heap tree, Hackage probably provides a dozen implementations. If you want to connect to an RPC server and fire off a few procedure calls... good luck with that. Same deal for talking to databases, accessing OS resources, manipulating binary file formats... You'll basically have to write everything yourself from scratch. This is a big deal for commercial work.
    • Multiple incompatible libraries. You can, in fact, connect to a database in Haskell. Trouble is, at the last count there's about a dozen libraries for doing this, and it's bewildering trying to figure out which ones are actively supported and which ones are zombie projects that stopped working years ago. It's also not as simple as hooking up an ODBC connection; there are different backends for each library and each DB target. Yay. :-/
    • Windows. Almost all the important libraries (for cryptography, binary file formats, network protocols, data compression, talking to databases, etc.) are Haskell wrappers around C libraries. And these all fail to build on Windows. Given that Windows is
      the single biggest target platform on the market, this is a big deal.
  8. Unpredictable performance. This is way, way down at #8. Most people don't know enough about Haskell to even know this. Most people just assume that "Haskell is slow". This is demonstrably untrue. What is true is that it can be hard to predict the performance of a Haskell program. Subtle, apparently irrelevant changes can sometimes make big performance differences.
  9. Correctness. Most companies don't give a **** about correctness. They don't care about quality. They just want to shovel code out the door as fast as possible and earn wads of cash. If there are bugs, they'll charge the customer money to fix them. Getting code right is of no interest; getting code fast is what counts. Haskell is a language that rewards those who sit back and deeply analyse the problem, and then produce a beautiful solution. Most companies don't care for this approach; let's just hack something together as fast as possible, and worry about fixing it later (i.e., never).

There are a few places where correctness matters. These are either safety-critical systems, or financial systems, generally. I gather Haskell tends to be quite popular here.

One final pair of data points:

  • I can still remember not so long ago hearing people cry "C++ is a toy language for n00bs! You should use a proper programming language like C." Now take a look around you and see how many large-scale C++ programs there are.
  • People have been claiming that Lisp is "the next big thing" for, what, 40 years now? Lisp is older than almost every programming language in mainstream use. And now many large-scale Lisp programs are there?

Which fate awaits Haskell, I don't know. I rather suspect all the good ideas of Haskell will be stolen by the likes of C# and hybrids like F# or OCaml,and people still won't ever use Haskell itself. It's just too different.

But anyway, as to why industry doesn't use Haskell, see the points above. It's too rare, too unpopular, too weird, and has incomplete libraries. That's basically it, in a nutshell.

Embedding Haskell?

The missing item for me is how hard it is to embed. For my purposes, I need to embed Haskell within an existing C++ framework, not dissimilar to how you might embed Lua. I need to do so in a compiler agnostic manner, where I retain control over how the embedding occurs (i.e. compilation and linking) and have significant influence over the operation of the runtime. In short, Haskell must fit in to my existing C++ environment, not the other way round.

An interesting aspect of an "Embedded Haskell" would be you could remove most (if not all) of Haskell's IO support if it made the job of embedding Haskell significantly easier. The ability to efficiently embed 'pure' Haskell would be a very interesting and useful tool in it's own right 1. It's a simple idea, and therefore doesn't take many words to state, but I'm keen to not understate how significant this could be.

Incremental adoption

Playing the counter argument, it could be argued that my needs are rather domain specific and an Embedded Haskell is not likely to be important enough to aid mainstream adoption of Haskell. This attitude does appear to be reflected in some parts of the Haskell community.

To start anecdotally, when I hear people propose writing something in Haskell, they generally imply the only option on the table is to write it in Haskell wholesale. Most arguments about use of Haskell I have read online focus on this either-or situation. For nearly all the reasons made in the quote above, this polarised position doesn't appeal unless you have very tailored circumstances and considerable resources.

There is also evidence of a lack of emphasis at a more fundamental level. GHC and the tools and libraries ecosystem that surround it are not designed with embedding as a goal. GHC is primarily focused on producing stand alone executables, which it does very well. Interoperability tools like Greencard immediately sideline the ability to call Haskell from C as having "little demand". Most emphasis is placed on wrapping C libraries for use in Haskell. The best I can find online are some open student projects such as the ageing GSoC ticket #1555, or the Embedded Haskell thesis topic in the PLS group. I believe this situation might be starting to change though, as the desire to remove the implementation obstacles surrounding much of Haskell's inspiring research builds, and new developments such as the LLVM backend present additional options.

Wholesale adoption of Haskell is undeniably risky and definitely not the only way. Incremental adoption is where it's at, and in a nutshell, that's what Haskell is currently missing the most.

  1. As an interesting historical aside, when Haskell first took up its purity banner, monadic IO did not exist and Haskell programs were pure functions of type [Response] -> [Request], where these types provided a limited set of options such as ReadFile, WriteFile, RequestFailed, ReadSucceeded for the side-effecting wrapper program to interpret. (See the Awkward Squad tutorial for further details.) Generalise this to a pure iteratee and you are probably in the right ballpark. []

Haskell For Games!

I've been head-down in a big stack of papers since around March this year. That was the point at which I first started to get excited about the idea of Haskell becoming a plausible language for use in games development. More recently I decided to start doing something about it and gave a talk to a group of dedicated Haskellers at AngloHaskell 2009. The event turned out to be a lot of fun and I think it's safe to say the talk went pretty well. Here's the abstract.

Functional Languages in Games: Plotting the Coup

[Slides as PDF]

As a games developer by trade, my experience of the industry leads me to suspect games development is approaching a tipping point where functional languages could enact a successful coup. The revolution would claim a chunk of C++-owned territory for the victor and mark an important milestone in the development of functional languages. It will not be easy. Games development is notoriously demanding and the successful functional language would need to meet stringent performance requirements, have clearly demonstrable 'killer apps', jump through hoops of fire and tell jokes at parties. This talk will discuss how close Haskell is to meeting these demands, the challenges that remain, evidence of functional languages already in games, and how Haskell compares against its nearest competitors.

Haskell For Games!

At first glance it sounds like a crazy idea. One to file away with the other crazy ideas to replace C++ with Java/C#/Python/etc. Most alternatives to C++ are so unlikely to succeed in practice that they appear to taint the very idea of replacing C++. I've written before about my high regard for C++,  but as powerful and effective as it is for games development, it does not represent an impossible challenge and we don't have to look to replace it entirely. Finding it a suitable companion would be a major step forward and is the goal I'd choose to focus on.

Multi-core

There are powerful currents moving in modern computer hardware, pulling us inevitably into murky multi-core waters. However this movement also begins to make the idea of doing games development in an alternative language more plausible. What do we do when large multi-core systems become a standard hardware platform? (A reality that I note is only a handful of years away.) I have yet to see a parallelisation option that don't make me think life in this new age in C++ will be rather hard. And would it be any easier in C# or Java? No. Multi-core life there will likely be just as tough. However, these aren't the only options.

Functional languages

I'm far from the first to notice this, but pure functional languages - as opposed to the imperative languages most of us are used to - do at least have a theoretical advantage. Pure functional code does not have side effects. If you call it with the same parameters you will always get the same answer. It is thread-safe at a fundamental level giving opportunities for optimisation and parallel evaluation that are either infeasible or impossible with imperative code. They aren't so alien as you may immediately think. You may well already work with such a language without really realising it. Ignoring some syntactical obfuscations, both CG and HLSL are essentially pure, referentially transparent languages. Neither language can wipe your hardrive or save state in global variables, and it's no coincidence that they both optimise exceptionally well.

As you can well imagine, this is not an open-and-shut success case. Achieving good parallelism, even from a functional starting point, is still hard. In the previous example of CG/HLSL, the hard parallelism work is still done by the programmer by setting up the GPU pipeline, rather than magically derived from the CG/HLSL. Doing complicated, dependent operations in a GPU architecture is tricky and the subject of many GPGPU articles, although to be fair many of these obstacles are due to the current GPU architecture than the more fundamental issues in utilising parallelism.

Achieving parallel code that includes grubby details like nesting and runtime data dependencies are hard problems. But in the long term I think it's more plausible to turn these problems into successes in functional languages than anywhere else. Compiler-parallelised code, even if partly programmer controlled, would be a Killer App for any alternative language, and one feature that C++ is unlikely to ever have. Without this feature, there are many other benefits for games development to adopt a functional sister-language, but the cost of doing so may cancel out the cost of the adoption.

Multi-core Haskell

I'm championing Haskell from the functional language pack for a variety of reasons, several of which are noted briefly in my talk and the rest I'll expand on further in the future. I hope many of the benefits of Haskell will be apparent to anyone prepared to spend the time learning it, and I'd urge anyone interested to get stuck in immediately. There are several decent tutorials referenced from the Haskell Wiki, and I can highly recommend, "Learn You A Haskell For Great Good!", as a great starting point. One other very notable highlight is the on-going research into extending the language to support Nested Data Parallelism. Although not complete, this research does look very promising and where I'm hoping some of the magic may take place.

Haskell for Games is by no means a done deal, but my enthusiasm for this project has at least withstood it's first challenge - presenting these ideas to members of the Haskell community - and if anything has grown as a result.


The beauty of software development

This is all about how amazing software development really is.

Taking "X" to be a geeky subject: The belief that "X" is truly a thing of beauty but scorned, unloved and misunderstood by the masses is by no means a modern concept. But it lingers on all the same. I suppose it's no coincidence that the culmination of many geeky subjects into a sort of geeky mega-subject (software development) might attract a bit more than it's fair share of abuse. People at least have some respect for mathematicians and physicists, even if they choose to distance themselves. Tell people you develop software for a living and they promptly fall asleep, or complain that their computer never works. Unless of course, you develop games for a living at which point you become every kid's best friend. (It's a strategy I highly recommend.)

Here's a few thoughts and some of my favourite quotes on the topic of beauty and software.

Art
First up is Donald Knuth's, "Art of computer programming". For non-coders out there, this book is the equivalent of Steven Hawkings, "Brief History of Time", to most people: Everyone has heard of it. Many people own a copy. Some people have even attempted to read it but few have actually completed it and even less understood it. It's the kind of "compulsory reading" that most programmers skip but know they probably shouldn't have.

Knuth justifies his use of the word "Art" in the title:

Computer programming is an art, because it applies accumulated knowledge to the world, because it requires skill and ingenuity, and especially because it produces objects of beauty. A programmer who subconsciously views himself as an artist will enjoy what he does and will do it better.

You can almost hear a revolt starting.

Is coding Art? Well, I think there's one thing missing in Knuth's description that would make his assertion particularly convincing - Art can tell you something about humanity. Can your code do that? Well, I'm not sure. But, in the defense of code and the study of patterns in general, there are features and patterns of the world that are better reflected through them than Art. I think some of these patterns are surprisingly deep and beautiful - eigenvectors are the first to spring to mind. Certainly beautiful enough that I'd hang them on my wall if I could capture them in a picture.

Expression
You can express yourself through Art. Can you express yourself through code? Certainly. The most obvious example of this is the rapidly growing cross-over world of programming visual artists. Generative art is a topic all of it's own, so I'll just recommend anyone interested to check out Processing and follow links from there. I'm a fan of Robert Hodgin, especially this.

Is it possible to be defined by your creations, as many artists become defined by their output? This seems to be true of Justin Frankel, creator of several popular and sometimes controversial projects. There's a popular quote to go with his resignation from AOL to go with this, but please be aware I'm including it with some reservations as it's second hand and comes from a somewhat opinionated article. Just be aware it might be porky pies:

For me, coding is a form of self-expression. The company controls the most effective means of self-expression I have. This is unacceptable to me as an individual, therefore I must leave.

(I should probably also note his most recent project, REAPER, is absolutely fantastic and all you Cubase users should jump ship immediately.)

Elegance
I might be nitpicking, but I suspect the most common understanding of 'beauty' in reference to code is actually something closer to 'elegance' rather than beauty as such. Code elegance is arguably the reading-between-the-lines topic of many software engineering mailing lists.

Some noteworth texts from the small to the large include a decent blog post, On Beauty in Code; a presentation on how to go about writing beautiful code (in PHP of all things!); and of course there's a rather interesting looking book, Beautiful Code. I haven't read this yet, but intend to shortly. The highlight for me is an interesting review of a review of the book entitled, Code isn't beautiful:

Ideas are beautiful. Algorithms are beautiful. Well executed ideas and algorithms are even more beautiful. But the code itself is not beautiful. The beauty of code lies in the architecture, the ideas, the grander algorithms and strategies that code represents.

I think that's pretty much on the button.

Architecture
If your code was a building - an analogy that happens to be a good fit a lot of the time - you could marvel at it's architecture. You could be impressed by the construction, or the balance of functionality and aesthetics. And like appreciation of architecture, a lot can be in the eye of the beholder!

Coventry's Belgade Theatre.

Is it a "bold and dynamic" statement, developed through a "sculpural process" where "the spaces that it embraces, and that it implies around itself, are as important as the form itself"? Or, an unimaginative concrete cube ungracefully slapped into the middle of an already concrete-heavy town, representing little but the staggering lack of inspiration present in its creators? You decide! Comparisons with your most loved or love-to-hate software engineering projects as comments please.

Creation
Ignoring the code and algorithms for a moment, it's undeniable that the output of code can be beautiful - after all it's a major goal of computer graphics research. And not all of it involves artists in the traditional sense. Data visualisation has become a big topic in recent years. I find the growth of this area quite fascinating as it produces attractive, often intriguing images but apparently skipping over the role of an artist in a traditional sense and deriving input purely from real world data. It's arguably an expression of humanity - although not quite in the same sense I originally had in mind!

On a personal note, I still remember the first implementation of our radiosity algorithm emerge. The whole thing happened quite quickly and we lost several days to just playing with it: tweaking the scene, changing the lights, adding some post processing. It was something none of us had seen before, and it took us quite by surprise. I'd had that feel-good effect from previous projects, but there's something about actually being able to see the result and play with it that makes it all the more tangible.

Process
I clearly remember my tutor at university complaining that too many people focus on process over product. In fact, he was my music tutor complaining about composers, but the point applies very well to software engineering. But that's not to say there isn't beauty - even joy - to gain from the creation of code. This leads me to my last, but perhaps favourite quote of all time. Here's Alexander Stepanov (author of the C++ standard library) and Mat Marcus in some lecture notes:

Programming has become a disreputable, lowly activity. More and more programmers try to become managers, product managers, architects, evangelists – anything but writing code. It is possible now to find a professor of Computer Science who never wrote a program. And it is almost impossible to find a professor who actually writes code that is used by anyone: the task of writing code is delegated to graduate students. Programming is wonderful. The best job in the world is to be a computer programmer. Code can be as beautiful as the periodic table or Bach’s Well Tempered Clavier.

It's one of my favourite quotes because it's so passionate: I too love programming! I love patterns and algorithms! The world is fantastic!

But - and it's a big but - that quote simulateously shines light on the big elephant in the room: Software development is programming but with people. That 'people' part is vitally important, and is occasionally neglected by programmers of code, beautiful or otherwise. It mustn't be. Coding is empowering, but the power still lies with people. I suspect software development does have a thing or two to tell us about humanity.

And that's why software development really is amazing. Even if it's simultaneous one of the most mind-numbingly difficult, painful and exhilarating things I can think of.