Tag Archives: haskell

Why isn't Haskell popular in Industry?

This question was recently posed (and closed!) on stackoverflow. The top voted reply is so good I thought I'd replicate it here! I'm afraid all I know of the original author is their "Orphi" posting name.

It's tongue-in-cheek and broadly true, although I'd argue the details of a few points. I think it's interesting that it doesn't contain the main reason blocking me from using it on a daily basis. More on that in a moment.

First, here's Orphi's response:

  1. Nobody's ever heard of it. No one's going to use something they don't know exists.
  2. It's unpopular. People assume that the most popular language is the best language, because if it wasn't good, it wouldn't be popular. This is actually untrue; as you can see, the most popular language is the most popular language. To put it another way, Haskell is unpopular because it's unpopular. This is what Haskell programmers refer to as "recursion", and despite what imperative programmers tell you, it's extremely common in The Real World.
  3. It's different. People are always afraid of what's different.
  4. It's difficult. People think that Haskell is difficult to understand or difficult to learn. This is almost certainly related to point #3. It's also related to the fact that the Haskell community is populated by people who casually remark "a monad is just a monoid in the category of endofunctors, what's the problem?" and expect normal human beings to comprehend this.
  5. It's risky. Most companies don't want to be the first to use something. Haskell isn't being used by many people, so not many people want to try it. (See this recursive unpopularity argument again?)
  6. Can't hire programmers. First, by #2, there aren't many programmers who already know Haskell. Second, most people believe #4, which means you can't train programmers to use Haskell. (At least, it would if it were actually true.) A language that you can't hire programmers for is a very, very risky proposition indeed. (Which leads us back to #5.)
  7. Libraries. This is probably the big one, so I'm going to spend some time on it.
    • Quality. We have the quantity. We do not yet have the quality. Most of Hackage is one-man hobby projects with little to no documentation. Some of it is incomplete, some of it has long since bit-rotted, some of it malfunctions if used in certain ways.
    • The Outside World. If you want a binary heap tree, Hackage probably provides a dozen implementations. If you want to connect to an RPC server and fire off a few procedure calls... good luck with that. Same deal for talking to databases, accessing OS resources, manipulating binary file formats... You'll basically have to write everything yourself from scratch. This is a big deal for commercial work.
    • Multiple incompatible libraries. You can, in fact, connect to a database in Haskell. Trouble is, at the last count there's about a dozen libraries for doing this, and it's bewildering trying to figure out which ones are actively supported and which ones are zombie projects that stopped working years ago. It's also not as simple as hooking up an ODBC connection; there are different backends for each library and each DB target. Yay. :-/
    • Windows. Almost all the important libraries (for cryptography, binary file formats, network protocols, data compression, talking to databases, etc.) are Haskell wrappers around C libraries. And these all fail to build on Windows. Given that Windows is
      the single biggest target platform on the market, this is a big deal.
  8. Unpredictable performance. This is way, way down at #8. Most people don't know enough about Haskell to even know this. Most people just assume that "Haskell is slow". This is demonstrably untrue. What is true is that it can be hard to predict the performance of a Haskell program. Subtle, apparently irrelevant changes can sometimes make big performance differences.
  9. Correctness. Most companies don't give a **** about correctness. They don't care about quality. They just want to shovel code out the door as fast as possible and earn wads of cash. If there are bugs, they'll charge the customer money to fix them. Getting code right is of no interest; getting code fast is what counts. Haskell is a language that rewards those who sit back and deeply analyse the problem, and then produce a beautiful solution. Most companies don't care for this approach; let's just hack something together as fast as possible, and worry about fixing it later (i.e., never).

There are a few places where correctness matters. These are either safety-critical systems, or financial systems, generally. I gather Haskell tends to be quite popular here.

One final pair of data points:

  • I can still remember not so long ago hearing people cry "C++ is a toy language for n00bs! You should use a proper programming language like C." Now take a look around you and see how many large-scale C++ programs there are.
  • People have been claiming that Lisp is "the next big thing" for, what, 40 years now? Lisp is older than almost every programming language in mainstream use. And now many large-scale Lisp programs are there?

Which fate awaits Haskell, I don't know. I rather suspect all the good ideas of Haskell will be stolen by the likes of C# and hybrids like F# or OCaml,and people still won't ever use Haskell itself. It's just too different.

But anyway, as to why industry doesn't use Haskell, see the points above. It's too rare, too unpopular, too weird, and has incomplete libraries. That's basically it, in a nutshell.

Embedding Haskell?

The missing item for me is how hard it is to embed. For my purposes, I need to embed Haskell within an existing C++ framework, not dissimilar to how you might embed Lua. I need to do so in a compiler agnostic manner, where I retain control over how the embedding occurs (i.e. compilation and linking) and have significant influence over the operation of the runtime. In short, Haskell must fit in to my existing C++ environment, not the other way round.

An interesting aspect of an "Embedded Haskell" would be you could remove most (if not all) of Haskell's IO support if it made the job of embedding Haskell significantly easier. The ability to efficiently embed 'pure' Haskell would be a very interesting and useful tool in it's own right 1. It's a simple idea, and therefore doesn't take many words to state, but I'm keen to not understate how significant this could be.

Incremental adoption

Playing the counter argument, it could be argued that my needs are rather domain specific and an Embedded Haskell is not likely to be important enough to aid mainstream adoption of Haskell. This attitude does appear to be reflected in some parts of the Haskell community.

To start anecdotally, when I hear people propose writing something in Haskell, they generally imply the only option on the table is to write it in Haskell wholesale. Most arguments about use of Haskell I have read online focus on this either-or situation. For nearly all the reasons made in the quote above, this polarised position doesn't appeal unless you have very tailored circumstances and considerable resources.

There is also evidence of a lack of emphasis at a more fundamental level. GHC and the tools and libraries ecosystem that surround it are not designed with embedding as a goal. GHC is primarily focused on producing stand alone executables, which it does very well. Interoperability tools like Greencard immediately sideline the ability to call Haskell from C as having "little demand". Most emphasis is placed on wrapping C libraries for use in Haskell. The best I can find online are some open student projects such as the ageing GSoC ticket #1555, or the Embedded Haskell thesis topic in the PLS group. I believe this situation might be starting to change though, as the desire to remove the implementation obstacles surrounding much of Haskell's inspiring research builds, and new developments such as the LLVM backend present additional options.

Wholesale adoption of Haskell is undeniably risky and definitely not the only way. Incremental adoption is where it's at, and in a nutshell, that's what Haskell is currently missing the most.

  1. As an interesting historical aside, when Haskell first took up its purity banner, monadic IO did not exist and Haskell programs were pure functions of type [Response] -> [Request], where these types provided a limited set of options such as ReadFile, WriteFile, RequestFailed, ReadSucceeded for the side-effecting wrapper program to interpret. (See the Awkward Squad tutorial for further details.) Generalise this to a pure iteratee and you are probably in the right ballpark. []

Haskell For Games!

I've been head-down in a big stack of papers since around March this year. That was the point at which I first started to get excited about the idea of Haskell becoming a plausible language for use in games development. More recently I decided to start doing something about it and gave a talk to a group of dedicated Haskellers at AngloHaskell 2009. The event turned out to be a lot of fun and I think it's safe to say the talk went pretty well. Here's the abstract.

Functional Languages in Games: Plotting the Coup

[Slides as PDF]

As a games developer by trade, my experience of the industry leads me to suspect games development is approaching a tipping point where functional languages could enact a successful coup. The revolution would claim a chunk of C++-owned territory for the victor and mark an important milestone in the development of functional languages. It will not be easy. Games development is notoriously demanding and the successful functional language would need to meet stringent performance requirements, have clearly demonstrable 'killer apps', jump through hoops of fire and tell jokes at parties. This talk will discuss how close Haskell is to meeting these demands, the challenges that remain, evidence of functional languages already in games, and how Haskell compares against its nearest competitors.

Haskell For Games!

At first glance it sounds like a crazy idea. One to file away with the other crazy ideas to replace C++ with Java/C#/Python/etc. Most alternatives to C++ are so unlikely to succeed in practice that they appear to taint the very idea of replacing C++. I've written before about my high regard for C++,  but as powerful and effective as it is for games development, it does not represent an impossible challenge and we don't have to look to replace it entirely. Finding it a suitable companion would be a major step forward and is the goal I'd choose to focus on.

Multi-core

There are powerful currents moving in modern computer hardware, pulling us inevitably into murky multi-core waters. However this movement also begins to make the idea of doing games development in an alternative language more plausible. What do we do when large multi-core systems become a standard hardware platform? (A reality that I note is only a handful of years away.) I have yet to see a parallelisation option that don't make me think life in this new age in C++ will be rather hard. And would it be any easier in C# or Java? No. Multi-core life there will likely be just as tough. However, these aren't the only options.

Functional languages

I'm far from the first to notice this, but pure functional languages - as opposed to the imperative languages most of us are used to - do at least have a theoretical advantage. Pure functional code does not have side effects. If you call it with the same parameters you will always get the same answer. It is thread-safe at a fundamental level giving opportunities for optimisation and parallel evaluation that are either infeasible or impossible with imperative code. They aren't so alien as you may immediately think. You may well already work with such a language without really realising it. Ignoring some syntactical obfuscations, both CG and HLSL are essentially pure, referentially transparent languages. Neither language can wipe your hardrive or save state in global variables, and it's no coincidence that they both optimise exceptionally well.

As you can well imagine, this is not an open-and-shut success case. Achieving good parallelism, even from a functional starting point, is still hard. In the previous example of CG/HLSL, the hard parallelism work is still done by the programmer by setting up the GPU pipeline, rather than magically derived from the CG/HLSL. Doing complicated, dependent operations in a GPU architecture is tricky and the subject of many GPGPU articles, although to be fair many of these obstacles are due to the current GPU architecture than the more fundamental issues in utilising parallelism.

Achieving parallel code that includes grubby details like nesting and runtime data dependencies are hard problems. But in the long term I think it's more plausible to turn these problems into successes in functional languages than anywhere else. Compiler-parallelised code, even if partly programmer controlled, would be a Killer App for any alternative language, and one feature that C++ is unlikely to ever have. Without this feature, there are many other benefits for games development to adopt a functional sister-language, but the cost of doing so may cancel out the cost of the adoption.

Multi-core Haskell

I'm championing Haskell from the functional language pack for a variety of reasons, several of which are noted briefly in my talk and the rest I'll expand on further in the future. I hope many of the benefits of Haskell will be apparent to anyone prepared to spend the time learning it, and I'd urge anyone interested to get stuck in immediately. There are several decent tutorials referenced from the Haskell Wiki, and I can highly recommend, "Learn You A Haskell For Great Good!", as a great starting point. One other very notable highlight is the on-going research into extending the language to support Nested Data Parallelism. Although not complete, this research does look very promising and where I'm hoping some of the magic may take place.

Haskell for Games is by no means a done deal, but my enthusiasm for this project has at least withstood it's first challenge - presenting these ideas to members of the Haskell community - and if anything has grown as a result.

On parsing, regex, haskell and some other cool things

I've recently become slightly obsessed about finding ways (new or otherwise) to make parsing text really really simple. I'm concerned there are wide gaps in the range of currently parsing tools, all of which are filled by pain.

It's also a nice distraction from the C++ language proposal I was working on which is stalled while I dig through more research. It turns out someone has already done something very similar to what I was thinking! So there will be a bit of a delay while I bottom that out properly.

Parsed the pain.
Parsing with regular expressions covers a decent amount of simple low-hanging fruit. I happen to be a big fan of regex but it definitely doesn't handle parsing 'structured documents' very well. Here 'structure' means some non-trivial pattern: perhaps matching braces, nested data or maybe a recursive structure.

This is by design. Regular expressions are, or were originally, a way of describing an expression in a 'regular grammar'. Its expressive power is actually very limited and text doesn't need to be that complex before it exceeds the expressiveness of a regular expression. This regex email address parser is just about readable, but kind of pushing the limits:

\b[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,4}\b

However, XML, HTML, pretty much all source code, every file format I've ever written - basically all the documents I care about - are not regular grammars.

The pain in context
The next step up from a regular grammar in the Chomsky hierarchy is a 'context free' grammar. Parsing a context-free grammar frequently involves writing a lexer and parser combination to do the work. The lexer breaks the character stream into 'tokens' and the parser translates the token stream into a more meaningful tree structure. Parsers in particular tend to be complex and lengthy pieces of code so you'd more often than not find yourself using a parser generator such as yacc, bison, or antlr to actually generate the code for you from a separate description of the grammar. This is all before you actually get to doing something useful with the tree the parser outputs.

Either way you cut it, this is a significant step up in pain from a regular expression. Your problem has suddenly jumped from a condensed one-liner to full-on procedurally-generated code. If the task you have in mind is just a bit more complex than a regex can handle your pain increases disproportional with this increase in complexity.

Sadly, even context-free grammars don't cut much in practice. There's a fair gap between the expressiveness of context-free grammar and the real world of nasty ambiguous context-sensitive languages. I'm thinking mainly of the context-sensitivity of C++ where the task of writing a parser is full of painful implementation details. Not to mention that there is a further major leap to get close to parsing the world of natural languages, such as English.

Pain relief
There are no shortage of parsing tasks in the "slightly more complex than a regex" category. Context-free grammars actually contain several sub-categories that are more restrictive but simpler to parse, such as LL and LR. So it's not really much of a surprise to discover that a typical 'regex' isn't actually a 'regular grammar expression' any more.

Perl's implementation of regex supports recursion, back references, and finite look-ahead which allow it handle some - maybe all - context-free documents. I recently re-read the Perl regex tutorial to remind myself of it, and had some fun scraping the web for tescos voucher codes. I think the expansion beyond supporting just regular grammars is very helpful, but I don't think it's really bridging the gap to context-free parsing in a particularly manageable and re-usable way.

So, if Perl's extended regex doesn't cut it, what are the alternatives? Well, here's a couple of thoughts.

Structured grep
I thought this was quite a nice find: sgrep ("structured grep"). It's similar to, but a separate from the familiar grep. There are binaries for most platforms on-line as well as being found in Cygwin, Ubuntu and probably most other Linux distros. At least in theory, it extends regular grammar pattern matching to support structure through the use of nested matching pairs and boolean operators.

Here's how you might scrap a html document for the content of all 'bold' tags:

$ cat my_webpage.html | sgrep '"" .. ""'

The .. infix operator matches the text region with the specified start and end text strings. It also support boolean operators like this:

$ cat my_webpage.html | sgrep '"".. ("" or "")'

If you dig through the manual you'll come across macros and other cool operators such as 'containing', 'join', 'outer' and so on. It seems easy to pick up and you can compose more complex expressions with macros.

I would go on about it for longer but sadly it's current implementation has a fairly major flaw - it has no support for regex! This feels like a bit of simultaneous forwards and backwards step. I'm not actually sure whether it's a fundamental flaw in the approach they've taken or whether the functionality is simple missing from the implementation. It's a bit of shame because I think it looks really promising, and if you are interested I'd recommend you take a moment to read a short article on their approach. I found it an interesting read and have since hit upon a handful of mini-parsing problems that I found sgrep very helpful with.

Parser combinators
This was a recent discovery, and it now surprises me I hadn't come across it before. I think I didn't know of it because it's rather tightly bound to the realm of 'functional' languages, which isn't something I've spent that much time looking at until now. That's all changing though, as I think I'm becoming a convert.

It occured to me that a parser might be easier to write in a functional language: Parsing a grammar is kind of like doing algebra, and algebraic manipulation is the kind of thing functional languages are particularly good at. Googling these ideas turned up both Parser Combinators, an interesting parsing technique, and Haskell, a pure functional language where a 'parser' is really a part of the language itself.

Parse combinators are a simple concept: You write a set of micro-parsers (my name for them) that do very basic parsing duties. Each is just a single function that given a text string, returns a list of possible interpretations. Each interpretation is a pair of the interpreted object and the remaining text string. In Haskell, you'd write the type of all parsers (a bit like a template in C++) like this:

type Parser a = String -> [(a,String)]

For an unambiguous input string the parser will produce a list with just one item, ambiguous inputs will produce a list with more than one item, and an invalid input produces an empty list. An example micro-parser might just match a particular keyword at the start of the string.

Since all your parsers are of the same type, it's now simple to compose them together into more complex parsers. This is modular programming at its most explicit.

It's quite surprisingly how tiny and general the code to compose these parsers can be. You can reduce them to one-liners. Here's a few examples, again in Haskell:


-- Here, m and n are always Parser types.
-- p is a predicate, and b is a general function.

-- parse-and-then-parse
(m # n) cs = [((a,b),cs'') | (a,cs') <- m cs, (b,cs'') <- n cs']

-- parse-or-parse
(m ! n) cs = (m cs) ++ (n cs)

-- parse-if-result
(m ? p) cs = [(a,cs') | (a,cs') <- m cs, p a]

-- parse-and-transform-result
(m >-> b) cs = [(b a, cs') | (a,cs') <- m cs]

-- parse-and-ignore-left-result
(m -# n) cs = [(a,cs'') | (_,cs') <- m cs, (a,cs'') <- n cs']

-- parse-and-ignore-right-result
(m #- n) cs = [(a,cs'') | (a,cs') <- m cs, (_,cs'') <- n cs']

I've taken these examples from "Parsing with Haskell", which is an excellent short paper and well worth a read.

Learning Haskell has been something of a revelation. I had glanced at Objective CAML and Lisp before, but I'm actually really quite shocked at how cool Haskell is and that it took me so long to find it.