But now Haskell seems to be en vogue and after playing around with it, doing some toy programming and trying things out, I decided that I have to do something 'real' in it. For some time now I casually work on a system for compiler creation based on a customizable parser front end, a template engine for output, an rule based language with forward inference for doing the semantic stuff and a graphical Browser to work with the system comfortably. Because I reached a point where small incremental changes weren't enough anymore, I decided to do a bigger rewrite. It's written in Java but I thought it would be a good idea to rewrite it in Haskell to learn more about the language.
So for some weeks I've done quite a bit of Haskell programming, produced about 2000 lines working code and made some experiences with Haskell I'd like to share.
Where Haskell really shines is pattern matching: Creating tree walkers or subexpression-matchers is a piece of cake, and if you're writing a compiler, this is really useful. Also the easy and powerful list-handling is quite an asset. Things which would require tens of lines in Java are done in 2-3 lines of Haskell code. No more clumsy visitor patterns, not boring List-object creation with complicated insertions etc. All this is really smooth walking and where you can use it good, it really gives a lot productivity.
Lazy evaluation is the icing on the cake because it allows you to write code in very natural way which would lead to rather bad performance in other languages because calculations could easily calculate much stuff which isn't really required there. To avoid this you often have to create quite artificial structures instead of being able to simply reuse old code. In Haskell you simply create a infinite or big finite lists, even if you require maybe only the first two elements. So in a non-lazy language it's always necessary to have some criteria to abort the evaluation while in Haskell you simply write like it is most natural and then let the function which request the data choose how much it wants to get. Caching inclusive.
Also using a fast REPL (read-eval-print-loop) to test out new code (I used Hugs, because of it's good integration into the Windows environment) can be quite nice. That's also a big plus for languages like Lisp and with Haskell it's quite the same. One bothersome thing with Hugs is that it doesn't allows you to test code out, as long as it finds a single error in one of your modules, even if the part of code your test doesn't touch. Thats a result of the compilation model and type inference and partly understandable. But in contrast to using the REPL with Lisp it's still a bit annoying.
Also in contrast with Ocaml the Haskell type system is much more powerful and allows easy overloading of functions. In Ocaml I always had to invent lots of names or use qualified names for functions because there was no overloading possible (but don't know for sure it this is still true because the last time I used Ocaml was in 2002 and maybe they made some advance since then).
Haskell has type inference and while this can really make code more compact because the lack of necessary type annotations it also has its downside: A new line somewhere in your code can easily create compiler errors somewhere else. Type inference works by creating lots of constraints and the inferencer then tries to solve them. So if an error occurs, the compiler flags this error somewhere in the dependency chain and often at places where everything is totally correct while the real error is somewhere else. If you develop code incrementally this is not as bad because you know what you have written since the last run and can look at the right spot it to figure out the error. But if you write longer parts of code without testing the program in between, it can take you quite a bit of time to find out where the error really is.
Haskell has a solution to it: Type annotations. Simply declare the type you want a function to have and things are working much more easy. So if you want to do something bigger than 20 lines in Haskell: Use type annotations, at least for all top level functions. Luckily it's easy to put annotations into your code later, because you can query the inferenced type of a function each time your code compiles and simply paste it into your program (but sometimes you should rewrite it a bit to support a bit more general typing). The code isn't as small as without annotations, but thats better than doing lots of searching around for the source of compilation errors.
The real problems with Haskell start to show up in the moment you require some kind of state. And yes, you will need it: Many things can be done in pure functional style, but I've often encountered situations where a single function deep down in the call hierarchy suddenly needed some informations from the outside. And in some cases they even needed to modify those informations. Then things starts to get ugly.
In Ocaml you can simply declare global, mutable variables and so the problem is a non issue. But in Haskell it's impossible because it don't allows mutation. Even if our small block of code only needs to read some state from the outside, it's not directly possible in Haskell anymore. In my project I often found a nice functional solutions to a problem, but later on, after I put the (tested and good working) code into action I discovered that some function 'deep-down' suddenly required access to outside data and I had to do lots of rewrites because of this.
Haskell has a solution to this 'outside context problem': Excessive use of Monads. While a bit strange in the beginning, after using monads a bit, you get quite fast accustomed to them. But with more complex situations, the usage patterns are also getting more and more complicated.
To use outside state I often used the StateT monad. This monad allows you to create your own state data, modify it like in an imperative language and it can also wrap an other monads into it (for example the IO monad). So if you use this monad you can access your state at will, even modify it and also have access to IO functionality to write out informations while your program is executing.
(This is another big issue I had with Haskell: Simply create some debugging output. In most language you can simply write to some logging-stream to see what your program has done. This is often even more useful than using a debugger, because you can examine the history of the program execution and see what lead to a certain error. But in Haskell you need access to the IO monad to write data out and this requires you to change the way you write your programs)
Also the conciseness of Haskell stops in the moment you want to access or change state. Even Java code is more concise than Haskell code which has to change things. I know that this is probably intended to force programmers to write programs in a functional style. But compared to the otherwise clear and compact syntax of Haskell its really a pain.
Monads can encapsulate state into some kind of data. In fact all monads are simply data constructors with a few required functions. So most monads work by returning itself from functions. It requires a bit backward thinking, because the state-monad for example creates a chain of closures which are executed later by evaluating the value of the monad. So to use state you don't have to give this state to a function, the monad shows up only in the result type. While this is quite clever (and also is the reason why things like backtracking and continuations can be implemented as monads) from a practical point of view, it also means you have to rewrite your code:
Lets have two simple, small functions:
giveElements :: (a -> Bool) -> [[a]] -> [[a]]
giveElements condition = filter (detectElement condition)
detectElement :: (a -> Bool) -> [a] -> Bool
detectElement = any
The above seems a bit useless but I've extracted from it's surrounding code to show the problem. The 'giveElements' function is able to extract certain elements from a list and returns the filtered list. The 'detectElement' function does the job of deciding which elements are in and which are not. It does this by checking if any element of a list fulfills a certain condition.
So short so nice, but after some work on your program you discover that your 'condition' function required access to outside state to do it's job. Maybe the function has to do some a small modification of some data or should simply write a short message into a log. To allow this, you have to rewrite the above functions to use a monad. For this example lets simply use the IO monad here. Now the functions become:
giveElements :: (a -> IO Bool) -> [[a]] -> IO [[a]]
giveElements condition = filterM (detectElement condition)
detectElement :: (a -> IO Bool) -> [a] -> IO Bool
detectElement c els = do
r <- mapM c els
return $ or r
The first function still looks quite similar, only the 'filter' function is replaced by 'filterM' now. 'filterM' is the monadic version of filter and gives the monad back from the condition. So without filterM the state won't get up from the detectElement function (and of course there would be a type error).
detectElement looks a bit more different. The reason is that there is no monadic version for 'any'. So I use a monadic map (the mapM function) to create a list of Bools by applying the condition to each element and then check if at least one of the results it true. This doesn't requires more evaluations of c than the first version because of lazy evaluation, but it still required a complete rewrite of my code. It's still readable, but not as much as before.
And there are some other dreadfulnesses: The '$' after the 'return' is necessary because if you simply write "return or r" it would think or and r are both parameters to return (because 'return' is no part of the Haskell syntax but an ordinary function), so we need the '$' to change this. I could also write "return (or r)" but thats a bit longer and with more complex expressions can easily look like you're programming in Lisp. And you can't easily put the 'r' into the expression below. That's because the '<-' is no assignment but a special syntax which creates a closure with r as parameter.
All this has to be considered if you start to use monads. Only because I wanted to access the IO state from my condition-function, I had to rewrite my little program substantially. And if there would be no filterM if even get even more difficult. I had for example to use a similar code with sets instead of lists - but there is no filterM for sets, only a normal filter. And you can't 'lift' (Haskell slang for 'transform a function by using a combinator') a filter function to a monadic one. Map or fold functions can be lifted to monadic forms because they take a function which returns an arbitrary value, which can be used to put the monad in. But if a filter simply returns a Bool and then there is no way to return the monad from within. So I had to convert the set to a list, filter it with filterM and then build a set again - which has worse runtime characteristics than directly filter it.
And if you have more complex functions the above translation process from a non-monadic into a monadic-one is much more difficult and IMO it also destroy the clarity of the code. Also state monads have bad runtime characteristics because of their 'inside-out' evaluation which can fill up the stack quickly. You can use IORefs instead - but then you can't simply transform your program to using continuations anymore for example. And if you put a IO monad into a state-monad, you have to lift all IO-accessing functions via liftIO.
And whats the point of having a pure functional language, going thru lots of trouble to maintain the purity, only to get things like IORef which you tried to get rid of in the first place? If you design a program to use IO monads from the beginning everywhere, then you have fewer problem maintaining and expanding it - but then you're using Haskell like a normal imperative language. Whats the point of Haskell then?
Another bad one with Haskell is documentation: While you find lots of introductions all over the net, it get quite difficult to find informations if you're beyond the basics and start do do real work. The 'official' docs are simply horrible: No examples, no real descriptions how things work or should be used. And finding tutorials for the advanced stuff becomes quite difficult. You can get lucky and find a solution to a problem buried in some of the pages of the Haskell-wiki, a newsgroup or a research paper - but it will take its time to find it and sometimes there is no such solution. And this is only about 'language stuff', not how to solve a very specific situations in some obscure framework.
In retrospective: Haskell is a nice language, but if you need state handling then it gets ugly. That even kills the language for me. Whats the use of concise and compact code if you have to rewrite it later on into a much more ugly form if you discover that you need state? This is no problem for small programs, so I've never ran into those problems with my experiments before. But with a bigger program things changed. So my program still isn't implemented completely and I will probably abandon the Haskell version and do a rewrite in Java again.
Its not a problem of functional programming in itself, its a Haskell thing: While monads are quite powerful in principle IMO they are a pain to use in practice. Because monads show up in the return type of functions, the easy composition of functions which is the prime reason for the power of functional programming only works as long as all those functions use (compatible) monads, and sadly many functions simply don't exists in this form.
With Ocaml for example you won't get this kind of problem, but there you have unlimited mutation which has it's own problems. But there are other pure concepts like for example uniqueness typing (Clean) or transparently chaining state-parameters thru calls (Mercury). And maybe with more time to get accustomed to the use of monads I would maybe see things differently. But since I have no unlimited time and need to get things done in a programming language, I consider my Haskell experiment as a failure - but only a partial one because it was nonetheless fun and brought lots of insights.
So what to do with Haskell? Maybe it would be an idea to put monads automatically in every return type. If I write something like:
func :: a -> b
then the compiler would creates internally
func :: Monad m => a -> m b
and also rewrites all bodies to do/return-syntax. I don't know, if this would really always work and would really solve all the problems but I suspect that by hiding the use of monads better and letting the compiler decide where he can drop the monad constructors safely you would get you 'clear and normal' looking programs back even if they use Monads everywhere. I'm far from a Haskell expert, so take this with a grain of salt. But since Haskell already provides some syntactic support for monads, why not go one step further? If there wasn't the problem with state (and documentation - but thats only a question of effort) Haskell could be a very nice and maybe even the 'next big' thing.