Andrew Birkett's blog – Page 14 – Thoughts of a software engineer

Haskell-meet-Erlang; (incl. Agile Sumo Wrestling)

Post author By Andrew
Post date 6/30/2009

Simon Peyton-Jones compares Haskell and Erlang. A good, fun historical tour of Haskell during a recent Erlang conference. I do enjoy watching talks where the presenter isn’t just preaching to the converted. Over the past few days, I’ve been writing a haskell IRC bot in an erlang message-passing style so this haskell/erlang video makes me wonder if there’s some kind of wormhole universe link-up going on!

General

307 Temporary Redirect

Post author By Andrew
Post date 4/12/2009

This blog has been quiet recently because, for the next month, I’m focused on my imminent 1,000 mile cycle ride from Land’s End to John O’Groats. I started a separate blog to track my training and progress if you’re interested!

General

Static checking java

As a distraction from learning about denotational semantics, I took a look at ESCJava2, which is an extended static checker for Java. ESCJava2 lets you add annotations to your source code which describe invariants and pre/post-conditions which you expect to be true. Then it goes away and tries to verify that they are true. Often, you have to go back and add extra annotations elsewhere (anyone who has const-ified a C++ program knows this feeling). But eventually you get to a point where you can be fairly confident that your program is (statically) free from all sorts of badness – eg. you’ll never try to pop from an empty stack, you’ll never get a null pointer exception.

This is all done declaratively, which makes it much tastier than unit tests.

Does it work? Well, I got out the source code for my trusty java raytracer, which was literally the first java program I ever wrote .. back in 1996 or something. At about 1,000 lines of code, it’s both “real world” enough to act as a decent test case, yet simple enough to grok quickly.

Initially, I had to spend a while adding ‘non_null’ annotations everywhere – every field in every object had that property. Dull, but useful to have it all checked.

Next was various bound-checking stuff. I had an image class with width/height members and a 2d array of pixels. I ‘knew’ that the array was of the right size, but by asserting some invariants to that effect, I could get ESCJava to statically bounds-check all the code that reads/writes pixels. The invariant were moderately complex things like “forall i < height; pixels[i].length == width". These "forall" quantifiers are what makes this approach tastier than unit tests. With unit tests, you have to pick "representative" data points and then trust that you chose well. In the haskell world, QuickCheck is a big step above that. Perhaps ESCJava brings some hint of universal tastiness to Java? Well, except that it didn't work in practise. My key 'raytrace all the pixels' method couldn't be processed by ESCJava. It gave an error which effectively said "the theorem which I'd have to prove to be sure that the code is safe is just too complex". Disappointing, considering that it was little more than some nested loops plus a bit of logic. ESCJava had a few other nice features though. Since my code is old, it the plain old Vector class. ESC can add a 'phantom' elementType field to your vector, which gives you 1.5-like static type checking on pre-1.5 collections. But in the end, the final nail in the coffin was that ESCJava does not support java 1.5+ features. So it doesn't understand generics. Oh well, no use for the real code that I work with. I'd love to have the time to understand this area more deeply. I like this kind of "unsound and incomplete but useful in practise" part of the statically-checked spectrum.

Programming

HAppS-state mistake

I’m grappling with HAppS-State at the moment, and thought it useful to capture some work-in-progress notes. My toy webapp allows you to view and edit information about people, places and things. The webapp state just consists of several identifier->entity maps.

HAppS-state requires that you write your state query/update functions as normal MonadState or MonadReader computations. But you also must process each of these functions using the mkMethods template haskell function. This generates some “behind the scenes” machinery to turn your vanilla state-updating monads into something which additionally maintains a write-ahead disk log to make the change durable. If your update function was called “modifyPersonName”, the call to mkMethod generates a datatype/constructor called ModifyPersonName which, when used like “update ModifyPersonName newName” has the richer durable behaviour.

I have lots of different entities, and they all have lots of different attributes. It quickly gets boring writing seperate “modifyEntityX” functions for each attribute. Haskell’s rather lousy record syntax doesn’t help out much either.

Fortunately, there’s a nice library called data-accessor which provides a more pleasant way to handle haskell record types. The idea is that you build up a getter/setter pair for each record member. These are first class values, and are consequently much more flexible than the builtin haskell record update syntax.

This seemed to be the answer to my problem – I can make a generic “modifyPersonAttribute” function which takes one of these accessors as an argument in order to select the field to update.

Unfortunately, this doesn’t work. I get a type error effectively stating that happs-state requires that all of the arguments to update/query function must themselves be Serializable.

This confused me. I can see that the application state type (and all of its constituent subparts) need to be serializable. But I was surprised that all the arguments for state-updating functions needed to be Serializable.

Then I realized what my false assumption was. I had assumed that happs was persisting the result of running the update operation to the logfile, similar to what mysql does for redo logs. In other words, I thought the logfile consisted of things like “the new value for row 42 is ‘foo'”.

However, a quick look at the contents of the _local directory (where happs stores its state) shows that this isn’t the case. Happs stores a description of the computation itself – ie. the name of the update operation and the (serialized) arguments it took.

This has got me somewhat stuck. Firstly, my generic ‘modifyPersonAttribute’ doesn’t work because the “accessor” values are not serializable. I’m now wondering if perhaps I can bypass data-accessors and instead write some template haskell to generate the happs machinery for all my entity types and all their attribute values.

But more importantly, this means that you need to be super-careful not to change the behavior of your state-modifying functions if there are any uncheckpointed changes in the logfile. Let’s say you have a createPerson function which takes a name and stores the name straight into the application state. But some days later, you decide that you want to make names have an initial capital letter before storing them. You change the code and restart the application – but unless you were careful to checkpoint the application state, the log will be replayed and you’ll end up with a different application state from before (some existing people will have the initial-caps logic applied to their name, not just new people).

Tags happs, haskell

General

Hacking with HAppS – what type, handler?

Post author By Andrew
Post date 1/11/2009

(These posts are prettified versions of my notes that I made whilst stumbling around HAppS for the first time. I hope they’ll be of use to future HAppS explorers!)

Note 1 .. In which we figure out what ServerPartT and WebT really are.

When we start HAppS running with ..

   simpleHTTP config list_of_handlers :: IO ()

… what type do the handlers have? Let’s explore some possible universes!

In a simple world, a handler could have type

    Request -> Maybe Response

This allows handlers to be selective about which Requests they handle, which is useful. But it also means that handlers must be pure functions so there’d be no way of maintaining any ‘application state’ between requests. Not very useful!

Let’s try to remedy that. If the handlers had type:

   Request -> State S Response

.. then we’d be able to retain and modify some application state (of type S) between handler invocations. But that’s all we’d be able to do. We still couldn’t write logs to disk, read file contents or talk to databases.

To allow handlers to do that, handlers would need to have type ..

  Request -> IO Response

(As an aside, note that whilst simpleHTTP has type “IO ()” it’s up to simpleHTTP whether or not it exposes the full power of the IO monad to the handlers. However, as we’ve just seen, not doing so would pretty much cripple the handlers)

So is this really what HAppS gives you? Let’s see. In the real world HAppS handlers have type “ServerPartT IO a” which is defined to be …

newtype ServerPartT m a = ServerPartT { unServerPartT :: Request -> WebT m a }

In other words, if you have a value of type “Request -> WebT m a”, you can tag it with the ServerPartT constructor to say “this ain’t just any function of that type, it’s part of a web server, dogammit!”.

But, apart from the tag, our handlers really just have type “Request -> WebT m a”. So what’s a WebT?

newtype WebT m a = WebT { unWebT :: m (Result a) }

So, ignoring the tag, it looks kinda like our handlers can produce any monadic computation that, when run, yields a “Result a” (the “Result a” type lets us say “I handled this request, here’s my answer” or “I didn’t handle this request”)

Hmm, there’s something not right there. If that was the case, it would be possible to use, for example, both “Result -> IO (Result a)” and “Result -> State S (Result a)” as handlers. But how would simpleHTTP ‘run’ such monadic computations? To run a State computation you use “evalState”. To run an IO computation, you have to return it from main. There’s no single way to run all possible monads.

Did I miss something? Yes, sort of – I just blindly assumed ‘m’ was a monad because of the way it was used. But there’s no “Monad m =>” constraints anywhere. And, in fact, if we go back to the start ..

simpleHTTP :: ToMessage a => Conf -> [ServerPartT IO a] -> IO ()

.. we can see that, right from the top, simpleHTTP only deals with ServerPartT’s and WebT’s where the ‘m’ type parameter is IO.

So, after all that, the handlers in HAppS handlers can be anything of type: Request -> IO (Response a). The ‘a’ type needs to be something that HApps can figure out a content-type and be able to make an HTTP message from (the ToMessage type class). The “IO (Response a)” part gets dressed up as a “WebT” (think: computation that produces the response). And the whole thing things gets dressed up as a “ServerPartT” (think: request handler).

What does each bit do? The IO part gives you free reign to do side-effecting computations, which get sequentially executed by the runtime. The Response part allows you to say ‘I handled this’ or ‘didn’t handle it’ and the ‘a’ part can be your XML or HTML data type (so long as its an instance of ToMessage).

So that’s it.

Man, that was a lot of complexity for such a simple end result. I don’t understand why there’s so much generality in the types of ServerT and WebT – I’d love to know!

Tags happs, haskell