Learning OCaml with Mirage OS
What's a Mirage OS?
Early this year I came across the Mirage OS Project. It started as a research project in 2009 and had a 1.0 release last December. It's an interesting approach to cloud computing that asks the question "what do you really need to run a web service?".
In a nutshell, it provides network and block device drivers for Xen virtual machines (think AWS and Rackspace VMs), a network stack, filesystem drivers and a simple http server. If you stop and think about everything that's required from a virtual machine to serve a simple web service with ASP.Net or Rails, there's a huge amount of overhead. Not to mention the huge number of places vulnerabilities could be found. The Mirage OS approach is to write as much as possible in a high level language to avoid memory mistakes, and to remove anything that isn't required. Pieces you don't need aren't compiled in. For example, the VM this blog is hosted on right now doesn't even have block device drivers, the total image size including content is around 6 MB.
This is a huge step beyond the sort of thing Docker is trying to achieve and it opens up some really interesting possibilities. In some tests the boot time for the VM to get to serving requests is shorter than a normal round-trip time, so automatically firing up tens or hundreds of these in reponse to a surge of traffic could be feasible. There are some great resources on openmirage.org considering Mirage from security, maintainability and performance perspectives.
An Introduction to OCaml
I only started learning OCaml so I could play with Mirage. It's a great choice for that project - the language features and compiler really lend themselves to what the project is trying to achieve.
OCaml's influence on Haskell is very obvious from the start, but I initially found it restricting syntax-wise compared to Haskell. For example, having to repeat
let for multiple bindings and the fact that data constructors can't be used as function arguments (this is actually being added to F# in the next release). But I soon got used to OCaml's way of doing things and it wasn't long before I was completely comfortable.
OCaml has all the good stuff you'd want from a modern high-level language; higher-order functions, a good type system with inference, ADTs and pattern matching. It compiles to either its own byte-code or native code. It's missing Haskell style type classes and doesn't have operator overloading (so adding floats is done with (+.)), but both of those shortcomings haven't really been an issue in my experience. It also has a killer feature - functors.
Modules and Functors
The coolest feature of OCaml I've encountered (and the one that enables Mirage OS to be so pleasant to work with) is the module system.
All code exists within a module and modules can be "opened" or dot-accessed similar to namespaces in other languages. OCaml also has interfaces, which define what gets exported from a module. Nothing super exciting about them, except you can define a signature for a dependency and have it filled at compile time by any module that matches that signature.
This gives us a nice way of doing IOC and providing mocks for testing purposes. Modules are considered eligible as long as they provide everything required in the signature (they don't have to explicitly implement the interface) plus you can define modules in-line.
Things get more interesting when you move up to "functors" (not the same as the Haskell typeclass). They're effectively functions over modules. This means you can parameterise a module or library over some other type. The standard example is using the Set module from the standard library:
1 2 3 4
module Int_set = Set.Make (struct type t = int let compare = compare end);;
The Set.Make functor takes a module that specifies a type and a comparison function and returns a module with the set operations specialised on that type. The
struct block is defining a module (in-line) that has the required components for Set.Make. This example doesn't appear to give you much more than generics in C#, however because we can specify both the types and the operations on the types we're using we can be a lot more flexible. For example, we could define a
Map module for strings and ints (think
Dictionary<string, int>) that does a case-insensitive comparison on the keys.
We'll see how functors are essential to the Mirage OS infrastructure in my next post.
Functors are missing from F# but (the same as type-classes) I don't know how it would fit into the CLR object system, and even if it could, how much complexity that'd add to the language. "functor" is a reserved word in F# though so they are keeping their options open.
The other area I was really impressed with in OCaml was the OPAM package manager. It does everything you'd expect of a package manager, it installs and upgrades packages, does all the dependency stuff and can keep multiple compiler versions around (not that I've needed that yet).
On top of that it has a handy feature around "pinning" libraries to versions or directories. This allows you to build your projects against custom local versions of a library through the standard package manager. It also has support for pointing at a git or mercurial repository. This all adds up to a really nice workflow for working with open source libraries.
In the course of building this blog I had pulled in ocaml-mustache (a mustache rendering library) to do my templating. Initially I installed it with
opam install mustache and from there I could reference it in my project and the standard build tools would find it. Eventually I found out it was missing a feature I needed. I cloned the repo for the library, pinned the library to my local copy in OPAM and made my changes. From there I could rebuild my main project against it without altering anything (because it's goes to OPAM for the library) and test my changes. After submitting up a pull-request I kept ocaml-mustache pointed at my local copy until my PR got merged and then I could un-pin it and be back on the standard distribution. It was a really easy workflow.
Less Cool Features
The standard library in OCaml is a bit poor. The two string modules (String and Str) aren't great and there are lots of other areas where simple functions you'd expect are missing, like
find : ('a -> bool) -> 'a list -> 'a option. This has been addressed by the community by standard libraries "overlays", most notably Jane Street Core and OCaml Batteries. I haven't used either (I wasn't sure of the implications of including them in a Mirage project) but they both add a bunch of useful functionality.
The OCaml compiler errors can be tough, they even make the Haskell ones look nice. I built up a short cheat-sheet for myself of common syntax mistakes I was making because they errors weren't clear at all. Especially with the complex syntax for defining module types and functors. I guess this is something that will improve as (if?) OCaml becomes more popular and more work is put into these areas but for now it can be pretty intimidating to beginners:
Parse error: [fun_def_cont] expected (in [fun_def_cont])
- Means you wrote
fun f x = x * 2not
fun f x -> x * 2
- Means you wrote
Parse error: [fun_binding] expected (in [fun_binding])
- Means you wrote
let f x -> something innot
let f x = something in
- Means you wrote
Parse error: [type_kind] expected after `=` (in [opt_eq_ctyp])
- Means you missed a
`in a polymorphic variance definition eg:
type blah = [`Something of string | NotTicked of int]
- Means you missed a
The de-facto standard unit testing library in OCaml seems to be OUnit. It does what you ask of it but I had some trouble writing re-usable test components like I would in Haskell's HUnit with the lack of type-classes. I'm sure it's possible to do what I wanted but the documentation is a bit lacking and I ended up just skipping tests which is unfortunate because the modules lend themself to unit testing really nicely.
It's a nice language, if you can get past the standard library which the OCaml community has done well. The build tools are mature but occasionally lacking in documentation and the package manager is first-rate. As a language it's made me appreciate the options and tradeoffs made in F# a lot more and I suspect I'll continue to use both for different tasks.
I also want to point out that the community seems small but inclusive, the Mirage mailing list is especially welcoming.
In part-two of this series I'll talk about my experience using Mirage and OCaml to put together this blog.
comments powered by Disqus