Control.Monad.Inception

I saw Inception a few months ago. At the time I was struck by how coherent the computational environment describe in the movie was. I suspect “Inception” forms a monadic environment for computation (though not necessarily a useful one), and have been meaning to try to implement it. However, time is short, so here are my notes from the time (August 23rd) on the semantics of Inception, in the hope that someone else will implement Control.Monad.Inception.

Semantics of the Inception computational environment

  • “Dream” == thread of control == forkIO
  • A thread must stay at each level (thread group) for it not to collapse (dreamer)
  • Messages can be sent down (music) – e.g. async message passing
  • A kick exits a thread up one level (“die”) – throwTo KillThread
  • Time degrades quickly 10, 3 mins , 10 mins , 10 years – so adjust scheduler time slices based on levels
  • Groups of threads brought together by the dreamer can enter a new environment together (collaborative dreaming) – shared heap
  • Arbitrary effects are allowed, (yep, it has IO at the bottom)
  • Under sedation, you can disassociate a thread from its control stack, and it is in limbo (some “init” thread group), until it remembers.
  • Tokens are (unique) depth counters (0, many).

Your challenge:

  • Improve Claessen’s “poor man’s concurrency monad” to support the Inception environment.
  • What notion of `bind` and `return` are used?
  • Show that the monad laws are satisfied.
  • import Control.Monad.Inception and win!

Refer to the published analysis on the use of the inception monad you need to be able to support.

Hackage, Cabal and the Haskell Platform: The Second Year

The Haskell Implementors Workshop was held in Baltimore, Oct 1, 2010. Duncan Coutts from Well-Typed and I presented a status report on the Haskell distribution infrastructure:

You can read the slides as PDF here, or online:

The amount of freely available Haskell code has grown exponentially in the past
two years as Hackage and Cabal have come online. Managing millions of Haskell
code, partitioned thousands of interdependent packages is a serious engineering
challenge that has received little attention from the language research community.
Meanwhile, new adopters of Haskell struggle to deal with the sheer number of
libraries and tools now available.

One pragmatic approach to managing this web is the Haskell Platform (HP), a
project to build a blessed, comprehensive set of libraries meeting objective
quality control criteria, and in doing so make expert recommendations on which
packages to use. In its first six weeks of operation the HP had over forty
thousand downloads.

The challenge with such a project is to manage the many conflict constraints
for diversity, coverage, and quality when assembling the package set. This talk
will outline the state of the Haskell Platform, the technical approaches taken
to build it, and the roadmap ahead.

Practical Haskell: scripting with types

I had the pleasure to give a new talk today, on design in functional programming — types, abstractions and monads — using the motivating example of scripting. The slides are below and a PDF version is available.

Shell scripts are often a quick, dirty way to get the job done. You glue
together external tools, maybe do a little error checking and process all data
as strings. This is great for some very simple problems but as requirements
change and more is demanded from the code shell scripts become unwieldy and
fragile. When they get large, they become slow and difficult to maintain. If
you need to write robust code then shell is not the way to go.

In this talk at an alternative: how to use Haskell as a type checked and
natively compiled language for scripting tasks. By refining the semantics of
the problem domain, employing abstraction, we produce shorter and more robust
code, that is more maintainable and scalable.

Haskell Platform 2010.2.0.0 is live!

We’re pleased to announce the fifth release of the Haskell Platform: a single, standard Haskell distribution for everyone.

The specification, along with installers (including Windows, Apple and
Unix installers for a full Haskell environment) are available.

The Haskell Platform is a single, standard Haskell distribution for every system, in the form of a blessed library and tool suite for Haskell distilled from the thousands of libraries on Hackage, along with installers for a wide variety of systems. It saves developers work picking and choosing the best Haskell libraries and tools to use for a task.

When you install the Haskell Platform, you get the latest stable compiler, an expanded set of core libraries, additional development tools, and cabal-install – so you can download anything else you need from Hackage.

What you get is specified here.

— The Platform Infrastructure Team

Engineering Large Projects in a Functional Language

I had the opportunity to speak at DevNation on July 10th in Portland, and gave the following talk, an updated version of Galois’ collective experiences developing Haskell projects over the past decade. You can download the .pdf.

Abstract

Galois has been building software systems in Haskell for the past decade. This talk describes some of what we’ve learned about in-the-large, commercial Haskell programming in that time. I’ll look at when and where we use Haskell. At correctness, productivity, scalabilty, maintainability, and what language features we like: types, purity, types, abstractions, types, concurrency, types!

We’ll also look at the Haskell toolchain: FFI, HPC, Cabal, compiler, libraries, build systems, etc, and being a commercial entity in a largely open source community.

ghc-gc-tune: Tuning Haskell GC settings for fun and profit

Inspired by a comment by Simon Marlow on Stack Overflow, about the time and space tradeoffs we make with garbage collection, particularly with a generational GCs, I wrote a small program, ghc-gc-tune, to traverse the garbage collector variable space, to see the relationship between settings and program performance. Given a program, it will show you an (optionally interactive) graph of how -A and -H flags to the garbage collector affect performance.

Previously I’ve had good success exploring multi-variable spaces for optimizations with GAs in Haskell, to find strictness flags and LLVM flag settings, so I was keen to see what the GC space looked like. In this initial GC search, however, I don’t use a GA, instead just measuring time as two variables change over the entire space.

Here’s an example for the binary-trees language shootout benchmark, where the GHC default settings are known to be suboptimal (the benchmark disallows changes to the default runtime GC settings):

Running time of the binary-trees benchmark as -A and -H vary

The flags we use are:

  • -A, the size of the initial thread allocation area for the youngest generation.
  • -H, the suggested overall heap size

ghc-gc-tune, in the style of ghc-core, wraps a compiled Haskell program, and runs it with varying values of -A and -H, recording various statistics about the program. The output can be rendered interactively, or to png, pdf or svg. It would augment use of heap profiling, ThreadScope and ghc-core for analyzing and improving Haskell program behavior.

In this case, ghc-gc-tune recommends the somewhat surprising -A64k -H32M, and binary-trees runs in 1.12s at N=16, while for the default GC settings it completes in 1.56s. So ghc-gc-tune found settings that improved performance by 28%.  Nice.

I already knew that a large -A setting helped this program (corresponding to the broad plateau for large -A values in the above graph), however, I was surprised to see the best result was with a very small -A setting, and medium sized -H setting, resulting in only 5% of time spent in GC, and 36M total allocated — the narrow valley on the far side of the graph. Very interesting! And is that my L2 cache in the square at x= 2M, y = 2M? Sure looks like it.

Here’s a video of the same graph in the tool’s interactive mode (without any -t flag):

Currently, the sampling is vary simplistic, with a fixed set of logscale values taken. A clever sampling algorithm would measure the heap used in the default case, and compute a range based on that, possibly with cutoffs for very pessimistic GC flags.

Another example: pidigits, with what I would consider far more typical behavior. Though again, a surprisingly small -A setting does well, and there’s an interesting pathological result with extremely large -H and very small -A settings.

PiDigiits GC space

You can get ghc-gc-tune from Hackage, via cabal, and note that it requires gnuplot installed. Let me know if you find it useful, and I welcome patches!

Future work will be to graph the Z axis as space, instead of time (so we can find GC settings that minimize the footprint), as well as adding other variables (such as parallel GC settings, and varying the number of generations).

Popular Haskell Packages: Q2 2010 report

Here is some data on downloads of Haskell libraries and apps on Hackage, for the first half of 2010.

The Hackage dependency graph

Hackage is the central repository of open source Haskell libraries and tools. Once they install the Haskell Platform, users get more libraries from Hackage, via “cabal install”.

Headlines

May was the most popular month for Hackage ever, breaking 150k downloads in a single month for the first time.

The 2000th Haskell package was released on April 16.

Total downloads on Hackage since 2006 have passed 2.4 million, with 780 thousand downloads in 2010 so far (double the total from the same time in 2009).

Totals

Total cabal packages: 2182. (+ 208 in Q2).

Total contributing developers: 575 (42 new developers in Q2)

90 day moving average: 12 packages per day uploaded.

Total downloads from Hackage 2007-present: 2.42 million

Average monthly downloads in 2010: 130 thousand.

Top of the Pops

The top 15 most popular libraries in the first half of 2010 were:

  1. HTTP
  2. parsec (+1)
  3. zlib (-1)
  4. binary (+1)
  5. network (+2)
  6. utf8-string (-2)
  7. Cabal (+1)
  8. QuickCheck (-2)
  9. mtl (+1)
  10. haskell-src-exts (-1)
  11. regex-base
  12. deepseq (+6)
  13. ghc-paths (+2)
  14. hslogger (+6)
  15. regex-posix (-2)

Top 15 most popular applications in the first half of 2010:

  1. cabal-install
  2. xmonad
  3. haddock (+1)
  4. cpphs (-1)
  5. happy
  6. darcs (+1)
  7. alex (+1)
  8. hscolour (-2)
  9. pandoc
  10. hlint
  11. leksah
  12. xmobar
  13. yi
  14. hint
  15. agda

Honorable Mentions

  • The Galois xml library was more popular in the first half of 2010 than HaXml, dethroning HaXml for the first time.
  • text has made it into the top 30 libraries
  • HDBC continues to be the most popular database library
  • vector has almost surpassed array in downloads (array is part of the Haskell Platform though)
  • wxHaskell is still more popular than gtk2hs on Hackage,  though gtk2hs has almost caught up.

You can read all the 2010 data for your favorite packages, and ranked by 2010 popularity.

Top Libraries by Category

  • Networking: HTTP, network, network-bytestring, curl
  • Parsing: parsec, polyparse, attoparsec
  • Compression: zlib, zip-archive
  • Binary formats: binary, cereal
  • Text formats: utf8-string, text, dataenc
  • Markup: pandoc, xhtml, tagsoup, html
  • JSON: json
  • Atom/RSS: feed
  • XML: xml, HaXml, hexpat
  • Web services:  happstack, snap
  • GUIs: wxHaskell, gtk2hs
  • Graphics: SDL, cairo, gd
  • Templates: HStringTemplate
  • Testing: QuickCheck, HUnit, testpack, hpc
  • Control: mtl, transformers, monads-fd
  • Languages: haskell-src-exts, haskell-src, HJavaScript
  • Regexes: regex-{base,posix,compat,tdfa}, pcre-light
  • Logging: hslogger
  • Generics: uniplate, syb-with-class, syb
  • 3D: OpenGL
  • Edit history: haskeline
  • Concurrency and parallelism: parallel, stm
  • Databases: HDBC
  • Arrays: array, vector, hmatrix
  • Hashing: pureMD5, SHA
  • Data structures: containers, fingertree, dlist
  • Science:  statistics
  • Benchmarking: criterion
  • Storage: hs3

Is there anything else you see in the data?

Follow

Get every new post delivered to your Inbox.

Join 68 other followers