Popular Haskell Packages: Q2 2010 report

Here is some data on downloads of Haskell libraries and apps on Hackage, for the first half of 2010.

The Hackage dependency graph

Hackage is the central repository of open source Haskell libraries and tools. Once they install the Haskell Platform, users get more libraries from Hackage, via “cabal install”.

Headlines

May was the most popular month for Hackage ever, breaking 150k downloads in a single month for the first time.

The 2000th Haskell package was released on April 16.

Total downloads on Hackage since 2006 have passed 2.4 million, with 780 thousand downloads in 2010 so far (double the total from the same time in 2009).

Totals

Total cabal packages: 2182. (+ 208 in Q2).

Total contributing developers: 575 (42 new developers in Q2)

90 day moving average: 12 packages per day uploaded.

Total downloads from Hackage 2007-present: 2.42 million

Average monthly downloads in 2010: 130 thousand.

Top of the Pops

The top 15 most popular libraries in the first half of 2010 were:

  1. HTTP
  2. parsec (+1)
  3. zlib (-1)
  4. binary (+1)
  5. network (+2)
  6. utf8-string (-2)
  7. Cabal (+1)
  8. QuickCheck (-2)
  9. mtl (+1)
  10. haskell-src-exts (-1)
  11. regex-base
  12. deepseq (+6)
  13. ghc-paths (+2)
  14. hslogger (+6)
  15. regex-posix (-2)

Top 15 most popular applications in the first half of 2010:

  1. cabal-install
  2. xmonad
  3. haddock (+1)
  4. cpphs (-1)
  5. happy
  6. darcs (+1)
  7. alex (+1)
  8. hscolour (-2)
  9. pandoc
  10. hlint
  11. leksah
  12. xmobar
  13. yi
  14. hint
  15. agda

Honorable Mentions

  • The Galois xml library was more popular in the first half of 2010 than HaXml, dethroning HaXml for the first time.
  • text has made it into the top 30 libraries
  • HDBC continues to be the most popular database library
  • vector has almost surpassed array in downloads (array is part of the Haskell Platform though)
  • wxHaskell is still more popular than gtk2hs on Hackage,  though gtk2hs has almost caught up.

You can read all the 2010 data for your favorite packages, and ranked by 2010 popularity.

Top Libraries by Category

  • Networking: HTTP, network, network-bytestring, curl
  • Parsing: parsec, polyparse, attoparsec
  • Compression: zlib, zip-archive
  • Binary formats: binary, cereal
  • Text formats: utf8-string, text, dataenc
  • Markup: pandoc, xhtml, tagsoup, html
  • JSON: json
  • Atom/RSS: feed
  • XML: xml, HaXml, hexpat
  • Web services:  happstack, snap
  • GUIs: wxHaskell, gtk2hs
  • Graphics: SDL, cairo, gd
  • Templates: HStringTemplate
  • Testing: QuickCheck, HUnit, testpack, hpc
  • Control: mtl, transformers, monads-fd
  • Languages: haskell-src-exts, haskell-src, HJavaScript
  • Regexes: regex-{base,posix,compat,tdfa}, pcre-light
  • Logging: hslogger
  • Generics: uniplate, syb-with-class, syb
  • 3D: OpenGL
  • Edit history: haskeline
  • Concurrency and parallelism: parallel, stm
  • Databases: HDBC
  • Arrays: array, vector, hmatrix
  • Hashing: pureMD5, SHA
  • Data structures: containers, fingertree, dlist
  • Science:  statistics
  • Benchmarking: criterion
  • Storage: hs3

Is there anything else you see in the data?

Open Source Bridge Talk: Multicore Haskell Now

In June I had the opportunity to talk about approaches to parallel programming in Haskell at Open Source Bridge: “a new conference for developers working with open source technologies and for people interested in learning the open source way.”

Here are the slides (::PDF), and the source that accompanied the tutorial:

The abstract for  the session:

Haskell is a functional language built for parallel and concurrent programming. You can take an off-the-shelf copy of GHC and write high performance parallel programs right now. This tutorial will teach you how to exploit parallelism through Haskell on your commodity multicore machine, to make your code faster. We will introduce key parallel programming models, as implemented in Haskell, including:

  • semi-explicit parallelism via sparks
  • explicit parallelism via threads and shared memory
  • software transactional memory

and look at how to build faster programs using these abstractions. We will also look at the engineering considerations when writing parallel programs, and the tools Haskell provides for debugging and reasoning about parallel programs.

This is a hands on tutorial session: bring your laptops, there will be code!

There are a hell of a lot of Haskell libraries now. What are we going to do about it?

The Haskell community has reached a bit of a milestone: there are now more than 2000 open source libraries for Haskell on Hackage! However, with this also comes a problem: how do you work out which library to use? (Without learning one Haskell library a day for the next 6 years?) Which ones are robust, and supported, and which ones aren’t? This isn’t a new problem in open source: the Perl community has faced it with CPAN for a decade or more. Now Haskell is in the same situation.

In fact, it’s kind of startling to look back: in 2006, there were only a handful of open source Haskell libraries for developers to use in their projects (just HDBC, zlib, libxml, Crypto…). Today, there are 2121 (more by the time you read this) libraries for Haskell, available as source on http://hackage.haskell.org (only a “cabal install” away), and often 100s of Haskell libraries in binary form on your favorite distro. You can even follow the package flood on Twitter.

Here’s what the growth in available Haskell libraries over the last 4 years looks like:

We passed 1000 libraries in early 2009, and doubled that a year later.

So this is great for the Haskell dev community. In some areas, like database interfacing, we’ve gone from a single option (HDBC) to a full range, including new stuff like, uh, well, Cassandra, CouchDB, Amazon SimpleDB, MongoDB, Tokyo Cabinet, and pure Haskell libs like TCache, or safe, high level libs like HaskellDB.

We’re rapidly running into CPAN-like problems of just managing the weight of so much Haskell code. How do you know which one to use? Should you use, say, Galois’ xml library, or Lemmih’s xml library? . Someone recently said “It is bewildering trying to figure out which ones are actively supported and which ones are zombie projects that stopped working years ago.”

So what are we doing about it?

There are four efforts underway to help Haskellers manage this work, and you can contribute!

  1. The Haskell Platform – a easy, one-click installer for the core system, including a blessed set of libraries, with a commercially friendly BSD license (like most of Hackage). At the moment, this means just these libraries, and we need developers to propose new additions to the blessed set.
  2. Google Summer of Code: Hackage 2.0 – we have Matt Gruen working this summer to finish the implementation of Hackage 2.0 – an improved Hackage that will allow for many new features to help sort out the wheat from the chaff in Haskell packages: build reports, wiki commenting, and social voting.
  3. Google Summer of Code: Cabal Test: we also have Thomas Tuegel working on “cabal test”  – to allow automated testing and reporting of cabalized (and thus, all of Hackage). This is the second plank in the solidifying the quality assurance story for Hackage.
  4. Regular regression testing of Hackage: having all that code is great – it means we can do regular regression testing of compilers and tools on a multi-million line Haskell codebase. For the 6.10 GHC release, for example, we were able to narrow breakages of all known open source Haskell to just 5% of Hackage, and post detailed instructions on how to address those changes. This gives us significant stability.

So, the HP to make it simpler to install Haskell and get started with a good set of libraries (several hundred thousand downloads of the installers so far!), a better Hackage to help us rate and rank packages, regression testing against Hackage to keep things stable, and in particular, test reporting support to make it easier to do quality assurance estimates.

How would you like to see changed in the Haskell library world? What libraries do you love? What do you hate? How do you find the packages you need?

And you don’t have to wait for others to solve this. Write tools to pick the best libs. Do your own quality ratings and share them. Write reviews of packages, and compare them, then let everyone know.

This is open source – it is up to you to help make things happen.

The 7 Haskell projects in the Google Summer of Code

Congratulations to the 7 successful applications to do Haskell projects for the Google Summer of Code 2010. The quality of proposals was extremely high this year, and we look forward to more such excellent proposals next year.

The students who will be working on projects for Haskell.org this summer are:

  1. Thomas Tuegel, Improvements to Cabal’s test support
  2. Matthew Gruen, Infrastructure for a more social Hackage 2.0
  3. Jasper Van der Jeugt, A high performance HTML generation library
  4. Alp Mestanogullari, Improvements to the GHC LLVM backend
  5. Marco Silva, Implementing the Immix Garbage Collection Algorithm
  6. Matthew Arsenault, GObject-Introspection based bindings for gtk2hs
  7. Alexey Levan, Improving Darcs’ network performance

Well done all! And you can cheer on and support these students this summer — keep track of their progress on the Haskell Reddit, check out their code, and help with feedback and support.

And thank you to Google for sponsoring Haskell.org projects for the 5th year!

Popular Haskell Packages: Q1 2010 report

Here is some data on downloads of Haskell libraries and apps on Hackage, for the first quarter of 2010.

The Hackage dependency graph

Hackage is the central repository of open source Haskell libraries and tools. Once they install the Haskell Platform, users get more libraries from Hackage, via “cabal install”.

Headlines

March was the most popular month for Hackage ever. And we’re closing in on 2000 packages, and 2 million “cabal installs” in the next month or so.

Totals

Total cabal packages: 1976. (+ 256 in Q1).

Total contributing developers: 533

90 day moving average: 11.5 packages per day uploaded (up from 10.5).

Total downloads from Hackage 2007-present: 1.88 million (up 350k in Q1)

Downloads in March 2010: 145,752 (new monthly record)

Top of the Pops

The top 15 most popular libraries in the first quarter were:

  1. HTTP
  2. zlib
  3. parsec
  4. utf8-string
  5. binary
  6. QuickCheck
  7. network
  8. Cabal
  9. haskell-src-exts
  10. mtl
  11. regex-base
  12. uniplate
  13. regex-posix
  14. X11
  15. ghc-paths

Top 15 most popular applications in Q1:

  1. cabal-install
  2. xmonad
  3. cpphs
  4. haddock
  5. happy
  6. hscolour
  7. darcs
  8. alex
  9. pandoc
  10. hlint
  11. leksah
  12. yi
  13. agda
  14. texmath
  15. gitit

Honorable Mentions

  • The deepseq is in the top 20 packages of the year.
  • HaXml and HDBC remain the most popular xml and database libraries (though xml-light is closing in)
  • wxHaskell is rising up, as the only cabal-installable major gui library
  • vector and text are quickly rising as the preferred arrays and unicode libraries

You can read all the Q1 data for your favorite packages, and ranked by Q1 popularity.

And for non-Haskellers, how does your favourite open source community compare?

The 8 Most Important Haskell.org GSoC Projects

While at ZuriHac, Johan Tibell, David Anderson, Duncan Coutts and I discussed what the highest priority projects for the Haskell community are, in the context of the Google Summer of Code, for which Haskell.org is a mentoring organization for the 5th year.

Here’s our top 8 most important projects, that we would really like to see good applications for. Some of these have tickets already, but some don’t. If you apply to work on projects like those below, you can expect strong support from the mentors, which ultimately determines if you’ll be funded.

For details on what we think you need to consider when applying to execute a project, see this earlier post.

A Package Versioning Policy Checker

Cabal relies on package version ranges to determine what Haskell software to install on your system. Version numbers are essentially “hashes” of the API of the package, and should be computed according to the package versioning policy. However, package authors don’t have a tool to automatically determine what the version number change to their package should be, when they release a new version, leading to mistakes, and needless dependency breakages.

This project would construct a tool that would be able to compute the correct package version number, given a package and an API change. As an extension, it would warn about errors in version ranges in .cabal files.

“cabal test”

Proper test support is essential for good software quality. By improving Cabal’s test support we can test all Cabal packages on continuous build machines which should help us detect breakages earlier. Making it easier to run the tests means that more people will run them and those who already do will run the more often.

Fast text/bytestring HTML combinators

We have Data.Binary for fast serialization of data structures to byte strings to be sent over the wire. High performance web servers need fast HTML generation too, and an approach based on Text.PrettyPrint combinators for filling unicode-friendly Data.Text buffers would be a killer app for web content generation in Haskell. This might mean working on BlazeHTML.

Threadscope with custom probes

ThreadScope is an amazing new tool in the Haskell universe for monitoring executing Haskell processes. It reveals detailed information about thread and GC performance. We’d like to extend the tool with support for new kinds of event hooks. Examples would be watching for MVar locks, STM contention, IO events, and more.

Combine Threadscope with Heap Profiling Tools

ThreadScope lets us monitor thread execution. The Haskell Heap Profiler lets us monitor the Haskell heap live. HPC lets us monitor which code is execution and when. These should all be in an integrated tool for monitoring executing Haskell processes.

LLVM Performance Study

GHC has an LLVM backend. The next step is to look closely at the kind of code we’re generating to LLVM, and the optimizations LLVM performs on GHC’s code, in order to further improve performance of Haskell code.

LLVM Cross Compiler

LLVM has support for many new backends, such as ARM. The challenge is to use this ability to generate native code for other architectures to turn GHC into a cross-compiler (so we could produce, e.g. ARM executables on an x86/Linux box). This will involve linker and build system hacking.

Hackage 2.0 Web Services

Hackage is the central repository for Haskell code.  It hosts around 2000 libraries, and is growing rapidly. It can be hard to determine which packages to use. We believe social mechanisms (comments, voting, …) can be very succesful in helping to both improve the quality of Hackage, and make it easier for developers to know which library to use. This project would bring Hackage 2.0 to a deployable state, and then consider better interfaces to search and sort packages.

These are the 8 projects we felt were the most important to the community. What do you think? Are there other key projects that need to be done , that will benefit large parts of the community, or enable the use of Haskell in new areas of importance?

After 3 years, my xmonad configuration now uses GNOME

Nearly three years ago, Spencer Janssen and I started work on xmonad, a tiling window manager for unix that would do what we want, automatically, so we could just concentrate on hacking code, without the window manager getting in the way. The project’s been quite successful — the most downloaded app on Hackage for the last couple of years, and thousands of users. It even has its own twitter, blogreddit and facebook accounts.

Originally I thought of this project something as the anti-GNOME: small, learn, and every part just does one thing only, but well – in the Unix tradition. And it has stayed lean. Around two thousand lines of Haskell for the core system, but with the benefit of hundreds of extensions in the contributor’s library — everyone’s config file is potentially a library module new users can import.

Over the years, GNOME and xmonad have started playing well together to the point that there’s relatively seemless interop between the two projects: you can use the full GNOME environment, and swap in xmonad as your window manager, or use a minimal environment with xmonad, adding in GNOME tools that help you.

Playing well with others is good for your open source software.

I’ve now finally switched my xmonad configuration to use a number of gnome apps, to support the core dynamic tiling provided by xmonad. Here’s my config file:

import XMonad
import XMonad.Config.Gnome
import XMonad.Layout.NoBorders
main = xmonad
    gnomeConfig {
            terminal = "term"
          , layoutHook  = smartBorders (layoutHook gnomeConfig)
    }

Yeah, that’s it. import XMonad.Config.Gnome, add smart borders, and overide the terminal to be my urxvt wrapper. xmonad is configured in Haskell, or languages Haskell can interoperate with.

My session is started up from .xinitrc as:

gitit &
gnome-panel &
gnome-power-manager &
dbus-launch --exit-with-session xmonad

I use gitit as my personal wiki, and then put a few things in the gnome-panel.

I’m really happy with how easy it now is to use xmonad with all the regular GNOME apps that people would like to see. This kind of friendliness to the dominate tools of the day is good for the project — and good for our users.

Follow

Get every new post delivered to your Inbox.

Join 46 other followers