In the next branch of Haskell Platform we’ll be adding and removing packages from the specification for the first time. The Haskell Platform steering committee will make recommendations for additions and removals based on individual proposals to add and remove packages from the list.

It is hard to come up with “notability” criteria for why a package should be added or removed. There are many competiting reasons why people use the Haskell Platform, and what packages they need.

The goal though should be an almost fully automated criteria for determining when a package should be added, based on objective data. Then, combined with strategic and other concerns, packages will be added or, sometimes, removed.

Possible Criteria for Notability

A quick list of possible criteria by which to evaluate whether a package is “blessed”:

  • How popular is the package in Hackage downloads?
  • How many packages depend on it?
  • Do any applications of note depend on it?
  • Does it meet a stated end-user need?
  • Do similar systems include such a library (e.g. Python)?
  • Is it portable?
  • Does it add additional C libraries?
  • Does it follow the package versioning system?
  • Is the code of good quality?
  • Does it have a good development history?
  • Is it on hackage?
  • Does it provide haddock documentation?
  • Does it come with examples?
  • Does it have a test suite?
  • Does it have a maintainer?
  • Does it in turn require new Haskell dependencies?
  • Does it have a simple/configure-based Cabal build?
  • Does it conflict/compete with existing functionality?
  • Does it reuse existing types?
  • Does it follow the hierarchical naming conventions?
  • Is it -Wall clean?
  • Have declared correctness or performance statements?
  • Is it BSD licensed?
  • Is it thread-safe?

A Point System

One way of determining notability for a package would be to use a points system against an agreed-upon set of such criteria.

Does anyone know of similar examples, or would like to code up some programs to experiment with these ratings?

Distro Page Rank

Another source of raw data may well be a sort of “Page Rank” across unix distros for how often a package is used. On the Arch Linux distribution, we have 3 level support for Haskell. In the core system some Haskell apps and tools are provided in binary form. In the “community” binary repo there are yet more packages. Finally, in the user-contributed repository are around 1300 other packages (~90% of Hackage).

Does your distro have popularity statistics? Could you determine the top 100 Haskell package by vote?

Most Popular Packages in Arch Linux

Some users install packages with the ‘yaourt’ tool, and some of those users opt in to voting when they install. Here’s the top 100 packages sorted by votes in Arch Linux, with those that are in the Haskell Platform already, indicated:

HP Repository Category Library/Program Votes Synopsis Notes
Extra darcs Decentralized replacement for CVS with roots in quantum mechanics
Extra haskell-extensible-exceptions Extensible exceptions darcs dep
Extra haskell-hashed-storage Hashed file storage support code. darcs dep
Extra haskell-haskeline A command-line interface for user input, written in Haskell. darcs dep
Extra haskell-mmap Memory mapped files for POSIX and Windows darcs dep
Extra haskell-terminfo Haskell bindings to the terminfo library. darcs dep
Extra haskell-utf8-string Support for reading and writing UTF8 Strings darcs dep
YES Extra ghc The Glasgow Haskell Compiler
Extra hugs98 Haskell 98 interpreter
YES Extra happy The Parser Generator for Haskell
YES Community alex a lexical analyser generator for Haskell
Community gtk2hs A GTK+2 binding for Haskell
YES Community haskell-http A library for client-side HTTP cabal dep
YES Community cabal-install The command-line interface for Cabal and Hackage.
Community haskell-x11 A Haskell binding to the X11 graphics library. xmonad dep
Community haskell-x11-xft Bindings to the Xft, X Free Type interface library, and some Xrender parts xmonad dep
YES Community haskell-zlib Compression and decompression in the gzip and zlib formats cabal dep
Community pandoc Haskell library and program to convert one markup format to another
Community xmonad A lightweight X11 tiled window manager written in Haskell
Community xmonad-contrib Add-ons for xmonad xmonad dep
lib haskell-binary 0.5.0.1-1 98 Binary serialisation for Haskell values using lazy ByteStrings
YES lib haskell-opengl 2.2.1.1-1 56 A binding for the OpenGL graphics system
lib haskell-hslogger 1.0.7-2 51 Versatile logging framework
lib haskell-puremd5 1.0.0.0-1 48 MD5 implementations that should become part of a ByteString Crypto package.
YES lib haskell-syb 0.1.0.0-1 48 Scrap Your Boilerplate
YES devel haddock 2.4.2-1 46 A documentation-generation tool for Haskell libraries
devel haskell-xft 0.2-2 46 Bindings to the Xft library, and some Xrender parts
lib haskell-ghc-paths 0.1.0.5-1 45 Knowledge of GHC’s installation directories
lib haskell-haxml 1.13.3-1 42 Utilities for manipulating XML documents
lib haskell-missingh 1.1.0-1 40 Large utility library
lib haskell-testpack 1.0.2-1 36 Test Utililty Pack for HUnit and QuickCheck
YES lib haskell-time 1.1.2.4-1 36 A time library
lib haskell-uniplate 1.2.0.3-1 36 Uniform type generic traversals.
lib haskell-diff 0.1.2-1 35 O(ND) diff algorithm in haskell.
YES lib haskell-mtl 1.1.0.2-1 35 Monad transformer library
YES lib haskell-regex-base 0.93.1-1 33 Replaces/Enhances Text.Regex
YES lib haskell-parsec 3.0.0-1 32 Monadic parser combinators
devel cpphs 1.7-1 31 A liberalised re-implementation of cpp, the C pre-processor.
lib haskell-curl 1.3.5-1 31 Haskell binding to libcurl
lib haskell-hinotify 0.2-1 31 Haskell binding to INotify
lib haskell-transformers 0.1.4.0-1 31 Concrete monad transformers
lib haskell-unix-compat 0.1.2.1-1 31 Portable POSIX-compatibility layer.
devel cabal2arch 0.5.3-1 30 Create Arch Linux packages from Cabal packages
lib haskell-fingertree 0.0.1.0-1 30 Generic finger-tree structure, with example instances
lib haskell-haskell-src-exts 1.0.1-1 30 Manipulating Haskell source: abstract syntax, lexer, parser, and pretty-printer
YES lib haskell-glut 2.1.1.2-1 29 A binding for the OpenGL Utility Toolkit
lib haskell-pcre-light 0.3.1-2 29 A small, efficient and portable regex library for Perl 5 compatible regular expressions
lib haskell-rosezipper 0.1-1 29 Generic zipper implementation for Data.Tree
devel hscolour 1.13-1 28 Colourise Haskell code.
lib haskell-data-accessor 0.2.0.2-1 26 Utilities for accessing and manipulating fields of records
lib haskell-data-accessor-template 0.2.1.1-1 26 Utilities for accessing and manipulating fields of records
lib haskell-regex-tdfa 1.1.2-2 26 Replaces/Enhances Text.Regex
lib haskell-xml 1.3.4-1 26 A simple XML library.
lib haskell-hsh 2.0.2-1 25 Library to mix shell scripting with Haskell programs
lib haskell-split 0.1.1-1 25 Combinator library for splitting lists.
lib haskell-utility-ht 0.0.5.1-1 25 Various small helper functions for Lists, Maybes, Tuples, Functions
lib haskell-vty 3.1.8.4-1 25 A simple terminal access library
lib haskell-syb-with-class 0.5.1-1 24 Scrap Your Boilerplate With Class
YES lib haskell-cgi 3001.1.7.1-1 23 A library for writing CGI programs
YES lib haskell-fgl 5.4.2.2-1 23 Martin Erwig’s Functional Graph Library
devel derive 0.1.4-1 22 A program and library to derive instances for data types
lib haskell-monads-fd 0.0.0.1-1 21 Monad classes, using functional dependencies
devel haskell-pandoc 1.2.1-1 21 Conversion between markup formats
lib haskell-safe 0.2-1 21 Library for safe (pattern match free) functions
lib haskell-zip-archive 0.1.1.3-1 21 Library for creating and modifying zip archives.
YES lib haskell-bytestring 0.9.1.4-1 20 Fast, packed, strict and lazy byte arrays with a list interface
lib haskell-configfile 1.0.4-2 20 Configuration file reading & writing
lib haskell-data-accessor-monads-fd 0.2-1 20 Use Accessor to access state in monads-fd State monad class
lib haskell-hstringtemplate 0.6-1 20 StringTemplate implementation in Haskell.
lib haskell-pointedlist 0.3.5-1 20 A zipper-like comonad which works as a list, tracking a position.
YES lib haskell-quickcheck 2.1.0.1-2 20 Automatic testing of Haskell programs
lib haskell-convertible 1.0.5-1 19 Typeclasses and instances for converting between types
lib haskell-digest 0.0.0.6-1 19 Various cryptographic hashes for bytestrings; CRC32 and Adler32 for now.
lib haskell-hdbc 2.1.1-1 19 Haskell Database Connectivity
network twidge 0.99.3-1 19 Unix Command-Line Twitter and Identica Client
lib haskell-hspread 0.3.3-1 18 A client library for the spread toolkit
lib haskell-readline 1.0.1.0-1 17 An interface to the GNU readline library
lib haskell-strict 0.3.2-2 17 Strict data types and String IO.
lib haskell-happs-util 0.9.3-1 16 Web framework
devel hoogle 4.0.7-1 16 Haskell API Search
editors yi 0.6.1-1 16 The Haskell-Scriptable Editor
lib haskell-findbin 0.0.2-1 15 Locate directory of original program
lib haskell-glfw 0.3-1 15 A binding for GLFW, An OpenGL Framework
lib haskell-json 0.4.3-1 15 Support for serialising Haskell to and from JSON
YES lib haskell-network 2.2.1.4-1 15 Networking-related facilities
lib haskell-stream 0.3.2-1 15 A library for manipulating infinite lists.
lib haskell-tagsoup 0.6-2 15 Parsing and extracting information from (possibly malformed) HTML documents
YES lib haskell-editline 0.2.1.0-2 14 Bindings to the editline library (libedit).
lib haskell-sdl 0.5.5-1 14 Binding to libSDL
editors leksah 0.6.1-1 14 Haskell IDE written in Haskell
devel c2hs 0.16.0-1 13 C->Haskell FFI tool that gives some cross-language type safety
lib haskell-hsx 0.5.6-1 13 HSX (Haskell Source with XML) allows literal XML syntax to be used in Haskell source code.
devel hlint 1.6.4-1 13 Source code suggestions
lib haskell-crypto 4.2.0-1 12 Collects together existing Haskell cryptographic functions into a package
lib haskell-hdbc-sqlite3 2.1.0.2-1 12 Sqlite v3 driver for HDBC
lib haskell-highlighting-kate 0.2.4-1 12 Syntax highlighting
lib haskell-hjavascript 0.4.4-1 12 HJavaScript is an abstract syntax for a typed subset of JavaScript.
lib haskell-hjscript 0.4.4-1 12 HJScript is a Haskell EDSL for writing JavaScript programs.
devel mkcabal 0.4.2-2 12 Generate cabal files for a Haskell project
lib haskell-arrows 0.4.1.1-1 11 Arrow classes and transformers
lib haskell-filemanip 0.3.2-1 11 Expressive file and directory manipulation for Haskell.
lib haskell-happs-data 0.9.3-1 11 HAppS data manipulation libraries
lib haskell-happs-ixset 0.9.3-1 11
lib haskell-happs-state 0.9.3-1 11 Event-based distributed state.
lib haskell-harp 0.4-1 11 HaRP allows pattern-matching with regular expressions
lib haskell-lazysmallcheck 0.3-2 11 A library for demand-driven testing of Haskell programs
lib haskell-typecompose 0.6.4-1 11 Type composition classes & instances
lib haskell-dataenc 0.13.0.0-1 10 Data encoding library
lib haskell-happstack-util 0.3.2-1 10 Web framework
lib haskell-hxt 8.3.1-1 10 A collection of tools for processing XML with Haskell.
lib haskell-maybet 0.1.2-1 10 MaybeT monad transformer
lib haskell-platform 2009.2.0.2-1 10 The Haskell Platform
office pdf2line 0.0.1-1 10 Simple command-line utility to convert PDF into text
lib haskell-category-extras 0.53.5-1 9 Various modules and constructs inspired by category theory
lib haskell-colour 2.2.1-1 9 A model for human colour/color perception
lib haskell-datetime 0.1-1 9 Utilities to make Data.Time.* easier to use.
lib haskell-happs-server 0.9.3-1 9 Web related tools and services.

Now, one of the other constraints on the Haskell Platform is sustainable growth. We can’t add 1000 packages tomorrow and hope to maintain quality. Instead, something like 10-20% growth per release cycle seems plausible. This would mean adding 4 to 9 new packages.

If we were to judge only on download popularity, the 10 new packages would be:

Now, one of the other constraints on the Haskell Platform is sustainable growth. We can’t add 1000 packages tomorrow and hope to maintain quality. Instead, something like 10-20% growth per release cycle seems plausible. This would mean adding 4 to 9 new packages.

If we were to judge only on download popularity, our first 5 new packages would be:

Merely because one killer app, darcs, depends on them, and so they are widely built (they may also fail to satisfy many of the other critieria noted above).

If we ignore those packages popular for being dependencies, we get a different top 5:

Now we’re getting there. pandoc is both a library and a popular app, so we might treat it specially. gtk2hs is very popular, but not cabalised, so we might also set that aside, leaving (and I’ll ignore ghc-paths as it is used by ghc):

Which is starting to look like a plausible list. In turn however, you can find fault with all these packages in various dimensions (utf8-string may be obsoleted by Data.Text, haxml is LGPL licensed).

Coming up with an obvious list is non-trivial!

Finally, this is clearly only one very small data set, which should only have a small influence. If we step over an look at the Hackage download statistics, sorted by popularity, our top 5 new packages would be:

Popularity by Category

If instead we thought that having a comprehensive library set was the key goal, we may choose to include libraries via category, no matter how popular in the global list. This would yield, according to Hackage,

For example.

What Is The Decision Model?

So how do we decide what goes in? One model would be:

  1. Have people propose packages
  2. Sort them by category need
  3. Identify the top rank package in each category using a points system or page rank
  4. Add or remove packages based on this?

What do you think? What is a good way to decide when a package is sufficiently notable to add to the Haskell Platform?

What critieria would you use to determine when a package is blessed?

About these ads