Author: Benedikt Ahrens (Université Nice Sophia-Antipolis)
Reference: arXiv preprint arXiv:1212.5668v1 [math.LO], 2012.
Comments: First, let's discuss the status of this paper.
The arXiv is an internet archive for scientific 'preprints' (draft papers not yet published in any peer-reviewed outlet). As such, this paper is probably currently undergoing peer-review for some journal, so some care should be taken with the results as they have not yet withstood this process. The point of arXiv is to allow dissemination of results, and claims of priority, to be established more rapidly than the often ponderous publication process. This paper will undoubtedly be changed in some way as a response to reviewers' comments before it sees the 'official' light of day; it is an example of the
'grey literature' that Google Scholar tends to include that I mentioned in my
mission statement post.
There are, however, two reasons to trust the results of this particular paper. First, they appeared in the author's
PhD thesis, which has been examined and passed. Second, this paper is an extended version of the author's 2012 conference paper
Initiality for Typed Syntax and Semantics, which was peer-reviewed before acceptance. This model of relatively rapid publication of a short paper in a conference, followed by the slower publication of a longer paper, covering similar territory in more depth, in a traditional journal, is very common in computer science (and much less so in other disciplines).
Considering the paper's contents, this is an application to computer science of an approach to mathematics called
category theory. Categories provide a way of looking at mathematics where instead of taking mathematical 'objects' as the primary focus, as a conventional mathematician might, we focus on the interconnections
between objects. Stated at that level of abstraction it might seem vaguely Eastern and mystical, but it's a completely formal way of building up mathematics that is exactly as powerful as set theoretic foundations. The advantage of this different angle on things is a bit like learning to speak a new language - different concepts tend to come to light. Category theory has been particularly widely applied in computer science, perhaps because computer scientists are quite comfortable thinking about dynamic transformations between objects, rather than the objects' static descriptions, as the most interesting thing.
The category theoretic concept that gets the most work in this paper is that of a
monad. The definition of a monad is, like many things in category theory, relatively simple but so abstract that its utility is far from obvious. If we view a category as an area of mathematics (so we have a category for set theory, a category for group theory, or even go down the rabbit hole to define a category for category theory itself), then a monad can be thought of as a construction that allows you to do
algebra within that area of mathematics. (In fact, monads have made a slightly surprising escape from the world of pure theory to become
a useful tool in functional programming, but that is orthogonal to the interests of this paper.)
In fact, this paper doesn't use monads in the conventional sense, but rather the
relative monads introduced by the 2010 paper
Monads Need Not Be Endofunctors by Altenkirch et al. There, the monad is not defined on a category, but on a translation between (possibly different) categories (such a translation is called a
functor). This generalisation collapses to the conventional definition if we set this functor to be the identity on some category. The impact of this generalisation when we use our monads to do algebra is that our
variables may live in a different mathematical universe from our
terms. In this paper, the variables reside in the category of sets, while the terms reside in the category of preorders (sets equipped with a reflexive and transitive relation).
Why pull this trick? The clue is in the 'reduction rules' of the paper's title. Ahrens looks at systems like the λ-calculus where some terms can be reduced, or simplified, to other terms; the reduction relation that holds across all the terms is a preorder. This motivates the setting of terms within the category of preorders. So why not work entirely within this category, using normal monads? This would mean that the variables would also live in the category of preorders, so we might be able to reduce some variables to others. But this is not the case for most applications - variables usually form a set only, in the sense that they have no non-trivial preorder definable on them. Therefore Ahrens keeps his variables in the category of sets, and injects them when needed into the category of preorders via the trivial embedding functor, here called Δ. Monads for systems like the λ-calculus can then defined on the functor Δ, and the 'algebraic' view that arises (via
initiality; I'm afraid I'll have to beg off going into the details of that here) give rise to mathematical descriptions of syntax and reduction semantics, and a notion of reduction-preserving translation which is here used to map the programming language
PCF into the simply typed λ-calculus.
The key word in the sentence above is 'typed'. In fact much of the development discussed above can be found in a
2011 paper be the same author, and the specific advance of this paper is to detail how (simple) types are incorporated into this view (types are essentially annotations on terms that restricts how they can be used, for example preventing you from applying an arithmetical operation to a string; most programming languages have them). Technically, it means shifting from categories of sets and preorders to categories of typed families of sets and preorders. This shift apparently introduces only "a minor complication" to the major theorem of the paper (Theorem 3.32), but it is a necessary one to talk sensibly about typed languages like PCF.
Finally, this work has been formalised and machine-checked with the proof assistant
Coq. Such formalisation is an increasingly popular way to boost confidence in the correctness of mathematical results; perhaps unsurprisingly, computer scientists have been the most eager to adopt this technology. It's not something I've dealt with in much depth myself however, so I have no particular comments or insights into the final section of this paper, which details the formalisation process.