Tuesday 23 December 2014

Discussing 'Nominal Unification'

UPDATE - Due to upcoming teaching commitments and, after that, paternity leave, this blog will not be updated again in the near future.

(Link).

Authors: Christian Urban (University of Cambridge), Andrew M. Pitts (University of Cambridge), and Murdoch J. Gabbay (INRIA).

Reference: Theoretical Computer Science vol. 323(1-3), pp. 473-497 (2004)

Why this paper? Cited by 4 (of a possible 28) of the papers in my Google Scholar Updates recommendations.

Comments: In the first iteration of this blog I reviewed recent papers recommended to me to by Google Scholar Updates; one of the reasons I abandoned this is that many of these new papers were quite strongly linked to older work, and I ended up having to explain all that older work before writing about the new. Note that being strongly linked to older work is no bad thing - on the contrary, such incremental research is the unglamorous but essential heart of scientific progress. But my new focus for this blog, on 'classic' (which is to say, highly cited) papers, allows me to focus more on foundational work, which tends to be more original and so may be described with less reference to prior literature.

The reason for this discussion is that I have already discussed this week's paper at some length in a post for the first iteration of this blog. Please feel free to look up the post; the only correction I'd make is that is that a quadratic time (which is to say, feasibly fast) algorithm for nominal unification is in fact sketched in the conclusion of this paper, whereas I had previously thought that credit for this should go to Levy and Villaret alone.

Let me instead discuss something slightly orthogonal to this paper. The phrase 'nominal techniques', which I discussed for example here, actually refers to two things - syntax and semantics. I have previously emphasised the semantical (which is to say, mathematical) side because that is where my interests lie; but here is a paper that is clearly nominal techniques yet makes explicit use of neither Fraenkel-Mostowski set theory nor nominal sets. What brings the syntactic and semantical work together under the heading 'nominal techniques' is their commonality of jargon and notation, their treatment of names as 'first-class' entities rather than as embarassments to be hidden away, and their use of name permutations (name-for-name swappings) as the main technical engine. These links mean that for a trained reader it is simple to see how this paper should be understood semantically, although my workshop paper 'Nominal Logic with Equations Only' is the only place I know of where this is done explicitly (see Section 6).

Tuesday 16 December 2014

Discussing 'The Semantics and Proof Theory of the Logic of Bunched Implications'


Author: David J. Pym (University of Bath)

Reference: Applied Logic Series Vol. 26, Kluwer Academic Publishers, 2002.

Why this paper? Cited by 4 (of a possible 31) of the papers in my Google Scholar Updates recommendations.

This week's most cited paper from among my Google Scholar Updates recommendations is a book, which I admittedly have not read through in its entirety for this discussion, but have enjoyed the excuse to flip through.

The logic of bunched implications, abbreviated BI, is similar to the linear logic I discussed last time, combining as it does 'normal logic' with a 'substructural' part that disallows the powerful inference rules of weakening and contraction. The technical difference between the two logics is that linear logic lacks a first-class normal implication (if... then...), usually considered to be the most important logical connective, but Girard showed that it can be reconstructed out of substructural implication and the '!' connective. As ingenious as that construction is, it demands a relationship between the normal and substructural connectives that it too tight for some applications. BI instead takes the conceptually rather simpler view that the normal and substructural logical connectives should be 'ships passing in the night' with no explicitly required interaction - which is not to say that they cannot be freely mixed together to create interesting theorems, but rather that one set of connectives does not depend on the other in any fundamental way.

I mentioned applications in the previous paragraph, and it is the breadth of concrete mathematical and computational constructions that can be understood via the logic BI that make it interesting. Nominal sets, the model of names in computing applications, provide such an example. In this setting propositions depend on finite set of names. The normal 'and' is defined in the usual way - 'p and q' is true if p is true and q is true. But the substructural conjunction, usually written with a star *, can be defined as follows: p * q is true if p is true and q is true and they do not share any names. The substructual implication is a bit trickier to define, and in fact I wrote a paper mostly about this very topic. Note that with this definition of * nominal sets do not define a model of linear logic (in the sense that we cannot define a ! connective with the required properties), showing that BI really is the 'better logic' for this application. A more famous application of BI is separation logic, a prominent formal methods technique for reasoning about programs, in which the * connective permits reasoning about the content of memory.

There is a lot of material in this book which I am not mentioning, of course. I will merely briefly note the proof theoretic notion of bunches that give the logic its name. These are essentially a richer notion of 'structure' than the lists (or sets) of propositions used by traditional proof systems; the idea of using more complicated structures in proof theory is reminiscent of the display calculi which I discussed recently.

Monday 1 December 2014

Discussing 'Linear Logic'


Author: Jean-Yves Girard (Université Paris Diderot)

Reference: Theoretical Computer Science Vol. 50(1), pp 1-101 (1987).

Why this paper? Cited by 5 (of a possible 31) of the papers in my Google Scholar Updates recommendations.

Comments: This paper begins with an extraordinary disclaimer from the editor: "Because of its length and novelty this paper has not been subjected to the normal process of refereeing. The editor is prepared to share with the author any criticism that eventually will be expressed concerning this work." In other words, the paper was published in one of the top journals in theoretical computer science without review. With hindsight there is no doubt that this paper was worthy of publication - four and a half thousand citations speaks to that - but one wonders if a proper reviewing process could have helped point out Girard's rediscovery of much that was already known, particularly the work of the relevant logicians. Instead, only six papers are cited in this 101 page paper, 5 sole-authored by Girard himself! This matters for more than just the claiming of credit; for example Girard's notational innovations, which have stuck despite attempts at reform by the likes of Arnon Avron, have almost certainly harmed the ability of computer scientists and the broader logic community to communicate.

Linear logic is in part a substructural logic, meaning that less powerful 'structural rules' are available for reasoning than there would be for more standard logics. This is a technical concept, but can be explained by example: there is a substructural 'and', for which you cannot deduce 'A' from 'A and B' (weakening), nor 'A and A' from A (contraction). These are obviously not desirable restrictions to have if you really mean 'and'! But linear logic has the standard 'and' as well; the substructural one simply allows you to reason in a "resource conscious" way, in which propositions may not be thrown away with weakening, nor duplicated with contraction.

The insight that led to linear logic lay in Girard's study of coherent spaces, a mathematical model (or semantics) for intuitionistic logic (a standard logic which can be, but usually isn't, viewed as 'substructural'). In this model the meaning of the apparently basic logical notion of implication (if... then...) came apart into two. The denotation of "if A then B" involves (1) 'permission' to use A as many times as we might want to, and (2) some translation of however many As we fixed on, into a B. (2) is the the resource-conscious 'substructural' part, whereas (1) can be considered as a brand new logical connective, written !, which permits the throwing away or duplication of a proposition. Hence a new logic, expressive enough to encode intuitionistic logic (and via the well known encoding, or more directly) classical logic also.

As might be suggested by its 101 page length, there is a lot more to this paper, including an alternative semantics of 'phases', which look a lot like the 'ternary relations' already well known to relevance logicians, sequent calculus (which we discussed in general here), and 'proof nets', a style of deduction analogous in some ways to natural deduction. It is all quite engagingly written, thank goodness: necessary technical grinding is openly flagged as "boring", and the dedicated reader is rewarded on page 92 with "This paper has already been too long, and a reader not already dead at this point will have no trouble in finding out the details." Of course I already knew Girard as a flamboyant and humorous writer, exhibit A being his virtually indescribable Locus Solum, a 206 page paper that includes an almost 100 page long appendix titled 'A pure waste of paper', essentially amounting to his personal book of jokes and observations about logic ("The theory of free and bound variables is still a typical topic for the early morning lecture the day following a meeting banquet. But this is a deep subject..."). Strangely enough, that paper appears to have been reviewed in the usual way!

Monday 24 November 2014

Discussing 'Lambda calculus notation with nameless dummies, a tool for automatic formula manipulation, with application to the Church-Rosser theorem'


Author: N. G. de Bruijn (Eindhoven University of Technology)

Reference: Indagationes Mathematicae Vol. 75(5), pp 381-392 (1972).

Why this paper? Cited by 4 (of a possible 29) of the papers in my Google Scholar Updates recommendations.

Comments: As the papers I discuss for this blog are those most cited in my Google Scholar 'Updates' recommendations, it is not surprising that the 'classic papers' I've looked at so far have been very closely related to my research interests. This week's paper is slightly further away, but is a classic indeed, notable enough to have a pretty good Wikipedia article about it (Wikipedia's 'nominal techniques' article is, by comparison, a neglected stub).

De Bruijn indices are one solution to the problems of names and binding which we've discussed on this blog before. They work by eradicating names altogether. This is an apparently brutal but very well-motivated idea, because the choice of names, for example of variables in a computer program, is formally arbitrary. So why have them?

There are two main answers to this rhetorical question. The first is that names allow the 'internal book-keeping' needed to keep track of what is being referred to at various points inside a program. De Bruijn's contribution is to show that this style of book-keeping can be replaced by counting how many steps there are between the introduction of (what we'd usually consider) a name, and its use.

The second answer to my rhetorical question is that, as openly admitted by de Bruijn in the introduction, names seem essential for human-readable syntax. De Bruijn frames his alternative approach as a tool for automation, as well as programming, and indeed in these contexts his approach has flourished and remains important. But there is now much interest in formalising human proofs, and in interactive theorem proving in which human and computer work together towards a proof, and so combining human-readability with full formality is an important goal. This is the niche in which nominal techniques (and some approaches as well) have become prominent. In fact even de Bruijn's contribution can be seen in this light, as there does exist another, much older, nameless approach based on so-called 'combinators'; de Bruijn's approach sits much closer to human practice.

Monday 17 November 2014

Discussing 'Nominal logic, a first order theory of names and binding'


Author: Andrew M. Pitts (University of Cambridge)

Reference: Information and Computation vol. 186(2), pp. 165–193 (2003).

Why this paper? Cited by 5 (of a possible 29) of the papers in my Google Scholar Updates recommendations.

Comments: Two weeks ago I discussed "A New Approach...", the paper that introduced nominal techniques. In that discussion I specifically mentioned (the conference version of) this week's paper as the one that introduced nominal as the term of the art for these techniques. So this is very much a sequel paper, and as such this is very much a sequel discussion - please do read my earlier discussion before this one.

As I discussed, the innovation of "A New Approach" was the mathematical shift to a non-standard set theory. Such a move is beautiful and clever, but does run the risk of being profoundly alienating to working computer scientists who simply want to solve problems involving names and binding! This paper, without abandoning the central insights gleaned from the Fraenkel-Mostowski setting, shows how much can be accomplished without straying beyond first-order logic, a completely standard logical setting that no one could object to. Along with this, Pitts makes a notion called nominal sets the semantical heart of his developments. Nominal sets appear in "A New Approach" as a preliminary construction on the road to the set-theoretic semantics, but they are far simpler than the full non-standard set theory. Although they are less expressive, this paper makes the case that they are expressive enough for most applications, a view which has been vindicated by the hundreds of papers that have since used the nominal sets framework.

While both logic and semantics have proved very useful, there is an acknowledged mismatch in this paper between them, via an incompleteness result - first-order logic is not really powerful enough to fully describe the properties of nominal sets, even 'cut down' as they are from the full set theory. For the notion of the 'set of free names' of a term (in nominal sets jargon, the support of an element) to be specified we would need the more powerful second-order logic. However the assertion that a name is not in the set of free names of (in the jargon, is fresh for) a term can be expressed in first-order logic, and turns out to be all that is needed for most applications. James Cheney resolved this mismatch in a 2006 paper, which provided a semantics that was complete for this paper's nominal logic, although I'm not aware of any applications of this very interesting theoretical work. After all incompleteness is often unavoidable, so is not necessarily a barrier to getting some work done!

Monday 10 November 2014

Discussing 'Display Logic'


Author: Nuel D. Belnap, Jr. (University of Pittsburgh)

Reference: Journal of Philosophical Logic vol. 11(4), pp. 375-417 (1982).

Why this paper? Cited by 7 (of a possible 29) of the papers in my Google Scholar Updates recommendations.

Comments: This week's paper is perhaps best understood as standing at the crossroads of two of the biggest features of modern logic (by which I mean, roughly, logic since Gödel blew everything up).

The first is that modern logicians are interested in structured proofs. Rather than attempting to proceed from axioms to conclusion in an undirected way, it can be useful to lock ourselves in to a system which tightly constrains what steps we may make within our proof, and hence prevents us from getting lost. Such structure can in particular guide automated proof search, i.e. the question of how one writes computer programs to do proofs automatically. This is vital for proving properties of, for example, massive software, where the sheer scale would defeat any human effort even if each individual step is quite simple-minded. The godfather of this field of structural proof theory is Gerhard Gentzen, who in the mid 30s invented not one but two of the most important proof systems we work with to this day: natural deduction, which is supposed to formally capture the way humans reason, and his sequent calculus, which is particularly useful for proving certain properties of logics such as their consistency, and provides the basis for some automated provers.

Gentzen was only interested in two sorts of logic - classical and intuitionistic. Modern logicians do not have it so easy, as the second feature I want to highlight is the dizzying and constantly increasing variety of different logics out there, sporting all sorts of different notions of truth, inference, meaning, and so forth. Many logics are created to meet some specific application; computer science leads the pack here, but philosophers are also prolific inventors of new logics, and other areas like linguistics, mathematics, and physics chip in too. Still other logics are invented for entirely 'pure' reasons, with logicians twiddling with the settings of existing logics to see what pops out (and this style of logic research has seen the invention of logics that later found important concrete applications).

There is a tension between these desires for structure and novelty - the more controlled the proof system, the less likely it is that your wacky new logic can be fitted to it. This has led to a variety of extensions of Gentzen's ideas to make them flexible enough to fit a wider variety of logics, and Belnap's display calculus, introduced in this 1982 paper, is the most flexible I know of. This made it the perfect basis for the paper I wrote last year with some co-authors at Australian National University, where we used display calculus to give a structured proof system for a logic called 'Full Intuitionistic Linear Logic' whose proof theory had previously been a rather awkward business (hence, of course, the occurrences of this citation in my Google Scholar Updates recommendations).

Display calculus is an extension of Gentzen's sequent calculus. A sequent is a set of assumptions and conclusions, possibly with some structure imposed on them such as an ordering. Display calculus uses more richly structured sequents which sport the display property, a technical property which allows any assumption or conclusion you have in your current sequent to be 'displayed' so that it becomes the ongoing focus of your calculations, with all other parts of your sequent set to one side. The cool things about this setup are first, that an enormous variety of logics can be given display calculi (though by no means all; Belnap himself mentions quantum logic as something which he couldn't fit into his structure). Second, any calculus with the display property is guaranteed to have a certain property called cut-elimination which is extremely useful for proving all sorts of things about your logic, but which can often be extremely tricky to prove from scratch. So using display calculus can save a lot of work! Of course, nothing comes for free, and display calculi do have limitations, such as not being much use as a basis for automation. But they are still a useful tool to know about for the logician hoping to tame the jungle of logics that are out there.

Monday 3 November 2014

Discussing 'A New Approach to Abstract Syntax with Variable Binding'


Authors: Murdoch J. Gabbay and Andrew M. Pitts (University of Cambridge)

Reference: Formal Aspects of Computing Vol. 13(3-5), pp 341-363 (2002).

Why this paper? Cited by 9 (of a possible 29) of the papers in my Google Scholar Updates recommendations (where multiple citations from different papers by the same author(s) in my recommendations are disregarded).

Comments: My PhD and some of my subsequent research was built around something called nominal techniques; this is the journal version of the conference paper that introduced them (in fact the second author, Andrew Pitts, was my PhD supervisor). Already some clarification is necessary: computer science has for some reason developed an unusual publication culture in which peer-reviewed conferences, rather than journals, take center stage. But conference publication often imposes tight space limits on papers, which can force authors to tighten their prose, which I find no bad discpline, but more regrettably can mean some details or examples are omitted. Hence new research is often published twice, with a more detailed 'journal version' following the conference version, often years later because of the slowness of the journal review process (which itself is probably contributed to by so much work having to be reviewed twice!). Another clarification to my first sentence is that the word 'nominal', which has become the brand name under which all developments of this work by Gabbay and Pitts are grouped, does not appear in this paper. I believe it was introduced by Pitts in his 'Nominal logic' paper, which was published before this paper but written after it, thanks again to the vagaries of academic publication.

The problem this paper addresses is that of reasoning formally in the presence of names and binding, ubiquitous features of computer science applications and other formal settings like mathematics. Of course names are everywhere - think of variable names in programs, or the identifiers like x and y we use when we do maths. The precise names we use is at some level arbitrary; while we're doing our mathematical manipulations we better keep track of our xs and ys, but 'from the outside' it obviously wouldn't have mattered if we'd called our variables foo and bar instead. Binding (sometimes, scope) is simply the point where this 'outside' versus 'inside' distinction is made: some operations have the property that the choices of names inside the operation's domain are irrelevant to anything outside that operation's domain. Such structure is vital in software engineering applications where people worry about encapsulation, but is in fact virtually unavoidable in any sufficiently interesting formal setting.

Anyone with a passing interest in computing, or even logic or mathematics, would nod along to most of the previous paragraph, but perhaps would not become greatly interested. Names and binding are basic machinery, and there is no reason to spend time thinking about them, as opposed to getting on with using them! The twist comes when one tries to reason fully formally, for example to convince a computer that a certain program has a certain desirable property. This sort of reasoning, if successful, can yield guarantees about software far more robust than mere testing, but it is not easy, and names and binding structure turns out to be be one of the (many) barriers to doing this successfully. In particular, proofs involving structural induction, which are vital to programming languages applications in particular, do not play well with bound names. So there is a problem to be solved.

The solution proposed by Gabbay and Pitts (I should note that other candidate solutions exist, but I will not discuss them here) is to change the set theoretic foundations on which we build our mathematics, from the standard one to an alternative framework developed by Fraenkel and Mostowski in the 1930s for reasons completely unrelated to computer science. In this framework certain elements called atoms claim a privileged place, giving mathematics in this universe a slightly different flavour. This paper adapts this work in (to me) wildly surprising fashion to show that if we view these atoms as our names then we have just the right 'universe' in which to model, and reason in a reasonably intuitive way about, binding, up to and including the structural induction I mentioned earlier.

It should be acknowledged, of course, that doing mathematics in a 'different universe' sounds like an cumbersome (and possibly eccentric!) move. For this stuff to actually be used, the esoteric set theory needs to be at least partially hidden, and subsequent work, most notably the 'nominal logic' paper I mentioned above, has helped to do this. Still, all of the many subsequent applications of Gabbay and Pitts's work can be seen to rest on some of the most abstruse mathematics you can imagine, a story that must gladden the heart of any working theoretician!

Friday 31 October 2014

Restarting this blog, with 'classic papers'

In a fascinating recent paper a team of Google-based researchers have shown that the way academic literature is disseminated has gone through dramatic changes over the last two decades. 'Elite' journals in most fields have shrunk in relative significance, as both highly cited papers, and citations in general, are increasingly dispersed across a wide variety of outlets. Whatever other factors are driving this process, the biggest surely is the rise of internet databases as a research tool, with Google's own 'Scholar' search engine first in that class. Papers no longer need to be in the major journals that people read, or even that libraries subscribe to, to be discoverable. This could perhaps be seen as part of a broader societal trend, driven by the internet, away from monolithic and powerful broadcasters, which is sometimes referred to as the long tail.

There does seem to be a mismatch between this drift away from elite journals and the pressure I feel under as a postdoc bouncing around the world on short-term contracts, which is to publish in eye-catching places to increase my chances of impressing hiring committees and attaining something more permanent. Could having a paper in a famous and hard-to-get-into venue become more about advertising oneself in the labour market than getting one's work read? In theory, perhaps, everyone's incentives should align, as the most selective outlets accept only the most significant papers, thus sending a credible signal about the quality of their authors. But the waning of elite journals, despite the academic market being tougher than ever, seems to belie this. Perhaps attitudes simply take time to change, and the current obsession with venues will fade.

Setting this aside, it is clear that Google Scholar has changed how research is done. Last year I launched this blog to have a play with the 'Updates' tool for the Google Scholar database, which uses one's own research as a 'search term' to discover pertinent new research, wherever it is released (often it is 'published' nowhere more prestigious than the homepage of the author). My review of the tool is given here; in short it is remarkably good for discovering papers relevant to your 'old' work, but obviously useless for discovering papers relevant to your ongoing work that is not yet archived by Google. As I shifted jobs from Australian National University to Aarhus University in Denmark I have been furiously learning new stuff to catch up with the interests of my new group, which is great fun but for which the Updates tool, and my blog about it, would have been a distraction.

Now that I have got my bearings a bit in Aarhus (although there always so much more to learn!) I thought I could bring back this blog with a slightly different focus. Instead of looking at the new research recommended to me, which is sometimes very interesting but often a bit obscure outside a very narrow community, and so not necessarily great 'blog material', I would briefly review some 'classic papers' useful for a slightly wider world of theoretical computer scientists and logicians. But how do I choose the classics in question? To keep consistency with my blog's theme to date, and make sure I choose papers relevant to my corner of the research world, I will use the Updates tool again, looking at the papers most cited by those new papers recommended to me. Having done the tedious task of compiling 1133 citations from 34 papers, and hence divined the papers considered most relevant by the authors that are considered most relevant to me by Google, I will post my first review next Monday.