Monday, 27 May 2013

Instances of computational effects: an algebraic perspective


Authors: Sam Staton (University of Cambridge)

Conference: Twenty-Eighth Annual ACM/IEEE Symposium on Logic in Computer Science (LICS 2013), June 25-28, 2013, New Orleans, USA.

Journal reference: Proceedings to appear.

Comments: Google Scholar Updates's top recommendation this week was something a bit different - my old PhD supervisor Andrew Pitts has just released the first full book ever published on the relatively new field of nominal sets, titled Nominal Sets: Names and Symmetry in Computer Science. As a recommendation to read it is of course very much in line with my interests, but as a recommendation to write about on this blog within a week's notice it is not so realistic (Google Books offer a preview if you're curious).

Let's turn instead to the second recommendation in my queue, which happens to also stem from Cambridge. In fact it is a sequel to the Staton paper I wrote about four months ago, continuing his development of parameterized algebra. Please refer to my earlier post for a reminder of what that is.

In this paper parameterized algebra is put to work to talk about computational effects. If we think of a computer program as a process that, after some computation, returns a result, a (side) effect refers to things the computer does while that result is being produced. For example, the computer might write something on the screen, or change something in memory. Less obviously, if the program can accept user input this is also a sort of side effect. Whether such events are treated as side effects or as part of the value returned is somewhat flexible, and can differ by programming paradigm or language, but separating computations into the values they calculate and the effects they produce en route is hard to avoid entirely, even in the relatively 'pure' world of functional programming.

A big development in functional programming that allowed effects to be handled more transparently came with Moggi's 1991 paper Notions of computation and monads. Monads are a concept we've encountered a few times on this blog, most extensively here, and essentially give a concise category theoretic way to describe equationally defined, or algebraic, models. Monads are a core concept of category theory because of their relationship to the basic idea of adjunctions, but hardly seem candidates to impact the working practices of your average Haskell programmer. And yet, Moggi and later authors showed that monads are a nice way to 'package up' effects in functional programs to make effectful programs more structured.

The next big milestone in the intellectual history behind this paper is Plotkin and Power's 2002 Notions of computation determine monads, deliberately titled as a sequel to Moggi's classic. This paper essentially dialed the practical issue - effects - back to the original theory - equational definitions - and asked how the monads used to handle effects can be defined equationally. This work was in a sense experimental, as it gave a few examples of how such definitions were both possible and illuminating without building any sort of general theory. Staton to an extent follows this example-driven approach, looking at computational effects that can be thought of as multiple instances of a general concept, such as writing to a specific memory bit being an instance of the general idea of writing to memory. However he moves beyond the examples to argue that the effects in question are correctly captured by his theory of parameterized algebra, which hence provides a syntactic and semantic framework through which many interesting side effects can be defined, categorised, and studied.

Sunday, 19 May 2013

Correctness of Functionally Specified Denotational and Operational Semantics


Authors: Sjaak Smetsers, Ken Madlener (Radboud University Nijmegen), and Marko van Eekelen (Radboud University Nijmegen and Open University of the Netherlands)

Conference: Symposium on Trends in Functional Programming (TFP 2013), 14-16 May 2013, Provo, Utah, USA.

Journal reference: Lecture Notes in Computer Science, to appear.

Comments: Last week I looked at the ways Google Scholar Updates has been useful to my research; I recently read a blog post that saw the tool put to a rather more surprising use: Updates recommended a possibly relevant published paper that turned out to be highly related, in the sense that it was partly plagiarised from the academic's own work! Click here to read the story.

Moving on to this week's paper, we'll look at bialgebraic semantics of programming languages. The (only) good thing about technical jargon is that you can fit a lot of content into a small space, so we'll try to unpack this over the course of the post.

First, what is meant by "semantics of programming languages"? To give such is to give a precise and formal mathematical statement of how a programming language should be understood. These semantics can then be used to (a) reason about properties of the language, (b) reason about programs written in that language, and (c) guide the implementation of the language, e.g. compilation. Semantics come in a variety of different styles (see e.g. Wikipedia's overview of some options); the "bialgebraic" style of this paper is so called because it combines algebraic and coalgebraic approaches.

Algebraic semantics start with a specification of the syntax of the language, and then show how programs using that syntax can be interpreted within some mathematical setting. This mathematical translation is called the denotation of a program, and so algebraic semantics are a form of what is called denotational semantics. We discussed this approach here, but our mathematical setting in this paper is a lot simpler than that used there - the category of sets and functions rather than rich and complex "domains". The advantage of denotational semantics in general is that they are 'mathematical' in character and hence can be reasoned about with familiar mathematical tools such as induction; the disadvantage is that they are abstracted away from the original language enough that the actual behaviour of programs can be obscured.

Coalgebraic semantics are the category theoretic dual of algebraic semantics; we discussed coalgebra here. Coalgebra are a good way of talking about systems with transitions; when used in semantics they view programs as transitions between states (say, memory states of the program). This is a particular style of what is called operational semantics.

The denotational and operational semantics of a language or program should, of course, have something to do with each other; the jargon for when they conform is the adequacy property. But we'd like to say more than that: we'd like the coalgebraically specified operational behaviour of the language to very closely flow from the algebraically specified syntax. In other words we'd like the algebra to provide 'structure' for the coalgebra, in the style called structural operational semantics. It turns out this sort of structure can be imposed at the level of abstract category theory, quite independent of the attributes of any particular language, by asking that the algebra and coalgebra share certain mathematical relationships called 'distribution laws'. The seminal paper behind this idea is Turi and Plotkin's 1997 Towards a mathematical operational semantics, but I'd recommend starting with Klin's nice introduction to the area.

So what does this week's paper contribute? One problem with formal semantics - the reason most languages lack them despite their obvious advantages - is that real languages are huge and complicated and formal proofs involving them are just too big to fit in any one human being's mind. Hence there is a need for formalisation, in the sense that machines, rather than people, should at the very least record and check proofs, and preferably do some of the grunt work themselves. This is a pretty common refrain on this blog - see here for example. This paper, then, is a formalisation of bialgebraic semantics (the general theory, not any particular language) into the specification tool and theorem prover PVS. The particularities of the coding are a bit beyond me, as a non-expert in mechanised theorem proving, but the appeal of PVS seems to be that it is designed and used as a working tool for real world formal methods, and so provides a promising link between abstract category theory and application; the main impediment seems to be its weak support for certain forms of polymorphism.

Monday, 13 May 2013

Reviewing Google Scholar Updates - 12 papers in

4 months or so ago I launched this blog, experimenting with the academic paper recommendation tool Google Scholar Updates, which uses your own papers as a 'search term' to recommend recently archived papers. I've now read and written on 12 papers, from 14 recommendations (I broke the purity of the 'experiment' by skipping two papers that looked very unlikely to be of value to me); in this post I will review what I've learned from the experiment so far.

Relevance: The biggest question is whether Updates has been regularly recommending papers that are interesting and useful to me. To answer this we must first observe that there is one very obvious problem with the methodology of a tool like this, which is that its only input is the research I've done in the past, and not that which I'm doing right now but have yet to publish. This is particularly noticeable to me right now because I've been partly transitioning away from my PhD line of research to take on some new interests. These new directions are not catered for by Updates, but neither would you expect them to be; it would be foolish to use Updates as your only source of reading. Nonetheless the point should be made that Updates could only ever be useful in proportion to how much your current research interests are a continuation of your past ones.

With this caveat set aside, I have been happy with the relevance of my recommendations - happy enough to continue using this tool, and maintaining this blog. There has been only one conspicuous example of a paper with very low relevance to my research interests, although a number of other papers have fairly weak links. In my case, these weakly linked papers usually comprised a brief discussion of handling name and binding issues in the context of some other investigation. In fact these weakly linked papers have in many ways been the most interesting to read about, as they help me situate my work in the broader fabric of theoretical computer science research. So while there is a concern (not one I totally buy into) that these sorts of recommender systems in general can produce an 'echo chamber' effect, fitting people into narrow profiles and never challenging them, that's not been my experience with Updates.

Breadth of sources: One problem that might be underestimated is the sheer diversity of different outlets that publish research relevant to me. A long time ago everything worth reading would appear in the Proceedings of the Royal Society, but nowadays research of interest to me might appear in all sorts of journals, conference proceedings, and online repositories. The only venue to 'publish' more than one of the twelve papers I have had recommended is arXiv, which is far too large and diverse to regularly check. In fact, 4 of the 12 papers were not formally 'published' anywhere more high profile than the personal websites of the authors (more on such 'grey literature' below). The converse of this diversity of outlets is that any one outlet publishes only a very small proportion of material I am interested in, or even understand. Therefore regularly checking certain outlets for new research borders on the useless, and is certainly inefficient. Some sort of automation, whether through explicit keyword or citation alerts or more 'intelligent' but opaque recommender systems, seems virtually compulsory if you want to stay current with relevant research.

False negatives: A tricky question to judge is whether any highly relevant papers are not being recommended to me. Following my earlier discussion, the first answer to this is "of course!" - at a minimum, Updates is worthless to bring my attention to research relating only to my current unpublished research. On the rather fairer question of whether Updates is failing to recommend papers relevant to my existing body of work, one thing that could help make that judgement is that Updates offers, along with a selective 'top' list (currently numbering 32), a less selective 'all' list of recommendations (currently 99). Are there recommendations in the second list that I would have preferred on the first? I haven't studied this systematically, but on a brief look the answer is no - the algorithm has highlighted papers that at least from their titles and authors seem more interesting than those on its backup list. The more imponderable question is whether Updates is missing some highly relevant papers altogether; all I can say here is that I am not aware of any.

Quality: Of course a paper can be highly related to my research and still not be worth my time - here is one clear example, and as I said I entirely skipped two others (by the same authors). Updates does not really filter by quality in any direct way and so this problem is probably unavoidable. It is, however, not chronic - perhaps I'm simply lucky to be in a field not too overburdened with cranks, because the vast majority of papers I have read are genuinely good science. I wouldn't necessarily say 'excellent' science, but the reality is that most science is not groundbreaking - and that's OK! For every paradigm shift that changes everything you need a volume of 'normal' science to fill in the gaps and push at the edges; it's fair to say that most of the papers I have read for this blog fit into the latter category. In particular, many have been 'sequel' papers extending earlier work by the same authors, which are very popular in an era where quantity of publication gets (perhaps undue) emphasis.

One issue on the quality front that deserves particular emphasis is the large amount of unreviewed work I have had recommended. 7 out of the 12 papers had not yet been accepted for publication at the time of my review (one of these has subsequently been accepted for publication). Researchers are not all of one mind on whether such unverified work should be exposed to public view or not, but enough do put this sort of work out there that it constitutes a major proportion of Updates's recommendations. The advantage of this is that you see new science quicker; the disadvantage is that the work tends to be less polished in its exposition and less reliable in its results. Your mileage with this tool will almost certainly depend on your tolerance for this unpublished grey literature.

Sociology: One slightly unexpected aspect of this tool is that it has given a pretty interesting big picture not just of who is releasing research of interest to me, but of where. This is quite important as I look for post-postdoc work! Of the 12 papers an overwhelming 11 come from institutions in Europe, with 1 each from Asia and South America (numbers add to more than 12 because some papers have multiple authors from different institutions). Breaking that down, there were 5 from the United Kingdom, 2 each from Germany and Poland, and 1 each from Brazil, France, Japan, Romania, and Sweden. While these statistics are only compelling so far as one trusts the quality of Updates's recommendations, the sheer weight of European institutions (not to mention the surprising absence of the USA) say a lot about where my sort of research is being done, and emphasise the role geography still plays despite all the connections modern communications make possible - something difficult for a researcher in relatively isolated Australia to be unaware of, of course.

Sunday, 5 May 2013

Nominal Monoids


Author: Mikołaj Bojańczyk (University of Warsaw)

Reference: Theory of Computing Systems, Springer Open Choice, 4 April 2013

Comments: This, the twelfth paper I've read for this blog, represents something of a full circle, as it is by one of the same authors that wrote the first paper I looked at. In that post I discussed how Bojańczyk and collaborators had generalised the nominal sets model to capture certain computational phenomena, particularly to do with automata.

Automata are mathematical machines, somewhat like flowcharts, that accept certain inputs as 'questions' and return 'yes' or 'no' as their answer (more sophisticated notions of automata can return more complex output, and can also fail to terminate). The set of inputs that an automaton returns 'yes' to (in the jargon, recognises) is known as the automaton's language. This paper takes a more abstract view of formal language theory by, instead of using automata, using abstract algebraic structures called monoids to recognise languages, but the intention is the same - to study certain classes of computation at a clarifying level of mathematical abstraction.

There actually already exists a paper titled 'nominal monoids' (pdf), written in 2010 by Alexander Kurz (one of the authors of last week's paper) and collaborators, and perhaps surprisingly not cited in this paper. To be fair, the two notions of nominal monoid are a bit different in their emphasis - Kurz et al are interested in monoids equipped with binding structure, while Bojańczyk is interested in changing the 'alphabet' underlying languages to include an infinite number of letters, possibly equipped with some non-trivial structure. This point of view is known as languages with data words and is claimed to have applications e.g. with XML.

The key result of this paper is to adapt a result called the Schützenberger-McNaughton-Papert theorem from the setting of ordinary languages to languages with data words. This theorem makes a surprising and beautiful link between languages, the monoids that recognise them, and the logic that specifies them (specifically first-order logic, one of the basic logics you learn in a first or second year course). Bojańczyk extends notions of language, monoid, and even first-order logic to his generalised nominal setting, and his version of the theorem is proved with a new assumption that the monoid is 'ordered' in a certain sense. Interestingly, in the 'ungeneralised' nominal sets model that I'm most familiar with, this ordering requirement becomes unnecessary, so the statement of the theorem becomes very close indeed to the standard version.