& : Differences between code and natural language

We often hear, and I often feel, that writing code has a lot in common with writing prose.

Recently I've been thinking about ways in which this is not true.

Time. When I write or polish English, I sometimes look up to find that far less time has passed in the world of the clock than in my mind. There is something about manipulating words in silence that slows my perception of time down to the speed of a slug. My sense of accomplishment — of time efficiently spent — is great at such a moment. And that is not what it feels like when I write, debug, or refactor code — on the contrary, the clock seems to have jumped an hour or two every time I look up from what I perceive to be a few minute's cognition.
Clearly very different processes are at work. I can imagine that habit is one factor; surely there are others.
Economy. I find programming languages very different from natural language because of the redundancy of speech. In speech we often say the same thing several times in different (or even the same) words, and choose collocations in which the elements reinforce each other, which is also a kind of redundancy. Even the level of economy normally reached in writing is not possible in natural speech, to say nothing of the semantic density of a programming language.
Classical Chinese, in which (at its best) every morpheme contributes to the meaning of the whole sentence, has a density more like a programming language. But to me that has always suggested that Classical Chinese is not a close representation of oral language; rather, it must be a fundamentally written entity — never mind the presence of oral elements in it or its powerful effects on modern spoken language.
Non-verbal elements. Pseudocode often bears a likeness to natural language. But few if any programming languages are very much like pseudocode. The problem is the symbolic content — brackets of different kinds, disambiguating parentheses, different kinds of quotation marks, and so on. The same factors that put distance between mathematics and natural language apply in the case of programming languages. To unwind symbolic notation into words that can be understood on a single reading or hearing is often frankly impossible. It is a commonplace that the time frame within which one reads and understands a mathematical argument is orders of magnitude away from ordinary reading. The same is true of code.

And then there are the complications that nesting, control flow, and memory bring. Exams in first and second-semester computer science courses often contain questions about the output of nested loops or passing by reference, and interpreting even a simple code block usually brings disaster to many students. Furthermore, once a variable is declared and assigned, its type and content are "known" to the computer till the end of the run. Who could imagine a computer program full of statements addressing the listener or reader as a human being with fallible memory — statements intended to help clarify a fact in a long narrative? But such things abound in natural language — here are three at random from Robert Graves's I, Claudius, written in a chatty style:

It was at this point that my mother who, you must remember, was Livilla's mother too, interposed.

Pollio's son Gallus — hated by Tiberius because he had married Vipsania (Tiberius's first wife, you will recall, whom he had been forced to divorce on Julia's account), and because he had never given a public denial of the rumour which made him the real father of Castor, and because he had a witty tongue — this Gallus was the only senator who had dared to question the propriety of the motion.

He relieved Vonones of his treasure — you will recall that Vonones was the former king of Armenia, about whom my brother Germanicus had quarrelled with Gnaeus Piso — by sending agents to help him escape from the city in Cilicia where Germanicus had put him under guard and then having him pursued and killed.

Above all, natural language normally deals with words corresponding directly to human experience. Even metaphor and other high-level natural language abstractions do this. It is hard for me to imagine a programming language dealing with human experience this way.

Basically, I do not believe that human thought processes are rigorously rational. We can sometimes emulate reason, but that is all we can do. Natural language reflects that state of things. Specialized programming languages may some day emulate human thought, but I am content with Python or LaTeX attempting to do something very different.

I was prompted to write this by a conversation last night with Sean McGovern of Hacker School.

[end]

All articles