Skip to content

One Herald Layout

May 16, 2015

Layout is a source of violent disagreement in programming languages. I’ve written about it before, in the context of Epigram. But now I’m even more overwhelmend than I was then, and I’m thinking about working on several languages, which makes me less inclined to think of their individual properties and concentrate on what I need. I’m certainly not pitching to solve everybody‘s layout problems once and for all: I’ll be lucky if I can even manage my own. Let’s try to boil the issues down.

Some lines are long. I grew up amongst the paraphernalia of the punchcard era, and for the most part, I used 80-column displays. To this day, when I’m hacking, I get uncomfortable if a line of code is longer than 78 characters, and I enjoy the way keeping my code narrow allows me to put more buffers of it on my screen. But however you play it, it’s far from odd to find that a logical line of code stretches wider than your window, so that it might be visually more helpful if it made more use of the vertical dimension. Indenting ‘continuation’ lines more than the ‘header’ line is a standard way to break the latter into pieces which fit.

Some lines are subordinate. Whether they are sublists of a list, or the equations of a locally defined function, or whatever, a textual construct sometimes requires a subordinate block of lines. It’s kind of usual to indent the lines which make up a subordinate block.

How do you tell whether an indented line is a continuation line or a header line within a subordinate block?

I’m trying to find a simple way to answer that question, and what I’m thinking is that I’d like a symbol which marks the end of ‘horizontal mode’, where indented lines continue the header, and the beginning of ‘vertical mode’, where indented lines (each in their own horizontal mode) belong to a subordinate block. My candidate for this symbol is -: just because it looks like a horizontal thing then some vertical things. I’m going to try to formulate sensible rules to identify the continuation and subordination structure.

An indentation level, or Dent, is an element of the set of natural numbers extended by bottom and top, with bottom < 0 < 1 < 2 < … j. An i-Block is a possibly empty sequence of j-Chunks each for some j > i. Within a given j-Chunk, each line is considered a continuation of the first (the header) until the first occurrence of -:, at which point the remainder of the j-Chunk is interpreted as a subordinated j-Block, with any text to the right of -: treated as a top-Line. A document is a bottom-Chunk.

And, er, that’s it. At least for the basic picture.

Higgledy piggledy
  boggle bump splat
Most of the post
  clusters close on the mat -:
  the phone bill
  the gas bill
  the lecce
  the junk
  the bags to dispose of
    old clothes from your trunk
The tide you divide
  to get into your flat
Will just gather dust
  if you leave it like that.


{Higgledy piggledy boggle bump splat; Most of the post clusters close on the mat {the phone bill; the gas bill; the lecce; the junk; the bags to dispose of old clothes from your trunk}; The tide you divide to get into your flat; Will just gather dust if you leave it like that.}

(Actually, it might make sense to allow a matching :- to act as an ‘unlayout herald’. The idea is that a Block is a bunch of Chunks and a Chunk is a bunch of Components, and a Component is either a lexical token or a subordinated Block. If a -: has no matching :-, it’s a subordinated Block Component at the end of its enclosing Chunk; the matching :- indicates the end of the subordinated Block Component, after which the Chunk continues.)

By way of an afterthought, why not take Dent to be the integers extended by bottom and top. A line which looks like this (with at least 3 dashes and any amount of whitespace either side)


shifts the indentation origin to the left by number-of-dashes-plus-2, thus increasing the indentation of the leftmost physical column by the corresponding amount. A line like


shifts the origin the other way, and if you overdo it, the leftmost physical column will have negative indentation, but not as negative as bottom. That’s one way to keep your subordinates from drifting too far to the right.

2 Comments leave one →
  1. ezyang permalink
    May 19, 2015 6:21 am

    To try to summarize, you suggest layout shouldn’t be triggered when one of a few special keywords occurs in the source text (how where/do/let/of function in Haskell98, as you mention in the linked text), but when there is a single herald -: to trigger it.

    It would certainly have solved the fiasco where MultiWayIf doesn’t introduce new layout!

  2. May 19, 2015 8:29 am

    That’s correct. I’m not necessarily saying that having a small number of special layout-inducing keywords is bad, if you’re setting up the syntax of one specific programming language. But I’m trying to engineer a textual syntax in which different regions will generate different sorts of code: those regions will be indicated by subordination, so I kind of need layout to work, at least in outer levels, in order to figure out what stuff is where.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

%d bloggers like this: