15 October 2019

The "Clojure Style Guide" is simply Wrong

  1. The Guide itself
  2. Why this topic?
  3. Single Space Indentation
    1. Incongruent with basic semantics of any language or markdown
    2. Not easily enforced
    3. Prior art - it’s the Scheme style
  4. 80 character line limit
    1. Lisp needs more horizontal budget
    2. Dealing with deep nesting is tricky
    3. The main reason for this restriction in 2019 are flawed tools
  5. Conclusion

The Guide itself

What is the "Clojure Style Guide"?

It can be found here:

It’s a style guide for Clojure that was created by a man who wrote some of the widely used tooling in Clojure, but primarily he’s a Ruby developer and the author of Rubocop, the most popular Ruby linter.


Why this topic?

The reason for this post is that other people have been clobbering me over the head with this style guide over and over again:

"But the Clojure Style Guide guide says this and that…​"


This is very upsetting, because I don’t think there’s ever been an agreement about the Clojure style, yet this one document made by this one person somehow became the golden standard.

There was no examination of what is suggested, it was all just accepted. It’s even named Clojure Style Guide like it’s an official thing, an ISO standard or a RFC and not just an opinion of one person.

I’ll go into detail why I reject it.


Single Space Indentation

This man wants us to ident function arguments with a single space.

I reject this notion because:

Incongruent with basic semantics of any language or markdown

Let’s say you have a YAML file and you want to represent an ordered sequence.

It looks like this (as per YAML 1.2):

color_list:
   - red
   - blue
   - green

All the items start on same vertical column.

The formatting is aligned with the semantics of this piece of YAML. It’s a data structure. It’s a list of items, which have equal standing as far as this list is concerned. The items are independent.

That is why when describing lists, vectors et al. we align all the items to start at the same column.

The same goes for vector literal in clojure:

[:red
 :blue
 :green]

In this case, the single space indentation for second and third item is there to offset the opening bracket and the effect it achieves is columnar aligment of the items. This is fine and it’s completely sensible, since it’s a vector of independent and equal elements.

Clojure Style Guide suggests the same formatting for function calls or operations instead of just vectors. This is where this thing goes off the rails:

(operation
 arg1
 arg2
 arg3)

This is simply wrong. It pretends that the first element of the list, the operation, does not have a special semantic, when it clearly does!

The first element of the list is special, it’s the operation or function.

  1. Lisp-2 languages like Common Lisp will even have a separate symbol space for the function

  2. Common Lisp is not formatted like this

  3. As we will see, even this style guide itself offers a different formatting for many operations (macros)

  4. It is simply not an ordered list of data elements


Not easily enforced

Prior argument would be reason enough not to use this formatting scheme, but to make things worse, there are many many exceptions to this rule special forms like if, any control macro, any binding macro etc…​ but not all macros.

It’s practically impossible to create an automatic formatter that would follow this code style because you’d need a database of third party macros and their correct formatting.

This is ironic given that the author is an author of a linter.


Prior art - it’s the Scheme style

The question might arise: what about other older Lisps?

  1. lisp-lang.org style guide says 2 space ident

  2. schemewiki style guide says 1 space ident

This other Scheme related style document provides the following rationale:

Rationale:  The columnar alignment allows the reader to follow the
  operands of any operation straightforwardly, simply by scanning
  downward or upward to match a common column.  Indentation dictates
  structure; confusing indentation is a burden on the reader who wishes
  to derive structure without matching parentheses manually.

I don’t think that one additional space of indentation weakens visual vertical scanning, because the difference is too small for the argument to have some other symbol above it to confuse the issue or to make parenthesis confusing.

This argument is especially suspect since we know that there exist forms with 2 space indentation in the same standard.



And for the second rule I don’t agree with:

80 character line limit

This archaic rule is a throwback to the 70s with the 80 column terminals, and yet here we are, 40 years later, with 4k screens, still having to adhere to the same principles C programmers did before desktop programming was even a thing.

Let’s examine the failings of this rule and the reasons behind it.

Lisp needs more horizontal budget

Programming Lisp with an 80 character budget is like programming C with a 60 character budget.

There are commonly used constructs that don’t increase level of indentation in Algol languages like C, but it does in Lisp, most obvious is introducing a local variable.

In C it would be simply:

int i = 0;
i = i + 1;

No additional indentation was introduced. Here’s Clojure:

(let [i 0]
  (inc i))

A let statement increases indentation.

In general, S-expressions introduce more indentation than general C-style language code, hence they need more horizontal space.


Dealing with deep nesting is tricky

So what does one do when you reach the 80 column limit?

One way is to break the line. The established formatting rules make it so this really doesn’t free up much space.

Another approach is to factor out the nested expressions into functions.

This has severe shortcomings.

  • it severely pollutes namespace public symbol pool with functions used only at one call-site

  • naming is a hard problem and naming these functions is generally impossible unless you use really long function names that all but ruin any space benefit to using them

  • if the expression uses a lot of local values then the function has very many arguments

Another alternative is to break up expressions into let bindings instead of functions. This still has a problem with succinct names for similar values:

(let [transactions (get-transaction ......
      transactions-with-confirmed-date (filter ..... transactions)
      transactions-with-confirmed-date-and-domestic-currency ....]

When the expression breaks down into subexpressions which are of type previous value but with slight twist, it becomes really hard to come up with names.

The only really adequate solution is threading macros, but those don’t work that well in all situations.


The main reason for this restriction in 2019 are flawed tools

The main reason why we have people advocating for this limit in 2019 is that some people work in Emacs on a single laptop monitor, and then they split that single monitor down into multiple text buffers, which end up being really small so 80 character limit is needed by them.

I work with multiple large monitors and most of my screen is empty in these codebases.

But why does the industry need to adjust to the lowest common denominator? Imagine professional photographers adjusting everything to work with the smallest lenses because there are people who use that.


Conclusion

I know styling is a very personal topic, but at least on Java projects I could download the style of the client and have the IDE format the file on each save. I cannot do that in Clojure because this style is not lintable at all, so now everyone on the team spends time code reviewing, fixing, rebasing branches for style mistakes.

It’s not very smart.

Tags: clojure style community politics