The "Clojure Style Guide" is simply Wrong
- The Guide itself
- Why this topic?
- Single Space Indentation
- Incongruent with basic semantics of any language or markdown
- Not easily enforced
- Prior art - it’s the Scheme style
- 80 character line limit
- Lisp needs more horizontal budget
- Dealing with deep nesting is tricky
- The main reason for this restriction in 2019 are flawed tools
- Conclusion
The Guide itself
What is the "Clojure Style Guide"?
It can be found here:
It’s a style guide for Clojure that was created by a man who wrote some of the
widely used tooling in Clojure, but primarily he’s a Ruby developer and the author of
Rubocop, the most popular Ruby linter.
Why this topic?
The reason for this post is that other people have been clobbering me over the head with this style guide over and over again:
"But the Clojure Style Guide guide says this and that…"
This is very upsetting, because I don’t think there’s ever been an agreement about the Clojure style, yet this one document made by this one person somehow became the golden standard.
There was no examination of what is suggested, it was all just accepted. It’s even named Clojure Style Guide like it’s an official thing, an ISO standard or a RFC and not just an opinion of one person.
I’ll go into detail why I reject it.
Single Space Indentation
This man wants us to ident function arguments with a single space.
I reject this notion because:
Incongruent with basic semantics of any language or markdown
Let’s say you have a YAML file and you want to represent an ordered sequence.
It looks like this (as per YAML 1.2):
color_list:
- red
- blue
- green
All the items start on same vertical column.
The formatting is aligned with the semantics of this piece of YAML. It’s a data structure. It’s a list of items, which have equal standing as far as this list is concerned. The items are independent.
That is why when describing lists, vectors et al. we align all the items to start at
the same column.
The same goes for vector literal in clojure:
[:red
:blue
:green]
In this case, the single space indentation for second and third item is there to offset the opening
bracket and the effect it achieves is columnar aligment of the items. This is fine and it’s
completely sensible, since it’s a vector of independent and equal elements.
Clojure Style Guide suggests the same formatting for function calls or operations instead of just vectors. This is where this thing goes off the rails:
(operation
arg1
arg2
arg3)
This is simply wrong. It pretends that the first element of the list, the operation,
does not have a special semantic, when it clearly does!
The first element of the list is special, it’s the operation or function.
-
Lisp-2 languages like Common Lisp will even have a separate symbol space for the function
-
Common Lisp is not formatted like this
-
As we will see, even this style guide itself offers a different formatting for many operations (macros)
-
It is simply not an ordered list of data elements
Not easily enforced
Prior argument would be reason enough not to use this formatting scheme, but to make things worse,
there are many many exceptions to this rule special forms like if
, any control macro, any binding
macro etc… but not all macros.
It’s practically impossible to create an automatic formatter that would follow this code style because you’d need a database of third party macros and their correct formatting.
This is ironic given that the author is an author of a linter.
Prior art - it’s the Scheme style
The question might arise: what about other older Lisps?
-
lisp-lang.org style guide says 2 space ident
-
schemewiki style guide says 1 space ident
This other Scheme related style document provides the following rationale:
Rationale: The columnar alignment allows the reader to follow the operands of any operation straightforwardly, simply by scanning downward or upward to match a common column. Indentation dictates structure; confusing indentation is a burden on the reader who wishes to derive structure without matching parentheses manually.
I don’t think that one additional space of indentation weakens visual vertical scanning, because the difference is too small for the argument to have some other symbol above it to confuse the issue or to make parenthesis confusing.
This argument is especially suspect since we know that there exist forms with 2 space indentation in the same standard.
And for the second rule I don’t agree with:
80 character line limit
This archaic rule is a throwback to the 70s with the 80 column terminals, and yet here we are, 40 years later, with 4k screens, still having to adhere to the same principles C programmers did before desktop programming was even a thing.
Let’s examine the failings of this rule and the reasons behind it.
Lisp needs more horizontal budget
Programming Lisp with an 80 character budget is like programming C with a 60 character budget.
There are commonly used constructs that don’t increase level of indentation in Algol languages
like C, but it does in Lisp, most obvious is introducing a local variable
.
In C it would be simply:
int i = 0;
i = i + 1;
No additional indentation was introduced. Here’s Clojure:
(let [i 0]
(inc i))
A let
statement increases indentation.
In general, S-expressions introduce more indentation than general C-style language code, hence they need more horizontal space.
Dealing with deep nesting is tricky
So what does one do when you reach the 80 column limit?
One way is to break the line. The established formatting rules make it so this really doesn’t free up much space.
Another approach is to factor out the nested expressions into functions.
This has severe shortcomings.
-
it severely pollutes namespace public symbol pool with functions used only at one call-site
-
naming is a hard problem and naming these functions is generally impossible unless you use really long function names that all but ruin any space benefit to using them
-
if the expression uses a lot of local values then the function has very many arguments
Another alternative is to break up expressions into let bindings instead of functions. This still has a problem with succinct names for similar values:
(let [transactions (get-transaction ......
transactions-with-confirmed-date (filter ..... transactions)
transactions-with-confirmed-date-and-domestic-currency ....]
When the expression breaks down into subexpressions which are of type previous value but with slight twist
,
it becomes really hard to come up with names.
The only really adequate solution is threading macros, but those don’t work that well in all situations.
The main reason for this restriction in 2019 are flawed tools
The main reason why we have people advocating for this limit in 2019 is that some people work in Emacs on a single laptop monitor, and then they split that single monitor down into multiple text buffers, which end up being really small so 80 character limit is needed by them.
I work with multiple large monitors and most of my screen is empty in these codebases.
But why does the industry need to adjust to the lowest common denominator? Imagine professional photographers adjusting everything to work with the smallest lenses because there are people who use that.
Conclusion
I know styling is a very personal topic, but at least on Java projects I could download the style of the client and have the IDE format the file on each save. I cannot do that in Clojure because this style is not lintable at all, so now everyone on the team spends time code reviewing, fixing, rebasing branches for style mistakes.
It’s not very smart.