clojure.spec Gotchas

Clojure Clojure.spec

Intro

Every Clojure programmer has either used or at least heard of clojure.spec. I’m confident that many have experimented with Schema, clojure.spec, or Malli at least once. Today, I won’t be comparing these, but instead, I want to focus on how to avoid Spec pitfalls, to highlight potential misuses. We’ll explore how a tool designed to foster confidence can occasionally cause confusion.

Here are some examples of spec-related issues:

  • Newly added service coercing endpoint parameters, fails on production, even though everything appears to work fine locally and all tests pass.
  • Running a subset of tests results in a failure, yet running all tests pass.
  • Changing the order of test source paths breaks the tests
  • Introducing a spec in one section of the codebase, breaks the tests and/or production in another.

Then you look at the error message and you see either a information about missing spec, failing to satisfy spec. You look at it and you think “This is not my spec!”, “It shouldn’t be there”. “Why it worked until now?” and simply “WTF?” In this article I attempt to answer these. I attempt to bring explanation and show it is deterministic despite it might seem not to.

Dictionary

Before I start, I want to only bring a couple of definitions to speak the same language.

Unqualified keyword: A keyword without any namespace. It’s just the name without any slashes. For example, :name.

Qualified keyword (or namespaced keyword): A keyword that includes a namespace. This can be either in the form of a namespace alias or the full namespace. For example, :person/name where person is the namespace part.

Fully qualified keyword: A keyword that includes its full and exact namespace. For instance, :com.company.person/name where com.company.person is the full namespace and name is the keyword.

Namespace alias qualified keyword (or alias-namespaced keyword): A keyword that is prefixed with a namespace alias, which is a shorthand reference to a full namespace. The actual keyword will resolve to its fully qualified form based on the current context or the namespace in which the alias is defined. For example, ::person/name, assuming person is an alias for a longer namespace.

Pseudo-qualified keyword (or pseudo-namespaced keyword, or conventionally-namespaced keyword, entity-namespaced keyword): These keywords look like they are qualified, but their namespace doesn’t correspond to any actual Clojure namespace. An example might be :person/name, where person isn’t necessarily a valid namespace in the codebase but serves to categorize the name keyword.

(qualified-keyword? :person/name) ;;=> true             ; pseudo-qualified keyword
(qualified-keyword? :com.company.person/name) ;;=> true ; fully-qualified keyword
(qualified-keyword? ::name) ;;=> true                   ; namespace alias qualified keyword
(qualified-keyword? :name) ;;=> false                   ; unqualified keyword

Beside the last one, all keywords are qualified.

Without further ado, let’s delve into some of the basic principles about Spec and Clojure.

Spec briefly

For the purposes of this article, I’m assuming you’re familiar with Spec. My goal isn’t to introduce it from scratch, nonetheless I want to put some ground in our understanding how it works. Here are some key points:

  • clojure.spec employs a mutable state to represent its registry, which takes the form of a map within an atom.
  • qualified keywords serve as the keys in this registry map.
  • Both pseudo- and fully-qualified keywords are valid.
  • specs are not known until they are registered.
  • While it’s feasible to register a spec at any stage in a process’s lifetime, the bulk of registrations occur at startup when the tree of namespaces is loaded.
  • The namespaces are loaded only when they are required (when they are reachable from the tree of required namespaces).
  • The registry isn’t read at the time of spec registration, but rather when the spec is put into action. This principle also applies to nested specs.
  • If a keyword has a registered spec and this keyword is also a key in a map under validation, this spec will be effective, even if s/keys doesn’t explicitly mention it.

This is a brief, bullet point-based summary. and if you like bullet points here is a few more. The key weak points causing these possible issues are:

  • global mutable state
  • overrides are possible
  • easiness to introduce implicit dependencies between namespaces
  • The spec design flaw complecting a keyword name used as a registry key (sort of an address where our spec definition lives) with the data keyword being under spec examination.

Now lets see some examples.

Meditations

Let’s analyse together some examples.

Example 1 - registry mutation

(ns com.company.billing
  (:require [clojure.spec.alpha :as s]))

(s/def ::person (s/keys :req-un [::name]))
(comment
 (s/valid? ::person {:name "Morty"}) ;=> true
 (s/valid? ::person {:name "Rick137"})) ;=> true

;; we register a new spec and try again
(s/def ::name (s/and string? #(re-matches #"[a-zA-Z]*" %)))
(comment
 (s/valid? ::person {:name "Rick137"}); => false
 (s/explain-data ::person {:name "Rick137"})
 ; => #:clojure.spec.alpha{:problems ({:path [:name],
 ;                                     :pred (clojure.core/fn [%] (clojure.core/re-matches #"[a-zA-Z]*" %)),
 ;                                     :val "Rick137",
 ;                                     :via [:com.company.billing/person :com.company.billing/name],
 ;                                     :in [:name]}),
 ;                         :spec :com.company.billing/person,
 ;                         :value {:name "Rick137"}}
 ,)

;; then register the spec again
(s/def ::name string?)
(comment
 (s/valid? ::person {:name "Rick137"})); => true

We register ::person spec (line 4). It seems to be valid for both {:name "Morty"} and {:name "Rick137"} (lines 6-7). It works that way because initially there is no spec under ::name so it only validates existence of a :name key. Then in line 10, we register a spec under ::name and while {:name "Tom"} still is valid, while {:name "Rick137"} is in trouble.

This shows that registry mutates. Of course this is very unlikely case to come up with redefinition of fully-namespaced spec, but let consider a similar scenario, slightly changed.

Example 2 - spec overriding

In the first namespace, we see a billing logic. We deal here with legal matter and we want to ensure the person enters a valid legal name:

(ns com.company.billing
  (:require [clojure.spec.alpha :as s]))

(s/def :person/name (s/and string? #(re-matches #"[a-zA-Z]*" %)))
(s/def ::person (s/keys :req-un [:person/name]))

(defn valid-person? [person]
  (s/valid? ::person person))
(comment
 (valid-person? {:name "Morty"}) ; => true
 (valid-person? {:name "Rick137"})) ; => false

The commented calls of valid-person? show that validation works. “Morty” is valid name, while “Rick137” is not.

Then we have a player namespace. It is purpose is entirely different. We are much more relaxed about the name of the user - simple string? is enough.

(ns com.company.player
  (:require [clojure.spec.alpha :as s]))

(s/def :person/name string?)
(s/def ::person (s/keys :req-un [:person/name]))

(defn valid-person? [person]
  (s/valid? ::person person))
(comment
 (valid-person? {:name "Morty"}) ; => true
 (valid-person? {:name "Rick137"})) ; => true

Again in the commented code we have some samples. This time both “Morty” and “Rick137” are valid names.

Then what happens is that the service is running. Both com.company.billing/valid-person? and com.company.player/valid-person? can be called interchangeably. But then it looks something strange is going on. It looks “Rick137” is a valid name by billing logic. We open our REPL and dig in:

(ns user
  (:require [clojure.spec.alpha :as s]
            [com.company.billing :as billing]
            [com.company.player :as player]))

(comment
 (billing/valid-person? {:name "Morty"}) ; => true
 (billing/valid-person? {:name "Rick137"}) ; => true
 (player/valid-person? {:name "Morty"}) ; => true
 (player/valid-person? {:name "Rick137"}) ; => true

 (s/form :person/name)) ; => clojure.core/string?

It turns out billing/valid-person? no longer respects its spec. We overridden :person/name in the player namespace. If we check what is under the key, we see it is clearly string? predicate alone. There is no two entries for :person/name, there is only one. All that other stuff? Doesn’t mean squat. Just the last thing.

Example 2 - fix

The fix to the example 2 is very simple. Since we use unqualified keyword (:name) to store name in our data, we are free to use a fully-qualified spec combined with req-un:

--- a/example2/src/com/company/billing.clj
+++ b/example2/src/com/company/billing.clj
@@ -4,2 +4,2 @@
-(s/def :person/name (s/and string? #(re-matches #"[a-zA-Z]*" %)))
-(s/def ::person (s/keys :req-un [:person/name])) ;=> :com.company.product.spec/person
+(s/def ::name (s/and string? #(re-matches #"[a-zA-Z]*" %)))
+(s/def ::person (s/keys :req-un [::name])) ;=> :com.company.product.spec/person
--- a/example2/src/com/company/player.clj
+++ b/example2/src/com/company/player.clj
@@ -4,2 +4,2 @@
-(s/def :person/name string?)
-(s/def ::person (s/keys :req-un [:person/name]))
+(s/def ::name string?)
+(s/def ::person (s/keys :req-un [::name]))

This change entirely solves the problem. But is it always that straightforward to fix? No. Let’s look at the next example.

Example 3 - pseudo-qualified key in data

What if we validate data that use pseudo-qualified keywords? Not like before :name, but :person/name?

(ns com.company.billing
  (:require [clojure.spec.alpha :as s]))

(s/def :person/name (s/and string? #(re-matches #"[a-zA-Z]*" %)))
(s/def ::person (s/keys :req [:person/name]))

(defn valid-person? [person]
  (s/valid? ::person person))
(comment
 (valid-person? {:person/name "Morty"}) ; => true
 (valid-person? {:person/name "Rick137"})) ; => false
(ns com.company.player
  (:require [clojure.spec.alpha :as s]))

(s/def :person/name string?)
(s/def ::person (s/keys :req [:person/name]))

(defn valid-person? [person]
  (s/valid? ::person person))
(comment
 (valid-person? {:person/name "Morty"}) ; => true
 (valid-person? {:person/name "Rick137"})) ; => true)
(ns user
  (:require [com.company.billing :as billing]
            [com.company.player :as player]))

(comment
 (billing/valid-person? {:person/name "Morty"}) ; => true
 (billing/valid-person? {:person/name "Rick137"}) ; => true
 (player/valid-person? {:person/name "Morty"}) ; => true
 (player/valid-person? {:person/name "Rick137"})) ; => true

The example is very same as the example 2, but we use qualified keyword as a key in the map. It is not qualified by actual namespace. It is a pseudo-qualified keyword. The thing is that if a map uses qualified keys, you can’t use :req-un and apply the fix like previously. With :req-un spec would look for unqualified keys. Previously it was fine, here it is not. If we have data using qualified keys, we need to have qualified specs. In this case these keys are pseudo-qualified and spec needs to use a pesudo-qualified keyword. Looks like the universe decided to give us a break.

Solution? If we want to keep using clojure.spec we need to either:

  • unify the :person/name spec and keep it in one place, most likely sacrificing distinctness,
  • transform data before approaching spec,
  • get rid of pseudo-qualified keys from your logic.

There are trade-offs coming from applying each of these.

If you decide to unify :person/name and store it in one place, you mostly likely loose the precision of your specs, and still it is not going to prevent from introducing problems. It still easy to misuse by someone who is less experienced with spec, and we still share the registry. Maybe unlikely, but possible, that changes come from upstream libraries can cause naming conflict with your pseudo-qualified keywords.

If you decide to transform data at the time we consume them, you end up with massive amount of boilerplate.

If you get rid of pseudo-qualified keys, in your logic this is the best way to go. It might mean some refactoring, but it doesn’t have to be done everywhere at once. If you use attribute-based database like Datomic and this is the source of pseudo-qualified keywords, keep the pseudo-qualified specs close to the database and to the rest of the app provide translated qualified keys.

Lets move to the next example, which is even more mind blowing.

Example 4 - s/keys helpful to a fault

This time we extracted :person/name to separate namespace - com.company.spec:

(ns com.company.spec
  (:require [clojure.spec.alpha :as s]))

(s/def :person/name string?)

In addition to this, in com.company.billing we added a :person/age in the billing namespace to validate whether the person has reached their 18th birthday.

(ns com.company.billing
  (:require [clojure.spec.alpha :as s]
            [com.company.spec]))

(s/def :person/age (s/and number? #(> % 18)))
(s/def ::person (s/keys :req [:person/name :person/age]))

(comment
 (s/valid? ::person {:person/name "Morty" :person/age 14}) ; => false
 (s/valid? ::person {:person/name "Rick137" :person/age 70})) ; => true)

We do it only for billing namespace. player namespace still cares, supposedly, only about :person/name. Isn’t it? Let’s see.

(ns com.company.player
  (:require [clojure.spec.alpha :as s]))

(s/def ::person (s/keys :req [:person/name]))

(comment
 (s/valid? ::person {:person/name "Morty" :person/age 14}) ; => false
 (s/valid? ::person {:person/name "Rick137" :person/age 70}) ; => true

 (s/explain-data ::person {:person/name "Morty" :person/age 14})
 ; => #:clojure.spec.alpha{:problems ({:path [:person/age],
 ;                                     :pred (clojure.core/fn [%] (clojure.core/> % 18)),
 ;                                     :val 14,
 ;                                     :via [:com.company.player/person :person/age],
 ;                                     :in [:person/age]}),
 ;                         :spec :com.company.player/person,
 ;                         :value #:person{:name "Morty", :age 14}}

 (s/form ::person)) ; => (clojure.spec.alpha/keys :req [:person/name]))

It turns out that spec complains in the player namespace. Oh wow, how totally unexpected… Not.

We check (s/form ::person)) - returns the spec we expect to see, but spec still complains. burp Stuck again.

Why is this happening?

When using s/keys, any key in the map being validated that has an associated spec in the global registry will be checked. This can be considered both a feature (for keeping data consistency) and a potential pitfall, as you can see in this example.

OK. What we can do with it? Basically we have the same options as the last time:

  • unify - here would have to unify entire person this time, either enforce age check everywhere or remove it
  • transform data before approaching spec - boilerplate,
  • get rid of pseudo-qualified keys from your logic - recommended.

Summary

Wubba Lubba Dub Dub! clojure.spec is a powerful tool, but it has its nuances. The global, mutable nature of the spec registry means that the order of loading and the state of the registry can impact spec behavior in non-obvious ways. You let your guard down for a moment, let yourself or your team to apply poor practice like copy-paste programming and very quickly you are in the place where spec starts becoming a problem.

You can’t carelessly use pseudo-qualified keywords with spec. Implicit dependencies between namespaces regarding spec and redefinitions will make you life miserable. On top of that the spec design decision complecting the name of spec registry key with name of key in test data - you need to be aware of it.

Here is a couple of takeaway points.

Avoid pseudo-qualified keywords: If there is a one root of all evil in spec, it would be pseudo-qualified keywords. If you use clojure.spec stay away from them. If the reason to use pseudo-namespaced specs is the attribute based database (like Datomic), introduce translation layer. Then you can keep pseudo-qualified specs very close to the database, lower their importance, and use qualified by actual namespace in the rest of the logic. If you want to use pseudo-qualified keywords, don’t use Spec, use Malli. Malli can handle them without such problems.

Keep dependencies explicit: If one namespace is depending on a spec from another, make that dependency explicit using :require. Prefer alias-namespaced keywords over fully-namespaced keywords. If you encounter a dependency circle between namespaces, don’t try to be smart using fully-namespaced keyword and skipping require. Resolve it. When you use pseudo-qualified keywords is easy to skip require, don’t skip it. Add the require to make the dependency between namespaces explicit. If ns A defines specs used, and ns B uses them, it is better to make this dependency explicit. Otherwise it is not guaranteed that the spec exists when you need it.

Know your tools: spec is great, but has its flaws. It can’t serve everyone. If you like to use pseudo-namespaced keys, Malli will serve you way better. Give it a go. If you prefer spec, you need to be aware where is works well, and where it doesn’t and how to deal with it.

If you found it helpful, glad to hear that. If you have a spec problem of different nature, let me know. Happy to help.


The article can also be found on Medium. Feel free to leave a comment there or send me an email if you wish to share your thoughts.