CatFace, where cats have a face. (Diving into Clojure, part four)

Today, I’m taking a break from my mass downloader experiment in Clojure (part 1, parts 2 and 3). A discussion with a friend about a particularly intrepid cat I saw in my backyard led to speculation about the existence of a LinkedIn for cats. I thought the idea was hilarious and that it’d be a fun little web-app to roll up. I could do it in Sinatra, but since I’m trying to learn Clojure and Heroku supports it, I figured that’s the way I’d go.

This is a less complicated problem to solve than the mass downloader, but also easier for me to express in a pure functional style. I don’t want to have to use a database for this project, but I’d like people to be able to link to cat profiles. My design is to use integer ‘profile IDs’ for cats, which serve as seeds for a psuedorandom number generator. As long as the number generator is stable, I should generate the same random cat profile for the same profile ID every time.

I’m calling it CatFace, where cats have a face. After rolling up a new app in Leiningen, I started working. First, I decided to define lists of cat names, places, descriptions, etc.

(def cat-names
  ["Fluffy" "Spot" "Paco" "Curly" "Felix"])

(def cat-homes
  ["Under the porch for some damn reason"
   "Pissing on your nice new couch"
   "Trying to sleep on your face"
   "Under your feet on the stairs"])

(def cat-colors
  ["black", "white", "tuxedo", "calico", "tortoiseshell", "red tabby", "brown tabby", "siamese"])

All of these are pretty simple – just vectors with names. I’ll want many, many more names in the future. As long as I treat these strictly as seqs, I could read them as lines out of a file, load them out of a database, or whatever I like – other code will not have to change.

The ‘destiny’ (‘headline’, in LinkedIn speech) for the cat is a bit more complicated:

(def cat-destiny
  (lazy-cat (repeat 25 "just a cat")
            ["this cat is going places"]))

This behaves essentially like an array of 25 “just a cat”s and one “this cat is going places”, but it’s lazily evaluated. Conjugating together repeat seqs of various length, I can weight the random choice to favor some results over others. As far as I understand, these numbers could be very high without actually generating a huge array, as long as I don’t realize the whole seq at once or hold onto its head while traversing it. (Whether it would perform acceptably in terms of CPU time is another question.) (_ed: I incorrectly used conj in this, but conj only adds a single item to a seq. lazy-cat is used to glue two seqs together._)

The built-in Clojure random number interface uses the shared random number generator. I could manually set the seed on each request, but that would not be pure functional and could conceivably affect other parts of the system I don’t intend to make predictable. So I created a set of functions to create and work with a seeded random number generator.

(defn rand-gen [seed]
  (Random. seed))

(defn rand-with
  ([rands] (. rands nextInt))
  ([rands n] (. rands nextInt n)))

(defn rand-int-with [rands n]
  (int (rand-with rands n)))

(defn rand-nth-with [rands coll]
  (nth coll (rand-int-with rands (count coll))))

These all correspond directly to common Clojure functions, except they take a random number generator created with rand-gen. All of this allows me to finally build a cat profile:

(defn cat-profile [profile-id]
  "Returns a predictably generated random cat profile based on profile-id."
  (let [rands (rand-gen profile-id)
        pick-one #(rand-nth-with rands %)
        rand-int-l (partial rand-int-with rands max-cat)]
    { :id      profile-id
      :name    (pick-one cat-names)
      :home    (pick-one cat-homes)
      :color   (pick-one cat-colors)
      :destiny (pick-one cat-destiny)
      :friends (map cat-profile (distinct (repeatedly 10 rand-int-l)))}))

I allocate a new random number generator using the given profile-id, and then for convenience I define a lambda (using the # macro, which uses positional arguments) which uses that generator to pluck a random item out of any seq. I also define a local version of rand-int using the same generator. As long as all of my random numbers come from this generator, they’ll be repeatable in the future given the same ID.

That’s where all the major work is done – the actual body of the function is just a literal map definition! I return the ID used for later reference, and then pick a random entry out of cat-names, cat-homes, cat-colors, and cat-destiny. To generate a list of this cat’s friends, I use repeatedly to produce a seq of ten random ints, pare out any (unlikely) repeats with distinct, and then map them to cat-profile so each entry is an actual cat profile, not just an ID.

Now, this looks like trouble. I’m calling cat-profile recursively, aren’t I? And, indeed, if I run lein repl and give it a spin, it starts printing out an endlessly recursive structure:

    user=> (use 'cat-face.core)
    user=> (cat-profile 33364)
    {:id 33364, :name "Felix", :home "Trying to sleep on your face",
    :color "calico", :destiny "just a cat", :friends ({:id 10318,
    :name "Felix", :home "Pissing on your nice new couch", :color
    "calico", :destiny "just a cat", :friends ({:id 6847, :name
    "Felix", :home "Pissing on your nice new couch", :color "white",
    :destiny "just a cat", :friends ({:id 37442, :name "Felix", :home
    "Trying to sleep on your face", :color "red tabby", :destiny
    "just a cat", :friends ({:id 84367, :name "Paco", :home "Under
    your feet on the stairs", :color "tuxedo", :destiny "just a cat",
    :friends ({:id 41317, :name "Felix", :home "Under your feet on
    the stairs", ...

and there’s nothing for it but to Ctrl-C. Calling the function doesn’t actually create this structure, though – printing it does. But it doesn’t have to.

At this point, just about anyone with Clojure experience is probably yelling at the screen. OK, I got it. map creates a lazy sequence – it doesn’t actually execute cat-profile or even generate that list of random numbers until you actually try to traverse the friends list. Printing recursively will generate an infinitely-recursive data structure, but you can create it and dig down without actually creating such a structure.

To get the REPL to not go crazy when presented with an infinitely recursive sequence, you can use the *print-level* setting.

    user=> (set! *print-level* 3)
    user=> (cat-profile 33364)
    {:id 33364, :name "Felix", :home "Trying to sleep on your face",
    :color "calico", :destiny "just a cat", :friends ({:id 10318,
    :name "Felix", :home "Pissing on your nice new couch", :color
    "calico", :destiny "just a cat", :friends #} {:id 3227, :name
    "Spot", :home "Under your feet on the stairs", :color "calico",
    :destiny "just a cat", :friends #} {:id 84261, :name "Paco",
    :home "Under the porch for some damn reason", :color "calico",
    :destiny "just a cat", :friends #} {:id 27080, :name "Curly",
    :home "Keyboard", :color "brown tabby", :destiny "just a cat",
    :friends #} {:id 34010, :name "Paco", :home "Under the porch
    for some damn reason", :color "siamese", :destiny "just a cat",
    :friends #} {:id 7349, :name "Paco", :home "Pissing on your nice
    new couch", :color "black", :destiny "just a cat", :friends #}
    {:id 41653, :name "Paco", :home "Under the porch for some damn
    reason", :color "calico", :destiny "just a cat", :friends #}
    {:id 97476, :name "Curly", :home "Under the porch for some damn
    reason", :color "calico", :destiny "just a cat", :friends #}
    {:id 88174, :name "Curly", :home "Under your feet on the stairs",
    :color "black", :destiny "just a cat", :friends #} {:id 59364,
    :name "Felix", :home "Under your feet on the stairs", :color
    "brown tabby", :destiny "just a cat", :friends #})}

This prints Felix #33364’s profile and all of his friends, but none of their friends – those lists are replaced with a hash symbol.

So now we have a Clojure package for creating random user profiles for cats. As a convenience, let’s finish off with a function for generating an actual psuedorandom cat (with a random ID):

(defn rando-cat []
  (cat-profile (rand-int max-cat)))

This is the only part of the code which isn’t purely functional – it depends on the current state of the global random number generator.

Well, I’m not quite done. I want to put together a little web interface for this, where you can view a cat’s profile and go to their friends’ profiles. That’s going to require some design work and digging around Wikimedia Commons for cat pictures (the meager quantity of cat pictures available on the Internet is well-known, so this should be difficult). But before I write any code relying on this, I want to be sure I can rely on it. So…

Testing! Leiningen will generate a test file and run it for you once you fill it in. It looks like this:

(ns cat-face.test.core
  (:use [cat-face.core])
  (:use [clojure.test]))

(deftest replace-me ;; FIXME: write
  (is false "No tests have been written."))

Tests are expected to be a.test.z, where your namespace is a.z. Leiningen uses the clojure.test library, which is based on the is macro. This macro takes actual code and is smart enough to detect what you’re trying to do and tell you what’s wrong – it has some special cases which look at the predicate you’re using and format the output accordingly. RSpec is the closest thing that Ruby has to this, but it’s still very different.

So, what do I want to test? I suppose the main thing I want to verify is the structure of the data and whether the profiles are repeatable.

(deftest profile-structure
  (testing "Profile structure"
    (let [cat (rando-cat)]
      (is (instance? java.lang.Long (cat :id)))
      (is (instance? String (cat :name)))
      (is (instance? String (cat :home)))
      (is (instance? String (cat :color)))
      (is (instance? String (cat :destiny)))
      (is (instance? clojure.lang.ISeq (cat :friends))))))

This test generates a random cat profile and verifies that all the expected keys are present and the values are of the appropriate type. It runs and passes:

    [mboeh@orz:~/projects/cat-face]  % lein test

    Testing cat-face.test.core

    Ran 1 tests containing 6 assertions.
    0 failures, 0 errors.

If it had failed (as it did when I expected :id to be an Integer, not a Long), it prints out the failure nicely:

mboeh@orz:~/projects/cat-face % lein test

    Testing cat-face.test.core

    FAIL in (profile-structure) (core.clj:8)
    Profile structure
    expected: (instance? Integer (cat :id))
      actual: java.lang.Long

    Ran 1 tests containing 6 assertions.
    1 failures, 0 errors.

This is an example of the is macro being clever – it knows that instance? is checking for type, so it reaches into the code executed and extracts the actual class found. If it was just testing for true/false like a simple assert function, that information would be lost.

Let’s also make sure that cat-profile is repeatable given the same ID. I don’t think it’s possible to test exhaustively, or between runs (it should be, though), but let’s at least test that it returns the same thing 10 times:

(deftest repeatability
  (testing "Repeatability"
    (let [cat-id 1234
          same-cats (repeatedly 10 #(cat-profile cat-id))
          diff-cats (repeatedly 10 rando-cat)]
      (is (= 1  (count (distinct same-cats))))
      (is (= 10 (count (distinct diff-cats)))))))

distinct will remove all but one of any identical entries in a seq. All right, now let’s run this and –

    ERROR in (repeatability) (
    expected: (= 10 (count (distinct diff-cats)))
      actual: java.lang.StackOverflowError: null
     at java.lang.ClassLoader.getCallerClassLoader (
        java.lang.Class.getMethods (
        clojure.lang.Reflector.getMethods (
        clojure.lang.Reflector.invokeInstanceMethod (
        cat_face.core$rand_with.invoke (core.clj:28)

Oh, hell.

Here we have our clever indefinitely-defined cat profiles biting us in the rear. distinct is attempting to recursively compare the profiles, which recurses endlessly and kills the stack. distinct doesn’t appear to have any way to control its recursion, so we’ll need to fix it on the other end. We still want to check the repeatability of the friends, but it should be sufficient just to validate the IDs generated. So:

(let [cat-id 1234
      ids-only     (partial map :id)
      safe-friends #(update-in % [:friends] ids-only)
      same-cats (map safe-friends (repeatedly 10 #(cat-profile cat-id)))
      diff-cats (map safe-friends (repeatedly 10 rando-cat))]
  (is (= 1  (count (distinct same-cats))))
  (is (= 10 (count (distinct diff-cats)))))))

We create ids-only, a function which will extract only the :id key from a seq of maps. We then create safe-friends, a function which will use the update-in function to associate the :friends key with that list of IDs. And then we apply safe-friends to both cat seqs. Now we’re safe from infinite recursion:

    Testing cat-face.test.core

    Ran 2 tests containing 8 assertions.
    0 failures, 0 errors.

Victory! Next step: a web interface with Noir (or possibly back to mass-download).