Thursday, April 26, 2012

Scoring a Clojurebreaker Game


As a Clojure programmer, one question you will often ask is, “Where do I need state to solve this problem?” Or, better yet, “How much of this problem can I solve without using any state?” With Clojurebreaker (and with many other games), the game logic itself is a pure function. It takes a secret and a guess and returns a score. Identifying this fact early gives us two related advantages:
·        The score function will be trivial to write and test in isolation.
·        We can comfortably proceed to implement score without even thinking about how the rest of the system will work.
Scoring itself divides into two parts: tallying the exact matches and tallying the matches that are out of order. Each of these parts can be its own function. Let’s start with the exact matches. To make things concrete, we will pick a representation for the pegs that facilitates trying things at the REPL: the four colors :r (red), :g (green), :b (blue), and :y (yellow). The function will return the count of exact matches, which we can turn into black pegs in a separate step later. Here is the shell of the function we think we need:

clojurebreaker/snippets.clj
(defn exact-matches
"Given two collections, return the number of positions where
the collections contain equal items."
[c1 c2])
Hold the phone—that doc string doesn’t say anything about games or colors or keywords. What is going on here? While some callers (e.g., the game) will eventually care about the representation of the game state, exact-matches doesn’t need to care. So, let’s keep it generic. A key component of responsible Clojure design is to think in data, rather than pouring object concrete at every opportunity.
When described as a generic function of data, exact-matches sounds like a function that might already exist. After searching through the relevant namespaces (clojure.core and clojure.data), we discover that the closest thing to exact-matches is clojure.data’s diff. diff recursively compares two data structures, returning a three-tuple of things-in-a, things-in-b, and things-in-both. The things-in-both is nothing other than the exact matches we are looking for.
Try it at the REPL:

(require '[clojure.data :as data])
(data/diff [:r :g :g :b] [:r :y :y :b])
-> [[nil :g :g] [nil :y :y] [:r nil nil :b]]
The non-nil entries in [:r nil nil :b] are the exact matches when comparing r/g/g/b and r/y/y/b. With diff in hand, the implementation of exact-matches is trivial:

clojurebreaker/src/clojurebreaker/game.clj
(defn exact-matches
"Given two collections, return the number of positions where
the collections contain equal items."

[c1 c2]
(let [[_ _ matches] (data/diff c1 c2)]
(count (remove nil? matches))))
Again, we test at the REPL against an example input:

(exact-matches [:r :g :g :b] [:r :y :y :b])
2
Now let’s turn our attention to the unordered matches. To calculate these, we need to know how many of each colored peg are in the secret and in the guess. This sounds like a job for the frequencies function:

(def example-secret [:r :g :g :b])
(frequencies example-secret)
-> {:r 1, :g 2, :b 1}
(def example-guess [:y :y :y :g])
(frequencies example-guess)
-> {:y 3, :g 1}
To turn those two frequencies into the unordered-matches, we need to do two additional things:
·        Consider only the keys that are present in both the secret and the guess
·        Count only the overlap (i.e., the minimum of the vals under each key)
Again, we hope these operations already exist, and happily they do. You can keep the keys you need with select-keys:

(select-keys (frequencies example-secret) example-guess)
-> {:g 2}
(select-keys (frequencies example-guess) example-secret)
-> {:g 1}
And you can count the overlap between two frequency maps using merge-with:

(merge-with min {:g 1} {:g 2})
-> {:g 1}
Combining frequencies and select-keys and merge-with gives the following definition for unordered-matches:

clojurebreaker/src/clojurebreaker/game.clj
(defn unordered-matches
"Given two collections, return a map where each key is an item
in both collections, and each value is the number of times the
value occurs in the collection with fewest occurrences."

[c1 c2]
(let [f1 (select-keys (frequencies c1) c2)
f2 (select-keys (frequencies c2) c1)]
(merge-with min f1 f2)))
which, of course, we should verify at the REPL:
(unordered-matches [:r :g :g :b] [:y :y :y :g])
-> {:g 1}
That’s nice, with one subtlety. unordered-matches counts matches regardless of order, while the game will want to know only the matches that are not in the right order. Even though the game doesn’t seem to need unordered-matches, writing it was a win because of the following:
·        unordered-matches does exactly one thing. To write a not-ordered match, we would have to reimplement exact-matches inside unordered-matches.
·        The two simple functions we just wrote are exactly the functions we need to compose together to get the not-ordered semantics. Just subtract the results of exact-matches from the results of unordered-matches.
With the two primitives in place, the score operation simply compounds them:

clojurebreaker/src/clojurebreaker/game.clj
(defn score
[c1 c2]
(let [exact (exact-matches c1 c2)
unordered (apply + (vals (unordered-matches c1 c2)))]
{:exact exact :unordered (- unordered exact)}))
And the REPL rejoices:
(score [:r :g :g :b] [:r :y :y :g])
-> {:exact 1, :unordered 1}
At this point, we have demonstrated a partial answer to the question, “What is a good workflow for writing code?” In summary:

·        Break apart the problem to identify pure functions.
·        Learn the standard library so you can find functions already written.
·        Pour no concrete (use data as data).
·        Test inside out from the REPL.
In our experience, programmers trying this workflow for the first time make two typical mistakes:
·        Coding too much
·        Complicating the tests
You have written too much code whenever you don’t understand the behavior of a form, but you haven’t yet tested and understood all of its subforms. Many developers have an intuition of “write X lines of code and then test,” where X is the smallest number of lines that can do something substantial. In Clojure, X is significantly smaller than one, which is why we emphasize building functions inside out at the REPL.
“Complicating the tests” is more subtle, and we will take it up in the next section.

No comments:

Post a Comment