29 May 2012

two rules about testing and a meta-rule about software engineering

Here are two very important rules to write good automatic tests for software:
  1. Don't let the tests parrot the code.
  2. Focus tests on areas where you expect failures.
Before I go on with this post, I would like you, Dear Reader, to read those rules again, think about them deeply, and answer the following questions: have you ever seen a test that was useless because it violated rule 1? Have you ever worked on a piece of software that had lots of tests, but still let lots of failures happen? In hindsight, did some or many of those failure concentrate in particular areas?
More generally, due the rules seem true to you? Do they seem helpful? Or are they so obvious that they wouldn't even need to be stated so explicitly? Or might the opposite be true: the rules are too general for an ordinary developer to apply them well without further instruction? For example, what degree of redundancy between test code and production code constitutes parroting? How would an ordinary developer know in which areas to expect the most failures?
When I discovered the above two rules my attitude towards them swung wildly a couple of times. First I thought: “Wow, are those rules great. They're going to be the basis for my next book about software quality.” A bit later, I thought: “What silly rules to brag about. Any half-decent programmer is surely always following them unconsciously, just because they make so much sense.” Currently my thinking is: “Yes, yes, both is true: the good programmers follow the rules, even if they couldn't state them explicitly. But the bad programmers (or even the ordinary average wage-slave programmers of our day) will often create bad tests by ignoring just those two simple rules.”
And when I had that last thought, it occurred to me that the same is probably true about most “rules” or “guidelines” in software engineering or, for that matter, in any practical field. The best and most talented people always act in accordance with some unwritten rules and are often not aware of it. It takes other people to formulate those rules and even more people to reformulate them for specific audiences, discuss examples, do one-on-one coaching, ... until every ordinary practitioner of a particular profession has adopt the guideline into his daily routine.
That's the meta-rule:
Many good engineering (or life) rules sound simple and obvious, yet they require good judgement and/or lots of experience to be applied well.
And now, it's time for your comments!

28 May 2012

Variations on a Bowling Scorer

Instead of writing a lot about programming as I usually do, this post is made up mainly of code. The background story is that I was terribly offended by a piece of code I found on the web. So I rewrote it, documenting my process and design decisions on the side. I was very happy with the resulting code I got, but never had the time to rewrite the process and design description for publishing. My code, however, is always written to be readable for anybody, so I am publishing it here without any comments.

You'll probably notice that the code which offended me is very clean on the surface (after all, it's written by Mr. Clean Code himself!), but what bugged me was the design of the program and the tight coupling of methods by shared variables. In fact, the coding style reminded me a lot of programs written in C. (A higher-level machine language from the 70s, for those who don't know it.)

Here's Uncle Bob's solution; copied from his long post where he and a coworker derive it TDD style. (Code-highlighting on Blogger seems really hard to do, so for the moment, I need to feed you black and white code. Sorry for that!) Don't worry about the actual purpose of the code. If you want to learn the rules of bowling scoring, you can read them from the declarative program given below!

//Game.java----------------------------------
public class Game
{
  public int score()
  {
    return scoreForFrame(itsCurrentFrame);
  }

  public void add(int pins)
  {
    itsScorer.addThrow(pins);
    adjustCurrentFrame(pins);
  }

  private void adjustCurrentFrame(int pins)
  {
    if (firstThrowInFrame == true)
    {
      if (adjustFrameForStrike(pins) == false)
        firstThrowInFrame = false;
    }
    else
    {
      firstThrowInFrame=true;
      advanceFrame();
    }
  }

  private boolean adjustFrameForStrike(int pins)
  {
    if (pins == 10)
    {
      advanceFrame();
      return true;
    }
    return false;
  }  

  private void advanceFrame()
  {
    itsCurrentFrame = Math.min(10, itsCurrentFrame + 1);
  }

  public int scoreForFrame(int theFrame)
  {
    return itsScorer.scoreForFrame(theFrame);
  }

  private int itsCurrentFrame = 0;
  private boolean firstThrowInFrame = true;
  private Scorer itsScorer = new Scorer();
}

//Scorer.java-----------------------------------
public class Scorer
{
  public void addThrow(int pins)
  {
    itsThrows[itsCurrentThrow++] = pins;
  }

  public int scoreForFrame(int theFrame)
  {
    ball = 0;
    int score=0;
    for (int currentFrame = 0; 
         currentFrame < theFrame; 
         currentFrame++)
    {
      if (strike())
        score += 10 + nextTwoBalls();
      else if (spare())
        score += 10 + nextBall();
      else
        score += twoBallsInFrame();
    }

    return score;
  }

  private boolean strike()
  {
    if (itsThrows[ball] == 10)
    {
      ball++;
      return true;
    }
    return false;
  }

  private boolean spare()
  {
    if ((itsThrows[ball] + itsThrows[ball+1]) == 10)
    {
      ball += 2;
      return true;
    }
    return false;
  }

  private int nextTwoBalls()
  {
    return itsThrows[ball] + itsThrows[ball+1];
  }

  private int nextBall()
  {
    return itsThrows[ball];
  }

  private int twoBallsInFrame()
  {
    return itsThrows[ball++] + itsThrows[ball++];
  }

  private int ball;
  private int[] itsThrows = new int[21];
  private int itsCurrentThrow = 0;
}
Since I felt that mutable variables shouldn't be necessary in the solution at all, I first produced a solution in a purely functional programming language. Here's my solution in Haskell:
example_throws_incomplete = [1,2,3,4,10,10,1,2,3]

to_frames :: [Int] -> [([Int], [Int])]
to_frames (10:xs)  = ([10], xs) : to_frames xs
to_frames (x:y:xs) = ([x,y], xs) : to_frames xs
to_frames _        = []   -- covers empty lists and trailing bonus throws

test_to_frames = assert_equals (map fst $ to_frames example_throws_incomplete)
          [[1,2],[3,4],[10],[10],[1,2]]

-- how many bonus throws are counted into a frame?
num_bonus [10]               = 2
num_bonus [a, b] | a+b == 10 = 1
num_bonus _                  = 0

-- scoring a frame is really simple:
score_frame (frame_throws, next_throws) = sum frame_throws + sum bonus_throws
     where 
     bonus_throws = take (num_bonus frame) next_throws

-- the straight-forward solution assumes that all the bonus throws are present
score_complete_game throws = scanl1 (+) $ map score_frame $ take 10 $ to_frames throws

assert_equals a b | a == b    = "ok"
                  | otherwise = "expected: " ++ show a ++ "\n"
                             ++ "but got: " ++ show b

Given how easy that was, we now have a clear picture of the solution algorithm and design which we only need to port to Java. It's not exactly I one-to-one port, but you'll recognize the spirit of the algorithm.
Here's the code:
import java.util.List;

public class BowlingScorer {
  static class Frame {
    static final public Frame firstFrame(List throwns) {
      return new Frame(throwns, 1, 0, 0);
    }
    public Frame nextFrame() {
      return new Frame(throwns, number+1, nextPosition(), score());
    }
    public int number() {
      return number;
    }
    public int score() {
      return score;
    }
    public boolean isStrike() {
      return throwns.get(position) == 10;
    }
    public boolean isSpare() {
      return ! isStrike() && throwsScore() == 10;
    }
    public int numThrows() {
      return isStrike() ? 1 : 2;
    }
    public int numBonus() {
      return isStrike() ? 2 : isSpare() ? 1 : 0;
    }
    private int nextPosition() {
      return position + numThrows();
    }
    private int throwsScore() {
      return sumThrowns(position, numThrows());
    }
    private int bonusScore() {
      return sumThrowns(nextPosition(), numBonus());
    }
    private int sumThrowns(int start, int count){
      int result = 0;
      for (int i = start; i < start+count; i++) {
        result += throwns.get(i);
      }
      return result;
    }
    private final List throwns;
    private final int number; // number of frame in game, counting from 1
    private final int position; // this frame's first throw in throwns
    private final int score;
    private Frame(List throwns, int number, int position, int previousScore) {
      this.throwns = throwns;
      this.number = number;      
      this.position = position;
      this.score = previousScore + throwsScore() + bonusScore();
    }
  }
  public int finalScore(List throwns) {
    return getFrame(throwns, 10).score();
  }
  private Frame getFrame(List throwns, int number) {
    // requiresThat(number, atLeast(1));
    // requiresThat(number, atMost(10));
    Frame f = Frame.firstFrame(throwns);
    while (f.number() < number) {
      f = f.nextFrame();
    }
    return f;
  }
  /*
  private static void requiresThat(T thing, Matcher matcher) {
    if (! matcher.matches(thing)) {
      throw new IllegalArgumentException(matcher.reason());
    }
  }  
  */
}
Before start giving a lecture on why my program is better, I'll just let you read and find your own opinion. If you want some more food for thought, try to find the one concern that Uncle Bob's solution treats in two different places in two different ways without explicitly mentioning it as a concern. (Had they found the pattern and named it, they'd probably tried to solve it in one place only.)

If you don't have the time to study the code in detail, at least have a look at that one line in Frame(..) which says:  this.score = previousScore + throwsScore() + bonusScore(); I think that this is as close as we get to formulating a really direct and concise specification of the score of a frame. 


Finally, sorry for improvising some of the Matcher / Assertion stuff as I don't have all the proper tools installed right now.

5 May 2012

Shimano Nexus/Alfine Inter-8 hub gear disassembled / reverse-engineered

On a beautiful day in 2011, when I visited my favorite Berlin bike builder and dealer on the Schöneberg Island, I noticed that he used the shell of a Shimano Inter-8 gear hub as an ash tray. When I asked him about the piece, he told me that one of his customers or friends had serviced the gear hub and not reassembled it correctly, making the ball bearings smash when he used it again.
Left is the reduction stage, right the 4-gear stage
(including carrier shared by both stages).
I asked for the innards of the hub which Conrad still had below a shelf in a corner of the shop. He gave them to me and this week I finally got around to look at them more closely.
I already knew that this hub is logically composed of two stages of planetary gears. One has two gears, the other four. Both can be switched independently to give eight gears and the ratios of the cogs are tuned in such a way that the eight gears have roughly equal spacing and no gear overlap. (So logically they work like the 3×8 gears of a 24 gear derailer system, only that all the gears are shiftable in sequence.) After disassembling the hub, I counted the teeth of all the cogs and used the counts to calculate the gear ratios. I was happy to find that I just got the same results that are published on several places of the Interwebs. For those who can't wait to see them, here's my calculations. All others can first read on to find out how this stuff works. Actually, those who've never read anything about planetary gears, go start with my introductory blog post on planetary gears to learn some of terminology. Then come back to find that I am experimenting with different terminology here (after all, this is a work in progress and I need to figure out what works best). For now, let's say “cog” to the little toothed round devices that turn inside the gear hub to translate speeds. Then we can use the word “gear” for the resulting ratios of speed, just as you ordinarily do when you say “first gear, second gear” and so on.

Now, here's a summary of the hub's architecture: since both stages share the same planet carrier, the carrier serves as the one and only power transfer between the two stages. The carrier is therefore output of the first stage and input of the second stage, which conveniently serves the fact that the first stage reduces speed, while the second stage increases speed. In both stages the sun cogs are locked and the ring cog serves as the other moving part. (In the second stage, there are three sun cogs locked via switchable one-way ratchets.) One-way ratchets (aka freewheels) are used in many places in the hub. Thanks to the freewheels several gears can be switched “active” at the same time which has two key advantages. One, when shifting gears, the mechanism only needs to activate or inactivate the higher gear, while the lower will automatically be inactive (freewheel) without being actuated by the shifting mechanism. The second advantage is that during shifting (or when there's a problem with the shifting cable), there is never a no-gear stage, since the lower gear will always be active as fall-back in case that the higher gear doesn't engage properly.
Reduction stage with two planet sets.
A first surprise I found when looking at the hub innards was that the reduction stage has two planet sets to create just one reduction ratio. I had previously assumed that there was only one set, that is, one sun cog, a couple of planet cogs all engaging with the sun cog, and one ring engaging with all the planets. But in reality, the three planets are two cogs in one: a smaller cog that engages the sun and a little bigger one, that engages the ring. I should actually have thought of that because with a single planet set, it is quite hard to achieve an 89% gear ratio as this stage of the hub does. (The planet cogs would need to be to tiny, since their size approaches zero as the ratio nears 100%.)
The power from the chain and cog is transferred to the ring cog of the first stage via a freewheel. To bypass the first stage (as in gears 5 to 8), a clutch connects the incoming motion directly to the planet carrier. Since the ring gear always turns faster than the carrier, it will be then freewheeling with respect to the input motion. (Inversely, when the clutch is disengaged and the ring is driven by the bike chain, then the carrier will turn with the translated speed, so it's important that the clutch is completely disengaged, since the gears would otherwise jam.) From experience during about 10'000 km riding a similar hub (and by what I've heard from others), switching this clutch is the most difficult and noticeable switch in the hub. It's the one between fourth and fifth gear. Also the one that first becomes edgy when the shifter cable is misadjusted. (Happens rarely and is easy to fix.)
Finally, let's look at the second stage which as I said is driven by the carrier and transfers its motion to the hub shell (and thus the wheel (and thus the road (and thus the earth, which makes is turn))) via the ring gear or (if in direct drive) via the carrier. Both ring and carrier connect to the hub shell via a freewheel so that the faster one will be driving while the other will be in fallback mode. Interestingly, direct gear (1 and 5) is achieved by locking none of the sun wheels which makes them turn freely, making the planets turn freely and the ring not being driven, so the carriers wins the race to drive the wheel. One last factoid: the ring cog sits on the smallest of the planet sets (with the biggest sun cog) which when locked makes the biggest gear (4 and 8). Locking the smallest sun cog, gives the smallest (non-direct) gear (2 and 6). The jump from direct drive to this gear is the biggest in the entire hub (22% compared to just 14% from gear 3 to 4). I think the reason for this is that the sun cog can't become much smaller than it already is because the ratched mechanism for locking it still needs to fit inside. (And the solid hub axle is still inside the ratched mechanism.)
I actually disassembled the hub one step further than the manual explains. I only figured this out by accident after removing one more stop ring (shown on the right in the very first picture) when suddenly the shafts holding the planets of the second stage fell out of the carrier. When I tried to make them all fall out, the shafts holding the first stage's planets also fell out from the same holes! :-D After taking out the first stage planets, the ring cog from the second stage could be removed since it was only held by the planets and then the planets from the second stage could be removed. The sun cogs are still inside the carrier. I can see and feel another stop ring inside them, but didn't try to take it out. I instead just counted the sun cogs teeth by marking one tooth and then turning the cog until the marked tooth came up again. It's really interesting how some parts are purposely held by stop rings while others (such as the second stage ring gear) are just floating on other parts (second stage planets) and held sideways by again other parts (first stage planets). I am a bit curious whether I could put that part back together again. Maybe some day... ;-)
Here's my calculations again for those who I made curious. I've already got a follow-up post in my head in which I'll explain the formula that the spreadsheet is using.

Sources:

Misused Mock Objects

During the past year at work, I had trouble with mock objects several times. Often tests would break because the mocks were hard to program or to change. In one case, we missed a critical bug because our mock differed significantly from the real implementation.
Since I didn't have any experience with mocking, I didn't know how to improve any tests towards a best practice solution. I found a lot of code smells, but only for a few of them, did I know how to actually make it better. At one point I thought that mocks were more trouble than they're worth.
Fortunately, I also had some good experience with mocks during this year, which led me to the conclusion that in the bad cases, mocks were just badly used. Thinking about it now, some of the tests that I had to deal with seemed to be written under the motto: “Mock everything but the class you want to test.” Thinking about it, a much better motto would be: “Test every component in an environment that is as production-like as possible and mock only if there is a good reason to do so.” In fact, Wikipedia says the same and even lists valid reason for mocking. The valid reasons I personally experienced are the following: speed and simulating rare behavior. There are many instances where we mocked for speed: using an in-memory database instead of a remote one or stubbing calls to remote services. The rare behavior that we simulated was when testing handling of network errors. Instead of simulating real network errors (plugin out cables or intercepting packet traffic??) we just stubbed the interface that does the networking and let it throw a SocketException. That's a perfectly valid reason to use a mock!

While reading the Wikipedia article I noticed the words “indirect input” and “indirect output” of objects under test. This made me think that some uses of mocks might just be needed because of a bad software design. Before taking out the mocking framework, shouldn't we check if the code can be written with direct input and output? For small functionality, input is just method parameters and output is the method result. For bigger functionality, input is given via setters or constructor parameters and output again is the result of the method under test or getter methods called later. For even bigger functionality, some of the input (or context) will be supplied in the form of other objects that first have to be constructed. When there's lots of those objects, tests will obviously not just span the object under test, but also the classes of those other objects. Maybe you think that's too big for a unit test, but I think it's just fine for a big unit test. If those other classes have complex behaviors they will have their own unit tests that will have ran green every time before the bigger test runs. So we don't risk to test too much at once, since the lower layer of the application is already tested.