Tuesday, December 1, 2009

Run-up to the Spike

(See New Broom for an explanation....)


I’ve been writing Ruby using TextMate and Rspec for a couple of days now.

Yesterday I TDD’d a TagExtractor specifically for extracting PoS (Part of Speech) info from W1913 (the Project Gutenberg semi-marked-up Merriam-Webster dictionary from 1913). Strayed from the path – got into a very complicated parse method with multiple flags.

I decided to step back and approach the problem from the direction of “pure story” – i.e., let the story drive the test and let the test drive the implementation decisions. This resulted in BDDing of the
PosExtractor, which will take a character stream and return an array of triples: {:word, :sense, :PoS}.

Now there’s a dialectic between the PE (PosExtractor) and the TE (TagExtractor) which is starting to drive the TE toward a simpler and hopefully more Ruby-like implementation, using Element objects that consume from a character stream.

Starting with the TE was the first mistake: a bottom-up approach locking into an implementation detail that should emerge from BDD. BDD done right prevents BDUF;  TDD can encourage it. Starting from the nouns and verbs of the story makes possible a progressive decomposition of the
story into objects and functions.

Required log4r to debug TE.

Created new class ElementExtractor (EE) from TE to eliminate the find_all_tags noise – BDD’d it.
Now the Element class, which was just a transfer object that accumulated content, has evolved under BDD pressure to detect its end tag and mark itself complete, which simplifies EE.

No comments:

Post a Comment