FactBasedArchitectures
From Epowiki
There was an interesting thread on comp.object related to how XP would code up scoring for a bowling game (http://www.xprogramming.com/xpmag/acsBowling.htm). One of the thread particpants was trying to say something interesting, but didn't do a very good job at it, so I think I'll take a crack at it.
The issue revolves around XPs exclusive use of evolutionary design. In XP you pick a user story and start coding. The design evolves as you code.
I am not against this idea in general. I think it is a waste of time to create low level class diagrams. I have seen a lot of people spin on working out details in UML diagrams when it would have been just as easy and more productive to work out the details in the code.
Where I disagree strongly with XP is the lack of design up front. XP says they aren't completely against up front design, but as they don't talk about it positively, show examples using it, and always push late binding of all decisions, this stance doesn't have a lot of credibility.
For example, if I was creating a compiler for C# my approach would not be to come up with a set of user stories and start coding. I have several books around me on language theory. I would first reread those to get a better understanding of the science of language theory and parsing. I would look for tools like Lex and Yacc to help me. I would do a hell of a lot of up front design so the result was correct and well performing.
Would I design every last detail? Not a chance. But I would be very clear on the type of algorithms to use and the research on how best to implement those algorithms.
This seems a very solid and reasonable approach to me. I don't think starting with a user story like "increment a variable" would yield the best results. There's a science and set of well known practices to creating a compilers. I would like to take advantage of that.
One of my jobs in the past was to run 100MB of tests against a lisp compiler. Some of the tests were truly complex monsters. By starting with a solid theory of compilers these tests should just pass because they are based on the ideas of regular grammar. If you code based on implementing examples (user stories) you will miss the underlying grammar and your code will constantly be driven by ever more complex exceptions to what you already handle.
In the same way there are general approaches to problems even as simple as scoring a bowling game that I have found create a better, more flexible, and more robust solution than other approaches. There's no doubt in my mind TDD (test driven design) will produce a good solution. No doubt at all. But like for compilers, I think there are general architectures that will produce a better solution.
What is better is an interesting question. In the bowling game code-off better seemed to be simplicity as measured by the number of lines of code. My idea of better is slightly more complex. Better for me is a tradeoff between a number of different driving forces. I use my global judgement that has been tempered by years of experience combined with the local experience of working on the actual problem. If I am wrong I am confident I can handle any problems in the same way XP advocates refactoring. Clean code shouldn't be afraid of failure, even failure from an early decision. Making design decisions earlier doesn't require unclean complex code that has a lot unused stuff in it. That's a strawman.
The bowling problem can be roughly categorized into what I call a Fact Based Architecture. It's a common architecture for problems based on data from external events.
In this problem the external events are derived from a bowler bowling. A parallel can be found in how deep space explorers are architected. For an interesting overview of where spacecraft software architecture is going take a look at Mission Planning and Execution Within the Mission Data System (http://www-aig.jpl.nasa.gov/public/planning/papers/barrett_iwpss2004_missionplanning.pdf).
The brittleness of the XP generated solution can be shown using sensitivity analysis through asking some bowling related questions. The question asked in the comp.object thread was something like "what did kind of split did I bowl in the seventh frame?" This seems a basic question. But it's not one the XP generated solution could answer. This fact was not kept.
The XP solution could be changed to answer this question. Certainly. And you could say because the question was not in the original user story list we didn't have to answer it anyway. XP would say their solution was not brittle at all, it was made to do exactly what it was supposed to do and nothing more.
I understand, but don't agree. We never get all user stories up front. People always think of more. In the same way XP wants to late bind all decisions, users want to late bind their requirements. XP handles this by saying you can't do that. As a user I don't think that's reasonable. What's good for the goose is good for the gander. Especially when there is a way to create a simple solution that does a lot more.
By using a more robust architecture to start with this question could have been answered with a very simple coding change. Or maybe not even a coding change. It's similar to storing facts in a RDBMS and then using SQL to ask the questions you want concerning the facts. The underlying code wouldn't change, just the access layer would change.
But to do this you have to record the facts of the problem. If you record the facts then you are in a good position to answer any questions related to those facts, including performing data mining and finding complex patterns in the data. If you don't record the facts you have no recourse. All your old data is useless to the new questions you want to ask. In the XP example the didn't record which pins were standing on each role so they can't answer any question about them. All that old scoring data is useless.
A strawman is that you can't record all the facts in the world so you still have the problem. This is true, but not limiting. Using sensitivity analysis, domain analysis, listening to the customer, and listening to the user, your set of facts should be a pretty good approximation. Your sensors may be the limiting factor so you don't actually have access to the facts you want.
Some bowling establishements are now displaying the speed of a ball down the alley, like the radar gun used in baseball. A lot of people really like this. This data should be added as a roll fact. A business could then do stuff like calculate highest average speed and award patches. Stuff that's never been done before. Yet if you don't collect the facts you won't be able to drive in new business this way.
A fact based architecture would record information about every roll of every frame, at the most generic level possible. It would record which pins were standing so it could determine the score and know things like what splits were thrown. If you record the number of pins standing then you lose any information about individual pins. You could determined if the bowler through a washout or a split, for example, which are first class notions in a bowling game. Patches and pins and awards and pot money are tied to this type of information.
If my bowling sensors provided me more information I would record facts like the speed and the weight of the ball too. I would add in a time stamp so the data could be correlated with snack bar and game revenues. I would include information like the league, date, handedness of the bowler, and lots of other stuff.
Facts are very cheap to record, yet they provide amazing flexibility. They allow the the game to be scored in real-time and for extensive analysis after the fact and for new features to be easily added. For example, the bowling alley might want to send a beer down to a team that has all strikes in the ninth or automatically give a patch to someone who picks up a complex split. Perhaps I want to compute the average score between leagues. Lots of stuff like that.
With the XP approach each of those new features would require a new story and a new release of the software. A software release that could not be used on older score data.
What's interesting is that fact based architectures are usually very simple. Usually data are scalar so they are cheap to store. Sensors provide them and they are easy to add into a system.
This approach initially will have more lines of code than the XP solution. But so what? I don't believe more code means more bugs because any code I write has tests. It's not going to have bugs. And fact based approaches are still extremely simple while providing robustness and flexibility.
By looking at the nature of the problem I feel I was able to peg the problem a head of time into a solution space that is still very simple yet better along a number of different dimensions. While the XP approach isn't bad, I think the fact based approach is better and it knowably better ahead of time. There's no need to evolve into it. Having said that, I would still evolve most of the code that implements the fact based approach. That's what I consider a balanced approach that uses the both of best worlds.
