Ted O'Grady on Agile Development

Serendipitous

Barn Shadows

Ghosts in the wood

Art in Calgary Alley

Reflected Wealth

Taken from the south side of the tracks in Calgary

Relax

Holidays

Fixing one of the things that bugs me...

One of the things that bugs me is how long it can take to set up the infrastructure for a small project. Purchase hardware, install OS, configure OS, CVS, bugzilla, webserver, cruise control, and, oh yeah, backups. Oh crap don't forget security.

Why do I need to do all this work everytime? True, it's kind of fun to configure this stuff and tweak it to your hearts content, but the installation and setup of it I could really care less about. On alot of small projects I have just seen people go without. "Do we really need cruise control?" "Do we have a security expert?" "Ralph will do it on top of his day job."

It is simple stuff that shouldn't be a burden on a project. It bugged me.

Now my partner and I have created ITForDevelopers. You can purchase a virtual server, preconfigured for development. Root access to a full system. Start developing the day your project starts rather than setting up infrastructure.

Where have I been?

Doing too many things.

I'm PM'ing one project, leading a team on another, and then starting a new product. Learning, Learning and Learning.

Most importantly, I am heading toward a summer where I have two months off. Not something I've done since high school. I'll be spending it with my family and enjoying the sun and warmth which lasts far too briefly here in the Northern climate. We'll be returning to Ottawa to visit relatives and do some car camping around Alberta.

No big trip. Just my family. Nice.

Which Profiler?

Jamie asks:

"Ted,What are you guys using for a Profiler? If you have any advice on a profiler
or, more importantly, common areas a profiler flags, I'd love to hear them."

Well we use:

JProfiler - for exploratory work (has the problem of dilation)
com.tedogrady.profiler - for precise timings (basically println)
The toplink profiler - for understanding the cache and gross database numbers
tkprof /explain plan- for understanding how queries are working in the database
math and common sense - for figuring out how to solve the problem

Performancing tuning is a highly empirical thing.

Establish a target time.
Make sure you have a test showing the slowness. Automate it.
Develop Causal Theory
Test Causal Theory
Develop Potential Solution
Experiment with Potential Solution
If 4 doesn't help, go back to 3
Implement Solution

JProfiler helps alot with the Causal theory. You can use it to explore your code as a whole looking for problem areas, testing out theories you may have developed about the code. But, if you are working in a database application, DO NOT BELIEVE EVERYTHING IT TELLS YOU.

It's lying to you. Remember it slows down your application, but not the database. So, hot spots and the relative timings it tells you are lies where concerned with the database. For example, the current piece I am profiling is spending half the time in the database dealing with foreign keys. The profiler told me my problem was in navigating some hierarchy.

I use Jprofiler for the following features:

Execution stack trace - shows the order in which things were called
Number of Invocations of a method - really useful (do quick math to see if the numbers make sense)
Percentage time in a method - also really useful
Hot spots - (rank methods by amount of CPU time/calls)
Memory profiler - which objects are sucking back the memory

I tried the eclipse one about 6 months ago (TpTp) but it choked on simple stuff. Haven't tried it since.

com.tedogrady.profiler implements println with extra special fancy do-dad stuff. It allows you to verify what your profiler is doing. It lies to you a lot less frequently than JProfiler. It's about 40 lines of source code that I got tired of writing on every project, so its now open source. It is in with my toplink in memory stuff, or just email me if you want a copy. Basically, it has a really small overhead to verify what you learned in JProfiler, or if Jprofiler is really lying to you then you can use it as a more painful exploration tool.

To profile a suspicious method you can do the following:

Profiler.start("Profiling myMethod()");
this.myMethod()
Profiler.stop("Profiling myMethod()");
Profiler.dump();

You can put as many starts and stops (each with unique tags) in your code as you want. It'll tell you nice things like the average time, cumulative time, invocation order, blah blah blah. Many of you are thinking map. You'd be right.

TopLink probably comes with a profiler because of all the people that complain that toplink is slow. Very smart of them to give you the tools to see where the slowness/idiocy is. The tool usually tells me again how stupid I really am. As an idiot, I frequently forget I'm an idiot. It's part of the condition.

The command that lets you show sql in toplink is also very useful. Dario has been using it to fine tune the SQL we're throwing at Oracle. It lets you configure all those wacky options in toplink and see what they do to the SQL (just what does batch write or batch read do? Gee, they make things a lot faster).

tkprof, explain plan, and oracle analyzer are useful when you have to think about the database. They tell you what the database is doing, like how long a query is taking, or the strategy that the optimizer is using to evluate your query. (Sadly, you do need to understand how oracle implements things if you want to tune it). asktom.oracle.com has all your answers pre-packaged.

I haven't gone looking for Oracle analyzer in Oracle 10g, but I've used it in 9i. Basically, it is a way for you to ask the Oracle optimizer what will work best. Rather than you analyzing (guessing) what the best index are for a query, oracle will recommend some. 90% of them are optimal, and some recommendations slow things down. Remember that whole test thingamy.

math and common sense are the best tools of all. This would be why you went to school. When you look at the number of invocations, and the time, does it make sense to you? What do the numbers mean? A scrap of paper, a pencil, and some simple multiplication and division can tell you a lot more about what your program is doing than any profiler.

Just as a side note, I learned most my techniques sitting across the hall from Steve Northover at OTI when he was writing the profiler for OTI/Smalltalk. Thanks Steve.

Performance tuning in ernest

Up until this point we have been performance tuning for testing. Our users have been testing the application all along, and our goal has been to keep them happy with the performance of the system (not ecstatic).

Now we are nearing the end of our mamoth, huge, horrifically large project (why did I agree to help run this?) and things are starting to stabilize. We're five months away and seriously testing our system. Data structures and core internal pieces are pretty clear and we can optimize the living crap out of them. And optimize we shall.

Production accounting has a few unique attributes to it:

First: If you give us 7 numbers, we can very quickly turn those into 7,000. For example, processing 20 wells results in 72,000 numbers (or allocations) being created. Our biggest customer has 40,000 wells. (144,000,000 allocations per month in the work case - note this is the upper bound). Huh.

Second: Accountants have four (important) deadlines a month. For one of those deadlines (Oil/NGL Splits) we need to generate all of the data. Best of all the data for all their systems will probably only be available the day before the deadline. Don't ask, the production accounting business is wacky. We will have to be able to do all our allocations in about 12 hours.

We need to write 3400 numbers (allocations) a second.

We're out by about a factor of 10 (we do around 300-400/sec on a single Intel CPU box). That is not very far, especially as we are just beginning optimization. Better yet, we can expect between 8 and 32 cores doing the work on the boxes.

So what did I miss on my envelope?

We'll have fun, fun, fun 'til daddy takes our profiler away....

Something to try...

When I read a book that I think is good, I like to try a few things out of it that you normally wouldn't (Why did you read the book if not to learn something new?)

I've been reading Peopleware by Demarco & Lister and really enjoying it (it lines up with a lot of the things I've been taught and try to practice). I'd like to try the "Power of Veto". Allowing our teams (development and customer) to state when their portion of the product is ready.

I think the question to ask is "When will x be ready for production?" where x is contracts, volumetrics or allocations.

Discovery phase timing

"Ted,How soon before production do you think a team should go into a 'discovery'
phase?" - Jamie

As soon as possible, preferably at the start of the project. You want to make sure that your product is "production ready" from the day you start.

The later you wait to start the larger your backlog of bugs and missing features.

However, my reality has not always allowed discovery to commence with the project. Generally, customers do not want to explore a partially built system, they don't mind testing pieces of it, but have trouble seeing if all the pieces will hang/operate together. Often customers are shared between your project and another, their time constraints do not permit them to explore the application. Quality assurance people are really good at uncovering bugs, but it requires a real user to understand how the application hangs together in their environment.

There are many mitigation strategies I have used in the past:

1) Get it into production. If you have a truly green field application (no predecessor) you might be able to get it into production really early. The bare minimum of function can be useful to the end user. (Teamwave was good at this)

More often than not, there is an existing application, that does at least some of the same work that the new one will. In this case I try to duplicate the old application and put it into production once it has the same functionality as the old app. The second release is then the new enhancements.

Arguments I haved faced include: You will not be able to add new functionality as it will be just as brittle as the old app. It is too expensive for us to put it into production so we must add all the new features first. There are not enough features to be useful to a customer.

2) Get real data into the system early. As a substitute (or when getting into production is a long way away) this can work really well. Users get to see their data in the system and see what works and what doesn't. The data is meaningful, and having the users work with real life examples brings out all the complexity of real data. (Inform Alberta did this)

On a previous project we started migration a few months into the project so that data was readily available to the users to test with. They had lots of complaints, but also lots of learnings about how things worked. We got really good at migration, and did it multiple times over the remainder of the project as new functionality was added.

Arguments I have faced include: The data is too complex to migrate. We will oversimplify the problem, because the real data doesn't reveal all cases.

3) Eat your own dogfood. If you are really lucky, you can use the product as you produce it. Basically, you use your own product as you develop it. (Teamwave and OTI did this). OTI was actually a master of this, we built the Smalltalk language in Smalltalk, using our latest builds. You bet the code became production quality quickly. (I managed to screw up the AIX Smalltalk VM causing it to hang every couple of hours, no one was impressed).

Arguments I have faced include: We can't use our own product.

4) Simulated data. Make up data that you believe demonstrates that the system can handle all of your actual data. The idea is that the business changes rapidly, and meeting only the needs of current production data is not enough, you need to show that the system can handle more general cases.

This is what we are trying on our current project. It's later that I would like, but the business people are ready.

The Mythical Man Month

Just reading this classic (far to late in my career). Excellent writing, wish I could write this clearly. He does an excellent job of describing the problems faced on a large project, and then addresses how those problems are solved by a practical waterfall approach, really how all the pieces fit together.

It would be really interesting to take his book and overlay it with an Agile approach.

One of the great quotes:

"On another project the best engineering manager I ever saw served often as
a giant flywheel, his inertia damping the fluctuations that came from market and
management people"

Discussion on coupling...

This categorized how I try to work:

http://codebetter.com/blogs/david.hayden/archive/2006/08/26/Over_2D00_Architecting-Via-Pure-Fabrication-and-Indirection-to-Reduce-Coupling.aspx

Basically, decoupling is important when you need it. If you write objects well (single responsibility, encapsulation) you can decouple quickly through refactoring when demanded by the business or implementation.

Decoupling has a cost associated with it.

Teamwave review?

A few years ago I helped write the conference manager portion of this product with a company called teamwave. Interesting to see the review. Happy to see the word robust.

http://www.eweek.com/article2/0,1895,2005008,00.asp

Stepping up to the plate

Our teams are redefining the way we work together. As we enter a discovery phase in our project (what bugs remain?, what stories did we miss?) we're asking our teams to adapt and innovate.

Our goal is simple: Become production ready.

Our task is hard. We have a large complex system, and our customers need to see just how great the system is, and how quickly we can resolve problems. Our customers need to explore and discover the functionality available in the (huge) application they have been developing with us. We need to support their efforts. There are no simple processes to tell us what to do.

It will be interesting to see who on our development team steps up to the plate and changes our processes to help evaluate and stabilize the system. It is never easy to change the direction of a team, it is always interesting to watch those who lead the charge and see them flourish.

Is it you?

Great people and Great tools

I've been working with Lu and John on our help system. It's been amazing to see what people can do when they discover a new tool to help them solve a problem and they are empowered to solve that problem.

A good (great) Alistair Cockburn article: (Courtesy of Joseph King)

http://alistair.cockburn.us/index.php/Characterizing_people_as_non-linear,_first-order_components_in_software_development

Usability Cobwebs

Having fun with Lu and John on the project sorting out how to make our help more usable. Building paper mock ups, figuring out our users, identifying core tasks and generally having lots o fun.

Having done usability 20 years ago (Psych degree as well as Comp Sci degree) there a quite a few cob webs in my brain. Oh well, just figure it out as we go along.

Smalltalkers

Having been reminded that I'm an old dude

http://www.josephmoore.net/2006/05/visual-studio-ruby-on-rails-and-old.html

Courtesy John Duimovichs Blog.

It's a different safety net...

XP has a different safety net than traditional approaches.

Written spec -> functional tests
Written Design & Architecture -> refactoring & mentoring
Change Control -> Malleable code & visibility
Detailed writing ->oral communication
Everything thought out -> just in time
Look to the future and prepare -> look to the past and improve

Is this why combining Agile and Waterfall methods is the worst of all possible worlds? In both cases the safety nets are tightly woven (coupled?), one piece depends on all the others. Mixing and matching from both sides looses that "weaving".

Mistakes

If you aren't making mistakes you aren't trying hard enough.

Speculative Solutions

are good. Unfortunately they are not the end point.

There are two primary reasons I come up with a speculative solution to a problem:

1) I haven't encountered it before
2) I think we could be doing better than our current solution

I often need to remind myself that the speculative solution is not the answer, it it the starting point. It is a chance for me to learn and figure out if this is right. If it is the wrong answer I have two options:

1) Modify what I'm doing
2) Try something else

To do this I need to be in a position where I can change. If beginning my solution requires me to go to the end before I can evaluate whether or not it worked, I'm in big trouble.

Holidays

Looking forward to some holidays. First couple of weeks in July I'll be in Ottawa. Some time at the inlaws farm, rent a cottage with my family and of course a little talk on Agile development at CGI Ottawa for my brothers group.

John Duimovich

I worked for John back when computers were made of stone with little hampsters used to power them. He was one of the key people at OTI and really, really smart. I was one of the junior developers and really, really dumb.

John is one of the brains behind the IBMs (e.g. OTI) Smalltalk and Java VMs. He groks computers like no one I've ever met. His insight into how people program and how programs work is amazing. He's fun to be around as his mind works in weird and wonderful ways that make you "Wish I had said that".

Take for example his blog on "Visible ink"

http://duimovich.blogspot.com/

Nice to find his blog.

Blogger is pissing me off...

Blogger has identified my blog as SPAM. Which probably isn't far from the truth, however...

It's really annoying that I have to enter some word verification everytime I want to post. I've asked them to turn it off, it's been more than a week (took me a while to figure out what had happened). It's annoying.

Vlad: On risk and performance

Vlad (or Vald:) :
"Say there's a 99% chance that performance problems can be dealt with effectively late in a project, but there is 1% chance that performance problems can't be dealt with effectively. Is it worth risking an entire project on faith that one will find ways to fix the problems?"

Yes, on some projects performance is a severe risk. For example, my brother in Palo Alto needs to grab data from the partical accelerator as fast as possible and store it in a database (vast oversimplification). He worries about performance every second he programs. He writes custom ethernet drivers, uses customized hardware, reprograms switches and even writes in line assembly language in his C code. They spent 3 years making sure the software would be fast enough. Their risk was being able to do it fast enough, not whether or not they could store data in a database.

What is the risk on our project? Are we going to have to rewrite switches and introduce specialized hardware? Or just write clever java code? Our team is full of people who can write clever java code, it's not really a risk. I don't think it's even 999 to 1. Especially, when our code is trying make the business rules clear, and our tests form a living specification of the application. Acutally, I don't see a scenario where the talent on our team couldn't make our app performant, it's what we do and do really well. It is a technical problem, and we're really really really good at technical problems.

Our risk is the business and understanding how to make it work. Codifying the rules of production accounting is really really hard. Getting the specification right (the tests) and making it work correctly (the implementation) is a huge risk. How do we take a complicated body of (and not necessarily consistent) rules and make them into a reliable program? What are we missing that will make it hard to run properties? How does our model fail to meet the needs of our production accountants? We mitigate these risks by having production accountants test the application everyday. We write tests to create a living specification of what we've learned. We focus on these things, because if we get them wrong, we're in big trouble.

Incidentally, we have done back of the envelope calcs and shooting tracer bullets through the application to see where we need to be and actually are in terms of performance. In our heavy processing areas, our early calcs show we need to get 2-10x faster, depending on the area. Assuming, that is, we don't turn this into a batch process system, which would make us about 3x slower in the worst case. We can deal with that.

Vlad points out my oversimplification

"Does everyone really say that performance tuning should be saved for last? "

Well ok, I meant deferred as long as possible, not dead last. People advocating this strategy include: Kent Beck, Martin Fowler, and I believe Ward Cunningham.

Martin Fowlers discussion
http://www.artima.com/intv/tunable.html

Kent Becks discussion
http://www.rolemodelsoftware.com/moreAboutUs/publications/articles/lazyopt.php

My experience:

On my last application we spent the majority of our time implementing the system and deferring performance. It took three developers six months to build. It was 10x too slow. It took 1 developer 3 weeks to rip out the guts of well factored code and put it back together fast enough.

Near the dawn of time, when I worked at OTI, there was a huge panic after IBM built Visual Age on top of OTI/Smalltalk because it was *WAY* too slow for them. One (Senior) developer, Steve, with the right tools(profiler and an open mind) went and assessed the situation. In a few hours things were all tickety boo. It was simply the way they were initializing their objects.

In my experience, with the code teams I've worked with, deferring performance saved time and money. It's the old 80/20 rule (80% of the time is spent in 20% of the code). In my experience, I think this is way closer to a 95/5 rule.

If you change (refactor) your code alot, that 5% will change alot too (it will be the heart of your system). How easy would allocation be to change if it had to meet performance criteria as well? How could we experiment?

Garbage Collection

It's been close to 15 years since I worked with the garbage collector in OTIs Smalltalk. I was tasked with experimenting with parallelization for SMP boxes in OTI/Smalltalk. At the time, people were struggling with parallelizing garbage collections. The train algorithm was an interesting paper, suggesting better ways of parallelization.

I was simply curious to see if from a 50,000 foot view garbage collection has changed much in the last couple of decades. Apparently not.

This was a really good overview for me.

http://www-128.ibm.com/developerworks/java/library/j-jtp11253/

Empirical Performance

Everyone tells you to do performance tuning last. Why?

Because you can make performance tuning empirical. If you've done your tests first you'll be able to change the code to be performant, without breaking it.

The following process works for me as follows:

0) Prioritize and Instrument

Record how long things take in your app. This can be as simple as paper and pencil, or as sophisticated as automated performance tests suites.

Look at what is slow. Where would you get your biggest bang for the buck? What would make your users the happiest? Where will the load be in your system (back of envelope math works well here)?

Are they having trouble with the GUI, or the calculations taking to long? Which page? Help them.

1) Investigation and Hypothesis

You need to understand what the application is doing and how it does it. Use a profiler such as JProfiler. Eclipse has a free but slow one (tptp).

Look at time spent in methods - can you reduce the time?
Look at who is allocating objects - can you reduce the number of objects?
Look at number of calls to a method - does it make sense?

Hypothesize where you can save time, have rough guess of proportion of time it will save.

Example: Alan and I found a method was called 20x more often than we expected. It was a bug in the code that was doing unnecessary calculations.

2) Experiment

Do actual timings on the code you are running (I use println like utilities). Youcan't use a profiler for this because of dialation effects (e.g. the OS and Database are not being slowed down, only your java code - the profiler may be lying to you).

Do back of envelope calculations to see if what you're doing makes sense.

Calculate: Theoretical maximum you can save, and likely amount you can save.

For example: assume you find a method you can double in speed. Wow that seems like alot of savings. So lets say this method is taking 10 seconds out of a 100 second process.

Now lets assume you speed it up so it takes zero time:

Theoretical maximum: 10/100 = 10%. Will that satisfy your users?
Likely savings (Double speed of method) = 5/100 = 5%. Will your users notice?

3) Conclusion

Is it worth doing?

If you know your users need a lot more speed 5% won't cut it, you'll have to go back to investigating and finding where to improve.

Otherwise implement the change and then measure your actual speed gain.

Empirical Performance improvements. I find them worth the wait.

Useful little tool

For those of you who are part-time SQLers and need to format stuff occasionally.

http://www.wangz.net/cgi-bin/pp/gsqlparser/sqlpp/sqlformat.tpl

Service what you build

I always try to help service what I build.

If you were to buy a car from toyota - you expect to be able to service it there. Same with a toaster.

This analogy only goes so far, because when someone builds a toaster, they usually build 1,000's (or millions). Servicing can be done by specially trained servicing staff.

When I develop software, my customers will need service. I provide that service.

Specification Based Factory

Cool stuff today.

Simon and James came up with a new use for the Specification pattern. They combined it with the Factory pattern.

Basically, we need to find a class under some pretty heavy conditions (OK maybe we need a strategy pattern). They look up the class in a Map, indexed by a Specification object. They iterate over the map evaluating the specification, when one return trues they poof up a new instance of the class contained in the value.

Nice explicit code, with each rule contained in its own specification object.

I love simple stuff.

Process and people

Had an interesting discussion with Rado the other day. We were discussing process.

Just reinforced the notion that process is there for the people, not the people for the process. As such the process should accommodate all people and their working styles, not the other way around. When running a project I must try and make the process we have chosen work for all the people on the team. The inability of a process to adapt to an individuals working style is a failure of the process.

In general I try to adapt the process to the people. In reality, at times I am not able to adapt the process to meet the needs of all the people. This is a failing of the process and myself.

Performance techniques

Techniques we are using to improve performance:

1) Do more at once in a transaction.

With coarser grained transactions, we spend less time looking up stuff in caches and merging changes and more time actually doing the important work.

For example: Our importer was importing 1 measurement at a time. By grouping the measurements together on a per facility basis and updating them in a single transaction we reduced the time by 50%. DEspite caching

2) Cache.

Yep. It works. In our case we cache our snapshots (www.martinfowler.com/ap2/snapshot.html) which are constructed frequently in our networks of facilities.

For Example: Had one process come down from an hour to process to mere minutes.

3) Convert dynamically calculated results to stored results.

For example: Our statuses were being calculated all the time for complex large networks. Made the GUI slow. Store the results and poof things sped up by a factor of 5.

4) Convert stored information to dynamically calculated.

For example: We currently store some summary numbers for one of our facilities. Everytime something changes we redo our calculations. They end up being done over and over.

Instead, we make it an on demand calculation and our performance goes way up.

Many times we are pushing the time around, from a place where it hurts to one where it is OK.

Performance tuning in an agile world...

I have taken over the lead on the performance tuning of our application.

A few interesting notes on performance tuning:

1) We do performance tuning test first.

We use Watir to drive our performance tests - because they reflect the experience of the user. We can measure the responsiveness of the app. Also we get to show the users what we are measureing and see if it matches their workflow in a month

On our project, we have moved away from the traditional pass/fail test for performance. We have red, yellow and green. Green means we're with in our target, yellow indicates we're within about 50% and red, well, we're in trouble. It helps prioritize our work.

As always we're running these tests a few time a day in a cruise-control like process.

2) There are different areas of performance tuning.

We have GUI response time, processing time and import/export time (our application has a large component of reporting and third party application interaction in it).

We concentrate on each one, writing tests and getting feedback from our testers on how the system is doing.

3) Break down the problem. Tackle one thing at a time to make it managable.

Performance, Load, Concurrency are all heavily interrealted problems. The nice thing is that we can takle them individually, we don't have to deal with them all at once.

We use both real and imaginary data to drive our performance tests (Actually, we use real data that are part of our regression suite).

4) Thank god that your developers have been refactoring code. You'll notice when you have some "original" design code, versus stuff that has been worked over. It is way easier to tune the latter.

Prescriptive vs. Descriptive

I like descriptive writing. (e.g. Ron Jeffries/Martin Fowler)

Basically, they describe what they've done and the outcome.

I dislike prescriptive writing. (e.g. Scott Ambler).

They describe what you should do to be successful.

I read a bunch of my previous blogs. The ones I like are descriptive, the ones I dislike are prescriptive. I'll try to be more descriptive.

Help with problems not with solutions

Are you leading an agile team?

One of my big (and hard) learnings has been that one of the key ingredients to a high performance team is how the leadership delegates responsibility.

I have found that helping people identify problems, and brainstorming with them (rather than for them) creates a great enabled team. It also creates a great relationship.

I have found that leadership for me, means letting the people I work for (my team) do what they think is best. They make great decisions. Any mistakes they make, they recover from. I like to help with advice, if they ask for it.

Here's a great article from the globe and mail on the subject: (sorry, you'll have to purchase it)
Caution: thinking encouraged here

Enthusiasm in an Interview

Remember that enthusiasm is important for an interview. Why are you doing this job? Do you like building software? Are you learning new techniques? Are you going to excel on my team? Are you going to grow on my team?

These are all questions I'm trying to answer for myself during that 30 minutes I have to decide whether or not you'll work out on our team.

A big factor is how enthusiatic you are. I know if you love what you're doing, you'll be working at being better everyday.

For those of you that are counting...

Our project has blown through 10,000 automated functional tests. Still another year to go.

170,000 lines of business code
230,000 lines of test code

Average # of lines per method: 5

# of great people working on the project: 62
# of customers 12
# of developers ~ 32
# of qa 5
# of teams 5

# of sprints: 24

Time to run 10,000 tests: 380 secs. TOO LONG.

On Performance

Vlad commented:

You mentioned that performance could be improved with this approach, but I'm not
sure how - since this approach involves asking the database for all the records
and then filtering them usig the specs. Using toplink expressions, one can take
advantage of indexes on the database tables and the syntax is quite similar to
specs, so one gets the same kinds of benefits in terms of code duplication. I'd
love to see a posting about performance tuning these spec objects.

Performance for us was not over the database, but the number of times we were looping over the MeasurementPointCollection. Many of our common calls would loop over the measurement point collections two to three times. We saw about a 5% speed up in the tests using the Specification. Not a lot, but something.

Earlier in the project we used Home (Repository Pattern) calls to load the measurement points from the database. We found loading them once and reusing them much faster than using the database. (I think we halved the time the tests took). This is fairly common. (See Fowler PoEAA p.40, Toplink performance http://www.javaperformancetuning.com/news/interview030.shtml. In particular, read the question: What are the most common performance related mistakes that you have seen projects make when developing TopLink applications?)

Specification Objects

Domain Driven Design by Eric Evans has several patterns mixed in it, among the ideas about how to leverage the domain in development.

One of the patterns is the Specification pattern. For me, it's primary benefit has been to reduce code duplication when you're using collections, and need to find subsets of the objects in those collections, depending on various Criteria.

For our project we have this problem in spades, with contracts and measurement points.

Sometimes we want to find all Sales contracts. Sometime we want to find an Oil Contract. Sometimes we want to find Oil contracts that are also Sales contracts. We have many many dimension on our contracts, and need to slice and dice them many different ways.

Our original implementation looked like this:

findAllSalesContracts
List results = new List
foreach contract in getContracts()
if contract.isSales
results.add(contract)
return results

findAllOilContracts
List results = new List
foreach contract in getContracts()
if contract.isOil
results.add(contract)
return results

findAllOilSalesContracts
List results = new List
foreach contract in findAllOilContracts()
if contract.isOil
results.add(contract)
return results

See the duplication, the repeated code? Imagine 4 or 5 conditions. This gets unmanagable.

Now here's a Specification Implementation:

class OilSpecification
isSatisfiedBy aContract
return aContract.isOil

class SalesSpecification
isSatisfiedBy aContract
return aContract.isSales

class ContractFinder
findAll(Collection, Specification)
List results = new List
foreach element in collection
if Specification.isSatisfiedBy(element)
results.add(element)
return results

findAllSalesContracts
return ContractFinder.findAll(getContracts, SalesSpecification)

findAllOilContractsList
return ContractFinder.findAll(getContracts, OilSpecification)

findAllOilSalesContracts
return ContractFinder.findAll(getContracts, SalesSpecification.and(OilSpecification))

Removed several hundred lines of code from our application.

Denis and ClearStream

ClearStream was one of the companies I have seen where the words the "we're a family" was true.

Denis has taken care of his employees through thick and thin, and they have been good friends to him. When he spoke of them to me it was always with fondness and pride. Those who had left he always appeared to have hope they might return, while proud of their current successes.

My sympathies go out to Greg, Gerard, Jennitta, Ralph and Shaun who all form part of the ClearStream family.

Denis and Calgary

A few years ago Tom and Mary Poppendick came to Calgary and in one of my conversations with them, they marvelled at what an agile community we had. It was unusual.

One of the reasons for that community is Denis. Denis created a business in new ideas and new ways of doing things. ClearStream is not a body shop, it is more. Denis' business sense made it possible to assemble some of the best people in the city and introduce innovative ways of developing on many different projects.

In the past he has helped his company introduce Smalltalk, Objective-C, object oriented development and now Agile development to the community. He brought the people together with the ability and turned it into a viable business. His companies innovation has pushed other bigger companies like IBM, Telus, CGI to adopt new techniques and practices faster than they normally would.

Denis helped form our agile community. Thanks.

Denis Clelland

Denis was a great friend. If you knew him you very quickly thought the same thing.

My own relationship with Denis began as a business relationship. It very quickly became warm and friendly. Denis was genuinely interested in me personally. He did that with many people.

Denis was easy to trust. I could tell him lots and bounce ideas off him. He'd tell you about his experiences. Not "condescending" or "teaching", sharing. He was interested in my ideas, thoughts, family. He shared his with me. It was a friendship.

Friendship and business went hand in hand with Denis I don't think he could have one without the other.

I miss my friend deeply.

Drywalling...

I've been drywalling for the last few months. I have lots of tips on how to mud, soon I'll be back to my normal posting schedule.

Verify imported data

Just used Watir to verify some imported data.

The problem: We had business logic in java. It was much easier to do our imports in SQL. Our new logic had stricter controls than the old application in some areas.

How do we verify that our imported data is editable in the new system?

We wrote a simple script in watir to try editing every single piece of data. Nothing fancy, just open and save. The big trick is that you need a simple way to bring in the data easily. Our app allowed us to specify the surrogate key as a parameter to the editing webpage. From there watir hit the save button and then loaded the next piece of data.

Less than 50 lines of code. More than 24 hours to run. Found problems in .5% of the data.

Ted O'Grady on Agile Development

Monday, August 27, 2007

Serendipitous

Barn Shadows

Ghosts in the wood

Friday, July 27, 2007

Art in Calgary Alley

Reflected Wealth

Tuesday, July 10, 2007

Relax

Holidays

Wednesday, May 30, 2007

Fixing one of the things that bugs me...

Where have I been?

Saturday, October 28, 2006

Which Profiler?

Wednesday, October 25, 2006

Performance tuning in ernest

Sunday, September 17, 2006

Something to try...

Wednesday, September 06, 2006

Discovery phase timing

The Mythical Man Month

Tuesday, September 05, 2006

Discussion on coupling...

Wednesday, August 30, 2006

Teamwave review?

Monday, August 28, 2006

Stepping up to the plate

Thursday, July 27, 2006

Great people and Great tools

Wednesday, July 26, 2006

Usability Cobwebs

Smalltalkers

Monday, July 17, 2006

It's a different safety net...

Saturday, July 01, 2006

Mistakes

Wednesday, June 28, 2006

Speculative Solutions

Friday, June 23, 2006

Holidays

John Duimovich

Sunday, June 11, 2006

Blogger is pissing me off...

Vlad: On risk and performance

Saturday, June 10, 2006

Vlad points out my oversimplification

Garbage Collection

Empirical Performance

Useful little tool

Service what you build

Tuesday, June 06, 2006

Specification Based Factory

Sunday, June 04, 2006

Process and people

Saturday, June 03, 2006

Performance techniques

Performance tuning in an agile world...

Tuesday, May 16, 2006

Prescriptive vs. Descriptive

Help with problems not with solutions

Saturday, May 06, 2006

Enthusiasm in an Interview

Thursday, May 04, 2006

For those of you that are counting...

Sunday, April 30, 2006

On Performance

Specification Objects

Saturday, April 29, 2006

Denis and ClearStream

Denis and Calgary

Friday, April 28, 2006

Denis Clelland

Tuesday, February 07, 2006

Drywalling...

Sunday, January 01, 2006

Verify imported data

Previous Posts

BLOGS