How to safely change 13,000 lines of code, part 1

I cranked out a bunch of code tonight for my new nullable types approach. This is a .NET 1.1 solution, so it's pretty old school, but it was still fun.

I explained the problem with our existing methods the other day. Basically, we have a ton of helper functions for dealing with nulls and empty strings across all of the data types. The methods are confusing and we haven't been using them correctly. We need a more sensible approach.

After chatting with Marco, we came up with something like this:

NullableGuid.FromDatabase(Value as Object) As Guid

  

NullableGuid.FromUserInput(Value as String) AS Guid

I intended to create all of the new classes and methods, mark the old ones obsolete, and correct all of the existing code by hand. This would force me into doing some pretty thorough code reviews, and we just wrapped up development on a major release, so that would be good. I estimate that about 30% of our usage of the methods is incorrect, leaving tons of bugs waiting to pop up. It would be awesome to clean that stuff up as part of this release.
(30% is a VERY pessimistic estimate. It could very well be 3%, but I wanted to be prepared for the worst.)

To get an idea of how much work I was getting into, I went ahead and marked everything as obsolete Thursday afternoon. I guessed that it would create about 7000 build errors. A coworker guessed 10,000. Well, it ended up being nearly 13,000 errors. When I left work Thursday, I still had it in my head that I'd find a way to update all of that code by hand, and here's why...

If the old method of IsNullGuid is being used incorrectly, and I replace that method with NullableGuid.FromDatabase, a global search and replace would translate to the new code, but it would leave all of the usage as is, and we'd still have the same error rate--only new code would benefit, leaving the million lines of existing code the same. That won't do.

Frankly though, I knew there must be a better way. I really shouldn't have to update 13,000 lines of code by hand to clean this up. Besides, that level of monotony would surely lead to oversights and even more problems. I need this to be 100% accurate and I cannot introduce any issues.

Friday morning, on my way in to work, I thought it through again and realized that I don't need to have 2 separate methods for FromDatabase and FromUserInput. With each of the data types getting its own class (e.g. NullableGuid), I can overload the heck out of things, and make the routines much more intelligent. I can simply create NullableGuid.From with a few different overloads. Then I can replace all instances of IsNullGuid as well as IsStringEmptyGuid with NullableGuid.From.

If NullableGuid.From is implemented correctly, it will handle every possible scenario of data handed into it. This would allow me to use a global search and replace, putting NullableGuid.From in everywhere, and it will always be right. It will be impossible to get it wrong, so long as NullableGuid.From is bullet-proof. Instead of fixing 13,000 lines of code by hand, I'll just need to execute a couple dozen search and replaces. These thoughts made for an exciting drive into work.

What I say next might surprise some of you; the ones that know me best anyway. The only person I've told this too was very surprised in fact.

I used a TDD approach to make these methods bullet-proof, and to ensure compatibility with the old methods.

I haven't previously used TDD techniques on this project, and I've stated that TDD would not work on this project, for many reasons. But for what I was doing, TDD made perfect sense today; it was an unordinary task for this project. Let's break it down to see why it was such a good fit.

It's been like 4 years since anything this deep in the architecture has been changed
I was creating lots of classes that have lots of tiny methods
The methods deal strictly with scalar values and simple parameters
There are no outside dependencies, no need for DI or anything fancy
There are several different scenarios for calling each method, and each method needs to be tested against each scenario
The methods are going be be called 13,000 times, right off the bat -- they better be right!
I have existing methods to use for benchmark results

For every method, I ended up mapping it to an existing method. I need to do this anyway, in order to do my search and replace. Then I created unit tests for all possible scenarios, against all of the methods. I did this before implementing the new methods, as true TDD dictates. The unit tests compared the results of the new methods to the results of the old methods. I started at 0% passing, but at 100% passing, it's safe to do the global search and replace.

All of my NullableType classes are fully implemented. I didn't reach 100% passing, because I found a bug in the old IsNullString and IsStringEmptyString methods. I've corrected the bug in the new NullableString.From method, but this leaves a difference between the methods. I'll have to review this with the QA department on Monday to make sure they agree that we should fix the bug. We were not trimming off leading and trailing spaces from user entries, and we're supposed to be.

I have not yet processed the search and replace. I would've done it tonight, but some bozo on the project broke the build before he left for the night (he will be flogged on Monday). So I'll have to wait until next week. I'm very confident that the search and replace will safely change 13,000 lines of code, and the new nullable type methods will not be susceptible to misuse.

I'll post a follow up next week after I process the search and replace, and I'll include some samples of the code. Until then, wish me luck!