Recently, we were running into Out Of Memory issues with our software. A developer had identified that a feature which launched an embedded browser was using up quite a bit of memory, and we could move it into a separate process which would have its own memory allocation.
We followed his recommendation, so we moved the browser into a separate process and released this software (which I will denote “V1”). This caused an issue because the dll wasn’t built with the correct settings. We needed a quick fix due to the number of complaints from our users. It would be easy to fix, but due to the Testers stating it would take a few days to test it again, the managers told us to simply revert it and give the users the memory issues again (so this is “V2”).
We then fixed it, but since we had more time, we decided to implement a configuration switch so that if it went wrong, we could just disable it without having to release a new version. I made a small refactoring to get the configuration switch working, but I made a small mistake in the constructor which caused a crash in one of the places where the browser is launched. A Tester had found it late in his testing. I came up with a fix, but then we got called into a meeting to discuss what we needed to do. Unfortunately, my new fix would require a lot of the testing to be redone. The “refactoring” I did was brought up for discussion, and an experienced developer said that my way of doing it wasn’t the way he imagined. All the managers on the call trusted his judgement and felt I had gone a bit overboard, so they insisted I redo it.
So I had another look and couldn’t see how I could possibly do it without refactoring at least slightly. I did the best I could which I’d say was actually ~90% identical, apart from it was clearly worse from a code-design perspective. I showed it to the experienced developer and he said that was how he imagined it. I was baffled, but this is what the managers asked for. The release got delayed a week because the tester had to retest everything. So in the end, we had just wasted days of time to make the code worse. Brilliant. If we had gone with my original fix, it would have taken the same amount of time, but the code would be better.
Additionally, we did highlight we couldn’t test all the third-party suppliers we integrate with, so we had “low confidence”… or there’s some uncertainty at least. However, the Directors got involved and they wanted it out. So out goes “V3”.
There was a problem with just one of the third-party suppliers, but for some reason, Support left the feature enabled for several hours, when they could have reverted it with my Config Switch. This led to many complaints (yet again), and therefore panic from the management. When it came to disabling it, despite being able to disable it for particular users, they turned it off for large groups.
We rollout our changes in phases, so the Software Delivery managers wanted me to rollback the changes once more before releasing to the next group of users. I asked why we couldn’t just use the Configuration Switch, and the response I got was “we don’t trust Deployment to turn the feature off. What if they miss some users?”
Well, when those users complain, you just switch the Configuration Switch to “off”. If we rollback again, then we are back to square one; where users can encounter the memory issue.
What is the point in coding a Configuration Switch if they won’t use it? I did point out the way the Configuration Tool works is that you could set the configuration ahead of time (the previous software version will just ignore the extra config), then when the users get the new version, then the updated version would use the configuration file as intended. So with a bit of persuasion, that’s the option they went for. However, they turned it off for every user, even if it wouldn’t have caused problems for them.
When will they turn it back on?
Conclusion
It was actually a good idea to use a Configuration Switch. If the feature doesn’t work properly, you can quickly revert to the old way (well assuming your new changes don’t unintentionally break the existing feature). Obviously, you need to trust the Deployment team to actually use it correctly.
Colin went with a Configuration Switch for his recent project. How did he get on?
“The bug can’t be introduced by our project because it is switched off on live”.
Colin
5 mins later…
“This is definitely code that we changed”
One thought on “Configuration Switches”