How Not To Write Unit Tests

Introduction

Unit testing is a software testing technique where individual components of a program are tested in isolation. These components or “units” can be functions, methods, or classes.

When implemented correctly, Unit testing is a crucial practice in modern software development. It helps ensure the quality, reliability, and maintainability of your code. By incorporating unit testing into your development workflow, you can catch bugs early, improve code quality, and ultimately deliver better software to your users.

When I heard about Unit tests, they did seem awesome. But then the more I used them, I found that my opinion on them has declined. I find it quite hard to explain though.

I think in general, to make things testable, you have to split the logic up into smaller methods. But then when they are smaller, A) they are easier to understand and B) they are less likely to change. So if a developer has looked at that code, what is the chance they are gonna change it and break it? If you have a unit test and it never fails in the software’s lifetime, has it provided any benefit?

Then in the case that you decide to change the behaviour, then you have the overhead of rewriting all the unit tests and it can basically double the development time.

When there’s certain scenarios which could end up taking ages to manually test it, the unit tests are very beneficial. When there’s loads of permutation/optional aspects to logic, it is a prime candidate for unit tests. Without unit tests, retesting every time you make a simple change is incredibly tedious. But with unit tests, you just click a button and wait a few seconds.

Unit tests give you confidence you can refactor without risk. However, they are not automatically the silver bullet. Well-written, fast, reliable tests accelerate development and save time. Poorly-written, slow, flakey tests hinder development and waste time.

https://twitter.com/olafurw/status/1578704185809244160?s=20&t=oD3MJz9EXH-cHX7fFKOGdg

Fast Tests

A test that takes a second to run doesn’t sound slow, but what if you have hundreds or thousands of tests? If the tests take a long time to run, the developers won’t run them as often, or at all, then what value do they serve?

They also should run on a build to ensure only quality releases actually go live, but you want your release process to be fast.

There was a recent change where the developer was claiming to have sped up a long-running call, however, he hadn’t carried over that performance enhancement mindset to the tests, and had actually increased the time to run them by 6 seconds.

The code “Thread.Sleep” can be used in threaded code to intentionally call a delay. I’ve seen many developers add this to a unit test. Tests are supposed to be fast, so you should never add this in a unit test.

Measuring Tests & ExcludeFromCodeCoverage

When people write unit tests, they want to try to understand how much of their code is covered by tests. We have this metric of Code Coverage but it has some severe limitations in the way that it is measured. It’s often a simple metric of “does the line get hit by at least one test”, but since methods can be executed with different combinations of variables, you can end up having 100% statement coverage but without actually testing many combinations at all.

The metric is one that impresses managers so you often see developers writing bad tests simply to game the test coverage metric. This is bad as you end up being misled that your code changes haven’t caused any bugs but yet it could have introduced something severe because the unit tests weren’t adequate.

I’ve seen quite a few code changes purely to increase the code coverage. So the title of the change would be like:

“Added more code coverage”

Then when I check the build output:

“There might be failed tests”

How can you be adding more tests then not actually run them before submitting it to review? Madness. The explanation is that their focus is just on coverage and not on quality. Maybe a bit of arrogance and laziness.

https://twitter.com/housecor/status/1454413464156483587?s=20

This week I worked with a team to get code coverage over 80% (a corporate minimum). The problem with this effort: Code coverage can be gamed. Sure, low code coverage means there’s a lot of untested code. But, high code coverage doesn’t mean the code is well tested.
Cory House

You can add ExcludeFromCodeCoverage “attributes” to your code which tells the code coverage tool to ignore it. It’s a simple way of reducing the amount of lines that are flagged as untested.

Here’s one of our principal developer’s opinion on this attribute:

“Using ExcludeFromCodeCoverage is only good when the goal is 100% coverage. That should never be the goal. The goal should be a test suite that prevents bugs from ever going live. I’m happy never using it and just having coverage reports flag things that are not actually covered, it is a more realistic representation of what the tests cover and makes me much more cautious about changing them as I know I don’t have test coverage. Never add Exclude from Code Coverage, it’s just lying to everyone. Why suppress things that might be a problem, if they are, we need to fix them.”
Principal Developer

Personally, I think adding suppressions/attributes just clutters the code base. I’d rather just treat the stats as relative to each release. The numbers have gone up/down, but why? If we can justify them, then it’s all good. Chasing 0 code smells and a specific test coverage means you can just cheat and add the likes of ExcludeFromCodeCoverage to meet such metrics.

Another developer said:

I value a holistic set of metrics that help us understand quality in software development. Code coverage is a single metric that can be part of that set of metrics you monitor. No single metric can stand by itself, and be meaningful. Nothing is perfect, which is why we should value a toolbox. I don’t believe in gaming the system and “hiding” uncovered code to get to 100%.

You need engineering teams who are prepared and confident enough to publicly share their coverage reports. This sets the tone of the culture. Context is needed, always. There will be reasons why the coverage is as it is. Use tools that help engineering teams with confidence/delivering at pace and ultimately delivering customer satisfaction. You cannot compare reports from different teams or projects.

Useful Tests

You need to make sure your tests actually test some logic. Sometimes people end up seemingly writing tests that really test the actual programming language, but I suspect it is just so the Code Coverage metric is fooled. Code Coverage checks if lines of code are “covered” by tests, but the simplistic nature of the check just ensures that a line of code is executed whilst a test is running; rather than if there was a meaningful test.

So for example:

[Fact]
public void DefaultConstructor_ReturnsInstance()
{
        var redisMgr = new RedisStateManager();
        Assert.NotNull(redisMgr);
}

So there you are instantiating an object then checking it is not null. Now that’s how objects work in C#. You instantiate an object, and then you have an object. Now, I suppose an exception could be thrown and the object wasn’t created, but that is generally considered bad practice and also there was no other test to check a situation like that, so they haven’t tested all scenarios.

largeResourceDefinitionHandler.MissingResource = _missingResourceDefinitions;
 
Assert.NotEmpty(largeResourceDefinitionHandler.MissingResource);

Setting it then checking it is set. Unless the property has loads of logic which you could say is bad design, then checking it is set is really testing the “.net framework” but if you think you need this; that means you don’t trust the fundamental features of the programming language you are using. You are supposed to be testing the logic of your code, and not the programming language.

If there’s lots of setup then the Assert is just checking for Null, then it’s likely just to fudge the code coverage. Another classic that I’ve seen is loads of setup, then ends with:

Assert.IsTrue(true);

So as long as the test didn’t throw an exception along the way, then it would just always pass because True is definitely equal to True.

Those ones seem intentionally malicious to me, but maybe the following example is more of a case of a clear typo:

Assert.Same(returnTrigger, returnTrigger);

Whereas this following one looks like a typo, but it’s actually two different variables. Need to look closely (one is a single S in Transmission). 🧐

Assert.Equal(organisationTransmissionStatus.Enabled, organisationTransmisionStatus.Enabled);

What goes through people’s heads? How can you write code like that and carry on like nothing is weird.

Sometimes tests look a bit more complicated but on analysis they still don’t really test much:

        [CollectionDefinition(nameof(LoggerProviderTests), DisableParallelization = true)]
         public class LoggerProviderTests : IDisposable
         {
[Theory]
                 [InlineData("Verbose")]
                 [InlineData("Debug")]
                 [InlineData("Fatal")]
                 [InlineData("Information")]
                 [InlineData("InvalidLogLevel")] // Test with an invalid log level
                 public void GetMinimumLevel_ReturnsCorrectLogLevel(string logLevelSetting)
                 {
                         // Arrange
                         System.Configuration.ConfigurationManager.AppSettings["DistributedState.LogLevel"] = logLevelSetting;
var firstInstance = LoggerProvider.Instance;
                         var secondInstance = LoggerProvider.Instance;
// Assert
                         Assert.Same(firstInstance, secondInstance);
                 }
         }

So this sets a setting on AppSettings, presumably used by the “LoggerProvider”. However, all they are doing is testing that if you call the Instance property twice, it returns the same object both times. So the setting of the different log levels is completely irrelevant. I mean, the log level could be completely wrong but you are comparing ‘is the wrong value of A the same as the wrong value of B’; and it will still pass.

Another common aspect is when you use a testing library like Moq, and you can use it to create objects and essentially say “when I call some code with these specific parameters, then give me this value back”. The thing is when developers use this as the actual thing they are testing, then you are testing Moq, and not your actual logic.

	[Fact]
	public void JobReposReturnsValidJobTest()
	{
		//Arrange
		ScheduledJob job = new ScheduledJob() { ScheduledJobId = Guid.Parse("e5ee8f7410dc405fb2169ae2ff086310"), OrganisationGUID = Guid.Parse("fbee8f7410dc405fb2169ae2ff086310") };
		_mockJobsRepo.Object.Add(job);
		_mockJobsRepo.Setup(e => e.GetById(Guid.Parse("e5ee8f7410dc405fb2169ae2ff086310"))).Returns(job);

	//Act
	var resultJob = _unitOfWork.JobsRepo.GetById(Guid.Parse("e5ee8f7410dc405fb2169ae2ff086310"));

	//Assert
	Assert.Equal( Guid.Parse("fbee8f7410dc405fb2169ae2ff086310"), resultJob.OrganisationGUID);
	}

“I think all this test is doing – is testing that JobsRepo returns an object that was passed into the constructor on line 22. The GetById is redundant, it will always work if it returns that object because the Moq was configured to return that value. That is testing Moq, and not our code. But then if you are just asserting a property returns an object, you are just testing that C# properties work.”
Me

“yes you are right , I am just testing if JobsRepo could return a value, so that it helps me in code coverage for get functionality of JobsRepo , as it is just set in the constructor of the class and there is no call for get”
Developer who wrote bad tests

So I think they are saying “I am just fudging the coverage”. Checks it in anyway.

There’s been loads of tests where you could actually cut out large parts of the method they are testing and the tests still pass. Again, sometimes you point this out to developers and they still want to check it in, purely for the statistics, and not for any benefit to any developer.

“do these tests add value? a quick glance suggests this is very dependent on your mock object. It might be the case that the production code can be changed without breaking these tests.”
Me

yeah, they’re kind of meaningless. Merging code to use as templates for future, better, tests.
Developer who wrote bad tests

Here is a rant I left on a similar review:

This name implies that it should always be disabled, especially because there’s no coverage for the case where it is true. However, these tests aren’t really testing anything. I think you’re basically testing that Moq works and the default boolean is false. I think the most you can really do is call Verify on Moq to ensure the correct parameters are passed into the GetBool call.

If you replace the contents of IsRequestingFeatureEnabledForOrganisation with return false, your tests pass which illustrate the coverage isn’t complete, or you aren’t guaranteeing the configuration manager code is even called at all. Personally, I don’t think it is worth testing at all though. All your class does is call the FeatureDetails class so you aren’t testing any logic on your side.

I think people are too concerned about getting code coverage up, so they insist on writing tests, even if it makes things more confusing.

I suppose it is up to you and your team to decide what you want to do, but I occasionally question people just to make them think if it is actually adding any value. I’ve seen tests where they simply assert if an object is not null, but it could literally return an object with all the wrong values and still pass (and the method always returned an object anyway so could never fail). If you see a method has tests, it gives you a false sense of security that you think it is going to catch any mistake you make, but it just always passes anyway

always think if your tests will add value and if it’s worth adding them. If you need to mock everything then they’re not very valuable, or you’re testing at the wrong level (too high), and you’re better off with integration tests than unit test. 100% code coverage is a really bad idea for complex software, massive diminishing returns in value the higher you try to push it. We change stuff all the time in our software too, so if everything has high-level unit tests then you spend more time fixing those tests.I tend to find you spend ages writing tests then if you change the implementation then you have to change the tests and you can’t run them to see if you broke anything because you had to change the test to run it.
Me

Test Driven Design (TDD)

There’s a methodology called Test Driven Development where you write a test first. It will then fail if you run it because there’s no functionality to run. Then you write the implementation and get it to pass. Then move onto writing the next test, and repeat. So you build up your suite of tests and get feedback if your new changes have broken previous logic you wrote.

I was recently listening to a podcast and the guest said that he always writes code first, then adds tests after. If he can’t write tests, then he will make a change to bypass code just for the tests. I wasn’t sure what he meant by this, maybe it’s like when people write a new constructor which is only ever called by the tests. But that’s bad design.

I thought he may as well just do TDD from the start, instead of always going through that struggle. He says TDD doesn’t often lead to good design because you aren’t thinking about design, you just think of how to make the tests pass.

But doesn’t the design organically come from TDD? and his way of changing the design just for the tests is what he is arguing against TDD for. TDD often slightly over-engineers the solution with the likes of Interfaces. So then he is avoiding TDD, and instead writing the tests after; but his way adds “Technical Debt” via adding extra constructors that are only used by the tests.

“I’ll add tests in a separate change later”.

5 reasons to add tests before merge:

1. Clear memory: Before merge, everything is fresh in my mind. I know what the code is supposed to do, because I wrote it. So I also know what tests I should write to assure it works. Every minute that passes after merge, I will understand the feature less, and thus, be less equipped to add proper test coverage.

2. More effective reviews: If I write the tests before merge, then anyone reviewing my code can use my tests to help them understand the code, and to watch the feature run.

3. Faster development: If I write tests during development, I can use the tests to accelerate my development. I can “lean” on my tests as I refactor. Faster feedback loops = faster development.

4. Better design: Writing tests during dev encourages me to write code that is testable. It makes me consider accessibility too since that tends to make automated testing easier by providing well-labeled targets.

5. Changing priorities: After merge, there’s no guarantee that I’ll have time to write the tests at all. I may get pulled away for other more “urgent” tasks.

Bottom line: The proper time to add tests is *before* merge.
Coty House

I recently saw the following conversation. A developer was basically saying he didn’t have time to write the tests, and it might end up in some drastic refactoring which would be risky. Then the plan is to rely on manual testers and get the changes released. Then the next part probably won’t happen (because important features will be prioritised), but his suggestion is that he then makes the changes for the next release with good unit test coverage.

Senior Developer:
This domain supports unit testing, you should be able to add tests to cover the changes you made to make sure it behaves as you expect

Developer
Currently there are no unit test cases available for the changes made class, and the class is tightly coupled. I have written some draft tests and will check them next month as a priority.

Architect
IMO, given the pressure, timescales and urge to complete this, I think we can defer for now and stress the testers to pay more attention to the areas that we have low confidence.

Senior Developer:
So instead of checking if it is correct by adding tests that we can be sure exercise the code changes, we just merge it and hope that the manual testers find any bugs over the next day or so, and if they do, then it is back to the dev team and another change?

Time In Unit Tests

Tests should be deterministic. If a test is run and passes, then if no changes have been made and we run it again, it should also pass (obviously). An unreliable test doesn’t give you confidence in code changes you make. It’s a surprisingly common occurrence when you make a change and an unrelated test breaks, and you are thinking “how can those changes break the test“? then you look at what it is doing, and it’s often something to do with time.

You see something like
data is "BirthDate":"1957-01-15T00:00:00"
And the test result says:
Expected "Age":"67y"
Actual: "Age":"68y"
Today is their birthday!

What you need to do is put a “wrapper” around the code that gets the current date. So instead of simply DateTime.Now, you create a class called something like DateTimeProvider, and in the production code, the class returns DateTime.Now. Then in your Unit Tests, you then create a MockDateTimeProvider and make it return a hard-coded date. That way, no matter when you run the test, it always returns the same date, and is a deterministic test.

I recently fixed some tests that were failing between 9pm-12am. I found that a developer had changed the MockDateTimeProvider to return DateTime.Now, completely rendering it pointless. Other parts of the test were adding 3 hours to the current time, and because 9pm+3 hours is tomorrow’s date, the date comparison it was doing then failed.

public class MockDateTimeProvider : IDateTimeProvider
{
        public DateTime Now { get { return DateTime.Now; } }
}

I think another red flag in unit tests is conditional statements. Logic should be in your production code, and not in tests. Not only does this following code have a DateTime.Now in it, it looks like they have put a conditional If statement in there, so if it would normally fail, it will now execute the other branch instead and pass. So maybe the test can never fail.


[Fact]
public void ExpiryDateTest()
{
        DateTime? expiryDate = (DateTime?)Convert.ToDateTime("12-Dec-2012");
        _manageSpecialNoteViewModel = new ManageSpecialNoteViewModel(_mockApplicationContext.Object);
        _manageSpecialNoteViewModel.ExpiryDate = Convert.ToDateTime(expiryDate);
        
        if (_manageSpecialNoteViewModel.ExpiryDate < DateTime.Now.Date)
                Assert.True(_manageSpecialNoteViewModel.IsValid());
        else
                Assert.False(_manageSpecialNoteViewModel.IsValid());
}

Other Bad Unit Tests

Maybe the most obvious red flag, even to non-programmers – is testing that the feature is broken. The developer has left a code comment to say it looks wrong!

Assert.Equal("0", fileRecordResponse.Outcome); // I would have thought this should have been -1

The One Line Test

How do you even read this. Is that actually one line? 🤔🧐

_scheduledJobsRepo.Setup(r => r.GetAllAsNoTracking(It.IsAny<Expression<Func<ScheduledJob, bool>>>(),
        It.IsAny<Func<IQueryable<ScheduledJob>, IOrderedQueryable<ScheduledJob>>>(),
        It.IsAny<int>(),
        It.IsAny<int>(),
        It.IsAny<Expression<Func<ScheduledJob, object>>>()))
        .Returns((Expression<Func<ScheduledJob, bool>> expression,
        Func<IQueryable<ScheduledJob>, IOrderedQueryable<ScheduledJob>> orderBy,
        int page, int pageSize,
        Expression<Func<ScheduledJob, object>>[] includeProperties)
        =>
        {
                var result = _scheduledJobs.AsQueryable();
                if (expression != null)
                {
                        result = result.Where(expression);
                }
                result = orderBy(result);
                result = result.Skip(page * pageSize).Take(pageSize);
                return result;
        });

When it is that hard to read, I wonder how long it took to write it.

Other Common Mistakes

I think tests can be unclear if you use a Unit Testing library but not understand what features are available. Like instead of using the ExpectedException check, they may come up with some convoluted solution like a try/catch block then flagging the test as passed/failed.

try 
{ 
        Helper.GetInfoArticles(articleName, _httpWebRequest.Object); 
        Assert.IsFalse(true); 
} 
catch 
{ 
        Assert.IsTrue(true); 
}

Naming the tests can be tricky to make it clear what it does and to differentiate it from other tests. The worst is when the name says something completely different, most likely from a “copy and paste” mistake.

I’ve talked about how using DateTime can make tests fail at certain times. You can end up with tests that rely on some shared state, then the order you run the tests causes failure when the test expects data is set or not set.

Bank

Tests are even more important in certain domains. You know, like when money is involved. Commercial Bank Of Ethiopia allowed customers to withdraw more cash than they had.

https://twitter.com/GergelyOrosz/status/1769972958284362056

“I’m really interested how a bank managed to deploy code that didn’t have tests for “can a user withdraw money when they don’t have enough balance.” Development teams at banks are usually conservative, process-heavy and slow-moving with changes exactly to avoid this. Wow”

Conclusion

Unit tests can be a useful and helpful tool to developers. However, there is an art to writing them and they have to be written with good intentions. If they aren’t written to be useful, fast, and reliable, then developers either won’t run them or won’t trust them.

Computers cannot handle floating point maths

I’ve been writing a game using Unity, and I’ve just realised how rare it is to use floating point (decimal) numbers in my day job as a software developer. I remember my Computing teacher back in College stating how computers are rubbish at calculating decimal numbers, but I never really understood. Calculators work just fine. They may only have several decimal places but that is usually more decimal places than you need.

In game development, decimal numbers are frequently used for sizes and positions in a 3D space and therefore calculations with these objects involve lots of decimal maths.

After writing some logic to position my characters on the map, I tried writing a unit test. Unit testing this method could be beneficial when I add extra features later.

I wanted to compare a Vector which is composed of 3 floating point numbers. Since I already had written the logic and was happy with the character’s positioning, I just grabbed the output of the method and specified it as the expected output, but my test failed!

Expected: (-34.28, 0.00, 20.63)
But was:  (-34.28, 0.00, 20.63)

so then I inspected the Vectors more closely

"(-34.28, 0.00, 20.63)"
	magnitude: 40.00208
	normalized: "(-0.86, 0.00, 0.52)"
	sqrMagnitude: 1600.166
	x: -34.275
	y: 0
	z: 20.625
 
new Vector3(-34.28f, 0.00f, 20.63f)
"(-34.28, 0.00, 20.63)"
	magnitude: 40.00894
	normalized: "(-0.86, 0.00, 0.52)"
	sqrMagnitude: 1600.715
	x: -34.28
	y: 0
	z: 20.63

I think the numbers it shows you is a text representation which involves rounding the numbers to 2 decimal places. But if you look at the actual X, one is -34.275 and the other is -34.28. So there’s a 0.005 difference. A similar problem with Z.

What also threw me, is when I looked at the documentation, https://docs.unity3d.com/ScriptReference/Vector3.Equals.html I saw this:

“Due to floating point inaccuracies, this might return false for vectors which are essentially (but not exactly) equal. Use the == operator to test two vectors for approximate equality.”

I was originally using Assert.Equals(expected, actual) so I switched to Assert.True(expected == actual) but it still wasn’t working.

I looked at the Unity source code and they are providing their own == operator for vectors that does some “square magnitude” calculation then compares if it is less than a certain number – a margin of error.

this is Unity’s version

[MethodImpl(MethodImplOptions.AggressiveInlining)]
public static bool operator ==(Vector3 lhs, Vector3 rhs)
{
   float num = lhs.x - rhs.x;
   float num2 = lhs.y - rhs.y;
   float num3 = lhs.z - rhs.z;
   float num4 = num * num + num2 * num2 + num3 * num3;
   return num4 < 9.99999944E-11f;
}

So I did something similar and increased the number. I came up with this, using Unity’s provided square magnitude function, but wasn’t sure if it made mathematical sense: It worked for my scenarios though

private static void AssertAreApproximatelyEqualDueToFloatingPointImprecision(Vector3 expected, Vector3 actual)
{
   var difference = Vector3.SqrMagnitude(actual - expected);
   Assert.IsTrue(difference <= 5.0E-04);

/*
eg
0.0004996415
<=
0.00050
*/
}

Looking back at this, I probably over complicated it. I suppose I should just change the expected values to have more decimal places because it’s the rounding to 2 decimal places that will be the actual problem in this case. In other cases with more decimal points, you will encounter issues due to the way computers do handle large decimal numbers which the Unity docs are alluding to.

In other situations, you also need some leeway in your calculations. If you want to move a character to a certain position, you may find that they could be 0.1 away from their destination, then on the next frame, they move their usual distance based on their movement speed – and then overshoot it. So you end up needing to write code like “if distanceRemaining < 0.2f then stop”

So floating point numbers, especially when it comes to game development, really does add extra complications to think about.

Project: Batch Approval

This long blog documents what I have been working on for the past year. I had made lots of notes with the aim of writing a blog, in addition to taking extra notes from chat logs.

We actually estimated the project would take around 5 months, but then an extra 2 months for testing and go through our slow rollout process. It actually took closer to a year. I’d say it was a combination of:

realising the feature was more complicated than anticipated
the UX team had little knowledge of the actual user experience
managers changing or trying to change team members
our slow release process
90/10 rule of project management

We were told the project was important, yet we were only assigned 2 developers (as in myself and one other). As the project came to a close, we were being integrated into our new team, therefore other developers could help out during the final stages.

Here is a list of all the people involved over the project’s lifetime:

Name (Core team in bold)	Role
Me	Developer (Team Lead)
Daniel	Developer
Dean	Developer (Temporary)
Dennis	Developer (Temporary)
Tina	Tester
Tim	Tester
Colin	Technical Manager
Mary	Technical Manager
Olivia	Product Owner
Owen	Product Owner
Carl	Customer Representative
Adam	Architect
Andy	Architect
Grace	Safety & Legal Governance
Ulrika	UX
Ursula	UX

I’ve made the names start with a letter to represent their job title, apart from Colin because he is a recurring person in my blogs. I’ll put reminders throughout the blog so it is easy to follow.

Current Software

To protect anonymity, I need to come up with a different theme for what the software is for. Let’s say customers request various restricted items of different severity. So a request could come in for a Hunting Rifle, and the user needs to know if they have the adequate licence to possess firearms and they are deemed medically safe in a recent time-frame. Possible warnings are shown which the user can dismiss/acknowledge e.g. “licence is up for renewal in the next 3 months”, “recent purchase of other firearms”. Standard users can create “Awaiting Approval” tasks and assign them to users with authority to approve. To approve them, the authorised users open the task list, view the details, then click approve. Many tasks have either no warnings, or low-severity warnings, so users often just glance at the info and click Approve. The system then sends the approved request to a central system, then loads up the next task. There’s a couple of seconds delay due to the “digital signing”, a couple of seconds for sending, then loading up the next record. To sign loads of tasks, it’s a very slow and laborious process. It’s a major source of complaints from our users.

Unsafe/Unofficial Automation

Carl [Customer Representative] sent a link to a video where someone was demoing a commercial automated tool that autocompletes the tasks. It waits for the system to load, clicks the approve button, then repeat. So you could set it running, then walk away from your desk.

I thought it seemed ridiculously irresponsible and would cause people to be sacked if they got caught using such a tool:

A) The program is now the one authorising the tasks, not the qualified user. What’s the point needing to have qualifications if you aren’t even going to read what is on-screen? If a task was wrongly approved, then the user would be accountable.

B) if you walk away from your desk, you are leaving your PC unlocked, along with your physical Security Key.

The creator had actually put a bit of thought into it though. If there are any Warnings that require another click to dismiss/override, then the automation is paused.

The video claimed that some users have up to 500 tasks to sign after a weekend. They charge a fixed yearly fee of £295, plus 7p per customer on the system per year.

“the robot does not get bored, does not make human errors, and crucially is a lot cheaper than the user’s hourly wage”
Promotional video for the Automation tool

Probably just makes robotic errors instead!

I said we should change the names of the buttons to try and screw them since it probably uses something like that to locate the button to click. It would be quite funny to make them dish out refunds.

The existence of the automation tool shows how much the users desire a better solution.

UX User Feedback

Given the existence of such an automated tool, it is no surprise that one frequently requested feature is Batch Approval. Our UX team put together some kind of interactive prototype and invited a few users to provide feedback on two designs. The alternative design was actually produced by Mary [Technical Manager] who has no UX qualifications. I’m not sure how that came about and why UX agreed to trial her design, but the feedback was actually extremely favourable to her design.

This caused her to be quite smug and maybe caused some animosity as we will see later. The ratings out of 5 were:

(Option A) 4.3 for Mary’s design

(Option B) 2.3 for UX Team’s design

For additional comments, one user commented:

“I prefer Option A by a country mile – Option B feels even worse than the existing system!”

Another commented:

“Option B feels more clunky, less user friendly than option A. A lot of clicking involved”

One even gave a threatening response:

“Option A or you’re gonna lose me and my franchise”

Shortly, there was a write-up from a conference where the feature was announced:

“This item is one that really did steal the show – this is something that our customers have been very eager to see us implement and are very excited to learn that we are busy developing this solution.”

“Busy developing this solution” made me laugh, because at the time, all I had was a dialog box with a couple of lines of text and a button.

Proposed Change

The general idea, is that the user is presented with key details from the tasks in a data grid.

They can click checkboxes to select which tasks they want to approve.
These are added in a queue to send in the background.
The user can continue working as they are sending.
The “digital signing” has to take place on the user’s computer so a large part is done client-side.
The user has to remain logged in until the process is finished.

This project had actually been discussed for years, but because there wasn’t much of a “commercial drive” for it – we would be giving users this feature for free – it was always low priority.

Product Owner: Owen

I think the initial planning was done by a different Product Owner but then when the project fully began, we were assigned a new Product Owner, Owen, who was new to the company, but he also gave me the impression that he was new to the role…but also didn’t seem very clever in general.

Here are some quotes that happened in various meetings (mainly Sprint Planning and Refinement).

Owen: "which work item is it?"
Me: “the one right at the top"
Owen: slowly scrolls...chooses 2nd item

Me: "it's not a Must, it is a Could"
Owen saves it with Must tag
Tim [Tester]: "No, Owen, you tagged it wrong, go back"
Owen: "Which WI is this?"

saves it with the Must tag again
Then goes back into the work item and gets confused
then goes back into it again. I think he needs rebooting

Me: "you need to set the state"
Owen clicks to close
Me: "you need to set the state, go back"
Owen is confused
Me: "left hand side Owen!"
Owen hovers over the right
Me: "left hand side Owen!"
Owen moves down

Me: "leave it as it is"
Owen "Which one shall I take out?"
I'm sure he is intentionally 30 seconds behind to wind us all up

Owen changes Story Points from 3 to a 5 without any discussion.
"shall we keep it at 5?"

For another item, I was talking about how the requirement is either obsolete, or needs a completely different approach from the initial proposal. 
Owen: "So how many points shall we add?"

"The system crashes when entering incorrect PIN and clicking 'OK' on error prompt"
Owen: "what was the behaviour before we fixed this?"
team: "It crashed"

We were discussing how we logged a bug a few months back but haven’t seen it occur since, so it will need some investigation to try work out what the recreation steps are.

“Assuming the bug still exists, how long will it take to fix it?”
Owen

Estimating software changes is hard, but I always think bugs are even harder to estimate. It’s only possible if there’s clear recreation steps, otherwise it is stupid to ask – we can’t fix it if we don’t know what the problem even is.

“depending on Grace’s [Safety & Legal Governance] feedback, do you know how long it would take to fix?”
Owen

Translation: can you predict what Grace would say, and given that she did say it, can you come up with an estimate for it?

I logged a bug about suggestions on how to improve a dialog. It would be up to Owen or UX to decide on the approach to fix it. Owen then asks questions along the lines of: “what do we need to do for this? do we need it?” I said it would be nice but it’s not my decision. Then he still asks “do we need it?” “can we close it?“

What’s the point asking me these questions, when I logged it with the aim of asking him to decide?

When the project deadline was looming, we ended up having multiple meetings to decide if there’s any features we could scrap, or defer to a later release. After the first meeting where we decided scope, he may as well have said “You know those items you said we need to do and couldn’t defer them, are you sure we can’t defer them”, because he was arranging subsequent meetings to go back over them. When we came up with estimates which showed that we would need at least another month, he was then arranging another meeting to re-estimate them.

The Architects

An important project started around the same time ours did. Our architect, Adam [Architect], was reassigned to the new project. Andy [Architect] joined our team as a replacement. He wasn’t completely new to the company but wasn’t familiar with this area of the system. Additionally, I don’t think he even looked at the software or even requested a demo.

Any question we asked him, he ended up making an excuse that he was busy and will get back to me later. Then when he did answer, I then sent a message to the original architect, Adam, and he said Andy had asked Adam about it and simply relayed the message back to us. So basically Andy wasn’t doing anything. We had him officially assigned, but it was Adam [Architect] that was answering the questions but via a middle-man.

The July Cancellation

There was a bit of disruption when our project looked to be cancelled, but there was apparently some mis-communication.

Hi All, a decision has been made by Directors to stop Batch Approval and to move resources across to pick up Project France instead. Therefore I will be cancelling the Batch Approval meetings.
Project Manager

1 day Later

The directors had decided to move you to the new project so I cancelled the meetings, but then I find that there wasn’t a firm decision from the Directors.
Project Manager

Brian has asked us to proceed with Batch Approval as originally planned. Sorry about the chaos dudes. They must be smoking some good drugs upstairs.
Olivia [Product Owner]

It was off the table, then someone put it back on the table, then someone else swept it off the table, then someone picked it up off the floor and put it back on the table.
Andy [Architect]

Coding Tales

Colin [Technical Manager]: "What sprint are you in?"
Me: "I dunno"
Colin [Technical Manager]: "you are the team lead, you should know"
Me: "No one in the team knows"

Put it in a new tab but make it behave like a dialog

The original UX designs looked like it fit nicely in the existing Task Framework. The requirements were that Batch Approval had:

Its own folder but is a sub-folder of Approvals
Opening a task opens it in a new tab

After looking at the code though, the framework didn’t actually support a sub-item. But we found a basic workaround to make it look like it did. However, there were quite a few features that we got “for free”, but we didn’t want them because they weren’t appropriate for a sub folder. So I had to disable the features by hacky code.

If you double click a task, then it opens in a new tab, which is what they wanted. However, they then didn’t want you to be able to navigate away into other parts of the system, and the Task Framework didn’t support that. With a bit of a workaround, I got that working, but the tab was designed to view one task only, and we are displaying a Batch of them. A few weeks went by and I managed to cobble something together, but the code was awful.

I took a step back and thought about it.

We have a tab that the users surely would expect to be able to move away from to view other tabs.
I’m using this “tab” which is designed for a single task, and I want multiple. So I had to make my own custom page.
We have hacked a sub folder and had to basically fight against the codebase to get it all working…

So why don’t we just have a button on the main folder, and it launches a modal dialog?

It would take a couple of days to get working,
the code would be neat,
and I think it’s what the user would expect.

After speaking to UX about it, they were happy with my proposal. I had wasted about 3 weeks trying to get it working like they previously wanted. Also, we are again telling UX what a good UX design is.

Scrollbar

The UX was also clear that we didn’t want a scrollbar to appear, and instead we use pagination. I didn’t see anything obvious in the standard DataGridView Winforms control, although I’m sure this is a common problem/requirement.

I ended up writing my own logic to add controls to the grid, keep track of the size, then stop adding when the size exceeds the height of the control. However, if there is only 1 very large task, we have no choice but to use a scrollbar.

The problem we encountered was that sometimes a scrollbar did appear when it shouldn’t. I made some tweaks to the calculation and it seemed to work fine. But then a Tester found a combination of task sizes where it still appeared. I couldn’t work out what I was missing in the calculations but it seemed about 4 pixels off, so I just added that into the calculation. Again, all seemed fine for a few days, but then the Tester found a combination of sizes where it still appeared.

Olivia [Product Owner] suggested that we detect when there is a scrollbar then disable the Approve button until the user scrolls down.

I said if we know when the scrollbar is there, why don’t we just remove the last task and check for the scrollbar again, repeat until the scrollbar has gone. I thought the code would be messy, and I’d end up writing a stupid code comment like “mate, something has gone wrong with the calculations here, so we’re gonna have to do some jiggery pokery to get out of this mess”.

Adam [Architect] did suggest some alternatives and they were just as wildly wrong.

Dean, a developer in another team agreed to help, and after a couple of days, he says “you can just set the vertical scrollbar to be disabled”.

But if the scrollbar is appearing so you have to scroll to view the content, then surely disabling the scrollbar will mean content is off the screen?

I tested his idea, and it worked fine! What must be happening is that the vertical scrollbar appears and takes some of the horizontal space… which causes the text to wrap and creates the need for more vertical space. Therefore the scrollbar is required and so remains. But if you tell the scrollbar it cannot appear, then the controls are added, and my calculations meant it fit perfectly in the grid.

It’s a self-fulfilling prophecy!

Olivia [Product Owner]: Do we have concerns about the unknowns?
Tim [Tester]: It's just the unknowns that we don't know about
I feel like you need to know the system inside and out to be able to safely implement this

Conflict With The UX Team

UX: “We want to minimise pop-ups”
Also UX: “Add a pop up after closing the dialog”

Ulrika [UX] had to take time off to deal with some personal problems. Ursula [UX] agreed to join the meeting we arranged on the Wednesday.

“I don’t work Thursday/Friday and have to leave early on a Wednesday to get the kids. I’ll get back to you next week”.
Ursula covers for Ulrika but then also has time off.

When she got back to us, she seemed to overlook how users access this restricted part of the system, and it turned out none of the UX team actually had this knowledge. So halfway through the project, we were discovering new requirements because they hadn’t designed the user flow.

Don’t Have Time

In early January, we were waiting for UX to give us some approved text but they seemed to be taking their time. I asked Olivia [Product Owner] what was going on, and she said that we don’t have time to make any more changes so they “needed to stop requesting changes”. Even though I pointed out that I was the one requesting changes, she said “we don’t have time to test” (even though it only involved quickly checking some text has changed on a message box). Nearly 2 months went by before we actually began to release.

After more protests from me, she says:

“The text is fine for now. We don’t have time to be changing it.”
Olivia [Product Owner]

When it came for the final review, reviewers questioned why we had dialogs with some ToDO comments on it saying “ToDo: Awaiting UX approval“. Even if you don’t have comments like that, I have seen developers question the user-facing messages if the grammar isn’t correct or sounds unclear. It definitely wasn’t clear because we just wrote the first thing that popped into our heads at the time; knowing the text would be replaced.

I think what had happened was that Mary [Technical Manager] and Olivia [Product Owner] had fallen out with Ulrika [UX], and then was refusing to authorise her changes. Remember, tensions will have been building since users criticised Ulrika’s design and wanted Mary’s design, and Mary’s arrogance about it wouldn’t have gone down well.

It’s just part of the process though – all text needs to be approved by the UX team; otherwise what is the point of their team?

Conflict With The Architect

When we implemented Adam [Architect]’s suggested invalidation logic, we thought the criteria was too restrictive. Adam was off on annual leave for a few weeks so we couldn’t consult him. So we made our own decision to change it, and got Carl [Customer Representative] and Grace [Safety & Legal Governance] in agreement. However, when the Architect saw it, he said it was unsafe. In many meetings, I got the impression Grace wasn’t really listening and she tended to agree with what we said. Not exactly great when your job involves telling the team what is safe and legal, and then get overruled by the Architect.

We came up with a compromise, and implemented it. Then when it came to the Code Review, Adam suggested removing one more of the sub-rules which I think would be perfect, but then Olivia [Product Owner] was reluctant for us to make more changes.

Then a week later, Olivia said she would arrange another meeting to discuss the rules because she felt it might be too restrictive. OMG. However, she then seemed to have personal grievances with Adam, so told me not to make the simple change, even though it would be what we want. She used the excuse of lack of Testing time.

Adam [Architect]

We shouldn’t be knowingly introducing bugs. Olivia [Product Owner] This is not a bug. It’s a change to the criteria and we are not going to change it a week before we finish. I am speaking to Carl [Customer Representative] about changing the criteria, and we’ll look at it then. Adam [Architect] A bug is any deviation from requirements. Why are you planning on changing it if it is not a bug? Olivia [Product Owner] That’s not a bug. You are right in the sense that we need to change it…we’re just not changing it now. I was happy to leave it as it was to get this out of the door. That’s my call to make. Mary [Technical Manager] There's a lot that's not right. But how long do we keep going until we give it to the customers?

A summary of how this situation appears to me:

There is a process, but if you declare you want to move the process to the next release, then it is fine.
It will take too long to change a few lines of code, so we ain’t doing it. Apart from when it is a comment on the Code Review, then we are doing it, apart from those that we aren’t.
It takes longer for Olivia [Product Owner] to argue against it than to fix it.

The CEO had recently posted:

“The most important thing we do every day is keep our users and their customers safe by managing risk effectively. I know you all know this, but it warrants repeating: safety is our number 1 priority all day, every day – regardless of anything else that is going on. It trumps everything. Please always remember that.”
CEO

Our Managers are like:

“Next release”

The Technical Manager change

Colin [Technical Manager] complains that Daniel [Developer] and I haven’t handled the project well – and it overran by over a month at that point. A week or so later, the team was on a call with other stakeholders and he said

“you guys have done a tremendous job”,
Colin

then said the delay “was caused purely by scope-creep and nothing to do with the developers at all”.

“Mary is in charge of the team since yesterday”
Colin [Technical Manager] with his timely announcement

I got the impression that Mary just wanted to get rid of the project, because it was dragging on for far too long.

The Testers had nothing to do since us Developers were working on the last few bug fixes. Tina [Tester] said she was just re-testing old features to pass the time, but also get extra confidence there are no remaining bugs. Mary [Technical Manager] replied:

“should we be doing testing when changes are ongoing?”
Mary

Well, in that case, this statement means testers should only be hired for a couple of weeks right at the end of a project – since changes are constantly ongoing. I think she might have intended it to mean like “you’d better not find more bugs!”, but if there are bugs, then you definitely want to find them before our users do.

On the last day of the Sprint, Tina [Tester] took annual leave. She had left her assigned items in the “To Test” column of the Kanban board. There was no evidence she had tested the item, so I don’t think it wasn’t a case of just forgetting to move to “PO Approval” column. Olivia [Product Owner] and Mary [Technical Manager] then decided to just close the items. No evidence, no demo – just close them so the Sprint looks good, and looks ready to release.

What annoys me is that Mary had criticised how we had run our team and suggested we don’t follow the process. She stated that she perfectly follows the process – which leads to her successful projects. Then I see her cutting corners like that.

Just like Colin, she criticises me to my face, but then when we are in a group she states:

“I think you’ve done a fantastic job given that there’s only 4 of you”
Mary

A few days later, I had finished what I was assigned, but there was a bug on the backlog which Mary [Technical Manager] seemed to want to defer (again, she just wanted to release the project as soon as possible). I thought it couldn’t be released without this fix. I stated that I would like to look at it and she said:

“don’t do any development work”
Mary

Seems I have the day off then. What is the point in me sat around doing nothing? If I fix it, we can decide if it goes straight in, or deferred for the next release. Or maybe I won’t even find a solution. She just seemed desperate to finish the project so wasn’t considering the seriousness of the bug, or thinking logically at all.

The Backstab

I didn’t actually sit around doing nothing. I worked hard and found a solution. I knew that there was no chance Mary would accept my changes, so I needed to come up with a way of convincing her. My plan was to get the testers to informally test it, then I can say that I have a fix, and the testers are happy that there’s low risk of introducing more issues – so she would be stupid to reject it.

Testers Tim and Tina were in agreement that the fix should definitely go out in the initial release, and they agreed Mary was making a bad decision to consider releasing without it.

Tim said he would “have to check with Mary if he was allowed to spend time testing it” since they got told not to test anything. I said “there is no way she would approve it, that’s why we are doing this informally/secretively”. If Tim and Tina test it and find a bug, my plan has failed and Mary never needs to know that I attempted it.

It’s a perfect plan, or it would have been, but Tim then goes and tells Mary that I asked them to test it.

“You gotta start being better with your comminications – it’s not just yours and Tim/Tina’s decision if something gets put into the release – it’s a whole team decision but ultimately mine and Olivia’s. You’ve messaged them directly asking if they can get it tested, and as much as they’ll also want to get it done, it then puts them under pressure. This is how you’ve all got to a stage of being all over the place and burning yourselves out, it’s got to stop please.”
Mary’s chastisement

I shouldn’t have to go behind people’s backs and make my own decisions, but the entire non-management side of the team thought it should go in, and only the managers thought it shouldn’t. As a team we care about quality, but managers are just focussed on deadlines.

I also didn’t appreciate that she is accusing my decision making of adding stress to my team.

80% coverage

As the project got towards completion, I recalled our stupid “Merge Ready” process that no one seems to care about other than the small team who came up with it. You have to justify the likes of Code Coverage, and ours was at a low figure like 10%.

I’ll write some blog posts about my reasoning on when tests are good or bad in the future. A simple explanation is that Units tests are good when covering requirements, but often developers write them to cover implementation i.e. verify a particular method is called; but not that the particular method actually works. When you switch implementation, you have to rewrite a new unit test, slowing you down. Unit Tests are supposed to help you refactor, but in this case, it is a hindrance to refactoring. We did a lot of prototyping early on, and knew there would be large re-writes, so Daniel [Developer] and I decided to worry about Unit Tests later on.

When I declared the low number of Unit Tests, Olivia ended up raising it to the Directors for some reason. Why was it their concern? Do they even know what Unit Tests are for, and what the coverage actually means?

It could jeopordise my chance of payrises (I was correct, I got 0% this year) and tarnishes my reputation.

When Mary joined the team, she berated me over this decision and made the dramatic statement:

“We can’t go on like this”
Mary

She then asked a couple of her favourite developers to write some Unit Tests for my project, completely undermining me.

The thing is, both Dean [Developer (Temporary)] and Dennis [Developer (Temporary)] spent way longer than they estimated, and they didn’t do as much as they hyped, then when it came to make the last few changes, it slowed us down.

We ended up around 22% in the end, and the managers decided that is fine.

That’s the problem with us though… Do 80% coverage because it’s important. But actually it’s not that important, so you don’t need 80%. But TRY get 80%, Why?, Dunno, but the Document says.
Tim [Tester]

On track

Dennis [Developer (Temporary)] was also asked to helping out address the Code Review comments. In some ways, this kinda slowed us down. I told him I had a branch with some changes in already and sent him a link to it so we can work together. When I caught up with him the next day, he said that he had been working on a few of the ones I already had done because he hadn’t looked at the link. What a waste of time.

When Mary asked for a progress report, Dennis reckoned it would take 1 day to go through 20 comments, but he had done 8 easy ones the day before, and we had the hard ones left. So I said it would be more like 4 days, but could take longer if they are surprisingly complicated. I was correct.

Manager: How's the project coming along?
Developer: It's on track.

The track: pic.twitter.com/kTWIE4CHZE
— Molly Struve 🦄 (@molly_struve) August 17, 2021

Final demo

On the final Project Demo, Carl [Customer Representative] was saying the sending process was far too slow. He had been on most of the demos from the start and saw the progress across the project.

The original version I showed him was incredibly slow, but I had managed to speed it up significantly. So despite him witnessing the project months ago, he said the performance was a concern and maybe users would think it wasn’t a significant improvement.

We had all kinds of people turn up to this final demo. People from support, training etc. We should have had those guys on the early meetings. They were prodding holes in the requirements and asking important questions. Although we gave good answers for most of them, I couldn’t help but think our changes might not be as useful as we thought.

If only we got more users involved throughout the project, rather than just some UX mock-ups before we started, and then a year later – give them the update and hope for the best.

I’d like to reiterate just how hard the team has worked. They have worked their little socks off
Olivia [Product Owner]

Conclusion

We were told the importance of the project, but because there wasn’t a direct commercial aspect to the project, I felt it wasn’t backed up by the number of developers assigned to the project. With only 2 developers, then key staff like Architects and Product Owners switching throughout the project; it just slowed us all down and made us all feel it was actually a low-priority project.

There were other morale-reducing aspects like when we were told the project was on hold, then Mary berating my decisions, and implying the failures were down to me.

There wasn’t a great understanding of the feature in many ways, illustrated by

how many requirements we discovered throughout the project,
the UX team being clueless about many aspects,
one Product Owner so clueless – it seemed he struggled to use a computer,
then switching to a clueless Architect that just went straight to the original architect.