When migrating from on-prem servers to the cloud, the Deployment team decided to change the way servers were allocated, presumably to minimise the cost. They:
“designed separate pools for the public side and private side so that the computer and memory could be dedicated to (and protected from) other types of traffic. Due to this split we reduce the ration of CPU cores to sites from 0.42 per site to 0.28 per site (as the cores were now dedicated to public, private all separately)“.
Deployment expert
Initially, this new way worked fine, but then during a particular busy week, they saw slower response times. It actually led to a discovery of a problem we must have had for a while, that SQL connections weren’t being properly disposed of, which created a bottleneck of the remaining possible connections.
They added a temporary fix which was something to do with “Shared app pools“, rather than autoscaling the application servers which would cost money. But this is a major advantage of the cloud – that you can scale on demand.
So to no one’s surprise, when another increase in load happened, performance issues happened once again.
So now the fix should be autoscaling right? No, they are still reluctant to do so. Instead, they added a fixed number of application servers. Surely that costs money, and increases our costs at quieter periods. I suppose I don’t know all the details but it seems risky to choose a set number and hope that the load never exceeds that.
On Viva Engage, a manager posted a positive message stating that the AWS migration was a big success:
“I am thrilled to announce that we have completed the migration to AWS!
This is a major milestone for our cloud migration programme and has involved many team members across multiple disciplines working together.
We have migrated a whopping 505 TB of data across 4178 databases and have stood up over 1,080 application servers. There has been meticulous planning (over 130 steps in each migration), preparation and countless hours spent migrating our systems, including overnight and weekend working.
The impact of this collective effort extends beyond numbers and statistics. We have successfully improved stability and performance for our end users. The migration has enabled us to navigate the increased load challenges.”
Manager
Yet, someone shared this angry message from a customer. I’m not sure if the first sentence is sarcastic, or if they thought we had been somewhat supportive:
“Thank you for your support in what seems to be a “run” of problems for the business. After our awful experience in November when your software literally tipped over leaving us without a system, I did request that both the ombudsman and your company treated this as a significant event, looked into what went wrong and responded to me with an answer. To date I have not received any such feedback from either party.”
Sarcastic customer
I asked a Software Architect what he thought, since he is usually close to the gossip or involved directly.
The Chief of Smoke and Mirrors will have some explaining to do. performance improved quite a bit as a result of the 64-bit work done behind the scenes (not to the client) but now users do things faster with longer sessions and they have plenty of infrastructure issues around the AWS changes that caused a lot of customers problems as always, one group of people fix certain things, while one group breaks lots of things at the same time
Architect
So it sounds like there’s been some good work done, but also some mistakes made. Then internally, we are announcing it as a great success.
Someone also showed me this complaint where someone had visited a customer and reported back what they had witnessed:
“We visited a site yesterday displaying nearly all of the problems we have discussed to date – still having to reboot the software 10 to 15 times per day! System slowness (witnessed), documents not opening, closing when going into the orders module, first record load slow, changing an order – system crashes.”
Another reason for performance issues was due to missing config after the migration:
“some of our app servers are downloading/installing Windows Updates in core hours, which is causing poor performance for users.”
A simple workaround that sometimes happens is a “cache reset”. That sounds like it’s a similar mindset to the “turn it off and on again” which does magically fix some problems. However, due to the migration, Support had got a bit confused how to remote onto the new servers:
“cache resets were done on the wrong servers. ” –
Manager explaining why performance issues lingered for longer than expected.
Even after further tweaks to the cloud migration, updating the client to 64 bit, fixing SQL connections, and some other miscellaneous changes, the Support team were saying some sites were still having problems:
Can I confirm that things should be improving for all sites following all the brilliant work done? The customer is experiencing the below and I am visiting them tomorrow;
Customer issues
loading can take several minutes
Slowness and crashing every day, at least 9 or 10 times a day
No discernible pattern or time of day for slowness or crashing, and no particular machine is noticeably better or worse
Been experiencing performance issues for 2 years, but have gotten much worse recently (last 6 months)
experiencing significant delays when uploading records
Can take up to 1 hour to approve a small amount of external requests which can involve multiple restarts
Switching between records can lead to delays and ‘greyed out screen’ (not responding)
Constant and randomly crashes and needs restarting – staff having to partition out tasks such as viewing documents and approving tasks
Closing statement
It does seem like our performance issues are a bit of a mystery. I think we have run out of things to blame. Customer internet, SQL connections, 32 bit client, on-prem servers, caching bug. Hopefully one day we will have a fast system.
Over the last few years, my employer has gone Cloud crazy. We are a large company so we have our own data centres. These are costly to run when you need physical space, staff, electricity, software licensing, and a plan of action when things go wrong.
I wonder if it is better to have your own servers when you are a big company. I always think Cloud is best for smaller companies that don’t have the resources to host it themselves.
“Our reasons for using the cloud are the same as others using the cloud.”
Our CTO
Not really true though is it? From what I saw quoted for the virtual machines for our test systems, I think Cloud is more expensive over time. On-prem has a massive up-front cost which is what they don’t like, but we have the capital to do it, unlike small companies that the Cloud is perfect for.
The recent drive to move away from our data centres is that we needed to replace some old hardware, and perform SQL server upgrades.
I could imagine us moving to the cloud, managers then panicking when they see the monthly costs, then demanding we go back.
One aspect of an SQL Server upgrade sounded like they needed to migrate the data to a new physical server. One of the tables they were concerned about was Audit, which adds a new row every time the user edits a record, which they stated was around 9 Billion records. A copy of the changed data is then saved as XML, so then you can do a before/after comparison. So that particular column is a problem.
So for the data that would still remain in our data centres and moved to a new server with a modern SQL Server version, the plan was to migrate the table without the XML column in it. Instead a new boolean (true/false) column was added to state if there should be data there, and instead, the data is moved to the cloud.
So now we are paying to host the database on our own data centre, but then have certain data in AWS which sounds like it should be more expensive. The justification is that we didn’t need to buy as much hard disk storage which they reckoned could have cost a massive £500k! Then it would mean the migration to the new server in the data centre was faster.
Still, we needed to transfer the data to the AWS Cloud storage. I think the idea was that Audit data isn’t accessed much, so it’s better to move it to a cheaper but slower storage method, then request it on demand. So in our software, instead of displaying the data instantly when you view that record, there would be a “view more detail” button, and only then do we request it and show it.
I think the mindset is just to focus on the cost figures that are apparent. Seeing a figure like £500k sounds like a crazy figure, but if we look at the cost to store it over a few years, does storing it in our own servers outweigh the cost of paying Amazon to store it?
A new corporate buzzword that gets thrown around in this subject is FinOps, as in Financial Operations.
One of the challenges we have when we start to build a new service is around estimating the potential cost of that new service in AWS. This ultimately goes towards setting the budget expectation for that service and therefore how we monitor it from a FinOps perspective. Do we have any experience within the department or anything we can leverage to help us get better at understanding the potential budget expectations for a new service we’re building?
Concerned staff member
In one of the recent “Town hall” meetings, the CEO was ranting about how high our cloud costs were. He said we currently had £250k in AWS servers that are switched off (not sure if that was a yearly figure, or even more unbelievable; monthly). These were servers just for development/testing. If our testing teams are spending £250k on servers we aren’t really using, how much are we spending on ones we are actively using? Then how much does our live system cost?
Now when you see those figures, that £500k hard disk storage doesn’t sound too bad.
“FYI – Stopped instances don’t incur charges, but Elastic IP addresses or EBS volumes attached to those instances do.”
When people start talking about the cloud, they quickly start dropping in jargon terms. Sometimes they use multiple terms in the same sentence and it quickly becomes hard to understand when you aren’t familiar with the cloud providers. Even if you are familiar with one particular provider, other providers use different terms for their equivalent service. I think AWS is particularly bad for their naming which often aren’t intuitive. So when people start talking about Elastic Beanstalk, Route 53 and Redshift; it’s hard to grasp what the hell they are talking about.
Here’s an example of equivalent services by four different cloud providers.
Unit testing is a software testing technique where individual components of a program are tested in isolation. These components or “units” can be functions, methods, or classes.
When implemented correctly, Unit testing is a crucial practice in modern software development. It helps ensure the quality, reliability, and maintainability of your code. By incorporating unit testing into your development workflow, you can catch bugs early, improve code quality, and ultimately deliver better software to your users.
When I heard about Unit tests, they did seem awesome. But then the more I used them, I found that my opinion on them has declined. I find it quite hard to explain though.
I think in general, to make things testable, you have to split the logic up into smaller methods. But then when they are smaller, A) they are easier to understand and B) they are less likely to change. So if a developer has looked at that code, what is the chance they are gonna change it and break it? If you have a unit test and it never fails in the software’s lifetime, has it provided any benefit?
Then in the case that you decide to change the behaviour, then you have the overhead of rewriting all the unit tests and it can basically double the development time.
When there’s certain scenarios which could end up taking ages to manually test it, the unit tests are very beneficial. When there’s loads of permutation/optional aspects to logic, it is a prime candidate for unit tests. Without unit tests, retesting every time you make a simple change is incredibly tedious. But with unit tests, you just click a button and wait a few seconds.
Unit tests give you confidence you can refactor without risk. However, they are not automatically the silver bullet. Well-written, fast, reliable tests accelerate development and save time. Poorly-written, slow, flakey tests hinder development and waste time.
A test that takes a second to run doesn’t sound slow, but what if you have hundreds or thousands of tests? If the tests take a long time to run, the developers won’t run them as often, or at all, then what value do they serve?
They also should run on a build to ensure only quality releases actually go live, but you want your release process to be fast.
There was a recent change where the developer was claiming to have sped up a long-running call, however, he hadn’t carried over that performance enhancement mindset to the tests, and had actually increased the time to run them by 6 seconds.
The code “Thread.Sleep” can be used in threaded code to intentionally call a delay. I’ve seen many developers add this to a unit test. Tests are supposed to be fast, so you should never add this in a unit test.
Measuring Tests & ExcludeFromCodeCoverage
When people write unit tests, they want to try to understand how much of their code is covered by tests. We have this metric of Code Coverage but it has some severe limitations in the way that it is measured. It’s often a simple metric of “does the line get hit by at least one test”, but since methods can be executed with different combinations of variables, you can end up having 100% statement coverage but without actually testing many combinations at all.
The metric is one that impresses managers so you often see developers writing bad tests simply to game the test coverage metric. This is bad as you end up being misled that your code changes haven’t caused any bugs but yet it could have introduced something severe because the unit tests weren’t adequate.
I’ve seen quite a few code changes purely to increase the code coverage. So the title of the change would be like:
“Added more code coverage”
Then when I check the build output:
“There might be failed tests”
How can you be adding more tests then not actually run them before submitting it to review? Madness. The explanation is that their focus is just on coverage and not on quality. Maybe a bit of arrogance and laziness.
This week I worked with a team to get code coverage over 80% (a corporate minimum). The problem with this effort: Code coverage can be gamed. Sure, low code coverage means there’s a lot of untested code. But, high code coverage doesn’t mean the code is well tested.
Cory House
You can add ExcludeFromCodeCoverage “attributes” to your code which tells the code coverage tool to ignore it. It’s a simple way of reducing the amount of lines that are flagged as untested.
Here’s one of our principal developer’s opinion on this attribute:
“Using ExcludeFromCodeCoverage is only good when the goal is 100% coverage. That should never be the goal. The goal should be a test suite that prevents bugs from ever going live. I’m happy never using it and just having coverage reports flag things that are not actually covered, it is a more realistic representation of what the tests cover and makes me much more cautious about changing them as I know I don’t have test coverage. Never add Exclude from Code Coverage, it’s just lying to everyone. Why suppress things that might be a problem, if they are, we need to fix them.”
Principal Developer
Personally, I think adding suppressions/attributes just clutters the code base. I’d rather just treat the stats as relative to each release. The numbers have gone up/down, but why? If we can justify them, then it’s all good. Chasing 0 code smells and a specific test coverage means you can just cheat and add the likes of ExcludeFromCodeCoverage to meet such metrics.
Another developer said:
I value a holistic set of metrics that help us understand quality in software development. Code coverage is a single metric that can be part of that set of metrics you monitor. No single metric can stand by itself, and be meaningful. Nothing is perfect, which is why we should value a toolbox. I don’t believe in gaming the system and “hiding” uncovered code to get to 100%.
You need engineering teams who are prepared and confident enough to publicly share their coverage reports. This sets the tone of the culture. Context is needed, always. There will be reasons why the coverage is as it is. Use tools that help engineering teams with confidence/delivering at pace and ultimately delivering customer satisfaction. You cannot compare reports from different teams or projects.
Useful Tests
You need to make sure your tests actually test some logic. Sometimes people end up seemingly writing tests that really test the actual programming language, but I suspect it is just so the Code Coverage metric is fooled. Code Coverage checks if lines of code are “covered” by tests, but the simplistic nature of the check just ensures that a line of code is executed whilst a test is running; rather than if there was a meaningful test.
So for example:
[Fact]
public void DefaultConstructor_ReturnsInstance()
{
var redisMgr = new RedisStateManager();
Assert.NotNull(redisMgr);
}
So there you are instantiating an object then checking it is not null. Now that’s how objects work in C#. You instantiate an object, and then you have an object. Now, I suppose an exception could be thrown and the object wasn’t created, but that is generally considered bad practice and also there was no other test to check a situation like that, so they haven’t tested all scenarios.
Setting it then checking it is set. Unless the property has loads of logic which you could say is bad design, then checking it is set is really testing the “.net framework” but if you think you need this; that means you don’t trust the fundamental features of the programming language you are using. You are supposed to be testing the logic of your code, and not the programming language.
If there’s lots of setup then the Assert is just checking for Null, then it’s likely just to fudge the code coverage. Another classic that I’ve seen is loads of setup, then ends with:
Assert.IsTrue(true);
So as long as the test didn’t throw an exception along the way, then it would just always pass because True is definitely equal to True.
Those ones seem intentionally malicious to me, but maybe the following example is more of a case of a clear typo:
Assert.Same(returnTrigger, returnTrigger);
Whereas this following one looks like a typo, but it’s actually two different variables. Need to look closely (one is a single S in Transmission). 🧐
What goes through people’s heads? How can you write code like that and carry on like nothing is weird.
Sometimes tests look a bit more complicated but on analysis they still don’t really test much:
[CollectionDefinition(nameof(LoggerProviderTests), DisableParallelization = true)]
public class LoggerProviderTests : IDisposable
{
[Theory]
[InlineData("Verbose")]
[InlineData("Debug")]
[InlineData("Fatal")]
[InlineData("Information")]
[InlineData("InvalidLogLevel")] // Test with an invalid log level
public void GetMinimumLevel_ReturnsCorrectLogLevel(string logLevelSetting)
{
// Arrange
System.Configuration.ConfigurationManager.AppSettings["DistributedState.LogLevel"] = logLevelSetting;
var firstInstance = LoggerProvider.Instance;
var secondInstance = LoggerProvider.Instance;
// Assert
Assert.Same(firstInstance, secondInstance);
}
}
So this sets a setting on AppSettings, presumably used by the “LoggerProvider”. However, all they are doing is testing that if you call the Instance property twice, it returns the same object both times. So the setting of the different log levels is completely irrelevant. I mean, the log level could be completely wrong but you are comparing ‘is the wrong value of A the same as the wrong value of B’; and it will still pass.
Another common aspect is when you use a testing library like Moq, and you can use it to create objects and essentially say “when I call some code with these specific parameters, then give me this value back”. The thing is when developers use this as the actual thing they are testing, then you are testing Moq, and not your actual logic.
“I think all this test is doing – is testing that JobsRepo returns an object that was passed into the constructor on line 22. The GetById is redundant, it will always work if it returns that object because the Moq was configured to return that value. That is testing Moq, and not our code. But then if you are just asserting a property returns an object, you are just testing that C# properties work.”
Me
“yes you are right , I am just testing if JobsRepo could return a value, so that it helps me in code coverage for get functionality of JobsRepo , as it is just set in the constructor of the class and there is no call for get”
Developer who wrote bad tests
So I think they are saying “I am just fudging the coverage”. Checks it in anyway.
There’s been loads of tests where you could actually cut out large parts of the method they are testing and the tests still pass. Again, sometimes you point this out to developers and they still want to check it in, purely for the statistics, and not for any benefit to any developer.
“do these tests add value? a quick glance suggests this is very dependent on your mock object. It might be the case that the production code can be changed without breaking these tests.”
Me
yeah, they’re kind of meaningless. Merging code to use as templates for future, better, tests.
Developer who wrote bad tests
Here is a rant I left on a similar review:
This name implies that it should always be disabled, especially because there’s no coverage for the case where it is true. However, these tests aren’t really testing anything. I think you’re basically testing that Moq works and the default boolean is false. I think the most you can really do is call Verify on Moq to ensure the correct parameters are passed into the GetBool call.
If you replace the contents of IsRequestingFeatureEnabledForOrganisation with return false, your tests pass which illustrate the coverage isn’t complete, or you aren’t guaranteeing the configuration manager code is even called at all. Personally, I don’t think it is worth testing at all though. All your class does is call the FeatureDetails class so you aren’t testing any logic on your side.
I think people are too concerned about getting code coverage up, so they insist on writing tests, even if it makes things more confusing.
I suppose it is up to you and your team to decide what you want to do, but I occasionally question people just to make them think if it is actually adding any value. I’ve seen tests where they simply assert if an object is not null, but it could literally return an object with all the wrong values and still pass (and the method always returned an object anyway so could never fail). If you see a method has tests, it gives you a false sense of security that you think it is going to catch any mistake you make, but it just always passes anyway
always think if your tests will add value and if it’s worth adding them. If you need to mock everything then they’re not very valuable, or you’re testing at the wrong level (too high), and you’re better off with integration tests than unit test. 100% code coverage is a really bad idea for complex software, massive diminishing returns in value the higher you try to push it. We change stuff all the time in our software too, so if everything has high-level unit tests then you spend more time fixing those tests.I tend to find you spend ages writing tests then if you change the implementation then you have to change the tests and you can’t run them to see if you broke anything because you had to change the test to run it.
Me
Test Driven Design (TDD)
There’s a methodology called Test Driven Development where you write a test first. It will then fail if you run it because there’s no functionality to run. Then you write the implementation and get it to pass. Then move onto writing the next test, and repeat. So you build up your suite of tests and get feedback if your new changes have broken previous logic you wrote.
I was recently listening to a podcast and the guest said that he always writes code first, then adds tests after. If he can’t write tests, then he will make a change to bypass code just for the tests. I wasn’t sure what he meant by this, maybe it’s like when people write a new constructor which is only ever called by the tests. But that’s bad design.
I thought he may as well just do TDD from the start, instead of always going through that struggle. He says TDD doesn’t often lead to good design because you aren’t thinking about design, you just think of how to make the tests pass.
But doesn’t the design organically come from TDD? and his way of changing the design just for the tests is what he is arguing against TDD for. TDD often slightly over-engineers the solution with the likes of Interfaces. So then he is avoiding TDD, and instead writing the tests after; but his way adds “Technical Debt” via adding extra constructors that are only used by the tests.
“I’ll add tests in a separate change later”.
5 reasons to add tests before merge:
1. Clear memory: Before merge, everything is fresh in my mind. I know what the code is supposed to do, because I wrote it. So I also know what tests I should write to assure it works. Every minute that passes after merge, I will understand the feature less, and thus, be less equipped to add proper test coverage.
2. More effective reviews: If I write the tests before merge, then anyone reviewing my code can use my tests to help them understand the code, and to watch the feature run.
3. Faster development: If I write tests during development, I can use the tests to accelerate my development. I can “lean” on my tests as I refactor. Faster feedback loops = faster development.
4. Better design: Writing tests during dev encourages me to write code that is testable. It makes me consider accessibility too since that tends to make automated testing easier by providing well-labeled targets.
5. Changing priorities: After merge, there’s no guarantee that I’ll have time to write the tests at all. I may get pulled away for other more “urgent” tasks.
Bottom line: The proper time to add tests is *before* merge.
Coty House
I recently saw the following conversation. A developer was basically saying he didn’t have time to write the tests, and it might end up in some drastic refactoring which would be risky. Then the plan is to rely on manual testers and get the changes released. Then the next part probably won’t happen (because important features will be prioritised), but his suggestion is that he then makes the changes for the next release with good unit test coverage.
Senior Developer: This domain supports unit testing, you should be able to add tests to cover the changes you made to make sure it behaves as you expect
Developer Currently there are no unit test cases available for the changes made class, and the class is tightly coupled. I have written some draft tests and will check them next month as a priority.
Architect IMO, given the pressure, timescales and urge to complete this, I think we can defer for now and stress the testers to pay more attention to the areas that we have low confidence.
Senior Developer: So instead of checking if it is correct by adding tests that we can be sure exercise the code changes, we just merge it and hope that the manual testers find any bugs over the next day or so, and if they do, then it is back to the dev team and another change?
Time In Unit Tests
Tests should be deterministic. If a test is run and passes, then if no changes have been made and we run it again, it should also pass (obviously). An unreliable test doesn’t give you confidence in code changes you make. It’s a surprisingly common occurrence when you make a change and an unrelated test breaks, and you are thinking “how can those changes break the test“? then you look at what it is doing, and it’s often something to do with time.
You see something like data is "BirthDate":"1957-01-15T00:00:00" And the test result says: Expected "Age":"67y" Actual: "Age":"68y" Today is their birthday!
What you need to do is put a “wrapper” around the code that gets the current date. So instead of simply DateTime.Now, you create a class called something like DateTimeProvider, and in the production code, the class returns DateTime.Now. Then in your Unit Tests, you then create a MockDateTimeProvider and make it return a hard-coded date. That way, no matter when you run the test, it always returns the same date, and is a deterministic test.
I recently fixed some tests that were failing between 9pm-12am. I found that a developer had changed the MockDateTimeProvider to return DateTime.Now, completely rendering it pointless. Other parts of the test were adding 3 hours to the current time, and because 9pm+3 hours is tomorrow’s date, the date comparison it was doing then failed.
public class MockDateTimeProvider : IDateTimeProvider
{
public DateTime Now { get { return DateTime.Now; } }
}
I think another red flag in unit tests is conditional statements. Logic should be in your production code, and not in tests. Not only does this following code have a DateTime.Now in it, it looks like they have put a conditional If statement in there, so if it would normally fail, it will now execute the other branch instead and pass. So maybe the test can never fail.
[Fact]
public void ExpiryDateTest()
{
DateTime? expiryDate = (DateTime?)Convert.ToDateTime("12-Dec-2012");
_manageSpecialNoteViewModel = new ManageSpecialNoteViewModel(_mockApplicationContext.Object);
_manageSpecialNoteViewModel.ExpiryDate = Convert.ToDateTime(expiryDate);
if (_manageSpecialNoteViewModel.ExpiryDate < DateTime.Now.Date)
Assert.True(_manageSpecialNoteViewModel.IsValid());
else
Assert.False(_manageSpecialNoteViewModel.IsValid());
}
Other Bad Unit Tests
Maybe the most obvious red flag, even to non-programmers – is testing that the feature is broken. The developer has left a code comment to say it looks wrong!
Assert.Equal("0", fileRecordResponse.Outcome); // I would have thought this should have been -1
The One Line Test
How do you even read this. Is that actually one line? 🤔🧐
_scheduledJobsRepo.Setup(r => r.GetAllAsNoTracking(It.IsAny<Expression<Func<ScheduledJob, bool>>>(),
It.IsAny<Func<IQueryable<ScheduledJob>, IOrderedQueryable<ScheduledJob>>>(),
It.IsAny<int>(),
It.IsAny<int>(),
It.IsAny<Expression<Func<ScheduledJob, object>>>()))
.Returns((Expression<Func<ScheduledJob, bool>> expression,
Func<IQueryable<ScheduledJob>, IOrderedQueryable<ScheduledJob>> orderBy,
int page, int pageSize,
Expression<Func<ScheduledJob, object>>[] includeProperties)
=>
{
var result = _scheduledJobs.AsQueryable();
if (expression != null)
{
result = result.Where(expression);
}
result = orderBy(result);
result = result.Skip(page * pageSize).Take(pageSize);
return result;
});
When it is that hard to read, I wonder how long it took to write it.
Other Common Mistakes
I think tests can be unclear if you use a Unit Testing library but not understand what features are available. Like instead of using the ExpectedException check, they may come up with some convoluted solution like a try/catch block then flagging the test as passed/failed.
Naming the tests can be tricky to make it clear what it does and to differentiate it from other tests. The worst is when the name says something completely different, most likely from a “copy and paste” mistake.
I’ve talked about how using DateTime can make tests fail at certain times. You can end up with tests that rely on some shared state, then the order you run the tests causes failure when the test expects data is set or not set.
Bank
Tests are even more important in certain domains. You know, like when money is involved. Commercial Bank Of Ethiopia allowed customers to withdraw more cash than they had.
“I’m really interested how a bank managed to deploy code that didn’t have tests for “can a user withdraw money when they don’t have enough balance.” Development teams at banks are usually conservative, process-heavy and slow-moving with changes exactly to avoid this. Wow”
Conclusion
Unit tests can be a useful and helpful tool to developers. However, there is an art to writing them and they have to be written with good intentions. If they aren’t written to be useful, fast, and reliable, then developers either won’t run them or won’t trust them.
One of the latest buzzwords to be thrown around is “Customer experience”. My understanding is that it’s a focus on customer interactions, from awareness of the product to purchase. This covers brand perception, sales process, and customer service.
Customer Experience is shortened to the acronym CX, because using the letter X is always cooler. For some reason, we went a bit further and put a hyphen in there for good measure; “C-X Experience Centre”.
The weird thing is that it kinda looks like a letter is missing like you are supposed to pronounce it like SEX; and a Sex Experience Centre is a different thing entirely. Does it even make sense, or sound sensible to call it the Customer Experience Experience Centre?
“The Customer-Xcellence Programme is all about putting our customers and users at the heart of everything we do. It directly supports our strategic priority of delighting our customers and partners. But we can only do that if we really put ourselves in their shoes and truly understand what day-to-day working life is like for them. By doing so, we can ensure the products and solutions we design, enhance and implement are directly informed by their experiences.”
We lost even more office space to create this C-X Experience Centre. Since we worked at home, they made the desk space more spacious for those that did go into the office, then over time have reassigned meeting rooms to nonsense like this.
To make it more pretentious, we invited a local politician for the grand opening
“The C-X experience Centre is a real gamechanger in how we immerse ourselves in the experiences of our customers and users.”
I think all it is is a few computers in a room decorated to look like a customer’s office.
“This will help everyone learn about the challenges our customers and users face, and how our solutions help them provide a better service.”
As well as showcasing our solutions to customers and key stakeholders, it will be used for:
onboarding new starters
supporting sales enablement training
launching and testing new solutions and products
“Thank you to the whole Customer-Xcellence team for turning this vision into reality – it will make such a difference in how we understand our customer’s and user’s challenges.”
Honey is a browser extension now owned by PayPal. It promised cheap deals to the user by automatically searching for vouchers and applying them at checkout. However, there seems to be some possible foul play in the way that it worked.
Honey was adding itself as a referrer which sounds logical if the user has made their own way there. Referral links give a financial kickback to the referrer so would be fine to give Honey some credit for assuring the end user completes the purchase.
The end user uses honey with the promise of searching for valid voucher codes to save further money. However, even when Honey couldn’t find anything, they still stole the referral. To the end user, this didn’t affect them because it was the referrer that was missed out. So all those YouTubers that had affiliate links will have lost out money, or future affiliate deals and sponsorships.
The ironic thing is that Honey gained a lot of new users from YouTube partnerships themselves. So YouTube audience would install the Honey extension, then any future affiliate links from the YouTuber (and any other YouTuber) would be then hijacked by Honey. So the YouTuber has been completely scammed but would be unaware it was happening at all.
There was another suggestion that Honey even did deals with shops to limit the discounts offered. So if there was a voucher available for 20% off, they would lie and say they have found 10% off. So Honey promised to find the best deal for you without you making any effort, but they were just finding mediocre deals for you and you could have got a better deal if you did put the effort in.
For some sales, you could say that the value proposition to retailers is dubious since they are giving customers discounts on products they were already about to buy.
Legal Eagle is filing a lawsuit against them, which is going to be interesting to see the outcome. I’m Suing Honey .
When it comes to software, the concept of time can cause problems. There’s actually some really interesting scenarios, but even in simple applications, some developers really struggle with simple concepts.
In terms of standard problems, you can have problems where the client and server times can be out. This can just be because they are set incorrectly, or maybe are using a different timezone. As a developer, if you are looking at times in log files across the client and server, it can cause confusion if the timestamps are out. A common thing I have seen is that some servers don’t use Daylight Savings Time we have in the UK, but the client times often do. So the server can be an hour out.
Daylight savings time is interesting as time shifts forward or backwards one hour. So time isn’t linear.
I recall reading a blog about time by Jon Skeet who then discussed how if you are using historical dates, the time can also suddenly change. Like if a country switches to a different calendar system entirely, so moving a day could suddenly jump in years to align with the new system. Computerphile have a discussion on this The Problem with Time & Timezones – Computerphile
Leap Years
We once had a leap year bug because someone created a new date using the current day and month, and added a year. So when it was 29th Feb, it tried to create a date of 29th Feb for next year which wasn’t a valid date. So the feature crashed. Everyone was panicking trying to rush out a fix, but then we realised we could only get the fix out to our customers tomorrow, and the bug wouldn’t happen. Not for another 4 years anyway. It was hilarious
-1
One weird mistake I saw recently, is that a developer defined a variable and set it to 5. The code they wrote was supposed to make sure that we never make an API call more than once every 5 minutes. However, they then minused 1, so were checking every 4 minutes instead.
var minimumNumberOfSecondsRequiredBetweenEachAPICall = (NumberOfMinutesRequiredBetweenEachAPICall - 1) * 60;
Ages
You would think everyone would understand the concept of ages since everyone has an age and it increases by 1 every time you have a birthday. However, many developers seem to struggle with the concept. The last implementation I saw had the following:
int age = DateTime.Now.Year - dateOfBirth.Year;
So it can be one year out because it basically assumes your birthday is 1st January.
It reminds me of an exchange on Twitter that I saw years ago. It was in the context of football.
PA: Why are Arsenal paying £25m for a 29 year old striker? G: he’s 28 btw PA: He’s a lot nearer to 29 than 28, that’s a fact G: He’s 28, that’s a fact PA: Why am I not surprised that fractions are beyond you. The day after his birthday, he is no longer 28. G: He’s 28 until he becomes 29. That’s how it works PA: Perhaps if you had paid more attention in Maths lessons? You might remember “round up or down to the nearest whole number” G: He’s 28. That’s a fact. PA: No, it is not. £1.75 is not one pound. You don’t even understand what a fact is now. G: Until he is 29, he is 28.
When it is the next day after your birth, are you 1 day old? technically you could just be a minute old but claim you are 1 day old.
My instinct to perform mathematics on dates would be to use an existing date library. Another developer tried to make something themselves. This seemed a bit complex to me, but I think it actually worked, or at least seemed reasonable for how they wanted to use it.
public static double AgeInYearsAtDate(DateTime effectiveDate, DateTime dateOfBirth)
{
double daysInYear = 365.25;
int completeYears = Age.GetYears(dateOfBirth, effectiveDate);
dateOfBirth = dateOfBirth.AddYears(completeYears);
double proportion = effectiveDate == dateOfBirth ? 0 : Age.GetDays(dateOfBirth, effectiveDate) / daysInYear;
return completeYears + proportion;
}
public static string ConvertCurrentAgeToYearsAndMonths(double age)
{
int monthsInYear = 12;
int years = (int)age;
int months = (int)Math.Round((age - (int)age) * monthsInYear);
return $"{years} year{(years == 1 ? String.Empty : "s")} and {months} month{(months == 1 ? String.Empty : "s")}";
}
Ages Part 2
Another developer was testing his age code and wrote this:
new object[]
{
new DateTime(2010, 05, 31),
new DateTime(2009, 06, 01),
AgeRange.UpToOneYear,
"52 weeks and 0 days"
},
If there’s 52 weeks in a year, then is that 52 weeks? kinda looks 1 day short to me. Time is mental isn’t it?
Although Junior developers can be useful, I have been against my employer’s over-reliance on them. If you get someone cheap with high potential, as long as you reward them, you end up with someone that knows your system, loves the company and is a great developer.
The problem is that we love hiring them but not rewarding them which means the best ones leave and the bad ones remain. The focus on cheap wages has led to more and more offshoring which has led to the rapid expansion of our Indian office. How do you hire so many people quickly? lower the standards. So now you have a high amount of incompetent people but they are cheap.
It’s not the fact they are Indian that is the problem, it is the fact the demand is high and the standard we had in hiring is low. The problem this has in the work culture is that it is easy to see a discrepancy in quality between the UK and Indian developers as a whole; which means you end up seeing them as inferior, despite some of them actually being good. The good ones tend to be the ones we hired in Senior positions, so they would naturally have higher wages anyway.
So building on from my recent blog on one particular developer, Here’s a collection of things other people have done:
Rollbacks
James has just rolled back someone’s changes who merged into the wrong folders.Don’t they think something isn’t right when it is showing [add] next to the main Database folder. Looks like they copied the folders up one level so now it is re-adding everything as a duplicate.
Dean 16:27: haha Me 16:28: that change by Portia is mad when you look at the changesets original patch change to patch, fix db patching errors, rollback, rollback other changes, rollback from xml file, then Chris comes into undo the rollback Dean 16:30: it's just wrong that we've got people who don't know what they're doing Me 16:30: but it's cheap Portia went wild on that second “rollback” and manually reverted the files. removed 8 patches and added 1, instead of removing 1 it's amazing how many rollbacks happen these days Dean 16:40: Rollbacks worry me in general
“used for identification”
In general, SQL databases are designed to reduce redundancy. So if you have a table storing a list of “job roles”, then if another table references this information, you can link it together via an ID of the row. What you shouldn’t do is copy the data into another table. This means if the data needs to be updated, then you need to remember to update both, and this will double the storage space too.
I saw that a developer was doing this. It was only one column of text, but why were they copying it over into their new table instead of just referencing it?
Me Is there a reason why this isn't being taken from JobCategory? It is never returned in a stored proc call so there is no need for it
Vignesh JobCategoryName used for identification of JobCategoryID and not used in stored proc. Thanks
Me Regardless if the system uses the data, or if it is there for our Staff to read in the database; you would just write a query that joins onto the JobCategory table. what if the JobCategoryName in JobCategory is updated? The names in your new table won't be accurate
Vignesh JogCategoryID only used in stored proc/code, JobCategoryName is just an identification for JogCategoryID in the table. Thanks.
Me So it needs to be removed?
Vignesh JobCategoryName was removed since it is used in code or stored proc. Thanks.
Refresh Link
private void llbl_RefreshList_LinkClicked(Object sender, LinkLabelLinkClickedEventArgs e)
{
IEnumerable<ExecutionScheduleDetail> runningItems = _service.GetAllScheduleSearches(AuthenticatedUser.Organisation.Guid);
int count = (from runningitem in runningItems select runningitem).Count();
if (count>0)
{
LoadRunningSearches();
}
else
{
MessageBox.Show(
"This run has just completed and details can no longer be viewed.",
"Running Searches",
MessageBoxButtons.OK,
MessageBoxIcon.Error);
LoadRunningSearches();
}
}
Instead of just getting the count from the Ienumerable, they are selecting all the items they already have. Then since they are calling the LoadRunningSearches method in both parts of the IF, it may as well be moved out. So then the code would just be
if count == 0 , show message box.
when a review comment was left about not specifying the method call twice, he then just moved the GetAllScheduleSearches to a field, which meant it would no longer get the latest searches every time you clicked. Since the link is to “Refresh”, it wasn’t doing what it was supposed to.
If Constraint Exists
if exists (Select * from sys.check_constraints where name = 'DF_Templates_AgeRange_AgeTo ')
alter table Templates.AgeRange drop constraint DF_Templates_AgeRange_AgeTo;
I noticed there was a redundant space at the end of the constraint name so I thought it was likely that the check would never be true (unless SQL ignores it automatically?)
Me 4/28/2022 Is this space intentional? Vignesh 4/28/2022 Removed Joel 4/28/2022 Do you actually need the select check? I think you should be able to use the dependent patch mechanism instead. Vignesh 4/28/2022 I think it is not needed, we checked the condition to avoid any error. if its true it will execute Joel 4/28/2022 The constraint is added in patch 7.809, on which you've marked this patch as dependent. So this patch will literally only run if the constraint was created successfully.
We have an attribute in the xml so you can state dependent patches, so will only run if the prerequisite patch has run. Vignesh was aware of it because he had used it, but then he also had this guard clause that possibly didn’t even work.
When told he didn’t need it, he then agrees that it isn’t needed, yet, put it in there so it would “avoid error”. Does he mean there was an error? Or just being overly cautious?
Sometimes, it’s the little things like this that annoy me. How can you write “inActive”, and not realise that you either:
have spelt “inactive” wrong,
or alternatively – someone else has when they created this enum
Therefore why did they not fix it? There’s clearly an inconsistency there.
In a similar fashion, I saw this recently:
//Recieved and Transalted
Both words are spelt wrong. It was also copy and pasted from another file. It does pose a good question though, if you copy and paste, do you think you should correct the spelling or leave it for maximum laziness? I guess the advantage is if you search for that text to try and find the original code, it’s better to match it as much as possible.
throw new NotSupportedException("Can't able to fetch template consultation details!");
Indians always seem to write “Can able” and “Can’t able” instead of just “can” and “unable”.
The logic was consistently backwards. It wasn’t a case that they typed it wrong and didn’t bother testing it. There were several files with the same type of logic. I pointed it out and they rewrote the entire thing.
A while ago, I wrote a blog about the Merge Ready Checklist, which was a process we have to prove we have followed to be able to complete and release our software project.
The process was created by a some experienced former Software Testers, now basically Quality Assurance Managers.
As part of the checklist, they then insist on having Test Coverage of 80% which I always think is an unreasonable ask. When I joined Development, we had Visual Studio Enterprise licences which have a Test Coverage tool available. However, we have since downgraded to Professional. So I asked what Test Coverage tools we can use because it needs to be something that IT have approved to download, and that we have a licence for. We were told we could use AxoCover, but I found it wasn’t compatible with Visual Studio 2019 or above, which was an inconvenience.
Ok, so let’s run it and see what happens. Firstly, you are greeted with a phallic symbol.
It’s supposed to look like their logo, which looks more like a fidget-spinner.
Here are the metrics it produces, before; and after.
Coverage
Uncovered
Total
Classes
11.4%
3040
3433
Methods
14.4%
17784
20770
Branches
12.7%
29085
33308
Lines
13.4%
98844
114114
Baseline figures. Code from the Main branch before my changes.
Coverage
Uncovered
Total
Classes
11.9%
3034
3443
Methods
13.3%
17747
20473
Branches
12.0%
29020
32985
Lines
13.4%
98786
114118
Codebase with my Project merged in.
Then I can’t even make sense of the statistics? Is that saying that I have removed methods (total has gone down!)? I have added a few classes with several methods each (these obviously contain lines of code and conditional statements so I expect all values to be higher (but despite more Classes, the number of methods and lines has decreased). I had also added some Unit Tests but maybe would have expected 30% on new code.
I asked Tech QA to explain the figures to me, and they were like “we dunno, we aren’t developers. We just look for the 80% number“. Then I point out that they were supposed to be judging the 80% coverage on NEW CODE only. This is for the entire solution file. So this doesn’t give them the evidence they want, and it’s not accurate either and cannot be trusted.
After running it several times and adding/removing code to see how the numbers changed, I then was suddenly low on disc space. Turns out Axo Cover reports are 252MB each! Yikes.
Testing AxoCover
Since the numbers were nonsensical. I decided to create a simple test project and run it on simple examples. Let’s see how it judges what is a line/branch/method/class.
namespace Axo
{
public class ExampleClass
{
}
}
0 method 0 branches 0 lines
So a class definition with no methods and actual lines of code results in zeroes all around. It must ignore boilerplate class code.
namespace Axo
{
public class ExampleClass
{
public ExampleClass()
{
}
}
}
1 method 1 branches 3 lines
So now I have a method but it is empty. Seems the idea of ignoring boilerplate code doesn’t apply to methods. It must count the method definition plus braces inside the method for the line count, but it doesn’t make sense to count braces since that’s just to group related code. 1 branch is weird too, that should be for IF statements which we will test soon.
namespace Axo
{
public class ExampleClass
{
public ExampleClass()
{
var a = 3;
var b = 4;
var result = a * b;
}
}
}
1 method 1 branches 6 lines
So now I have added 3 lines to the method. The line count has increased by 3 so it seems like it make sense.
namespace Axo
{
public class ExampleClass
{
public ExampleClass()
{
var a = 3;
var b = 4;
var result = a * b;
if (result > 10)
result = 0;
}
}
}
1 method 3 branches 8 lines
I’ve added 2 lines, where 1 is an If Statement. So now we have increased the branches but it has increased by 2. This must be the “implicit else” where the “result” is either greater than ten or it is less than 10, so there’s 2 paths. I’d still say that is 1 branch though.
namespace Axo
{
public class ExampleClass
{
public ExampleClass()
{
var a = 3;
var b = 4;
var result = a * b;
if (result > 10)
MethodA();
}
private void MethodA()
{
}
}
}
2 method 4 branches 10 lines
I’ve replaced one line with a method call to a new method. Method count increasing by 1 makes sense. Given the previous examples, adding a new method adds 1 to the branch count for some reason. In the empty method example, we got +3 to the line count, but now we only get +2, so that seems wrong. I don’t even think an empty method should increase the line count or the branch count, so the figures are becoming increasingly nonsensical.
namespace Axo
{
public class ExampleClass
{
public ExampleClass()
{
var a = 3;
var b = 4;
var result = a * b;
if (result > 10)
MethodA();
else
MethodB();
}
private void MethodB()
{
}
private void MethodA()
{
}
}
}
3 methods 5 branches 13 lines
So now instead of an implicit else, I’ve made it explicit, and created another Method. Method count makes sense. Branch count has increased by 1 which I think will be for the new method and not the else. We have +3 to the line count but should we have 2 for the else, then up to 3 for the new method.
namespace Axo
{
public class ExampleClass
{
public ExampleClass()
{
var a = 3;
var b = 4;
var result = a * b;
if (result > 10)
MethodA();
else
MethodB();
}
private void MethodB()
{
}
private void MethodA()
{
{ }
}
}
}
3 methods 5 branches 15 lines
I was intrigued if it really was including braces. Some random braces gives +2.
public class ExampleClass
{
public ExampleClass()
{
}
public ExampleClass(string test)
{
}
}
2 methods 2 branches 6 lines
I thought I’d reset and try again. So we have 2 methods, which as we have discovered means 2 branches with AxoCover’s metrics. It seems to count both methods as +3 lines.
namespace Axo
{
public class ExampleClass
{
public void ExampleClassMethodA()
{
}
public void ExampleClassMethodB(string test)
{
}
}
}
2 methods 2 branches 4 lines
Looking back through the examples, I wondered if it is actually counting an empty Constructor as +3, but an empty method is +2. So this example has actual methods rather than constructors, and it seems to confirm my theory.
Discussion
When it comes to counting metrics in code, I think there is some degree of subjectivity to it. What even is a line of code? You could technically add many “statements” and put them all on one physical line. So you could look at the line numbers and see 1 line, but when actually reading the line of code, you can read multiple instructions. The analogy could be that you expect a list to be one item per line but someone could write a list on paper as a comma-separated list on one line. One is more readable than the other but it’s still valid either way. If someone asked you how many items were on the list, you could count them either way and end up with the same number. Therefore, I think the line count should actually be “statement” count.
I do think AxoCover’s definition of a branch seems wrong, and what they interpret as lines seems inconsistent and a possible bug in their logic.
On a larger and more complex codebase, the statistics it produces seems really nonsensical. So I think I have proved this tool isn’t worth using, and we definitely shouldn’t be using it to gatekeep on our Merge Ready Checklist.
Even though our UX Team has been around for a while, they never seem to understand what is possible in our software, so end up designing something that we cannot accurately recreate from their Figma designs.
I often think their standards change over time so it’s hard to predict what they would come up with.
The UX Team asked what kind of formatting is possible in a Tooltip. You’d think they would know what is possible, and have plenty of old designs to refer to.
They said they had some upcoming projects that required tooltips containing large amounts of text; often with legal statements. They shared an example which had 3 sentences, then a Name, ID, Phone number, and address. So was a large amount of text in a tooltip, and some words were formatted.
I thought it wasn’t good UX to have loads of info inside a tooltip. Also, wouldn’t it be better to have that address somewhere where the user can copy and paste it? Seemed like a useful thing to have.
I often think it is good to evaluate the UX designs and give your own opinion on what’s possible to implement, but also suggest what would improve the user experience. You’d think the person employed in the UX team is the expert on user experience, but it’s best to not blindly accept it.
Cory House also seems to share this thought:
As a developer, I know I’m not a designer. But that doesn’t mean I should blindly implement designs.
As a developer, I know I’m not a designer. But that doesn’t mean I should blindly implement designs.
I push back on designs that are:
Insecure
Confusing
Incomplete
Inaccessible
Inconsistent
Not performant
Assuring a good user experience is everyone’s job.
Cory House
More specifically, if we have an existing dialog, and the UX team decides to change what it says; you would think this is the simplest change possible. However, there could be a bit more to it than you would think.
I was explaining this concept to a Junior Developer. I was saying how I loved working with a Product Owner called Rob, who always asked “is that a hard thing to do?” no matter how trivial something sounded. He understood that there could be all kinds of crazy designs in the code.
In theory, it should never be hard. But sometimes adding more words means the words need to wrap onto the next line, and if the dialog hasn’t been coded to resize, then it might be a manual resize job. But if the “design view” is broken, then that makes it even more complicated.
The text might not just be set to specific words in the file where the control is. It could be dynamically generated then passed into another method, or maybe it even is set and read from a database. It’s still easy to change, but if you tried to search the source code for a specific word/words then you might not find it if it is dynamic or in the database instead.
I’m sure there have been times where, after investigation, you are like
“Rob, can’t we just keep the words as they are, I don’t have the skills to add a few more words!”.