Skin-tone plasters

Big changes coming from the Employee Forum. We are getting a variety of skin-tone plasters (band-aid) for the first aid kit. What sort of insane social justice warrior asked for that?

If anything, the default plaster is brown so us white folk need lighter ones.

This is the most extreme woke thing I have ever heard of. I don’t think we will beat it.

Skin Tone Plasters.  A great shout and a big thank you for the lack of variety being highlighted to the Employee Forum.  Aligning to our environmental credentials, the incumbent plasters will remain where they’re within a 3yr use-by lifespan.  As we move to replace, this will be done with a wide variety of skin colour matching plasters. 

I asked a friend what he thought of this:

Jack: I've heard of this before, so dumb

Me: Bet we get sacked for using the black plaster

Jack: Haha I would, just to make a point. Whoever came up with this has too much time on their hands, and whoever gets upset about wearing a wrong coloured plaster is a melt

Me: I should swap 'em with kids ones with cartoon characters on them

I do raise a good point there. If you did use the wrong colour plaster, would people get offended? What happens if you took the last dark-skinned one and someone saw you and they wanted it?

How often do you need a plaster when you are in the office? If the plaster is in a visible part of your body, is the presence of the plaster uncomfortable/embarrassing anyway regardless of colour? I think the default brown one is probably a good compromise for all skin types anyway, but I suppose modern ones can be white or transparent. 

This surely has to be a case of a white person suggesting this, using their wokeness to raise an injustice against darker skinned people, even though no dark-skinned person was actually offended. However, if you now take the plaster that is reserved for them; then they will be offended.

Do we have the same policy in regards to bandages? They are usually white too.

Incompetent Developer Tales

We had an Indian Developer who wasn’t very good and he wasn’t that great at communicating in English. I don’t mind people starting out as a Junior and improving over time, but sometimes I think you need to have a mindset for quality and care about your work, and Manikandan didn’t seem to have that.

Text

Making a change which involves updating text is one of the easiest changes in software development. This particular area of the code had unit tests to check the message we were displaying to the user. So if the requirement changed, you would have to update the unit tests, then change the actual logic. Manikandan changes the code, but for some reason puts an extra space at the end, and doesn’t update the unit tests.

“As suggested the changes were made and PR got patched.” 

Manikandan

Another change he did involved changing the file path of where a specific file (sqlite) was meant to be. He ends up changing the file path for some other file instead. I pointed it out and he claimed he had tested it.

“I have dev tested this yesterday, it looks like i have missed sqlite folder path.

Mistakenly i have taken up the cache folder.

Will update and patch the PR. Apologies as i have misplaced the path location mistakenly and have corrected it now and done dev testing”

Manikandan

Corrupt Files

When viewing files that have been attached by users, some of these we natively support like text and images, whereas specialist files we allow opening externally. Some we blocked completely. 

A new change would allow files to be saved regardless if they were unsupported or we suspected them to be corrupt. We will inform the user of this though, so this was the 2 requirements:

When you attach:

This file “xxxxx.tif” has been successfully attached. However the file format or file extension is not valid. Verify that the file has not been corrupted and that the file extension matches the format of the file.

When you select “open file”:

This file “xxx.tif” cannot be opened because the file format or file extension is not valid. Verify that the file has not been corrupted and that the file extension matches the format of the file.”

He writes the following note in the bug report:

Made a proper message information need to be displayed to the user as well as restricted further attachment of the corrupted file into user record history.

Manikandan

So that sounds like he is blocking suspected corrupted files, which is not the requirement.  When I looked at his code changes, it seemed like he had done the wrong thing.

So I ask him:

“Have the requirements changed since the product owner’s comment? It looks like you have the message that should be displayed, and it sounds like the attachment should still be saved.”

Me

he replies

“Corrupted Document will not be attached to the record. Once again checked in the database to confirm”.

Manikandan

So I reply again:

“Who has decided that it shouldn’t be attached to the record? The Product Owner said it should be attached and there should be a specific message shown to the user.”

Me

“Apologies for that, will check once again with product owner”

Manikandan

Got the update and have made the changes as per Document gets attached to the record irrespective of the document was corrupt or not. Will show a message to the user regarding the file corruption but allows the user to attach the doc irrespective of its correctness

Manikandan

So it seems he finally understands the requirement. However, when he submitted his change to review, it will show a message on Loading (but not the one the Product Owner asked for), but will still crash on Save so he hadn’t really changed anything.

Finishing my work

I had made loads of changes for a new set of validation, but didn’t have time to finish it off as I got moved onto a different project. I had to hand over my work to him. He made some changes and submitted it to me for review. One file I had already wrote the code to show the new message, but then he had added another IF statement, and added the exact same message, however it had some extra spaces in it! So redundant code, and it was wrong.

Another requirement was 

This Warning should be available irrespective of Admin Org flag...”. Yet he had an If statement to check for the flag.

else if(IsAdminOrg)
 {

There should be no IF statement because it should happen regardless if the flag is on or off.

In another code block, his logic was based on a check :

specialisedCode.Equals(SpecialisedCode.HDS3)

This equals check wouldn’t actually ever equate to “true” because the comparison was against different object types. And even if it did work, won’t this display the same text twice? He already added code to add it to the list, then had some code to add it again if the If statement was true:

 if (matchedItems != null)
{
	var displayName = matchedItems.GetAttribute("displayName");
	var specialisedCode = matchedItems.GetAttribute("code"); ;
	if (!allComponentNames.Contains(displayName))
	{
		allComponentNames.Add(displayName);

		if (specialisedCode.Equals(SpecialisedCode.HDS3) || specialisedCode.Equals(SpecialisedCode.HDS4))
		{
			allComponentNames.Add(displayName);
		}
	}
}

In another part of the code, he had this code

var hasExceptionalComponents = HasExceptionalComponents(sectionDefinition);

if(!data.IsSystemLibraryItem || (data.IsSystemLibraryItem && hasExceptionalComponents))

So if the first part of the if statement wasn’t true (“!data.IsSystemLibraryItem”), then it would evaluate the second part, but data.IsSystemLibraryItem would always be true (since it is just the inverse, and the opposite of “false” is “true”). So that bit could be excluded; so yet more redundant code from Manikandan. This would mean you could write:

if(!data.IsSystemLibraryItem || hasExceptionalComponents)

but what was he doing if this statement was true or false? Here is more of the code:

if (!data.IsSystemLibraryItem || hasExceptionalComponents)
	FormListFromTemplateSectionDefinition(sectionDefinition, allComponentNames);
else
	FormListFromTemplateSectionDefinition(sectionDefinition, allComponentNames);

That’s right, he was doing the exact same thing, so that code block can actually be one line:

FormListFromTemplateSectionDefinition(sectionDefinition, allComponentNames);

So the majority of the code he writes is just redundant. How can you be so incompetent?

When I flagged it to Manikandan, he said

or i am i missing it somewhere, sorry. lil bit confused on this, as i am alone working on this project.

Manikandan

This isn’t a misunderstanding of the requirements, it’s just a lack of care and thought.

Email Regex

When it comes to data entry, you can perform validation using Regular Expressions which defines a pattern/structure. Certain identifiers have rules, for example; an English postcode (I think) is 2 letters, 1 or 2 numbers, usually formatted with a space, then a number, followed by 2 letters. eg  HX1 1QG.

Software developers often come up with convoluted rules to try and define what an email address looks like, but I think they are inconsistent even between email providers that certain special characters can be excluded.

In our software, a developer had decided to change the regular expression in our email address validation. It was complex before, and it looked more complex after the new developer’s changes.

This was what it was:

return (Regex.IsMatch(emailAddress, @"^[A-Za-z0-9_\.\-&\+~]+@((\w+\-+)|(\w+\.))*\w{1,63}\.[a-zA-Z]{2,6}$"));

And he changed it to this:

return (Regex.IsMatch(emailAddress, @"^([\w-]+(?:\.[\w-]+)*(?:\'[\w-]+)*)@((?:[\w-]+\.)*\w[\w-]{0,66})\.([a-z]{2,6}(?:\.[a-z]{2})?)$"));

Now, glancing at that, I have no idea what the advantage is. Some developers just assumed it was a good change and approved the change. What I noticed is that the end part was changed from [a-zA-Z] to [a-z], which means the last part of the email eg “.com” can only be written in lowercase, but previously we accepted “.COM“. As far as I am aware, email addresses don’t care about casing, so although it would be nice if users only entered them in lowercase, it seems an unnecessary restriction. Was it intentional? What else in this validation is different?

If he had unit tests to show the valid emails, and prove his change wasn’t flagging valid emails as invalid; then I would be more comfortable approving it. But there were no unit tests, and I didn’t understand the change.

Another developer pointed out that the advice for emails is usually just to go for the simple *@*.* validation which translates to; “it must have  some characters, followed by an @ sign, more characters, then a period followed by some characters“.

Email validation is surprisingly complicated, so it’s best having a simple restriction just to filter out data-entry errors. Having complex restrictions can then exclude people’s valid email addresses and you definitely don’t want to do that!

The Curious Case of Paul’s Holiday: A Software Saga

This is a story of how you can find a bug before users report it, but don’t end up fixing it due to other priorities or communication breakdown.

I was trying to investigate a bug with our software, and ventured into a related module to try and configure it. However, it crashed. After looking at the code where the crash occurred, I found the developer that had likely caused it, and it was a change he made in the previous month. I sent a message directly to Paul to alert him to it. There was a chance the change hadn’t been released yet, and he could fix it in time. Or if it had been released, then it probably needed to be fixed urgently because the feature looked unusable as far as I could tell.

Paul replied stating that he had realised his change had caused a bug elsewhere and this was essentially the same mistake, so he would add it to his list of things to fix. However, he was going on annual leave in a couple of days so would need to hand it over to his team.

He sent an email to his team and copied me in.

Recently my change had caused a bug in Resource Manager. If you archive and unarchive a template it causes a crash. It became evident that there is the potential for two more crashes to occur that are related.

Schedules (clicking on any folder, demonstrated by Praveen)
Assessment (found today, but we believe this only is used by 1 site)

Root cause is that the control that I changed is the same in these other areas, however these “resources” do not have a file size.

I have created a bug for it, and raised a draft change which I believe would fix the issues (I have done some local testing but more required).

This could be picked up by someone else in my absence if it needs to be progressed asap (code is live and not behind a feature flag).

These discoveries have been through our own efforts, not through Support.

Needless to say, this feature has had its fair share of issues, and I will write up a report/lessons learned document upon my return.

A few days later, Paul returns back from his annual leave.

[15:13] Paul Marshall 
you know that bug in assessments module you highlighted the other day?
Major Incident raised. Only 2 complaints so far, but we have found more occurrences when browsing the error logs.

[15:35] Paul Marshall
I raised the PR (draft) but did not progress it as I was literally going on holiday that day. I emailed a bunch of folks and gave the full picture to them, including all the evidence and PR but it wasn’t moved on.
When I was back I raised it and it sat there not reviewed by the team even though I asked repeatedly to have it reviewed.

Sorry, bit of a rant there....
its moving now

[15:40] Me
let's write a blog post for Viva Engage about how we came close to preventing a Major Incident
others foiled our attempt

Emails were flying, managers were sighing, and somewhere, a developer was crying.

AI
The Curious Case of Paul’s Holiday: A Software Saga

Once upon a time, in the not-so-far-away land of Software Developmentia, there was a developer named Paul. Paul was a diligent worker, known for his sharp eye for detail and his love for datagrids. One fateful day, Paul made a change to a datagrid that was nothing short of genius… for a certain group of people. For another group, however, it was a disaster waiting to happen.

Paul, with his developer senses tingling, caught the potential calamity before it unleashed chaos upon the unsuspecting second group. He crafted a fix as elegant as a ballroom dance and sent out an email to his manager and team, detailing the issue, the solution, and the testing needed. It was a masterpiece of communication, a symphony of technical jargon and urgent pleas. And with that, Paul packed his bags and set off on a well-deserved holiday, visions of bug-free code dancing in his head.

But alas, in the twist of all twists, Paul’s team, perhaps too mesmerized by his email’s eloquence or too entranced by their own holiday plans, did nothing. The fix sat, lonely and unimplemented, as Paul’s toes wiggled in the sand and his out-of-office reply cheerfully told people he’d be back soon.

Upon his return, Paul found his inbox flooded with the digital cries of users caught in the datagrid debacle. The issue had gone live, and chaos reigned. Emails were flying, managers were sighing, and somewhere, a developer was crying.

What can we learn from this tale of holiday woe? Perhaps it’s that communication, no matter how clear, is only as good as the action it inspires. Or maybe it’s that one should never go on holiday without ensuring their work is done… or at least assigned to someone who isn’t equally holiday-bound.

So let us raise our mugs of coffee to Paul, the developer whose holiday will forever be remembered in the annals of Software Developmentia. May his next vacation be free of datagrid disasters and full of completed tasks.

And to all the Pauls out there, a word of advice: double-check your team’s to-do list before you hit that out-of-office button. It might just save your holiday… and your datagrids.

Early Morning Meetings

We had our daily “Stand-up” meetings at 9:45, where the team states what they did yesterday and what they will aim to complete today. In our “Retrospective” meeting, where the team reflects on what went well and what didn’t go well over the last 2 weeks, and also what improvements could be made, one developer questioned why we have our meetings at 9:45 rather than 9:00.

Most people in the team start their day at 9:00, so to do a bit of work, then have to go on the meeting seemed like a distraction. After thinking about it, the longest serving team members said they might have set that time due to the previous Product Owner having several teams and it was the best slot for them to attend. Then no one questioned it and so it remained.

I suggested that we start at 9:05, given that if we turn on our laptops at 9:00, we will end up being late. Often if team members request help at the start of the day, they often reply “let me get my coffee”, so it made sense to allow people to finish their breakfast and get a drink. People seemed to think it was a good idea.

However, the next week, the meeting invite was set at 9:00. I questioned some of the team members.

[Monday 09:19] Me
has Sam ignored my idea of starting at 9:05 instead

[Monday 09:20] Dennis
must've

[Monday 09:24] Dean
what was the 9:05 idea? can't remember that

[Monday 09:25] Me
I said if we start at 9, it gives us a chance to get logged in, and you get your coffee
then everyone was like, "yeah dude that is sick idea mate", and you kicked off about being falsely accused of wanting coffee

[Monday 09:26] Dean
haha i don't fully understand
i get logged in and get my coffee based on what time the call is
T - 5 mins
doesn't matter whether that's 9am or 9:07

Dean had previously criticised me for being late to a meeting that was arranged for 8:30 (before my 9:00 start), and I had missed another that was arranged less than an hour’s notice. I felt like he was using this as another opportunity to question my attitude towards meetings.

On the third day, Dean then says he is finding it really hard to get used to starting at 9:00 and might consider asking for it to be moved back to 9:45. He finds it hard to wake up and be alert for the meeting. Some days he will be late from childcare because it’s a bit of a rush back and can be hard with the traffic.

Quite interesting how he made out he always starts working before 9:00, and made sure he is ready for any meeting 5 minutes before it starts. Then when it comes down to it, he admits he starts late some days due to childcare, and other days he isn’t fully alert to work efficiently. So when we had our meeting at 9:45, it seemed like he was never really working at 9:00 like he claimed, whereas I would be ready at 9:05.

Self-Assessment

Recently, we were filling in our forms for the end of year performance reviews. We have tried all kinds in the past, but have settled on something simplistic in recent years. It’s basically structured around open questions of “what went well?”, “what didn’t go well?”, “What have you learned?”, “What would you like to learn?”. 

Since we had already just evaluated ourself, it was  a surprise to get an email directly from the CTO wanting us to evaluate ourselves.

Hope you are well. 

We are currently conducting an assessment exercise across our Portfolio to establish our strengths and areas for improvement, with the goal of growing our capability as a department and to drive to our long term vision of moving to development on our new product.

To facilitate this process, we have developed an assessment questionnaire that will help us understand your capabilities and your career trajectory.

Could you please complete this form by selecting the option that best reflects your current capability or skill.

It’s an unexpected email, states urgency, and contains a suspicious link. All the hallmarks of a phishing email. I waited for a colleague to click the link before clicking mine. Given that it asks similar questions to what is on our performance review, as well as many others that are specific for our job role; why wouldn’t they just standardise the review process in order to get the information?

Clicking the link loads up a Microsoft form with Employee ID and Name filled in with editable fields but the question says “Please do not change this”. My name had double spaces in it which was really annoying. What would happen if I did correct it? Does Microsoft Forms not allow you to have non-editable fields? Seems a weird limitation regardless.

The questions were labelled with the following categories:

Delivery, Code Quality, Problem Solving, Accountability, Technical Proficiency, Domain Proficiency, Cloud Knowledge, New Gen Tech Stack Proficiency, Joined Up, Process and Communication, Innovation. 

I really didn’t like the way the questions were written. There are 5 answers labelled A-E, but C is often written to sound like a brilliant option when you would expect that to be average. B and A just sound like behaviour reserved for the Architects/Engineering Managers/Principal Developers.

Given that the answers just seem to link directly to your job role, then it reminded me of those online quizzes where it is gonna decide what TV Character/Superhero you are, but you can easily bias your answers because you can see exactly where it is going. In this case, this assessment just seems like it is gonna rank you Architect, Expert, Senior, Junior based on your answers.

Some of the wording for the lowest answers seem like a strange thing to admit.

“Only engages in innovation efforts when directly instructed, showing a complete lack of accountability. “

Why would you admit to showing a complete lack of accountability? Most people probably don’t “innovate” but selecting an answer with “showing a complete lack of accountability” seems crazy.

So given that some answers are never gonna be selected because it’s a difficult thing to admit, and given some answers were clearly based on your job description; then people would just select answers based on what they SHOULD be doing, rather than what they ACTUALLY do. So therefore, it’s a pretty pointless survey. Also there is bias that it was given during the review period so people would suspect it would be used to decide pay-rises and promotions rather than just for some team reshuffle. 

This one on Code Quality is weird because B and C seem similar in standard, but then when you read D, it sounds like you admit you are an incompetent Software Developer.

Code Quality 
(cq.a) Established as code guru and plays a key role in shaping optimal code quality in the team through effective reviews, acting on insights from tools, identifying and resolving inefficiencies in the software and process.
(cq.b) Effectively uses insights from tools like Sonarcloud and influences team quality positively by enforcing standards and showing an upward trend of improved quality and reduced rework.
(cq.c) Upholds the highest standards of unit testing, coding practices, and software quality in self-delivery and ensuring the same from the team through effective code reviews.
(cq.d) Rarely identifies refactoring opportunities, misses critical issues in code reviews, and struggles to positively influence the team's approach to code quality.
(cq.e) Engages minimally in code reviews, allowing issues to slip through; unit tests are skipped and/or yet to begin influencing the code quality of the team.

This one seems applicable to only the top people, or ones that love the limelight and want attention from the managers.

Joined-up
(ju.a) Designs personalised learning paths for team members to ensure comprehensive skill development.
(ju.b) Takes ownership of training needs, seeking opportunities for personal growth. Takes the initiative to identify advanced training opportunities for skill enhancement.
(ju.c) Demonstrate robust team communication, encourage team to contribute in weekly Lunch and Learn sessions, actively recognising peers, support juniors wherever needed. Be active in recruitment.
(ju.d) While excelling as an individual contributor, there is an opportunity to engage more with team members by sharing ideas, seeking input, recognition and offering support in team/organisation initiatives
(ju.e) Need to start taking on a mentoring role by sharing knowledge, providing guidance, and offering constructive feedback to the juniors help them grow and succeed.

I think it is difficult to make meaningful surveys or assessments, but you need to put some thought into the value, and the accuracy of the results.

DataGrid & The Custom Row Selection

We have one module where you select rows from a grid, then can perform actions on multiple entries. The first column in the grid contains checkboxes. So whatever is ticked is what the actions should apply to. However, if you check the box, it also highlights the row. So there are actually 2 things visually that show the user what is selected.

Clicking anywhere on the row should highlight the row, but since highlighting and the checkbox shows what is selected, then we need some custom code to also check the box when the row is highlighted.

My team had made a tweak to the underlying control, which had broken the checkbox being ticked when the row is selected.

When we recreated the bug, we realised that – because there’s 2 ways of selecting (highlight and checkbox), when you click buttons to perform actions, it actually checks against what is highlighted. So even though we had introduced a bug that it no longer checks the box, it doesn’t actually cause any problems at all because the checkboxes are merely a visual thing.

A simple fix would be to just remove the column since it is redundant. Then a lot of code could be cleaned up.

“One thing we have noticed is the tick box isn’t actually needed, highlighting the column gives you all the same functionality as selecting the tick box.” 

Tester

However, our Product team said that even though multiselect behaviour has always existed, many users weren’t aware and so clicked each checkbox one by one. So it sounds like some users like clicking rows, some like using checkboxes, and some like using keyboard shortcuts.

The keyboard behaviour seemed rather strange too and caused extra complications with the code. You can press down/up which selects/deselect the row below/above the row that is currently focussed. However, there is no visual indicator of which row actually has the focus. Other shortcuts include pressing Spacebar which toggles the checkbox on/off. Pressing ctr+up/down jumps to the top/bottom respectively and checks the box (but if it is already checked, then it doesn’t uncheck it; which is inconsistent with up/down without the Control key). You can also press ctrl+a which selects all rows, but you cannot deselect all rows.

It really illustrates how something basic can be complicated to make all users happy. Then the more custom code you add, the more likely there’s bugs and other inconsistencies.

I was noticing inconsistencies with the shortcuts. So when I had implemented my fix, I was convinced I had broken them.

private void ListData_KeyUp(Object sender, KeyEventArgs e)
{
	if (listData.CurrentCell == null)
		return;
	
	if (e.KeyData == Keys.Down || e.KeyData == Keys.Up)
		HandleGridRowSelection(mouseActionOrSpaceKey: false);
	
	if (e.KeyData == Keys.Space)
	{
		_isSpaceKeyPressOnCheckBox = listData.CurrentCell.ColumnIndex.Equals(_checkBoxColumnIndex);
		HandleGridRowSelection(mouseActionOrSpaceKey: true);
		_isSpaceKeyPressOnCheckBox = false;
	}
	
	if ((e.Modifiers == Keys.Control && (e.KeyCode == Keys.A || e.KeyCode == Keys.Up || e.KeyCode == Keys.Down)))
		HandleCheckBoxHeaderClick(combinationKeys: true);
}

I thought I had broken the functionality because sometimes I saw it highlight all rows but it didn’t select the checkbox. Then when I was looking at the code again (which I hadn’t modified), I noticed it was called from Key Up. The ListData_KeyUp code would expect you to let go of A whilst still holding down Control. Although most people would do that due to their hand positioning, sometimes I was finding I was basically “stabbing” the keys so was releasing them roughly the same time. So in the times I had actually released Control first, then the IF statement doesn’t pass and therefore the checkboxes aren’t selected. I think the standard implementation is to check OnKeyDown and not UP.

The Feature Flag Rant

When adding new features to software, you can add a Feature Flag. If set to true, it uses the new feature, false and it doesn’t. This allows a quick roll-back feature by tweaking this value rather than releasing a new software update. However, it makes the code more complicated due to branching paths.

When all users are now using the new feature, when do you remove the code? Obviously it should be removed once all users are switched over and happy with the new functionality, but the work needs to be planned in, and what is the urgency? Project Managers will want new projects that add value, not deleting redundant code.

One of our most experienced developers posted a rant about feature flags. He pointed out there was no guidance on when to use feature flags. Do all new features get feature flags? What if it depends on a feature that already has a feature flag? Do Software Testers test each combination to make sure all code paths are supported? Is it clear which configurations are deployed on live since this should have priority when it comes to testing? By default, our Test Environments should match the config of a typical Live Environment. However, we often find that the default is some configuration that is invalid/not used.

It’s not always possible to “roll back” by switching the feature flag off. This is because to implement the change, you may have needed to refactor the code, or add new database columns. Changing the feature flag back to “off/false” just stops some new code being called, but not all new code changes (the refactored parts). So if the bug is with the changes even with the flag off; then it is still a problem.

It was also discussed that some people used our Configuration Tool for actual configuration and others were using them as Feature flags, and maybe we should have separate tools for Configuration and Features.

Feature flags cause maintenance problems. It needs to be tested on/off when implemented, then if you want to remove it, then that needs to be tested too. If you leave it in, then it’s always going to be questioned if code in that area is used/needs testing. How do you prioritise removing the code; does it belong with the team that initially created the feature? What if the team has moved on, or split?

Another developer brought up an example of how a bug existed in two places but the developer that fixed the issue was only aware of one path, and didn’t know about the other which required a feature flag to enable.

He also questioned if it is more of a problem with our process. Other companies may have quicker releases and are more flexible to rollback using ideas like Canary Deployment. Our process is slow and relies on “fix-forward” rather than rollback.

Things to consider:

  • What actually gets feature flagged?
  • When is the conditional code is removed from the codebase
  • Effect of the “Cartesian Explosion” of combination of flags on unit tests and test environments

Crowdstrike Struck The World

I heard from a few security podcasts that Microsoft wanted to have exclusive rights to manage the security of the kernel on Windows machines. However, due to the EU’s competition laws, they don’t like monopolies so want an open market of security software. In most cases, competition is good, but this could actually be one area where you do want a closed system. The more companies that have control in something fundamental as the kernel, then the greater risk of threats.

A kernel driver has very intimate access to the system’s most inner workings. If anything goes wrong with the kernel driver; the system must blue screen to prevent further damage to the user settings, files, security and so on.

Crowdstrike released a faulty update in their software update, which caused the infamous blue screen of death in many Windows systems across the globe. Microsoft must have been fuming, because they knew this wouldn’t have happened with a closed system, and the media kept on reporting on it as if it was a Windows problem. Sure, it only affected Windows PCs, but it had nothing to do with Microsoft.

If I understand correctly, the driver was signed off by Microsoft but the update involved a “channel file” which just contained loads of zeros. So when the driver used it, it had no choice but to blue-screen. It makes you wonder what kind of testing processes they have at Crowdstrike if they can release an update like that.

When I logged in at work, our Group IT announced that some colleagues will be affected by a Crowdstrike problem and would be acting quickly to get people back up and running. It was only a bit later when someone sent me a screenshot of some of our users complaining on X did I realise that it wasn’t just an internal problem. When I went on X, I saw reports of the problem affecting banks, airlines, supermarkets and more; and had a live news page on the BBC. I still didn’t understand the severity of the problem until I saw that Troy Hunt had declared it as one of the severest problems we have ever seen.

Despite Group IT making it sound easy to restore, when I heard others talk about it, I got the impression that it was fairly straightforward to revert the update on a single computer, but when you have hundreds of computers; then it is a problem. In companies where they only have a few IT staff; it is crippling. You may think that people could fix the problem themselves but many people aren’t tech-savvy, and plus, many companies lock down access so you don’t have any advanced features like Administrator mode. 

Furthermore, it sounded like servers “in the cloud” were even more difficult to restore; or it was more cumbersome at least.

Ironically, in recent years, we have moved a lot of our live infrastructure from our own data centres and into the cloud; citing benefits of reliability. However, this problem meant our users were impacted for a day or so; when we could have got them up and running within an hour or so if the servers were still internally hosted. 

Crowdstrike released an update to prevent more machines from being brought down, and had sent customers mitigation steps and tools to identify impacted hosts. The new update wouldn’t fix the broken machines though; that required manual fix involving booting into safe mode, locating the dodgy file, and removing it.

Companies purchase security software to prevent system outages, and causing a global system outage is a massive PR blunder for Crowdstrike and security software in general. It’s gonna be tough rebuilding trust, but many of the every-day people will probably blame Microsoft because that’s the name that was initially stated in the media.

It must have been brutal for the upper management, and a disaster when they turn up fatigued and under pressure on live TV.

Troy Hunt documented the story as he learned more:

Strange keyboards

Recently, I’ve come across some interesting keyboard designs.

Optimus

Optimus Maximus Keyboard review (Cherry ML)

This one which has customisable screens under the main keys. Very pricey and very gimmicky

Emoji

 Logitech Pop Keys keyboard review: form over function – The Verge

This one has a selection of emoji keys. For someone that loves emojis, it sounds like a great idea in theory. However, at work, I tend to react to messages on Slack/Teams and I think this would only add them when you are writing text. Also, they have chosen very generic ones which they think are popular. Although I might use a laughing emoji, I like using more obscure ones based on inside-jokes. So it wouldn’t really work for me.

Lego

There is this Lego-like keyboard which looks bizarre, and the straight layout and limited keys makes it much harder to type. A total gimmick.

Keyboards Should Have Been Like This

The Future

Remember when we stopped using keyboards? Basically just examples of zany ideas which aren’t that practical for most people.

2002: COMPUTER KEYBOARDs of the FUTURE | Tomorrow’s World | Retro Tech | BBC Archive