Process

Why I don’t like estimating

I suck at estimating! But it seems that our industry considers estimating an important skill. I’ve always been optimist with my estimates. And being aware of this hasn’t made my estimates more accurate. (Apparently there’s event a law for this – Hofstadter’s law). The truth is, I never wanted to get better at it. And I still don’t.

Why?

First of all, estimates are waste (from a lean point of view). If you ask a customer: “Do you want more estimations?”, he’ll say no. This means that estimates are waste.

Of course, there are different types of estimates. There are rough order of magnitude estimations for large chunks of work. Let’s call this Estimating a project. Then there’s estimating smaller pieces of work, for example Estimating stories or features.

Architecture, Clean Code, Quality

What I’ve learned on the Crafted Code course

Last week was a good week for the IT community in Iasi thanks to Codecamp – 2018 autumn edition. One of their masterclasses caught my eye – Crafting Code by Sandro Mancuso. I have been following Sandro‘s work for a while now, so this was a great opportunity for me to put the theory into practice . This blog post contains some of the things I’ve learned during the training.

This was a 2 day, hands-on course, focused on TDD, using mocks as a design tool through Outside-In TDD and working with Legacy Code. All exercises required pairing, which was a good opportunity to meet and learn from other people.

TDD

The focus of the first day was to learn the basics of TDD. Here are some of the highlights:

Think of tests as specifications for the unit under test.
How to name a test. Always try to make your code read well in English. If you’re testing an Account class, name the test class AccountShould. Then each test should continue from there – e.g.: Increase_Current_Balance_When_Making_A_Deposit. This reads nicely, contains terms used by the business (ubiquitous language), and specifies clearly what the test does.
The order in which to write the Given, When, Then is important. Start with Then, since this should be obvious from the test name. Then write the When and the Given. Implementing the steps in this order will keep the test focused and ensure we’re not doing too much in the Given step.
If the test that you’ve just written goes immediately to Green, then maybe the previous test took too big of a leap. TDD is about Red, Green, Refactor, not Red, Green, Green,…Green, Big Refactor.
Do not treat exceptional cases and the happy path at the same time. First flesh out the happy path, then add edge cases. This will usually get you to the solution faster.
Try to avoid the False Sense of Progress – writing lots of tests that pass quickly without helping you identify the solution. You should write the smallest test that points you in the right direction (i.e. the solution).
How to test a method that returns void – look for side effects without breaking encapsulation
Don’t believe the single assert myth. A test should contain a single logical assert. We can have more than one assert statements in a test. But they need to be logically grouped together.

After that, we focused on the two main styles of TDD, classicist and outside-in. (Sandro also mentioned a more extreme style – TDD as if you meant it. If you want to check it out have a look at Adrian Bolboaca‘s blog)

Classicist (Chicago school)

This is a good way to test drive an algorithm, data manipulation or conversion, when you know the inputs and outputs, but you don’t know anything about the implementation.
The design happens in the Refactor step. Because of this, it can be harder to get to a good design if the unit under test touches many domains (e.g Payment, Shipping).
Use the transformation priority premise to get from Red to Green. This can help you avoid writing test code that duplicates production code.
As the tests get more specific, the code gets more generic. So look for ways to move data out of the algorithm.
You cannot refactor a switch cases step by step. You need to rewrite the whole thing. So try to avoid them when test driving an algorithm.
Recommend book: Test Driven Development: By Example by Kent Beck

Outside-In (London school)

Use this when you have an idea about the implementation and the internals of the unit under test.
Use mocks as a design tool. Mocks get a bad name because many people misuse them. They can be a powerful tool when they are used correctly.
Most use cases don’t require strict mocking. Some really high risk apps (for health care, rockets, nuclear plants) might benefit from it.
Don’t mock private methods, even if the framework allows it. Even though you would write more tests, it would not lead to a better design.
Don’t use Argument.Any when verifying method calls. The arguments are part of the contract, so they should be checked.
Recommended book: Growing Object-Oriented Software, Guided by Tests by Steve Freeman and Nat Pryce.

Using Outside-In TDD to implement a business feature

We started the second day with an ATDD exercise. Sandro took this opportunity to talk about Outside-In Design:

Architecture vs. Design

Architecture – These are the systems that are part of the product and the way they interact. Each one should be treated as a black box. Simon Brown‘s container view (part of the C4 model) came to mind.
Macro Design – the architecture of each system. This is where you choose MVC, layers, ports and adapters, clean architecture (Simon Brown has an interesting post on the different styles).
Micro Design – how classes collaborate, what modules do you need?

When practicing Outside-In TDD, it is recommended to think about the application’s architecture and macro design beforehand. Than you can use TDD to drive the micro design. When you start thinking of how to make the first Acceptance Test pass, you’ll need to make lots of design decisions, before writing any code.

Test Types

There are a lot of conflicting definitions for test types. What’s important is for your team to know exactly what you mean when you say, for example, Integration Test or Component Test. Sandro briefly described a potential test classification:

Acceptance Test – to test a behavior of the system. The entry point is usually the Application Service (from DDD, Use Case in Clean Architecture or Action in Interaction-Driven Development). External dependencies (e.g. Databases) can be mocked (white box testing) or we could use the real implementation (black box testing)
Unit test – the unit under test is a single class or a small group of classes
Component Test – the unit under test is the Domain Model
Feature Test – the unit under test is the Application Service and the Domain Model
Integration Test – testing classes at the system boundaries (e.g. testing the SQL implementation of a Repository)
User Journey Test (the unit under test is the UI and the backend is mocked)

You start with an Acceptance Test, then move to the other test types, as needed, while mocking collaborators.

Testing and Refactoring Legacy Code

This is the part that really impressed many of us in the audience. I’ve seen Sandro’s session on Testing and Refactoring Legacy Code in 2013, but I enjoyed seeing it live. This is one of the most useful presentation I’ve seen because it was immediately applicable to the work I was doing. It also led me to Michael Feathers‘ Working Effectively with Legacy Code. If you’re working with legacy code, you need to read this book. It will help you when you get stuck.

Some tips from the session:

Use Dependency Breaking techniques (e.g. Subclass and override method) in order to write tests for legacy code.
Test from the shallowest branch, since it contains the lowest number of dependencies.
Refactor from the deepest branch.
Use Test Data Builders to make tests more readable.
Use Guard Clauses to make the happy path more visible.
Use the Balanced Abstraction Principle to make sure that everything in a method is at the same level of abstraction. Public methods should tell a story.

Conclusion

As I said, I was aware of Sandro’s work. Things made sense while reading the blog posts but only “clicked” during the course. This is because the course relied on coding exercises, pairing and on Sandro critiquing our code (which he did a lot!). And we all know that there is no learning without experimentation and playing around.

At the end of the course, my only complaint was about the fact that it was ending when we started to delve deeper into more advanced topics: design and architecture. Fortunately there is a another course that tackles these subjects – Crafted Design. So hopefully I’ll attend that one soon!

In conclusion, this was the best training I’ve attended. Sandro’s passion and experience were visible from the get go. The advice was pragmatic. The discussion about different options he considered while designing also gave us a glimpse into his train of thought. It was great to have the opportunity to learn from a software craftsman. And, as a bonus, we also talked a bit about BDD and DDD, which helped me confirm some of my ideas and see other things in a new light.

So don’t miss the chance to attend this course!

Architecture, NServiceBus

Storing the state of a Long Running Process

In the previous two posts in this series, we’ve seen some examples of long running processes and how to model them. In this article we’ll see where to store the state of a long running process. This is an important topic when talking about long running processes because long running means stateful. We’ll discuss three patterns: storing the state in the domain entity, in the message or in a process instance. To better explain these patterns, we’ll implement subflows from the Order Fulfillment enterprise process.

You can find the code on my GitHub account.

Store the state in the Domain Entity

This is probably the most used approach of the three, although it’s not the best choice in most cases. But it’s overused because it’s simple: you just store the state in the domain entity.

Requirement

Let’s start with what Finance needs to do when it receives the OrderPlaced event: charge the customer. To do that, it will integrate with a 3rd party payment provider. The long running process in this case handles two message:

the OrderPlaced event – in which case it will send a ChargeCreditCardRequest
the ChargeCreditCardRespone

Implementation

Since we only have two transitions, we could store the state in the Order entity.

Let’s have a look at the code. We’ll use NServiceBus, but the code is readable even if you don’t know NServiceBus or .Net.

Architecture, NServiceBus

Modelling Long Running Processes: Choreography versus Orchestration

In the previous article we’ve seen some examples of long running processes. The purpose of this blog post is to show how to model long running processes by using choreography or orchestration.

Requirement

To better understand the differences between these two approaches, let’s take a long running process and implement it with both. Since we already talked about the Order Fulfillment enterprise process in the last post, let’s use that.

When a customer places an order, we need to approve it, charge the customer’s credit card, pack the order and ship it.

Choreography

Let’s first implement this requirement with choreography. Choreography is all about distributed decision making. When something important happens in a service (or bounded context), the service will publish an event. Other services can subscribe to that event and make decisions based on it.

Architecture

What are Long Running Processes?

Most of us are working on distributed systems. Most of us are implementing long running processes. Of course we would like all our long running processes to be:

simple
fast
decoupled
reliable
easy to implement
easy to understand
easy to change
easy to monitor

But this is impossible, so you need to make trade offs. This is why it’s important to have the right tool for the job. But, much of the information out there describes one tool – RPC style integration (e.g. services calling each other over the web, through HTTP). And although this is a good tool, it’s not the best tool in every situation. The purpose of this blog post series is to present some message based patterns that are useful when designing and implementing long running processes.

What is a long running process

First, let’s start with what is a process. A process is a set of operations that are executed in a given order as result of a trigger.

public Task Handle(PlaceOrder message, IMessageHandlerContext context)
{
	Data.OrderId = message.OrderId;
	Data.TotalValue = message.TotalValue;

	Log.Info($"Placing Order with Id {message.OrderId}");

	RequestTimeout(context, TimeSpan.FromSeconds(1), new BuyersRemorseTimeout());

	return Task.CompletedTask;
}

In this example, the trigger is the PlaceOrder message, and the instructions are in the body of the method.

A long running process is a process that needs to handle more than one message.

{
	public Task Handle(PlaceOrder message, IMessageHandlerContext context)
	{
		Data.OrderId = message.OrderId;
		Data.TotalValue = message.TotalValue;

		Log.Info($"Placing Order with Id {message.OrderId}");

		RequestTimeout(context, TimeSpan.FromSeconds(1), new BuyersRemorseTimeout());

		return Task.CompletedTask;
	}

	public Task Timeout(BuyersRemorseTimeout state, IMessageHandlerContext context)
	{
		context.Publish<IOrderPlaced>(
			o =>
				{
					o.OrderId = Data.OrderId;
					o.TotalValue = Data.TotalValue;
				});

		MarkAsComplete();

		return Task.CompletedTask;
	}
}

As you can see, in the handler of the PlaceOrder message, we set some state (the OrderId and TotalValue) and we raise a timeout. In the second handler, when we receive the BuyersRemorseTimeout, we read the state that we saved in the first handler and publish an event.

Long running means that the same process instance will handle multiple messages. That’s it! Long running doesn’t mean long in the sense of time. At least not for people. Such a process could complete in microseconds. Also, a long running process does not need to be actively processing its entire lifetime. Most of the time, it will probably just wait for the next trigger.

Quirks

OBJECT_DEFINITION and sp_rename

Today I found an interesting quirk of sp_rename: renaming a view, stored procedure, function or trigger will not update the object’s definition that is returned by the OBJECT_DEFINITION function. This is documented, but I think it might take people by surprise.

Example

So, if you first create the following view:

CREATE VIEW dbo.InitialViewName
AS
SELECT Id
FROM dbo.SampleTable

Then, you update its name:

EXEC sp_rename 'InitialViewName', 'UpdatedViewName'

Now, when you get the view’s definition:

SELECT OBJECT_DEFINITION(OBJECT_ID('UpdatedViewName'));

The result might surprise you

CREATE VIEW dbo.InitialViewName  AS  SELECT        Id  FROM            dbo.SampleTable

Notice it’s still InitialViewName. So, if you use the OBJECT_DEFINITION in your SQL scripts, you better stick to dropping and re-creating these objects.

Architecture, MSMQ, NServiceBus

Understanding the 8 fallacies of Distributed Systems

Are you working on a distributed system? Microservices, Web APIs, SOA, web server, application server, database server, cache server, load balancer – if these describe components in your system’s design, then the answer is yes. Distributed systems are comprised of many computers that coordinate to achieve a common goal.

More than 20 years ago Peter Deutsch and James Gosling defined the 8 fallacies of distributed computing. These are false assumptions that many developers make about distributed systems. These are usually proven wrong in the long run, leading to hard to fix bugs.

The 8 fallacies are:

The network is reliable
Latency is zero
Bandwidth is infinite
The network is secure
Topology doesn’t change
There is one administrator
Transport cost is zero
The network is homogeneous

Let’s go through each fallacy, discussing the problem and potential solutions.

1. The network is reliable

Problem

Calls over a network will fail.

Most of the systems today make calls to other systems. Are you integrating with 3rd party systems (payment gateways, accounting systems, CRMs)? Are you doing web service calls? What happens if a call fails? If you’re querying data, a simple retry will do. But what happens if you’re sending a command? Let’s take a simple example:

var creditCardProcessor = new CreditCardPaymentService();
creditCardProcessor.Charge(chargeRequest);

What happens if we receive an HTTP timeout exception? If the server did not process the request, then we can retry. But, if it did process the request, we need to make sure we are not double charging the customer. You can do this by making the server idempotent. This means that if you call it 10 times with the same charge request, the customer will be charged only once. If you’re not properly handling these errors, you’re system is nondeterministic. Handling all these cases can get quite complex really fast.

Solutions

So, if calls over a network can fail, what can we do? Well, we could automatically retry. Queuing systems are very good at this. They usually use a pattern called store and forward. They store a message locally, before forwarding it to the recipient. If the recipient is offline, the queuing system will retry sending the message. MSMQ is an example of such a queuing system.

But this change will have a big impact on the design of your system. You are moving from a request/response model to fire and forget. Since you are not waiting for a response anymore, you need to change the user journeys through your system. You cannot just replace each web service call with a queue send.

Conclusion

You might say that networks are more reliable these days – and they are. But stuff happens. Hardware and software can fail – power supplies, routers, failed updates or patches, weak wireless signals, network congestion, rodents or sharks. Yes, sharks: Google is reinforcing undersea data cables with Kevlar after a series of shark bites.

And there’s also the people side. People can start DDOS attacks or they can sabotage physical equipment.

Does this mean that you need to drop your current technology stack and use a messaging system? Probably not! You need to weigh the risk of failure with the investment that you need to make. You can minimize the chance of failure by investing in infrastructure and software. In many cases, failure is an option. But you do need to consider failure when designing distributed systems.

Soft Skills

How to improve your decision making process

Software development is all about trade-offs. I was watching Dan North‘s fantastic presentation, Decisions, decisions, in which he talks about the fact that every decision is a trade-off. This got me thinking about my own decision making process. Every decision we make has its pros and cons. But sometime we seem to overstate the advantages while understating the disadvantages. I am guilty of this myself. In this blog post I’ll explain some of the mistakes I’ve made while making a decision and provide some tips on how to avoid them.

Common Mistakes in the decision making process

In this section I’ll list some of the flaws that I’ve recognized in my own decision making process at one point or another.

The Any Benefit approach

I first read about the Any Benefit mind set in Cal Newport’s book Deep Work (although I have been guilty of it way before that). This way of thinking leads to justifying a decision with a possible small benefit, ignoring all the negatives. I have been guilty of the any benefit approach many times. It’s easy to read an article on the web and immediately jump on the new trend bandwagon, without proper thought. How many times you wanted to immediately update to the newest major version of a framework, but that pesky senior said that it’s too risky? Or use an alpha release of this new library because it’s cool and then have to work around breaking changes. I’m not saying that this is the wrong decision, but that you should carefully analyze the trade-offs. Do the pros outweigh the cons?

Rich Hickey stated that “Programmers know the benefits of everything and the trade-offs of nothing“. This is because we focus too much on ourselves. But we should focus more on the quality of software, its maintainability and its fit for purpose. This is why I try to think of the main pros and cons of every important decision that I make. Before I go proposing a new solution, I try to play the devil’s advocate and think about its main disadvantages. This is a great way to show other people that you have thought this through. So, before making your next decision, think about the main reason why you shouldn’t do it.

Relying too much on Best Practices

Some of the the today’s antipatterns have been yesterday’s best practices. Singleton and lazy loading come to mind, but there are many others. The problem with best practices is that they are applicable only in a given context. The Cynefin framework suggests that best practices are a good strategy only in the obvious domain, where the relationship between cause and effect is clear. The problem is that we aren’t always in the obvious domain, so we should change our decision making strategy depending on the domain. Let’s take, for example, the Complicated Domain. This is the domain of “known unknowns”. A good decision making strategy in this domain is Sense – Analyze – Respond. This means that we should assess the facts, analyze the situation and respond by following good practices. This decision strategy relies on the fact that there might be many good options and a best practice does not exist.

I’ve been guilty of over relying on best practices too. Best practices like pair programming, automated testing, code review have their own downsides. I think they are valuable in most contexts, but this doesn’t undermine the fact that you should think about your context before applying them. Sometimes I don’t know the answer to questions like “Why do we use X and not Y?”, even though I helped make that decision. At the time, the choice seemed obvious, so I didn’t really thought it through. Even if it’s the right choice, you should still have the right arguments. Answers like “this is a best practice” can seem random and don’t usually convince people.

Ignoring Cognitive Biases

Although we would like to think that we make only rational decisions, cognitive biases prove otherwise. A cognitive bias is a flaw in our judgment that can lead to irrational decisions. Let’s take a look at two different types of biases: decision making biases and social biases.

Decision making biases

The Confirmation bias is the tendency to interpret information in a way that confirms our preconceptions. As an example, if you want to switch to microservices and do a google search on the advantages of microservices architecture, you’ll find plenty of compelling reasons. But before making the move, you should also search for its drawbacks.

Another bias that you should be aware of is loss aversion. This bias states that people feel loses more deeply than gains of the same value. This is also related to the sunk cost effect – we invest more in a decision so we don’t lose what we have already invested in it. Basically, we ignore the negative outcomes of a choice because of our fear to lose what we have already invested in that decision. It’s always better to try to be objective and cut your loses early.

I fell victim to this bias while I was researching how to put coarse grained locks on NServiceBus Saga instances when using NHibernate persistence. After several hours of searching how to use coarse grained locks by using NHibernate listeners and checking if an entity is dirty, I noticed something in the NServiceBus documentation. They were stating that pessimistic locking is enabled by default on saga instances. So, if you use the default settings, you can’t get into an inconsistent state. But, after I invested so much time in researching a general solution to the NHibernate coarse grained lock problem, I was almost inclined to use this alternative, more complex solution with no extra benefits. Fortunately, I noticed that I was a victim of the sunk cost fallacy and deleted the extra code.

Social biases

Social biases can also play an important role in our decision making process. In Your Code as a Crime Scene, Adam Thornhill mentions two social biases that might lead to bad decisions: Pluralistic Ignorance and Inferring the popularity of an opinion based on its familiarity.

Pluralistic ignorance happens when everyone in a group privately rejects the norm, but assume that the majority accepts it, so they go along with it. An example will probably better explain this bias. Let’s say that you have a large suite of brittle broad stack tests that fail often. These tests rarely catch production bugs and usually fail because of the test code. Each team member knows that the return on investment for these tests has become negative. But, because they think that everyone else finds these tests valuable, they accept the norm and carry on fixing the tests, without solving the real problem. This is also an example of the sunk cost effect – because of the effort invested in implementing the tests, you don’t want to delete them.

Inferring the popularity of an opinion based on its familiarity is another common social bias. If someone in the team keeps repeating a strong opinion, we come to think that opinion is more common than it actually is. Most of us work in teams with people that have strong opinions. This is a good thing, as long as we do our own research and support our decisions with data. Your own experience might lead you to make the wrong decision. If you’ve been working in the same context for a long time, groupthink might lead to everyone having the same opinions. This can make the team members think that their opinions are more wide spread than they are, because of their familiarity inside the team.

Tips for improving your decision making process

After seeing these common mistakes, what can we do to make better decisions? Here are some tips that I find useful:

Be aware of the most common cognitive biases. Knowing them is the first step in overcoming them.
Support your decisions with data. Always do your research. Make pros and cons lists. Don’t base decisions only on instinct. Data is much harder to argue against.
Play the devil’s advocate. Before making a decision, play the devil’s advocate. Think of the top three reasons why your idea won’t work.
Know your context. Be aware of context. The consultant’s answer – “it depends” – is many times the correct answer. A good decision in one context can become a bad decision when applied in a different context. The Cynefin framework is a good place to start if you want to make sense of your context. It can help you use the correct decision making process, based on your domain.
Use guiding principles. Every time you make a decision you should consider the high level principles that govern the product. Examples of these are business drivers or architectural principles. You might make a different decision if you’re optimizing for time to market than if you’re optimizing for maintainability.

What are the most common flaws in your decision making process and how do you overcome them?

Books, Quality

Book Review: Your Code as a Crime Scene

In the last couple of months I’ve been learning about what information can I extract from a codebase. I’ve written some articles on how to use NDepend to extract a static view of the system’s quality. But this view is based only on the current state of the codebase. What about source code history? What can it tell us? How has the code changed? These are exactly the kind of questions that Adam Thornhill‘s book, Your Code as a Crime Scene: Use Forensic Techniques to Arrest Defects, Bottlenecks, and Bad Design in Your Programs, tries to answer.

Clean Code

How to identify common Code Smells using NDepend

This article recaps how to identify some of the most common code smells using NDepend. The basis of this series is Object-Oriented Metrics in Practice, by Michele Lanza and Radu Marinescu. This book describes (among other things) how you can use several targeted metrics to implement detection strategies for identifying design disaharmonies (code smells). You can read a summary of the book and my review in this article.

Detection Strategies

The design disharmonies are split into three categories: Identity Disharmonies, Collaboration Disharmonies and Classification Disharmonies. A Detection Strategy is a composed logical condition, based on a set of metrics for filtering.

Identity Disharmonies

Identity disharmonies affect methods and classes. These can be identified by looking at an element in isolation.

God Class
Feature Envy
Data Class
Brain Method
Brain Class
Significant Duplication

Collaboration Disharmonies

Collaboration Disharmonies affect the way several entities collaborate to perform a specific functionality.

Classification Disharmonies

Classification Disharmonies affect hierarchies of classes.

Conclusion

These detection strategies identify potential culprits. You need to analyze the candidates and decide if it’s an issue or just a false positive. I ended up adding some more project specific filters to ignore most of the false positives. Adding some basic where clause which exclude certain namespace or class name patterns can get you a long way. But, of course, these depend on your specific project and conventions. The beauty of NDepend is that you can update the queries as you wish: add filters, play with the thresholds or add more conditions.

Analyzing a suspect can be done in code, but you can also use other tools. NDepend has some views that can help you with the investigation: Treemaps, Dependency Graph, Dependency Structure Matrix, query results. In Object-Oriented Metrics in Practice the authors use Class Blueprints, but I don’t know a tool that can generate these views for .Net code.

After identifying the issues, you can start refactoring. For some strategies on how to tackle each disharmony or how to prioritize them, I recommend reading the book.