Just Enough complexity

DirectSolutionHow to avoid failure building large software systems and the principle of “Just Enough” complexity.

Introduction

One of the challenges for technical, logical, detail oriented people is accepting that having contrasting objectives is not a failure in planning, but is often necessary for maintaining a balance between conflicting aspects of the reality of the business of software. An example is the balance needed between delivering software features in time to meet demand, and the importance of producing stable and maintainable software. If either of these mutually valid needs too far outweighs the other the results are disastrous.

If you ship really bad software, you will lose your customers and be out of business. If you never ship the best software ever planned but not completed, you will lose your customers and be out of business.

As technical and intelligent people we must avoid becoming so focused on the task at hand we fail to understand the balance to be maintained between productivity and quality, tactical and strategic features, functionality and simplicity, flexibility and supportability, and many more.

Technical and engineering training teaches us that we must consider all of the objectives and factors in planning or engineering a solution, as there is always a tradeoff between the benefit and cost in every choice. There are no cost free benefits in any technical solution, so to ignore the cost and only focus on the benefit leads to failure.

Many frameworks and software patterns greatly increase the complexity of writing LOB application specific code, while automating fast generation of partial functionality or “system plumbing” type code. These frameworks are often intended to abstract and simplify the creation of the basic aspects of building a software application, yet add layers and abstractions to the process of implementing the application specific logic. These factors conspire to explain how often large software projects fail from a business perspective by being over-budget and under-featured, yet were developed with flawless engineering “best practices”.

The Standish Group presents findings that between 2003 and 2012 only 6% of very large software projects were successful, with 42% being failed and 52% challenged. These statistics are staggering, but we must consider that challenged projects might still be successful from a business ROI standpoint.

How can so many smart technologists and engineers fail so often while “doing everything right”?

Years of observing both successful and failed software projects have obviated the need for lean architecture and engineering practices, otherwise we become too burdened by complexity and dependencies to make easy progress on LOB functionality. Approaches protecting from low probability outcomes but of great complexity results in inadequately featured and difficult to enhance software. We do not question that a sculptor can create in clay faster than stone, and it is time to realize that the complexity of our technology patterns is the medium of a software creator.

Overview

“…you’re not here to write code; you’re here to ship products” – Jamie Zawinski (Netscape creator)

An ISP develops, sells, and supports software in their chosen industry. Our job as the software engineers is to produce software that will be delivered to existing and new customers. This software should meet the customers’ needs as simply and directly as possible, be as stable, supportable and extensible as possible, and be delivered and deployed as quickly as possible. Every one of those objectives is important and can’t be dismissed or minimized for emphasis on the others.

There are a couple truisms that we should consider about our role in a software company.

If we have made it harder to create, extend, or review application functionality we are doing something wrong. All of our technology patterns should be simplifying this process, not complicating it, as this is what we are here to do.

We develop code once and revisit it many, many times. It would be no exaggeration to assert that the core source files of a mature LOB application will be opened and examined several thousand times, and altered hundreds or thousands of times, across an expected lifetime of 10+ years.

Consider the implications of complicated and/or confusing code with a simple thought experiment…

Imagine that Mark and Ted each create a new reservation module, Mark’s is direct and obvious code with minimal layers and interfaces, Ted’s is abstract and esoteric with many interfaces and layers. Both provide the same functionality to the user, but working in Ted’s code requires more work in maintaining the inheritance and propagating interface changes to all descendent classes, resulting in an extra hour or two for a given project or defect escalation. On the other hand, Mark’s code was not designed around reuse and is not as technically pleasing.

Which is better? Although not designed around reuse Mark used business objects which can be reused, perhaps at the expense of another couple weeks when the need arose. So the drawback to Mark’s code is calculated in days if the code needs to be refactored for reuse.

The cost of Ted’s code if it had a lifetime of 10 years and 2000 enhancement and escalation code reviews, times let’s say 60 minutes’ extra effort = 50 weeks of developer’s time. Given our business model of supporting and extending our software for a decade or more, complexity is multiplied and becomes unsustainable.

“The logic should be straightforward to make it hard for bugs to hide, the dependencies minimal to ease maintenance…” – Bjarne Stroustrup (C++ creator)

Often time complexity comes from applying a pattern or framework as a method to deal with the complexity or uncertainty of the problem domain for which we are attempting to create a software solution. The intent is to decompose the pieces until they become isolated from the whole of a system which we have not completely understood or conceived yet. In truth we usually are only moving complexity around, or deferring it until later. The less we convolute or abstract the LOB application logic we are creating the easier it is to revisit and extend it later.

“If you can’t explain it simply, you don’t understand it well enough.” – Albert Einstein

Specific Lean Principles and Practices

  • Complexity is your worst enemy

According to recent studies done by Gartner and the Standish Group only 6% of complex software projects were successfully completed on time within budget and delivering the functionality requested. Ignoring over-budget, missed deadlines, and reduced features, the result is 42% of complex software projects were written off as a loss. How does this occur with smart people following best practices? The primary answer is unmanageable complexity. We break complex problems into smaller and more manageable pieces. But the complexity of a system is product of the number of components and the number of dependencies, additive to the cumulative complexity of the components themselves. The human mind is only capable of holding 4 to 9 elements (or concept “chunks”) in working memory at a time, which is why we decompose complex problems to solve them. But there are classes of problems that can’t be decomposed without losing critical understanding of the systemic nature of the whole.

  • Code for Clarity and Simplicity

We write code once and visit it many times, so it should be self-evident what it is doing. Simple code leads to fewer bugs over time. Clarity and simplicity is more important than elegance, re-usability, or any other abstract software engineering principle. The cost of obscurity or excessive complexity in code is calculated in developer-years and is not corrected by documentation. Our job is to create code that anyone can pick up and work with, and should not require specialized knowledge to understand what it is doing. You can change more lines of simple code in less time than fewer lines of complex code

  • Application code kept simplest

Application code (LOB functionality) is touched an order of magnitude (or two!) more times than framework, tool, and infrastructure code. The application code should be a direct and visible representation of the algorithm fulfilling the business process, and should not be obscured by framework and tool use. Tools, frameworks, patterns, and infrastructure glue/communication code must not get in the way of expressing the application code.

  • All other code serves the application

All other code in the system exists to serve creation of the LOB functionality code. None of the other code exists for its own benefit. Tools, frameworks, components, and infrastructure code should make it easier to express the LOB functionality code. If these other layers are not created with specific usability tuned to how the LOB functionality code will look at things, they are wrong. Perfect abstract representations of the class of problems or functionality to be implemented by our support code is not the goal. Simple and direct support for our specific LOB application is our goal.

  •  Common/generic code is sometimes wrong!

When code in the LOB application domain becomes so generic that it is unclear what it will do, or requires expertise and specialized knowledge to work with, then it is wrong. When elaborate or excessive compensation code in the LOB application domain becomes necessary to fill a generic object or structure for common code, it is wrong. When LOB application domain specific business rules must be encoded in common code with different behaviors driven implicitly by data elements, it is wrong. When using common code complicates or significantly extends the coding time required for developing or supporting LOB application functionality, it is wrong.

  • Fewer moving parts is better.

Simplicity at the pattern and architectural level has great benefit to the functional code, it should be no more complex than necessary for the task at hand. Less dependency means fewer permutations are possible and less to go wrong. The faster you can get to the code that does the actual LOB work the better, as you will need to do this over and over during the years of a product’s lifetime.

  • Loose couple whenever possible.

Tight coupling generates dependency between components in a system. This includes parameterization and returned results as well as versioning. Complexity of a system is a product of the number of components and the number of dependencies between them. This means an exponential rate of increase in complexity, and is why many systems must be managed monolithically despite being divided into components based upon separation of concerns. Loosely coupled components reduce dependency and removes the need to consider other components while working in the code of our target component.

  • Follow an additive only pattern to messages, objects, and methods where possible.

Our goal should be to not break backward compatibility unless absolutely necessary. This allows us to change a producer level bit of code without having to change every consumer level bit of code that uses it. Sometimes the consumer will care about the change to the producer, and sometimes not. The goal is to only validate or require the data elements or parameters that we care about as a consumer. If we are intermediary code, we should never fail simply because someone is passing something new through us to an endpoint bit of code elsewhere. This is how we reduce dependencies within the fabric of our platform. Less dependencies, less complexity. Only care about what you care about and let the rest of the fabric mind itself.

  • Keep domain logic and validation as close to the source as possible.

When creating shared code and lower level provider code we often run into the need to process differently based upon a specific domain based case or data state. Case in point would be applying different rules based upon the customer profile (brand specific rules, etc.). The “smarts” about these things should be kept as close to the source of the domain as possible, and the provider level code should at most accept explicit directives about how to perform its job. For example, the calculate ADR function should take a directive “include maintenance rooms” rather than examine the data received to determine whether or not to include them based upon customer.

Don’t confuse the benefit of separation of concerns with creating non-domain knowledge in our service layers. The further you push domain logic away from its point of origin the more convoluted it becomes trying to serve all possible domains and consumers. The best rule of thumb is to always drive out implicit behavior in your code layers. All sub-LOB domain code should be “I do X because the caller told me to” and never “I do X because Y exists in my data and when Y I should do X”. We want our LOB application level code to be deterministic in its expression.

  • Allow consumer code to manage error states from shared lower level code.

The consumer level code is typically more likely to know how to respond to the failure at a lower level than the lower level provider level code. Even in cases where the lower level code knows how to retry this isn’t necessarily what should occur for the use case. When there are multiple consumers of shared lower level code this becomes more important, as some applications may need to report and stop where others can safely retry. A UI based application often needs to deal with error differently than an automated process hidden on a server. Trying to build recovery or complex permutations of error state management into low level components quickly becomes complex and not deterministic from the view of the consumer level code.

  • Internet software must be written as such.

Transmission of data across the internet is much slower than across a local network, and there is latency added to each transaction at the web protocols layer. Minimize data moved in each direction. Direction often determines the data rate. Minimize the number of transactions because of cumulative transaction latency and cumulative risk of transaction failure. You must minimize both the number of transactions and the size of each transaction. Find the best balance possible between consolidating calls and minimizing data transmitted in both directions. Wherever possible initiate internet transactions only upon user action that generates user expectation of same. Asynchronous calls can mitigate perceived performance, but does not remove the need for these guidelines. Often the user will not be able to complete their task until call completion whether asynchronous or sequential.

  • Code in context.

Not all code is created equal. Shared code should not be re-factored unless a better reason exists than an abstract sense of “correctness”. Common code should not be altered unless the implications of the change are understood for all consumers. You can’t anticipate every permutation of bad data or incorrect use of your code, unless you are writing avionics software don’t try to code against unlikely possibilities.

  • Apply a new pattern only if it:

Fits within the lean philosophy presented in the entirety of this document. Additional complexity or dependency is outweighed by superior benefit to the specific project objectives. Will make the code simpler and clearer. Has more benefit than simple preference, familiarity, or “cool factor”. Team consensus that all above conditions are met and the “Lean Advocate” approves. 

  • Our code belongs to our employer and team

We are writing this code for the next person, not for ourselves. Our employer needs to be profitable and our role is to support this, even when that means making technical decisions based on ROI (Return On Investment) rather than what we consider the best engineering practices in an ideal world. Our job is not to protect ourselves from any possibility of making a mistake, our job is to invest in protecting against mistakes to the degree that makes sense to our employer. We are taught as engineers to consider every possible failure we can imagine. But we must also consider the probabilities of the possible failure and participate in an ROI discussion about how or if it is addressed.

  • Software should accommodate both beginner and expert users.

Often time the design of our user experience is built around a beginner or an expert exclusively. If we build for beginners this helps with sales and training, but will make the use of our system tedious and frustrating to experienced users. If we build for experienced users, our software will be confusing to new users and often the decision makers for potential customers that do not actually use the software on a regular basis. To be successful there must be simplicity and a shortest path to functionality, even if this means more than one use path.

  • Software UI should model business process.

Our UI and functionality should model the business process we are automating, rather than model our database or internal implementation details. We should base our design on meeting a customer need in as simple and natural a manner as possible for the business domain. Our target audience is not computer operators or technical analysts; they are non-technical staff working in our target industry.

  • Software should be as simple and natural to use as possible.

Good software does what the user expects. Bad software surprises and confuses the user. Simple software doesn’t surprise the user, and does not lead to confusion and support calls because they will feel comfortable if there is an obvious and natural path to their objective. The software should give enough information to the user for them to find their way through to their objective, but not overwhelm them with details.