Our Software is Full of Bugs

Scope of this Report

  • What are the potential causes of defects?
  • Process, Culture or Estimation Issue?
  • The Value Tetrahedron
  • Tracking and monitoring defects
  • Conclusion

What are the potential causes of defects?

There are various types of defects discovered in live software and we need to consider the root cause of each. While most people assume that all defects are the developer’s responsibility, this isn’t always true. Typical causes include:

  • Poor coding which could be caused by:
    • Unclear or incomplete requirements – Process
    • Poor or insufficient unit testing – Culture or Poor estimation
    • Lack of peer review – Culture or Poor estimation
    • Poor estimation leading to rushed development – Poor estimation
    • Tight development window – Culture or Poor estimation
  • Insufficient Testing which could be caused by:
    • Insufficient test coverage – Process, Culture or Poor estimation
    • Lack of regression testing – Process, Culture or Poor estimation
    • Unclear or incomplete requirements Process
    • Squeezed testing window - Culture or Poor estimation
  • Unexpected Data in the “Live” system which could be caused by:
    • Corrupted live data
    • Unexpected scenarios
  • Change which could be caused by:
    • Volatile requirements - Process, Culture
    • Late change - Process, Culture

Of the four, only “Unexpected data in the Live System” defects can really be considered outside the development project’s capacity to capture or prevent. Even in that case, there are often disagreements between the developers and the business about whether or not certain scenarios should have been unexpected. We have three common themes for the remainder.

Process, Culture or Estimation Issue?

As we can see above, any of the recurring themes – Process, Culture or Estimation - can have a significant impact on the project’s performance and so impact the quality of the release. The tendency is always to blame the developer or the tester for the quality but it’s often a combination of all three issues.

Process

We need to look at the failure from two perspectives – business and development. From the business viewpoint, we should ask if the following were controlled:

  • Business Change Description - Did the business describe what the new business state will be?
  • The Business Case - Was it clear, quantified, current and did it match the business change?
  • Communication - Did the Business help the Development Team to understand the Business Change so they could both articulate which aspects of the backlog provided highest business value?
  • Backlog Prioritisation and Change Control - Was there a product backlog or its equivalent effectively managed and prioritised using techniques such as Value Visualisation?
  • The Risk Management Process - Were risks documented, prioritized and tracked to resolution?

Constant change always has a negative impact on quality in waterfall projects. Even in Agile, which is designed to accommodate change, excessive re-work of features can inject defects. Agile is a lean process and aims to eliminate defects at source or, at least, at the earliest opportunity (e.g. before the end of a sprint). While this may appear to constrain productivity at the front end, the end-to-end benefits of early defect detection and removal are huge.

In the technical processes, we need to look at how the code was delivered:

  • Were agile principles adhered to with constant communication through the business owner to end users?
  • Is adequate time given for an integration sprint if such an activity is planned or needed?
  • Were peer code reviews performed?
  • Was the team’s unit test robust?
  • Was there any systems integration testing?
  • Are agile code releases managed within a robust architecture with continuous integration managed effectively
  • Is adequate time given for integration testing?

Culture

Now, we need to consider the development project culture. With constant demands for everything delivered yesterday at a cheaper cost, ultimately something suffers and, unfortunately, it tends to be quality.

Unless Service Level agreements or Key Performance Indicators are set with development teams/suppliers around defect rates then the developer cares (quite rightly) more about meeting promised delivery dates than rigorously testing the code. In Agile, where this problem should not manifest itself, we sometimes see the symptom of continually developing the same thing in multiple sprints as the product owners try to get closer to what is actually needed.

Estimating

Unless strong and robust estimating procedures are followed, projects are likely to face schedule pressure as a result of development teams cutting corners as they are driven to meet (unrealistic) cost and/or schedule targets.

Even in the Agile team, there is only a finite resource and unless the estimate is strong, the chances of delivering the minimum viable scope will be small unless corners are cut. Put another way, if the product owners and businesses expectations are not carefully managed in an Agile project then their expectations of what they will receive after x iterations may be unrealistic enough to cause the team to cut validation and verification code rather than functionality.

The Value Tetrahedron

In our careers, we have all come across managers and clients who are obsessed by the “Time, Cost, Quality” triangle when they consider software development and, rightly, insist that it should be possible to strike a balance between the three. There has to be intelligence applied during planning and estimating to get that balance correct.

A great deal of time is spent looking at how one can maintain quality and at the same time reduce cost and shorten duration. For example, costs are often driven down by project teams being made to work unpaid overtime or else to cut corners in order to deliver by unrealistic dates. The software goes live and stays up (more or less). The PM happily moves on to his next project. The commissioning (client) manager moves on to the next step on her career ladder. The maintenance team is paid to try to keep the defect backlog down to a manageable size (usually defined as a reasonable level of complaints from the customers) for less cost and in less time! In short, in many organizations, the people who cause the bugs aren’t held accountable as the defect backlog and problems in the code mount up. One manifestation of this is Technical Debt.

Technical debt is the dimension missed so frequently when we look at software development. Somehow the downstream consequences of business driven decisions are often overlooked in the heat of “getting it done”, and often this is tinged with the knowledge that “It won’t be my problem!” when the decision is made to go for it and to hell with the consequences.

If the volume of the tetrahedron is the TCO (total cost of ownership) then controlling that growth has to be the focus of system support and effective control only happens where programme teams and senior commissioning managers make sure that any decisions they make for enhancements will not inflate the TCO beyond controllable limits.

The price of defects hitting the live system is a high TCO and unhappy clients.

Tracking and monitoring defects

The final area to consider is effective tracking and control of the project and that includes defects. Continuous review of the project’s number of defects discovered or defects outstanding should help determine the testing efficacy and the quality of the deliverables before the software goes “live.”

Consider the Defect Discovery Curve.

Figure 1: Defect Discovery Curve
Defect Discovery Curve

Figure 1 shows a typical (albeit fictional data) s-curve shape we would expect for defects in a delivery project. It is a key metric to monitor and is often a strong indicator of when the code is fit to release. The earlier you release the more defects you would expect in live of course. Estimation toolsets will generate predictive models based on historical data and can predict when a project is likely to have discovered 95% or 99% of all defects introduced.

Finally, it is important to investigate where a defect was introduced and where it was discovered, a design defect not found until acceptance testing is much more costly than if it is found in design review (see comments on Agile above). Defect root cause analysis can highlight process issues in specific areas which can improve future performance.

Conclusions

Improving software project quality requires an organization to commit resources to the development and execution of a well-defined software practice.

Strong causal analysis of defects and their origin helps prevent future problems.

Governance of change is important, if you know what the business is going to look like at the implementation of the project then the project will control change and is much more likely to succeed.

The collection of key metrics and review of the data quality will help reduce project failure.
Strong governance, realistic expectations and close communication between the vendor and supplier will help ensure success.

Written by Default at 05:00

3 Comments :

Ed said...
Yep, great article
July 31, 2016 09:50
Jim said...
I really liked this article. Very well written.
July 31, 2016 09:47
Joe said...
Totally agree
July 31, 2016 09:31

Comment

"It's frustrating that there are so many failed software projects when I know from personal experience that it's possible to do so much better - and we can help." 
- Mike Harris, DCG Owner

Subscribe to Our Newsletter
Join over 30,000 other subscribers. Subscribe to our newsletter today!