We had done it: We had built a testable system. We achieved high observability through monitoring, logging and alerting. We instituted controllability by using feature flags. We understood the build through pairing and cross-discipline-generated acceptance test automation. We aimed for decomposability by using small, well-formed services. But our pipeline was still failing — and failing often.
So what went wrong?
We had a horrible external dependency. Their service suffered from frequent downtime, and slow recovery times. Most painful was the fact that test data creation required testers to create a manual request through a ticketing system. We were dependent on a system with low testability, which undermined our own testability. And this had consequences for our flow of work to our customers.
In this article, I will cover how to address such dependencies and engage with the teams that maintain them, including:
- Enhanced observability by adding key events from your dependencies to your own logging, monitoring and alerting.
- Add controllability to applications and share this ability in order to foster a culture of collaboration with your dependencies.
- Greater empathy with teams that provide your dependencies, they have their own problems to deal with and greater understanding will bring teams closer together.
How testability affects flow
Testability has a tangible relationship with the flow of work. If a system and its dependencies are easier to test, then work will usually flow through its pipeline. The key difference is that all disciplines are likely to get involved in testing if it is easier to test. But if a system and its dependencies are hard to test, you're likely to see a queue of tickets in a "Ready to Test" column—or even a testing crunch time before release.
To achieve smooth flow, treat your dependencies as equals when it comes to testability.
What is adjacent testability?
This term refers to how testable a system you depend upon to provide customer value is. Systems you need to integrate with to complete a customer journey, for example, if your system relies on a payment gateway which suffers from low testability, your end to end tests may fail often, making release decisions problematic for stakeholders. Most systems have integrations with other, internal and external systems. Value generation occurs when those systems work together in concert, allowing customers to achieve their goals.
When considering flow of work, I often reference Eli Goldratt's, "Theory of Constraints." Goldratt discusses two types of optimizations that apply to testability:
- Local - changes that optimize one part of the system, without improving the the whole system.
- Global - changes that improve the flow of the entire system.
If you optimize your own testability but neglect your dependencies, you have only local optimization. That manes you can only achieve a flow rate as large as your biggest bottleneck. In the horrible dependency I described above, new test data from the dependency took a week to create. This created other challenges. For example, we had to schedule the creation of the data in advance of the work. And when the work was no longer the highest priority, we had wasted time and energy.
How to improve adjacent testability
Establishing that you may have an adjacent testability challenge is one thing; determining what to do about it is another. On the one hand, you could argue that if a dependency is hard to test, its not your problem. External dependencies might have contractual constraints for reliability, like Service Level Agreements for example. Contracts and reality can be far apart sometimes and service level agreements are not very effective change agents in my experience, so try engaging in the following ways:
Observability and information flow
Enhance observability to provide real feedback about your interactions with dependencies, rather than logging only your system events. Interactions with dependencies are part of a journey through your system. Both internal events and interactions should log to your application logs, exposing the full journey. Replicate this pattern for both production and test environments. The key benefit: You'll provide context-rich information that the people who maintain that dependency can act upon.
For example, after integrating with external content delivery API for an internal application we had issues with triggering our request rate limit. We believed the rate limit block triggered too early, as it should have only triggered for non-cache hit requests. We added the external interactions to our internal application logs, noted that certain more frequent requests needed a longer cache expiry and worked with the external team to solve the problem.
Controllability and collaboration
Controllability is at its best when it is a shared resource, which encourages integration between services early and importantly a dialogue between teams. Feature toggles for new or changed services allow for early consumption of new features, without threatening current functionality. Earlier integration testing between systems addresses risks earlier and builds trust.
As an example, when upgrading a large scale web service by two major versions of php, our test approach included providing a feature toggle to redirect to a small pool of servers running the latest version of php for that service. Normal traffic went to the old version, while our clients tested their integrations on the new. We provided an early, transparent view of a major change, clients integrated with it, while we also tested for changes internally.
Empathy and Understanding
Systems are not the only interfaces which need to be improved in order to improve adjacent testability, how you empathize with others teams you depend on needs attention too. Consuming their monitoring, alerting and logging they receive into your own monitoring, alerting and logging setup helps a great deal.
The instance that I often reflect on is a platform migration project I worked on, a Database Administrator I worked with was often obstructive insisting on tickets being raised for every action. So I added myself to the service disruption alerts email list for that team. Batch jobs we had set up often failed, as we had not considered disk space for temporary files, waking him up at night with alerts. Small fix for us, huge for him. Never had a problem with data being moved or created after that.
Taking a collaborative approach to improving the testability of your dependencies will result in a significant testability improvement for your own system. Keep these principles in mind:
- Observability and information flow where the whole journey is the aim, including dependencies.
- Controllability and collaboration to encourage early integration and risk mitigation.
- Empathy to understand the problems and pain of those who maintain your dependencies.
As a first step, try and build a relationship with the teams that you depend upon. Understanding their challenges and how you might be able to assist can unlock large testability gains for you and your team.
This post was originally published by the great people at TechBeacon here: https://techbeacon.com/testers-guide-overcoming-painful-dependencies