The State of AI in Software Testing
Examining the Progress, Pitfalls, and Potential of Artificial Intelligence in 2024
Over the past 24 months, interest in artificial intelligence and software testing has surged, with a 3x increase in Google searches for related topics. Despite this spike in curiosity, many organizations struggle to harness AI effectively for testing software or testing AI-driven systems themselves. According to a recent Leapwork report, while 79% of respondents use AI-augmented testing tools and 74% anticipate increased investments in AI testing tools, only 16% believe they have an efficient testing process for these tools. Moreover, 68% of respondents using AI systems reported issues with performance, accuracy, and reliability, while 78% agree that AI applications need better testing. Why is there this disconnect?
AI Testing Tools: Does it Solve Our Problems Today?
AI Testing Tools undoubtedly have the potential to revolutionize our approach to testing, but does it solve the problems that many organizations experience with software testing, and are these tools really ready right now?
1. Inefficient Testing Processes Pre-AI Adoption
The adoption of AI doesn't automatically solve fundamental problems in software testing. Many organizations struggle because they lack an efficient process before incorporating AI. As Leapwork's study suggests, simply adding AI tools to the mix cannot make up for inadequate testing methodologies or skill deficiencies. Without a robust foundation, these tools may amplify inefficiencies rather than resolve them.
2. Misrepresentation of “AI” in Tools
The hype around AI has led some vendors to branding their tools as “AI-powered” when they offer basic automation or rudimentary machine learning capabilities. Our team at TTC Global conducted a thorough analysis of the AI testing tools landscape and found that many tools exaggerated their AI functionalities. While some genuinely incorporate advanced AI/ML, many only provide shallow implementations, resulting in underwhelming performance in real-world scenarios.
3. Mismatch Between AI Use Cases and Tools
Even when AI tools are adopted, their benefits vary depending on how well they align with specific use cases. For instance, a McKinsey report highlights how generative AI tools like Co-Pilot can reduce development time by up to 50% for certain tasks. Similarly, TTC Global's research found that using commercial co-pilots boosted productivity by 10-30% in test automation using open-source tools. However, these gains are often limited to specific tasks. In our landscape analysis, we identified over ten use cases where we believe that AI can benefit software testing, but found that more than half lacked enterprise-ready solutions.
Testing AI Systems: A New Frontier
AI systems present unique challenges for testing that go beyond traditional approaches, necessitating a tailored strategy and toolset.
1. Testing AI Systems is Fundamentally Different
Large machine learning models often exhibit emergent behavior, where small input changes lead to unexpected and complex outcomes. This makes it difficult to isolate the effects of changes, while the lack of transparency in AI decision-making increases the risk of unintended consequences. Additionally, AI systems frequently display non-determinism, meaning that they may not consistently produce the same output for identical inputs. Traditional pass/fail testing becomes inadequate, requiring repeated tests and additional safeguards to identify critical defects. Lastly, qualitative assessments are essential, as AI systems must be evaluated for clarity, maintainability, and efficiency—factors that go beyond accuracy and require more nuanced evaluation metrics.
2. AI Testing Skills Gaps
A significant barrier to successful AI testing is the skills gap. While 81% of IT professionals believe they can use AI, only 12% have the expertise to do so effectively (AI Skills Report 2024). Testing AI systems requires a unique set of skills beyond traditional software testing. Professionals must understand AI/ML models, emergent behavior, and how to craft appropriate tests for non-deterministic systems. Furthermore, testers must be adept at both quantitative and qualitative assessments, as they need to evaluate the broader context of AI performance, including transparency and ethical implications. Without these skills, organizations struggle to develop and implement effective AI testing strategies.
3. Unrealistic Expectations
Many organizations also face unrealistic expectations when it comes to AI’s capabilities. AI is often seen as a silver bullet, leading to disappointment when systems fail to meet exaggerated expectations. Misalignment between what AI tools can realistically provide and what organizations expect from them often results in underperformance and unmet objectives (https://www.thoughtworks.com/insights/blog/generative-ai/gen-ai-mismatched-expectations). For example, while AI can help automate routine tasks or provide recommendations, it cannot replace the need for human oversight or guarantee perfect accuracy in complex, evolving environments. Organizations must set realistic goals and invest in understanding AI’s true potential before expecting substantial returns on their AI investments.
The Path Forward
To truly harness the power of AI in software testing, organizations must first lay a strong foundation for their testing processes. AI tools are not a panacea; they require careful planning, appropriate use cases, and the necessary skills to fully unlock their potential. Likewise to test AI systems you need to carefully consider the unique challenges and plan your approach accordingly.
I’m currently leading an effort at TTC Global, we continue to explore the AI testing landscape, ensuring that tools and approaches are delivering real value. I would love to hear from you about your successes and challenges.