Who is this for? SDETs, developers, and DevOps engineers integrating ContextQA with AI coding assistants (Claude, Cursor) or CI/CD pipelines.
This guide explains how AI agents should use the ContextQA MCP tools effectively — what order to call tools in, how to handle asynchronous execution, how to debug failures, and how to orchestrate complex multi-step workflows.
Whether you are building a custom agent, prompting Claude or Cursor, or writing an automated CI script, the patterns in this guide will help you get reliable results.
Core Workflow Pattern
Every test automation workflow with ContextQA follows the same fundamental loop:
1. create_test_case(url, task_description, name)
↓ returns test_case_id
2. execute_test_case(test_case_id)
↓ returns execution portal URL + number_of_executions
3. get_execution_status(test_case_id, number_of_executions)
↓ poll every 15-30 seconds until PASSED or FAILED
4. (if PASSED)
→ Report success, retrieve results for evidence
5. (if FAILED)
→ get_test_case_results(execution_id)
→ get_root_cause(execution_id)
→ Decide: fix_and_apply, create_defect_ticket, or update test
The execution step is asynchronous — ContextQA runs the test in a cloud browser environment, which may take anywhere from a few seconds (for short tests) to several minutes (for complex multi-page flows). Your agent must poll for completion rather than expecting an immediate result.
Example: Simple Test Creation and Execution
The following is a natural language prompt that an agent using Claude with the ContextQA MCP server can handle end to end:
The agent will internally:
Call create_test_case with the URL, task description, and name
Receive a test_case_id in the response
Call execute_test_case with that ID
Poll get_execution_status until the status is PASSED or FAILED
Report the result back to you with a screenshot or step summary
You do not need to specify any of the underlying API calls — the agent handles all of that based on your natural language request.
Decision Tree: Choosing the Right Tool
Use this reference when building or prompting an agent to know which tool to invoke for a given scenario.
Creating Tests
Goal
Tool to Use
New test from a task description
create_test_case
Tests from a Jira or Azure DevOps ticket
generate_tests_from_jira_ticket
Tests from a Figma design file
generate_tests_from_figma
Tests from an OpenAPI / Swagger spec
generate_tests_from_swagger
Tests from a video screen recording
generate_tests_from_video
Tests from an Excel or CSV file
generate_tests_from_excel
Tests from a requirements document
generate_tests_from_requirements
Tests for a specific git diff / PR
generate_tests_from_code_change
Tests for an n8n workflow
generate_contextqa_tests_from_n8n
Edge cases for an existing feature
generate_edge_cases
Close an identified coverage gap
generate_tests_from_analytics_gap
Finding Existing Tests
Goal
Tool to Use
Search for tests by feature name
query_contextqa
List all test cases
get_test_cases
Get steps for a specific test
get_test_case_steps
Find tests impacted by a code change
analyze_test_impact
Running Tests
Goal
Tool to Use
Run a single test case
execute_test_case
Run an entire test suite
execute_test_suite
Run a full test plan
execute_test_plan
Re-run a test plan (after fixes)
rerun_test_plan
Run a performance load test
execute_performance_test
Run a DAST security scan
execute_security_dast_scan
Analyzing Results
Goal
Tool to Use
Check if an execution completed
get_execution_status
Get the full result object
get_test_case_results
Get per-step details with screenshots
get_test_step_results
Get AI root cause analysis
get_root_cause
Get step-level browser screenshots
get_execution_step_details
Get network traffic for a run
get_network_logs
Get browser console errors
get_console_logs
Get Playwright trace viewer URL
get_trace_url
Get AI confidence scores per step
get_ai_reasoning
Handling Failures
Goal
Tool to Use
Push failure to Jira/ADO
create_defect_ticket
Get locator fix suggestions
get_auto_healing_suggestions
Apply a healing suggestion
approve_auto_healing
Apply a code-level fix
fix_and_apply
Reproduce a bug from a ticket
reproduce_from_ticket
Deep investigation of a failure
investigate_failure
Migrating from Other Frameworks
Goal
Tool to Use
Analyze an existing test repo
analyze_test_repo
Migrate tests to ContextQA
migrate_repo_to_contextqa
Export ContextQA tests to Playwright
export_to_playwright
Polling for Execution Completion
Executions are asynchronous. When you call execute_test_case, the response gives you enough information to poll for the result — do not assume the test is complete immediately.
Recommended polling strategy:
Recommended polling intervals:
Simple tests (1-5 steps): poll every 15 seconds
Medium tests (5-20 steps): poll every 30 seconds
Complex tests (20+ steps) or mobile tests: poll every 60 seconds
For test plan executions, use get_test_plan_execution_status and apply the same polling pattern.
Multi-Step Orchestration: The Bug Fix Loop
When a test fails in production or CI, an agent can orchestrate a complete bug triage and fix workflow:
Step 1: Retrieve the Failure Details
This returns the complete result object: which steps passed, which step failed, the error message, and the screenshot URL for the failing step.
Step 2: Get AI Root Cause Analysis
The AI analyzes the screenshots, video, DOM state, and network logs and returns a plain-English explanation of why the test failed. Example output:
Step 3: Create a Defect Ticket
ContextQA creates a Jira (or Azure DevOps) issue with:
The failure screenshot attached
The step that failed and the error message in the description
A link to the ContextQA execution for the full video and trace
Step 4: Check for Self-Healing Suggestions
If the failure was caused by a changed UI element (the button moved, was renamed, or had its locator modified), ContextQA proposes an automatic fix with a confidence score.
Step 5: Apply the Healing
The locator is updated in the test case definition automatically. No manual editing required.
Step 6: Re-Run to Verify
Run the test again to confirm the fix resolved the failure.
Deep Telemetry for Debugging
Every test execution produces a complete evidence package. When an AI agent is investigating a failure, it should pull all available telemetry before drawing conclusions.
Screenshots per Step
Returns a list of step objects, each containing:
Step number and description
Pass/fail status
Screenshot URL (hosted in S3, publicly accessible)
Execution duration in milliseconds
Whether auto-healing was applied
Network Traffic
Returns the full HAR-format log of every HTTP request and response during the test run. Use this to identify:
Failed API calls (4xx, 5xx responses)
Missing requests (a POST that should have fired but did not)
Slow responses that caused timeouts
Browser Console
Returns all browser console output: errors, warnings, console.log statements. Useful for catching JavaScript exceptions that caused UI state failures.
Playwright Trace
Returns a URL to view the Playwright trace at trace.playwright.dev. The trace shows:
Exact DOM snapshot at every step (before and after each action)
Full network waterfall timeline
Console output synchronized with steps
Screenshots at each action point
This is the most detailed debugging artifact available and should be used when other telemetry does not reveal the root cause.
AI Reasoning
Returns the AI's internal reasoning for each step: how confident it was in locating the element, which locator strategy it used, and why it made specific decisions. Useful for understanding flaky tests where the AI sometimes finds an element and sometimes does not.
Generating Tests Automatically in CI
A common pattern is to run ContextQA test generation as part of a pull request review workflow. When a developer opens a PR:
Extract the git diff
Call generate_tests_from_code_change with the diff and the staging URL
ContextQA generates tests targeting the changed flows
Call execute_test_case on each generated test
Report results back to the PR as a comment
In GitHub Actions pseudocode:
Migrating an Existing Test Suite
If you have an existing Playwright, Cypress, or Selenium test repository and want to migrate it to ContextQA:
Step 1: Analyze the Repository
The analysis returns:
Total number of test files and test cases found
Test framework detected (Playwright, Cypress, Selenium, etc.)
Estimated complexity
Any patterns that may require special handling during migration
Step 2: Migrate
ContextQA reads each test file, converts the code-based steps to natural language NLP steps, and creates the corresponding test cases in your workspace. The migration report shows how many tests were imported successfully and flags any that required manual review.
Step 3: Verify
After migration, run the imported test suite:
Review the results to confirm the migrated tests produce the same behavior as the originals.
Custom Agents and Knowledge Bases
For teams that need consistent behavior across many tests — for example, always dismissing a cookie consent banner, or always using specific test credentials on a payment page — ContextQA provides custom agents and knowledge bases.
Creating a Custom Agent
Creating a Knowledge Base
Once created, assign the custom agent or knowledge base to a test case or suite. The AI execution engine will apply these instructions for every run.
Error Patterns and Recovery
Tool Returns an Empty Result
Some tools return empty arrays when no data matches. Before concluding there is a problem, verify the workspace context:
Ensure you are querying the correct workspace
Check that the resource actually exists in ContextQA via the UI
Use query_contextqa to search before assuming a test case does not exist
Execution Stuck in RUNNING State
If get_execution_status keeps returning RUNNING for more than 15 minutes, the execution may have encountered an infrastructure timeout. In this case:
Stop polling
Retrieve whatever partial results are available with get_test_case_results
Re-trigger with execute_test_case
Authentication Error on Every Call
This means credentials are not being resolved. Check:
Environment variables are set and exported correctly
The .env file is in the project root (not a subdirectory)
There are no leading/trailing spaces in the email address
The ContextQA account password has not been changed since the server started
Test Generation Returns No Steps
If a generation tool returns a test case with zero or very few steps, the AI may not have understood the input. Try:
Adding more context to the task_description parameter
Breaking a complex workflow into multiple smaller test cases
Using generate_edge_cases after the main test is created to expand coverage
Next Steps
Tool Reference — complete parameter documentation for all 67 tools
Create a test for https://myapp.com that:
1. Logs in as [email protected] with password Pass123!
2. Opens the Reports section from the sidebar
3. Verifies at least one report exists in the list
Name it "Reports - Basic Smoke". Execute it and tell me if it passed.
import time
# Step 1: Start execution
result = execute_test_case(test_case_id="1234")
execution_count = result["number_of_executions"]
# Step 2: Poll until done
max_attempts = 40 # 40 * 30s = 20 minutes max wait
for attempt in range(max_attempts):
status = get_execution_status(
test_case_id="1234",
number_of_executions=execution_count
)
if status["result"] in ["PASSED", "FAILED", "ERROR"]:
break
# Not done yet — wait before next poll
time.sleep(30)
else:
raise TimeoutError("Execution did not complete within 20 minutes")
# Step 3: Handle result
if status["result"] == "PASSED":
print("Test passed")
elif status["result"] == "FAILED":
# Proceed to failure analysis
pass
get_test_case_results(execution_id="5678")
get_root_cause(execution_id="5678")
The test failed at Step 4: "Click the Submit button".
Root cause: The Submit button was disabled when the click was attempted because
the form validation required the 'Terms and Conditions' checkbox to be checked first.
The test steps do not include checking this checkbox.
Suggested fix: Add a step before Step 4 to check the Terms and Conditions checkbox.
- name: Generate and run tests for PR
run: |
python -c "
import subprocess, os
# Get the PR diff
diff = subprocess.check_output(['git', 'diff', 'main...HEAD']).decode()
# Generate tests
from app.contextqa_client import ContextQAClient
client = ContextQAClient(
username=os.environ['CONTEXTQA_USERNAME'],
password=os.environ['CONTEXTQA_PASSWORD']
)
tests = client.generate_tests_from_code_change(
diff_text=diff,
app_url='https://staging.myapp.com'
)
# Execute each generated test
for test in tests['test_cases']:
client.execute_test_case(test_case_id=test['id'])
"
create_custom_agent(
name="E-commerce Test Agent",
system_prompt="When you encounter a cookie consent banner, always click 'Accept All'.
On the payment page, always use test card number 4111111111111111,
expiry 12/26, CVV 123."
)
create_knowledge_base(
name="Login Instructions",
content="The login page is at /auth/login.
Use the email field with id 'email' and password field with id 'password'.
After clicking Login, wait for the dashboard URL to contain '/dashboard'."
)