Agent Integration Guide

circle-info

Who is this for? SDETs, developers, and DevOps engineers integrating ContextQA with AI coding assistants (Claude, Cursor) or CI/CD pipelines.

This guide explains how AI agents should use the ContextQA MCP tools effectively — what order to call tools in, how to handle asynchronous execution, how to debug failures, and how to orchestrate complex multi-step workflows.

Whether you are building a custom agent, prompting Claude or Cursor, or writing an automated CI script, the patterns in this guide will help you get reliable results.


Core Workflow Pattern

Every test automation workflow with ContextQA follows the same fundamental loop:

1. create_test_case(url, task_description, name)
       ↓  returns test_case_id

2. execute_test_case(test_case_id)
       ↓  returns execution portal URL + number_of_executions

3. get_execution_status(test_case_id, number_of_executions)
       ↓  poll every 15-30 seconds until PASSED or FAILED

4. (if PASSED)
       → Report success, retrieve results for evidence

5. (if FAILED)
       → get_test_case_results(execution_id)
       → get_root_cause(execution_id)
       → Decide: fix_and_apply, create_defect_ticket, or update test

The execution step is asynchronous — ContextQA runs the test in a cloud browser environment, which may take anywhere from a few seconds (for short tests) to several minutes (for complex multi-page flows). Your agent must poll for completion rather than expecting an immediate result.


Example: Simple Test Creation and Execution

The following is a natural language prompt that an agent using Claude with the ContextQA MCP server can handle end to end:

The agent will internally:

  1. Call create_test_case with the URL, task description, and name

  2. Receive a test_case_id in the response

  3. Call execute_test_case with that ID

  4. Poll get_execution_status until the status is PASSED or FAILED

  5. Report the result back to you with a screenshot or step summary

You do not need to specify any of the underlying API calls — the agent handles all of that based on your natural language request.


Decision Tree: Choosing the Right Tool

Use this reference when building or prompting an agent to know which tool to invoke for a given scenario.

Creating Tests

Goal
Tool to Use

New test from a task description

create_test_case

Tests from a Jira or Azure DevOps ticket

generate_tests_from_jira_ticket

Tests from a Figma design file

generate_tests_from_figma

Tests from an OpenAPI / Swagger spec

generate_tests_from_swagger

Tests from a video screen recording

generate_tests_from_video

Tests from an Excel or CSV file

generate_tests_from_excel

Tests from a requirements document

generate_tests_from_requirements

Tests for a specific git diff / PR

generate_tests_from_code_change

Tests for an n8n workflow

generate_contextqa_tests_from_n8n

Edge cases for an existing feature

generate_edge_cases

Close an identified coverage gap

generate_tests_from_analytics_gap

Finding Existing Tests

Goal
Tool to Use

Search for tests by feature name

query_contextqa

List all test cases

get_test_cases

Get steps for a specific test

get_test_case_steps

Find tests impacted by a code change

analyze_test_impact

Running Tests

Goal
Tool to Use

Run a single test case

execute_test_case

Run an entire test suite

execute_test_suite

Run a full test plan

execute_test_plan

Re-run a test plan (after fixes)

rerun_test_plan

Run a performance load test

execute_performance_test

Run a DAST security scan

execute_security_dast_scan

Analyzing Results

Goal
Tool to Use

Check if an execution completed

get_execution_status

Get the full result object

get_test_case_results

Get per-step details with screenshots

get_test_step_results

Get AI root cause analysis

get_root_cause

Get step-level browser screenshots

get_execution_step_details

Get network traffic for a run

get_network_logs

Get browser console errors

get_console_logs

Get Playwright trace viewer URL

get_trace_url

Get AI confidence scores per step

get_ai_reasoning

Handling Failures

Goal
Tool to Use

Push failure to Jira/ADO

create_defect_ticket

Get locator fix suggestions

get_auto_healing_suggestions

Apply a healing suggestion

approve_auto_healing

Apply a code-level fix

fix_and_apply

Reproduce a bug from a ticket

reproduce_from_ticket

Deep investigation of a failure

investigate_failure

Migrating from Other Frameworks

Goal
Tool to Use

Analyze an existing test repo

analyze_test_repo

Migrate tests to ContextQA

migrate_repo_to_contextqa

Export ContextQA tests to Playwright

export_to_playwright


Polling for Execution Completion

Executions are asynchronous. When you call execute_test_case, the response gives you enough information to poll for the result — do not assume the test is complete immediately.

Recommended polling strategy:

Recommended polling intervals:

  • Simple tests (1-5 steps): poll every 15 seconds

  • Medium tests (5-20 steps): poll every 30 seconds

  • Complex tests (20+ steps) or mobile tests: poll every 60 seconds

For test plan executions, use get_test_plan_execution_status and apply the same polling pattern.


Multi-Step Orchestration: The Bug Fix Loop

When a test fails in production or CI, an agent can orchestrate a complete bug triage and fix workflow:

Step 1: Retrieve the Failure Details

This returns the complete result object: which steps passed, which step failed, the error message, and the screenshot URL for the failing step.

Step 2: Get AI Root Cause Analysis

The AI analyzes the screenshots, video, DOM state, and network logs and returns a plain-English explanation of why the test failed. Example output:

Step 3: Create a Defect Ticket

ContextQA creates a Jira (or Azure DevOps) issue with:

  • The failure screenshot attached

  • The step that failed and the error message in the description

  • A link to the ContextQA execution for the full video and trace

Step 4: Check for Self-Healing Suggestions

If the failure was caused by a changed UI element (the button moved, was renamed, or had its locator modified), ContextQA proposes an automatic fix with a confidence score.

Step 5: Apply the Healing

The locator is updated in the test case definition automatically. No manual editing required.

Step 6: Re-Run to Verify

Run the test again to confirm the fix resolved the failure.


Deep Telemetry for Debugging

Every test execution produces a complete evidence package. When an AI agent is investigating a failure, it should pull all available telemetry before drawing conclusions.

Screenshots per Step

Returns a list of step objects, each containing:

  • Step number and description

  • Pass/fail status

  • Screenshot URL (hosted in S3, publicly accessible)

  • Execution duration in milliseconds

  • Whether auto-healing was applied

Network Traffic

Returns the full HAR-format log of every HTTP request and response during the test run. Use this to identify:

  • Failed API calls (4xx, 5xx responses)

  • Missing requests (a POST that should have fired but did not)

  • Slow responses that caused timeouts

Browser Console

Returns all browser console output: errors, warnings, console.log statements. Useful for catching JavaScript exceptions that caused UI state failures.

Playwright Trace

Returns a URL to view the Playwright trace at trace.playwright.dev. The trace shows:

  • Exact DOM snapshot at every step (before and after each action)

  • Full network waterfall timeline

  • Console output synchronized with steps

  • Screenshots at each action point

This is the most detailed debugging artifact available and should be used when other telemetry does not reveal the root cause.

AI Reasoning

Returns the AI's internal reasoning for each step: how confident it was in locating the element, which locator strategy it used, and why it made specific decisions. Useful for understanding flaky tests where the AI sometimes finds an element and sometimes does not.


Generating Tests Automatically in CI

A common pattern is to run ContextQA test generation as part of a pull request review workflow. When a developer opens a PR:

  1. Extract the git diff

  2. Call generate_tests_from_code_change with the diff and the staging URL

  3. ContextQA generates tests targeting the changed flows

  4. Call execute_test_case on each generated test

  5. Report results back to the PR as a comment

In GitHub Actions pseudocode:


Migrating an Existing Test Suite

If you have an existing Playwright, Cypress, or Selenium test repository and want to migrate it to ContextQA:

Step 1: Analyze the Repository

The analysis returns:

  • Total number of test files and test cases found

  • Test framework detected (Playwright, Cypress, Selenium, etc.)

  • Estimated complexity

  • Any patterns that may require special handling during migration

Step 2: Migrate

ContextQA reads each test file, converts the code-based steps to natural language NLP steps, and creates the corresponding test cases in your workspace. The migration report shows how many tests were imported successfully and flags any that required manual review.

Step 3: Verify

After migration, run the imported test suite:

Review the results to confirm the migrated tests produce the same behavior as the originals.


Custom Agents and Knowledge Bases

For teams that need consistent behavior across many tests — for example, always dismissing a cookie consent banner, or always using specific test credentials on a payment page — ContextQA provides custom agents and knowledge bases.

Creating a Custom Agent

Creating a Knowledge Base

Once created, assign the custom agent or knowledge base to a test case or suite. The AI execution engine will apply these instructions for every run.


Error Patterns and Recovery

Tool Returns an Empty Result

Some tools return empty arrays when no data matches. Before concluding there is a problem, verify the workspace context:

  • Ensure you are querying the correct workspace

  • Check that the resource actually exists in ContextQA via the UI

  • Use query_contextqa to search before assuming a test case does not exist

Execution Stuck in RUNNING State

If get_execution_status keeps returning RUNNING for more than 15 minutes, the execution may have encountered an infrastructure timeout. In this case:

  1. Stop polling

  2. Retrieve whatever partial results are available with get_test_case_results

  3. Re-trigger with execute_test_case

Authentication Error on Every Call

This means credentials are not being resolved. Check:

  1. Environment variables are set and exported correctly

  2. The .env file is in the project root (not a subdirectory)

  3. There are no leading/trailing spaces in the email address

  4. The ContextQA account password has not been changed since the server started

Test Generation Returns No Steps

If a generation tool returns a test case with zero or very few steps, the AI may not have understood the input. Try:

  1. Adding more context to the task_description parameter

  2. Breaking a complex workflow into multiple smaller test cases

  3. Using generate_edge_cases after the main test is created to expand coverage


Next Steps

Last updated

Was this helpful?