a2a-testing

Test A2A implementations — unit tests, integration tests, mock agents, protocol conformance, and end-to-end multi-agent testing. Use when building test suites for A2A servers, clients, or multi-agent systems.

17 stars

byOrcaQubits

View on GitHub Installation ↓

Best use case

a2a-testing is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Teams using a2a-testing should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

You only need a quick one-off answer and do not need a reusable workflow.
You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/a2a-testing/SKILL.md --create-dirs "https://raw.githubusercontent.com/OrcaQubits/agentic-commerce-skills-plugins/main/a2a-multi-agent/skills/a2a-testing/SKILL.md"

Manual Installation

Download SKILL.md from GitHub
Place it in .claude/skills/a2a-testing/SKILL.md inside your project
Restart your AI agent — it will auto-discover the skill

How a2a-testing Compares

Feature / Agent	a2a-testing	Standard Approach
Platform Support	Not specified	Limited / Varies
Context Awareness	High	Baseline
Installation Complexity	Unknown	N/A

Frequently Asked Questions

What does this skill do?

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

SKILL.md Source

# A2A Testing

## Before writing code

**Fetch live docs**:
1. Fetch `https://a2a-protocol.org/latest/specification/` for protocol requirements to test against
2. Web-search `site:github.com a2aproject A2A testing conformance` for testing tools and guidance
3. Web-search `site:github.com a2aproject a2a-samples test` for test examples
4. Fetch SDK docs for test utilities, mock classes, and test helpers

## Conceptual Architecture

### Testing Levels

A2A implementations need testing at multiple levels:

#### 1. Unit Tests
Test individual components in isolation:
- Agent handler logic (given input message, verify output)
- Task state management (verify state transitions)
- Message/Part construction and parsing
- Agent Card validation
- Error code generation

#### 2. Integration Tests
Test the A2A server with a real HTTP stack:
- Send JSON-RPC requests, verify responses
- Test all supported methods (message/send, tasks/get, etc.)
- Verify Agent Card serving at `/.well-known/agent-card.json`
- Test authentication flows
- Test streaming with SSE client

#### 3. Protocol Conformance Tests
Verify compliance with the A2A specification:
- All required methods are implemented
- Request/response schemas match the spec
- Error codes are correct
- State transitions follow the rules
- Agent Card schema is valid

#### 4. End-to-End Tests
Test full multi-agent workflows:
- Client discovers agent via Agent Card
- Client sends task, agent processes, client receives result
- Multi-turn conversations complete successfully
- Error scenarios are handled gracefully
- Multiple agents collaborate on a complex task

### Mock Agents

For testing clients, create mock A2A servers that:
- Return predefined responses
- Simulate different task states
- Simulate errors and failures
- Simulate slow responses and timeouts
- Support streaming with controlled events

For testing servers, create mock A2A clients that:
- Send various message types and parts
- Exercise multi-turn flows
- Send invalid requests to test error handling
- Simulate disconnections during streaming

### What to Test

**Server tests:**
- Agent Card is valid and accessible
- All declared methods work correctly
- Task state transitions are correct
- Error codes are appropriate
- Streaming events are properly formatted
- Push notifications are delivered
- Authentication is enforced
- Concurrent requests are handled

**Client tests:**
- Agent Card discovery and parsing
- Request construction (correct JSON-RPC format)
- Response handling for all task states
- Error handling and retry logic
- Multi-turn conversation flow
- Streaming event processing
- Authentication credential handling

### Testing Patterns

**Stateful server tests**: Use an in-memory task store, send a sequence of requests, verify the accumulated state.

**Snapshot testing**: Capture JSON-RPC request/response pairs and verify they match expected schemas.

**Chaos testing**: Introduce random failures, slow responses, and disconnections to test resilience.

**Contract testing**: Verify that client and server agree on the message schemas (both sides validate against the spec).

### Best Practices

- Test all 9 task states, not just the happy path
- Test invalid state transitions (should return errors, not crash)
- Test with all Part types (TextPart, FilePart, DataPart)
- Test streaming with slow consumers and fast producers
- Test authentication with valid and invalid credentials
- Use deterministic agent logic in tests (not real LLMs) for reproducibility
- Validate all JSON-RPC responses against the schema
- Run tests in CI/CD pipelines
- Test Agent Card changes (schema validation)

Fetch any official conformance test suites or testing utilities from the A2A project before building custom test infrastructure.

Related Skills

woo-testing

from OrcaQubits/agentic-commerce-skills-plugins

Test WooCommerce extensions — PHPUnit unit/integration tests, WP test suite, WooCommerce test helpers, E2E with Playwright, and WP-CLI test scaffolding. Use when writing tests for WooCommerce plugins or setting up a test environment.

webmcp-testing

from OrcaQubits/agentic-commerce-skills-plugins

Test WebMCP tools with AI agents — Chrome DevTools integration, agent testing workflows, tool discovery verification, and end-to-end commerce flow testing. Use when validating that tools work correctly with real AI agents.

spree-testing

from OrcaQubits/agentic-commerce-skills-plugins

Test Spree applications and extensions with RSpec — `spree_dev_tools` gem (v5.2+) for factories and helpers, FactoryBot patterns (prefer `build` over `create`), Capybara feature specs, controller/request specs, testing decorators and subscribers, dummy app for extension testing, system specs for the Hotwire admin, and CI patterns. Use when writing Spree tests, setting up CI, or refactoring slow specs.

shopify-testing

from OrcaQubits/agentic-commerce-skills-plugins

Test Shopify applications — app testing with Vitest and Playwright, theme testing with Theme Check, Function testing, webhook testing, extension testing, and CI/CD pipelines. Use when writing tests for Shopify projects.

sf-testing

from OrcaQubits/agentic-commerce-skills-plugins

Test Salesforce Commerce code — B2C (Node.js unit testing, sfcc-ci CI/CD, sandbox management, linting) and B2B (Apex test classes with 75% coverage minimum, Jest for LWC, sf CLI deployment and validation). Use when writing tests or setting up CI/CD.

saleor-testing

from OrcaQubits/agentic-commerce-skills-plugins

Test Saleor applications — pytest setup, Django test client, GraphQL test patterns, App testing, factory_boy fixtures, and webhook testing. Use when writing tests for Saleor projects.

medusa-testing

from OrcaQubits/agentic-commerce-skills-plugins

Test Medusa v2 applications — Jest setup, module unit tests, workflow integration tests, API route tests, medusaIntegrationTestRunner, and mock patterns. Use when writing tests for Medusa projects.

php-testing

from OrcaQubits/agentic-commerce-skills-plugins

Write PHP tests with PHPUnit — unit tests, mocking, data providers, test doubles, assertions, and TDD practices. Use when writing tests for PHP code, whether in Magento or standalone PHP applications.

magento-testing

from OrcaQubits/agentic-commerce-skills-plugins

Write tests for Magento 2 — PHPUnit unit tests, integration tests, MFTF functional tests, and API tests. Use when implementing test coverage for modules, debugging, or setting up CI/CD test pipelines.

bc-testing

from OrcaQubits/agentic-commerce-skills-plugins

Test BigCommerce integrations — API testing, Stencil theme testing, Cypress/Playwright E2E tests, webhook testing, and sandbox stores. Use when writing tests for BigCommerce apps, themes, or integrations.

woo-shipping

from OrcaQubits/agentic-commerce-skills-plugins

Build WooCommerce shipping methods — WC_Shipping_Method, shipping zones, shipping classes, rate calculation, tracking, and integration with carriers. Use when creating custom shipping integrations or configuring shipping logic.

woo-setup

from OrcaQubits/agentic-commerce-skills-plugins

Install WooCommerce, configure the development stack, and set up a local dev environment with WP-CLI, Docker, or wp-env. Use when setting up a new WooCommerce project or development environment.