ClaudeDeveloper Tools

assessing-external-test-risk

Assesses whether branch or PR changes are high-risk for externally hosted or embedded Streamlit usage and recommends whether external e2e coverage with `@pytest.mark.external_test` is needed. Use during code review, PR triage, or test planning when changes touch routing, auth, websocket/session behavior, embedding, assets, cross-origin behavior, SiS/Snowflake runtime, storage, or security headers.

44,152 stars

bystreamlit

Complexity: easy

View on GitHub Installation ↓

About this skill

This AI agent skill is designed to assist maintainers and contributors of the Streamlit framework in evaluating the potential risk of new code changes or pull requests on externally hosted or embedded Streamlit applications. It systematically analyzes whether a given change might affect crucial functionalities such as routing, authentication, websocket/session behavior, embedding mechanisms, asset loading, cross-origin interactions, security headers, or specific behaviors within environments like Streamlit-in-Snowflake (SiS) or Snowflake runtime. Based on this assessment, the skill recommends if additional end-to-end (e2e) testing, specifically using the `@pytest.mark.external_test` decorator, is necessary. This helps ensure the stability and security of external Streamlit deployments, which often operate under specific conditions involving proxies, iframes, Content Security Policy (CSP) constraints, and other browser security boundaries. It serves as a crucial tool during code review, PR triage, and test planning phases.

Best use case

To determine if a proposed code change in the Streamlit framework introduces a high risk to externally hosted or embedded Streamlit applications, and consequently, if additional external end-to-end test coverage is required to prevent regressions and security vulnerabilities.

A clear, justified recommendation from the AI agent on whether the current branch or pull request requires external end-to-end (e2e) test coverage using `@pytest.mark.external_test`, based on an analysis of potential impacts on externally hosted or embedded Streamlit usage.

Practical example

Example input

Assess the external test risk for the current branch (feature/websocket-auth-refactor). Do these changes require `@pytest.mark.external_test` coverage?

Example output

**Assessment for branch `feature/websocket-auth-refactor`:**

**Risk Areas Identified:**
*   **Auth:** Direct modifications to authentication mechanisms.
*   **Websocket/Session behavior:** Changes impacting how client sessions are managed over websockets.

**Recommendation:**
**YES**, this branch **requires** `@pytest.mark.external_test` coverage.

**Justification:**
The proposed changes directly impact core authentication and websocket session handling. These are critical components for secure and stable external Streamlit deployments, especially in scenarios involving proxies or custom authentication flows. Ensuring `@pytest.mark.external_test` coverage will validate these interactions under conditions relevant to externally hosted applications, mitigating risks of regressions or security vulnerabilities in real-world environments.

When to use this skill

Use this skill during code review, pull request (PR) triage, or test planning when changes touch core areas like routing, authentication, websocket or session behavior, embedding logic, asset management, cross-origin behavior, Streamlit-in-Snowflake (SiS)/Snowflake runtime specifics, storage interactions, or security headers. It's particularly useful when protecting deployments that involve proxies, embedded iframe contexts, CSP constraints, and other browser security boundaries.

When not to use this skill

Do not use this skill for assessing general code quality, performance issues unrelated to external integration, or changes that are purely internal to Streamlit's UI components or backend logic that have no bearing on how Streamlit interacts with its hosting environment or other origins. It is not intended for generating tests, but rather for assessing the *need* for specific external tests.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/assessing-external-test-risk/SKILL.md --create-dirs "https://raw.githubusercontent.com/streamlit/streamlit/main/.claude/skills/assessing-external-test-risk/SKILL.md"

Manual Installation

Download SKILL.md from GitHub
Place it in .claude/skills/assessing-external-test-risk/SKILL.md inside your project
Restart your AI agent — it will auto-discover the skill

How assessing-external-test-risk Compares

Feature / Agent	assessing-external-test-risk	Standard Approach
Platform Support	Claude	Limited / Varies
Context Awareness	High	Baseline
Installation Complexity	easy	N/A

Frequently Asked Questions

What does this skill do?

Which AI agents support this skill?

This skill is designed for Claude.

How difficult is it to install?

The installation complexity is rated as easy. You can find the installation instructions above.

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

Related Guides

AI Agents for Coding

Browse AI agent skills for coding, debugging, testing, refactoring, code review, and developer workflows across Claude, Cursor, and Codex.

Best AI Skills for Claude

Explore the best AI skills for Claude and Claude Code across coding, research, workflow automation, documentation, and agent operations.

ChatGPT vs Claude for Agent Skills

Compare ChatGPT and Claude for AI agent skills across coding, writing, research, and reusable workflow execution.

SKILL.md Source

# Assessing external test risk

Use this skill to decide whether a branch or PR should include external e2e coverage using `@pytest.mark.external_test`.

This helps protect deployments that commonly involve proxies, embedded iframe contexts, CSP constraints, and other browser security boundaries.

This skill is for **risk assessment and recommendation**. It does not auto-mark tests unless explicitly requested.

## Decision rule

Use an **any-hit** policy:

- If any checklist category is hit, output **Recommend external_test: Yes**
- If no categories are hit, output **Recommend external_test: No**

## Inputs to review

- Branch or PR diff against its base branch
- Changed files and related tests
- PR description (if available)

## Assessment workflow

1. Gather the changed files and full diff against the base branch.
2. Evaluate each checklist category below as hit or not hit.
3. Record concrete evidence from file paths and diff snippets.
4. Produce a recommendation and specific external-test focus areas.

## Checklist categories

Evaluate all categories. A single hit is enough to recommend external coverage.

1. **Routing and URL behavior**
- Hit when changes introduce or modify Starlette routes, `server.baseUrlPath`, catch-alls, request methods, URL resolution, redirects, or status codes.

2. **Auth, cookies, CSRF, and identity binding**
- Hit when changes touch login/logout or OAuth flows, `_streamlit_user`, `_streamlit_xsrf`, CSRF/XSRF handling, `server.trustedUserHeaders`, or session-to-identity binding.

3. **Websocket handshake and session transport**
- Hit when changes affect websocket handshake or subprotocols, session affinity, reconnect behavior, ping or timeout behavior, message size limits, or fragmentation.

4. **Embedding and iframe boundary**
- Hit when changes modify host-to-guest communication (`postMessage`), iframe sizing or resize behavior, iframe sandbox or allow attributes, or permissions policy behavior in embedded contexts.

5. **Static and component asset serving**
- Hit when changes alter asset handlers, cache headers, size limits, base paths (including `server.customComponentBaseUrlPath`), or proxying rules for static/component assets.

6. **Service worker, uploads, and downloads**
- Hit when changes modify service worker registration, scope, or caching strategy; upload/download endpoints; JWT or CSRF wrapping; or download attribute behavior.

7. **Cross-origin behavior and external networking**
- Hit when changes alter CORS allowlists, `crossOrigin` usage, external-origin fetches or external networks behavior, or backend URL discovery via `window.__streamlit.*`.

8. **Cross-origin theming and resource discovery**
- Hit when changes introduce or modify theme/resource loading across origins (fonts, images, theme globals), CSS isolation with host pages, or manifest/asset discovery when HTML is not served by Starlette.

9. **SiS and Snowflake runtime dependencies**
- Hit when changes rely on or modify SiS/Snowflake runtime behavior, including `running_in_sis()`, `get_active_session()`, Snowflake connection/session semantics, or SiS-specific environment flags.

10. **Client storage behavior**
- Hit when changes introduce or modify cookies, `localStorage`, or `sessionStorage` usage that may differ in embedded or third-party contexts.

11. **Security headers and browser policies**
- Hit when changes adjust CSP, Referrer-Policy, Permissions-Policy, or related headers that can impact embedding or resource loading.

## Output format

Use this exact structure:

```markdown
## External test recommendation

- Recommend external_test: [Yes/No]
- Triggered categories: [List category numbers and names, or "None"]
- Evidence:
- `<path>`: [short reason from diff]
- `<path>`: [short reason from diff]
- Suggested external_test focus areas:
- [Concrete scenario to validate externally]
- [Concrete scenario to validate externally]
- Confidence: [High/Medium/Low]
- Assumptions and gaps: [Unknowns, missing context, or why confidence is reduced]
```

## Interpretation guidance

- Prefer evidence over intuition. Tie each hit to concrete diff details.
- When in doubt, err toward **Yes** if externally hosted or embedded behavior could diverge from local runs.
- Keep focus areas specific and testable (route, auth handshake, iframe boundary, asset loading, SiS runtime behavior).

## Examples

### Example yes recommendation

Diff includes:

- `lib/streamlit/web/server/starlette/starlette_routes.py` route changes
- Cookie/XSRF handling updates in request auth middleware
- Frontend embed code changing iframe `allow` attributes

Expected output:

- `Recommend external_test: Yes`
- Triggered categories include routing, auth/cookies/CSRF, and embedding boundary
- Focus areas include external host iframe embedding + auth/session continuity checks

### Example no recommendation

Diff includes:

56166

from microsoft/ai-agents-for-beginners

Use when the user asks to create, scaffold, or edit Jupyter notebooks (`.ipynb`) for experiments, explorations, or tutorials; prefer the bundled templates and run the helper script `new_notebook.py` to generate a clean starting notebook.

Developer ToolsChatGPTClaudeGitHub Copilot