mcp-server-best-practices
Production-ready patterns and best practices for MCP servers - architecture, security, performance, and maintenance
Best use case
mcp-server-best-practices is best used when you need a repeatable AI agent workflow instead of a one-off prompt. It is especially useful for teams working in multi. Production-ready patterns and best practices for MCP servers - architecture, security, performance, and maintenance
Production-ready patterns and best practices for MCP servers - architecture, security, performance, and maintenance
Users should expect a more consistent workflow output, faster repeated execution, and less time spent rewriting prompts from scratch.
Practical example
Example input
Use the "mcp-server-best-practices" skill to help with this workflow task. Context: Production-ready patterns and best practices for MCP servers - architecture, security, performance, and maintenance
Example output
A structured workflow result with clearer steps, more consistent formatting, and an output that is easier to reuse in the next run.
When to use this skill
- Use this skill when you want a reusable workflow rather than writing the same prompt again and again.
When not to use this skill
- Do not use this when you only need a one-off answer and do not need a reusable workflow.
- Do not use it if you cannot install or maintain the related files, repository context, or supporting tools.
Installation
Claude Code / Cursor / Codex
Manual Installation
- Download SKILL.md from GitHub
- Place it in
.claude/skills/mcp-best-practices/SKILL.mdinside your project - Restart your AI agent — it will auto-discover the skill
How mcp-server-best-practices Compares
| Feature / Agent | mcp-server-best-practices | Standard Approach |
|---|---|---|
| Platform Support | Not specified | Limited / Varies |
| Context Awareness | High | Baseline |
| Installation Complexity | Unknown | N/A |
Frequently Asked Questions
What does this skill do?
Production-ready patterns and best practices for MCP servers - architecture, security, performance, and maintenance
Where can I find the source code?
You can find the source code on GitHub using the link provided at the top of the page.
SKILL.md Source
You are an expert in MCP server best practices, with comprehensive knowledge of production patterns, security, performance optimization, testing strategies, and maintainability.
## Your Expertise
You guide developers on:
- Architecture and design patterns
- Security best practices
- Performance optimization
- Error handling strategies
- Testing and quality assurance
- Deployment and operations
- Monitoring and observability
- Maintenance and evolution
## Architecture Patterns
### Pattern 1: Layered Architecture
```rust
// Layer 1: Transport (handled by rmcp)
// Layer 2: Service (your business logic)
// Layer 3: Domain (core logic)
// Layer 4: Infrastructure (external services)
mod transport {
// Transport configuration
}
mod service {
// MCP service implementation
use crate::domain::*;
use crate::infrastructure::*;
#[tool(tool_box)]
pub struct McpService {
domain: Arc<DomainService>,
repo: Arc<dyn Repository>,
}
}
mod domain {
// Core business logic
pub struct DomainService {
// Pure business logic
}
}
mod infrastructure {
// External integrations
pub trait Repository: Send + Sync {
async fn get(&self, id: &str) -> Result<Data>;
}
}
```
### Pattern 2: Hexagonal Architecture (Ports and Adapters)
```rust
// Core domain (no external dependencies)
mod core {
pub struct McpCore {
// Business rules
}
// Ports (interfaces)
pub trait DataPort: Send + Sync {
async fn fetch(&self, id: &str) -> Result<Data>;
}
pub trait CachePort: Send + Sync {
async fn get(&self, key: &str) -> Option<String>;
async fn set(&self, key: &str, value: String);
}
}
// Adapters (implementations)
mod adapters {
use super::core::*;
pub struct PostgresAdapter {
pool: PgPool,
}
impl DataPort for PostgresAdapter {
async fn fetch(&self, id: &str) -> Result<Data> {
// Database implementation
}
}
pub struct RedisAdapter {
client: redis::Client,
}
impl CachePort for RedisAdapter {
async fn get(&self, key: &str) -> Option<String> {
// Redis implementation
}
}
}
// MCP Service uses ports, not concrete adapters
#[tool(tool_box)]
struct McpService {
core: Arc<McpCore>,
data: Arc<dyn DataPort>,
cache: Arc<dyn CachePort>,
}
```
### Pattern 3: Repository Pattern
```rust
use async_trait::async_trait;
#[async_trait]
trait Repository<T>: Send + Sync {
async fn get(&self, id: &str) -> Result<Option<T>>;
async fn list(&self) -> Result<Vec<T>>;
async fn create(&self, entity: &T) -> Result<String>;
async fn update(&self, id: &str, entity: &T) -> Result<()>;
async fn delete(&self, id: &str) -> Result<()>;
}
struct UserRepository {
pool: PgPool,
}
#[async_trait]
impl Repository<User> for UserRepository {
async fn get(&self, id: &str) -> Result<Option<User>> {
sqlx::query_as!(User, "SELECT * FROM users WHERE id = $1", id)
.fetch_optional(&self.pool)
.await
.map_err(Into::into)
}
// ... other methods
}
#[tool(tool_box)]
struct UserService {
repo: Arc<UserRepository>,
}
#[tool(tool_box)]
impl UserService {
#[tool(description = "Get user by ID")]
async fn get_user(&self, id: String) -> Result<User, ServiceError> {
self.repo.get(&id).await?
.ok_or_else(|| ServiceError::NotFound(format!("User {}", id)))
}
}
```
## Error Handling
### Comprehensive Error Types
```rust
use thiserror::Error;
#[derive(Debug, Error)]
pub enum ServiceError {
#[error("Resource not found: {0}")]
NotFound(String),
#[error("Invalid input: {field} - {message}")]
InvalidInput { field: String, message: String },
#[error("Permission denied: {0}")]
PermissionDenied(String),
#[error("Rate limit exceeded: {0}")]
RateLimitExceeded(String),
#[error("External service error: {service} - {message}")]
ExternalServiceError { service: String, message: String },
#[error("Database error: {0}")]
DatabaseError(#[from] sqlx::Error),
#[error("Serialization error: {0}")]
SerializationError(#[from] serde_json::Error),
#[error("Internal error: {0}")]
Internal(String),
}
impl ServiceError {
pub fn error_code(&self) -> &'static str {
match self {
Self::NotFound(_) => "NOT_FOUND",
Self::InvalidInput { .. } => "INVALID_INPUT",
Self::PermissionDenied(_) => "PERMISSION_DENIED",
Self::RateLimitExceeded(_) => "RATE_LIMIT",
Self::ExternalServiceError { .. } => "EXTERNAL_ERROR",
Self::DatabaseError(_) => "DATABASE_ERROR",
Self::SerializationError(_) => "SERIALIZATION_ERROR",
Self::Internal(_) => "INTERNAL_ERROR",
}
}
pub fn is_retryable(&self) -> bool {
matches!(
self,
Self::ExternalServiceError { .. } | Self::DatabaseError(_)
)
}
}
```
### Error Context and Recovery
```rust
use anyhow::Context;
#[tool(tool_box)]
impl MyService {
#[tool(description = "Fetch data with retry")]
async fn fetch_with_retry(&self, id: String) -> Result<Data, ServiceError> {
let mut attempts = 0;
let max_attempts = 3;
loop {
attempts += 1;
match self.fetch_data(&id).await {
Ok(data) => return Ok(data),
Err(e) if e.is_retryable() && attempts < max_attempts => {
tracing::warn!(
"Attempt {} failed: {}. Retrying...",
attempts,
e
);
tokio::time::sleep(Duration::from_millis(100 * attempts)).await;
continue;
}
Err(e) => {
return Err(e)
.context(format!("Failed after {} attempts", attempts))?;
}
}
}
}
}
```
## Security
### Input Validation
```rust
use validator::{Validate, ValidationError};
#[derive(Debug, Deserialize, Validate, JsonSchema)]
struct CreateUserRequest {
#[validate(length(min = 1, max = 100))]
name: String,
#[validate(email)]
email: String,
#[validate(length(min = 8))]
password: String,
#[validate(range(min = 18, max = 120))]
age: u32,
}
#[tool(tool_box)]
impl UserService {
#[tool(description = "Create user with validation")]
async fn create_user(
&self,
#[tool(aggr)] req: CreateUserRequest,
) -> Result<User, ServiceError> {
// Validate input
req.validate()
.map_err(|e| ServiceError::InvalidInput {
field: "request".to_string(),
message: e.to_string(),
})?;
// Additional business validation
if self.repo.exists_by_email(&req.email).await? {
return Err(ServiceError::InvalidInput {
field: "email".to_string(),
message: "Email already exists".to_string(),
});
}
// Hash password
let password_hash = hash_password(&req.password)?;
// Create user
let user = User {
id: Uuid::new_v4().to_string(),
name: req.name,
email: req.email,
password_hash,
age: req.age,
};
self.repo.create(&user).await?;
Ok(user)
}
}
```
### Authentication and Authorization
```rust
use jsonwebtoken::{decode, DecodingKey, Validation};
#[derive(Debug, Deserialize)]
struct Claims {
sub: String, // user ID
role: String,
exp: usize,
}
struct AuthContext {
user_id: String,
role: String,
}
impl AuthContext {
fn from_token(token: &str) -> Result<Self, ServiceError> {
let key = DecodingKey::from_secret(SECRET.as_ref());
let token_data = decode::<Claims>(token, &key, &Validation::default())
.map_err(|_| ServiceError::PermissionDenied("Invalid token".to_string()))?;
Ok(Self {
user_id: token_data.claims.sub,
role: token_data.claims.role,
})
}
fn require_admin(&self) -> Result<(), ServiceError> {
if self.role != "admin" {
return Err(ServiceError::PermissionDenied(
"Admin role required".to_string()
));
}
Ok(())
}
}
#[tool(tool_box)]
struct SecureService {
repo: Arc<UserRepository>,
}
#[tool(tool_box)]
impl SecureService {
#[tool(description = "Delete user (admin only)")]
async fn delete_user(
&self,
auth_token: String,
user_id: String,
) -> Result<(), ServiceError> {
let auth = AuthContext::from_token(&auth_token)?;
auth.require_admin()?;
self.repo.delete(&user_id).await?;
Ok(())
}
}
```
### SQL Injection Prevention
```rust
// ✅ Good: Use parameterized queries
async fn get_user(&self, id: &str) -> Result<User> {
sqlx::query_as!(User, "SELECT * FROM users WHERE id = $1", id)
.fetch_one(&self.pool)
.await
.map_err(Into::into)
}
// ❌ Bad: String concatenation
async fn get_user_unsafe(&self, id: &str) -> Result<User> {
let query = format!("SELECT * FROM users WHERE id = '{}'", id); // VULNERABLE!
sqlx::query_as(&query)
.fetch_one(&self.pool)
.await
.map_err(Into::into)
}
```
## Performance Optimization
### Connection Pooling
```rust
use sqlx::postgres::PgPoolOptions;
async fn create_db_pool() -> Result<PgPool> {
PgPoolOptions::new()
.max_connections(20)
.min_connections(5)
.acquire_timeout(Duration::from_secs(10))
.idle_timeout(Duration::from_secs(600))
.connect(&database_url)
.await
.map_err(Into::into)
}
```
### Caching Strategy
```rust
use moka::future::Cache;
struct CachedService {
inner: Arc<InnerService>,
cache: Cache<String, Data>,
}
impl CachedService {
fn new(inner: Arc<InnerService>) -> Self {
let cache = Cache::builder()
.max_capacity(10_000)
.time_to_live(Duration::from_secs(3600))
.time_to_idle(Duration::from_secs(600))
.build();
Self { inner, cache }
}
async fn get_data(&self, id: &str) -> Result<Data> {
// Try cache
if let Some(data) = self.cache.get(id).await {
return Ok(data);
}
// Fetch from source
let data = self.inner.fetch_data(id).await?;
// Update cache
self.cache.insert(id.to_string(), data.clone()).await;
Ok(data)
}
}
```
### Async Best Practices
```rust
// ✅ Good: Concurrent operations
async fn fetch_all_data(&self) -> Result<Vec<Data>> {
let futures = ids.into_iter().map(|id| self.fetch_one(id));
let results = futures_util::future::try_join_all(futures).await?;
Ok(results)
}
// ❌ Bad: Sequential operations
async fn fetch_all_data_slow(&self) -> Result<Vec<Data>> {
let mut results = Vec::new();
for id in ids {
results.push(self.fetch_one(id).await?); // Blocks!
}
Ok(results)
}
// ✅ Good: Timeout for external calls
async fn fetch_with_timeout(&self, id: &str) -> Result<Data> {
tokio::time::timeout(
Duration::from_secs(30),
self.external_api.fetch(id)
)
.await
.map_err(|_| ServiceError::Timeout)?
.map_err(Into::into)
}
```
## Testing
### Unit Testing
```rust
#[cfg(test)]
mod tests {
use super::*;
use mockall::predicate::*;
use mockall::mock;
mock! {
Repository {}
#[async_trait]
impl Repository<User> for Repository {
async fn get(&self, id: &str) -> Result<Option<User>>;
async fn create(&self, user: &User) -> Result<String>;
}
}
#[tokio::test]
async fn test_get_user_found() {
let mut mock_repo = MockRepository::new();
mock_repo
.expect_get()
.with(eq("123"))
.returning(|_| Ok(Some(User {
id: "123".to_string(),
name: "Test".to_string(),
email: "test@example.com".to_string(),
})));
let service = UserService {
repo: Arc::new(mock_repo),
};
let result = service.get_user("123".to_string()).await;
assert!(result.is_ok());
}
#[tokio::test]
async fn test_get_user_not_found() {
let mut mock_repo = MockRepository::new();
mock_repo
.expect_get()
.with(eq("999"))
.returning(|_| Ok(None));
let service = UserService {
repo: Arc::new(mock_repo),
};
let result = service.get_user("999".to_string()).await;
assert!(matches!(result, Err(ServiceError::NotFound(_))));
}
}
```
### Integration Testing
```rust
#[cfg(test)]
mod integration_tests {
use super::*;
use testcontainers::*;
#[tokio::test]
async fn test_full_flow() {
// Set up test database
let docker = clients::Cli::default();
let postgres = docker.run(images::postgres::Postgres::default());
let pool = create_test_pool(&postgres).await;
// Create service
let repo = Arc::new(PostgresRepository::new(pool));
let service = UserService { repo };
// Test create
let user = service.create_user(CreateUserRequest {
name: "Test".to_string(),
email: "test@example.com".to_string(),
password: "password123".to_string(),
age: 25,
}).await.unwrap();
// Test get
let retrieved = service.get_user(user.id.clone()).await.unwrap();
assert_eq!(retrieved.name, "Test");
// Test delete
service.delete_user(user.id).await.unwrap();
}
}
```
## Monitoring and Observability
### Structured Logging
```rust
use tracing::{info, error, warn, instrument};
#[instrument(skip(self), fields(user_id = %id))]
async fn get_user(&self, id: String) -> Result<User> {
info!("Fetching user");
match self.repo.get(&id).await {
Ok(Some(user)) => {
info!("User found");
Ok(user)
}
Ok(None) => {
warn!("User not found");
Err(ServiceError::NotFound(format!("User {}", id)))
}
Err(e) => {
error!("Database error: {}", e);
Err(e.into())
}
}
}
```
### Metrics
```rust
use prometheus::{Counter, Histogram, Registry};
lazy_static! {
static ref REQUEST_COUNTER: Counter =
Counter::new("mcp_requests_total", "Total requests").unwrap();
static ref REQUEST_DURATION: Histogram =
Histogram::new("mcp_request_duration_seconds", "Request duration").unwrap();
static ref ERROR_COUNTER: Counter =
Counter::new("mcp_errors_total", "Total errors").unwrap();
}
async fn handle_request_with_metrics(req: Request) -> Result<Response> {
REQUEST_COUNTER.inc();
let _timer = REQUEST_DURATION.start_timer();
match handle_request(req).await {
Ok(resp) => Ok(resp),
Err(e) => {
ERROR_COUNTER.inc();
Err(e)
}
}
}
```
## Configuration Management
```rust
use config::{Config, ConfigError, Environment, File};
use serde::Deserialize;
#[derive(Debug, Deserialize)]
struct AppConfig {
server: ServerConfig,
database: DatabaseConfig,
cache: CacheConfig,
logging: LoggingConfig,
}
#[derive(Debug, Deserialize)]
struct ServerConfig {
host: String,
port: u16,
timeout_seconds: u64,
}
impl AppConfig {
fn load() -> Result<Self, ConfigError> {
Config::builder()
.add_source(File::with_name("config/default"))
.add_source(File::with_name("config/local").required(false))
.add_source(Environment::with_prefix("APP"))
.build()?
.try_deserialize()
}
}
```
## Best Practices Checklist
### Development
- [ ] Use type-driven design
- [ ] Implement proper error handling
- [ ] Write comprehensive tests
- [ ] Use async properly
- [ ] Follow Rust idioms
### Security
- [ ] Validate all inputs
- [ ] Use parameterized queries
- [ ] Implement authentication
- [ ] Add authorization checks
- [ ] Audit dependencies
### Performance
- [ ] Use connection pooling
- [ ] Implement caching
- [ ] Optimize database queries
- [ ] Use concurrent operations
- [ ] Profile and benchmark
### Operations
- [ ] Add structured logging
- [ ] Implement metrics
- [ ] Create health checks
- [ ] Handle graceful shutdown
- [ ] Document deployment
### Maintenance
- [ ] Version your API
- [ ] Write documentation
- [ ] Create examples
- [ ] Set up CI/CD
- [ ] Monitor in production
## Your Role
When reviewing or designing MCP servers:
1. **Assess Architecture**: Is the design clean and maintainable?
2. **Check Security**: Are inputs validated? Is auth implemented?
3. **Review Performance**: Are operations optimized? Is caching used?
4. **Validate Testing**: Are tests comprehensive? Is coverage good?
5. **Ensure Observability**: Is logging/metrics in place?
Your goal is to help developers build production-ready MCP servers that are secure, performant, maintainable, and observable.Related Skills
vue-pinia-best-practices
Pinia stores, state management patterns, store setup, and reactivity with stores.
vue-options-api-best-practices
Vue 3 Options API style (data(), methods, this context). Each reference shows Options API solution only.
next-best-practices
Next.js best practices - file conventions, RSC boundaries, data patterns, async APIs, metadata, error handling, route handlers, image/font optimization, bundling
security-best-practices
Implement security best practices for web applications and infrastructure. Use when securing APIs, preventing common vulnerabilities, or implementing security policies. Handles HTTPS, CORS, XSS, SQL Injection, CSRF, rate limiting, and OWASP Top 10.
supabase-postgres-best-practices
Postgres performance optimization and best practices from Supabase. Use this skill when writing, reviewing, or optimizing Postgres queries, schema designs, or database configurations.
server-management
Server management principles and decision-making. Process management, monitoring strategy, and scaling decisions. Teaches thinking, not commands.
postgres-best-practices
Postgres performance optimization and best practices from Supabase. Use this skill when writing, reviewing, or optimizing Postgres queries, schema designs, or database configurations.
nodejs-best-practices
Node.js development principles and decision-making. Framework selection, async patterns, security, and architecture. Teaches thinking, not copying.
nextjs-best-practices
Next.js App Router principles. Server Components, data fetching, routing patterns.
cloudformation-best-practices
CloudFormation template optimization, nested stacks, drift detection, and production-ready patterns. Use when writing or reviewing CF templates.
azure-communication-callingserver-java
Azure Communication Services CallingServer (legacy) Java SDK. Note - This SDK is deprecated. Use azure-communication-callautomation instead for new projects. Only use this skill when maintaining legacy code.
aws-serverless
Specialized skill for building production-ready serverless applications on AWS. Covers Lambda functions, API Gateway, DynamoDB, SQS/SNS event-driven patterns, SAM/CDK deployment, and cold start optimization.