pinecone
Managed vector database for production AI applications. Fully managed, auto-scaling, with hybrid search (dense + sparse), metadata filtering, and namespaces. Low latency (<100ms p95). Use for production RAG, recommendation systems, or semantic search at scale. Best for serverless, managed infrastructure.
Best use case
pinecone is best used when you need a repeatable AI agent workflow instead of a one-off prompt.
Managed vector database for production AI applications. Fully managed, auto-scaling, with hybrid search (dense + sparse), metadata filtering, and namespaces. Low latency (<100ms p95). Use for production RAG, recommendation systems, or semantic search at scale. Best for serverless, managed infrastructure.
Teams using pinecone should expect a more consistent output, faster repeated execution, less prompt rewriting, better workflow continuity with your supporting tools.
When to use this skill
- You want a reusable workflow that can be run more than once with consistent structure.
- You already have the supporting tools or dependencies needed by this skill.
When not to use this skill
- You only need a quick one-off answer and do not need a reusable workflow.
- You cannot install or maintain the underlying files, dependencies, or repository context.
Installation
Claude Code / Cursor / Codex
Manual Installation
- Download SKILL.md from GitHub
- Place it in
.claude/skills/rag-pinecone/SKILL.mdinside your project - Restart your AI agent — it will auto-discover the skill
How pinecone Compares
| Feature / Agent | pinecone | Standard Approach |
|---|---|---|
| Platform Support | Not specified | Limited / Varies |
| Context Awareness | High | Baseline |
| Installation Complexity | Unknown | N/A |
Frequently Asked Questions
What does this skill do?
Managed vector database for production AI applications. Fully managed, auto-scaling, with hybrid search (dense + sparse), metadata filtering, and namespaces. Low latency (<100ms p95). Use for production RAG, recommendation systems, or semantic search at scale. Best for serverless, managed infrastructure.
Where can I find the source code?
You can find the source code on GitHub using the link provided at the top of the page.
Related Guides
Best AI Skills for Claude
Explore the best AI skills for Claude and Claude Code across coding, research, workflow automation, documentation, and agent operations.
ChatGPT vs Claude for Agent Skills
Compare ChatGPT and Claude for AI agent skills across coding, writing, research, and reusable workflow execution.
AI Agents for Startups
Explore AI agent skills for startup validation, product research, growth experiments, documentation, and fast execution with small teams.
SKILL.md Source
# Pinecone - Managed Vector Database
The vector database for production AI applications.
## When to use Pinecone
**Use when:**
- Need managed, serverless vector database
- Production RAG applications
- Auto-scaling required
- Low latency critical (<100ms)
- Don't want to manage infrastructure
- Need hybrid search (dense + sparse vectors)
**Metrics**:
- Fully managed SaaS
- Auto-scales to billions of vectors
- **p95 latency <100ms**
- 99.9% uptime SLA
**Use alternatives instead**:
- **Chroma**: Self-hosted, open-source
- **FAISS**: Offline, pure similarity search
- **Weaviate**: Self-hosted with more features
## Quick start
### Installation
```bash
pip install pinecone-client
```
### Basic usage
```python
from pinecone import Pinecone, ServerlessSpec
# Initialize
pc = Pinecone(api_key="your-api-key")
# Create index
pc.create_index(
name="my-index",
dimension=1536, # Must match embedding dimension
metric="cosine", # or "euclidean", "dotproduct"
spec=ServerlessSpec(cloud="aws", region="us-east-1")
)
# Connect to index
index = pc.Index("my-index")
# Upsert vectors
index.upsert(vectors=[
{"id": "vec1", "values": [0.1, 0.2, ...], "metadata": {"category": "A"}},
{"id": "vec2", "values": [0.3, 0.4, ...], "metadata": {"category": "B"}}
])
# Query
results = index.query(
vector=[0.1, 0.2, ...],
top_k=5,
include_metadata=True
)
print(results["matches"])
```
## Core operations
### Create index
```python
# Serverless (recommended)
pc.create_index(
name="my-index",
dimension=1536,
metric="cosine",
spec=ServerlessSpec(
cloud="aws", # or "gcp", "azure"
region="us-east-1"
)
)
# Pod-based (for consistent performance)
from pinecone import PodSpec
pc.create_index(
name="my-index",
dimension=1536,
metric="cosine",
spec=PodSpec(
environment="us-east1-gcp",
pod_type="p1.x1"
)
)
```
### Upsert vectors
```python
# Single upsert
index.upsert(vectors=[
{
"id": "doc1",
"values": [0.1, 0.2, ...], # 1536 dimensions
"metadata": {
"text": "Document content",
"category": "tutorial",
"timestamp": "2025-01-01"
}
}
])
# Batch upsert (recommended)
vectors = [
{"id": f"vec{i}", "values": embedding, "metadata": metadata}
for i, (embedding, metadata) in enumerate(zip(embeddings, metadatas))
]
index.upsert(vectors=vectors, batch_size=100)
```
### Query vectors
```python
# Basic query
results = index.query(
vector=[0.1, 0.2, ...],
top_k=10,
include_metadata=True,
include_values=False
)
# With metadata filtering
results = index.query(
vector=[0.1, 0.2, ...],
top_k=5,
filter={"category": {"$eq": "tutorial"}}
)
# Namespace query
results = index.query(
vector=[0.1, 0.2, ...],
top_k=5,
namespace="production"
)
# Access results
for match in results["matches"]:
print(f"ID: {match['id']}")
print(f"Score: {match['score']}")
print(f"Metadata: {match['metadata']}")
```
### Metadata filtering
```python
# Exact match
filter = {"category": "tutorial"}
# Comparison
filter = {"price": {"$gte": 100}} # $gt, $gte, $lt, $lte, $ne
# Logical operators
filter = {
"$and": [
{"category": "tutorial"},
{"difficulty": {"$lte": 3}}
]
} # Also: $or
# In operator
filter = {"tags": {"$in": ["python", "ml"]}}
```
## Namespaces
```python
# Partition data by namespace
index.upsert(
vectors=[{"id": "vec1", "values": [...]}],
namespace="user-123"
)
# Query specific namespace
results = index.query(
vector=[...],
namespace="user-123",
top_k=5
)
# List namespaces
stats = index.describe_index_stats()
print(stats['namespaces'])
```
## Hybrid search (dense + sparse)
```python
# Upsert with sparse vectors
index.upsert(vectors=[
{
"id": "doc1",
"values": [0.1, 0.2, ...], # Dense vector
"sparse_values": {
"indices": [10, 45, 123], # Token IDs
"values": [0.5, 0.3, 0.8] # TF-IDF scores
},
"metadata": {"text": "..."}
}
])
# Hybrid query
results = index.query(
vector=[0.1, 0.2, ...],
sparse_vector={
"indices": [10, 45],
"values": [0.5, 0.3]
},
top_k=5,
alpha=0.5 # 0=sparse, 1=dense, 0.5=hybrid
)
```
## LangChain integration
```python
from langchain_pinecone import PineconeVectorStore
from langchain_openai import OpenAIEmbeddings
# Create vector store
vectorstore = PineconeVectorStore.from_documents(
documents=docs,
embedding=OpenAIEmbeddings(),
index_name="my-index"
)
# Query
results = vectorstore.similarity_search("query", k=5)
# With metadata filter
results = vectorstore.similarity_search(
"query",
k=5,
filter={"category": "tutorial"}
)
# As retriever
retriever = vectorstore.as_retriever(search_kwargs={"k": 10})
```
## LlamaIndex integration
```python
from llama_index.vector_stores.pinecone import PineconeVectorStore
# Connect to Pinecone
pc = Pinecone(api_key="your-key")
pinecone_index = pc.Index("my-index")
# Create vector store
vector_store = PineconeVectorStore(pinecone_index=pinecone_index)
# Use in LlamaIndex
from llama_index.core import StorageContext, VectorStoreIndex
storage_context = StorageContext.from_defaults(vector_store=vector_store)
index = VectorStoreIndex.from_documents(documents, storage_context=storage_context)
```
## Index management
```python
# List indices
indexes = pc.list_indexes()
# Describe index
index_info = pc.describe_index("my-index")
print(index_info)
# Get index stats
stats = index.describe_index_stats()
print(f"Total vectors: {stats['total_vector_count']}")
print(f"Namespaces: {stats['namespaces']}")
# Delete index
pc.delete_index("my-index")
```
## Delete vectors
```python
# Delete by ID
index.delete(ids=["vec1", "vec2"])
# Delete by filter
index.delete(filter={"category": "old"})
# Delete all in namespace
index.delete(delete_all=True, namespace="test")
# Delete entire index
index.delete(delete_all=True)
```
## Best practices
1. **Use serverless** - Auto-scaling, cost-effective
2. **Batch upserts** - More efficient (100-200 per batch)
3. **Add metadata** - Enable filtering
4. **Use namespaces** - Isolate data by user/tenant
5. **Monitor usage** - Check Pinecone dashboard
6. **Optimize filters** - Index frequently filtered fields
7. **Test with free tier** - 1 index, 100K vectors free
8. **Use hybrid search** - Better quality
9. **Set appropriate dimensions** - Match embedding model
10. **Regular backups** - Export important data
## Performance
| Operation | Latency | Notes |
|-----------|---------|-------|
| Upsert | ~50-100ms | Per batch |
| Query (p50) | ~50ms | Depends on index size |
| Query (p95) | ~100ms | SLA target |
| Metadata filter | ~+10-20ms | Additional overhead |
## Pricing (as of 2025)
**Serverless**:
- $0.096 per million read units
- $0.06 per million write units
- $0.06 per GB storage/month
**Free tier**:
- 1 serverless index
- 100K vectors (1536 dimensions)
- Great for prototyping
## Resources
- **Website**: https://www.pinecone.io
- **Docs**: https://docs.pinecone.io
- **Console**: https://app.pinecone.io
- **Pricing**: https://www.pinecone.io/pricingRelated Skills
async-python-patterns
Comprehensive guidance for implementing asynchronous Python applications using asyncio, concurrent programming patterns, and async/await for building high-performance, non-blocking systems.
slack-automation
Automate Slack workspace operations including messaging, search, channel management, and reaction workflows through Composio's Slack toolkit.
linear-automation
Automate Linear tasks via Rube MCP (Composio): issues, projects, cycles, teams, labels. Always search tools first for current schemas.
jira-automation
Automate Jira tasks via Rube MCP (Composio): issues, projects, sprints, boards, comments, users. Always search tools first for current schemas.
gitops-workflow
Complete guide to implementing GitOps workflows with ArgoCD and Flux for automated Kubernetes deployments.
github-automation
Automate GitHub repositories, issues, pull requests, branches, CI/CD, and permissions via Rube MCP (Composio). Manage code workflows, review PRs, search code, and handle deployments programmatically.
github-actions-templates
Production-ready GitHub Actions workflow patterns for testing, building, and deploying applications.
zustand-store-ts
Create Zustand stores following established patterns with proper TypeScript types and middleware.
zod-validation-expert
Expert in Zod — TypeScript-first schema validation. Covers parsing, custom errors, refinements, type inference, and integration with React Hook Form, Next.js, and tRPC.
tanstack-query-expert
Expert in TanStack Query (React Query) — asynchronous state management. Covers data fetching, stale time configuration, mutations, optimistic updates, and Next.js App Router (SSR) integration.
tailwind-design-system
Build production-ready design systems with Tailwind CSS, including design tokens, component variants, responsive patterns, and accessibility.
sveltekit
Build full-stack web applications with SvelteKit — file-based routing, SSR, SSG, API routes, and form actions in one framework.