project-knowledge
CEI architecture, modules, data flows, conventions, tech stack decisions
Best use case
project-knowledge is best used when you need a repeatable AI agent workflow instead of a one-off prompt.
CEI architecture, modules, data flows, conventions, tech stack decisions
Teams using project-knowledge should expect a more consistent output, faster repeated execution, less prompt rewriting.
When to use this skill
- You want a reusable workflow that can be run more than once with consistent structure.
When not to use this skill
- You only need a quick one-off answer and do not need a reusable workflow.
- You cannot install or maintain the underlying files, dependencies, or repository context.
Installation
Claude Code / Cursor / Codex
Manual Installation
- Download SKILL.md from GitHub
- Place it in
.claude/skills/project-knowledge/SKILL.mdinside your project - Restart your AI agent — it will auto-discover the skill
How project-knowledge Compares
| Feature / Agent | project-knowledge | Standard Approach |
|---|---|---|
| Platform Support | Not specified | Limited / Varies |
| Context Awareness | High | Baseline |
| Installation Complexity | Unknown | N/A |
Frequently Asked Questions
What does this skill do?
CEI architecture, modules, data flows, conventions, tech stack decisions
Where can I find the source code?
You can find the source code on GitHub using the link provided at the top of the page.
SKILL.md Source
# Project Knowledge — CEI-001
## Project Overview
**Name:** CEI-001 — Guide Interactif Pré-Projet ERP
**Purpose:** Evaluate ERP implementation readiness for small manufacturing enterprises
**Users:** SME manufacturers, CEI consultants, admins
**Timeline:** 50 hours forfait
**Budget:** Free access for users, admin requires auth
## Architecture Decisions
| Decision | Rationale |
|----------|-----------|
| **Chat + Evaluation hybrid** | Chat for exploration, Evaluation for structured assessment |
| **OpenAI GPT-4** | Quality > cost for strategic consulting |
| **Weaviate RAG** | Open source, semantic search, admin-friendly |
| **PostgreSQL** | Relational, JSON support, proven reliability |
| **FastAPI** | Async native, auto-docs, type safety |
| **React + TypeScript** | Type safety, ecosystem maturity |
| **JWT auth** | Stateless, simple for admin-only protection |
| **Docker Compose** | Easy deployment, local development |
## Core Modules (8)
1. **Vision & Objectives** — Why ERP? Strategic alignment
2. **Organizational Prep** — Stakeholders, roles, change management
3. **Data & Processes** — Inventory, quality, documentation
4. **Technical Infrastructure** — Current setup, connectivity needs
5. **Resources & Budget** — Costs, availability, timeline
6. **Pitfalls to Avoid** — Common failures, risks
7. **Implementation Process** — Phases, deliverables, success criteria
8. **Post-Implementation** — Training, support, optimization
## Data Flows
### Chat Flow
```
User input → Frontend
→ POST /api/chat/message
→ Save message (PostgreSQL)
→ Query Weaviate (semantic search)
→ Build RAG context
→ Call OpenAI API (with context)
→ Stream response back
→ Save assistant message
→ Frontend displays with sources
```
### Evaluation Flow
```
User starts evaluation → Load questions (8 modules)
→ User answers module by module
→ Answers saved to PostgreSQL
→ On completion:
→ Scoring engine calculates scores
→ Generate recommendations
→ Create report
→ Return PDF
```
### Admin Document Flow
```
Admin uploads document → Upload to server
→ Save metadata (PostgreSQL)
→ Start pipeline:
→ Anonymize (OpenAI)
→ Whitelabel (OpenAI)
→ Normalize (OpenAI)
→ Enrich with summary (OpenAI)
→ Generate Q&A (OpenAI)
→ Chunk for RAG
→ Index into Weaviate
→ Publish
```
## Key Entities
- **User:** Email-based, role-based access
- **Conversation:** Chat history, multi-turn context
- **Message:** User/assistant messages with sources
- **Evaluation:** User's assessment session
- **Answer:** User's response to each question
- **Question:** Pre-defined evaluation questions (8 modules)
- **Document:** Knowledge base documents
- **DocumentChunk:** Indexed document sections (Weaviate)
## Naming Conventions
- **Routes:** `/api/[resource]/[action]`
- **Tables:** `lowercase_plural`
- **Columns:** `snake_case`
- **Models:** `PascalCase`
- **Functions:** `camelCase` (Python: `snake_case`)
- **Components:** `PascalCase.tsx`
- **Hooks:** `useXxx`
## Configuration
```python
# Core
DEBUG = False
ENVIRONMENT = "production"
# Database
DATABASE_URL = "postgresql+asyncpg://user:pass@localhost:5432/cei"
# Weaviate
WEAVIATE_HOST = "weaviate:8080"
WEAVIATE_SCHEME = "http"
# OpenAI
OPENAI_API_KEY = "sk-..."
OPENAI_MODEL = "gpt-4-turbo-preview"
OPENAI_EMBEDDING_MODEL = "text-embedding-3-small"
# Auth
JWT_SECRET = "your-secret-key-32-chars-min"
JWT_EXPIRE_HOURS = 24
# Frontend
VITE_API_URL = "https://api.yourdomain.com"
```
## Tech Stack Summary
| Layer | Technology | Why |
|-------|-----------|-----|
| Frontend | React 18 + TS | Type safety, ecosystem |
| Styling | TailwindCSS 3 | Rapid, consistent UI |
| Build | Vite 5 | Fast HMR, modern |
| Backend | FastAPI 0.109 | Async, auto-docs |
| Database | PostgreSQL 16 | Relational, JSON |
| ORM | SQLAlchemy 2.0 | Async support, mature |
| Vector DB | Weaviate 1.24 | Open source, semantic |
| LLM | OpenAI API | Quality responses |
| Auth | JWT + bcrypt | Standard, simple |
| Container | Docker Compose | Multi-service |
---
---
name: rag-weaviate
description: Document indexing, semantic search, RAG pipelines, chunking, Weaviate integration
---
# RAG & Weaviate — CEI-001
## Weaviate Schema
```python
# app/services/rag_service.py
from weaviate import Client
import weaviate.classes as wvc
class RAGService:
def __init__(self, weaviate_url: str):
self.client = Client(f"http://{weaviate_url}")
self._ensure_schema()
def _ensure_schema(self):
"""Create Weaviate schema if not exists"""
# Document class for indexed documents
self.client.collections.create(
name="Document",
description="CEI knowledge base documents",
vectorizer_config=wvc.Configure.Vectorizer.text2vec_openai(),
properties=[
wvc.Property(
name="title",
data_type=wvc.DataType.TEXT,
description="Document title"
),
wvc.Property(
name="content",
data_type=wvc.DataType.TEXT,
description="Document chunk content"
),
wvc.Property(
name="section",
data_type=wvc.DataType.TEXT,
description="Section title"
),
wvc.Property(
name="module",
data_type=wvc.DataType.TEXT,
description="Evaluation module (vision, org, data, etc.)"
),
wvc.Property(
name="document_id",
data_type=wvc.DataType.UUID,
description="PostgreSQL document ID"
),
wvc.Property(
name="chunk_index",
data_type=wvc.DataType.INT,
description="Chunk position in document"
),
]
)
```
## Indexing Pipeline
```python
async def index_document(self, doc_id: str, chunks: List[str]):
"""Index document chunks into Weaviate"""
collection = self.client.collections.get("Document")
# Prepare objects
objects = []
for idx, chunk in enumerate(chunks):
obj = wvc.DataObject(
properties={
"title": f"Document {doc_id}",
"content": chunk,
"section": "unknown",
"module": "general",
"document_id": doc_id,
"chunk_index": idx,
}
)
objects.append(obj)
# Batch import
uuids = collection.data.insert_multiple(objects)
return uuids
async def search(self, query: str, limit: int = 3):
"""Semantic search in Weaviate"""
collection = self.client.collections.get("Document")
results = collection.query.near_text(
query=query,
limit=limit,
where_filter=wvc.Filter.by_property("module").not_equal("archived")
).objects
return [
{
"title": obj.properties["title"],
"content": obj.properties["content"],
"section": obj.properties["section"],
"module": obj.properties["module"],
"score": obj.metadata.score
}
for obj in results
]
async def reindex_document(self, doc_id: str):
"""Remove old chunks and reindex"""
collection = self.client.collections.get("Document")
# Delete old chunks
collection.data.delete_many(
where=wvc.Filter.by_property("document_id").equal(doc_id)
)
```
## Chunking Strategy
```python
def chunk_text(
content: str,
chunk_size: int = 800,
chunk_overlap: int = 100
) -> List[str]:
"""Smart chunking: split by paragraphs, then sentences"""
chunks = []
paragraphs = content.split('\n\n')
current_chunk = ""
for para in paragraphs:
if len(current_chunk) + len(para) < chunk_size:
current_chunk += para + "\n\n"
else:
if current_chunk:
chunks.append(current_chunk.strip())
# Handle overlap
if len(para) > chunk_overlap:
current_chunk = para
else:
current_chunk = para
if current_chunk:
chunks.append(current_chunk.strip())
return chunks
```
## RAG Response Generation
```python
async def generate_rag_response(
self,
user_query: str,
chat_history: List[Dict],
openai_client: AsyncOpenAI
) -> Tuple[str, List[Dict]]:
"""Generate response with RAG context"""
# 1. Search knowledge base
context_docs = await self.search(user_query, limit=3)
# 2. Build context
context_text = "\n\n".join([
f"Source: {doc['title']}\n{doc['content']}"
for doc in context_docs
])
# 3. Build prompt
system_prompt = f"""Tu es un expert ERP pour PME manufacturières.
Contexte de connaissances:
{context_text}
Réponds en utilisant ce contexte. Cite les sources quand pertinent.
Sois concis et pratique."""
# 4. Call OpenAI
response = await openai_client.messages.create(
model="gpt-4-turbo-preview",
max_tokens=1024,
system=system_prompt,
messages=chat_history
)
return response.content[0].text, context_docs
```
## Embedding Configuration
```python
# app/config.py
OPENAI_EMBEDDING_MODEL = "text-embedding-3-small"
OPENAI_EMBEDDING_DIMENSION = 1536
# Cost optimization: use smaller embeddings
# text-embedding-3-small: 1536 dimensions, cheap
# text-embedding-3-large: 3072 dimensions, more precise
```
## Similarity Threshold
```python
# Search with confidence threshold
async def search_with_confidence(self, query: str, min_score: float = 0.5):
"""Only return results above confidence threshold"""
results = await self.search(query, limit=5)
return [
r for r in results
if r["score"] >= min_score
]
```
## Conventions
- Chunk size: 800 tokens (good for context windows)
- Chunk overlap: 100 tokens (preserve context)
- Min similarity: 0.5 (high confidence)
- Update frequency: on document publish
- Archive old versions (don't delete)Related Skills
startup-business-analyst-financial-projections
Create detailed 3-5 year financial model with revenue, costs, cash flow, and scenarios
project-specification-writer
Generate a complete software specification document for the current project/repo, including architecture, data model, key processes, pseudocode, and Mermaid diagrams (context, container/deployment, module relations, sequence, ER, class, flowchart, state).
project-scaffolding
Project type detection matrix, template recommendations per project type, post-scaffolding checklist, Harness integration patterns, and testing recommendations
[PROJECT]-deployment-patterns
[PROJECT] CI/CD pipeline and deployment automation patterns
maintain-project-rules
Audit and maintain project rules in .cursor/rules/. Use when auditing project rules, checking prefix convention, syncing doc/rules.md, or when the user asks about .cursor/rules or prefix convention.
fiber-logging-and-project-structure
Applies best practices for logging, project structure, and environment variable usage specifically to the main application file.
azure-ai-projects-java
Azure AI Projects SDK for Java. High-level SDK for Azure AI Foundry project management including connections, datasets, indexes, and evaluations.
azure-ai-projects-dotnet
Azure AI Projects SDK for .NET. High-level client for Azure AI Foundry projects including agents, connections, datasets, deployments, evaluations, and indexes.
acc-outbox-pattern-knowledge
Outbox Pattern knowledge base. Provides patterns, antipatterns, and PHP-specific guidelines for transactional outbox, polling publisher, and reliable messaging audits.
acc-event-sourcing-knowledge
Event Sourcing knowledge base. Provides patterns, antipatterns, and PHP-specific guidelines for Event Sourcing architecture audits.
Tero Voice Project Context
Load full project context, tech stack, status, and guidelines for the AI Receptionist SaaS project
systems-programming-rust-project
You are a Rust project architecture expert specializing in scaffolding production-ready Rust applications. Generate complete project structures with cargo tooling, proper module organization, testing