Best use case
sem-guide is best used when you need a repeatable AI agent workflow instead of a one-off prompt.
Structural equation modeling with latent variables guide
Teams using sem-guide should expect a more consistent output, faster repeated execution, less prompt rewriting.
When to use this skill
- You want a reusable workflow that can be run more than once with consistent structure.
When not to use this skill
- You only need a quick one-off answer and do not need a reusable workflow.
- You cannot install or maintain the underlying files, dependencies, or repository context.
Installation
Claude Code / Cursor / Codex
Manual Installation
- Download SKILL.md from GitHub
- Place it in
.claude/skills/sem-guide/SKILL.mdinside your project - Restart your AI agent — it will auto-discover the skill
How sem-guide Compares
| Feature / Agent | sem-guide | Standard Approach |
|---|---|---|
| Platform Support | Not specified | Limited / Varies |
| Context Awareness | High | Baseline |
| Installation Complexity | Unknown | N/A |
Frequently Asked Questions
What does this skill do?
Structural equation modeling with latent variables guide
Where can I find the source code?
You can find the source code on GitHub using the link provided at the top of the page.
SKILL.md Source
# Structural Equation Modeling Guide
Build, estimate, and evaluate structural equation models (SEM) with latent variables using Python (semopy) and R (lavaan), including confirmatory factor analysis and path analysis.
## What Is SEM?
Structural Equation Modeling is a multivariate statistical framework that combines factor analysis and path analysis to test complex theoretical models involving:
- **Observed (manifest) variables**: Directly measured (e.g., survey items, test scores)
- **Latent (unobserved) variables**: Theoretical constructs measured indirectly through observed indicators (e.g., "motivation," "intelligence")
- **Structural paths**: Directional relationships between variables (regression-like)
- **Measurement model**: How latent variables relate to their indicators (CFA)
- **Structural model**: How latent variables relate to each other (path analysis)
## SEM Components
| Component | Description | Diagram Symbol |
|-----------|-------------|---------------|
| Observed variable | Measured directly | Rectangle |
| Latent variable | Inferred from indicators | Oval/circle |
| Regression path | Directional relationship | Single-headed arrow |
| Covariance | Non-directional association | Double-headed arrow |
| Error/residual | Unexplained variance | Small circle with arrow |
## Step 1: Confirmatory Factor Analysis (CFA)
CFA tests whether observed variables load onto hypothesized latent factors.
### In R (lavaan)
```r
library(lavaan)
# Define the measurement model
# =~ means "is measured by"
cfa_model <- '
# Latent variable definitions
Motivation =~ mot1 + mot2 + mot3 + mot4
SelfEfficacy =~ se1 + se2 + se3
Performance =~ perf1 + perf2 + perf3 + perf4
# Covariances between latent variables (estimated by default in CFA)
'
# Fit the model
fit <- cfa(cfa_model, data = mydata, estimator = "MLR")
# View results
summary(fit, fit.measures = TRUE, standardized = TRUE)
# Key output to examine:
# - Factor loadings (standardized > 0.5 is desirable)
# - Model fit indices (see table below)
# - Modification indices (for model improvement)
modindices(fit, sort = TRUE, minimum.value = 10)
```
### In Python (semopy)
```python
import semopy
import pandas as pd
# Define model in lavaan-like syntax
model_spec = """
Motivation =~ mot1 + mot2 + mot3 + mot4
SelfEfficacy =~ se1 + se2 + se3
Performance =~ perf1 + perf2 + perf3 + perf4
"""
# Fit the model
model = semopy.Model(model_spec)
result = model.fit(data)
# View parameter estimates
print(model.inspect())
# Get fit statistics
stats = semopy.calc_stats(model)
print(stats.T)
```
## Step 2: Full Structural Model
After confirming the measurement model, add structural (regression) paths.
### In R (lavaan)
```r
sem_model <- '
# Measurement model
Motivation =~ mot1 + mot2 + mot3 + mot4
SelfEfficacy =~ se1 + se2 + se3
Performance =~ perf1 + perf2 + perf3 + perf4
# Structural model (regressions)
# ~ means "is regressed on"
Performance ~ Motivation + SelfEfficacy
SelfEfficacy ~ Motivation
# Optional: define indirect effect
# indirect := a * b
'
fit <- sem(sem_model, data = mydata, estimator = "MLR")
summary(fit, fit.measures = TRUE, standardized = TRUE, rsquare = TRUE)
```
### Mediation Analysis
```r
mediation_model <- '
# Measurement model
X =~ x1 + x2 + x3
M =~ m1 + m2 + m3
Y =~ y1 + y2 + y3
# Structural model
M ~ a*X # a path
Y ~ b*M + c*X # b path + direct effect c
# Define indirect and total effects
indirect := a * b
total := c + a * b
'
fit <- sem(mediation_model, data = mydata, se = "bootstrap", bootstrap = 1000)
summary(fit, standardized = TRUE)
# Bootstrap confidence intervals for indirect effect
parameterEstimates(fit, boot.ci.type = "bca.simple", standardized = TRUE)
```
## Model Fit Assessment
### Fit Index Reference Table
| Index | Good Fit | Acceptable | What It Measures |
|-------|----------|------------|-----------------|
| Chi-square (p) | p > 0.05 | Sensitive to N; use with other indices | Exact fit test |
| Chi-square/df | < 2 | < 3 | Parsimony-adjusted exact fit |
| CFI | > 0.95 | > 0.90 | Comparative fit vs. null model |
| TLI | > 0.95 | > 0.90 | CFI adjusted for parsimony |
| RMSEA | < 0.06 | < 0.08 | Approximate fit per df |
| SRMR | < 0.08 | < 0.10 | Average residual correlation |
| AIC/BIC | Lower = better | -- | Model comparison (not absolute) |
### Interpreting Fit
```r
# Extract fit measures in lavaan
fitMeasures(fit, c("chisq", "df", "pvalue", "cfi", "tli", "rmsea",
"rmsea.ci.lower", "rmsea.ci.upper", "srmr"))
```
**Reporting template:**
```
The structural equation model demonstrated adequate fit to the data:
chi-square(df) = X.XX, p = .XXX; CFI = .XX; TLI = .XX; RMSEA = .XXX
[90% CI: .XXX, .XXX]; SRMR = .XXX.
```
## Model Modification and Comparison
### Modification Indices
```r
# Show top modification indices
mi <- modindices(fit, sort = TRUE)
head(mi, 10)
# Common modifications:
# - Allow error covariances between similarly-worded items
# - Add cross-loadings (if theoretically justified)
# - Remove non-significant paths
```
### Model Comparison
```r
# Compare nested models using chi-square difference test
fit1 <- sem(model1, data = mydata) # More constrained
fit2 <- sem(model2, data = mydata) # Less constrained
anova(fit1, fit2) # Chi-square difference test
# For non-nested models, compare AIC/BIC
fitMeasures(fit1, c("aic", "bic"))
fitMeasures(fit2, c("aic", "bic"))
```
## Common Pitfalls
| Issue | Problem | Solution |
|-------|---------|----------|
| Small sample size | Unstable estimates, poor fit | Minimum N = 200, or 10-20 per parameter |
| Too many parameters | Overfitting, non-convergence | Simplify model, use parceling |
| Non-normal data | Biased standard errors | Use MLR estimator or bootstrapping |
| Ignoring missing data | Biased results | Use FIML (full information maximum likelihood) |
| Data-driven respecification | Capitalizing on chance | Cross-validate with holdout sample |
| Conflating fit with truth | Good fit does not mean correct model | Consider equivalent/alternative models |
## Assumptions and Diagnostics
1. **Multivariate normality**: Check with Mardia's test; use robust estimators (MLR) if violated
2. **Linearity**: SEM assumes linear relationships between variables
3. **No multicollinearity**: Correlations between latent variables should not exceed 0.85
4. **Sufficient sample size**: Rule of thumb: N >= 200 or 10-20 observations per estimated parameter
5. **Correct model specification**: Omitted variables can bias all estimates
```r
# Check multivariate normality
library(MVN)
mvn(mydata[, c("mot1", "mot2", "mot3", "se1", "se2", "se3")],
mvnTest = "mardia")
# Use robust estimation if non-normal
fit_robust <- sem(sem_model, data = mydata, estimator = "MLR")
```Related Skills
thuthesis-guide
Write Tsinghua University theses using the ThuThesis LaTeX template
thesis-writing-guide
Templates, formatting rules, and strategies for thesis and dissertation writing
thesis-template-guide
Set up LaTeX templates for PhD and Master's thesis documents
sjtuthesis-guide
Write SJTU theses using the SJTUThesis LaTeX template with full compliance
novathesis-guide
LaTeX thesis template supporting multiple universities and formats
graphical-abstract-guide
Create SVG graphical abstracts for journal paper submissions
beamer-presentation-guide
Guide to creating academic presentations with LaTeX Beamer
plagiarism-detection-guide
Use plagiarism detection tools and ensure manuscript originality
paper-polish-guide
Review and polish LaTeX research papers for clarity and style
grammar-checker-guide
Use grammar and style checking tools to polish academic manuscripts
conciseness-editing-guide
Eliminate wordiness and redundancy in academic prose for clarity
academic-translation-guide
Academic translation, post-editing, and Chinglish correction guide