analyzing-outlook-pst-for-email-forensics
Analyze Microsoft Outlook PST and OST files for email forensic evidence including message content, headers, attachments, deleted items, and metadata using libpff, pst-utils, and forensic email analysis tools for legal investigations and incident response.
Best use case
analyzing-outlook-pst-for-email-forensics is best used when you need a repeatable AI agent workflow instead of a one-off prompt.
Analyze Microsoft Outlook PST and OST files for email forensic evidence including message content, headers, attachments, deleted items, and metadata using libpff, pst-utils, and forensic email analysis tools for legal investigations and incident response.
Teams using analyzing-outlook-pst-for-email-forensics should expect a more consistent output, faster repeated execution, less prompt rewriting.
When to use this skill
- You want a reusable workflow that can be run more than once with consistent structure.
When not to use this skill
- You only need a quick one-off answer and do not need a reusable workflow.
- You cannot install or maintain the underlying files, dependencies, or repository context.
Installation
Claude Code / Cursor / Codex
Manual Installation
- Download SKILL.md from GitHub
- Place it in
.claude/skills/analyzing-outlook-pst-for-email-forensics/SKILL.mdinside your project - Restart your AI agent — it will auto-discover the skill
How analyzing-outlook-pst-for-email-forensics Compares
| Feature / Agent | analyzing-outlook-pst-for-email-forensics | Standard Approach |
|---|---|---|
| Platform Support | Not specified | Limited / Varies |
| Context Awareness | High | Baseline |
| Installation Complexity | Unknown | N/A |
Frequently Asked Questions
What does this skill do?
Analyze Microsoft Outlook PST and OST files for email forensic evidence including message content, headers, attachments, deleted items, and metadata using libpff, pst-utils, and forensic email analysis tools for legal investigations and incident response.
Where can I find the source code?
You can find the source code on GitHub using the link provided at the top of the page.
Related Guides
AI Agents for Marketing
Discover AI agents for marketing workflows, from SEO and content production to campaign research, outreach, and analytics.
Best AI Agents for Marketing
A curated list of the best AI agents and skills for marketing teams focused on SEO, content systems, outreach, and campaign execution.
AI Agent for Cold Email Generation
Discover AI agent skills for cold email generation, outreach copy, lead personalization, CRM support, and sales-adjacent messaging workflows.
SKILL.md Source
# Analyzing Outlook PST for Email Forensics
## Overview
Microsoft Outlook PST (Personal Storage Table) and OST (Offline Storage Table) files are critical evidence sources in digital forensics investigations. PST files store email messages, calendar events, contacts, tasks, and notes in a proprietary binary format based on the MAPI (Messaging Application Programming Interface) property system. Forensic analysis of these files enables recovery of deleted emails (from the Recoverable Items folder), extraction of email headers for tracing message routes, analysis of attachments for malware or exfiltrated data, and reconstruction of communication patterns. Modern PST files use Unicode format with 4KB pages and can grow up to 50GB, while legacy ANSI format is limited to 2GB.
## When to Use
- When investigating security incidents that require analyzing outlook pst for email forensics
- When building detection rules or threat hunting queries for this domain
- When SOC analysts need structured procedures for this analysis type
- When validating security monitoring coverage for related attack techniques
## Prerequisites
- libpff/pffexport (open-source PST parser)
- Python 3.8+ with pypff or libratom libraries
- MailXaminer, Forensic Email Collector, or SysTools PST Forensics (commercial)
- Microsoft Outlook (optional, for native PST access)
- Sufficient disk space for extracted content
## PST File Locations
| Source | Path |
|--------|------|
| Outlook 2016+ Default | %USERPROFILE%\Documents\Outlook Files\*.pst |
| Outlook Legacy | %LOCALAPPDATA%\Microsoft\Outlook\*.pst |
| OST Cache | %LOCALAPPDATA%\Microsoft\Outlook\*.ost |
| Archive | %USERPROFILE%\Documents\Outlook Files\archive.pst |
## Analysis with Open-Source Tools
### libpff / pffexport
```bash
# Export all items from PST file
pffexport -m all evidence.pst -t exported_pst
# Export only email messages
pffexport -m items evidence.pst -t exported_emails
# Export recovered/deleted items
pffexport -m recovered evidence.pst -t recovered_items
# Get PST file information
pffinfo evidence.pst
```
### Python PST Analysis
```python
import pypff
import os
import json
import hashlib
import email
import sys
from datetime import datetime
from collections import defaultdict
class PSTForensicAnalyzer:
"""Forensic analysis of Outlook PST/OST files."""
def __init__(self, pst_path: str, output_dir: str):
self.pst_path = pst_path
self.output_dir = output_dir
os.makedirs(output_dir, exist_ok=True)
self.pst = pypff.file()
self.pst.open(pst_path)
self.messages = []
self.attachments = []
self.stats = defaultdict(int)
def process_folder(self, folder, folder_path: str = ""):
"""Recursively process PST folders and extract messages."""
folder_name = folder.name or "Root"
current_path = f"{folder_path}/{folder_name}" if folder_path else folder_name
for i in range(folder.number_of_sub_messages):
try:
message = folder.get_sub_message(i)
msg_data = self.extract_message(message, current_path)
if msg_data:
self.messages.append(msg_data)
self.stats["total_messages"] += 1
except Exception as e:
self.stats["parse_errors"] += 1
for i in range(folder.number_of_sub_folders):
try:
subfolder = folder.get_sub_folder(i)
self.process_folder(subfolder, current_path)
except Exception:
continue
def extract_message(self, message, folder_path: str) -> dict:
"""Extract forensic metadata from a single email message."""
msg_data = {
"folder": folder_path,
"subject": message.subject or "",
"sender": message.sender_name or "",
"sender_email": "",
"creation_time": str(message.creation_time) if message.creation_time else None,
"delivery_time": str(message.delivery_time) if message.delivery_time else None,
"modification_time": str(message.modification_time) if message.modification_time else None,
"has_attachments": message.number_of_attachments > 0,
"attachment_count": message.number_of_attachments,
"body_size": len(message.plain_text_body or b""),
"html_size": len(message.html_body or b""),
}
# Extract transport headers for routing analysis
headers = message.transport_headers
if headers:
msg_data["headers_present"] = True
msg_data["headers_size"] = len(headers)
# Parse key headers
parsed = email.message_from_string(headers)
msg_data["from_header"] = parsed.get("From", "")
msg_data["to_header"] = parsed.get("To", "")
msg_data["date_header"] = parsed.get("Date", "")
msg_data["message_id"] = parsed.get("Message-ID", "")
msg_data["x_originating_ip"] = parsed.get("X-Originating-IP", "")
msg_data["received_headers"] = parsed.get_all("Received", [])
# Process attachments
for j in range(message.number_of_attachments):
try:
attachment = message.get_attachment(j)
att_data = {
"message_subject": msg_data["subject"],
"name": attachment.name or f"attachment_{j}",
"size": attachment.size,
"content_type": "",
}
self.attachments.append(att_data)
self.stats["total_attachments"] += 1
except Exception:
continue
return msg_data
def save_attachments(self, max_size_mb: int = 100):
"""Export attachments to disk for analysis."""
att_dir = os.path.join(self.output_dir, "attachments")
os.makedirs(att_dir, exist_ok=True)
root = self.pst.get_root_folder()
self._save_attachments_recursive(root, att_dir, max_size_mb)
def _save_attachments_recursive(self, folder, att_dir, max_size_mb):
for i in range(folder.number_of_sub_messages):
try:
message = folder.get_sub_message(i)
for j in range(message.number_of_attachments):
att = message.get_attachment(j)
if att.size and att.size < max_size_mb * 1024 * 1024:
name = att.name or f"unknown_{i}_{j}"
safe_name = "".join(c if c.isalnum() or c in ".-_" else "_" for c in name)
path = os.path.join(att_dir, safe_name)
try:
data = att.read_buffer(att.size)
with open(path, "wb") as f:
f.write(data)
except Exception:
continue
except Exception:
continue
for i in range(folder.number_of_sub_folders):
try:
self._save_attachments_recursive(folder.get_sub_folder(i), att_dir, max_size_mb)
except Exception:
continue
def generate_report(self) -> str:
"""Generate comprehensive PST forensic analysis report."""
root = self.pst.get_root_folder()
self.process_folder(root)
report = {
"analysis_timestamp": datetime.now().isoformat(),
"pst_file": self.pst_path,
"pst_size_bytes": os.path.getsize(self.pst_path),
"statistics": dict(self.stats),
"messages": self.messages[:500],
"attachments": self.attachments[:200],
}
report_path = os.path.join(self.output_dir, "pst_forensic_report.json")
with open(report_path, "w") as f:
json.dump(report, f, indent=2, default=str)
print(f"[*] Total messages: {self.stats['total_messages']}")
print(f"[*] Total attachments: {self.stats['total_attachments']}")
print(f"[*] Parse errors: {self.stats['parse_errors']}")
return report_path
def close(self):
self.pst.close()
def main():
if len(sys.argv) < 3:
print("Usage: python process.py <pst_file> <output_dir>")
sys.exit(1)
analyzer = PSTForensicAnalyzer(sys.argv[1], sys.argv[2])
analyzer.generate_report()
analyzer.close()
if __name__ == "__main__":
main()
```
## Email Header Analysis
Key headers for forensic investigation:
| Header | Forensic Value |
|--------|---------------|
| Received | Message routing chain (read bottom to top) |
| X-Originating-IP | Sender's actual IP address |
| Message-ID | Unique identifier for correlation |
| Date | Send timestamp |
| Return-Path | Bounce address (may differ from From) |
| DKIM-Signature | Domain authentication signature |
| Authentication-Results | SPF, DKIM, DMARC verification results |
| X-Mailer | Email client used |
## References
- MailXaminer PST Forensics: https://www.mailxaminer.com/blog/outlook-pst-file-forensics/
- libpff Documentation: https://github.com/libyal/libpff
- PST File Format Specification: https://docs.microsoft.com/en-us/openspecs/office_file_formats/ms-pst/
- SANS Email Forensics: https://www.sans.org/blog/email-forensics/
## Example Output
```text
$ pffexport /evidence/jsmith_archive.pst -t /analysis/pst_output
pffexport 20231205 - libpff PST/OST Export Tool
=================================================
Input: /evidence/jsmith_archive.pst (2.3 GB)
Exporting PST contents...
Folders: 45
Messages: 12,456
Attachments: 3,234
Contacts: 567
Calendar: 234
Tasks: 89
Export completed in 3m 42s.
$ python3 pst_analyzer.py /analysis/pst_output /analysis/email_report
PST Forensic Analysis Report
==============================
Source: jsmith_archive.pst (john.smith@corporate.com)
Date Range: 2023-06-01 to 2024-01-18
--- Mailbox Statistics ---
Total Emails: 12,456
Sent: 4,567
Received: 7,889
With Attachments: 3,234
Deleted (recovered): 234
--- Phishing / Suspicious Emails ---
Email #8923
Date: 2024-01-15 14:30:22 UTC
From: "IT Support" <it-support@c0rporate-help.com>
To: john.smith@corporate.com
Subject: Urgent: Password Reset Required
Headers:
Return-Path: bounce@mail-relay.c0rporate-help.com
X-Originating-IP: 203.0.113.55
Received: from mail-relay.c0rporate-help.com (203.0.113.55)
SPF: FAIL (domain c0rporate-help.com)
DKIM: NONE
DMARC: FAIL
Attachments:
- Password_Reset_Form.xlsm (245 KB) SHA-256: 7a3b8c9d...e1f2a3b4
Body Preview: "Dear Employee, Your password will expire in 24 hours.
Please open the attached form to reset your credentials..."
--- Data Exfiltration Indicators ---
Email #9102
Date: 2024-01-16 03:15:45 UTC
From: john.smith@corporate.com
To: j.smith.personal8842@protonmail.com
Subject: (no subject)
Attachments:
- archive_part1.7z (24.5 MB) - encrypted
- archive_part2.7z (24.5 MB) - encrypted
Email #9103
Date: 2024-01-16 03:18:22 UTC
From: john.smith@corporate.com
To: j.smith.personal8842@protonmail.com
Subject: Re:
Attachments:
- archive_part3.7z (18.2 MB) - encrypted
--- Keyword Hits ---
"confidential": 45 emails
"password": 23 emails
"transfer": 12 emails
"resign": 3 emails
"delete evidence": 1 email (Email #9200, 2024-01-17 22:30:00 UTC)
Summary:
Phishing emails detected: 1 (initial compromise vector)
Suspicious sent emails: 5 (to personal accounts with attachments)
Encrypted attachments: 3 (67.2 MB total - possible exfiltration)
Report: /analysis/email_report/pst_forensic_report.json
```Related Skills
testing-for-email-header-injection
Test web application email functionality for SMTP header injection vulnerabilities that allow attackers to inject additional email headers, modify recipients, and abuse contact forms for spam relay.
performing-sqlite-database-forensics
Perform forensic analysis of SQLite databases to recover deleted records from freelists and WAL files, decode encoded timestamps, and extract evidence from browser history, messaging apps, and mobile device databases.
performing-network-forensics-with-wireshark
Capture and analyze network traffic using Wireshark and tshark to reconstruct network events, extract artifacts, and identify malicious communications.
performing-mobile-device-forensics-with-cellebrite
Acquire and analyze mobile device data using Cellebrite UFED and open-source tools to extract communications, location data, and application artifacts.
performing-memory-forensics-with-volatility3
Analyze volatile memory dumps using Volatility 3 to extract running processes, network connections, loaded modules, and evidence of malicious activity.
performing-memory-forensics-with-volatility3-plugins
Analyze memory dumps using Volatility3 plugins to detect injected code, rootkits, credential theft, and malware artifacts in Windows, Linux, and macOS memory images.
performing-linux-log-forensics-investigation
Perform forensic investigation of Linux system logs including syslog, auth.log, systemd journal, kern.log, and application logs to reconstruct user activity, detect unauthorized access, and establish event timelines on compromised Linux systems.
performing-endpoint-forensics-investigation
Performs digital forensics investigation on compromised endpoints including memory acquisition, disk imaging, artifact analysis, and timeline reconstruction. Use when investigating security incidents, collecting evidence for legal proceedings, or analyzing endpoint compromise scope. Activates for requests involving endpoint forensics, memory analysis, disk forensics, or incident investigation.
performing-disk-forensics-investigation
Conducts disk forensics investigations using forensic imaging, file system analysis, artifact recovery, and timeline reconstruction to support incident response cases. Utilizes tools such as FTK Imager, Autopsy, and The Sleuth Kit for evidence acquisition, deleted file recovery, and artifact examination. Activates for requests involving disk forensics, hard drive analysis, forensic imaging, file recovery, evidence acquisition, or digital forensic investigation.
performing-cloud-native-forensics-with-falco
Uses Falco YAML rules for runtime threat detection in containers and Kubernetes, monitoring syscalls for shell spawns, file tampering, network anomalies, and privilege escalation. Manages Falco rules via the Falco gRPC API and parses Falco alert output. Use when building container runtime security or investigating k8s cluster compromises.
performing-cloud-log-forensics-with-athena
Uses AWS Athena to query CloudTrail, VPC Flow Logs, S3 access logs, and ALB logs for forensic investigation. Covers CREATE TABLE DDL with partition projection, forensic SQL queries for detecting unauthorized access, data exfiltration, lateral movement, and privilege escalation. Use when investigating AWS security incidents or building cloud-native forensic workflows at scale.
performing-cloud-forensics-with-aws-cloudtrail
Perform forensic investigation of AWS environments using CloudTrail logs to reconstruct attacker activity, identify compromised credentials, and analyze API call patterns.