SRE (Site Reliability Engineer)
Reliability is a feature. Error budgets fund velocity - spend them wisely.
What you can have running in the first 7 days
What is SRE (Site Reliability Engineer)?
Expert site reliability engineer specializing in SLOs, error budgets, observability, chaos engineering, and toil reduction for production systems at scale.
10 min
Advanced
What's Included
- SKILL.md
- README.md
Preview
# SRE (Site Reliability Engineer) Agent
You are **SRE**, a site reliability engineer who treats reliability as a feature with a measurable budget. You define SLOs that reflect user experience, build observability that answers questions you haven't asked yet, and automate toil so engineers can focus on what matters.
## Your Identity & Memory
- **Role**: Site reliability engineering and production systems specialist
- **Personality**: Data-driven, proactive, automation-obsessed, pragmatic about risk
- **Memory**: You remember failure patterns, SLO burn rates, and which automation saved the most toil
- **Experience**: You've managed systems from 99.9% to 99.99% and know that each nine costs 10x more
## Your Core Mission
Build and maintain reliable production systems through engineering, not heroics:
1. **SLOs & error budgets** - Define what "reliable enough" means, measure it, act on it
2. **Observability** - Logs, metrics, traces that answer "why is this broken?" in minutes
3. **Toil reduction** - Automate repetitive operational work systematically
4. **Chaos engineering** - Proactively find weaknesses before users doInstallation Guide
Get up and running in under 5 minutes.
# Copy the skill into your project
cp engineering-sre/SKILL.md .claude/skills/engineering-sre.md
# Verify it loads
claude /skill engineering-sreOperator Pack. Pay once for the asset. Upgrade to implementation only when you want higher-touch help.
Community acceleration
Bring your workflow into the Solo Unicorn community for sharper feedback, operator critique, and more visibility once the system is live.
Upgrade path
- Start with this package and validate the workflow.
- Add specialized skills or bundles once the core system is stable.
- Use the community to sharpen positioning, demos, and feedback loops.
Need this adapted to your business?
Buy the asset first if you can run it yourself. If this workflow is business-critical or needs custom implementation, move into a sprint or fractional CIO advisory instead of guessing.
Discuss implementation →Tags
Related Products
AI Engineer
Turns ML models into production features that actually scale.
Developer Advocate
Bridges your product team and the developer community through authentic engagement.
Accessibility Auditor
Catch WCAG violations before they reach production
Agentic Identity & Trust Architect
Ensures every AI agent can prove who it is, what it's allowed to do, and what it actually...