Internal · Production-readiness

Pod operations
decision brief.

Pre-infrastructure document. What pod tier and operating mode to commit to before building Docker images, cron schedulers, and per-client deploy scripts. Numbers grounded in observed RunPod pricing as of 2026-05-05.

Atlas Minds Co. · Signl Select · 2026-05-05

Section 1

Pod tiers — Community vs Secure

Both run identical RTX 4090 hardware (24GB VRAM). The difference is who operates the machine.

Community Cloud

Vetted third-party hosts

$0.34 / hr · $250 / mo @ 24/7

Marketplace-style hosting from vetted operators. 2× the compute per dollar vs Secure. Supply is variable (sometimes 0 4090s available). Not suitable for HIPAA / regulated industries. Uptime ~98%. Works great for most B2B SaaS workloads.

Secure Cloud

RunPod-owned datacenters

$0.69 / hr · $504 / mo @ 24/7

Enterprise-grade datacenters with SLA. SOC2-compliant, suitable for regulated industries. Uptime ~99.5%. Better network. Reliable supply. Has host-lockout issue on stop/resume — ~50% of resume attempts fail on a busy host.

Attribute	Community	Secure
Hourly rate (4090)	$0.34	$0.69
Monthly @ 24/7	$250	$504
Monthly @ 8–6 M–F	$75	$152
Uptime SLA	~98%	~99.5%
Supply availability	Variable	Reliable
Compliance fit	Non-regulated	HIPAA / SOC2
Network	Variable, mostly gigabit	Multi-gigabit, low latency
Best for	Most B2B SaaS	Regulated / customer-facing / demos

Section 2

Operating modes — 24/7 vs 8–6 M–F

24/7 = 730 GPU-hrs/month. 8–6 M–F = 220 GPU-hrs/month (10 hr × 22 weekdays). 8–6 costs 30% of 24/7.

Attribute	24/7 Always-On	8–6 Business Hours
GPU-hrs / month	730	220
Cost vs 24/7	100%	30%
Operational complexity	Low	High
Cold start risk	None	Daily (mitigated by Docker image)
Required infrastructure	Pod + monitoring	Pod + Docker image + cron + monitoring + state-off-pod
Daily failure modes	Hardware failure (rare)	Cron failure, supply gap, redeploy issues
Best for	<10 pods, customer-facing, regulated	≥10 pods, sales-team-only, cost-tight

Critical: 8–6 mode does not work without a pre-built Docker image. Without it, every morning takes 10 minutes to reprovision — defeating the purpose. The savings come from pod hours, not from "free morning provisioning."

Section 3

Cost scenarios — two realistic client profiles

Scenario A — Standard client (250 users, 6 locations, $3,500/loc)

Revenue: $21,000 / mo. Sizing: one pod per location for tenant isolation = 6 pods total. Normal call volume (~50K calls/month).

Mode	Per pod / mo	× 6 pods	+ Vol+Platform	Total / mo	Margin %
Secure 24/7	$504	$3,024	$68	$3,092	85.3%
Community 24/7	$250	$1,500	$68	$1,568	92.5%
Secure 8–6	$152	$912	$68	$980	95.3%
Community 8–6	$75	$450	$68	$518	97.5%

Scenario B — High-volume client (4,000 calls/hour, 800K calls/mo)

Volume: 4,000 calls/hour × 10 hrs × 20 days = 800,000 calls/month. Constraint flips from users per pod to processing throughput. With pipeline parallelism added: 30 pods required to keep up with peak. At flat $3,500/location pricing, the same $21K revenue must cover 5× more compute — Secure 24/7 becomes unprofitable.

Mode	Per pod / mo	× 30 pods	+ Vol+Platform	Total / mo	Margin @ $21K
Secure 24/7	$504	$15,120	$152	$15,272	27%
Community 24/7	$250	$7,500	$152	$7,652	64%
Secure 8–6	$152	$4,560	$152	$4,712	78%
Community 8–6	$75	$2,250	$152	$2,402	89%

Per-call cost at Community 8–6 (Scenario B): $2,402 ÷ 800,000 = $0.003 per call. Compare to OpenAI Whisper + GPT-4o-mini at ~$0.05/call → 800K calls would cost $40,000/mo in metered API fees. Local-first saves ~$37K/mo at this volume.

Section 4

The hybrid mode — Community-primary, Secure failover

Production recommendation. Captures Community's cost savings with Secure's reliability backstop.

~$0.40/hr

Blended rate

~85% Community + ~15% Secure failover when supply gaps occur

$8,800/mo

At 30 pods 24/7

vs $7,500 pure Community or $15,120 pure Secure

Effective SLA

Reliability

Supply gaps absorbed automatically — no client-facing impact

Section 5

Pricing model — flat tier breaks at high volume

Current enterprise tier is $3,500/location flat. Works for moderate-volume clients. Breaks down at Scenario-B volumes where 30+ pods are needed. The fix is volume-tiered pricing.

Tier	Base / location / mo	Calls included	Overage
Starter	$1,500	25,000	$0.02 / call
Mid	$3,500	100,000	$0.02 / call
Enterprise	$7,500	300,000	$0.015 / call
Unlimited	$12,000	Unlimited	—

At 800K calls / 6 locations (Scenario B): $7,500 + (500K × $0.015) = $15,000 per location, or $90,000/mo total. Reflects actual value delivered (vs $21K flat). Margin returns to 95%+ at all modes.

Section 6

Recommended phased rollout

Phase 1 · 1–3 clients

Community Cloud 24/7

~$250/pod/mo · 92.5% margin
No cron infrastructure needed
Operational simplicity wins
Accept rare supply gaps

Phase 2 · 4–10 clients

Hybrid Community + Secure failover, 24/7

~$300/pod/mo blended · 91% margin
Automatic Secure failover on supply gap
Eliminates client-facing supply risk
Failover script ~1 hr to build

Phase 3 · 10+ clients

Hybrid + per-client 8–6 option

Full Docker / cron / monitoring stack
Offer 8–6 to cost-sensitive sales clients
Keep 24/7 for customer-facing / regulated
Mode chosen per-client at onboarding

Section 7

Infrastructure checklist — what unlocks each phase

#	Item	Required for	Build time
1	Docker image (Node + Ollama + model + repo)	All modes	1–2 hr
2	Per-client network volume	All modes	30 min
3	Poller state → Supabase (off pod disk)	8–6 mode	30 min
4	`deploy-client.sh` script	All modes	1 hr
5	GitHub Actions cron (deploy/terminate)	8–6 mode	30 min
6	Health monitor + auto-redeploy	All modes	1 hr
7	Community→Secure failover script	Hybrid mode	1 hr
8	Pipeline parallelism (concurrent processing)	High-volume clients	4–8 hr

To reach Phase 2 (Hybrid 24/7): ~5 hours total. To reach Phase 3 (full 8–6 capability): ~8 hours.

Section 8

Bottom line

Today

Community Cloud 24/7 is the right production default. ~$250/pod/mo, 92.5% margin on standard clients, no cron infrastructure to build or maintain. Accept rare supply gaps with manual intervention.

Tomorrow (at scale)

Hybrid Community-primary + Secure failover at 10+ clients. Add 8–6 mode as a per-client option for cost-sensitive deployments once Docker/cron/monitoring infrastructure is in place. Volume-tiered pricing for any client north of 100K calls/month.

Decision required before infrastructure work begins:
1. Pricing model — flat tier or volume-tiered?
2. Phase 1 mode commit — Community 24/7 confirmed, or alternate?
3. Infrastructure greenlight — 5 hours of build to reach Phase 2 (Hybrid 24/7)?

Pod operationsdecision brief.

Section 1

Pod tiers — Community vs Secure

Vetted third-party hosts

RunPod-owned datacenters

Section 2

Operating modes — 24/7 vs 8–6 M–F

Section 3

Cost scenarios — two realistic client profiles

Scenario A — Standard client (250 users, 6 locations, $3,500/loc)

Scenario B — High-volume client (4,000 calls/hour, 800K calls/mo)

Section 4

The hybrid mode — Community-primary, Secure failover

Section 5

Pricing model — flat tier breaks at high volume

Section 6

Recommended phased rollout

Community Cloud 24/7

Hybrid Community + Secure failover, 24/7

Hybrid + per-client 8–6 option

Section 7

Infrastructure checklist — what unlocks each phase

Section 8

Bottom line

Today

Tomorrow (at scale)

Pod operations
decision brief.