Internal · Production-readiness

Pod operations
decision brief.

Pre-infrastructure document. What pod tier and operating mode to commit to before building Docker images, cron schedulers, and per-client deploy scripts. Numbers grounded in observed RunPod pricing as of 2026-05-05.

Atlas Minds Co. · Signl Select · 2026-05-05

Section 1

Pod tiers — Community vs Secure

Both run identical RTX 4090 hardware (24GB VRAM). The difference is who operates the machine.

Community Cloud

Vetted third-party hosts

$0.34 / hr · $250 / mo @ 24/7
Marketplace-style hosting from vetted operators. 2× the compute per dollar vs Secure. Supply is variable (sometimes 0 4090s available). Not suitable for HIPAA / regulated industries. Uptime ~98%. Works great for most B2B SaaS workloads.
Secure Cloud

RunPod-owned datacenters

$0.69 / hr · $504 / mo @ 24/7
Enterprise-grade datacenters with SLA. SOC2-compliant, suitable for regulated industries. Uptime ~99.5%. Better network. Reliable supply. Has host-lockout issue on stop/resume — ~50% of resume attempts fail on a busy host.
Attribute Community Secure
Hourly rate (4090)$0.34$0.69
Monthly @ 24/7$250$504
Monthly @ 8–6 M–F$75$152
Uptime SLA~98%~99.5%
Supply availabilityVariableReliable
Compliance fitNon-regulatedHIPAA / SOC2
NetworkVariable, mostly gigabitMulti-gigabit, low latency
Best forMost B2B SaaSRegulated / customer-facing / demos

Section 2

Operating modes — 24/7 vs 8–6 M–F

24/7 = 730 GPU-hrs/month. 8–6 M–F = 220 GPU-hrs/month (10 hr × 22 weekdays). 8–6 costs 30% of 24/7.

Attribute 24/7 Always-On 8–6 Business Hours
GPU-hrs / month730220
Cost vs 24/7100%30%
Operational complexityLowHigh
Cold start riskNoneDaily (mitigated by Docker image)
Required infrastructurePod + monitoringPod + Docker image + cron + monitoring + state-off-pod
Daily failure modesHardware failure (rare)Cron failure, supply gap, redeploy issues
Best for<10 pods, customer-facing, regulated≥10 pods, sales-team-only, cost-tight
Critical: 8–6 mode does not work without a pre-built Docker image. Without it, every morning takes 10 minutes to reprovision — defeating the purpose. The savings come from pod hours, not from "free morning provisioning."

Section 3

Cost scenarios — two realistic client profiles

Scenario A — Standard client (250 users, 6 locations, $3,500/loc)

Revenue: $21,000 / mo. Sizing: one pod per location for tenant isolation = 6 pods total. Normal call volume (~50K calls/month).

Mode Per pod / mo × 6 pods + Vol+Platform Total / mo Margin %
Secure 24/7$504$3,024$68$3,09285.3%
Community 24/7$250$1,500$68$1,56892.5%
Secure 8–6$152$912$68$98095.3%
Community 8–6$75$450$68$51897.5%

Scenario B — High-volume client (4,000 calls/hour, 800K calls/mo)

Volume: 4,000 calls/hour × 10 hrs × 20 days = 800,000 calls/month. Constraint flips from users per pod to processing throughput. With pipeline parallelism added: 30 pods required to keep up with peak. At flat $3,500/location pricing, the same $21K revenue must cover 5× more compute — Secure 24/7 becomes unprofitable.

Mode Per pod / mo × 30 pods + Vol+Platform Total / mo Margin @ $21K
Secure 24/7$504$15,120$152$15,27227%
Community 24/7$250$7,500$152$7,65264%
Secure 8–6$152$4,560$152$4,71278%
Community 8–6$75$2,250$152$2,40289%
Per-call cost at Community 8–6 (Scenario B): $2,402 ÷ 800,000 = $0.003 per call. Compare to OpenAI Whisper + GPT-4o-mini at ~$0.05/call → 800K calls would cost $40,000/mo in metered API fees. Local-first saves ~$37K/mo at this volume.

Section 4

The hybrid mode — Community-primary, Secure failover

Production recommendation. Captures Community's cost savings with Secure's reliability backstop.

~$0.40/hr
Blended rate
~85% Community + ~15% Secure failover when supply gaps occur
$8,800/mo
At 30 pods 24/7
vs $7,500 pure Community or $15,120 pure Secure
Effective SLA
Reliability
Supply gaps absorbed automatically — no client-facing impact

Section 5

Pricing model — flat tier breaks at high volume

Current enterprise tier is $3,500/location flat. Works for moderate-volume clients. Breaks down at Scenario-B volumes where 30+ pods are needed. The fix is volume-tiered pricing.

Tier Base / location / mo Calls included Overage
Starter$1,50025,000$0.02 / call
Mid$3,500100,000$0.02 / call
Enterprise$7,500300,000$0.015 / call
Unlimited$12,000Unlimited
At 800K calls / 6 locations (Scenario B): $7,500 + (500K × $0.015) = $15,000 per location, or $90,000/mo total. Reflects actual value delivered (vs $21K flat). Margin returns to 95%+ at all modes.

Section 6

Recommended phased rollout

Phase 1 · 1–3 clients

Community Cloud 24/7

  • ~$250/pod/mo · 92.5% margin
  • No cron infrastructure needed
  • Operational simplicity wins
  • Accept rare supply gaps
Phase 2 · 4–10 clients

Hybrid Community + Secure failover, 24/7

  • ~$300/pod/mo blended · 91% margin
  • Automatic Secure failover on supply gap
  • Eliminates client-facing supply risk
  • Failover script ~1 hr to build
Phase 3 · 10+ clients

Hybrid + per-client 8–6 option

  • Full Docker / cron / monitoring stack
  • Offer 8–6 to cost-sensitive sales clients
  • Keep 24/7 for customer-facing / regulated
  • Mode chosen per-client at onboarding

Section 7

Infrastructure checklist — what unlocks each phase

# Item Required for Build time
1Docker image (Node + Ollama + model + repo)All modes1–2 hr
2Per-client network volumeAll modes30 min
3Poller state → Supabase (off pod disk)8–6 mode30 min
4deploy-client.sh scriptAll modes1 hr
5GitHub Actions cron (deploy/terminate)8–6 mode30 min
6Health monitor + auto-redeployAll modes1 hr
7Community→Secure failover scriptHybrid mode1 hr
8Pipeline parallelism (concurrent processing)High-volume clients4–8 hr

To reach Phase 2 (Hybrid 24/7): ~5 hours total. To reach Phase 3 (full 8–6 capability): ~8 hours.

Section 8

Bottom line

Today

Community Cloud 24/7 is the right production default. ~$250/pod/mo, 92.5% margin on standard clients, no cron infrastructure to build or maintain. Accept rare supply gaps with manual intervention.

Tomorrow (at scale)

Hybrid Community-primary + Secure failover at 10+ clients. Add 8–6 mode as a per-client option for cost-sensitive deployments once Docker/cron/monitoring infrastructure is in place. Volume-tiered pricing for any client north of 100K calls/month.
Decision required before infrastructure work begins:
1. Pricing model — flat tier or volume-tiered?
2. Phase 1 mode commit — Community 24/7 confirmed, or alternate?
3. Infrastructure greenlight — 5 hours of build to reach Phase 2 (Hybrid 24/7)?