InterviewStack.io LogoInterviewStack.io

Capacity Planning and Resource Optimization Questions

Covers forecasting, provisioning, and operating compute, memory, storage, and network resources efficiently to meet demand and service level objectives. Key skills include monitoring resource utilization metrics such as central processing unit usage, memory consumption, storage input and output and network throughput; analyzing historical trends and workload patterns to predict future demand; and planning capacity additions, safety margins, and buffer sizing. Candidates should understand vertical versus horizontal scaling, autoscaling policy design and cooldowns, right sizing instances or containers, workload placement and isolation, load balancing algorithms, and use of spot or preemptible capacity for interruptible workloads. Practical topics include storage planning and archival strategies, database memory tuning and buffer sizing, batching and off peak processing, model compression and inference optimization for machine learning workloads, alerts and dashboards, stress and validation testing of planned changes, and methods to measure that capacity decisions meet both performance and cost objectives.

MediumTechnical
0 practiced
Describe metrics and experiments you would run to determine whether a recent capacity change (e.g., resizing instances, changing instance counts) met both performance and cost objectives over a 30-day evaluation period. Which statistical tests, dashboards, and KPIs (SLO adherence, cost per request, latency percentiles, utilization) would you use to declare success and detect regressions?
HardTechnical
0 practiced
You run critical end-of-day batch pipelines on preemptible VMs to save cost. Preemptions currently cause a 20% longer completion time which violates SLAs during peak months. Design an orchestration and retry strategy—including checkpointing frequency, task redundancy/speculative execution, and cost trade-offs—to meet SLAs despite preemptions.
MediumSystem Design
0 practiced
Design a mixed-instance autoscaling strategy that allows up to 50% of web-service capacity to come from spot instances. Describe policies for diversified spot pools, on-demand fallback, graceful handling of spot termination notifications, draining, checkpointing and how to scale gracefully when spot capacity disappears suddenly.
HardSystem Design
0 practiced
Design capacity and failover for a hybrid-cloud deployment where primary production runs in cloud provider A and a constrained on-prem cluster serves as backup. Discuss data replication choices, network egress costs, capacity buffering on-prem, service prioritization for failover, and how to meet RPO/RTO targets under limited on-prem capacity.
HardSystem Design
0 practiced
Case: You're building capacity for a global ML inference platform serving multiple customers with variable loads and strict p99 latency SLAs. Decide multi-tenant isolation strategy, autoscaling granularity (per model vs per host), GPU vs CPU allocation, model packing vs single-model containers, cold-start mitigation, headroom policy, and per-customer cost allocation. Provide a high-level capacity plan and justification.

Unlock Full Question Bank

Get access to hundreds of Capacity Planning and Resource Optimization interview questions and detailed answers.

Sign in to Continue

Join thousands of developers preparing for their dream job.