InterviewStack.io LogoInterviewStack.io

Capacity Planning and Resource Optimization Questions

Covers forecasting, provisioning, and operating compute, memory, storage, and network resources efficiently to meet demand and service level objectives. Key skills include monitoring resource utilization metrics such as central processing unit usage, memory consumption, storage input and output and network throughput; analyzing historical trends and workload patterns to predict future demand; and planning capacity additions, safety margins, and buffer sizing. Candidates should understand vertical versus horizontal scaling, autoscaling policy design and cooldowns, right sizing instances or containers, workload placement and isolation, load balancing algorithms, and use of spot or preemptible capacity for interruptible workloads. Practical topics include storage planning and archival strategies, database memory tuning and buffer sizing, batching and off peak processing, model compression and inference optimization for machine learning workloads, alerts and dashboards, stress and validation testing of planned changes, and methods to measure that capacity decisions meet both performance and cost objectives.

MediumTechnical
0 practiced
For serving a transformer-based NLP model at scale, propose optimizations: quantization, knowledge distillation, operator fusion, batching strategies, and hardware choices. Explain the expected impact of each on memory, latency, and accuracy and how to measure trade-offs.
HardTechnical
0 practiced
Given probabilistic demand forecasts that provide P50 and P95 estimates for peak capacity over the next quarter, describe how you would set safety margins and procurement decisions to balance cost vs risk. Explain scenario-based provisioning (e.g., P50 for normal, P95 for holiday), hedging mechanisms, and how to re-evaluate as new data arrives.
MediumSystem Design
0 practiced
A relational database shows high read latency during peaks. Compare vertical scaling, read replicas, sharding, and caching (e.g., Redis) as approaches to reduce read latency. For each option discuss capacity implications, cost, and operational complexity.
HardTechnical
0 practiced
You're experiencing high cross-region egress costs and intermittent packet loss on inter-region microservice calls. Propose a plan to optimize network capacity, reduce egress, and improve reliability while satisfying data residency constraints. Include options like replication, caching, API gateway consolidation, and traffic shaping.
HardSystem Design
0 practiced
You need to roll out a major schema migration that increases load during migrations (backfills, joins). Design a capacity-aware CI/CD deployment strategy to avoid overload: include canary rollout plans, throttling and batching of migration jobs, online schema migration techniques, and rollback triggers.

Unlock Full Question Bank

Get access to hundreds of Capacity Planning and Resource Optimization interview questions and detailed answers.

Sign in to Continue

Join thousands of developers preparing for their dream job.