InterviewStack.io LogoInterviewStack.io

Machine Learning Algorithms and Theory Questions

Core supervised and unsupervised machine learning algorithms and the theoretical principles that guide their selection and use. Covers linear regression, logistic regression, decision trees, random forests, gradient boosting, support vector machines, k means clustering, hierarchical clustering, principal component analysis, and anomaly detection. Topics include model selection, bias variance trade off, regularization, overfitting and underfitting, ensemble methods and why they reduce variance, computational complexity and scaling considerations, interpretability versus predictive power, common hyperparameters and tuning strategies, and practical guidance on when each algorithm is appropriate given data size, feature types, noise, and explainability requirements.

EasyTechnical
0 practiced
Explain principal component analysis (PCA): its objective, the steps to compute principal components using covariance eigendecomposition or SVD, assumptions and when PCA is appropriate. Describe necessary preprocessing (centering, scaling), and practical methods to choose the number of components (explained variance, scree plot).
EasyTechnical
0 practiced
Explain the assumptions of ordinary least squares linear regression (linearity, independence, homoscedasticity, normality of errors, lack of multicollinearity). For each assumption, describe a practical diagnostic you would run on a real dataset and one remediation technique if the assumption is violated. Be specific about steps and tools (e.g., residual plots, Durbin-Watson, variance inflation factor).
HardSystem Design
0 practiced
Design an end-to-end machine learning system for real-time fraud prediction that must serve predictions at <50ms latency at 10,000 requests/second for 1M active users. Cover offline training pipeline, feature store (online vs offline), feature freshness, model serving (batch vs streaming vs approximate), A/B testing, monitoring, and considerations to avoid training-serving skew.
HardTechnical
0 practiced
Provide a theoretical explanation for why bagging reduces variance of unstable learners. Derive the expected variance of the average of B identically distributed base learners with pairwise correlation rho and base learner variance sigma^2. Explain practical implications for ensembling.
EasyTechnical
1 practiced
You have ~10,000 labeled examples and 20 features (mix of categorical and numeric). Regulators require model explanations. Compare logistic regression and a shallow decision tree for this binary classification problem. Which would you choose and why? Include data preparation and pros/cons relative to interpretability and predictive power.

Unlock Full Question Bank

Get access to hundreds of Machine Learning Algorithms and Theory interview questions and detailed answers.

Sign in to Continue

Join thousands of developers preparing for their dream job.