Probability and Statistical Inference Questions

Covers fundamental probability theory and statistical inference from first principles to practical applications. Core probability concepts include sample spaces and events, independence, conditional probability, Bayes theorem, expected value, variance, and standard deviation. Reviews common probability distributions such as normal, binomial, Poisson, uniform, and exponential, their parameters, typical use cases, computation of probabilities, and approximation methods. Explains sampling distributions and the Central Limit Theorem and their implications for estimation and confidence intervals. Presents descriptive statistics and data summary measures including mean, median, variance, and standard deviation. Details the hypothesis testing workflow including null and alternative hypotheses, p values, statistical significance, type one and type two errors, power, effect size, and interpretation of results. Reviews commonly used tests and methods and guidance for selection and assumptions checking, including z tests, t tests, chi square tests, analysis of variance, and basic nonparametric alternatives. Emphasizes practical issues such as correlation versus causation, impact of sample size and data quality, assumptions validation, reasoning about rare events and tail risks, and communicating uncertainty. At more advanced levels expect experimental design and interpretation at scale including A B tests, sample size and power calculations, multiple testing and false discovery rate adjustment, and design choices for robust inference in real world systems.

HardSystem Design

0 practiced

Design a production pipeline to detect feature distribution drift for an online ML model. Include which statistical tests you would use per feature (e.g., KS test for continuous, χ² for categorical, KL divergence), how to select sample windows and thresholds, strategies to reduce false positives due to seasonality, and how to alert and take automated vs. manual actions.

HardTechnical

0 practiced

Your training data over-represents highly active users compared to the target population. Describe weighting strategies to correct for sample bias when estimating population-level quantities: inverse probability weighting, post-stratification, raking (iterative proportional fitting), and how to compute variance estimates that account for weights.

MediumTechnical

0 practiced

As an AI engineer, contrast Bayesian and frequentist inference. Give an example in anomaly detection where a Bayesian approach with an informative prior helps stabilize rare-event rate estimates, and explain how to choose/assess priors in practice.

HardTechnical

0 practiced

Derive the Fisher information I(μ) and the Cramér-Rao lower bound for estimating the mean μ of a Gaussian N(μ, σ^2) with known variance σ^2. Show derivation via the score function and explain implications for estimator variance and asymptotic efficiency of the sample mean.

HardTechnical

0 practiced

You train a deep generative model but some training features are missing at random. Compare approaches for handling missingness: marginalizing missing values via Monte Carlo EM, multiple imputation followed by training, and variational inference that treats missing entries as latent variables. Discuss computational trade-offs and convergence diagnostics for each.

Unlock Full Question Bank

Get access to hundreds of Probability and Statistical Inference interview questions and detailed answers.

Join thousands of developers preparing for their dream job.