Linear and Logistic Regression Implementation Questions

Covers the fundamentals and implementation details of linear regression for continuous prediction and logistic regression for binary or multiclass classification. Candidates should understand model formulation, hypothesis functions, and the intuition behind fitting a line or hyperplane for regression and using a sigmoid or softmax function for classification. Include loss functions such as mean squared error for regression and cross entropy loss for classification, optimization methods including gradient descent and variants, regularization techniques, feature engineering and scaling, metrics for evaluation such as mean absolute error and accuracy and area under curve, and hyperparameter selection and validation strategies. Expect discussion of practical implementation using numerical libraries and machine learning toolkits, trade offs and limitations of each approach, numerical stability, and common pitfalls such as underfitting and overfitting.

HardTechnical

0 practiced

A logistic regression model for 90-day customer retention shows degrading performance due to temporal drift. Propose step-by-step remediation covering data partitioning for time-aware validation, feature engineering (lags, rolling statistics, exponential decay), retraining frequency, and how to evaluate whether retraining improved production performance.

MediumTechnical

0 practiced

Give concrete examples of data leakage when building regression or classification models (e.g., using future features in forecasting, target-derived features, leakage via preprocessing). Describe diagnostic steps to find leakage and prevention techniques to adopt in feature engineering and cross-validation pipelines.

HardTechnical

0 practiced

Your logistic regression classifier shows disparate false positive rates across protected groups. Design a rigorous evaluation workflow to quantify fairness issues (metrics to gather, statistical tests), and describe remediation techniques across preprocessing (reweighing), in-processing (fair regularizers), and post-processing (threshold adjustments). Discuss trade-offs with accuracy and calibration.

EasyTechnical

0 practiced

Compare mean squared error (MSE) and mean absolute error (MAE) as loss functions for regression. Provide mathematical definitions, discuss sensitivity to outliers, differentiability/smoothness and implications for gradient-based optimization, and business scenarios where MAE is preferred over MSE or vice versa.

MediumTechnical

0 practiced

Compare one-hot encoding and target (mean) encoding for categorical variables when used with linear or logistic regression. Explain advantages (dimensionality, representational power), risks (target leakage, overfitting), smoothing strategies to prevent overfitting, and safe implementation via fold-based target encoding.

Unlock Full Question Bank

Get access to hundreds of Linear and Logistic Regression Implementation interview questions and detailed answers.

Join thousands of developers preparing for their dream job.