InterviewStack.io LogoInterviewStack.io

Python Programming & ML Libraries Questions

Python programming language fundamentals (syntax, data structures, control flow, error handling) with practical usage of machine learning libraries such as NumPy, pandas, scikit-learn, TensorFlow, and PyTorch for data manipulation, model development, training, evaluation, and lightweight ML tasks.

MediumTechnical
0 practiced
Explain how to implement a custom scikit-learn Transformer in Python (subclassing TransformerMixin and BaseEstimator) that creates polynomial interaction features for selected columns, integrate it into a Pipeline, and ensure it works with GridSearchCV. Provide code sketches and explain get_params/set_params behavior required for hyperparameter search.
MediumTechnical
0 practiced
Implement in Python an efficient, vectorized function to compute pairwise cosine similarity between rows of a 2D NumPy array X with shape (m, d) and return an (m, m) float32 similarity matrix. Provide code that avoids explicit Python loops, explain memory/time complexity, and describe strategies (chunking, approximate nearest neighbors) to handle very large m (e.g., m = 200k) where a full m x m matrix is infeasible.
MediumTechnical
0 practiced
Explain and demonstrate in Python how to save and load models safely and reproducibly in PyTorch and scikit-learn. Show code for saving a PyTorch model state_dict and optimizer state, loading them on CPU vs GPU, and for scikit-learn use joblib.dump/load. Discuss pros and cons of saving pickled full objects vs state_dict plus code, and how you would version and validate model artifacts.
EasyTechnical
0 practiced
Using pandas, write a Python snippet that computes the rolling 7-day average of column 'value' per user in a DataFrame with columns ['user_id', 'timestamp', 'value']. Show how to: (1) ensure rolling uses a time window (7 days) not a fixed row count, (2) handle missing days per user, and (3) return the result aligned to the original timestamps. Include a brief example DataFrame and the code you would run.
MediumTechnical
0 practiced
Implement a full PyTorch training loop in Python for a given model (nn.Module), train_loader (DataLoader), optimizer, and criterion. Include the essential parts: device placement, model.train(), forward pass, loss computation, loss.backward(), gradient clipping, optimizer.step(), scheduler.step() if present, optimizer.zero_grad(), and a validation pass using torch.no_grad(). Optionally include support for mixed-precision training using torch.cuda.amp. Provide clear, concise code and explain what each step does.

Unlock Full Question Bank

Get access to hundreds of Python Programming & ML Libraries interview questions and detailed answers.

Sign in to Continue

Join thousands of developers preparing for their dream job.