ML Interview Questions 2025
Complete machine learning interview prep: 150+ real questions from Google, Meta, OpenAI, and top AI companies. Includes detailed answers, coding examples, and salary data.
Question Categories
Organized by topic with real examples from top companies
ML Algorithms & Theory
Bias is error from wrong assumptions (underfitting). Variance is error from sensitivity to training data (overfitting). Trade-off: High bias = too simple model, high variance = too complex. Optimal model balances both.
L1 (Lasso) adds absolute value of coefficients, creates sparse models (feature selection). L2 (Ridge) adds squared coefficients, prevents large weights. Use L1 for feature selection, L2 for preventing overfitting when all features matter.
Iteratively adjusts weights to minimize loss. SGD: one sample at a time (noisy). Mini-batch: small batches (balance speed/accuracy). Adam: adaptive learning rates per parameter (most popular).
Deep Learning
Computes gradients of loss w.r.t. weights using chain rule. Forward pass: compute predictions. Backward pass: calculate gradients layer by layer from output to input. Update weights using optimizer.
Vanishing: gradients shrink in deep networks (sigmoid/tanh). Exploding: gradients grow exponentially. Solutions: ReLU activation, batch normalization, residual connections, gradient clipping, proper weight initialization.
CNNs: spatial data (images) with convolutions. RNNs: sequential data (time series) with hidden states. Transformers: parallel processing with attention (NLP, vision). Transformers now dominate most tasks.
ML System Design
Approach: 1) Candidate generation (collaborative filtering, content-based) 2) Ranking (deep learning model with user/video features) 3) Re-ranking (diversity, freshness) 4) A/B testing. Scale: distributed training, feature stores, real-time serving.
Architecture: 1) Query understanding (intent, entities) 2) Retrieval (inverted index, semantic search) 3) Ranking (learning to rank model) 4) Personalization. Scale: sharding, caching, distributed serving, incremental indexing.
Features: transaction amount, location, time, user history, device fingerprint. Model: ensemble (Random Forest + Neural Net). Real-time: streaming pipeline with Kafka. Handle imbalance: SMOTE, focal loss. Monitor: concept drift detection.
Coding (Python/SQL)
1) Initialize k centroids randomly 2) Assign each point to nearest centroid 3) Update centroids as mean of assigned points 4) Repeat until convergence. Key: Euclidean distance, handle empty clusters, track iterations.
SELECT user_id, SUM(amount) as total FROM purchases WHERE purchase_date >= CURRENT_DATE - INTERVAL '30 days' GROUP BY user_id ORDER BY total DESC LIMIT 10;
def gradient_descent(X, y, lr=0.01, epochs=1000): w = np.zeros(X.shape[1]); for _ in range(epochs): pred = X @ w; grad = X.T @ (pred - y) / len(y); w -= lr * grad; return w
Statistics & Probability
Hypothesis test: H0: p=0.5 (fair), H1: p≠0.5 (biased). Flip n times, use binomial test or chi-square. If p-value < 0.05, reject H0. For continuous: use Z-test with CLT.
Correlation: variables move together. Causation: one causes the other. Establish: randomized controlled trials, A/B tests, causal inference methods (propensity score matching, instrumental variables, difference-in-differences).
P-value: probability of observing data if H0 true. CI: range likely containing true parameter. Type I: false positive (reject true H0). Type II: false negative (fail to reject false H0). α and β control these.
Interview Focus by Company
- ML algorithms depth
- System design at scale
- Coding efficiency
Meta
- Production ML
- A/B testing
- Feature engineering
OpenAI
- Research background
- Deep learning
- Transformers
Amazon
- Behavioral (LPs)
- System design
- ML at scale
8-Week Interview Prep Timeline
ML Fundamentals Review
- Review supervised/unsupervised algorithms
- Practice bias-variance, regularization
- Understand evaluation metrics
- Refresh linear algebra, statistics
Deep Learning Deep Dive
- Study neural network architectures
- Practice backpropagation calculations
- Understand CNNs, RNNs, Transformers
- Learn optimization techniques
Coding Practice
- LeetCode medium/hard problems (30+)
- Implement ML algorithms from scratch
- Practice pandas, numpy operations
- SQL query optimization
System Design
- Study ML system design patterns
- Practice whiteboard design
- Learn feature stores, model serving
- Understand A/B testing frameworks
📚 Recommended Resources:
- • Cracking the ML Interview (book) - $35
- • LeetCode Premium - $35/month
- • Grokking ML System Design (course) - $79
- • Mock interviews (interviewing.io) - $150-300
- Total investment: ~$300-500 for complete prep
Get Free Interview Question Bank
150+ ML interview questions with detailed answers, delivered to your inbox.
✓ 150+ questions ✓ Detailed answers ✓ Company-specific tips
Optimize Your AI Resume
Get ATS-optimized resume templates that pass FAANG screening systems.
View Resume Templates