150+ Real Questions from FAANG Companies

ML Interview Questions 2025

Complete machine learning interview prep: 150+ real questions from Google, Meta, OpenAI, and top AI companies. Includes detailed answers, coding examples, and salary data.

Total Questions
150+
Companies Covered
20+
Avg Prep Time
8 weeks
Start Practicing

Question Categories

Organized by topic with real examples from top companies

ML Algorithms & Theory

45 questionsDifficulty: Medium-HardAvg time: 30-45 min
Q: Explain the bias-variance tradeoff. How does it relate to overfitting and underfitting?
GoogleMedium
Answer:

Bias is error from wrong assumptions (underfitting). Variance is error from sensitivity to training data (overfitting). Trade-off: High bias = too simple model, high variance = too complex. Optimal model balances both.

Q: What's the difference between L1 and L2 regularization? When would you use each?
MetaMedium
Answer:

L1 (Lasso) adds absolute value of coefficients, creates sparse models (feature selection). L2 (Ridge) adds squared coefficients, prevents large weights. Use L1 for feature selection, L2 for preventing overfitting when all features matter.

Q: How does gradient descent work? Explain variants (SGD, mini-batch, Adam).
OpenAIMedium
Answer:

Iteratively adjusts weights to minimize loss. SGD: one sample at a time (noisy). Mini-batch: small batches (balance speed/accuracy). Adam: adaptive learning rates per parameter (most popular).

Deep Learning

38 questionsDifficulty: HardAvg time: 45-60 min
Q: Explain how backpropagation works in neural networks.
GoogleHard
Answer:

Computes gradients of loss w.r.t. weights using chain rule. Forward pass: compute predictions. Backward pass: calculate gradients layer by layer from output to input. Update weights using optimizer.

Q: What causes vanishing/exploding gradients? How do you prevent them?
MetaHard
Answer:

Vanishing: gradients shrink in deep networks (sigmoid/tanh). Exploding: gradients grow exponentially. Solutions: ReLU activation, batch normalization, residual connections, gradient clipping, proper weight initialization.

Q: Compare CNNs, RNNs, and Transformers. When would you use each?
OpenAIMedium
Answer:

CNNs: spatial data (images) with convolutions. RNNs: sequential data (time series) with hidden states. Transformers: parallel processing with attention (NLP, vision). Transformers now dominate most tasks.

ML System Design

25 questionsDifficulty: HardAvg time: 60 min
Q: Design a recommendation system for a video streaming platform (like YouTube).
GoogleHard
Answer:

Approach: 1) Candidate generation (collaborative filtering, content-based) 2) Ranking (deep learning model with user/video features) 3) Re-ranking (diversity, freshness) 4) A/B testing. Scale: distributed training, feature stores, real-time serving.

Q: Design a search ranking system. How would you handle billions of queries daily?
MetaHard
Answer:

Architecture: 1) Query understanding (intent, entities) 2) Retrieval (inverted index, semantic search) 3) Ranking (learning to rank model) 4) Personalization. Scale: sharding, caching, distributed serving, incremental indexing.

Q: How would you build a fraud detection system for financial transactions?
StripeHard
Answer:

Features: transaction amount, location, time, user history, device fingerprint. Model: ensemble (Random Forest + Neural Net). Real-time: streaming pipeline with Kafka. Handle imbalance: SMOTE, focal loss. Monitor: concept drift detection.

Coding (Python/SQL)

32 questionsDifficulty: MediumAvg time: 30-45 min
Q: Implement k-means clustering from scratch in Python.
GoogleMedium
Answer:

1) Initialize k centroids randomly 2) Assign each point to nearest centroid 3) Update centroids as mean of assigned points 4) Repeat until convergence. Key: Euclidean distance, handle empty clusters, track iterations.

Q: Write SQL to find top 10 users by purchase amount in last 30 days.
MetaEasy
Answer:

SELECT user_id, SUM(amount) as total FROM purchases WHERE purchase_date >= CURRENT_DATE - INTERVAL '30 days' GROUP BY user_id ORDER BY total DESC LIMIT 10;

Q: Implement gradient descent for linear regression.
OpenAIMedium
Answer:

def gradient_descent(X, y, lr=0.01, epochs=1000): w = np.zeros(X.shape[1]); for _ in range(epochs): pred = X @ w; grad = X.T @ (pred - y) / len(y); w -= lr * grad; return w

Statistics & Probability

22 questionsDifficulty: MediumAvg time: 20-30 min
Q: How would you determine if a coin is biased using statistical testing?
GoogleMedium
Answer:

Hypothesis test: H0: p=0.5 (fair), H1: p≠0.5 (biased). Flip n times, use binomial test or chi-square. If p-value < 0.05, reject H0. For continuous: use Z-test with CLT.

Q: What's the difference between correlation and causation? How do you establish causation?
MetaMedium
Answer:

Correlation: variables move together. Causation: one causes the other. Establish: randomized controlled trials, A/B tests, causal inference methods (propensity score matching, instrumental variables, difference-in-differences).

Q: Explain p-value, confidence intervals, and Type I/II errors.
AirbnbMedium
Answer:

P-value: probability of observing data if H0 true. CI: range likely containing true parameter. Type I: false positive (reject true H0). Type II: false negative (fail to reject false H0). α and β control these.

Interview Focus by Company

Google

5-6 rounds interview rounds
Avg Total Comp
$265K
Interview Focus Areas:
  • ML algorithms depth
  • System design at scale
  • Coding efficiency
Difficulty: Very Hard

Meta

4-5 rounds interview rounds
Avg Total Comp
$270K
Interview Focus Areas:
  • Production ML
  • A/B testing
  • Feature engineering
Difficulty: Hard

OpenAI

4-5 rounds interview rounds
Avg Total Comp
$295K
Interview Focus Areas:
  • Research background
  • Deep learning
  • Transformers
Difficulty: Very Hard

Amazon

5-6 rounds interview rounds
Avg Total Comp
$245K
Interview Focus Areas:
  • Behavioral (LPs)
  • System design
  • ML at scale
Difficulty: Medium-Hard

8-Week Interview Prep Timeline

1
Weeks 1-2

ML Fundamentals Review

  • Review supervised/unsupervised algorithms
  • Practice bias-variance, regularization
  • Understand evaluation metrics
  • Refresh linear algebra, statistics
2
Weeks 3-4

Deep Learning Deep Dive

  • Study neural network architectures
  • Practice backpropagation calculations
  • Understand CNNs, RNNs, Transformers
  • Learn optimization techniques
3
Weeks 5-6

Coding Practice

  • LeetCode medium/hard problems (30+)
  • Implement ML algorithms from scratch
  • Practice pandas, numpy operations
  • SQL query optimization
4
Weeks 7-8

System Design

  • Study ML system design patterns
  • Practice whiteboard design
  • Learn feature stores, model serving
  • Understand A/B testing frameworks

📚 Recommended Resources:

  • Cracking the ML Interview (book) - $35
  • LeetCode Premium - $35/month
  • Grokking ML System Design (course) - $79
  • Mock interviews (interviewing.io) - $150-300
  • Total investment: ~$300-500 for complete prep

Get Free Interview Question Bank

150+ ML interview questions with detailed answers, delivered to your inbox.

We respect your privacy. Unsubscribe at any time.

✓ 150+ questions ✓ Detailed answers ✓ Company-specific tips

Optimize Your AI Resume

Get ATS-optimized resume templates that pass FAANG screening systems.

View Resume Templates