150+ Real Questions from FAANG Companies

ML Interview Questions 2025

Complete machine learning interview prep: 150+ real questions from Google, Meta, OpenAI, and top AI companies. Includes detailed answers, coding examples, and salary data.

Total Questions

150+

Companies Covered

20+

Avg Prep Time

8 weeks

Start Practicing

Question Categories

Organized by topic with real examples from top companies

ML Algorithms & Theory

45 questions•Difficulty: Medium-Hard•Avg time: 30-45 min

Q: Explain the bias-variance tradeoff. How does it relate to overfitting and underfitting?

GoogleMedium

Answer:

Bias is error from wrong assumptions (underfitting). Variance is error from sensitivity to training data (overfitting). Trade-off: High bias = too simple model, high variance = too complex. Optimal model balances both.

Q: What's the difference between L1 and L2 regularization? When would you use each?

MetaMedium

Answer:

L1 (Lasso) adds absolute value of coefficients, creates sparse models (feature selection). L2 (Ridge) adds squared coefficients, prevents large weights. Use L1 for feature selection, L2 for preventing overfitting when all features matter.

Q: How does gradient descent work? Explain variants (SGD, mini-batch, Adam).

OpenAIMedium

Answer:

Iteratively adjusts weights to minimize loss. SGD: one sample at a time (noisy). Mini-batch: small batches (balance speed/accuracy). Adam: adaptive learning rates per parameter (most popular).

Deep Learning

38 questions•Difficulty: Hard•Avg time: 45-60 min

Q: Explain how backpropagation works in neural networks.

GoogleHard

Answer:

Computes gradients of loss w.r.t. weights using chain rule. Forward pass: compute predictions. Backward pass: calculate gradients layer by layer from output to input. Update weights using optimizer.

Q: What causes vanishing/exploding gradients? How do you prevent them?

MetaHard

Answer:

Vanishing: gradients shrink in deep networks (sigmoid/tanh). Exploding: gradients grow exponentially. Solutions: ReLU activation, batch normalization, residual connections, gradient clipping, proper weight initialization.

Q: Compare CNNs, RNNs, and Transformers. When would you use each?

OpenAIMedium

Answer:

CNNs: spatial data (images) with convolutions. RNNs: sequential data (time series) with hidden states. Transformers: parallel processing with attention (NLP, vision). Transformers now dominate most tasks.

ML System Design

25 questions•Difficulty: Hard•Avg time: 60 min

Q: Design a recommendation system for a video streaming platform (like YouTube).

GoogleHard

Answer:

Approach: 1) Candidate generation (collaborative filtering, content-based) 2) Ranking (deep learning model with user/video features) 3) Re-ranking (diversity, freshness) 4) A/B testing. Scale: distributed training, feature stores, real-time serving.

Q: Design a search ranking system. How would you handle billions of queries daily?

MetaHard

Answer:

Architecture: 1) Query understanding (intent, entities) 2) Retrieval (inverted index, semantic search) 3) Ranking (learning to rank model) 4) Personalization. Scale: sharding, caching, distributed serving, incremental indexing.

Q: How would you build a fraud detection system for financial transactions?

StripeHard

Answer:

Features: transaction amount, location, time, user history, device fingerprint. Model: ensemble (Random Forest + Neural Net). Real-time: streaming pipeline with Kafka. Handle imbalance: SMOTE, focal loss. Monitor: concept drift detection.

Coding (Python/SQL)

32 questions•Difficulty: Medium•Avg time: 30-45 min

Q: Implement k-means clustering from scratch in Python.

GoogleMedium

Answer:

1) Initialize k centroids randomly 2) Assign each point to nearest centroid 3) Update centroids as mean of assigned points 4) Repeat until convergence. Key: Euclidean distance, handle empty clusters, track iterations.

Q: Write SQL to find top 10 users by purchase amount in last 30 days.

MetaEasy

Answer:

SELECT user_id, SUM(amount) as total FROM purchases WHERE purchase_date >= CURRENT_DATE - INTERVAL '30 days' GROUP BY user_id ORDER BY total DESC LIMIT 10;

Q: Implement gradient descent for linear regression.

OpenAIMedium

Answer:

def gradient_descent(X, y, lr=0.01, epochs=1000): w = np.zeros(X.shape[1]); for _ in range(epochs): pred = X @ w; grad = X.T @ (pred - y) / len(y); w -= lr * grad; return w

Statistics & Probability

22 questions•Difficulty: Medium•Avg time: 20-30 min

Q: How would you determine if a coin is biased using statistical testing?

GoogleMedium

Answer:

Hypothesis test: H0: p=0.5 (fair), H1: p≠0.5 (biased). Flip n times, use binomial test or chi-square. If p-value < 0.05, reject H0. For continuous: use Z-test with CLT.

Q: What's the difference between correlation and causation? How do you establish causation?

MetaMedium

Answer:

Correlation: variables move together. Causation: one causes the other. Establish: randomized controlled trials, A/B tests, causal inference methods (propensity score matching, instrumental variables, difference-in-differences).

Q: Explain p-value, confidence intervals, and Type I/II errors.

AirbnbMedium

Answer:

P-value: probability of observing data if H0 true. CI: range likely containing true parameter. Type I: false positive (reject true H0). Type II: false negative (fail to reject false H0). α and β control these.

Interview Focus by Company

Google

5-6 rounds interview rounds

Avg Total Comp

$265K

Interview Focus Areas:

ML algorithms depth
System design at scale
Coding efficiency

Difficulty: Very Hard

OpenAI

4-5 rounds interview rounds

Avg Total Comp

$295K

Interview Focus Areas:

Research background
Deep learning
Transformers

Difficulty: Very Hard

Amazon

5-6 rounds interview rounds

Avg Total Comp

$245K

Interview Focus Areas:

Behavioral (LPs)
System design
ML at scale

Difficulty: Medium-Hard

8-Week Interview Prep Timeline

Weeks 1-2

ML Fundamentals Review

Review supervised/unsupervised algorithms
Practice bias-variance, regularization
Understand evaluation metrics
Refresh linear algebra, statistics

Weeks 3-4

Deep Learning Deep Dive

Study neural network architectures
Practice backpropagation calculations
Understand CNNs, RNNs, Transformers
Learn optimization techniques

Weeks 5-6

Coding Practice

LeetCode medium/hard problems (30+)
Implement ML algorithms from scratch
Practice pandas, numpy operations
SQL query optimization

Weeks 7-8

System Design

Study ML system design patterns
Practice whiteboard design
Learn feature stores, model serving
Understand A/B testing frameworks

📚 Recommended Resources:

• Cracking the ML Interview (book) - $35
• LeetCode Premium - $35/month
• Grokking ML System Design (course) - $79
• Mock interviews (interviewing.io) - $150-300
Total investment: ~$300-500 for complete prep

Get Free Interview Question Bank

150+ ML interview questions with detailed answers, delivered to your inbox.

✓ 150+ questions ✓ Detailed answers ✓ Company-specific tips

Optimize Your AI Resume

Get ATS-optimized resume templates that pass FAANG screening systems.

View Resume Templates