Pre-test: Introduction to Machine Learning

Part 1

1) Who developed the Perceptron, one of the earliest neural network models?

Frank Rosenblatt introduced the Perceptron (1957), an early neural network model, which inspired later deep learning research.

2) Who coined the term “Artificial Intelligence” at the Dartmouth Conference in 1956?

John McCarthy is credited with coining the term Artificial Intelligence and organizing the Dartmouth Conference in 1956.

3) Who created the Support Vector Machine (SVM) algorithm?

Vladimir Vapnik and Alexey Chervonenkis introduced SVM in the 1990s, which became a powerful tool for classification.

4) Who is the founder of DeepMind, the team behind AlphaGo?

Demis Hassabis co-founded DeepMind, which developed AlphaGo and AlphaZero using deep reinforcement learning.

5) Why is Arthur Samuel’s checkers program considered a milestone?

Arthur Samuel’s checkers program (1950s) was the first successful self-learning program, a landmark in machine learning history.

6) Which example shows applied machine learning?

Applied ML means building and training models from data. Theory alone is not applied practice.

7) How does machine learning relate to AI?

AI is the broad field; ML is a subset focused on learning from data.

8) What best characterizes supervised learning?

Supervised learning uses input–output pairs (labeled data) to train models.

9) Which statement best defines machine learning?

ML is about improving from data automatically rather than coding every rule.

10) What is the key purpose of classification?

Classification models predict classes such as spam/ham or positive/negative.

11) What is the objective of unsupervised learning?

Unsupervised learning discovers structure (clusters, associations) without labels.

12) What is the goal of association rule learning?

Association rule mining (e.g. market basket analysis) finds frequent item relationships like Milk → Bread.

13) Which algorithm repeatedly assigns data points into k clusters?

K-means iteratively assigns points to centroids and updates until convergence.

14) Which is a typical use case of reinforcement learning?

Reinforcement learning solves sequential decision-making problems with rewards, such as autonomous driving.

15) Which learning paradigm uses both labeled and unlabeled data?

Semi-supervised methods leverage a small labeled set plus a large unlabeled set to improve learning.

16) Which of the following is an evaluation metric for classification?

Accuracy = (correct predictions) / (total predictions), a standard classification metric.

17) Which algorithm is a common baseline for linear classification?

Logistic regression models class probabilities for binary/multiclass settings.

18) Which technique is best for grouping customers by similar behavior?

Clustering groups observations by similarity without labels (e.g., k-means, hierarchical).

19) What is overfitting in machine learning?

Overfitting = low bias, high variance; excellent on training data but weak generalization on unseen data.

20) Which technique helps reduce overfitting in machine learning?

Cross-validation evaluates the model on multiple data splits, helping detect and reduce overfitting.

21) Which of the following is an example of regression in ML?

Regression predicts continuous values, such as house prices or temperature.

22) Which of these is an unsupervised learning algorithm?

K-means is unsupervised because it finds groups in unlabeled data.

23) What does a confusion matrix show in classification?

A confusion matrix summarizes results into TP, TN, FP, FN, giving more detail than accuracy alone.

24) Which optimization algorithm is most common in training neural networks?

Gradient descent updates model weights iteratively to minimize loss.

25) Which ML model is inspired by the human brain?

Neural networks are modeled after biological neurons connected in layers.

26) Which task is NOT supervised learning?

Clustering is unsupervised; the other tasks use labeled data, so they’re supervised.

27) What is the purpose of feature scaling?

Scaling prevents features with larger numeric ranges from dominating distance-based or gradient-based models.

28) Which distance metric is most common in KNN?

Euclidean distance is the standard metric in KNN to measure closeness.

29) What does PCA (Principal Component Analysis) do?

PCA compresses data into fewer dimensions while keeping most variance.

30) Which is true about Random Forest?

Random Forest is an ensemble of many decision trees that improves stability and accuracy.

31) Which task is most suitable for reinforcement learning?

RL handles sequential decisions like a robot learning to walk through trial and error.

32) Which is a common loss function for regression models?

MSE is the standard regression loss measuring average squared differences.

33) Which dataset split is used to test generalization?

The test set is held back until the end to check how well the model generalizes to unseen data.

34) What is a limitation of K-means clustering?

K-means needs k predefined, which is often unknown and requires experimentation.

Part 2

Q1: Predicting house prices from historical data can be solved using Supervised Learning.

House-price prediction maps features (e.g., size, location) to a numeric target (price). This is supervised regression because the model learns from labeled examples (x, y).

Q2: Which algorithm is commonly used for classification in Supervised Learning?

Logistic Regression is a classic supervised classification algorithm.

  • K-means = unsupervised clustering
  • Q-learning = reinforcement learning
  • Apriori = association rule mining (unsupervised)

Q3: Sentiment analysis of customer reviews is an example of Supervised Learning.

Sentiment analysis uses labeled text (e.g., positive/negative/neutral). Labels make it supervised classification.

Q4: Which is required for Supervised Learning?

Supervised learning needs labeled data (inputs with known outputs).

  • Unlabeled → typically unsupervised
  • Rewards/penalties → reinforcement learning

Q5: Supervised Learning can be applied to spam email detection.

Spam detection is binary classification trained on emails labeled spam/not spam.

Q6: Market segmentation using customer purchase data is an example of Unsupervised Learning.

Segmentation often uses clustering (e.g., K-means) to discover groups without labels → unsupervised.

Q7: Which algorithm belongs to Unsupervised Learning?

K-means is an unsupervised clustering method. Linear Regression & Decision Tree (supervised), Q-learning (reinforcement).

Q8: In Unsupervised Learning, the algorithm is trained with both inputs and labeled outputs.

Unsupervised learning uses inputs only (no target labels) to find structure (clusters, components).

Q9: Which of the following is NOT an application of Unsupervised Learning?

House price prediction is supervised regression. Image compression & dimensionality reduction (e.g., PCA) and segmentation are unsupervised.

Q10: Association rule mining (e.g., “Customers who buy bread also buy butter”) is part of Unsupervised Learning.

Association rules (e.g., Apriori) learn co-occurrence patterns from unlabeled transaction data.

🔹 Reinforcement Learning

Q11: A robot learning to walk by trial and error is an example of Reinforcement Learning.

In RL, an agent interacts with an environment, receives rewards/penalties, and iteratively improves its policy.

Q12: In Reinforcement Learning, the agent learns by:

RL optimizes behavior via reward signals, not labels or grouping.

Q13: Reinforcement Learning is suitable for training self-driving cars.

Self-driving involves sequential decisions with delayed outcomes—well-suited to RL (often combined with supervised/perception models).

Q14: Which of the following best describes Reinforcement Learning?

RL = interaction + feedback (rewards). It’s neither standard labeled learning nor pure clustering.

Q15: Game-playing AI like AlphaGo uses Reinforcement Learning.

AlphaGo/AlphaZero use RL (self-play, policy/value networks) to maximize winning probability.

Q16: Which learning type would you use for fraud detection in banking?

Fraud detection is commonly supervised classification with labeled fraud/non-fraud. Note: Unsupervised anomaly detection can complement when labels are scarce.

Q17: Which learning type would you use for grouping students by similar learning styles?

Grouping without known labels → unsupervised clustering (e.g., K-means, hierarchical clustering).

Q18: Which learning type would you use for a robot navigating a maze?

Maze navigation requires sequential decision-making with rewards (e.g., reaching the goal) → RL.

Q19: Predicting tomorrow’s temperature using past weather data is an application of Unsupervised Learning.

Forecasting a numeric target from historical features is supervised regression (often time-series models).

Q20: Teaching an AI to play chess without providing the rules but only giving rewards for wins is Reinforcement Learning.

The agent learns policies from reward signals (win/loss) through environment interaction, i.e., reinforcement learning.