On This Page

This set of Machine Learning (ML) Multiple Choice Questions & Answers (MCQs) focuses on Machine Learning Set 9

Q1 | This  clustering algorithm terminates when mean values computed for the current iteration of the algorithm are identical to the computed mean values for the previous iteration Select one:
  • k-means clustering
  • conceptual clustering
  • expectation maximization
  • agglomerative clustering
Q2 | Which one of the following is the main reason for pruning a Decision Tree?
  • to save computing time during testing
  • to save space for storing the decision tree
  • to make the training set error smaller
  • to avoid overfitting the training set
Q3 | You've just finished training a decision tree for spam classification, and it is getting abnormally bad performance on both your training and test sets. You know that your implementation has no bugs, so what could be causing the problem?
  • your decision trees are too shallow.
  • you need to increase the learning rate.
  • you are overfitting.
  • incorrect data
Q4 | The K-means algorithm:
  • requires the dimension of the feature space to be no bigger than the number of samples
  • has the smallest value of the objective function when k = 1
  • minimizes the within class variance for a given number of clusters
  • converges to the global optimum if and only if the initial means are chosen as some of the samples themselves
Q5 | Which of the following metrics, do we have for finding dissimilarity between two clusters in hierarchical clustering?1. Single-link2. Complete-link3. Average-link
  • 1 and 2
  • 1 and 3
  • 2 and 3
  • 1, 2 and 3
Q6 | In which of the following cases will K-Means clustering fail to give good results?1. Data points with outliers2. Data points with different densities3. Data points with round shapes4. Data points with non-convex shapes
  • 1 and 2
  • 2 and 3
  • 2 and 4
  • 1, 2 and 4
Q7 | Hierarchical clustering is slower than non-hierarchical clustering?
  • true
  • false
  • depends on data
  • cannot say
Q8 | High entropy means that the partitions in classification are
  • pure
  • not pure
  • useful
  • useless
Q9 | Suppose we would like to perform clustering on spatial data such as the geometrical locations of houses. We wish to produce clusters of many different sizes and shapes. Which of the following methods is the most appropriate?
  • decision trees
  • density-based clustering
  • model-based clustering
  • k-means clustering
Q10 | The main disadvantage of maximum likelihood methods is that they are _____
  • mathematically less folded
  • mathematically less complex
  • mathematically less complex
  • computationally intense
Q11 | The maximum likelihood method can be used to explore relationships among more diverse sequences, conditions that are not well handled by maximum parsimony methods.
  • true
  • false
  • -
  • -
Q12 | Which Statement is not true statement.
  • k-means clustering is a linear clustering algorithm.
  • k-means clustering aims to partition n observations into k clusters
  • k-nearest neighbor is same as k-means
  • k-means is sensitive to outlier
Q13 | With Bayes theorem the probability of hypothesis H¾ specified by P(H) ¾ is referred to as
  • a conditional probability
  • an a priori probability
  • a bidirectional probability
  • a posterior probability
Q14 | The probability that a person owns a sports car given that they subscribe to automotive magazine is 40%. We also know that 3% of the adult population subscribes to automotive magazine. The probability of a person owning a sports car given that they don’t subscribe to automotive magazine is 30%. Use this information to compute the probability that a person subscribes to automotive magazine given that they own a sports car
  • 0.0398
  • 0.0389
  • 0.0368
  • 0.0396
Q15 | What is the naïve assumption in a Naïve Bayes Classifier.
  • all the classes are independent of each other
  • all the features of a class are independent of each other
  • the most probable feature for a class is the most important feature to be cinsidered for classification
  • all the features of a class are conditionally dependent on each other
Q16 | Based on survey , it was found that the probability that person like to watch serials is 0.25 and the probability that person like to watch netflix series is 0.43. Also the probability that person like to watch serials and netflix sereis is 0.12. what is the probability that a person doesn't like to watch either?
  • 0.32
  • 0.2
  • 0.44
  • 0.56
Q17 | What is the actual number of independent parameters which need to be estimated in P dimensional Gaussian distribution model?
  • p
  • 2p
  • p(p+1)/2
  • p(p+3)/2
Q18 | Give the correct Answer for following statements.1. It is important to perform feature normalization before using the Gaussian kernel.2. The maximum value of the Gaussian kernel is 1.
  • 1 is true, 2 is false
  • 1 is false, 2 is true
  • 1 is true, 2 is true
  • 1 is false, 2 is false
Q19 | Which of the following quantities are minimized directly or indirectly during parameter estimation in Gaussian distribution Model?
  • negative log-likelihood
  • log-liklihood
  • cross entropy
  • residual sum of square
Q20 | Consider the following dataset. x,y,z are the features and T is a class(1/0). Classify the test data (0,0,1) as values of x,y,z respectively.
  • 0
  • 1
  • 0.1
  • 0.9
Q21 | Given a rule of the form IF X THEN Y, rule confidence is defined as the conditional probability that Select one:
  • y is false when x is known to be false.
  • y is true when x is known to be true.
  • x is true when y is known to be true
  • x is false when y is known to be false.
Q22 | Which of the following statements about Naive Bayes is incorrect?
  • attributes are equally important.
  • attributes are statistically dependent of one another given the class value.
  • attributes are statistically independent of one another given the class value.
  • attributes can be nominal or numeric
Q23 | How the entries in the full joint probability distribution can be calculated?
  • using variables
  • using information
  • both using variables & information
  • none of the mentioned
Q24 | How many terms are required for building a bayes model?
  • 1
  • 2
  • 3
  • 4