On This Page

This set of Machine Learning (ML) Multiple Choice Questions & Answers (MCQs) focuses on Machine Learning Set 19

Q1 | How many coefficients do you need to estimate in a simple linear regression model (One independent variable)?
  • 1
  • 2
  • 3
  • 4
Q2 | In a real problem, you should check to see if the SVM is separable and then include slack variables if it is not separable.
  • true
  • false
Q3 | Which of the following are real world applications of the SVM?
  • text and hypertext categorization
  • image classification
  • clustering of news articles
  • all of the above
Q4 | 100 people are at party. Given data gives information about how many wear pink or not, and if a man or not. Imagine a pink wearing guest leaves, was it a man?
  • true
  • false
Q5 | For the given weather data, Calculate probability of playing
  • 0.4
  • 0.64
  • 0.29
  • 0.75
Q6 | In SVR we try to fit the error within a certain threshold.
  • true
  • false
Q7 | In reinforcement learning, this feedback is usually called as     .
  • overfitting
  • overlearning
  • reward
  • none of above
Q8 | Which of the following sentence is correct?
  • machine learning relates with the study, design and development of the algorithms that give computers the capability to learn without being explicitly programmed.
  • data mining can be defined as the process in which the unstructured data tries to extract knowledge or unknown interesting patterns.
  • both a & b
  • none of the above
Q9 | Reinforcement learning is particularly
  • the environment is not
  • it\s often very dynamic
  • it\s impossible to have a
  • all above
Q10 | Lets say, you are working with categorical feature(s) and you have not looked at the distribution of the categorical variable in the test data.You want to apply one hot encoding (OHE) on the categorical feature(s). What challenges you may face if you have applied OHE on a categorical variable of train dataset?
  • all categories of categorical variable are not present in the test dataset.
  • frequency distribution of categories is different in train as compared to the test dataset.
  • train and test always have same distribution.
  • both a and b
Q11 | Which of the following sentence is FALSE regarding regression?
  • it relates inputs to outputs.
  • it is used for prediction.
  • it may be used for interpretation.
  • it discovers causal relationships.
Q12 | Which of the following method is used to find the optimal features for cluster analysis
  • k-means
  • density-based spatial clustering
  • spectral clustering find clusters
  • all above
Q13 | scikit-learn also provides functions for creatingdummy datasets from scratch:
  • make_classification()
  • make_regression()
  • make_blobs()
  • all above
Q14 |          which can accept a NumPy RandomState generator or an integer seed.
  • make_blobs
  • random_state
  • test_size
  • training_size
Q15 | In many classification problems, the target dataset is made up of categorical labels which cannot immediately be processed by any algorithm. An encoding is needed and scikit-learn offers atleast         valid options
  • 1
  • 2
  • 3
  • 4
Q16 | In which of the following each categorical label is first turned into a positive integer and then transformed into a vector where only one feature is 1 while all the others are 0.
  • labelencoder class
  • dictvectorizer
  • labelbinarizer class
  • featurehasher
Q17 |            is the most drastic one and should be considered only when the dataset is quite large, the number of missing features is high, and any prediction could be risky.
  • removing the whole line
  • creating sub-model to predict those features
  • using an automatic strategy to input them according to the other known values
  • all above
Q18 | It's possible to specify if the scaling process must include both mean and standard deviation using the parameters              .
  • with_mean=true/false
  • with_std=true/false
  • both a & b
  • none of the mentioned
Q19 | Which of the following selects the best K high-score features.
  • selectpercentile
  • featurehasher
  • selectkbest
  • all above
Q20 | How does number of observations influence overfitting? Choose the correct answer(s).Note: Rest all parameters are same1. In case of fewer observations, it is easy to overfit the data.2. In case of fewer observations, it is hard to overfit the data.3. In case of more observations, it is easy to overfit the data.4. In case of more observations, it is hard to overfit the data.
  • 1 and 4
  • 2 and 3
  • 1 and 3
  • none of theses
Q21 | Suppose you have fitted a complex regression model on a dataset. Now, you are using Ridge regression with tuning parameter lambda to reduce its complexity. Choose the option(s) below which describes relationship of bias and variance with lambda.
  • in case of very large lambda; bias is low, variance is low
  • in case of very large lambda; bias is low, variance is high
  • in case of very large lambda; bias is high, variance is low
  • in case of very large lambda; bias is high, variance is high
Q22 | What is/are true about ridge regression?1. When lambda is 0, model works like linear regression model2. When lambda is 0, model doesnt work like linear regression model3. When lambda goes to infinity, we get very, very small coefficients approaching 04. When lambda goes to infinity, we get very, very large coefficients approaching infinity
  • 1 and 3
  • 1 and 4
  • 2 and 3
  • 2 and 4
Q23 | Which of the following method(s) does not have closed form solution for its coefficients?
  • ridge regression
  • lasso
  • both ridge and lasso
  • none of both
Q24 | Function used for linear regression in R
  • lm(formula, data)
  • lr(formula, data)
  • lrm(formula, data)
  • regression.linear(formula,
Q25 | In the mathematical Equation of Linear Regression Y?=??1 + ?2X + ?, (?1, ?2) refers to                    
  • (x-intercept, slope)
  • (slope, x-intercept)
  • (y-intercept, slope)
  • (slope, y-intercept)