Machine Learning Set 25 MCQs

Q1 | For a multiple regression model, SST = 200 and SSE = 50. The multiple coefficient ofdetermination is

0.25
4.00
0.75
none of the above

Answer: 4.00

Q2 | A nearest neighbor approach is best used

with large-sized datasets.
when irrelevant attributes have been removed from the data.
when a generalized model of the data is desirable.
when an explanation of what has been found is of primary importance.

Answer: when irrelevant attributes have been removed from the data.

Q3 | Another name for an output attribute.

predictive variable
independent variable
estimated variable
dependent variable

Answer: independent variable

Q4 | Classification problems are distinguished from estimation problems in that

classification problems require the output attribute to be numeric.
classification problems require the output attribute to be categorical.
classification problems do not allow an output attribute.
classification problems are designed to predict future outcome.

Answer: classification problems do not allow an output attribute.

Q5 | Which statement is true about prediction problems?

the output attribute must be categorical.
the output attribute must be numeric.
the resultant model is designed to determine future outcomes.
the resultant model is designed to classify current behavior.

Answer: the resultant model is designed to classify current behavior.

Q6 | Which of the following is a common use of unsupervised clustering?

detect outliers
determine a best set of input attributes for supervised learning
evaluate the likely performance of a supervised learner model
determine if meaningful relationships can be found in a dataset

Answer: detect outliers

Q7 | The average positive difference between computed and desired outcome values.

root mean squared error
mean squared error
mean absolute error
mean positive error

Answer: mean positive error

Q8 | Selecting data so as to assure that each class is properly represented in both the training andtest set.

cross validation
stratification
verification
bootstrapping

Answer: stratification

Q9 | The standard error is defined as the square root of this computation.

the sample variance divided by the total number of sample instances.
the population variance divided by the total number of sample instances.
the sample variance divided by the sample mean.
the population variance divided by the sample mean.

Answer: the sample variance divided by the total number of sample instances.

Q10 | Data used to optimize the parameter settings of a supervised learner model.

training
test
verification
validation

Answer: validation

Q11 | Bootstrapping allows us to

choose the same training instance several times.
choose the same test set instance several times.
build models with alternative subsets of the training data several times.
test a model with alternative subsets of the test data several times.

Answer: choose the same training instance several times.

Q12 | The correlation coefficient for two real-valued attributes is –0.85. What does this value tell you?

the attributes are not linearly related.
as the value of one attribute increases the value of the second attribute also increases.
as the value of one attribute decreases the value of the second attribute increases.
the attributes show a curvilinear relationship.

Answer: as the value of one attribute decreases the value of the second attribute increases.

Q13 | The average squared difference between classifier predicted output and actual output.

mean squared error
root mean squared error
mean absolute error
mean relative error

Answer: mean squared error

Q14 | Simple regression assumes a __________ relationship between the input attribute and outputattribute.

linear
quadratic
reciprocal
inverse

Answer: linear

Q15 | Regression trees are often used to model _______ data.

linear
nonlinear
categorical
symmetrical

Answer: nonlinear

Q16 | The leaf nodes of a model tree are

averages of numeric output attribute values.
nonlinear regression equations.
linear regression equations.
sums of numeric output attribute values.

Answer: linear regression equations.

Q17 | Logistic regression is a ________ regression technique that is used to model data having a_____outcome.

linear, numeric
linear, binary
nonlinear, numeric
nonlinear, binary

Answer: nonlinear, binary

Q18 | This technique associates a conditional probability value with each data instance.

linear regression
logistic regression
simple regression
multiple linear regression

Answer: logistic regression

Q19 | This supervised learning technique can process both numeric and categorical input attributes.

linear regression
bayes classifier
logistic regression
backpropagation learning

Answer: linear regression

Q20 | With Bayes classifier, missing data items are

treated as equal compares.
treated as unequal compares.
replaced with a default value.
ignored.

Answer: treated as unequal compares.

Q21 | This clustering algorithm merges and splits nodes to help modify nonoptimal partitions.

agglomerative clustering
expectation maximization
conceptual clustering
k-means clustering

Answer: k-means clustering

Q22 | This clustering algorithm initially assumes that each data instance represents a single cluster.

agglomerative clustering
conceptual clustering
k-means clustering
expectation maximization

Answer: k-means clustering

Q23 | This unsupervised clustering algorithm terminates when mean values computed for the currentiteration of the algorithm are identical to the computed mean values for the previous iteration.

agglomerative clustering
conceptual clustering
k-means clustering
expectation maximization

Answer: k-means clustering

Q24 | Machine learning techniques differ from statistical techniques in that machine learning methods

typically assume an underlying distribution for the data.
are better able to deal with missing and noisy data.
are not able to explain their behavior.
have trouble with large-sized datasets.

Answer: are better able to deal with missing and noisy data.

Q25 | In reinforcement learning if feedback is negative one it is defined as____.

Penalty
Overlearning
Reward
None of above

Answer: Penalty