Machine Learning (ML)/Machine Learning (ML) MCQ Set 03 Sample Test,Sample questions

Question:
 A database has 5 transactions. Of these, 4 transactions include milk and bread. Further, of the given 4 transactions, 2 transactions include cheese. Find the support percentage for the following association rule “if milk and bread are purchased, then cheese is also purchased”.

1.0.4

2.0.6

3.0.8

4.0.42


Question:
 Hierarchical agglomerative clustering is typically visualized as?

1.dendrogram

2.binary trees

3.block diagram

4.graph


Question:
 In a Rule based classifier, If there is a rule for each combination of attribute values, what do you called that rule set R

1.exhaustive

2.inclusive

3.comprehensive

4.mutually exclusive


Question:
 In Apriori algorithm, if 1 item-sets are 100, then the number of candidate 2 item-sets are

1.100

2.200

3.4950

4.5000


Question:
 KDD represents extraction of

1.data

2.knowledge

3.rules

4.model


Question:
 What are tree based classifiers?

1.classifiers which form a tree with each attribute at one level

2.classifiers which perform series of condition checking with one attributeat a time

3. both options except none

4.none of the options


Question:
 What is the final resultant cluster size in Divisive algorithm, which is one of the hierarchical clustering approaches?

1.zero

2.three

3.singleton

4.two


Question:
 which of the following cases will K-Means clustering give poor results?
1. Data points with outliers
2. Data points with different densities
3. Data points with round shapes
4. Data points with non-convex shapes

1.1 and 2

2. 2 and 3

3. 2 and 4

4.1, 2 and 4


Question:
 Which of the following properties are characteristic of decision trees?
(a) High bias
(b) High variance
(c) Lack of smoothness of prediction surfaces
(d) Unbounded parameter set

1.a and b

2.a and d

3.b, c and d

4.all of the above


Question:
 Which of the following sentences are correct in reference to
Information gain?
a. It is biased towards single-valued attributes
b. It is biased towards multi-valued attributes
c. ID3 makes use of information gain
d. The approact used by ID3 is greedy

1.a and b

2.a and d

3.b, c and d

4.all of the above


Question:
 Which one of these is not a tree based learner?

1.cart

2. id3

3.bayesian classifier

4.random forest


Question:
A good clustering method will produce high quality clusters with

1.high inter class similarity

2.low intra class similarity

3.high intra class similarity

4.no inter class similarity


Question:
Assume that you are given a data set and a neural network model trained on the data set. You
are asked to build a decision tree model with the sole purpose of understanding/interpreting
the built neural network model. In such a scenario, which among the following measures would
you concentrate most on optimising?

1.accuracy of the decision tree model on the given data set

2.f1 measure of the decision tree model on the given data set

3.fidelity of the decision tree model, which is the fraction of instances on which the neuralnetwork and the decision tree give the same output

4.comprehensibility of the decision tree model, measured in terms of the size of the corresponding rule set


Question:
Choose the correct statement with respect to ‘confidence’ metric in association rules

1. it is the conditional probability that a randomly selected transaction will include all the items in the consequent given that the transaction includes all the items in the antecedent.

2.a high value of confidence suggests a weak association rule

3.it is the probability that a randomly selected transaction will include all the items in the consequent as well as all the items in the antecedent.

4. confidence is not measured in terms of (estimated) conditional probability.


Question:
Classification rules are extracted from _____________

1.decision tree

2.root node

3.branches

4.siblings


Question:
Clustering is ___________ and is example of ____________learning

1. predictive and supervised

2.dpredictive and unsupervise

3.descriptive and supervised

4.descriptive and unsupervised


Question:
Frequent item sets is

1.superset of only closed frequent item sets

2.superset of only maximal frequent item sets

3.subset of maximal frequent item sets

4.superset of both closed frequent item sets and maximal frequent item sets


Question:
Given a frequent itemset L, If |L| = k, then there are

1.2k – 1 candidate association rules

2.2k candidate association rules

3.2k – 2 candidate association rules

4.2k -2 candidate association rules


Question:
Having built a decision tree, we are using reduced error pruning to reduce the size of the
tree. We select a node to collapse. For this particular node, on the left branch, there are 3
training data points with the following outputs: 5, 7, 9.6 and for the right branch, there are
four training data points with the following outputs: 8.7, 9.8, 10.5, 11. What were the original
responses for data points along the two branches (left & right respectively) and what is the
new response after collapsing the node?

1.10.8, 13.33, 14.48

2.10.8, 13.33, 12.06

3.7.2, 10, 8.8

4.7.2, 10, 8.6


Question:
his clustering approach initially assumes that each data instance represents a single cluster.

1.expectation maximization

2. k-means clustering

3.agglomerative clustering .

4. conceptual clustering


Question:
How can we best represent ‘support’ for the following association rule: “If X and Y, then Z”.

1. {x,y}/(total number of transactions)

2.{z}/(total number of transactions)

3. {z}/{x,y}

4. {x,y,z}/(total number of transactions)


Question:
How to select best hyperparameters in tree based models?

1.measure performance over training data

2.measure performance over validation data

3.both of these

4.random selection of hyper parameters


Question:
How will you counter over-fitting in decision tree?

1. by pruning the longer rules

2.by creating new rules

3.both by pruning the longer rules’ and ‘ by creating new rules’

4. none of the options


Question:
If an item set ‘XYZ’ is a frequent item set, then all subsets of that frequent item set are

1.undefined

2.not frequent

3.frequent

4.can not say


Question:
If {A,B,C,D} is a frequent itemset, candidate rules which is not possible is

1.c –> a

2.d –>abcd

3. a –> bc

4.b –> adc


Question:
Lasso can be interpreted as least-squares linear regression where

1.weights are regularized with the l1 norm

2.the weights have a gaussian prior

3.weights are regularized with the l2 norm

4. the solution algorithm is simpler


Question:
Machine learning techniques differ from statistical techniques in that machine learning methods

1.are better able to deal with missing and noisy data

2.typically assume an underlying distribution for the data

3.have trouble with large-sized datasets

4. are not able to explain their behavior


Question:
Suppose on performing reduced error pruning, we collapsed a node and observed an improvement in the prediction accuracy on the validation set. Which among the following statements
are possible in light of the performance improvement observed? (a) The collapsed node helped overcome the effect of one or more noise affected data points in the training set
(b) The validation set had one or more noise affected data points in the region corresponding to the collapsed node
(c) The validation set did not have any data points along at least one of the collapsed branches
(d) The validation set did have data points adversely affected by the collapsed node

1.a and b

2.a and d

3.b, c and d

4.all of the above


Question:
The apriori property means

1.if a set cannot pass a test, its supersets will also fail the same test

2.to decrease the efficiency, do level-wise generation of frequent item sets

3.to improve the efficiency, do level-wise generation of frequent item sets d.

4.if a set can pass a test, its supersets will fail the same test


Question:
The distance between two points calculated using Pythagoras theorem is

1.supremum distance

2.eucledian distance

3.linear distance

4.manhattan distance


Question:
The most general form of distance is

1.manhattan

2.eucledian

3.mean

4.minkowski


Question:
The number of iterations in apriori ___________ Select one: 

1. increases with the size of the data

2.decreases with the increase in size of the data

3. increases with the size of the maximum frequent set

4.decreases with increase in size of the maximum frequent set


Question:
The probability that a person owns a sports car given that they subscribe to automotive magazine is 40%. We also know that 3% of the adult population subscribes to automotive magazine. The probability of a person owning a sports car given that they don’t subscribe to automotive magazine is 30%. Use this information to compute the probability that a person subscribes to automotive magazine given that they own a sports car

1.0.0368

2.0.0396

3.0.0389

4.0.0398


Question:
The _______ step eliminates the extensions of (k-1)-itemsets which are not found to be frequent,from being considered for counting support

1. partitioning candidate generation

2.candidate generation

3.itemset eliminations

4.pruning


Question:
This clustering algorithm terminates when mean values computed for the current iteration of the algorithm are identical to the computed mean values for the previous iteration

1.k-means clustering

2.conceptual clustering

3.expectation maximization

4.agglomerative clustering


Question:
Time Complexity of k-means is given by

1.o(mn)

2.o(tkn)

3.o(kn)

4.o(t2kn)


Question:
To control the size of the tree, we need to control the number of regions. One approach to
do this would be to split tree nodes only if the resultant decrease in the sum of squares error
exceeds some threshold. For the described method, which among the following are true?
(a) It would, in general, help restrict the size of the trees (b) It has the potential to affect the performance of the resultant regression/classification
model
(c) It is computationally infeasible

1.a and b

2.a and d

3.b, c and d

4.all of the above


Question:
To determine association rules from frequent item sets

1.only minimum confidence needed

2.neither support not confidence needed

3.both minimum support and confidence are needed

4. minimum support is needed


Question:
What are two steps of tree pruning work?

1.pessimistic pruning and optimistic pruning

2.postpruning and prepruning

3.cost complexity pruning and time complexity pruning

4.none of the options


Question:
What are two steps of tree pruning work?

1.pessimistic pruning and optimistic pruning

2.postpruning and prepruning

3.cost complexity pruning and time complexity pruning

4.none of the options


Question:
What does K refers in the K-Means algorithm which is a non-hierarchical clustering approach?

1.complexity

2. fixed value

3.no of iterations

4.number of clusters


Question:
What is Decision Tree?

1.flow-chart

2.structure in which internal node represents test on an attribute, each branch represents outcome of test and each leaf node represents class label

3.flow-chart like structure in which internal node represents test on an attribute, each branch represents outcome of test and each leaf node represents class label

4.None of the above


Question:
What is gini index?

1. A. it is a type of index structure

2.it is a measure of purity

3.both options except none

4.none of the options


Question:
What is the approach of basic algorithm for decision tree induct

1.greedy

2. top down

3.procedural

4.step by stepion?


Question:
What is true about K-Mean Clustering?
1. K-means is extremely sensitive to cluster center initializations
2. Bad initialization can lead to Poor convergence speed
3. Bad initialization can lead to bad overall clustering

1. 1 and 3

2.1 and 2

3.2 and 3

4.1, 2 and 3


Question:
Which among the following statements best describes our approach to learning decision trees

1. identify the best partition of the input space and response per partition to minimise sumof squares error

2. identify the best approximation of the above by the greedy approach (to identifying thepartitions

3.identify the model which gives the best performance using the greedy approximation(option (b)) with the smallest partition scheme

4. identify the model which gives performance close to the best greedy approximation performance (option (b)) with the smallest partition scheme


Question:
Which Association Rule would you prefer

1.high support and medium confidence

2. high support and low confidence

3. low support and high confidence

4.low support and low confidence


Question:
Which Association Rule would you prefer

1.high support and low confidence

2.low support and high confidence

3.low support and low confidence

4. high support and medium confidence


Question:
Which of the following algorithm comes under the classification

1.apriori

2.brute force

3.dbscan

4.k-nearest neighbor


Question:
Which of the following classifications would best suit the student performance classification systems?

1. if...then... analysis

2. market-basket analysis

3.regression analysis

4.cluster analysis


Question:
Which of the following option is true about k-NN algorithm?

1.I can be used for classification

2.??it can be used for regression

3.??it can be used in both classification and regression??

4.not useful in ml algorithm


Question:
Which of the following sentences are true?

1.in pre-pruning a tree is pruned by halting its construction early

2. a pruning set of class labelled tuples is used to estimate cost complexity

3.the best pruned tree is the one that minimizes the number of encodingbits

4.All of the above


Question:
Which one of these is a tree based learner?

1.rule based

2.bayesian belief network

3.bayesian classifier

4.random forest


Question:
Which Statement is not true statement.

1.k-means clustering is a linear clustering algorithm.

2. k-means clustering aims to partition n observations into k clusters

3.k-nearest neighbor is same as k-means

4.k-means is sensitive to outlier


Question:
Which statement is true about the K-Means algorithm?

1. the output attribute must be cateogrical

2. all attribute values must be categorical

3.all attributes must be numeric

4.attribute values may be either categorical or numeric


More MCQS

  1. Machine Learning (ML) MCQ Set 01
  2. Machine Learning (ML) MCQ Set 02
  3. Machine Learning (ML) MCQ Set 03
  4. Machine Learning (ML) MCQ Set 04
  5. Machine Learning (ML) MCQ Set 05
  6. Machine Learning (ML) MCQ Set 06
  7. Machine Learning (ML) MCQ Set 07
  8. Machine Learning (ML) MCQ Set 08
  9. Machine Learning (ML) MCQ Set 09
  10. Machine Learning (ML) MCQ Set 10
Search
R4R Team
R4Rin Top Tutorials are Core Java,Hibernate ,Spring,Sturts.The content on R4R.in website is done by expert team not only with the help of books but along with the strong professional knowledge in all context like coding,designing, marketing,etc!