Database query is used to uncover this type of knowledge.
1.deep
2.hidden
3.shallow
4.multidimensional
If machine learning model output doesnot involves target variable then that model is called as
1.descriptive model
2. predictive model
3.reinforcement learning
4. all of the above
In following type of feature selection method we start with empty feature set
1.forward feature selection
2.both a and b??
3.backword feature selection
4.None of the above
In multiclass classification number of classes must be
1. less than two
2.equals to two
3.greater than two
4.option 1 and option 2
In the example of predicting number of babies based on stork's population ,Number of babies is
1.outcome
2.feature
3.observation
4.attribute
PCA works better if there is 1. A linear structure in the data 2. If the data lies on a curved surface and not on a flat surface 3. If variables are scaled in the same uni
1.1 and 2
2.2 and 3
3.1 and 3
4.1,2 and 3
A feature F1 can take certain value: A, B, C, D, E, & F and represents grade of students from a college. Here feature type is
1.nominal
2.ordinal
3.categorical
4.boolean
A measurable property or parameter of the data-set is
1.training data
2.feature
3.test data
4.validation data
A person trained to interact with a human expert in order to capture their knowledge.
1. knowledge programmer
2.knowledge developer r
3.knowledge engineer
4.knowledge extractor
A student Grade is a variable F1 which takes a value from A,B,C and D. Which of the following is True in the following case?
1.variable f1 is an example of nominal variable
2.variable f1 is an example of ordinal variable
3.it doesn belong to any of the mentioned categories
4. it belongs to both ordinal and nominal category
Application of machine learning methods to large databases is called
1.data mining.
2.artificial intelligence
3.big data computing
4.internet of things
Data used to build a data mining model.
1.training data
2.validation data
3.test data
4.hidden data
Different learning methods does not include?
1.memorization
2.analogy
3.deduction
4.introduction
Feature can be used as a
1.binary split
2.predictor
3.both a and b??
4.None of the above
Following are the descriptive models
1.clustering
2.classification
3.association rule
4.both a and c
Following are the types of supervised learning
1.classification
2.regression
3.subgroup discovery
4.all of the above
If machine learning model output involves target variable then that model is called as
1.descriptive model
2.predictive model
3.reinforcement learning
4.All of the above
Imagine a Newly-Born starts to learn walking. It will try to find a suitable policy to learn walking after repeated falling and getting up.specify what type of machine learning is best suited?
1.classification
2.regression
3.kmeans algorithm
4.reinforcement learning
Impact of high variance on the training set ?
1. overfitting
2.underfitting
3.both underfitting & overfitting
4.depents upon the dataset
In simple term, machine learning is
1.training based on historical data
2.prediction to answer a query
3.both a and b??
4.automization of complex tasks
In what type of learning labelled training data is used
1.unsupervised learning
2.supervised learning
3.reinforcement learning
4.active learning
Like the probabilistic view, the ________ view allows us to associate a probability of membership with each classification.
1.exampler
2.deductive
3.classical
4.inductive
Of the Following Examples, Which would you address using an supervised learning Algorithm?
1.given email labeled as spam or not spam, learn a spam filter
2.given a set of news articles found on the web, group them into set of articles about the same story.
3.given a database of customer data, automatically discover market segments and group customers into different market segments.
4.find the patterns in market basket analysis
ollowing is powerful distance metrics used by Geometric model
1.euclidean distance
2.manhattan distance
3.both a and b??
4.square distance
PCA is
1.forward feature selection
2.backword feature selection
3.feature extraction
4.all of the above
Prediction is
1. the result of application of specific theory or rule in a specific case
2.discipline in statistics used to find projections in multidimensional data
3.value entered in database by expert
4.independent of data
Select the correct answers for following statements. 1. Filter methods are much faster compared to wrapper methods. 2. Wrapper methods use statistical methods for evaluation of a subset of features while Filter methods use cross validation.
1.both are true
2.1 is true and 2 is false
3.both are false
4.1 is false and 2 is true
Some telecommunication company wants to segment their customers into distinct groups ,this is an example of
1.supervised learning
2.reinforcement learning
3.unsupervised learning
4.data extraction
Supervised learning and unsupervised clustering both require which is correct according to the statement.
1.output attribute
2.hidden attribute.
3. input attribute
4.categorical attribute
Support Vector Machine is
1.logical model B.
2.proababilistic model
3.geometric model
4.none of the above
The "curse of dimensionality" referes
1.all the problems that arise when working with data in the higher dimensions, that did not exist in the lower dimensions.
2.all the problems that arise when working with data in the lower dimensions, that did not exist in the higher dimensions.
3.all the problems that arise when working with data in the lower dimensions, that did not exist in the lower dimensions.
4.all the problems that arise when working with data in the higher dimensions, that did not exist in the higher dimensions.
The effectiveness of an SVM depends upon:
1.selection of kernel
2.kernel parameters
3.soft margin parameter c
4.all of the above
The most popularly used dimensionality reduction algorithm is Principal Component Analysis (PCA). Which of the following is/are true about PCA? 1. PCA is an unsupervised method2. It searches for the directions that data have the largest variance3. Maximum number of principal components <= number of features4. All principal components are orthogonal to each other
1.1 & 2
2.2 & 3
3.3 & 4
4.all of the above
The output of training process in machine learning is
1. machine learning model
2.machine learning algorithm
3.null
4.accuracy
The problem of finding hidden structure in unlabeled data is called…
1.supervised learning
2.unsupervised learning
3.reinforcement learning
4.None of the above
Type of matrix decomposition model is
1.descriptive model
2.logical model
3.logical model
4.none of the above
What can be major issue in Leave-One-Out-Cross-Validation(LOOCV)?
1.low variance
2.high variance
3. faster runtime compared to k-fold cross validation
4. slower runtime compared to normal validation
What characterize is hyperplance in geometrical model of machine learning?
1.a plane with 1 dimensional fewer than number of input attributes
2.a plane with 2 dimensional fewer than number of input attributes
3.a plane with 1 dimensional more than number of input attributes
4.a plane with 2 dimensional more than number of input attributes
What characterize unlabeled examples in machine learning
1.there is no prior knowledge
2.there is no confusing knowledge
3.there is prior knowledge
4.there is plenty of confusing knowledge
What do you mean by a hard margin?
1.the svm allows very low error in classification
2. the svm allows high amount of error in classification
3.both 1 & 2
4.none of the above
What does dimensionality reduction reduce?
1.stochastics
2.collinerity
3.performance
4.entropy
Which learning Requires Self Assessment to identify patterns within data?
1.unsupervised learning
2.supervised learning
3.semisupervised learning
4.reinforced learning
Which of the folllowing is an example of feature extraction?
1.construction bag of words from an email
2.applying pca to project high dimensional data
3.removing stop words
4.forward selection
Which of the following can only be used when training data are linearlyseparable?
1.linear hard-margin svm
2. linear logistic regression
3.linear soft margin svm
4.the centroid method
Which of the following is a good test dataset characteristic?
1.large enough to yield meaningful results
2. is representative of the dataset as a whole
3.both a and b
4.none of the above
Which of the following is a reasonable way to select the number of principal components "k"?
1.choose k to be the smallest value so that at least 99% of the varinace is retained.
2.choose k to be 99% of m (k = 0.99*m, rounded to the nearest integer).
3.choose k to be the largest value so that 99% of the variance is retained.
4. use the elbow method
Which of the following is the best machine learning method?
1.scalable
2.accuracy
3.fast
4.All of the above
Which of the following techniques would perform better for reducing dimensions of a data set?
1.removing columns which have too many missing values
2.removing columns which have high variance in data
3.removing columns with dissimilar data trends
4.None of these
You are given reviews of few netflix series marked as positive, negative and neutral. Classifying reviews of a new netflix series is an example of
1.supervised learning
2.unsupervised learning
3.semisupervised learning
4.reinforcement learning
You are given sesimic data and you want to predict next earthquake , this is an example of
1.supervised learning
2. reinforcement learning
3.unsupervised learning
4.dimensionality reduction