High Performance Computing (HPC) MCQ Set 1 | High Performance Computing (HPC) - Online Exam Test Papers | High Performance Computing (HPC) - MCQs[multiple choice questions and answers ] | High Performance Computing (HPC) - Mock Test Papers | High Performance Computing (HPC) - Practice Papers | High Performance Computing (HPC)

Question:

 cost-optimal parallel systems have an efficiency of ___

1.. 1

2.n

3.logn

4.complex

Posted Date:-2022-07-14 11:46:22

Question:

 CUDA Hardware programming model supports:
a) fully generally data-parallel archtecture;
b) General thread launch;
c) Global load-store;
d) Parallel data cache;
e) Scalar architecture;
f) Integers, bit operation

1. a,c,d,f

2.b,c,d,e

3.a,d,e,f

4. a,b,c,d,e,f

Posted Date:-2022-07-14 08:23:31

Question:

 CUDA supports ____________ in which code in a single thread is executed by all other threads.

1. tread division

2.tread termination

3.thread abstraction

4.None of the above

Posted Date:-2022-07-14 07:50:59

Question:

 FADD, FMAD, FMIN, FMAX are ----- supported by Scalar Processors of NVIDIA GPU.

1.32-bit ieee floating point instructions

2.32-bit integer instructions

3.both

4.none of the above

Posted Date:-2022-07-14 08:04:42

Question:

 If variable a is host variable and dev_a is a device (GPU) variable, to allocate memory to dev_a select correct statement:

1.cudamalloc( &dev_a, sizeof( int ) )

2. malloc( &dev_a, sizeof( int ) )

3.cudamalloc( (void**) &dev_a, sizeof( int ) )

4.malloc( (void**) &dev_a, sizeof( int ) )

Posted Date:-2022-07-14 09:28:24

Question:

 If variable a is host variable and dev_a is a device (GPU) variable, to copy input from variable a to variable dev_a select correct statement:

1. memcpy( dev_a, &a, size);

2.cudamemcpy( dev_a, &a, size, cudamemcpyhosttodevice );

3. memcpy( (void*) dev_a, &a, size);

4.cudamemcpy( (void*) &dev_a, &a, size, cudamemcpydevicetohost );

Posted Date:-2022-07-14 09:29:32

Question:

 In ___________, the number of elements to be sorted is small enough to fit into the process's main memory.

1.internal sorting

2. internal searching

3.external sorting

4. external searching

Posted Date:-2022-07-14 09:34:31

Question:

 The fundamental operation of comparison-based sorting is ________.

1.compare-exchange

2.searching

3.sorting

4.swapping

Posted Date:-2022-07-14 09:37:14

Question:

 The kernel code is dentified by the ________qualifier with void return type

1._host_

2.__global__??

3._device_

4.void

Posted Date:-2022-07-14 07:11:49

Question:

 The n Ã— n matrix is partitioned among n2 processors such that each processor owns a _____ element.

1.n

2.2n

3.single

4.double

Posted Date:-2022-07-14 11:47:15

Question:

 What makes a CUDA code runs in parallel

1.__global__ indicates parallel execution of code

2.main() function indicates parallel execution of code

3.kernel name outside triple angle bracket indicates excecution of kernel n times in parallel

4.. first parameter value inside triple angle bracket (n) indicates excecution of kernel n times in parallel

Posted Date:-2022-07-14 09:33:30

Question:

. In CUDA, a single invoked kernel is referred to as a _____.

1.block

2.tread

3.grid

4.None of the above

Posted Date:-2022-07-14 07:52:40

Question:

A block is comprised of multiple _______.

1. treads

2.bunch

3.host

4.None of the above

Posted Date:-2022-07-14 07:54:39

Question:

A CUDA program is comprised of two primary components: a host and a _____.

1.gpu??kernel

2.cpu??kernel

3.os

4.none of above

Posted Date:-2022-07-14 07:11:12

Question:

A grid is comprised of ________ of threads.

1.block

2.bunch

3.host

4.None of the above

Posted Date:-2022-07-14 07:53:46

Question:

A parallel algorithm is evaluated by its runtime in function of

1.the input size,

2.the number of processors

3. the communication parameters.

4.all

Posted Date:-2022-07-14 11:56:25

Question:

a solution of the problem in representing the parallelismin algorithm is

1.cud

2.pta

3.cda

4.cuda

Posted Date:-2022-07-14 07:55:14

Question:

Any condition that causes a processor to stall is called as _____.

1. hazard

2.page fault

3.system error

4. none of the above

Posted Date:-2022-07-14 07:56:53

Question:

Breadth First Search is equivalent to which of the traversal in the Binary Trees?

1.pre-order traversal

2. post-order traversal

3. level-order traversal

4.in-order traversal

Posted Date:-2022-07-14 09:44:50

Question:

C(W)__Î˜(W) for optimality (necessary condition).

1. >

2. <

3.<=

4.equals

Posted Date:-2022-07-14 11:58:26

Question:

Calling a kernel is typically referred to as _________.

1.kernel thread

2.kernel initialization

3.kernel termination

4.kernel invocation

Posted Date:-2022-07-14 07:13:31

Question:

Cost of a parallel system is sometimes referred to____ of product

1.work

2.processor time

3.both

4.None of the above

Posted Date:-2022-07-14 09:53:21

Question:

CUDA provides ------- warp and thread scheduling. Also, the overhead of thread creation is on the order of ----.

1.programming-overheadâ€, 2 clock

2.zero-overheadâ€, 1 clock

3.64, 2 clock

4.32, 1 clock

Posted Date:-2022-07-14 08:07:36

Question:

CUDA stands for --------, designed by NVIDIA.

1. common union discrete architecture

2. complex unidentified device architecture

3.compute unified device architecture

4.complex unstructured distributed architecture

Posted Date:-2022-07-14 08:14:05

Question:

CUDA supports programming in ....

1. c or c++ only

2. java, python, and more

3. c, c++, third party wrappers for java, python, and more

4.pascal

Posted Date:-2022-07-14 08:03:31

Question:

Data items must be combined piece-wise and the result made available at

1.target processor finally

2.target variable finatlalyrget receiver finally

3.both (a) and (b)

4.None of these

Posted Date:-2022-07-14 12:07:53

Question:

Each NVIDIA GPU has ------ Streaming Multiprocessors

1.8

2.1024

3.512

4.16

Posted Date:-2022-07-14 08:06:45

Question:

Each streaming multiprocessor (SM) of CUDA herdware has ------ scalar processors (SP).

1.1024

2.128

3.512

4.8

Posted Date:-2022-07-14 08:05:52

Question:

Each warp of GPU receives a single instruction and â€œbroadcastsâ€ it to all of its threads. It is a ---- operation.

1.simd (single instruction multiple data)

2.simt (single instruction multiple thread)

3.sisd (single instruction single data)

4. sist (single instruction single thread)

Posted Date:-2022-07-14 08:08:27

Question:

efficient implementation of basic communication operation can improve

1. performance

2.communication

3.algorithm

4. all

Posted Date:-2022-07-14 12:02:23

Question:

efficient use of basic communication operations can reduce

1.development effort and

2.software quality

3.both

4.none

Posted Date:-2022-07-14 12:03:35

Question:

For a problem consisting of W units of work, p__W processors can be used optimally.

1. <=

2.>=

3.<

4. >

Posted Date:-2022-07-14 11:57:22

Question:

Graph search involves a closed list, where the major operation is a _______

1. sorting

2.searching

3.lookup

4.None of the above

Posted Date:-2022-07-14 09:44:02

Question:

Group communication operations are built using_____ Messenging primitives.

1.point-to-point

2.one-to-all

3.all-to-one

4.none

Posted Date:-2022-07-14 12:04:53

Question:

Host codes in a CUDA application can not Reset a device

1.true

2.false

3.all

4.None of These

Posted Date:-2022-07-14 07:56:07

Question:

how many basic communication operations are used in matrix vector multiplication

1.1

2.2

3.3

4.4

Posted Date:-2022-07-14 11:49:44

Question:

IADD, IMUL24, IMAD24, IMIN, IMAX are ----------- supported by Scalar Processors of NVIDIA GPU.

1.32-bit ieee floating point instructions

2. 32-bit integer instructions

3.both

4.none of the above

Posted Date:-2022-07-14 08:22:52

Question:

In BFS, how many times a node is visited?

1.once

2.twice

3. equivalent to number of indegree of the node

4.thrice

Posted Date:-2022-07-14 09:49:07

Question:

In CUDA memory model there are following memory types available:
a) Registers;
b) Local Memory;
c) Shared Memory;
d) Global Memory;
e) Constant Memory;
f) Texture Memory.

1.a, b, d, f

2.a, c, d, e, f

3.a, b, c, d, e, f

4.b, c, e, f

Posted Date:-2022-07-14 08:24:40

Question:

In DNS algorithm of matrix multiplication it used

1.1d partition

2.2d partition

3.3d partition

4.both a,b

Posted Date:-2022-07-14 11:50:31

Question:

In the Pipelined Execution, steps contain

1. normalization

2.communication

3.elimination

4. all

Posted Date:-2022-07-14 11:51:23

Question:

Limitations of CUDA Kernel

1.recursion, call stack, static variable declaration

2.no recursion, no call stack, no static variable declarations

3.recursion, no call stack, static variable declaration

4.no recursion, call stack, no static variable declarations

Posted Date:-2022-07-14 08:09:55

Question:

many interactions in oractical parallel programs occur in _____ pattern

1.well defined

2.zig-zac

3.reverse

4.straight

Posted Date:-2022-07-14 11:59:14

Question:

mathematically efficiency is

1.e=s/p

2.e=p/s

3.e*s=p/2

4. e=p+e/e

Posted Date:-2022-07-14 09:52:40

Question:

NVIDIA 8-series GPUs offer -------- .

1.50-200 gflops

2. 200-400 gflops

3.400-800 gflops

4.800-1000 gflops

Posted Date:-2022-07-14 08:21:55

Question:

NVIDIA CUDA Warp is made up of how many threads?

1. 512

2.1024

3.312

4.32

Posted Date:-2022-07-14 08:01:26

Question:

one processor has a piece of data and it need to send to everyone is

1.one -to-all

2. all-to-one

3. point -to-point

4.all of above

Posted Date:-2022-07-14 12:05:51

Question:

Out-of-order instructions is not possible on GPUs.

1.true

2.false

3. --

4. --

Posted Date:-2022-07-14 08:02:32

Question:

Scaling Characteristics of Parallel Programs Ts is

1. increase

2.constant

3.decreases

4.none

Posted Date:-2022-07-14 09:54:50

Question:

Speedup obtained when the problem size is _______ linearlywith the number of processing elements.

1.increase

2.constant

3.decreases

4.depend on problem size

Posted Date:-2022-07-14 09:57:33

Question:

Speedup tends to saturate and efficiency _____ as a consequence of Amdahlâ€™s law.

1.increase

2.constant

3.decreases

4.none

Posted Date:-2022-07-14 09:56:16

Question:

the BlockPerGrid and ThreadPerBlock parameters are related to the ________ model supported by CUDA.

1.host

2.kernel

3. thread??abstraction

4.None of the above

Posted Date:-2022-07-14 07:14:13

Question:

The computer cluster architecture emerged as an alternative for ____.

1.isa

2.workstation

3.super computers

4.distributed systems

Posted Date:-2022-07-14 08:00:33

Question:

the cost of the parallel algorithm is higher than the sequential run time by a factor of __

1.2020-03-02 00:00:00

2. 2020-02-03 00:00:00

3. 3*2

4.2/3+3/2

Posted Date:-2022-07-14 11:52:32

Question:

The CUDA architecture consists of --------- for parallel computing kernels and functions.

1.risc instruction set architecture

2.cisc instruction set architecture

3.zisc instruction set architecture

4.ptx instruction set architecture

Posted Date:-2022-07-14 08:13:02

Question:

the dual of one -to-all is

1.all-to-one reduction

2.one -to-all reduction

3. pnoint -to-point reducntion

4.none

Posted Date:-2022-07-14 12:06:51

Question:

The host processor spawns multithread tasks (or kernels as they are known in CUDA) onto the GPU device. State true or false.

1.true

2.false

3. ---

4.---

Posted Date:-2022-07-14 08:14:40

Question:

The load imbalance problem in Parallel Gaussian Elimination: can be alleviated by using a ____ mapping

1.acyclic

2.cyclic

3.both

4.none

Posted Date:-2022-07-14 11:53:35

Question:

The main advantage of ______ is that its storage requirement is linear in the depth of the state space being searched.

1.bfs

2.dfs

3.a and b

4.None of the above

Posted Date:-2022-07-14 09:41:10

Question:

The n Ã— n matrix is partitioned among n processors, with each processor storing complete ___ of the matrix.

1.row

2.column

3.both

4.depend on processor

Posted Date:-2022-07-14 09:58:22

Question:

The NVIDIA G80 is a ---- CUDA core device, the NVIDIA G200 is a ---- CUDA core device, and the NVIDIA Fermi is a ---- CUDA core device

1.28, 256, 512

2.32, 64, 128

3.64, 128, 256

4.256, 512, 1024

Posted Date:-2022-07-14 08:15:21

Question:

The NVIDIA G80 is a ---- CUDA core device, the NVIDIA G200 is a ---- CUDA core device, and the NVIDIA Fermi is a ---- CUDA core device

1.28, 256, 512

2.32, 64, 128

3.64, 128, 256

4.256, 512, 1024

Posted Date:-2022-07-14 08:15:21

Question:

The performance of quicksort depends critically on the quality of the ______-.

1.non-pivote

2.pivot

3.center element

4. len of array

Posted Date:-2022-07-14 09:39:09

Question:

The time lost due to branch instruction is often referred to as _____.

1.latency

2.delay

3.branch penalty

4.None of the above

Posted Date:-2022-07-14 07:57:36

Question:

Time Complexity of Breadth First Search is? (V â€“ number of vertices, E â€“ number of edges)

1. o(v + e)

2.o(v)

3.o(e)

4.o(v*e)

Posted Date:-2022-07-14 09:45:43

Question:

Triple angle brackets mark in a statement inside main function, what does it indicates?

1.a call from host code to device code

2.a call from device code to host code

3.less than comparison

4. greater than comparison

Posted Date:-2022-07-14 09:30:32

Question:

What is the equivalent of general C program with CUDA C: int main(void) { printf("Hello, World!
"); return 0; }

1. int main ( void ) { kernel <<<1,1>>>(); printf("hello, world!\n"); return 0; }

2.__global__ void kernel( void ) { } int main ( void ) { kernel <<<1,1>>>(); printf("hello, world!\n"); return 0; }

3. __global__ void kernel( void ) { kernel <<<1,1>>>(); printf("hello, world!\n"); return 0; }

4._global__ int main ( void ) { kernel <<<1,1>>>(); printf("hello, world!\n"); return 0; }

Posted Date:-2022-07-14 08:25:21

Question:

What is Unified Virtual Machine

1.it is a technique that allow both cpu and gpu to read from single virtual machine, simultaneously.

2.it is a technique for managing separate host and device memory spaces.

3. it is a technique for executing device code on host and host code on device.

4.it is a technique for executing general purpose programs on device instead of host.

Posted Date:-2022-07-14 08:10:54

Question:

Which function runs on Device (i.e. GPU): a) __global__ void kernel (void ) { } b) int main ( void ) { ... return 0; }

1.a

2.s

3.both a,b

4. ---

Posted Date:-2022-07-14 08:26:38

Question:

Which of the following is not a stable sorting algorithm in its typical implementation.

1.insertion sort

2. merge sort

3.quick sort

4.bubble sort

Posted Date:-2022-07-14 09:50:49

Question:

Which of the following is not an application of Breadth First Search?

1.when the graph is a binary tree

2.when the graph is a linked list

3. when the graph is a n-ary tree

4.when the graph is a ternary tree

Posted Date:-2022-07-14 09:46:49

Question:

Which of the following is not true about comparison based sorting algorithms?

1. the minimum possible time complexity of a comparison based sorting algorithm is o(nlogn) for a random input array

2. any comparison based sorting algorithm can be made stable by using position as a criteria when two elements are compared

3.counting sort is not a comparison based sorting algortihm

4.heap sort is not a comparison based sorting algorithm.

Posted Date:-2022-07-14 09:51:48

Question:

___ algorithms use a heuristic to guide search.

1.bfs

2.dfs

3. a and b

4. none of above

Posted Date:-2022-07-14 09:43:20

Question:

___ method is used in centralized systems to perform out of order execution.

1. scorecard

2.score boarding

3.optimizing

4.redundancy

Posted Date:-2022-07-14 07:58:37

Question:

____ can be comparison-based or noncomparison-based.

1.searching

2.sorting

3. both a and b

4.none of above

Posted Date:-2022-07-14 09:36:18

Question:

____ is Callable from the host

1._host_

2.__global__??

3. _device_

4.none of above

Posted Date:-2022-07-14 07:16:23

Question:

_____ became the first language specifically designed by a GPU Company to facilitate general purpose computing on ____.

1.python, gpus.

2.c, cpus.

3.cuda c, gpus.

4.java, cpus.

Posted Date:-2022-07-14 08:11:35

Question:

______ is Callable from the host

1._host_ B. C.

2.__global__??

3._device_

4.None of the above

Posted Date:-2022-07-14 07:50:15

Question:

_______ is Callable from the device only

1._host_

2.__global__?? C.

3._device_

4.None of the above

Posted Date:-2022-07-14 07:15:09

Question:

_____________ algorithms use auxiliary storage (such as tapes and hard disks) for sorting because the number of elements to be sorted is too large to fit into memory.

1. internal sorting

2. internal searching

3.external sorting

4.external searching

Posted Date:-2022-07-14 09:35:22

Posted on by R4R Team

More MCQS

Search

R4R Team

R4Rin Top Tutorials are Core Java,Hibernate ,Spring,Sturts.The content on R4R.in website is done by expert team not only with the help of books but along with the strong professional knowledge in all context like coding,designing, marketing,etc!

High Performance Computing (HPC)/High Performance Computing (HPC) MCQ Set 1 Sample Test,Sample questions

Question: cost-optimal parallel systems have an efficiency of ___

Question: CUDA Hardware programming model supports: a) fully generally data-parallel archtecture; b) General thread launch; c) Global load-store; d) Parallel data cache; e) Scalar architecture; f) Integers, bit operation

Question: CUDA supports ____________ in which code in a single thread is executed by all other threads.

Question: FADD, FMAD, FMIN, FMAX are ----- supported by Scalar Processors of NVIDIA GPU.

Question: If variable a is host variable and dev_a is a device (GPU) variable, to allocate memory to dev_a select correct statement:

Question: If variable a is host variable and dev_a is a device (GPU) variable, to copy input from variable a to variable dev_a select correct statement:

Question: In ___________, the number of elements to be sorted is small enough to fit into the process's main memory.

Question: The fundamental operation of comparison-based sorting is ________.

Question: The kernel code is dentified by the ________qualifier with void return type

Question: The n Ã— n matrix is partitioned among n2 processors such that each processor owns a _____ element.

Question: What makes a CUDA code runs in parallel

Question: . In CUDA, a single invoked kernel is referred to as a _____.

Question: A block is comprised of multiple _______.

Question: A CUDA program is comprised of two primary components: a host and a _____.

Question: A grid is comprised of ________ of threads.

Question: A parallel algorithm is evaluated by its runtime in function of

Question: a solution of the problem in representing the parallelismin algorithm is

Question: Any condition that causes a processor to stall is called as _____.

Question: Breadth First Search is equivalent to which of the traversal in the Binary Trees?

Question: C(W)__Î˜(W) for optimality (necessary condition).

Question: Calling a kernel is typically referred to as _________.

Question: Cost of a parallel system is sometimes referred to____ of product

Question: CUDA provides ------- warp and thread scheduling. Also, the overhead of thread creation is on the order of ----.

Question: CUDA stands for --------, designed by NVIDIA.

Question: CUDA supports programming in ....

Question: Data items must be combined piece-wise and the result made available at

Question: Each NVIDIA GPU has ------ Streaming Multiprocessors

Question: Each streaming multiprocessor (SM) of CUDA herdware has ------ scalar processors (SP).

Question: Each warp of GPU receives a single instruction and â€œbroadcastsâ€ it to all of its threads. It is a ---- operation.

Question: efficient implementation of basic communication operation can improve

Question: efficient use of basic communication operations can reduce

Question: For a problem consisting of W units of work, p__W processors can be used optimally.

Question: Graph search involves a closed list, where the major operation is a _______

Question: Group communication operations are built using_____ Messenging primitives.

Question: Host codes in a CUDA application can not Reset a device

Question: how many basic communication operations are used in matrix vector multiplication

Question: IADD, IMUL24, IMAD24, IMIN, IMAX are ----------- supported by Scalar Processors of NVIDIA GPU.

Question: In BFS, how many times a node is visited?

Question: In CUDA memory model there are following memory types available: a) Registers; b) Local Memory; c) Shared Memory; d) Global Memory; e) Constant Memory; f) Texture Memory.

Question: In DNS algorithm of matrix multiplication it used

Question: In the Pipelined Execution, steps contain

Question: Limitations of CUDA Kernel

Question: many interactions in oractical parallel programs occur in _____ pattern

Question: mathematically efficiency is

Question: NVIDIA 8-series GPUs offer -------- .

Question: NVIDIA CUDA Warp is made up of how many threads?

Question: one processor has a piece of data and it need to send to everyone is

Question: Out-of-order instructions is not possible on GPUs.

Question: Scaling Characteristics of Parallel Programs Ts is

Question: Speedup obtained when the problem size is _______ linearlywith the number of processing elements.

Question: Speedup tends to saturate and efficiency _____ as a consequence of Amdahlâ€™s law.

Question: the BlockPerGrid and ThreadPerBlock parameters are related to the ________ model supported by CUDA.

Question: The computer cluster architecture emerged as an alternative for ____.

Question: the cost of the parallel algorithm is higher than the sequential run time by a factor of __

Question: The CUDA architecture consists of --------- for parallel computing kernels and functions.

Question: the dual of one -to-all is

Question: The host processor spawns multithread tasks (or kernels as they are known in CUDA) onto the GPU device. State true or false.

Question: The load imbalance problem in Parallel Gaussian Elimination: can be alleviated by using a ____ mapping

Question: The main advantage of ______ is that its storage requirement is linear in the depth of the state space being searched.

Question: The n Ã— n matrix is partitioned among n processors, with each processor storing complete ___ of the matrix.

Question: The NVIDIA G80 is a ---- CUDA core device, the NVIDIA G200 is a ---- CUDA core device, and the NVIDIA Fermi is a ---- CUDA core device

Question: The NVIDIA G80 is a ---- CUDA core device, the NVIDIA G200 is a ---- CUDA core device, and the NVIDIA Fermi is a ---- CUDA core device

Question: The performance of quicksort depends critically on the quality of the ______-.

Question: The time lost due to branch instruction is often referred to as _____.

Question: Time Complexity of Breadth First Search is? (V â€“ number of vertices, E â€“ number of edges)

Question: Triple angle brackets mark in a statement inside main function, what does it indicates?

Question: What is the equivalent of general C program with CUDA C: int main(void) { printf("Hello, World! "); return 0; }

Question: What is Unified Virtual Machine

Question: Which function runs on Device (i.e. GPU): a) __global__ void kernel (void ) { } b) int main ( void ) { ... return 0; }

Question: Which of the following is not a stable sorting algorithm in its typical implementation.

Question: Which of the following is not an application of Breadth First Search?

Question: Which of the following is not true about comparison based sorting algorithms?

Question: ___ algorithms use a heuristic to guide search.

Question: ___ method is used in centralized systems to perform out of order execution.

Question: ____ can be comparison-based or noncomparison-based.

Question: ____ is Callable from the host

Question: _____ became the first language specifically designed by a GPU Company to facilitate general purpose computing on ____.

Question: ______ is Callable from the host

Question: _______ is Callable from the device only

Question:
cost-optimal parallel systems have an efficiency of ___

Question:
CUDA Hardware programming model supports: a) fully generally data-parallel archtecture; b) General thread launch; c) Global load-store; d) Parallel data cache; e) Scalar architecture; f) Integers, bit operation

Question:
CUDA supports ____________ in which code in a single thread is executed by all other threads.

Question:
FADD, FMAD, FMIN, FMAX are ----- supported by Scalar Processors of NVIDIA GPU.

Question:
If variable a is host variable and dev_a is a device (GPU) variable, to allocate memory to dev_a select correct statement:

Question:
If variable a is host variable and dev_a is a device (GPU) variable, to copy input from variable a to variable dev_a select correct statement:

Question:
In ___________, the number of elements to be sorted is small enough to fit into the process's main memory.

Question:
The fundamental operation of comparison-based sorting is ________.

Question:
The kernel code is dentified by the ________qualifier with void return type

Question:
The n Ã— n matrix is partitioned among n2 processors such that each processor owns a _____ element.

Question:
What makes a CUDA code runs in parallel

Question:
. In CUDA, a single invoked kernel is referred to as a _____.

Question:
A block is comprised of multiple _______.

Question:
A CUDA program is comprised of two primary components: a host and a _____.

Question:
A grid is comprised of ________ of threads.

Question:
A parallel algorithm is evaluated by its runtime in function of

Question:
a solution of the problem in representing the parallelismin algorithm is

Question:
Any condition that causes a processor to stall is called as _____.

Question:
Breadth First Search is equivalent to which of the traversal in the Binary Trees?

Question:
C(W)__Î˜(W) for optimality (necessary condition).

Question:
Calling a kernel is typically referred to as _________.

Question:
Cost of a parallel system is sometimes referred to____ of product

Question:
CUDA provides ------- warp and thread scheduling. Also, the overhead of thread creation is on the order of ----.

Question:
CUDA stands for --------, designed by NVIDIA.

Question:
CUDA supports programming in ....

Question:
Data items must be combined piece-wise and the result made available at

Question:
Each NVIDIA GPU has ------ Streaming Multiprocessors

Question:
Each streaming multiprocessor (SM) of CUDA herdware has ------ scalar processors (SP).

Question:
Each warp of GPU receives a single instruction and â€œbroadcastsâ€ it to all of its threads. It is a ---- operation.

Question:
efficient implementation of basic communication operation can improve

Question:
efficient use of basic communication operations can reduce

Question:
For a problem consisting of W units of work, p__W processors can be used optimally.

Question:
Graph search involves a closed list, where the major operation is a _______

Question:
Group communication operations are built using_____ Messenging primitives.

Question:
Host codes in a CUDA application can not Reset a device

Question:
how many basic communication operations are used in matrix vector multiplication

Question:
IADD, IMUL24, IMAD24, IMIN, IMAX are ----------- supported by Scalar Processors of NVIDIA GPU.

Question:
In BFS, how many times a node is visited?

Question:
In CUDA memory model there are following memory types available: a) Registers; b) Local Memory; c) Shared Memory; d) Global Memory; e) Constant Memory; f) Texture Memory.

Question:
In DNS algorithm of matrix multiplication it used

Question:
In the Pipelined Execution, steps contain

Question:
Limitations of CUDA Kernel

Question:
many interactions in oractical parallel programs occur in _____ pattern

Question:
mathematically efficiency is

Question:
NVIDIA 8-series GPUs offer -------- .

Question:
NVIDIA CUDA Warp is made up of how many threads?

Question:
one processor has a piece of data and it need to send to everyone is

Question:
Out-of-order instructions is not possible on GPUs.

Question:
Scaling Characteristics of Parallel Programs Ts is

Question:
Speedup obtained when the problem size is _______ linearlywith the number of processing elements.

Question:
Speedup tends to saturate and efficiency _____ as a consequence of Amdahlâ€™s law.

Question:
the BlockPerGrid and ThreadPerBlock parameters are related to the ________ model supported by CUDA.

Question:
The computer cluster architecture emerged as an alternative for ____.

Question:
the cost of the parallel algorithm is higher than the sequential run time by a factor of __

Question:
The CUDA architecture consists of --------- for parallel computing kernels and functions.

Question:
the dual of one -to-all is

Question:
The host processor spawns multithread tasks (or kernels as they are known in CUDA) onto the GPU device. State true or false.

Question:
The load imbalance problem in Parallel Gaussian Elimination: can be alleviated by using a ____ mapping

Question:
The main advantage of ______ is that its storage requirement is linear in the depth of the state space being searched.

Question:
The n Ã— n matrix is partitioned among n processors, with each processor storing complete ___ of the matrix.

Question:
The NVIDIA G80 is a ---- CUDA core device, the NVIDIA G200 is a ---- CUDA core device, and the NVIDIA Fermi is a ---- CUDA core device

Question:
The NVIDIA G80 is a ---- CUDA core device, the NVIDIA G200 is a ---- CUDA core device, and the NVIDIA Fermi is a ---- CUDA core device

Question:
The performance of quicksort depends critically on the quality of the ______-.

Question:
The time lost due to branch instruction is often referred to as _____.

Question:
Time Complexity of Breadth First Search is? (V â€“ number of vertices, E â€“ number of edges)

Question:
Triple angle brackets mark in a statement inside main function, what does it indicates?

Question:
What is the equivalent of general C program with CUDA C: int main(void) { printf("Hello, World! "); return 0; }

Question:
What is Unified Virtual Machine

Question:
Which function runs on Device (i.e. GPU): a) global void kernel (void ) { } b) int main ( void ) { ... return 0; }

Question:
Which of the following is not a stable sorting algorithm in its typical implementation.

Question:
Which of the following is not an application of Breadth First Search?

Question:
Which of the following is not true about comparison based sorting algorithms?

Question:
___ algorithms use a heuristic to guide search.

Question:
___ method is used in centralized systems to perform out of order execution.

Question:
____ can be comparison-based or noncomparison-based.

Question:
____ is Callable from the host

Question:
_ became the first language specifically designed by a GPU Company to facilitate general purpose computing on .

Question:
______ is Callable from the host

Question:
_______ is Callable from the device only

Question:
_____________ algorithms use auxiliary storage (such as tapes and hard disks) for sorting because the number of elements to be sorted is too large to fit into memory.