Sail E0 Webinar

MCQs

Total Questions : 10
Question 1. InputFormat class calls the ________ function and computes splits for each file and then sends them to the jobtracker.
  1.    puts
  2.    gets
  3.    getSplits
  4.    All of the mentioned
 Discuss Question
Answer: Option C. -> getSplits


InputFormat uses their storage locations to schedule map tasks to process them on the tasktrackers.


Question 2. _________ identifies filesystem pathnames which work as usual with regular expressions.
  1.    -archiveName
  2.    source
  3.    destination
  4.    None of the mentioned
 Discuss Question
Answer: Option D. -> None of the mentioned


identifies destination directory which would contain the archive.


Question 3. On a tasktracker, the map task passes the split to the createRecordReader() method on InputFormat to obtain a _________ for that split.
  1.    InputReader
  2.    RecordReader
  3.    OutputReader
  4.    None of the mentioned
 Discuss Question
Answer: Option B. -> RecordReader


The RecordReader loads data from its source and converts into key-value pairs suitable for reading by mapper.


Question 4. _________ is a pluggable Map/Reduce scheduler for Hadoop which provides a way to share large clusters.
  1.    Flow Scheduler
  2.    Data Scheduler
  3.    Capacity Scheduler
  4.    None of the mentioned
 Discuss Question
Answer: Option C. -> Capacity Scheduler


The Capacity Scheduler supports for multiple queues, where a job is submitted to a queue.


Question 5. Which of the following parameter describes destination directory which would contain the archive ?
  1.    -archiveName
  2.    source
  3.    destination
  4.    None of the mentioned
 Discuss Question
Answer: Option C. -> destination


-archiveName is the name of the archive to be created.


Question 6. Which of the following scenario may not be a good fit for HDFS ?
  1.    HDFS is not suitable for scenarios requiring multiple/simultaneous writes to the same file
  2.    HDFS is suitable for storing data related to applications requiring low latency data access
  3.    HDFS is suitable for storing data related to applications requiring low latency data access
  4.    None of the mentioned
 Discuss Question
Answer: Option A. -> HDFS is not suitable for scenarios requiring multiple/simultaneous writes to the same file


HDFS can be used for storing archive data since it is cheaper as HDFS allows storing the data on low cost commodity hardware while ensuring a high degree of fault-tolerance.


Question 7. The need for data replication can arise in various scenarios like :
  1.    Replication Factor is changed
  2.    DataNode goes down
  3.    Data Blocks get corrupted
  4.    All of the mentioned
 Discuss Question
Answer: Option D. -> All of the mentioned


Data is replicated across different DataNodes to ensure a high degree of fault-tolerance.


Question 8. ________ is the slave/worker node and holds the user data in the form of Data Blocks.
  1.    DataNode
  2.    NameNode
  3.    Data block
  4.    Replication
 Discuss Question
Answer: Option A. -> DataNode


A DataNode stores data in the [HadoopFileSystem]. A functional filesystem has more than one DataNode, with data replicated across them.


Question 9. Reducer is input the grouped output of a :
  1.    Mapper
  2.    Reducer
  3.    Writable
  4.    Readable
 Discuss Question
Answer: Option A. -> Mapper


In the phase the framework, for each Reducer, fetches the relevant partition of the output of all the Mappers, via HTTP.


Question 10. Interface ____________ reduces a set of intermediate values which share a key to a smaller set of values.
  1.    Mapper
  2.    Reducer
  3.    Writable
  4.    Readable
 Discuss Question
Answer: Option B. -> Reducer


Reducer implementations can access the JobConf for the job.


Latest Videos

Latest Test Papers