MCQs
InputFormat uses their storage locations to schedule map tasks to process them on the tasktrackers.
identifies destination directory which would contain the archive.
The RecordReader loads data from its source and converts into key-value pairs suitable for reading by mapper.
The Capacity Scheduler supports for multiple queues, where a job is submitted to a queue.
-archiveName is the name of the archive to be created.
HDFS can be used for storing archive data since it is cheaper as HDFS allows storing the data on low cost commodity hardware while ensuring a high degree of fault-tolerance.
Data is replicated across different DataNodes to ensure a high degree of fault-tolerance.
A DataNode stores data in the [HadoopFileSystem]. A functional filesystem has more than one DataNode, with data replicated across them.
In the phase the framework, for each Reducer, fetches the relevant partition of the output of all the Mappers, via HTTP.
Reducer implementations can access the JobConf for the job.