Tag Archives: MultipleInputs

Hadoop MultipleInputs sample usage

MultipleInputs is a feature that supports different input formats in the MapReduce.

For example, we have two files with different formats:

(1) First file format:

VALUE

(2) Second file format:

VALUE ADDITIONAL

In order to read the custom format, we need to write Record Class, RecordReader, InputFormat for each one.

InputFormat is needed by MultipleInputs, an InputFormat use RecordReader to read the file and return value, the value is a Record Class instance

Here is the implementation:
Continue reading