hadoop - How does Map Reduce ensure all the data of a single xml record split across different Files go to the same mapper -
I have a large set of XML records in different files. Now if a record starts in 1 file but there It does not end, instead it has been continued in another file. How to find out the framework for reducing the map, that the remaining patches of the record can be processed by a single mapper?
Comments
Post a Comment