Sqoop is not using all the specified mappers -
I am using Hadoop data from Oracle using Sqoop in the Oracle table, I have around 2 million records with the primary key Which I am providing as a field by dividing.
The job of my class is getting completed and I am getting the right data and the job is running for 30 so far all the good.
When I look at the output file, I get the first file round 1.4 GB, the second file is approximately 157.2 MB and the last file (20th file) is approximately 10.4 MB, while all other files from 3 to 19 There are 0 bytes.
I'm setting 20-meters because I want to run 20 mapers for my work.
Here is the sqoop command:
sqoop import - Connect "CONNECTION_STRING" - some "SELECT * FROM AND \ $ CONDITIONS" --split-by.id --target-dir / Output_data -m 20
Note: My
Any ideas? > / Div>
to ...
If the real value for P is the Rimi key completely It is not distributed in its own limits, then it can be the result of unbalanced work.
- split-by
can be used to select the logic column Generally with better distribution, this data will vary according to type.
Comments
Post a Comment