Sqoop is not using all the specified mappers -

- January 15, 2014

I am using Hadoop data from Oracle using Sqoop in the Oracle table, I have around 2 million records with the primary key Which I am providing as a field by dividing.

The job of my class is getting completed and I am getting the right data and the job is running for 30 so far all the good.

When I look at the output file, I get the first file round 1.4 GB, the second file is approximately 157.2 MB and the last file (20th file) is approximately 10.4 MB, while all other files from 3 to 19 There are 0 bytes.

I'm setting 20-meters because I want to run 20 mapers for my work.

Here is the sqoop command:

sqoop import - Connect "CONNECTION_STRING" - some "SELECT * FROM AND \ $ CONDITIONS" --split-by.id --target-dir / Output_data -m 20

Note: My

Any ideas? > / Div>

to ...

If the real value for P is the Rimi key completely It is not distributed in its own limits, then it can be the result of unbalanced work.

- split-by can be used to select the logic column Generally with better distribution, this data will vary according to type.

Search This Blog

Contra

Sqoop is not using all the specified mappers -

Comments

Post a Comment

Popular posts from this blog

python - Strange behavior using PyQt4's 'pyqtSlot' decorator before another decorator -

c# - UnhandledExceptionMode.ThrowException for AppDomain.UnhandledException -

c# - Process.Kill() returns access denied -