Sqoop Hive table import, Table dataType doesn't match with database

19,468

Solution 1

In addition to above answers we may also have to observe when the error is coming, e.g.

In my case I had two types of data columns that caused error: json and binary

for json column the error came while a Java Class was executing, at the very beginning of the import process :

/04/19 09:37:58 ERROR orm.ClassWriter: Cannot resolve SQL type

for binary column, error was thrown while importing into the hive tables (after data is imported and put into HDFS files)

16/04/19 09:51:22 ERROR tool.ImportTool: Encountered IOException running import job: java.io.IOException: Hive does not support the SQL type for column featured_binary

To get rid of these two errors, I had to provide the following options

--map-column-java column1_json=String,column2_json=String,featured_binary=String --map-column-hive column1_json=STRING,column2_json=STRING,featured_binary=STRING

In summary, we may have to provide the

--map-column-java 

or

--map-column-hive

depending upon the failure.

Solution 2

A new feature was added with sqoop-2103/sqoop 1.4.5 that lets you call out the decimal precision with the map-column-hive parameter. Example:

--map-column-hive 'TESTDOLLAR_AMT=DECIMAL(20%2C2)'

This syntax would define the field as a DECIMAL(20,2). The %2C is used as a comma and the parameter needs to be in single quotes if submitting from the bash shell.

I tried using Decimal with no modification and I got a Decimal(10,0) as a default.

Solution 3

You can use the parameter --map-column-hive to override default mapping. This parameter expects a comma-separated list of key-value pairs separated by = to specify which column should be matched to which type in Hive.

sqoop import \
  ...
  --hive-import \
  --map-column-hive id=STRING,price=DECIMAL
Share:
19,468
Admin
Author by

Admin

Updated on June 23, 2022

Comments

  • Admin
    Admin almost 2 years

    Using Sqoop to import data from oracle to hive, its working fine but it create table in hive with only 2 dataTypes String and Double. I want to use timeStamp as datatype for some columns. How can I do it.

    bin/sqoop import --table TEST_TABLE --connect jdbc:oracle:thin:@HOST:PORT:orcl --username USER1 -password password -hive-import --hive-home /user/lib/Hive/