loading data to hive static partition table using load command
Solution 1
Your problem is this:
load data inpath '/test.txt' into table temp partition(city='hyd');
All the data you loaded into this partitions are with city = 'hyd'. If you are doing static partitioning is your responsibility put into the partition the correct values.
Just remove the last line from your txt file, put it into test2.txt and execute:
load data inpath '/test.txt' into table temp partition(city='hyd');
load data inpath '/test2.txt' into table temp partition(city='sec');
Yes, not so much comfortable, but static partitioning works in this way.
Solution 2
I hope partition will not work properly with load statement for a single file.
Instead we need to write to temp table (stat_parti
) in hive and from there we need to another partition table (stat_test
)
ex :
create external table stat_test(id int, name string, sal int)
partitioned by(city string)
row format delimited fields
terminated by ' '
location '/user/test/stat_test';
and can give static or dynamic partition.
1) Static partition
insert into table stat_test partition(city='hyd') select id,name,sal from stat_parti where city='hyd';
insert into table stat_test partition(city='sec') select id,name,sal from stat_parti where city='sec';
2) Dynamic partition
Here we need to enable
set hive.exec.dynamic.partition=true
set hive.exec.dynamic.partition.mode=nonstrict
insert overwrite table stat_test partition(city) select id,name,sal from stat_parti;
Raj
Updated on June 27, 2022Comments
-
Raj almost 2 years
Please dont mind if it is a very basic :
test.txt
1 ravi 100 hyd
2 krishna 200 hyd
3 fff 300 seci have created a table in hive with partition on city and loaded the data as below :
create external table temp(id int, name string, sal int) partitioned by(city string) location '/testing';
load data inpath '/test.txt' into table temp partition(city='hyd');
In HDFS the structure is /testing/temp/city=hyd/test.txt
when i Query the table as "select * from temp";
output :
temp.id temp.name temp.sal temp.city 1 ravi 100 hyd 2 krishna 200 hyd 3 fff 300 hyd
here my question is why for the third row the city name from "sec" is changing to "hyd" in the output?
is anything wrong from my side?
Thanks in Advance !!!
-
Raj over 7 yearsThankyou ozwlz5rd, my requirement is suppose if i have a large file and i want to do static partition on city. Like the above file. how we proceed?
-
ozw1z5rd over 7 yearsWhat you are asking looks very close to dynamic partitioning. Using this you get what you want. If you have to use static partitioning, you can process the file before to add it to the partition or you can create a temporary external table which allows you to select the records to stick into the correct partition. "city" looks like a low cardinality field, dynamic partitioning will work fine on it.