Adding column headers to hive result set

23,299

Solution 1

Exactly what does your hive script look like?

Does the output from your hive script have the header data in it? Is it then being lost when you copy the output to your s3 bucket?

If you could provide some more details about exactly what you are doing that would be helpful.

Without knowing those details, here is something that you could try.

Create your hive script as follows:

USE dbase_name:
SET hive.cli.print.header=true;
SELECT some_columns FROM some_table WHERE some_condition;

Then run your script:

$ hive -f hive_script.hql > hive_output

Then copy your output to your s3 bucket

$ aws s3 cp ./hive_output s3://some_bucket_name/foo/hive_output

Solution 2

I guess that direct way is still impossible (HIve: writing column headers to local file?). Some solution would be export result of DESCRIBE table_name to file:

$ hive -e 'DESCRIBE table_name' > file

And write some script that add column names into your data file. GL!

Solution 3

I ran into this problem today and was able to get what I needed by doing a UNION ALL between the original query and a new dummy query that creates the header row. I added a sort column on each section and set the header to 0 and the data to a 1 so I could sort by that field and ensure the header row came out on top.

create table new_table as
select 
  field1,
  field2,
  field3
from
(
  select
    0 as sort_col,  --header row gets lowest number
    'field1_name' as field1,
    'field2_name' as field2,
    'field3_name' as field3
  from
    some_small_table  --table needs at least 1 row
  limit 1  --only need 1 header row
  union all
  select
    1 as sort_col,  --original query goes here
    field1,
    field2,
    field3
  from
    main_table
) a
order by 
  sort_col  --make sure header row is first

It's a little bulky, but at least you can get what you need with a single query.

Hope this helps!

Share:
23,299
Sam
Author by

Sam

This is what I start with on Stack Overflow Search box intags:mine [java][spring][spring-mvc][j2ee][spring]

Updated on July 21, 2022

Comments

  • Sam
    Sam almost 2 years

    I am using a hive script on Amazon EMR to analyze some data.

    And I am transferring the output to Amazon s3 bucket. Now the results of hive script do not contain column headers.

    I have also tried using this:

     set hive.cli.print.header=true;
    

    But it does not help. Can you help me out?

  • Venu A Positive
    Venu A Positive about 8 years
    Hi, sqoop getting data from oracle, but not schema /headers. I want to get schema as headers. For example name,age,location venu,31,Banlgaore srinu,32,Hyderabad ..... llike this. How to get the schema in the form of headers, not like this format. Oracle to S3 (in sqoop), .. not to local .. to s3 like above format.