Create Hive table to read parquet files from parquet/avro schema

24,778

Try below using avro schema:

CREATE TABLE avro_test ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' STORED AS AVRO TBLPROPERTIES ('avro.schema.url'='myHost/myAvroSchema.avsc'); 

CREATE EXTERNAL TABLE parquet_test LIKE avro_test STORED AS PARQUET LOCATION 'hdfs://myParquetFilesPath';

Same query is asked in Dynamically create Hive external table with Avro schema on Parquet Data

Share:
24,778
Mehdi TAZI
Author by

Mehdi TAZI

Hello, I'm a Big Data Architect & Scrum Master from Fez, Morocco living in Paris, France. I have an Engineering degree in Software Engineering & Master 2 degree in Distributed information systems, I’m also an academic (a PhD Student), my researches focus on The Internet of Things (IoT) Virtualization field, and more specially about a definition of :A Micro-Clouds sensors virtualization and composition approach for the Internet of Things. I currently work as a Consultant, where I spend most of my time designing Information Systems architectures & managing projects using scrum methodology. Setting-up technical, functional and organisational architectures is my daily job, I also design & code JAVA/JEE BigData/Reactive softwares using NoSQL & relational Databases.

Updated on July 05, 2022

Comments

  • Mehdi TAZI
    Mehdi TAZI almost 2 years

    We are looking for a solution in order to create an external hive table to read data from parquet files according to a parquet/avro schema.

    in other way, how to generate a hive table from a parquet/avro schema ?

    thanks :)

  • Gary Gauh
    Gary Gauh about 7 years
    Can I create table from parquet file directly ? Or how to get Avro schema from specific parquet file ?
  • JKC
    JKC over 6 years
    @GaryGauh for your second question here's my answer . Using parquet tools you can extract Avro schema of the particular parquet file. Please refer this link for more details : kitesdk.org/docs/0.17.1/labs/…
  • Vikram Gulia
    Vikram Gulia over 5 years
    It worked for me but can i use parquet schema (org.apache.parquet.schema.MessageType) to create tables?