Need to access Hive metadata tables using Jdbc program

11,709

Solution 1

Metastore, well, stores meta information on your Hive tables - names, partitions, columns, SSNs, SerDes, etc.

Metastore connection parameters are stored in hive-site.xml (relevant properties are named javax.jdo.option.ConnectionURL, javax.jdo.option.ConnectionUserName, and javax.jdo.option.ConnectionPassword)

You can connect to the Postgres database using the connection parameters and run, say, these commands:

-- that'll get you the names of all Hive tables
SELECT tbl_name FROM TBLS;

-- that'll list all Metastore tables
\d

Solution 2

You can query the metastore DB through JDBC.
E.g: list table names and their location on HDFS:

import java.sql.Connection;
import java.sql.DriverManager;
import java.sql.ResultSet;
import java.sql.Statement;

import org.apache.hadoop.fs.Path;
import org.apache.hadoop.hive.conf.HiveConf;
import org.apache.hadoop.hive.conf.HiveConf.ConfVars;

public class HiveMetastoreJDBCTest {

    public static void main(String[] args) throws Exception {

        Connection conn = null;
        try {
            HiveConf conf = new HiveConf();
            conf.addResource(new Path("file:///path/to/hive-site.xml"));
            Class.forName(conf.getVar(ConfVars.METASTORE_CONNECTION_DRIVER));
            conn = DriverManager.getConnection(
                    conf.getVar(ConfVars.METASTORECONNECTURLKEY),
                    conf.getVar(ConfVars.METASTORE_CONNECTION_USER_NAME),
                    conf.getVar(ConfVars.METASTOREPWD));

            Statement st = conn.createStatement();
            ResultSet rs = st.executeQuery(
                "select t.tbl_name, s.location from tbls t " +
                "join sds s on t.sd_id = s.sd_id");
            while (rs.next()) {
                System.out.println(rs.getString(1) + " : " + rs.getString(2));
            }
        }
        finally {
            if (conn != null) {
                conn.close();
            }
        }

    }
}

There exists an ER diagram about the metastore, but it might not be up-to-date, therefore I'd suggest you to run the metastore DDL script (Hive 0.12), in a test schema, and create the new ER diagram from these tables. (E.g with PowerArchitect)

Share:
11,709
user2176576
Author by

user2176576

Updated on June 05, 2022

Comments

  • user2176576
    user2176576 almost 2 years

    Need to access Hive metadata tables using Jdbc program.What exactly does the Metastore actually store and how can I access it?

    I tried doing this:

        sql="show tables";
    Statement stmt = con.createStatement();
    
    System.out.println("Running: " + sql);
    res = stmt.executeQuery(sql);
    while (res.next()) {
    System.out.println(res.getString(1));
    }
    

    So I get the list of tables, but I want to know which table is this information stored in so that I can Directly Select from that table instead of firing a Hive command.

    Also my Metastore is configured in PostGreSQL.(if that helps!) Thanks Hitz

  • user2176576
    user2176576 over 10 years
    ok thanks will try this out..one more question, I am trying to create a table in hive using primary key but am getting the following error as below in eclipse:Running: CREATE TABLE sample_Hive_Table_2 (PERSON_ID INT PRIMARY KEY, PERSON VARCHAR(26)) Exception in thread "main" java.sql.SQLException: Query returned non-zero code: 40000, cause: FAILED: ParseException line 1:51 mismatched input 'PRIMARY' expecting ) near 'INT' in create table statement at org.apache.hadoop.hive.jdbc.HiveStatement.executeQuery(HiveS‌​tatement.java:194) at Hive_Client_17.main(Hive_Client_17.java:48)
  • Nigel Tufnel
    Nigel Tufnel over 10 years
    Hive doesn't support primary keys.
  • user2176576
    user2176576 over 10 years
    thanks for clarifying my doubt. But I am sure we can add some kind of constraints to the columns. I tried adding NOT NULL constraint to it.. and this is the error I am getting:Running: CREATE TABLE sample_Hive_Table_2 (PERSON_ID INT NOT NULL, PERSON VARCHAR(26)) Exception in thread "main" java.sql.SQLException: Query returned non-zero code: 40000, cause: FAILED: ParseException line 1:50 mismatched input 'NOT' expecting ) near 'INT' in create table statement at org.apache.hadoop.hive.jdbc.HiveStatement.executeQuery(HiveS‌​tatement.java:194) at Hive_Client_17.main(Hive_Client_17.java:48)