FAILED: NullPointerException null in HIVE QUERY

22,448

Your UDF does not appear to protect against null values in the input table .Specifically: examine what happens when location were null.

Share:
22,448
patz
Author by

patz

Updated on June 13, 2020

Comments

  • patz
    patz almost 4 years

    Following is the HIVE query I am using, I am also using a Ranking function. I am running this on my local machine.

    SELECT numeric_id, location, Rank(location), followers_count
    FROM (
    SELECT  numeric_id, location, followers_count
    FROM twitter_data
    DISTRIBUTE BY numeric_id, location
    SORT BY numeric_id, location, followers_count desc
    ) a
    WHERE Rank(location)<10;
    

    My Rank function is as follows:

    package org.apache.hadoop.hive.contrib.udaf.ex;
    
    import org.apache.hadoop.hive.ql.exec.UDF;
    
    
    
    public final class Rank extends UDF{
        private int  counter;
        private String last_key;
        public int evaluate(final String key){
          if ( !key.equalsIgnoreCase(this.last_key) ) {
             this.counter = 0;
             this.last_key = key;
          }
          return this.counter++;
        }
    }
    

    I am creating the Jar of the above file and then doing the following steps before running the hive query. I tried doing it with runnable jar and creating with a simple as well.

    ADD JAR /home/adminpc/Downloads/Project_input/Rank.jar;
    CREATE TEMPORARY FUNCTION Rank AS 'org.apache.hadoop.hive.contrib.udaf.ex.Rank';
    

    This is what I get to after executing the Hive Query--

    hive> SELECT numeric_id, location, Rank(location), followers_count
        > FROM (
        > SELECT  numeric_id, location, followers_count
        > FROM twitter_data
        > DISTRIBUTE BY numeric_id, location
        > SORT BY numeric_id, location, followers_count desc
        > ) a
        > WHERE Rank(location)<1;
    FAILED: NullPointerException null