Hive User Defined Aggregate Functions (UDAF) Java Example

User Defined Aggregate Functions (UDAF) Java Example
Step 1 - Add these jar files to your java project.

hive-exe*.jar

$HIVE_HOME/lib/*.jar
$HADOOP_HOME/share/hadoop/mapreduce/*.jar
$HADOOP_HOME/share/hadoop/common/*.jar

Max.java

import org.apache.hadoop.hive.ql.exec.UDAF;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.hive.ql.exec.UDAFEvaluator;
@SuppressWarnings("deprecation")
public class Max extends UDAF {
 public static class MaxIntUDAFEvaluator implements UDAFEvaluator {
  private IntWritable output;
  public void init()
  {
   output = null;
  }
  public boolean iterate(IntWritable maxvalue) // Process input table
  {
   if (maxvalue == null)
   {
    return true;
   }
   if (output == null)
   {
    output = new IntWritable(maxvalue.get());
   }
   else
   {
    output.set(Math.max(output.get(), maxvalue.get()));
   }
   return true;
  }
  public IntWritable terminatePartial()
  {
   return output;
  }
  public boolean merge(IntWritable other)
  {
   return iterate(other);
  }
  public IntWritable terminate() // final result
  {
   return output;
  }
 }
}

Step 2 - Compile and create a jar file of your java project. Creating a jar file is left to you.
Step 3 - Create a Numbers_List.txt file

Numbers_List.txt

Step 4 - Add these following lines to Numbers_List.txt file

Step 5 - Change the directory to /usr/local/hive/bin

$ cd $HIVE_HOME/bin

Step 6 - Enter into hive shell

$ hive

Step 7 - Create a table Num_list, load Numbers_List.txt data into the table and verify. Save and close.

hive> CREATE TABLE Num_list(Num int) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\n';

hive> LOAD DATA LOCAL INPATH '/home/hduser/Desktop/HIVE/Numbers_List.txt' OVERWRITE INTO TABLE Num_list;

hive> SELECT * FROM Num_list;

Step 8 - Add jar file in distributed cache, create a function and execute udaf function.

hive> ADD JAR /home/hduser/Desktop/HIVE/MaxUDAF.jar;

hive> CREATE TEMPORARY FUNCTION max AS 'Max';

hive> SELECT max(Num) FROM Num_list;

Big Data Analysis

Search This Blog

Hive User Defined Aggregate Functions (UDAF) Java Example

Comments

Post a Comment

Popular posts from this blog

Apache Spark WordCount scala example

Hive hiveserver2 and Web UI usage

Apache Spark Shell Usage