Apache Hive is a data warehouse infrastructure built on top of
Hadoop for providing data summarization, query, and analysis. Hive
gives an SQL-like interface to query data stored in various databases
and file systems that integrate with Hadoop. The traditional SQL queries
must be implemented in the MapReduce Java API to execute SQL
applications and queries over a distributed data. Hive provides the
necessary SQL abstraction to integrate SQL-like Queries (HiveQL) into
the underlying Java API without the need to implement queries in the
low-level Java API. Since most of the data warehousing application work
with SQL based querying language, Hive supports easy portability of
SQL-based application to Hadoop.
Pre Requirements
1) A machine with Ubuntu 14.04 LTS operating system
2) Apache Hadoop 2.6.4 pre installed (How to install Hadoop on Ubuntu 14.04)
3) Apache Hive 2.1.0 Software (Download Here)
Hive Installation With MySQL Database Metastore
NOTE
Hive versions 1.2 onward require Java 1.7 or newer. Hive versions 0.14 to 1.1 work with Java 1.6 as well.
Hadoop 2.x (preferred), 1.x (not supported by Hive 2.0.0 onward). Hive versions up to 0.13 also supported Hadoop 0.20.x, 0.23.x.
Hive Installation Steps
Step 1 - Installing MySQL Server. Open a terminal (CTRL + ALT + T) and type the following sudo command.

During mysql-server installation it will ask password to enter for root, give it as 'root' or something else. In my case i have given it as root to remember easily.
Step 2 - Installing MySQL Java Connector. This will install libraries (mysql-connector-java.jar) in /usr/share/java/ folder using that we can connect MySQL using Java.
Step 3 - Enter into MySQL command line inteface(CLI). Open a terminal (CTRL + ALT + T) and type the following command.
Enter password
Step 4 - Creating new user
Step 5 - Grant all privileges to new user
Step 6 - Flush privileges

Step 7 - Creating hive directory. Open a new terminal(CTRL + ALT + T) and enter the following command.
Step 8 - Change the ownership and permissions of the directory /usr/local/hive. Here 'hduser' is an Ubuntu username.
Step 9 - Switch User, is used by a computer user to execute commands with the privileges of another user account.
Step 10 - Change the directory to /home/hduser/Desktop , In my case the downloaded apache-hive-2.1.0-bin.tar.gz file is in /home/hduser/Desktop folder. For you it might be in /downloads folder check it.
Step 11 - Untar the apache-hive-2.1.0-bin.tar.gz file.
Step 12 - Move the contents of apache-hive-2.1.0-bin folder to /usr/local/hive
Step 13 - Edit $HOME/.bashrc file by adding the pig path.
$HOME/.bashrc file. Add the following lines
Step 14 - Reload your changed $HOME/.bashrc settings
Step 15 - Change the directory to /usr/local/hive/conf
Step 16 - Copy the default hive-env.sh.template to hive-env.sh
Step 17 - Edit hive-env.sh file.
Step 18 - Add the below lines to hive-env.sh file. Save and Close.
Step 19 - Copy the default hive-default.xml.template to hive-site.xml
Step 20 - Edit hive-site.xml file.
Step 21 - Add or update below properties in hive-site.xml file.
Step 23 - Copy mysql-connector-java-5.1.28.jar from /usr/share/java/ to $HIVE_HOME/lib/ folder.
Step 24 - Change the directory to /usr/local/hadoop/sbin
Step 25 - Start all hadoop daemons.
Step 26 - You must use below
HDFS commands to create /tmp and /user/hive/warehouse (aka
hive.metastore.warehouse.dir) and set them chmod g+w before you can
create a table in Hive.
Step 27 - Change the directory to /usr/local/hive/bin
Step 28 - We need to run the schematool command below as an initialization step. For example, we can use "mysql" as db type.
Step 29 - To use the Hive command line interface (CLI) from the shell.
Step 30 - To list all the tables those are present in mysql database.
Step 31 - Enter into MySQL command line inteface(CLI). Open a terminal (CTRL + ALT + T) and type the following command.
Enter password
Step 32 - Use matastore database.
Step 33 - To list all the tables those are present in mysql metastore database.
Pre Requirements
1) A machine with Ubuntu 14.04 LTS operating system
2) Apache Hadoop 2.6.4 pre installed (How to install Hadoop on Ubuntu 14.04)
3) Apache Hive 2.1.0 Software (Download Here)
Hive Installation With MySQL Database Metastore
NOTE
Hive versions 1.2 onward require Java 1.7 or newer. Hive versions 0.14 to 1.1 work with Java 1.6 as well.
Hadoop 2.x (preferred), 1.x (not supported by Hive 2.0.0 onward). Hive versions up to 0.13 also supported Hadoop 0.20.x, 0.23.x.
Hive Installation Steps
Step 1 - Installing MySQL Server. Open a terminal (CTRL + ALT + T) and type the following sudo command.
$ sudo apt-get install mysql-server
During mysql-server installation it will ask password to enter for root, give it as 'root' or something else. In my case i have given it as root to remember easily.
Step 2 - Installing MySQL Java Connector. This will install libraries (mysql-connector-java.jar) in /usr/share/java/ folder using that we can connect MySQL using Java.
$ sudo apt-get install libmysql-java
$ mysql -u root -p
Enter password :****
mysql> CREATE USER 'hduser'@'%' IDENTIFIED BY 'hduser';
mysql> GRANT all on *.* to 'hduser'@localhost identified by 'hduser';
mysql> flush privileges;
Step 7 - Creating hive directory. Open a new terminal(CTRL + ALT + T) and enter the following command.
$ sudo mkdir /usr/local/hive
$ sudo chown -R hduser /usr/local/hive $ sudo chmod -R 755 /usr/local/hive
$ su hduser
$ cd /home/hduser/Desktop/
$ tar xzf apache-hive-2.1.0-bin.tar.gz
$ mv apache-hive-2.1.0-bin/* /usr/local/hive
$ sudo gedit $HOME/.bashrc
export HIVE_HOME=/usr/local/hive export PATH=$HIVE_HOME/bin:$HIVE_HOME/lib:$PATH
$ source $HOME/.bashrc
$ cd $HIVE_HOME/conf
$ cp hive-env.sh.template hive-env.sh
$ gedit hive-env.sh
export HADOOP_HOME=/usr/local/hadoop export HIVE_CONF_DIR=$HIVE_CONF_DIR export HIVE_AUX_JARS_PATH=$HIVE_AUX_JARS_PATH
$ cp hive-default.xml.template hive-site.xml
$ gedit hive-site.xml
Put the following at the beginning of hive-site.xml
Step 22 - Remove below property in hive-site.xml file. Save and Close. <property>
<name>system:java.io.tmpdir</name>
<value>/tmp/hive/java</value>
</property>
<property>
<name>system:user.name</name>
<value>${user.name}</value>
</property>
<property> <name>javax.jdo.option.ConnectionURL</name> <value>jdbc:mysql://localhost/metastore?createDatabaseIfNotExist=true</value> <description>metadata is stored in a MySQL server</description> </property> <property> <name>javax.jdo.option.ConnectionDriverName</name> <value>com.mysql.jdbc.Driver</value> <description>MySQL JDBC driver class</description> </property> <property> <name>javax.jdo.option.ConnectionUserName</name> <value>hduser</value> <description>user name for connecting to mysql server</description> </property> <property> <name>javax.jdo.option.ConnectionPassword</name> <value>hduser</value> <description>password for connecting to mysql server</description> </property> <property> <name>hive.metastore.warehouse.dir</name> <value>hdfs://localhost:9000/user/hive/warehouse</value> <description>location of default database for the warehouse</description> </property>
<property> <name>javax.jdo.option.ConnectionUserName</name> <value>APP</value> <description>Username to use against metastore database</description> </property>
$ cp /usr/share/java/mysql-connector-java-5.1.28.jar $HIVE_HOME/lib/
$ cd /usr/local/hadoop/sbin
$ start-all.sh
$ hdfs dfs -mkdir /tmp
$ hdfs dfs -chmod 777 /tmp
$ hdfs dfs -mkdir /user/hive/warehouse
$ hdfs dfs -chmod g+w /tmp
$ hdfs dfs -chmod g+w /user/hive/warehouse
$ cd $HIVE_HOME/bin
$ schematool -initSchema -dbType mysql
Step 29 - To use the Hive command line interface (CLI) from the shell.
$ hive
$ show tables;
$ mysql -u hduser -p
Enter password: hduser
mysql> use metastore;
select * from TBLS;
Comments
Post a Comment