Skip to main content

HBase Shell Usage

Apache HBase is an open source, non-relational, distributed database modeled after Google's BigTable and is written in Java. It is developed as part of Apache Software Foundation's Apache Hadoop project and runs on top of HDFS (Hadoop Distributed File System), providing BigTable-like capabilities for Hadoop. That is, it provides a fault-tolerant way of storing large quantities of sparse data (small amounts of information caught within a large collection of empty or unimportant data, such as finding the 50 largest items in a group of 2 billion records, or finding the non-zero items representing less than 0.1% of a huge collection).
Pre Requirements
1) A machine with Ubuntu 14.04 LTS operating system.
2) Apache Hadoop pre installed (How to install Hadoop on Ubuntu 14.04)
3) Apache HBase pre installed (How to install HBase on Ubuntu 14.04)
HBase Shell Usage
HBase contains a shell using which you can communicate with HBase. HBase uses the Hadoop File System to store its data. It will have a master server and region servers. The data storage will be in the form of regions (tables). These regions will be split up and stored in region servers.
The master server manages these region servers and all these tasks take place on HDFS. Given below are some of the commands supported by HBase Shell.
Step 1 - Change the directory to /usr/local/hbase/bin
$ cd /usr/local/hbase/bin
Step 2 - Start all hbase daemons.
$ ./start-hbase.sh
Step 3 - The JPS (Java Virtual Machine Process Status Tool) tool is limited to reporting information on JVMs for which it has the access permissions.
$ jps
Once the HBase is up and running check the web-ui of the components as described below
http://localhost:16010
Step 4 - You can start the HBase interactive shell using "hbase shell" command as shown below.
$ ./hbase shell
List is a command used to get the list of all the tables in HBase.
hbase> list
Apache HBase Shell Usage
Status command returns the status of the system including the details of the servers running on the system.
hbase> status
hbase> status 'simple'
hbase> status 'summary'
hbase> status 'detailed'
Apache HBase Shell Usage
Version command returns the version of HBase used in your system
hbase> version
Apache HBase Shell Usage
Table Help command guides you what and how to use table-referenced commands.
hbase> table_help
Apache HBase Shell Usage
Whoami command returns the user details of HBase.
hbase> whoami
Apache HBase Shell Usage
Create table You can create a table using the create command, here you must specify the table name and the Column Family name.
hbase> create 'emp','personal data','professional data'
Apache HBase Shell Usage
Verify the creation
hbase> list
Apache HBase Shell Usage
Drop table Using the drop command, you can delete a table. Before dropping a table, you have to disable it.
hbase> disable 'emp'

hbase> drop 'emp'
Apache HBase Shell Usage
drop_all This command is used to drop the tables matching the “regex” given in the command.
hbase> drop_all 'e.*'
Verify
hbase> list
Apache HBase Shell Usage
Disable Table To delete a table or change its settings, you need to first disable the table using the disable command. You can re-enable it using the enable command.
hbase> disable 'emp'
Verification After disabling the table, you can still sense its existence through list and exists commands. You cannot scan it.
hbase> scan 'emp'
Apache HBase Shell Usage
is_disabled This command is used to find whether a table is disabled.
hbase> is_disabled 'emp'
Apache HBase Shell Usage
disable_all This command is used to disable all the tables matching the given regex.
hbase> disable_all 'e.*'
Apache HBase Shell Usage
Enable Table
hbase> enable 'emp'
Apache HBase Shell Usage
Verification After enabling the table, scan it. If you can see the schema, your table is successfully enabled.
hbase> scan 'emp'
is_enabled This command is used to find whether a table is enabled.
hbase> is_enabled 'emp'
enable_all This command is used to enable all the tables matching the given regex.
hbase> enable_all 'e.*'
describe This command returns the description of the table.
hbase> describe 'emp'
alter Alter is the command used to make changes to an existing table. Using this command, you can change the maximum number of cells of a column family, set and delete table scope operators, and delete a column family from a table.
hbase> alter 'emp', NAME => 'personal data', VERSIONS => 5
Apache HBase Shell Usage
Table Scope Operator Using alter, you can set and remove table scope operators such as MAX_FILESIZE, READONLY, MEMSTORE_FLUSHSIZE, DEFERRED_LOG_FLUSH, etc.
hbase> alter 'emp', READONLY
Deleting a column family Using alter, you can also delete a column family. Given below is the syntax to delete a column family using alter.
hbase> alter 'emp','delete'=>'professional data'
Verification Now verify the data in the table after alteration. Observe the column family ‘professional’ is no more, since we have deleted it.
hbase> scan 'emp'
Exists You can verify the existence of a table using the exists command.
hbase> exists 'emp'
Exit from hbase shell
hbase> exit
Dont forget to stop habse daemons.
$ ./bin/stop-hbase.sh

Comments

Popular posts from this blog

Apache Spark WordCount scala example

Apache Spark is an open source cluster computing framework. Originally developed at the University of California, Berkeley's AMPLab, the Spark codebase was later donated to the Apache Software Foundation, which has maintained it since. Spark provides an interface for programming entire clusters with implicit data parallelism and fault-tolerance. Pre Requirements 1) A machine with Ubuntu 14.04 LTS operating system 2) Apache Hadoop 2.6.4 pre installed ( How to install Hadoop on Ubuntu 14.04 ) 3) Apache Spark 1.6.1 pre installed ( How to install Spark on Ubuntu 14.04 ) Spark WordCount Scala Example Step 1 - Change the directory to /usr/local/spark/sbin. $ cd /usr/local/spark/sbin Step 2 - Start all spark daemons. $ ./start-all. sh Step 3 - The JPS (Java Virtual Machine Process Status Tool) tool is limited to reporting information on JVMs for which it has the access permissions. $ jp...

Hive hiveserver2 and Web UI usage

Hive hiveserver2 and Web UI usage HiveServer2 (HS2) is a server interface that enables remote clients to execute queries against Hive and retrieve the results (a more detailed intro here). The current implementation, based on Thrift RPC, is an improved version of HiveServer and supports multi-client concurrency and authentication. It is designed to provide better support for open API clients like JDBC and ODBC. Step 1 - Change the directory to /usr/local/hive/bin $ cd $HIVE_HOME/bin Step 2 - Start hiveserver2 daemon $ hiveserver2 OR $ hive --service hiveserver2 & Step 3 - You can browse to hiveserver2 web ui at following url http: //localhost:10002/hiveserver2.jsp Step 4 - You can see the hive logs in /tmp/hduser/hive. log To kill hiveserver2 daemon $ ps -ef | grep -i hiveserver2 $ kill - 9 29707 OR $ rm -rf /var/run/hive/hive...

Apache Spark Shell Usage

Apache Spark is an open source cluster computing framework. Originally developed at the University of California, Berkeley's AMPLab, the Spark codebase was later donated to the Apache Software Foundation, which has maintained it since. Spark provides an interface for programming entire clusters with implicit data parallelism and fault-tolerance. Pre Requirements 1) A machine with Ubuntu 14.04 LTS operating system 2) Apache Hadoop 2.6.4 pre installed ( How to install Hadoop on Ubuntu 14.04 ) 3) Apache Spark 1.6.1 pre installed ( How to install Spark on Ubuntu 14.04 ) Spark Shell Usage The Spark shell provides an easy and convenient way to prototype certain operations quickly, without having to develop a full program, packaging it and then deploying it. Step 1 - Change the directory to /usr/local/hadoop/sbin. $ cd /usr/local/hadoop/sbin Step 2 - Start all hadoop daemons. $ ./start-all. sh ...