Hive comes with HiveServer2 which is a server interface and has its own Command Line Interface(CLI) called Beeline which is used to connect to Hive running on Local or Remove server and run HiveQL queries. Beeline is a JDBC client that is based on the SQLLine CLI. In this article, you will learn how to connect to Hive using Beeline with several examples.
Using Beeline we can connect to Hive running on Local or Remote server using IP address and port.
Start Beeline to Connect to Hive
To start Beeline, run beeline shell
which is located at $HIVE_HOME/bin
directory.
This prompts you to an interactive Hive Beeline CLI Shell where you can run HiveQL commands.
You can enter !help
on CLI to get all commands that are supported.
beeline>!help
Beeline can be run in Embedded mode and Remote mode, in Embedded mode it also runs HiveServer2 embedded internally similar to Hive CLI, and in Remote mode, it connects to remote HiveServer2 service over Thrift.
Now let’s connect to HiveServer2 using !connect and learn how to use embedded and remote modes with examples.
Using Beeline in Embedded Mode
Running in embedded mode is a quick way to connect to Hive using Beeline and run some HiveQL queries, this is similar to Hive CLI (older version). In embedded mode, it launches Hive service (HiveServer2) internally hence it’s not recommended for production use.
To start Beeline in embedded mode and connect to Hive using a connection string !connect jdbc:hive2://
, By running this command it prompts for user name and password. HiveServer2 by default provides user scott and password tiger, so let’s use these default credentials.
Note: One of the main differences between Hive CLI and HiveServer2 (HS2) is HS2 provides authentication, You cannot connect to Hive without authenticating.
beeline>!connect jdbc:hive2://
connecting to jdbc:hive2://
Enter username for jdbc:hive2://: scott
Enter password for jdbc:hive2://: *******
0: jdbc:hive2://>
You may get below errors
- If you get a connection error User: is not allowed to impersonate, set property
hive.server2.enable.doAs
tofalse
on$HIVE_HOME/conf/hive-site.xml
file, restart the Hive Beeline to connect again. - Follow the link if you get this error HiveException java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient,
Alternatively, you can also supply the embedded URL and credentials in one statement.
beeline>!connect jdbc:hive2:// -n scott -p tiger
(or)
bin/beeline -u jdbc:hive2:// scott tiger
Using Beeline to Connect to Remote Hive
In Remote mode, the HiveServer2 process is running on a remote cluster and we can connect to remote Hive from Beeline by providing a Remote IP address and port on the JDBC connection URL string.
In order to use Beeline connecting to remove server, first, you need to have the HiveServer2 service running on remote server, if it’s not already running, Start HiveServer2
Note: By default, HiveServer2 runs on port 10000.
Using separate HiveSever2 service enables us to configure different types of authentications LDAP and Kerberos hence, connecting to Hive in Remote mode is recommended to use for production.
This is the most secure way of connecting to Hive and protecting production data by not giving users access to cluster.
To connect to Hive running on remote cluster, just pass the IP address and Port on JDBC connection string.
beeline>!connect jdbc:hive2://192.168.1.1:10000 scott tiger
(or)
beeline>!connect jdbc:hive2://192.168.1.1:10000 -n scott -p tiger
By not providing a username and password, it prompts for the credentials to enter.
beeline> !connect jdbc:hive2://192.168.1.1:10000 scott tiger
Connecting to jdbc:hive2://192.168.1.1:10000
Connected to: Apache Hive (version 3.1.2)
Driver: Hive JDBC (version 3.1.2)
Transaction isolation: TRANSACTION_REPEATABLE_READ
0: jdbc:hive2://192.168.1.1:10000>
In case if you are running on LOCAL, you can also try with the localhost, hostname, or 127.0.0.1 instead of remote IP address.
Upon successfully staring Beeline, enter the show databases;
to list the database, by default hive provides default
database.
0: jdbc:hive2://192.168.1.1:10000> show databases;
+----------------+
| database_name |
+----------------+
| default |
+----------------+
1 rows selected (1.775 seconds)
0: jdbc:hive2://192.168.1.1:10000>
You can also directly issues a Beeline command from the Unix shell, For more command option please refer Beeline Options
prabha@namenode:~/hive$ bin/beeline -u jdbc:hive2://192.168.1.1:10000 scott tiger
Happy Learning !!