Hive

What is Apache Hive?

Apache Hive is an open-source data warehouse solution for Hadoop infrastructure. It is used to process structured data of large datasets and provides a way to run HiveQL queries.

What not?

Hive not designed for OLTP processing
It’s not a relational database (RDBMS)
Not used for row-level updates for real-time systems.

Apache Hive Advantages?

Supports large datasets
Runs on Hadoop infrastructure which uses commodity hardware
Supports SQL syntax
Provides Beeline client which is used to connect from Java, Scala, C#, Python, and many more languages.

Hive Installation

Apache Hive Installation on Hadoop HDFS

Start HiveServer2 & Connect Beeline

Hive Clients

Hive CLI (Deprecated in new Hive version)
Hive Connect to Beeline

HiveQL DDL Commands

HiveQL DML Commands

Hive Partition and Bucket

Hive Java Examples

Hive Scala Examples

Hive Spark Examples

Spark Union Hive Tables from different Databases

Hive PySpark Examples

Hive Error or Exceptions

Hive – HiveException java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient
Why Hive tables Loads with Null Values

Get in touch

“Get in touch” is a phrase used to encourage someone to initiate contact or communication.

Transforming data into decisions

Quick Links

Support

Contact Us

+918867106936

© 2024 Nidi Soft. All Rights Reserved.