What is Apache Hive?
Apache Hive is querying and analysis enabling software which runs on top of an Apache Hadoop instance. It provides the ability to perform SQL queries in distributed Hadoop storage, enabling reporting, analysis and functionality such as ETL, something which is natively not possible on a Hadoop Distributed File System. Using an interface similar to SQL called HiveQL, it enables the querying of data stored across large distributed databases and streamlines analysis of extremely large datasets.
Loome Integrate Apache Hive Connection
With the Apache Hive connector, you can run SQL commands on your Hadoop cluster directly from Loome Integrate. This opens up new options for you when orchestrating your data tasks. Scheduling SQL queries to run when you need them and in a specific order brings you more control over your Hadoop data environment and gives you the power of large distributed Hadoop architecture with the accessibility of simple querying.
Learn more about Loome Integrate and how it can optimise the way you query your big data systems.
Apache Hive Connector Solution Scenarios
There are numerous real-world implementations of running SQL commands on Hadoop clusters, some of which are covered by scenarios we have written up in our Resources section.
Publish Apache Hive to These Systems
Once you have performed the necessary data transformation tasks, you will tend to want to look into data visualisation and reporting tools. Loome Publish allows you to bring together all your reports from various systems into a single consolidated reporting portal.