Hadoop Hive

Topics 

  • What is and Why Hive? 
  • Hive Architecture 
  • HiveQL 
  • Physical Layout 
  • Loading data into Hive tables 
  • Partitions 
  • Joining 
  • Buckets

What is and Why Hive?

What is Hive? 

  • Provides data warehousing solution built on top of Hadoop 
    • Facilitates querying and managing large data sets residing in distributed storage of Hadoop (HDFS or HBase) 
  • Comes with SQL-like language called HiveQL 
    • SQL has huge developer base
    • Brings Hadoop capability (querying, analyzing, and summarizing large amounts of data) to developers familiar with SQL 
  • Allows you to project structure to any data formats 
    • Can handle non-structured data 
  • Open source Apache project 
    • Started at Facebook

You must have an active subscription to download PDF and Lab Zip of this course topic.Please click the "Subscribe" button or the "Login" button if you already have an account.