Hadoop HDFS


  • What is and Why HDFS? 
  • HDFS Architecture 
  • HDFS Features 
  • HDFS Commands 
  • HDFS Web UI 
  • Hue web UI

What is and Why HDFS?

What is HDFS? 

  • HDFS is a virtual FS (File System) built on top of local FS 
    • When you start writing data into HDFS, it eventually gets written onto the local FS (of distributed machines) 
  • You can't browse HDFS like you do with the local FS 
    • You need to use the HDFS commands (similar to local FS commands, however) or 
    • Or you can use the HDFS Web UI 
    • Or the available APIs 
  • HDFS stores data as blocks in a replicated fashion 
    • Management and replication of blocks are handled by HDFS 
  • HDFS is the primary distributed storage used by Hadoop applications 
    • Scalability, Reliability, Automatic distribution of data

You must have an active subscription to download PDF,Lab Zip and Recordings of this course topic.Please click the "Subscribe" button or the "Login" button if you already have an account.