written 8.6 years ago by |
The physical architecture lays out where you install and execute various components.Figure shows an example of a Hadoop physical architecture involving Hadoop and its ecosystem, and how they would be distributed across physical hosts. ZooKeeper requires an odd-numbered quorum,7 so the recommended practice is to have at least three of them in any reasonably sized cluster physical architecture includes CPU, RAM, disk, and network, because they all have an impact on the throughput and performance of your cluster.
The term commodity hardware is often used to describe Hadoop hardware requirements. It’s true that Hadoop can run on any old servers you can dig up, but you still want your cluster to perform well, and you don’t want to swamp your operations department with diagnosing and fixing hardware issues. Therefore, commodity refers to mid-level rack servers with dual sockets, as much error-correcting RAM as is affordable, and SATA drives optimized for RAID storage. Using RAID, however, is strongly discouraged on the DataNodes, because HDFS already has replication and error-checking built-in; but on the NameNode it’s strongly recommended for additional reliability.
Figure: Hadoop’s Physical Architecture