Hadoop and Datameer
Hadoop provides scalable data storage using the Hadoop Distributed File System (HDFS) and fast parallel data processing on a fault-tolerant cluster of computers.
If you are a system administrator responsible for setting up or configuring a Hadoop cluster to use with Datameer, this topic provides some links you might find useful.
If you are setting up a new Hadoop system for use with Datameer, see System Requirements for details on hardware and software requirements.
Getting Started with Hadoop
Learn about Hadoop.
Learn more about the HDFS architecture.
Here are some additional links where you can learn more about Hadoop:
- Hadoop wiki (a useful general starting point)
- Cluster setup:
- Yahoo Hadoop tutorial
- Tuning Hadoop for performance
- MapReduce tutorial
Additionally, see Monitoring Hadoop and Datameer to learn more about monitoring Hadoop and Hadoop Cluster Configuration Tips to learn about how to optimize Hadoop for use with Datameer.