...
...
...
...
Info | ||
---|---|---|
| ||
This page describes only the minimal set of steps needed for Datameer X to operate properly in your Hadoop environment. For a more complete guide to Hadoop cluster design, configuration, and tuning, see Hadoop Cluster Configuration Tips or the additional resources section below. |
...
Your Hadoop cluster is accessed by Datameer X as a particular Hadoop user. By default, the user is identified as the UNIX user who launched the Datameer X application (equivalent to the UNIX command 'whoami
'). To ensure this works properly, you should create a user of the same name within Hadoop's HDFS for Datameer X to use exclusively for scheduling and configuration. This ensures the proper permissions are set, and that this user is recognized whenever the Datameer X application interacts with your Hadoop cluster.
The username used to launch the Datameer X application can be configured in <Datameer X folder>/etc/das-env.sh
...
For standard Hadoop compression algorithms, you can choose the algorithm Datameer X should use. However, if your Hadoop cluster is using a non-standard compression algorithm such as LZO, you need to install these libraries onto the Datameer X machine. This is necessary so that Datameer X can read the files it writes to HDFS, and decompress files residing on HDFS which you wish to import. Libraries which utilize native compression require both a Java (JAR) and native code component (UNIX packages). The Java component is a JAR file which that needs to be placed into <Datameer X folder>/etc/custom-jars
. See Frequently Asked Hadoop Questions#Q. How do I configure Datameer/Hadoop to use native compression? for more details.
Note |
---|
The configuration of Hadoop compression can drastically affect Datameer X performance. See Hadoop Cluster Configuration Tips for more information. |
Connecting Datameer X to Your Hadoop Cluster
By default, Datameer X isn't connected to any Hadoop cluster and operates in local mode, with all analytics and other functions performed by a local instance of Hadoop, which is useful for prototyping with small data sets, but not for high-volume testing or production. To connect Datameer X to a Hadoop cluster, change the settings in Datameer X in the Admin > Hadoop Cluster page. Click the Admin tab at the top of the page and click the Hadoop Cluster tab in the left column. Click Edit and change the Mode setting.
...