This topic provides tips on configuring Hadoop and Datameer X for use in a shared cluster.
...
The identity of the user who launches Datameer X is the identity Hadoop assumes should also be used on the Hadoop cluster. To set the user name up correctly, you need to create a user on the Hadoop cluster with the same name as the user who launches Datameer. That user on the Hadoop cluster must have permissions to read and write to the Datameer X private folder on the cluster.
...
- You might experience HDFS permissions errors when you attempt to run jobs (as Datameer X tries unsuccessfully to manipulate files in the Datameer X private folder, or incorrectly attempts to manipulate the directory of the HDFS root user).
- Jobs can be submitted into the wrong work queue (meaning that your job won't have the appropriate priority), or can be rejected.
...
Code Block |
---|
hadoop -fs [-chmod [-R] <MODE[,MODE]... | OCTALMODE> PATH...] [-chown [-R] [OWNER][:[GROUP]] PATH...] [-chgrp [-R] GROUP PATH...] |
Datameer X export jobs to HDFS
Datameer X allows you to export data to remote filesystems. At the step New Export Job > Data Details you can enable clear output directory.
...
If you have configured a job scheduler on your cluster, you can easily configure which queue or pool Datameer X should use. For example, if you have set up a fair share scheduler (see Apache FairScheduler), you can set this up by doing the following:
- Check which property needs to be set-up to configure the pool. This is the Hadoop property
mapred.fairscheduler.poolnameproperty
configured inconf/mapred-site.xml
of your Hadoop installation. - Set this property in Datameer X at Administration > Hadoop Cluster > Custom Property and set the pool that should be used by Datameer.
Using Data from Other Applications
Datameer X can read files on HDFS generated by other MapReduce jobs and applications on top of Hadoop. Depending on the format of the files produced by other jobs, writing a plug-in for Datameer X might or might not be required.
See the custom import plug-in definition and see Datameer X Plug-in Tutorial for additional information on creating a custom plugin.
...