Frequently Asked Hadoop Questions

Frequently Asked Hadoop Questions

Q. Where can I go to learn more about Hadoop?

A. See: Hadoop Tutorials and Extra Features

Q. How can I optimize my Hadoop installation for use with Datameer?

A. See: Hadoop Cluster Configuration Tips

Q. How can I choose the Job queue/pool to which Datameer submits jobs:

A:

  1. First determine the appropriate Java system property which selects job queues on your Hadoop cluster. This is based on your chosen Hadoop scheduler. If you are using Fair Scheduler, this is the Hadoop property mapred.fairscheduler.poolnameproperty, configured in conf/mapred-site.xml of your Hadoop installation.
  2. Set this property in the Datameer UI under Administration > Hadoop Cluster > Custom Property and set the value to the name of the pool which Datameer should use.

Q. How do I configure Datameer/Hadoop to use native compression?

A: See Using Compression with Hadoop and Datameer.

Q. How do I configure Datameer/Hadoop to use LZO native compression?

A: See Using Compression with Hadoop and Datameer.

Q. How do I configure Datameer/Hadoop to use Snappy native compression?

A: See Using Compression with Hadoop and Datameer.

Q. How can I use a custom codec for Avro file export

A: The user can add and define a custom codec if the default codecs (listed below) don't fulfill needed requirements.

  1. Determine if a custom codec for Avro file export is needed.
  2. Install your JAR file with the custom codec to etc/custom-jar. 
  3. Add the following property to Custom Hadoop Properties located under the Administration tab and selecting Hadoop Cluster.

    das.avro.io.compression.codecs=<your custom avro codec>
  4. Start/Restart Datameer
     

By default, Datameer supports the following codecs: "deflatecodec", "snappycodec", "bzip2codec", "gzipcodec", and "defaultcodec" for Avro export. 

Q. Plugin registry fails to resolve dependency plugin-das-extension-points?

A: If you observe a message similar to:

WARN [2011-07-13 17:58:11] (PluginRegistryImpl.java:374) - Missing dependency plugin-das-extension-points for plugin <XYZ>

The plugin extension point needs to be changed in all Datameer plug-ins when moving to a new version.

For a temporary solution, copy the old plug-in file <Datameer old version>/plugins/plugin-das-extension-points-1.2.x.zip to the new installation folder and restart the Datameer application. This allows the custom extensions to be loaded.

However, you should consider maintaining your plug-in code. Change your plugin.xml to (removes requirements for plugin-das-extension-points):

<?xml version="1.0" encoding="UTF-8"?>
   <plugin id="plugin-xml" name="XML Plugin" version="1.0" provider-name="Datameer, Inc.">
   <requires>
      <import plugin="das.sdk" />
   </requires>
   ...
</plugin>

Q. Why do I receive the error “IllegalAccessError” when trying to access the Hbase table with HBase 0.96.1 - 0.98.0

A: This error is caused by https://issues.apache.org/jira/browse/HBASE-10304.

To fix this problem on the cluster:

  • Copy the hbase-protocol jar into Hadoop's root classpath (HADOOP_CLASSPATH environment variable). 

To fix this problem on the Datameer server:

  • Copy the same hbase protocol jar into the /etc/custom-jars folder of Datameer and reboot the Datameer server.