Table of Contents |
---|
...
Name | Default Value | Description |
---|---|---|
fs.AbstractFileSystem.hdfs.impl | org.apache.hadoop.fs.Hdfs | Property is required for successfully starting yarn application as these properties are stripped out while setting up the overlay by DAS framework. |
io.compression.codecs.addition | datameer.dap.common.util.ZipCodec,datameer.dap.common.util.LzwCodec | Property is required for successfully starting yarn application as these properties are stripped out while setting up the overlay by DAS framework. Additional comma=separated compression codecs that are added to io.compression.codecs |
das.yarn.available-node-vcores | auto | This property sets the number of available node vcores and is used in order to calculate the optimal number of splits and tasks (numberOfNodes * das.yarn.available-node-vcores). It can be set either to the number of free CPUs per node, or it can be set to auto. In auto mode, Datameer X fetches information about the vCores from every node and sets das.yarn.available-node-vcores to the average of the available vCores. |
das.job.health.check | true | Turn das job configuration health check on for cluster configuration. |
das.local.exec.map.count | 5 | DAS property to influence the number of mappers in Local Execution mode. |
Other Properties
Name | Default Value | Description |
---|---|---|
YARN MR2 Framework Properties | ||
das.yarn.base-counter-wait-time | 40000 | YARN Counters take a while to show up at the Job History Server, so we use this base wait time + a small factor * the number of executed tasks to wait for the counters to finally show up. |
TEZ Execution Framework Properties | ||
das.tez.session-pool.max-cached-sessions | auto | Maximum number of idle session to be held in cache. 0 means no session pooling, auto uses the number of nodes in the cluster as value for the maximum number of sessions. |
das.tez.session-pool.max-idle-time | 2m | The timeout for the sessions held in pool for which they wait for DAG to be submitted. |
das.tez.session-pool.max-time-to-live | 2h | Maximum amount of time a tez session will be alive irrespective of being idle or active since its start time. |
tez.runtime.compress | true | Settings for intermediate compression with Tez. |
tez.runtime.compress.codec | org.apache.hadoop.io.compress.SnappyCodec | Settings for intermediate compression with Tez. |
framework.local.das.parquet-storage.max-parquet-block-size | 67108864 | For local execution framework setting the max parquet block size to 64MB(can't be raised beyond 64MB but, can be lowered) by default. |
tez.shuffle-vertex-manager.desired-task-input-size | 52428800 | Sets the input size for each TEZ sub-task. |
das.cluster.plugin.resources.fixer.enabled | true | Intercepts the YARN errors YARN-3591 and YARN 6641 and rebuilds the missing plug-in class files on the fly. Setting the value to 'false' disables the property. |
Hive Properties | ||
dap.hive.use-datameer-file-splitter | fales | Property decides whether Datameer X should use Datameer's FileSplitter logic to generate RewindableFileSplits or Hive's own splitting logic. |
Windows Properties | ||
windows.das.join.disabled-strategies | MEMORY_BACKED_MAP_SIDE | By default, memory backed joins are enabled on Windows platform, if required to disable, please uncomment the property. |
mapreduce.input.linerecordreader.line.maxlength | 2097152 | Controls the maximum line size (in characters) allowed before filtering it on read. Below value is around a maximum of 4MB per line. |
Error Handling Properties | ||
workbook.error-handling.default | IGNORE | Legal values for workbook error handling are IGNORE, DROP_RECORD, ABORT_JOB. The equivalent error handling modes in the UI are Ignore, Skip and Abort. |
export-job.error-handling.default | DROP_RECORD | Legal values for export job error handling are IGNORE, DROP_RECORD, ABORT_JOB. The equivalent error handling modes in the UI are Ignore error, Drop record and Abort job. |
import-job.error-handling.default | DROP_RECORD | Legal values for import job error handling are DROP_RECORD, ABORT_JOB. The equivalent error handling modes in the UI are Drop record and Abort job. |
data-link.error-handling.default | DROP_RECORD | Legal values for data link error handling are DROP_RECORD, ABORT_JOB. The equivalent error handling modes in the UI are Drop record and Abort job. |
...