Table of Contents |
---|
...
Name | Default Value | Description |
---|---|---|
fs.AbstractFileSystem.hdfs.impl | org.apache.hadoop.fs.Hdfs | Property is required for successfully starting yarn application as these properties are stripped out while setting up the overlay by DAS framework. |
io.compression.codecs.addition | datameer.dap.common.util.ZipCodec,datameer.dap.common.util.LzwCodec | Property is required for successfully starting yarn application as these properties are stripped out while setting up the overlay by DAS framework. Additional comma=separated compression codecs that are added to io.compression.codecs |
das.yarn.available-node-vcores | auto | This property sets the number of available node vcores and is used in order to calculate the optimal number of splits and tasks (numberOfNodes * das.yarn.available-node-vcores). It can be set either to the number of free CPUs per node, or it can be set to auto. In auto mode, Datameer fetches information about the vCores from every node and sets das.yarn.available-node-vcores to the average of the available vCores. |
das.job.health.check | true | Turn das job configuration health check on for cluster configuration. |
das.local.exec.map.count | 5 | DAS property to influence the number of mappers in Local Execution mode. |
List of Hive Properties
Name | Default Value | Description |
---|---|---|
dap.hive.use-datameer-file-splitter | fales | Property decides whether Datameer should use Datameer's FileSplitter logic to generate RewindableFileSplits or Hive's own splitting logic. |
List of Execution Framework Properties
Name | Default Value | Description |
---|---|---|
General | ||
das.execution-framework | Tez | Sets the execution framework e.g. 'das.execution-framework=Spark' to run Spark as the execution framework or 'das.execution-framework=Tez' to run Tez as the execution framework. |
Spark | ||
TEZ | ||
das.tez.session-pool.max-cached-sessions | auto | Maximum number of idle session to be held in cache. 0 means no session pooling, auto uses the number of nodes in the cluster as value for the maximum number of sessions. |
das.tez.session-pool.max-idle-time | 2m | The timeout for the sessions held in pool for which they wait for DAG to be submitted. |
das.tez.session-pool.max-time-to-live | 2h | Maximum amount of time a tez session will be alive irrespective of being idle or active since its start time. |
tez.runtime.compress | true | Settings for intermediate compression with Tez. |
tez.runtime.compress.codec | org.apache.hadoop.io.compress.SnappyCodec | Settings for intermediate compression with Tez. |
framework.local.das.parquet-storage.max-parquet-block-size | 67108864 | For local execution framework setting the max parquet block size to 64MB(can't be raised beyond 64MB but, can be lowered) by default. |
tez.shuffle-vertex-manager.desired-task-input-size | 52428800 | Sets the input size for each TEZ sub-task. |
das.cluster.plugin.resources.fixer.enabled | true | Intercepts the YARN errors YARN-3591 and YARN 6641 and rebuilds the missing plug-in class files on the fly. Setting the value to 'false' disables the property. |
YARN MR2 | ||
das.yarn.base-counter-wait-time | 40000 | YARN Counters take a while to show up at the Job History Server, so we use this base wait time + a small factor * the number of executed tasks to wait for the counters to finally show up. |
List of Windows Properties
Name | Default Value | Description |
---|---|---|
windows.das.join.disabled-strategies | MEMORY_BACKED_MAP_SIDE | By default, memory backed joins are enabled on Windows platform, if required to disable, please uncomment the property. |
mapreduce.input.linerecordreader.line.maxlength | 2097152 | Controls the maximum line size (in characters) allowed before filtering it on read. Below value is around a maximum of 4MB per line. |
List of Error Handling Properties
Name | Default Value | Description |
---|---|---|
workbook.error-handling.default | IGNORE | Legal values for workbook error handling are IGNORE, DROP_RECORD, ABORT_JOB. The equivalent error handling modes in the UI are Ignore, Skip and Abort. |
export-job.error-handling.default | DROP_RECORD | Legal values for export job error handling are IGNORE, DROP_RECORD, ABORT_JOB. The equivalent error handling modes in the UI are Ignore error, Drop record and Abort job. |
import-job.error-handling.default | DROP_RECORD | Legal values for import job error handling are DROP_RECORD, ABORT_JOB. The equivalent error handling modes in the UI are Drop record and Abort job. |
data-link.error-handling.default | DROP_RECORD | Legal values for data link error handling are DROP_RECORD, ABORT_JOB. The equivalent error handling modes in the UI are Drop record and Abort job. |
...