Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Table of Contents

SQL Editor Worksheet (Beta)

Note

SQL worksheets are available as of Datameer 7.1 as a beta feature.

An SQL editor interface has been added to Datameer. Creating an SQL worksheet gives you the ability to write SQL queries directly in a workbook. Users working within SQL worksheets can utilize this feature to combine processes that might take several steps using the traditional interface.

Updated Amazon S3 Connector

The new Amazon S3 Native connector has been added and gives a boost to performance compared to the original S3 connector. The performance improvement comes from multipart technology in the uploading parameters.

This connector also has the ability to export to S3 buckets even without access to read the getObjectMetadata()method. 

Image Added

Currently, the S3 Native connector is available for export jobs with the import feature scheduled for a Datameer update in the near future.

Exporting data

HiveServer2 can export using the original format type

Exports to HiveServer2 into existing partitioned tables are written in the origin format type. (E.g., Text file, SequenceFile, Rcfile, Avro, Orc, Parquet).

Comprehensive improvements for exporting into HiveServer2

New improvements exporting into existing partitioned Hive tables.

Output format

Whereas it was only possible to write in Hive tables formatted as a textfile, Datameer now has the ability to export to tables which are formatted as a sequencefile, rcfile, avro, orc, or parquet. This allows you to benefit from the advantages of these fast formats.

Hive data type support

In Datameer v6.4 we extended the list of supported Hive data types for export with decimal, date, and timestamp. Now we extend this list again with tinyint, smallint, int, float, and varchar. This allows you to optimize Hive tables to the data types you really need while ensuring full compatibility to Datameer's Hive export.

Datameer Value Typevoidbooleantinyintsmallintintbigintfloatdoubledecimalstringvarcharchartimestampdatebinaryarray <data_type>
STRING








(tick)(tick)(tick)



BOOLEAN
(tick)













FLOAT







(tick)(tick)







INTEGER

(tick)(tick)(tick)(tick)











DATE








(tick)

(tick)(tick)

BIG_DECIMAL







(tick)






BIG_INTEGER




(tick)









LIST














(tick)

Partition support

Until now, the advantages of partitioning could only be used for importing data. Datameer v7.1 brings you the ability to export into existing Hive partitions for the first time. In addition, Datameer not only supports date partitions (as for importing data) but all available types such as string partitions.

Mapping validation

The order of the columns in a workbook and in a Hive table no longer need to be identical. Also, mapping errors are now clearly marked in a red color (in the UI) for quicker troubleshooting.

For data links, users have the option to generate sample data immediately after saving the data link (this saves the sample data to the DM private folder during the data link ingestion) or deferring sample data generation until using the data link in a workbook. Deferring sample generation is useful for when the data link is available to multiple users but each user may have different access to the original source of the data link. When deferring sample data, the sample data is created and stored in the DM private folder by the workbook user rather than the data link creator. This sample data access still follows impersonation security policies (e.g., Sentry/Ranger). 

Image Added

User Permissions

Cascading permissions

...

Schedules created with non complex cron patterns are converted automatically in the inspector. Select Schedule Type and Custom from the drop down menu to view or edit the schedule cron pattern

Additions to Supported Hadoop Distributions

Older Hadoop distributions might no longer be supported as of Datameer v7.0. See Supported Hadoop Distributions for all supported distributions

Support for Hive Server 1 and Hive Server 2 Using HDP 2.6

HDP 2.6 supports the use of Hive 1.2 and Hive 2.1.

Datameer's default HDP packaging supports Hive 2.1. In order to use Hive 1.2 with HDP 2.6, you need to download and install the HDP Datameer artifact with the suffix "-hive1".