11.0 New and Noteworthy

JSON Parsing 

A JSON can now be extracted out of a workbook. Therefore JSON parsing can be configured. You can see the single node within the root structure, showing the object type and an example. Each node will create a card. Depending on the type of node, additional configuration options will become available. Each selected node will be added with path specificity. Single or multiple nodes can be selected to be extracted. A preview of the JSON extraction can be found in the 'Preview' section. Further more the raw JSON data can be viewed.

Upgrade Tool 

Upgrading the Database

From now on upgrading the database during an Datameer X upgrade is executed automatically with the new upgrade tool which is available within the Datameer X distribution installation folder. Therefore a compatible SQL connector jar file must be places.

The new upgrade tool creates a database dump before the upgrade automatically, validates the correct schema from the database, selects only the necessary upgrades in the correct order. The execution can be logged out via commend line.

MariaDB is also supported as an alternative to MySQL.

Retriggering of the Search Index

With Datameer X v11, the search index is retriggered automatically.

Apache Knox WebHDFS & Apache Knox Hive Server2 JDBC

Apache Knox WebHDFS as a Connection

Apache Knox WebHDFS can be used as a connection to access a HDFS which is not the same HDFS where Datameer's Private Folder is stored, e.g. in a hybrid cloud or a second cluster. The connector can be disabled in the 'Admin' tab.

Apache Knox Hive Server2 JDBC as a Connection

The Apache Knox Hive Server2 JDBC connection is needed when you have a Hive Server2 JDBC instance. A corresponding plug-in Is part of the Hive Server2 plug-in. The plug-in can be disabled in the 'Admin' tab.

New Supported Hadoop Distributions

Support for Cloudera CDH

Datameer X now supports Cloudera CDH v6.3.2., CDH v7.0.3 and CDH v7.1.1.

Support for Amazon EMR

Datameer X now supports Amazon EMR v5.28.1, EMR v5.29.0., EMR v5.30.0 and EMR v6.0.0.

Support for Hortonworks HDP

Datameer X now supports Hortonworks HDP 3.1.4 and 3.1.5.

HBase Support

HBase 2.0 is now supported for Hortonworks HDP 3.1.0 and 3.1.4.

MapR Extension Package with MapR 6.3

Datameer X now supports MapR Extension Package with MapR 6.3.

Advanced Governance Updates

Column Obfuscation

The 'Column Obfuscation' plug-in is now configurable on the 'Admin' tab. The obfuscation algorithm can be modified and the encryption key uploaded.

Setup and Administration Updates

Google Cloud Storage as a Private Folder

The Google Cloud Storage can now be used with the Datameer X Private Folder. This can be set up within the Google Cloud Dataproc connection.

Set up Google Cloud Dataproc on Datameer X

Google Cloud Dataproc can now be used as an execution engine if it is deployed against a Google Cloud Platform 1.4 cluster.

Multi-Group Sharing Plug-In

Datameer X provides with the /wiki/spaces/DASSB110/pages/20221232594 a new extension point that allows using Access Control Lists to setup file and folder permissions for different users and groups in the HDFS.

Set up Datameer X on EMR

Both, the cluster name as well as the ID are validated when saving the configuration to avoid errors later on.

New Housekeeping Service Property

For performance issues the new property 'housekeeping.run.task-attempts-per-run' was introduced and can be set in the 'default.properties' file. The property is set as '50' per default. Therefore Housekeeping does not run all tasks again but only those that need to run again.

UI Change in Admin Tab - Cluster Configuration

The former 'Hadoop Cluster' section in the 'Admin' tab was renamed to 'Cluster Configuration'.

Neebo JDBC Dialect Introduction

Datameer X comes with the new Neebo dialect in the 'Database Driver's" section in the 'Admin' UI. Contact support@datameer.com to receive the JDBC driver to connect to Neebo.


Import & Export Updates

Tableau Export as a Connection

Setting 'das.splitting.disable-combining=true' as a custom property will run Tableau export jobs significant faster. There is now a verification that an export of a sheet to a Tableau Server launches multiple tasks and not just one.

You can now select the authentication mode when configuring Tableau as a connector: 'Username/ Password' or 'Personal Access Token'.

Snowflake as a Connection

In Datameer X, a connection to the Data Warehouse Snowflake can now be established.

Azure Cosmos DB as a Connection

You can now import and export by an Azure Cosmos DB connection.

Azure Databricks as a Connection

Importing from Azure Databricks is now implemented.

Importing Data with Amazon Athena Dialect

Datameer X supports importing data with the Amazon Athena database dialect now.

Azure Data Lake Store Gen 2 as a Connection

You can connect to Azure Data Lake Store Gen 2. The connection can be build to an existing HD Insights Data Lake Store (ADL). Import as well as export can be executed.

Working with Workbooks Updates

Binned Encoding

The button 'Add new Divider' adds an additional bucket, recalculates the percentile size and the corresponding absolute values. When adding a new divider, the chart is rerendered and shows the proper values after each percentile changes.

Quick Column Sorting

A third state 'None' is available for quick column sorting.

Commentary Function in SQL Sheets

You can now insert comments in the SQL Editor in an SQL sheet. Comments are allowed for a single line or multiple lines.

Workbook Inspector - Search in the Sheet Inspector

You can now search for a column within the Sheet Inspector. Enter at least one character of a column name and view the results. You can go to the column directly by clicking the search result. Find also all applied formulas within this section 'Columns'.

SHIFTTIMEZONE function

The documentation of this function is updated now.

Developer's Updates

Workbook Health Check

Datameer X provides a new list with the release versions of the Workbook Health Check per version.

REST API Creating a Folder

A folder is created when creating or updating a connection, when all capabilities and permissions are sufficient.

REST API for Workbook Variables

Workbook variables can now be set via the REST API while executing a job.

REST API Lookups

The new REST API Lookups returns information about a referenced entity to an existing UUID.

REST API Job Commands

The response of the request 'List Active Jobs' contains all jobs that are from job status 'RUNNING' or 'QUEUED' now.

Public User Documentation Updates

Setup and Installation

The page 'Custom Properties' is moved as a subpage of 'General Setup Information' and updated. The new page provides general information about custom properties and a how-to.

General Information on Connections

A user gets general information about creating/ deleting/ editing connections within this new documentation page.

Working with Workbooks

The general information documentation for the SQL function 'CAST' is moved as a subpage of 'Using SQL Sheets' which is a how-to documentation.

The general information documentation for 'Join Data' is split in 'Joining General Information' for providing general information and 'Joining Data' which is a how-to documentation.

The documentation for 'Hiding and Expanding Columns' within a workbooks, 'Removing Columns', 'Reordering Columns', 'Resizing Columns' as well as 'Splitting Columns' is moved to the /wiki/spaces/DASSB110/pages/20217740613 section.

The section 'Adding a Sheet' is now available on the documentation page 'Working with Sheets'.

Guide for Developers

Find the whole REST API documentation in a new design.

Glossary

New entries are added to the glossary: AES, CBC, DAS, NMU, Spark, AWS Athena, Teradata Aster, HSQL file, Neebo and UUID.