7.4 New and Noteworthy
Search Feature
Datameer has implemented a search feature giving you a fast and convenient way to find the files you need.
Based on Apache Lucene technology, the search feature is a powerful tool for both basic and advanced queries.
Tag words
You now have the ability to tag files in Datameer. Adding tags to files is a great way to categorize what that file is and makes it easier to find when you need it.
Visual Explorer Worksheet Improvements
Details sheet
When analyzing your data with Visual Explorer, you can now choose to create a worksheet that brings in all the relevant details from your source sheet with your newly applied filters.
Metadata summary in the worksheet inspector
In the worksheet inspector, the summary section displays the following metadata for the visual exploration sheet:
- The name of the source sheet used.
- A description of the chart configuration used to generate the sheet.
- A description of any exploration filters added from the visual exploration. Workbook filters added from Filtering Data are not displayed in this section.
Walkme Guided Training
Datameer has added WalkMe's Digital Adoption Platform to provide you with on-screen, step-by-step guidance with Smart Walk-Thrus of core Datameer functions and capabilities.
Back Up and Restore Data
Datameer users have the ability to create a backup copy as a .zip file and restore the data to Datameer from both the REST API as well as the user interface.
Remote Data Browser
Datameer has added a Remote Data Browser feature for HDFS, S3, SFTP, and SSH import jobs and data links. This file browser gives you a visual interface to select the file you need from your connector.
The filter box at the top of the Remote Data Browser can be used to find folders/files within the current directory.
Deduplication - Removing Redundant Data
The deduplication feature eliminates duplicate/redundant data from your worksheets. This procedure can be performed across all columns of a worksheet or for specifically selected columns.
Partition Aware Workbook Filters on Hive Data Links
Performance and speed have been impoved when filters set on a partitiioned workbook column pushes from your Hive data link. Before, the full data in the Hive table was being read and filtered which included your column's partition. Now, only the selected column's partition is being read and filtered.
Date and Time Workbook Functions
The following Date & Time workbook functions have been added:
Function Name | Description |
---|---|
CEILINGDATE | Rounds a date argument up to the beginning of the next date interval. |
DAYOFYEAR | Returns the day of the year for the supplied date in a range from 1 to 366. |
FLOORDATE | Rounds a date argument down to the beginning of the next date interval. |
QUARTER | Returns the quarter of the year for the supplied date in a range from 1 to 4. |
TIMESTAMPDIFF | Returns the number of whole date intervals between two date arguments. |
WEEKOFYEAR | Returns the week of the year for the supplied date in a range from 1 to 53. |
The function ADDTODATE has been updated to also support the date constant arguments Quarter (q) and Week (w).
Support for OpenJDK
Datameer now supports OpenJDK which is a free and open-source implementation of the Java Platform Standard Edition (Java SE). This has the possibility to be used as an alternative to Oracle's JDK.
Support Added for MariaDB
Datameer now supports MariaDB, a community-developed relational database management system intended to remain free use.
Support Added for AWS Corretto Open JDK
Datameer now supports Amazon Corretto Open JDK, which is a free Java SE standard compatible implementation and may be used as an alternative to Oracle's JDK.
Pause the Job Scheduler when Starting Datameer
If you start or restart Datameer with previously scheduled jobs, the scheduled jobs immediately run and depending on other data chains, trigger more jobs. This can lead to errors and other problems.
To account for this situation, a new optional parameter has been added when starting or restarting Datameer.
bin/conductor.sh start --jobschedulerPaused
Configuring the Number of Job Scheduler and Event Bus Threads
Prior to this patch, there was a single count for the number of possible Job Scheduler and Event bus threads available. This was the cause of some jobs becoming blocked due to the lack of available threads being used by the Event Bus.
Now, the thread count settings for the Job Scheduler and Event Bus have been separated and can be configured from within Datameer.
Restoring Configuration Settings
A new properties file (overrides.properties) created and stored off of the Datameer directory has been added to help with the upgrading process. This property holds configuration settings used to override default settings.
Learn more about overrides.properties in the Upgrade Guide.
Retrieving Essential Log and System Information for Support
A new plug-in called "Support Engineer Report" standardizes the process of obtaining system information and logs needed for the Datameer customer services department. This plug-in is able to obtain all log files contained in <datameerInstallFolder>/logs as well as standardized general system information.
Encryption Support for the S3 Connector
Datameer now supports encryption with AES-256 and SSE-KMS. This option is available when configuring an S3 connector.
Additions to Supported Hadoop Distributions
Older Hadoop distributions might no longer be supported as of Datameer v7.0. See Supported Hadoop Distributions for all supported distributions.
Parquet Import: Support for Advanced Data Types
Enabled support of Parquet files containing binary, fixed, int32 and int64 DECIMAL column types.