Upgrade Guide

Upgrade Guide

Upgrading From an Older Release

Datameer customers can download updated software versions from http://my.datameer.com

To upgrade from an older release, you should first back up your data. Then you will need to upgrade the application and upgrade the database.

Upgrading to version 6.3

Parquet is the new default storage format

Datameer now uses Parquet as its default storage format as opposed to the previous sequence storage format. Sequence Files are still readable in Datameer but all new artifacts are written in Parquet. Depending on the configuration of your jobs, it might be difficult to roll back after upgrading to Datameer 6.3.

There is a property that forces Datameer to use Sequence Files instead of Parquet if a rollback is required. Under the JAVA_OPTIONS in etc/das-env.sh, add "-Duse-sequence-storage=true" before starting Datameer for the first time following the upgrade. This property has been removed as of Datameer 7.0.

If upgrading from a Datameer version before 6.1, we highly recommend first upgrading to Datameer 6.1  and verifying that all job functions work without the MapReduce framework before upgrading to Datameer 6.3.  

In 6.3, Datameer uses Parquet as its default storage format. While Datameer can still read data written in the sequence file format, all artifacts are now stored using the Parquet file format. Datameer still uses the sequence file format for writing intermediate files and preview files for performance reasons, but the final artifacts are saved using the Parquet format. Due to this change, it might not be possible to roll back after upgrading to Datameer 6.3, depending on the configuration of your artifacts. For example, if import jobs and workbooks are configured to purge historical data and only keep the last one result, the final artifacts are stored as Parquet files, and prior versions of Datameer cannot read them. 

External authentication changes

As of Datameer 6.3, its' required to use ports 389/636 to connect to LDAP or the ports 3268/3269 to connect to Active Directory.

Upgrading to versions 7.0+

Ensure that Datameer application server, as well all data nodes, have Java 1.8 (Oracle recommended)

Upgrading to version 7.2

If you are upgrading to Datameer 7.2 from a 6.1.x or earlier version, must first upgrade to a 7.1.12 or later 7.1.x release, in order to trigger an important Database schema migration processes.

Upgrading with Kerberos

Keep the following in mind if you use Kerberos:

  • If you used Kerberos prior to 5.11, you need to install the plug-in during upgrade.

  • If you are running a Kerberos-secured cluster with impersonation enabled, you need to run the secure_hdfs_tool.sh command line tool when upgrading to versions 6 and above. For versions 6.1 and above, run the command after upgrading, starting, and stopping Datameer. Once you've run the command, restart Datameer.

Migrating workbooks and other artifacts when upgrading

Workbooks and other JSON files downloaded as a backup in older versions of Datameer are not supported in newer versions of Datameer.

When upgrading to a newer version of Datameer, ensure that all needed workbooks and files are part of the migration process so that they can be used in your new version of Datameer.

Upgrade Planning

  • Before upgrading Datameer, a plan should be made for application downtime scheduling, notifications, and creating a maintenance window for Datameer. Please take note of the system requirements and consider upgrading the MySQL service for Datameer's application database from 5.1 to 5.5 or 5.6, if applicable.

  • Backed up Datameer files need to be included in the migration process. Previous files that aren't part of the migration process aren't supported in newer versions.

  • Jobs will need to be set to stop executing prior to the upgrade process. This could be done e.g. by Pause Job Scheduler.

  • Request possible assistance from the database administrator in case credentials for database are not available to the application owner.

  • Custom plug-ins which were created by Using the Plug-in SDK of a former Datameer version need to be re-compiled!

  • Since it is recommended to not copy old configuration files or scripts to the new location, note the changes you have made in the previous setup. This information you will need to make necessary changes in the new configuration files.

    • As of Datameer 7.4, a property file exists so Job Scheduler and Event Bus settings adjusted in the UI are read during Datameer start up.

      • A properties file called "overrides.properties" isn't written on Datameer but your systems HOME folder. (Path = <home>/.datameer/overrides.properties)

      • This property file is auto-created when adjusting the Event Bus or Job Scheduler settings under the Admin tab. This file can also be manually added and edited.

      • This is the last properties  file read on start up. (E.g., it overrides other property files like default.properties and deployMode.properties) 

      • This file can be modified to allow for storing and the restoring of custom properties.

Disabling Housekeeping/Compaction services

In the event of a problem after a Datameer upgrade, it is possible to roll back to the previous release. To allow for the possibility of a roll back and avoid any potential data loss, Datameer strongly recommends disabling the Housekeeping and Compaction services before beginning the upgrade process. Once upgrade validation is complete and the environment is considered stable, the Housekeeping and Compaction services can be re-enabled.

  1. Open the /<Datameer new version installation directory>/conf/default.properties file.

  2. Disable Housekeeping by setting the property housekeeping.enabled to false.

  3. Disable Compaction by setting the property auto-compaction.enabled to false.

  4. Start Datameer and perform upgrade validation testing.

  5. Once the upgrade validated and is considered successful, turn Housekeeping and Compaction back on by setting the above mentioned properties back to true.

Restarting Datameer is required to apply these changes

Workbook Validation

A Workbook is a complex structure that Datameer is constantly striving to improve.  Workbooks can be built in a near unlimited number of different combinations of Column Renames, Joins, Unions, Filters, and Functions - including user built functions.  While we do attempt to cover all the possibilities while implementing our upgrade code, some customer development decisions are unpredictable and as a result our Workbook upgrade code can not always successfully transform all workbooks.

To validate whether any Workbook has been broken during an upgrade, Datameer created the Workbook Health Check feature.  The tool reviews a Workbook's structure and reports any logical issues. This tool is available in versions 6.4.14, 7.1.13, 7.2.13, 7.4.11+, 7.5.4+, and 10.0.1+.  Datameer's best practice recommendations for upgrades are as follows:

Backup Keyfiles and Keystore

If you have set up password encryption and/or enabled TLS  with custom certificates , backup your Keystore and Keyczar keyfiles.

Upgrade the Application

If you update Datameer using user/group root it is recommended for security to go back and change permissions back to the user/group datameer.

Pausing jobs being submitted to the cluster

Before performing a  graceful shutdown of Datameerview the current running or queued jobs  and double-check that there is nothing running or scheduled.

  1. Stop the application.

  2. Unzip the upgrade file. 
    Move the MySQL JDBC driver located in <Datameer path>/das-data/jdbc-jars, as described into the  Install the JDBC Database Drivers section of the Installation Guide.
    In case you are using Datameer with MySQL as an application database, please check that database mode is configured accordingly in your  etc/das-env.sh  file

    export DAS_DEPLOY_MODE=live
  3. If upgrading in Workgroup or Enterprise versions, users may want to consider allocating additional memory in etc/das-env.sh. 

  4. Consider changing the container sizing of the Map, Reduce, and AM containers, which respectively correspond to MapReduce and Tez jobs. To do so, change the settings of the following properties:

  5. If existing, copy the files from the das-data folder of the old distribution to the new location. Keep the original das-data information as a backup.

    cp -r /<old-location>/das-data /<new-location>/
  6. Users need to check, and update if necessary, the folder where plug-in configurations are stored.

    Open the file system of the previous Datameer version you are upgrading from and navigate to the default.properties file.

    Search for the property file that defines the folder where plug-in configurations are stored.

    Open the file system of the upgrade version of Datameer and find the same property in the default.properties file.

    Check or update the new property value to be the same folder name in the previous version.

    Copy the contents of the previous plug-in configurations folder into the file system of the new upgraded version of Datameer.

  7. If you made changes in your conf/ directory previously (like changing live.properties and log4j-<xyz>.properties), you need to apply** the same changes to the new conf/      directory of your upgraded instance.

  8. Copy over the native libraries that you have added to Datameer (if this applies). You don't have to copy over the native libraries that are already bundled with Datameer.

    cp -r /<old-location>/lib/native/* /<new-location>/lib/native/
  9. If you have made changes to the conductor.sh script (for example, to enable SSH), apply** these changes to /<new-location>/bin/conductor.sh again. Notice JAVA_OPTIONS has been moved to etc/das-env.sh

  10. Migrate the SSL values from the previous start.ini to the new one.

  11. Copy over files from etc/custom-jars (these files could be database drivers or 3rd party libraries).

  12. If you are using custom Datameer plugins (jar archives stored under etc/custom-plugins), update those to be in sync (API compatible) with the new version of Datameer and be sure to remove jar files related to older versions from etc/custom-plugins.