Installation Prerequisites

Before installing Datameer X, complete the following prerequisites:

Creating the Datameer X User

Administrative rights are required to create the Datameer X user on the machine where Datameer X is being installed. This can be accomplished under the root account. Make sure the user ID is above 500 and that the account has enough resources and file descriptors available. 

Creating Directories for Application, Cache, Logs, and Temporary Files 

For performance reasons and to have better control about where space on the file systems and on disks is used, create separate directories for application, cache, logs, and temporary files. Do this according the Linux Filesystem Hierarchy Standard (FHS). To create the directories and change the permissions you need administrative rights. Complete this task under the user account root.

Switching the User and Changing the Working Directory 

This should be the last task to which administrative rights are necessary. 

Downloading and Unzipping Datameer X

Get in touch with Datameer Support or your Datameer representative to receive the installation package download link.

Download and unzip the appropriate Datameer X package for your Hadoop cluster distribution:
INFO: If you already have a Datameer X installation, you can also start from here. 

curl -s -k -o Datameer-<package>.zip "https://download.datameer.com.s3.amazonaws.com/releases/Datameer-<version>/<dist>/Datameer-<package>.zip?<AWSproperties>" ; unzip Datameer* 

Creating Symlinks for Future Updates

To be prepared for future upgrades, you should create symlinks to the current (or latest) package as well as for the log directory.

Create the symlink and change the working directory:

ln -s Datameer-<package> current 
cd current

Selecting and Preparing the Database

By default, the Datameer X application runs with an HSQL file database that is created on the local filesystem under 'das-data/database/hsql-db'.

If you are setting up Datameer X for production use, Datameer strongly recommends MySQL instead of the HSQL file database. MariaDB is also supported as an alternative metastore database engine to MySQL.

To define which database to use, make an entry in the 'live.properties' file under 'conf/live.properties':

#Define which database to use: hsql-memory, hsql-file, mysql, mariadb
system.property.db.mode=mysql

or 

#Define which database to use: hsql-memory, hsql-file, mysql, mariadb
system.property.db.mode=mariadb 

Downloading and Installing the MySQL Database JDBC Connector 

By default, the Datameer X application runs with an HSQL file database that is created on the local filesystem under das-data/database/hsql-db. If you are setting up Datameer X for production use, Datameer strongly recommends using MySQL instead of the HSQL file database. 

Configuring Datameer X for MySQL Database

Datameer X service depends on the MySQL database. The MySQL database is used for writing to workbooks, permission changes, job execution, scheduling, and more. To function properly, a response time should be between ten and twenty milliseconds. To run the application in MySQL mode, the following changes need to be implemented. 

  1. Check the database connection:

    mysqladmin version
    mysqladmin ping
    mysqladmin status
    echo q | telnet -e q `hostname` 3306
    nc -z -w1 `hostname` 3306

    INFO: You can follow up later with using the Check if the Datameer X Application Database is Running and Accessible article.

  2. Create a new database via the "./bin/mysql-init.sql" script:

    for MYSQL 5.x and lower:

    mysql -u <user> -p<password> -h <host/ip> -P <port> < bin/mysql-init.sql


    for MYSQL 8.x and higher:

    mysql -u <user> -p<password> -h <host/ip> -P <port> < bin/mysql8x-init.sql


  3. Set the deploy mode to "live" in the "./etc/das-env.sh" file:

    # Change this to DAS_DEPLOY_MODE=live when you want to run in live mode against a mysql db
    export DAS_DEPLOY_MODE=live


  4. Set the database name in the "./conf/default.properties" file:

    # Set the name of the MySql database DATAMEER uses.
    system.property.db.name=dap


  5. Execute the command to initialize the database:

    ./bin/database.sh init


The database configuration is now completed.

Installing the License

If you don't have a license, email the application's product ID to license@datameer.com and request the key. Find the product ID displayed at the 'Welcome' page.

See 'License Information' for information on how to update the license and for details about volume-based licensing.

If you have already received a Datameer X license:

  1. Launch the Datameer X application and open the UI. The welcome page with all available licensing options is loaded. 
  2. Press the button "Activate" and upload the key you received from Datameer. The license is being activated. You will be redirected to the login page.

Starting Datameer X

Start the Datameer X service.

Stopping Datameer X

Stop the Datameer X service.

Restarting Datameer X

Restart the Datameer X service.

Datameer X Graceful Shutdown

Gracefully shut down the Datameer X service.

  1. Pause the Job Scheduler located under the Admin tab in Datameer.
  2. Wait for current jobs to be marked as completed.
  3. When all jobs have been completed, use the "stop" command on conductor.sh
  4. After the Datameer X application has been stopped, perform needed maintenance.
  5. With all maintenance completed, resume Datameer X using the "start" command on conductor.sh.
  6. Under Datameer's Admin tab, resume the Job Scheduler

Service Check 

Check if the Datameer X service is running and accessible

Configuring Datameer X for a Kerberos Secured Cluster

Before configuring Datameer X for a Kerberos Secured cluster, test Kerberos authentication and job execution on CLI.  

To configure Datameer X for a Kerberos-secured cluster, follow the Secure Mode Configuration instructions. 

Secure Hadoop Distributed Filesystem (HDFS)

You must have a properly configured connection to a Kerberos-secured cluster to use the tool to secure the Hadoop Distributed Filesystem (HDFS) .

Starting Testing

Start the Datameer X service to do final testing. 

Best Practices for Installing Datameer X

Implement frequent database backups

Datameer X service depends on the MySQL database, it is used for writing to workbooks, permission changes, job execution, scheduling, and more. It is highly recommended to backup the application database frequently.

0 * * * * mysqldump -u'dap' -p'dap' dap | gzip > /home/datameer/<company>_<system>_<datameer-version>_`date +\%Y\%m\%d_\%H\%M`.sql.gz


Don't leave the backup unattended for a long time. Monitor the directory /home/datameer for its size!


# Check from time to tome how long the database dump will take and if it fits into the timeslot
time mysqldump -u'dap' -p'dap' dap | gzip > /home/datameer/<company>_<system>_<datameer-version>_`date +\%Y\%m\%d_\%H\%M`.sql.gz
# Verify from time to time if the files are OK
gzip -d /home/datameer/company>_<system>_<datameer-version>_<date>_<time>.sql.gz 
head /home/datameer/<company>_<system>_<datameer-version>_<date>_<time>.sql


Validate the content. Don't leave backup files on the application server. Move backup files from /home/datameer to a safe and secure remote location.

Changing your Stored Data Directory

Use a path that doesn't depend on a Datameer X installation directory. Because the das-data folder is stored inside of your installation directory by default, you need to make a backup of your stored data every time you create a new distribution or upgrade. 


Change the default admin password

Log in and change the default admin password following the instructions on managing user accounts.

Downloading and installing Plug-ins

If you are setting up Datameer X for production use, it is most likely in a Kerberos Secured environment. To use Kerberos, an additional plug-in is necessary. This Datameer X plug-in is part of the Advanced Governance module.  

Configuring Datameer X for Enterprise 

By default, the application runs with settings where files are created on the local filesystem under the current directory. To address enterprise requirements, some changes need to be implemented. 
To avoid any mismatch in the configuration files or incompatibility with different versions, don't copy over configuration files from other versions. Make changes every time based on the originally delivered versions. 

Review the Changes Implemented by Accessing the Change Log

Validate the changes made. Move files from /home/datameer to a safe and secure remote location.


Enable and Configure Transport Layer Security (TLS)

Before the next steps, consider reverse proxies or a load balancer to offload the SSL traffic or to use wild card certificates. In that case, you only need to configure rewrite handling.

Enable TLS for use with Datameer X in production environments. As Datameer X is packed with Jetty 9, you only need to enable modules.

You can proceed further with Enabling SSL for MySQL service as well.


Configure Bash for Operations 

Set up shell aliases for most common commands to make work easier, faster, and less error prone.  

Conductor.sh Commands and Parameters 

Usage: conductor.sh <command> <option>

Commands:

Options:

Examples: