Migrating from One Hadoop Cluster to Another
When migrating from one cluster to another, (e.g., from a QA to a production cluster) you must update all the paths in the Datameer database (DB) which point to the new cluster.
Follow these steps to move your data:
- Stop Datameer.
Move all your data from the old cluster to the new cluster. The best way to do this is using the Hadoop distcp tool.
hadoop distcp hdfs://old.cluster:9000/old-root-path hdfs://new.cluster:9000/new-root-path
Make a backup of the Datameer DB.
mysqldump [-h <dbhost>] -u dap dap -p > das-backup.dmp
Check if the backup was created successfully:
head -n 50 das-backup.dmp
Update the paths to the new location in the Datameer DB:
bin/update_paths.sh hdfs://old.cluster:9000/old-root-path hdfs://new.cluster:9000/new-root-path
If the Datameer DB differs from the settings in
conf/default.properties
, you can pass the corresponding parameters to the update tool:bin/update_paths.sh -h <dbhost> -o <dbport> -n <dbname> -u <dbuser> -p <dbpassword> hdfs://old.cluster:9000/old-root-path hdfs://new.cluster:9000/new-root-path
After the above steps you need to update the cluster settings in Datameer:
- Restart Datameer.
- Update the Datameer cluster settings by clicking the Admin tab at the top of the page, then click Hadoop Cluster in the column on the left.
- Click Edit, then update the settings as needed.
- Click Save.
If the user running this utility doesn't have permissions to write to /tmp
, then set the TMPJ environment variable to designate another path.