Secure Impersonation on Datameer Without Super Group Requirement
Previously, secure impersonation functioned by having a single super group user, the Datameer service user, in the Hadoop environment impersonate all authenticated users set in Kerberos. Datameer's new secure impersonation method (Native Multi User) validates each individual user separately using the user's own Kerberos keytab.
Setup Instructions for Secure Impersonation in Native Multi User Mode
First, create a user for your Datameer installation. This user will be called the Datameer service user. Unless otherwise stated, perform the next steps as this user.
Preparing Datameer for startup using the command line
- Unzip Datameer on your Hadoop cluster in a folder of your choice as the Datameer service user.
- Configure your MySQL database.
- Add the MySQL JDBC driver to
webapps/conductor/WEB-INF/lib/
. (E.g., for MySQL 5.7.19 you can use "mysql-connector-java-5.1.43-bin.jar") - Configure your MySQL database parameters in the
conf/default.properties
file - Use the " bin/create-table.sql " script to initially fill the Datameer database.
- Start Datameer
Configure Datameer in the UI
License
Authentication
- Go to the Admin tab.
- Click Authentication from the menu.
Configure your Active Directory or LDAP (this example focus on Active Directory in the following steps) by clicking Edit .
Important
Enter the impersonation attribute. (E.g., userPrincipalName). This is necessary if your users are in different realms. If you don't use it then the configured default realm from your Datameer servers Kerberos client config is taken to create the principal for the Kerberos Keytab file.
- Save the authentication configuration.
Don't log out! If you log out you will not be able to login again as the current admin user. Only the superuser functionality of DM will be able to log you in again. - Wait until the authentication cache has refreshed.
- Click Users from the menu. Select the groups and users that should be able to authenticate.
- Define a new admin user. The current user is no longer valid after logout.
Hadoop cluster
- Click on Hadoop cluster from the menu and click Edit.
- From the Cluster Mode settings, select Multi User Kerberos Secured Hadoop Cluster.
- Enter your name node.
- Enter the private folder path for Datameer.
- Select the Enable Impersonation checkbox.
- Enter your Cluster Settings depending on your Hadoop cluster.
- Enter under Kerberos Settings:
- the principal for the Datameer service user. (E.g., <name>@<realm> )
- the path to the Datameer service users Keytab file.
- the yarn principal. (E.g., yarn/_HOST@<realm>)
- the HDFS principal. (E.g., hdfs/_HOST@<realm>)
- the mapred principal. (E.g., mapred/_HOST@<realm>)
- Enter any necessary Hadoop properties for the cluster.
- Enter under Automatic Keytab Management Settings:
- the full path to ktutil. (E.g., /usr/bin/ktutil)
- the path to the folder where Datameer should store the generated Keytab files. The folder needs to have the user "Datameer" as the owner and the permission 700. There is already a Keytab folder in the default installation of Datameer. (E.g., <path to your Datameer installation>/etc/keytabs)
- Don't set the Kerberos Realm if you already set the impersonation attribute in the Active Directory configuration . If you don't set the Kerberos Realm and you don't have the impersonation attribute set in the Active Directory configuration, the default realm from the Datameer server Kerberos client configuration will be taken.
- Enter the password encryption algorithms necessary for your Kerberos installation. (E.g., aes256-cts-hmac-sha1-96, aes256-cts, arcfour-hmac)
- logout and restart Datameer
Private Folder Permissions
Private folder permissions are automatically set up during the installation process of the secured Hadoop Distributed Filesystem. The information below is a detailed description if these settings need to be applied to other tools, such as Apache Ranger.
- Enter the following permissions via ACL's or Ranger for common group or individual users
<datameer private folder>
to r-x. - Enter the following permissions via ACL's or Ranger for common group or individual users
<datameer private folder>
sub folders to rwx.
Datameer restricts its folder to the following permissions:
folder | owner | group | user | group | others |
---|---|---|---|---|---|
datameer private folder | datameer service user | datameer service user group | rwx | --- | --x |
datameer private folder 1st level siblings | datameer service user | datameer service user group | rwx | --- | -wx |
individual job folders | job owner | default group or grouped shared with | rwx | no group sharing → --- group allowed to edit → -wx group allowed to view → r-x | no others sharing → --- others allowed to edit → -wx others allowed to view → r-x |
job execution folder | job owner | default group or grouped shared with | rwx | no group sharing → --- group allowed to edit → --x group allowed to view → r-x | no others sharing → --- others allowed to edit → --x others allowed to view → r-x |
Changing ownership of a file or folder
The change of ownership of a file or folder is a privileged action. When using secure impersonation on Datameer without the supergroup requirement, the Datameer service user is no longer a member of the superuser group and loses the privilege to change ownership of files in the Datameer home folder in HDFS. The change of ownership is still possible through a process of copying a file or folder from the Datameer home folder to a new owner.
Change of ownership process:
- The original folder (e.g., /path/to/dm_home_folder/workbooks/<wbk_config_id>) is copied by the new owner. (The user changing ownership must be an admin and the new owner requires view access to the original data.)
- The copy is appended with the new owners username (e.g., /path/to/dm_home_folder/workbooks/<wbk_config_id>_userName)
- New data is written to the new folder in HDFS.
- The original folder is submitted to Housekeeping and deletes the old data.
Limitations of changing owners:
- Appending import jobs where the old data is shared with the new owner requires and additional condition. The sharing of data must be completed where the user sharing the data and the new user are part of the same group.
- Logs saved in HDFS under <dm_private-folder>/jobhistory are unavailable for the folders with a transfer of ownership.