Simple Impersonation with Datameer

The operating system user that starts Datameer is the user logged that runs all the different tasks within Datameer, no matter which user is signed into Datameer. The impersonation feature allows the Datameer administrator to let the Datameer users appear to be running the tasks, and data is previewed by the logged in user from HDFS sources.

Prerequisites

 The follow prerequisites are required to enable simple impersonation within Datameer.

  • The Datameer users must also be operating system users. The user names needed to be mapped 1:1.
  • All of the operating system users must be located within a common group.
  • The user running Datameer must be a superuser of HDFS. For more information, refer to the Apache Hadoop documentation.

    Simple impersonation isn't supported by the Spark execution frameworks.

Configuring Simple Impersonation 

Follow the steps to configure simple impersonation in Datameer:

  1. Log into Datameer as an administrator.
  2. Click on the Admin tab.
  3. Select Hadoop Cluster from the side menu.
  4. Click Edit to adjust the Hadoop cluster configurations.
  5. Select Enable Impersonation under Storage Settings.
  6. Click Save.

Running the Simple Impersonation Tool to Migrate Permissions on Existing Objects (Optional)

Datameer packages a tool located in the /bin/ folder named unsecure_hdfs_tool.sh

The user running Datameer must be a superuser of HDFS.

Follow the steps to begin running simple impersonation on Datameer:

  1. Stop Datameer.
  2. Run the following command:

    bin/unsecure_*.sh -u -g <Operating system group name with Datameer users>
  3.  Start Datameer.

Simple Impersonation on MapR

Additional steps for those running MapR to enable simple impersonation:


  1. Create a file (local file system) /opt/mapr/conf/proxy/mapr as user root .

    touch /opt/mapr/conf/proxy/mapr
  2. Add a environment variable _ MAPR_IMPERSONATION_ENABLED="true". In this case, a new line in the file _etc/das-env.sh.

    export  MAPR_IMPERSONATION_ENABLED="true"
  3. Add the configuration in core-site.xml.

    <property>
      <name>hadoop.proxyuser.mapr.groups</name>
      <value>*</value>
    </property>
    
    <property>
      <name>hadoop.proxyuser.mapr.hosts</name>
      <value>*</value>
    </property>
    
    <property>
      <name>hadoop.proxyuser.root.groups</name>
      <value>*</value>
    </property>
    
    <property>
      <name>hadoop.proxyuser.root.hosts</name>
      <value>*</value>
    </property>

Expected Impersonation Behaviors

Refer to the following table to understand how simple impersonation affects the ownership of import jobs, file uploads, data links, workbooks, and export jobs. Note that the group permissions apply to the artifact, not the folders the artifacts are in.

ScenarioOwner in HDFSGroup in HDFSPermissions for owner in HDFSPermissions for group in HDFS

Owner of YARN application

(when job is triggered manually)

Owner of YARN application

(when job is triggered by schedule)

Preview data accessed as
Creating an artifactCreatorGroup selected, if none selected, the default Datameer groupRead and writeOnly readn/an/an/a
Running a jobCreatorn/aRead and writeOnly readCreatorCreatorLogged in user
Generating preview dataCreatorGroup selected, if none selected, the default Datameer groupRead and writeOnly readCreatorCreatorLogged in user
Saving edited artifact (not as creator)CreatorGroup selected, if none selected, the default Datameer groupRead and writeOnly readCreatorCreatorLogged in user
Updating permissionsCreatorNewly selected groupRead and writeNewly selected group and read permission onlyCreatorCreatorLogged in user