Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Code Block
borderStylesolid
secure_hdfs_tool.sh [options] arguments
 -G (--hdfs-groups) [comma separated    : Comma separated list of applicable
 list of HDFS groups]                   : Datameer X HDFS groups, when set extra
                                          validation is run to ensure only
                                          these groups are present in permission s
 -g (--core-group) VAL                  : Group for the core Datameer X directorie
                                          s, this HDFS group should contain ALL
                                          Datameer X users (only applies when
                                          -u,--update-core-directories is used)
 -s (--sync-hdfs)                       : Sync Datameer X entity permissions down
                                          to HDFS
 -u (--update-core-directories)         : When set, create/update all core
                                          Datameer X directories and set permissio
                                          ns according to Datameer X user and
                                          optionally --group

...

By default, with no arguments, secure_hdfs_tool.sh validates that all Datameer X entities conform to the requirements of secure impersonation. (i.e., they have exactly one group permission entry.)

...

Warning
titleChange to HDFS permissions in Hadoop 2.3 and higher

Changes have been made within the HDFS Permissions that causes Datameer's secure_hdfs_tool.sh to fail.

Follow this workaround to set up secure impersonation in Datameer X when using a version of Hadoop 2.3 or higher.

Panel

<super user of HDFS> is the user who starts the Datameer X application.

Users of the <user group> group are allowed to impersonate; as defined by the hadoop property hadoop.proxyuser.<super user name>.groups in core-site.xml.

The super user must now be added to the user group before the tool is run.

  1. Start the Datameer X application and configure it to secure Hadoop.
  2. Stop the Datameer X application.
  3. Add the <super user of HDFS> to <user group> 
    1. Execute the command: usermod -g <user group name> <super user name>.
  4. Run Datameer's secure HDFS tool.
    1. Execute the tool: bin/secure_hdfs_tool.sh -u -g <user group name> as <super user>.
  5. Start the Datameer X application.

Datameer X recommend removing the superuser from the user group and adding them back to their original group once the tool has run.

If the current cluster mode for Datameer X isn't "Secure", then the tool aborts. You must have a properly configured connection to a 'Secure' cluster to use this tool. To achieve this, navigate to Administration > Hadoop Cluster and configure Secure mode.

...

In all execution modes, the tool emits lines to STDOUT describing invalid Datameer X entities which need to be fixed for secure impersonation to work properly. As you fix entities, for example when preparing for secure impersonation the first time, you can simply rerun the script to find out what is left to update. Redirecting STDOUT to a file after grepping for INVALID_ENTITY is a good way to build a work list when dealing with large numbers of entities.

Code Block
borderStylesolid
secure_hdfs_tool.sh -G foo,bar,baz | grep INVALID_ENTITIY > datameer_invalid.txt

Updating Core Datameer X Directories

Another use for the tool, is to reset/create the Datameer X core HDFS directories with appropriate ownership and permissions. This is mostly done when enabling or re-enabling secure impersonation mode.

...

Passing the optional -g (--core-group) argument changes group ownership of the core directories to match the argument's value and set permissions to 770. By default, with no -g, the group inherits from its parent and the core directories' permissions is 777. Datameer X strongly recommends using a core HDFS group containing all Datameer X users to control access to these directories.

Note

Sticky Bits aren't supported by Datameer. To avoid access problems, don't use them for Datameer X core directories.

Synchronizing HDFS Artifacts

The final major use for the tool is synchronizing HDFS artifacts with the Datameer X entities represented in the database. There are several occasions where this might become necessary:

...

Updates the core Datameer X directory ownership with secure prinicpal as username and das_users and group, setting permissions to 770. Also runs Datameer X entity validation ensuring that proper permissions exist, including guaranteeing single groups are in the set: foo,bar,baz

Code Block
borderStylesolid
secure_hdfs_tool.sh --hdfs-groups foo,bar,baz

Validates Datameer X entity group permissions, emitting any invalid groups. Will also check that groups referenced are in the set foo, bar, baz. This is an example of something you would run while modifying Datameer X entities until there are no more errors.

...

Synchronizes HDFS artifact ownership and permissions with those stored in the Datameer X database. Note that you want to continue to include the -G (--hdfs-groups) argument if it applies to you, this guarantees complete validation.