Amazon S3


Prerequisites

INFO

Minimum access level required for importing data from an S3 bucket to Datameer.

The Account/ IAM Role one uses in the S3 connection should be allowed to perform the following actions:

  • GetObject

  • GetObjectAttributes

  • ListBucket (should be applied for an entire bucket but could be restricted to a specific directory(s) via the Condition key - "Condition": { "StringLike": {"s3:prefix": "<path>/*" }} )

For an edge case suggestion, please check the community article Bucket policy for S3 Import in multi-tenant Datameer instance.

Usable Custom Properties

Disabling the IAM Role Authentication Option

INFO

Disabling the IAM role mode allows more strict access control to S3 buckets requiring the IAM with assumed role mode to be used instead. In IAM with assumed role data access will happen under an assumed AWS IAM role explicitely defined in the S3 connection.

To disable the IAM role option enter the following custom property on the Admin's page within the '/wiki/spaces/DASSB100/pages/32569526922' settings:

das.s3.iam.without.assumed.role.enabled=true

Configuring Amazon S3 as a Connection

To configure Amazon S3 as a connector:

  1. Click the "+" button and select "Connection" or right-click in the browser and select "Create New" → "Connection"The "New Connection" tab appears in the menu bar.
     or 

  2. Select "S3" from the drop-down and confirm with "Next"The type is displayed in the drop-down.

      

  3. Enter the "S3 Bucket"
     

  4. Select the encryption algorithm.   

     

  5. Select how to authenticate. The details below adapt to the selection.
    INFO: Selecting 'IAM role' will use the IAM role associated with the EC2 instance or the EMR cluster. Selecting the  'IAM assumed role' will use the IAM role associated with the EC2 instance or the EMR cluster to assume a specific role. 

     

  6. When selected 'Access key and secret', enter the "Access key", if needed the "Access secret", the "Root path prefix", the "Region" as well as the "S3 Endpoint" and confirm with "Next"The "Save" tab opens.

     

  7. If needed, enter a description, and confirm with "Next"The 'Save Connection' dialog opens.

     

  8. Select the folder to save the connection in and enter a name in 'Save as'. Confirm with "Save"The connection is saved. Configuring Amazon S3 as a connection is finished.