Datameer's Amazon S3 Native connector uses multipart technology to boost the performance. Those using the S3 connector might think about updating to this S3 Native connector.
The S3 Native connector has the ability to export to S3 buckets where Datameer X can't read the getObjectMetadata()
method. The older S3 connector must be able to read the getObjectMetadata()
method in order to read and write to/from S3 buckets.
...
- Click the + (plus) button and select Connection or right-click in the browser and select Create new > Connection.
- From drop-down list, select S3 Native as the connection type.
- Select the authentication process.
If using an access key and secret key, enter those below.
If using IAM, Datameer's S3 client uses the instance profile credentials to sign and authenticate the S3 requests.
Select an option for the algorithm encryption method, enter the AWS region, and enter the S3 bucket name.
Enter the uploading parameters.Anchor uploading_parameters uploading_parameters Planing Planning for memory usage:
By default, Datameer X starts 3 threads per stream with each stream having a queue capacity of 3 tasks. Each part (the buffer size) is set at 10MB. The result consumes 3x3x10MB = 90MB RAM.
Size planning:
Amazon sets a limit of 10,000 parts for any multipart upload job. As a result, each Datameer X task can only upload as a 100GB (10,000*10MB = 100GB) file using the default settings. Increase the part size setting if you have a larger file. Note that this is per the Datameer X task. If you have five exporting tasks, the original worksheet data is divided into five with each having a 100GB limit.
These settings can be overridden in each export job. - If required, add a description and click Save.
Importing Data with
...
an S3 Native Connector (Not Available)
Importing using the S3 connector is currently unavailable and is used for export only.
Learn about exporting data with the S3 Native connector.