Importing from a Cloud Storage
INFO
Find all information and how-to import from a cloud storage here.
Supported Cloud Storages
Datameer X allows to import from:
- Azure Data Lake Gen 2
- Google Cloud Storage
Importing from a Cloud Storage
Azure Data Lake Gen 2
To import data from Azure Data Lake Storage Gen 2:
- Click the "+" button and choose "Import Job" or right-click in the file browser and select "Create New" → "Import Job". The 'New Import Job' tab appears in the menu bar.
- Click "Select Connection". The dialog 'Select Connection' opens.
- Click on the connection for Azure Data Lake gen 2 and confirm with "Select". The connection is displayed.
- Select the required file type from the drop-down "File Type" and confirm with "Next".
- Enter the file or folder name as it is named in your storage.
- Define the delimiter character the schema and the column names.
INFO: The default value for delimiter is ','.
- Select the schema of the imported data.
- If needed, uncheck the check-box to not include the column names in the first row.
INFO: The check-box is marked per default. The column names are contained in the first row.
- If you want to filter by data and time select the filter method from the drop-down.
INFO: Select the start date and end date from the calendar for the filter mode 'Fixed dates'.
INFO: Enter a 'das' expression as the start and end expression, e.g. 'TODAY()-4d for the filter mode 'Dynamic dates'. - If needed, exclude data by the file modification day, enter the amount of days.
- If needed, modify the advanced settings, e.g. the character encoding and confirm with "Next". The tab 'Data Fields' opens.
- Confirm with "Next". The tab 'Define Fields' opens.
- Mark all required columns.
- If needed, enter placeholder values and confirm with "Apply".
- Decide how to handle invalid data.
- Decide whether you want partition data and confirm with "Next". The tab 'Schedule' opens.
INFO: If you have checked 'Partition Data', enter a date expression and select the data format from the drop-down.
- Decide whether the import shall be triggered manually or on a schedule.
- Select the option for data retention.
- If needed, enter the amount of sample records and the maximum amount of errors to log and confirm with "Next". The tab 'Schedule' opens.
INFO: Higher values lead to more precise preview results but can rapidly decrease the performance.
- If needed, enter an import job description.
- Demark the checkbox if the import shall not start immediately after the saving.
INFO: The check-box is marked per default to start the import right after saving the import job.
- If needed, enter the email address for several notifications and confirm with "Next". The 'Save Import Job' dialog opens.
- Select the path the data shall be imported to, enter a name and confirm with "Save". Data Import from Azure Data Lake Storage gen 2 is finished.