INFO
In this chapter you find information to optimize a workbook.
Avoiding Unexpected Large Parquet Files
INFO
Kept sheets in a Datameer workbook which contains large strings (e.g. 200MB of JSON) will result in big Parquet files.
To avoid large Parquet files the following options apply:
- restructure the workbook chain and make it a multi layer data transformation where only the required data is defined as kept/ result sheet
- increase the JVM heap space for the Datameer job
Deletion of Non-kept Sheets
INFO
Workbook sheets that are not kept and used by a downstream Workbook or an export job reduce a Workbook's processing time and cost storage.
To avoid a long processing time you will find an indicator about kept and therefore sheets that need to be kept while non-kept sheets can be deleted:
- find the indicator 'Consumer' on the Workbook Setting page and on the Workbook Details page. e.g. here:
- the indicator 'Consumer' helps admins and users to identify which portions of old work are no longer needed to be kept
- delete non-kept sheets that are not used any more