INFO
In this chapter you find information to optimize a workbook.
Avoiding Unexpected Large Parquet Files
INFO
Kept sheets in a Datameer workbook which contains large strings (e.g. 200MB of JSON) will result in big Parquet files.
If a column content in a kept sheet exceeds a warning is displayed on the preview data.
<insert img>
To avoid large Parquet files the following options apply:
- restructure the workbook chain and make it a multi layer data transformation where only the required data is defined as kept/ result sheet
- increase the JVM heap space for the Datameer job