...
- Has proper sizing being done? See the Capacity Planning section.
- Ensure you select MultiParquetDataLoader in the Data loader field in Advanced Configurations. This ensures optimal parallelism when Datameer's job output writes one large part file.
- Have you used all configured memory? Delays can occur if new datasets start evicting other active indices in the memory cache. See the Capacity Planning section.
- Are you running on EMR? If so, note that the initial column reads and VE the Visual Explorer load time take longer than an on premise cluster due to EMR ↔ S3 latency.
...