Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • Has proper sizing being done? See the Capacity Planning section.
  • Ensure you select MultiParquetDataLoader in the Data loader field in Advanced Configurations. This ensures optimal parallelism when Datameer's job output writes one large part file.
  • Have you used all configured memory? Delays can occur if new datasets start evicting other active indices in the memory cache. See the Capacity Planning section.
  • Are you running on EMR? If so, note that the initial column reads and VE load take longer than an on premise cluster due to EMR ↔ S3 latency.

RTA takes a long time to start:

...