Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

RTA loads into memory the columns users start to interact with (aka Active Columns) for aggregation and filtering operations. Since there are too many variables such as data types, the type of interaction with a column (filtering or aggregation), number of records, cardinality, lenght length of characters for string columns, etc it is difficult to provide a single formula that covers all the different cases for sizing. The best way to plan memory capacity is to start with the largest datasets known that are planned for use with Visual Explorer.  For this initial deployment for sizing you can use this very conservative formula: 1GB of memory per 2.5 million to 8 million records (assumming 10 assuming 10 active columns). Next, use the URL:  http://<RTA Master Node>:9200/_parlenecache/stats?pretty (requires having Elasticsearch HTTP interface enabled in the plug-in configuration) so that a Datameer administrator can test potential columns of interest to get a sense how much memory is required for the largest dataset. Please note that additional user utilization for the same dataset doesn't increase memory requirements. See the Troubleshooting section below for additional information on finding the RTA Master Node URL.

Example:

You have a dataset with 100 million records. After testing different dimensions/measures of interest you learn that you need about 7GB of memory for touching 12 columns of this dataset. This is summarized by the top level "SumBytes" value with the URL above. This is the sample output from http://<RTA Master Node>:9200/_parlenecache/stats?pretty

...

  • conductor.log & runtime-analytics.log
    • Useful to investigate problems with Runtime Analytics starting, stopping, or crashing.
  • runtime-analytics-query.log
    • Read this log to investigate problems with Runtime Analytics crashing, querys failing, or the wrong results being delivered.

Anchor
rta_master
rta_master
Where is RTA Master Deployed?

Once the RTA deployment is live, go to the Yarn App and copy a URL from one one the active nodes holding the container:

...