Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Table of Contents

Configuring Hive Server2 as a Connection

Configuring a HiveServer2 connection is similar to configuring Hive.

...

The Hive plug-in is provided per default with the installation of Datameer to import from and export to Hive servers. It can be found under the the Admin tab  tab and selecting selecting Plug-ins from  from the menu.

...

...

As of Datameer v6.1

Click the cog icon under actions to configure Datameer's Hive plug-in.One of the plug-in's features is to cache

Export Settings

There are options on how to configure the data field type mapping from Datameer to HiveServer2 upon export.

  • Datameer classic is the default setting for the plug-in. The mapping specifics can be found on Data Field Types in Datameer: Hive Server2 Mapping.
  • Hive specific is similar to the default setting with the change that Datameer's BigDecimal is mapped to Hive's Decimal and Datameer's Date is mapped to Hive's Timestamp data type.
    • BigDecimal
      • New Table - Datameer exports the BigDecimal type to the Hive type Decimal with a precision (total number of digits) of 38 and a scale (number of digits to the right of the point) of 18.
      • Existing Table - Datameer exports the BigDecimal type to the Hive type Decimal with the precision and scale defined on the Hive server. 
        • A maximum is set at a percision,scale of (38,37).
      • If an exported value doesn't fit within the precision/scale of either a new or existing table, a failure occurs. 
    • Date
      • New Table - Datameer exports the Date type to the Hive type Timestamp.
      • Existing Table - Datameer exports the Date type to the Hive type Timestamp/Date/String depending on what is defined on the Hive server.

The mode has no influence when exporting into an existing partitioned Hive table.

Cache

Datameer caches Java objects that represent partitions on HiverServer2. These partitions contain locations, columns names, and other information. Datameer stores these objects on a local disk. When Datameer needs to read Hive partitions it increases performance by using the stored cache instead of having to pull the same information each time it is needed.

...

From the Hive plug-in configuration settings, users you have the ability to clear the current cache or start filling it immediately without having to wait for the automated renewal.

  • The cache can be cleared to remove stored partition data that is no longer being used and is decreasing performance.
  • The cache can be manually filled before the auto update to cache new/changed Hive partition data to increase performance. 

Image RemovedImage Added

The feature is unavailable for HiveServer1.

...