GROUPCOUNTDISTINCT

Syntax

GROUPCOUNTDISTINCT(<any>)

Description

Counts the distinct values of a group.

Example

Given the following data:

Customer_	Name	Item
2535	Jon	Zamioculcas
2888	Mike	Spider Lily
2535	Jon	Datura
2535	Jon	Dahlia
5788	Jon	Snapdragon
2888	Mike	Marguerite
2535	Jon	Windflower
2888	Mike	Elephant's Ear
2535	Jon	Chinese Evergreen
5788	Jon	Begonia
5788	Jon	Starfish Plant
2535	Jon	Hare's Foot Fern
2535	Jon	Venus Flytrap
3545	Monique	Red Flame Ivy

First create a group using GROUPBY(#RawData!Name)

Name
Jon
Mike
Monique

Then use the GROUPCOUNTDISTINCT(#RawData!Customer_), and the results are a count of each time an individual value appears in that column in relation to the GROUPBY() column.

Name	Customer_COUNTDISTINCT
Jon	2
Mike	1
Monique	1

Known issue:

When working with the GROUPCOUNTDISTINCT function on a dataset with a very high number of records in a group, out of memory exceptions might occur.

Cause:

In order to determine the true distinct value, GROUPCOUNTDISTINCT shifts all mapped information to disk on one reducer. This can cause disk space and performance/memory issues on large datasets. This is a known limitation that Datameer X is working to resolve.

Solution:

In order to work around this issue, an intermediary sheet along with a combination of GROUPBY and GROUPCOUNT functions is suggested as described below.

Original:

Sheet 1: user_id, product

Sheet 2: GROUPBY(Sheet1.product), GROUPCOUNTDISTINCT(Sheet2.user_id)

Workaround:

Sheet 1: user_id, produt

Sheet 2: groupby(Sheet1.product), groupby(Sheet1.user_id)

Sheet 3: group by(Sheet2.product), groupcount

This work around gives the same desired output of the GROUPCOUNTDISTINCT function.

GROUPCOUNTDISTINCT

[data-colorid=f9kilqwhpx]{color:#333d47} html[data-color-mode=dark] [data-colorid=f9kilqwhpx]{color:#b8c2cc}Syntax

Description

Example

Syntax