Demos for MapR

The Datameer trial version provided with MapR's distribution includes several ready-to-use demos for exploring the functionality of Datameer and illustrating big data analytics use cases in the areas of customer behavior analytics and IT systems management.

Demo Structure

Each demo consists of one or more data sets, workbooks, and dashboards. Data sets can be seen in Uploaded Files, in the Data folder. Workbooks can be viewed, edited, and run from the Analytics folder, and visualizations seen and modified from the Visualizations folder. We suggest you create copies of artifacts before making major changes in case you want to start over. Role-based access control is also showcased in these demonstrations. We've defined users and groups, and applied permissions to all the artifacts, corresponding to some typical roles and usage patterns found in a Datameer environment. Datameer recommends you start by accessing the demo as user admin (password: "admin") before exploring the permissions management features.

If you have any questions, visit our online documentation here. You can also contact Datameer at info@datameer.com for more information.

Behavioral Analytics

Clickstream analytics

This demo combines raw web server logs and a customer database to analyze each visitor's clickstream, enriched with their profile from a system of record. Datameer first sessionizes, filters, and enriches the raw data to develop the data relevant to analysis. The demo then answers questions about clickpath, page dwell time, session length and depth, and summarizes these across all visitors as well as unique and known customers to find the stickiest pages and establish high-value customer segments.

Here we illustrate the simplicity of correlating multiple data sources, and the ability to distribute the work of complex analytics across multiple users. The demo consists of two workbooks, ClickstreamPreProcessing and ClickstreamAnalytics. The ClickstreamPreprocessing workbook triggers the ClickstreamAnalytics workbook automatically upon completion (something you can modify by clicking Configure after selecting the ClickstreamAnalytics workbook).

The demo visualizes the results of the analytics pipeline in the dashboard Web_User_Behavior, and utilizes two data sets: weblogs, containing raw log data from an Apache web server) and customers, which simulates the user profile table from a customer database

Email analytics

This demo analyzes email content, looking at time distribution and finding patterns of word use, response to specific topics. This demo consists of the workbook, EmailAnalytics, the dashboard Email_Analytics_Dashboard, and utilizes the data set email_archives.

IT Systems Management

Log file analytics

This demo examines web traffic via raw Apache log files to summarize various types of errors experienced by users, identifying hotspots by time of day, page, error and potential security threats. This demo consists of the workbook WebserverErrorAnalysis, the dashboard Web_Errors_Overview, and utilizes the data set weblogs.

MapR cluster usage

This demo aggregates and summarizes resource utilization and availability information from a MapR cluster, including CPU, memory, Hadoop jobs, service status, and errors. Dashboards visualize trends in these metrics over time. The demo is based on a approximately one month of historical data captured from a large MapR cluster via a REST API. This doesn't represent data from your MapR cluster, although ingestion of this data into Datameer is possible. See Datameer's documentation for more information.

This demo also highlights the ETL capabilities of Datameer, cleansing and transforming raw, hierarchical semi-structured data in a JSON format into metrics analysts can easily work with. The data was collected from MapR's REST-based monitoring API. The MapR_Cluster_ETL workbook triggers the MapR_Cluster_Analytics workbook automatically upon completion (something you can modify by clicking Configure after selecting the Mapr_Cluster_Analytics workbook).

The demo visualizes the results of the analytics pipeline in the dashboards MapR_Cluster_Utilization, and MapR_Cluster_Status, and utilizes two data sets: MapR_cluster_stats, which contains system level information and MapR_node_stats which contains node level information.