Credit Card Fraud

The Credit Card Fraud application demonstrates how Datameer's powerful analytic tools are able to be used to find where credit card fraud transpires. 

Uses for the Credit Card Fraud App

Credit card theft happens everyday all around the world and companies need to secure their products. Credit card fraud is always changing and some credit card thieves have become more sophisticated adapting to circumvent credit card security parameters.

For example, criminals can now make numerous, small transactions that are seemingly benign. If a credit card company has a customer who normally only spends $100-$200 in a grocery store every few weeks but suddenly starts making multiple $10-$20 charges over a few days, there is a potential security risk. By analyzing point of sale, geolocation, authorization, and transaction data with Datameer, this financial customer was able to identify fraud patterns in historical data. This analysis helped the firm identify over $2 billion in fraud. By applying the fraud model to new transactions, the company was able to identify points of potential fraud and proactively notify customers.


Identifying Potential Fraudulent Security Risks

Data (file uploads and import files)

The basis of all analytic processes is having the data. Datameer uses two file uploads in this application as a source for the credit card company's data. This data covers all the transactions that the credit card company will analyze.

  • CreditCardData
    • VendorCategoryCode (number for vendor category. grocery, airline, hotel, etc.)
    • VendorID (vendor number)
    • Vendor (vendor name)
    • Date (date and time of each credit card transaction)
    • AmountUSD (amount of credit card transactions)
    • CreditCardNumber (credit card being used for each transaction)
    • isFraud (was the card reported as missing)
  • CatCodes
    • CategoryCode (number for vendor category)
    • Category (name of vendor category)

Analysis (workbooks)

After the data has been added into Datameer, workbooks are created to analyze that data for possible fraudulent transactions.

  • Outliers by Category
    • Where and when cards that were reported missing were used
  • Outliers by Spending
    • Where and when cards were spending outside of their normal spending pattern
  • Affected Vendors
    • Which vendors at what times had the largest occurrences of possible fraud

Outliers by category

The credit card company is looking for potential vendor types, locations, and times where credit cards are used after they have been reported missing in the Outliers by Category calculation workbook.

The credit card data and the category code references are loaded into a workbook. A GROUPBY() function is used to group the category codes that represent vendor types. They are then separated by 3 day intervals using a GROUPBYBIN() function and the number of flagged transaction occurrences are counted during those days.

An inner join is used to join the two data sources together so that the category codes can be translated into category names.

The last page in the workbook groups by the date and displays the top 5 category types that had the most occurrences of a credit card being used after it was reported as missing. 

The chart below shows that grocery stores towards the end of the year experience the most fraudulent alerts for a card being used after it was reported missing. The credit card company could then create additional security measures to take into account this data to prevent thefts.

Outliers by spending

People are creatures of habit. When normal patterns are broken it is a good indicator that something is not as it should. The Outliers by Spending workbook analyzes credit card data to look for when spending patterns deviate. 

Both data files are added to a workbook. The first calculation sheet groups by the category code and takes an average of the amount spent at each category type. A group standard deviation is preformed on the total amount spent per group.

double inner join is made to create a sheet that joins both the imported credit card data vendor codes to the category code vendor names as well as the credit card data category codes to the transaction categories.

The next sheet creates a filter to display any outliers in normal spending patterns:

-- ABS(#AmountUSD- #AverageSpend) > 2* #SpendStandardDeviation – This filtered pattern looks for the absolute number (disregarding +/-) of the amount of the credit card charge minus the average spending for the user at that vendor. If this number is twice the amount of the standard spending deviation then it will be considered an outlier.

The final worksheet groups by the outliers purchase date and category. A GROUPCOUNT() function is used to display the number of deviating fraudulent events.

Affected vendors

The credit card company wanted to identify the highest risk for fraud by individual store and date in this workbook.

Using built-in Datameer functions like GROUPBY() and GROUPAVERAGE(), you can create a list on how often individual stores receive fraudulent flags. Using the same functions on the vendor category lets you know how often each vendor type receives flagged transactions. Finally, using the function GROUPBYBIN() on the credit card transaction dates lets you create a date range for each of the vendors.

sheet join is used to link all three of the calculation sheets together. A final HotVendors worksheet is created to identify which vendors had the highest amount of fraudulent flags during three day periods.