AML

AML Issues

The AML team comprising of Dr. Sajjad, Saleha, Osama ,Asma and Farah had their first meeting in which Osama and Farah were briefed about the project.
 * Discussion on 5th May 2010**


 * Discussion on April 24, 2010 (Saturday)**

A meeting was arranged for AML Issues discussion wit h Mr. Nauman Sheikh and two of the CreditChux team menbers. Following are the issues discussed so far: > > > >
 * A Basic Banking Account is an account opened by the bank for mostly two purposes one is incase loan is approved then bank opens a deposit account and links it to the credit account. Other is for employees who would avail minimal facitlities no ATM etc. On the other end a Value account is for high ended customers
 * HBL DPA account number is three and credit amount is very much should we take the outlier or not? At tis moment we will not remove them first we will take input from a bank official then we will decide
 * We can discard locker account
 * Discard all the account types and take the top three accounts. Basic Banking ,PLS saving and Current accounts into consideration
 * Remove ATM switch fee, withholding tax, Transaction charges,
 * Remove all those whose avg amount is less than 1000
 * Remove all those # of transactions in which transaction count is less than 100
 * Blank ages have not been provided
 * Interesting pattern is observed inages between 48-55 avg 10 transactions
 * Interesting pattern is observed in Karachi North the avg number of transactions are much as compared to other regions
 * Bank Draft Proceeds is Credit and Issued is Debit


 * Discussion on April 21, 2010 (Wednesday)**


 * The following steps will be performed:
 * Clean the data by removing the irrelevant account types and transaction codes and then recompute the previously agreed variables.
 * Run clustering algorithms on the cleaned (and selected training) data set and store clusters specification (average values, lower and upper bound of each attribute). To begin with, we can only focus on K-Means.
 * Assign each customer to a cluster.
 * Process the test data sequentially and compare each record against its corresponding cluster and if the record is considered an outlier, raise a flag.
 * The flagged records should be a small percentage (2--5%) of the test data. Adjust the clusters' specifications and outlier metrics accordingly to meet this percentage.


 * Discussion on April 7, 2010 (Wednesday)**


 * It was decided that in the beginning we would focus on the following variables:
 * Average Monthly Withdrawal, Average Monthly Deposit, Average Number of Withdrawal Transactions, Average Number of Deposit Transactions, Average Start of the Month Withdrawal, Average Start of the Month Deposit, Average End of the Month Withdrawal, Average End of the Month Deposit
 * Branch Code, Birth Year, Account Type, Emp_Ind, Gender, NTC_NBR_GVN_IND, Max Year to Date Balance


 * The continuous attributes would be discretized first and then K-Means would be applied to form K clusters.


 * The benchmark in the beginning is SARs (Suspicious Activities Reports) generated by the bank. Our first goal is to replicate (or even improve) the SARs generation process.


 * In the next step, we would focus on those aggregate variables that capture the sequence aspect too. For instance,
 * Average Delay in Two Consecutive Withdrawals, Average Delay in Two Consecutive Deposits, Average Ratio between Two Consecutive Deposits, Average Ratio between Two Consecutive Withdrawals, etc.