R10246002 陳俊憲、R10246006 鄭書承


Task Description - Suspicious Activity Report Prediction

    For the financial industry, anti-money laundering is an inevitable problem and challenge. Criminals use various money laundering channels to whitewash illegal funds in order to evade judicial investigations and prosecutions, and even use the proceeds of crime for other illegal activities. If financial institutions do not actively scrutinize various transactions handled by them, they will become laundries for criminal groups, not only damaging their own reputation but also disrupting financial market order due to the nature of financial institutions aggregating large amounts of people's funds. In addition, financial criminals constantly use emerging technologies or channels to conceal or hide illegal income, and money laundering techniques are constantly evolving, making it clearly insufficient for the financial industry to rely on human power alone to identify suspicious criminal activities.

Goal: predict whether an alert is a suspicious(money laundering) activity.

Data Preprocessing and Feature Engineering

Dataset

custinfo: Customer data indexed by alert keys, including risk level, asset quantity, occupation, and age.

Untitled

ccba: Monthly credit card transaction data indexed by customer information.

Untitled

cdtx: Detailed information for a single purchase indexed by customer information.

Untitled

remit: Foreign exchange trade data indexed by customer information.

Untitled

dp: Detailed information for a single debit credit data indexed by customer information.

Untitled

Preprocessing

Approach 1: Combine all the other customer data to each alert key.