Financial Fraud Detection Inside Trading, Market Manipulation, - - PDF document

financial fraud detection
SMART_READER_LITE
LIVE PREVIEW

Financial Fraud Detection Inside Trading, Market Manipulation, - - PDF document

1/23/2010 Data Mining in Business Intelligence Professor Hui Xiong I/UCRC Center for Dynamic Data Analytics Rutgers University Data Mining Tasks Data Tid Refund Marital Taxable Status Income Cheat 1 Yes Single 125K No 2 No Married 100K


slide-1
SLIDE 1

1/23/2010 1

Data Mining in Business Intelligence

Professor Hui Xiong

I/UCRC Center for Dynamic Data Analytics Rutgers University

Tid Refund Marital Status Taxable Income Cheat 1 Yes Single 125K No 2 No Married 100K No 3 No Single 70K No 4 Yes Married 120K No

Data

Data Mining Tasks

5 No Divorced 95K Yes 6 No Married 60K No 7 Yes Divorced 220K No 8 No Single 85K Yes 9 No Married 75K No 10 No Single 90K Yes 11 No Married 60K No 12 Yes Divorced 220K No 13 No Single 85K Yes 14 No Married 75K No 15 No Single 90K Yes

1 0

Milk

Financial Fraud Detection

  • Inside Trading, Market Manipulation, Fraud
slide-2
SLIDE 2

1/23/2010 2 Spoiled by One Very Rotten Apple ‐ Rogue Trader’s $7.14 Billion Loss

  • Biggest Bank Fraud in History

– 2008: Bank Societe Generale

– $7.14 Billion Loss

  • A single futures trader, Jerome Kerviel, who

scheme of fictitious transactions

  • China Aviation Oil (CAO), Chen Jiulin, led to a

loss of $550 million

Business & Economic Networks

Example: eBay bidding

vertices: eBay users, links: represent bidder-seller or buyer-seller fraud detection: bidding rings

Example: corporate boards

vertices: corporations links: between companies that share a board member

Example: corporate partnerships

vertices: corporations links: represent formal joint ventures

Example: goods exchange networks

vertices: buyers and sellers of commodities links: represent “permissible” transactions

A Sample Network of Board of Directors

slide-3
SLIDE 3

1/23/2010 3 Financial Fraud Detection

  • Cross‐account/channel Fraud Detection

– Money transfer (ring of traders, multiple accounts) – Price manipulations (in or outgoing stars, potentially with losses)

  • Fraud Risk Propagation in Corporation Networks

Deliverables

  • First 6 months

– Building a database of bankrupt companies with the information, such as board of directors

  • 12 months and associated knowledge transfer

– A demo system for detecting fraud / short signals

Cab Location Traces

  • 500 Taxi drivers
  • About 30‐day data in San

Francisco

  • Spatial‐temporal sequence

Spatial temporal sequence

– Latitude – Longitude – Identifier of Business

  • 1 indicates with passenger
  • 0 indicates no passenger

– Time stamp

slide-4
SLIDE 4

1/23/2010 4 Profiling Driver Behaviors

Profiling the driver behaviors to identify transportation

related green knowledge

i.e. highly effective use of energy; safety driving; the driving

patterns affecting the gasoline consumption

Method Driver Segmentation Trajectory Clustering

Ref: Transecurity

Understanding Cab Driver Behaviors

Energy‐related Knowledge Discovery

  • Driver Segmentation based on their effective

driving time

– Ratio between driving time with customers and driving time without customers driving time without customers

  • Clustering of effective pick‐up points
  • Frequent trajectory with customers
  • Frequent trajectory without customers
  • Moving pattern of most profitable drivers
slide-5
SLIDE 5

1/23/2010 5

Energy‐Efficient Mobile Recommender Systems

  • Recommend routes

– Suggest a sequence of pick‐up points for cab drivers in a real‐time fashion based on the knowledge learnt from history data g y – Suggest to avoid area where may lead to less effective use of gasoline.

  • Knowledge for Safety Driving Training
  • Pattern for Cab Driver Coaching and Feedback

Context‐Aware Customer Service Support

Customer service support: an integral part of most

companies

Customer Service Problem Log

  • Structured attributes: limited information
  • Unstructured attributes
  • A Sample Problem Log Entry
slide-6
SLIDE 6

1/23/2010 6

Context‐Aware Customer Service Support

  • User behaviors identified from problem logs
  • Demographic information of Customers
  • Multi‐focal Learning

Multi‐focal Learning: An illustration

  • Multi‐focal learning: partition training data into several

different focal groups and build prediction model within each focal group

Deliverables

  • First 6 months

– Context‐aware feature selection – Multi‐source demographic customer data collection

  • 12 months and associated knowledge transfer

– A software package for context‐aware multi‐focal learning for customer service support

slide-7
SLIDE 7

NSF Industry/University Center for Dynamic Data Analytics (CDDA) Project Summary Project Name: An Energy-Efficient Mobile Recommender System Project Investigators: Hui Xiong Description: The increasing availability of large-scale location traces creates unprecedent opportunities to change the paradigm for knowledge discovery in transportation systems. A particularly promising area is to extract energy-efficient transportation patterns (green knowledge), which can be used as the guidance for reducing inefficiencies in energy consumption of transportation

  • sectors. However, extracting green knowledge from location traces is not a trivial task.

Conventional data analysis tools are usually not customized for handling the massive quantity, complex, dynamic, and distributed nature of location traces. To that end, in this project, we will provide a focused study of extracting energy-efficient transportation patterns from location traces. Specifically, we have the initial focus on a sequence of mobile recommendations. As a case study, we will develop a mobile recommender system which has the ability in recommending a sequence of pick-up points for taxi drivers. The goal of this mobile recommendation system is to maximize the probability of business success. Experimental Plan :

  • Sept. 10: Data Preprocessing
  • Dec 10: Algorithm Design
  • Spring 11: Testing of algorithms
  • Fall 11: Performance Evaluation

Related Work Elsewhere:

  • Classic recommender systems are focused
  • n traditional application domains, such as

commercial item recommendation How Ours Is Different:

  • Mobile recommender systems is under-

explored

  • Recommendation based on business

success instead of user ratings Related Work in Center:

  • Vision and data analysis applications
  • DHS work on camera networks

Milestones:

  • 2010-2011: Focus on algorithm

development

  • 2011: Implementation of a Demo system

and Evaluation of the performances of Energy-Efficient Mobile Recommendation Deliverables:

  • Technical demonstration along with a

technical report resulting in a publication; Budget: $50,000 Potential Benefits to Member Companies:

  • Ideas for developing energy-efficient location based services
slide-8
SLIDE 8

NSF Industry/University Center for Dynamic Data Analytics (CDDA) Project Summary Project Name: Mobile Web Usage Profiling for System Performance Tuning Project Investigators: Hui Xiong Description: The objective of this proposed research is to profile the behaviors of mobile web users. Due to the differences in age, profession, gender, and cultural background, mobile users may exhibit a large degree of diversity in how they access the mobile Internet. Understanding this diversity as well as extracting similarity in the user patterns is thus critical to designing and developing future mobile applications which is centered on mobile search. In order to address this need, we have obtained web usage logs from a mobile service provider, and propose to perform a detailed analysis of the logs. Specifically, we propose to analyze the logs based on the method of user segmentation, which cluster users with similar behaviors based on their demographic data, search keywords, and click histories. This research poses challenges in, as well as advances the development of, both data mining and mobile computing. By the end of the project, we expect to develop a set of techniques that can effectively characterize users’ usage patterns and a list of

  • bservations that can be leveraged for improving the performance of the mobile Web sites.

Experimental Plan :

  • Sept. 10: Data Preprocessing
  • Dec 10: Algorithm Design
  • Spring 11: Testing of algorithms
  • Fall 11: Performance Evaluation

Related Work Elsewhere:

  • Customer Segmentation
  • Customer Profiling

How Ours Is Different:

  • Cross-information-source collaborative

customer analysis Related Work in Center:

  • Vision and data analysis applications

Milestones:

  • 2010-2011: Focus on algorithm

development

  • 2011: Testing of a demo system for

customer analysis; evaluation of the performances Deliverables:

  • Technical demonstration along with a

technical report resulting in a publication; Budget: $50,000 Potential Benefits to Member Companies:

  • Techniques for multi-source and context-aware customer analysis