Knowledge Discovery from Transportation Network Data Paper Review - - PowerPoint PPT Presentation

knowledge discovery from transportation network data
SMART_READER_LITE
LIVE PREVIEW

Knowledge Discovery from Transportation Network Data Paper Review - - PowerPoint PPT Presentation

Knowledge Discovery from Transportation Network Data Paper Review Jiang, W., Vaidya, J., Balaporia, Z., Clifton, C., and Banich, B. Knowledge Discovery from Transportation Network Data. In ICDE, 2005 1 Outline Background. Experiments.


slide-1
SLIDE 1

1

Knowledge Discovery from Transportation Network Data

Paper Review

Jiang, W., Vaidya, J., Balaporia, Z., Clifton, C., and Banich, B. Knowledge Discovery from Transportation Network Data. In ICDE, 2005

slide-2
SLIDE 2

2

Outline

  • Background.
  • Experiments.

Structurally Similar Routes Temporally Repeated Routes

  • Experiment results.
  • Conventional techniques.
  • New challenges.
slide-3
SLIDE 3

3

A natural application area for Data Mining

  • Transportation and logistics are an important

sector of the economy.

  • -Transportation consumes 60% of oil worldwide
  • Data mining has lead to significant gains in
  • ther areas
  • Computer use is widespread in transportation

and logistics.

  • -Inventory management, parcel tracking, and even on-

truck location sensors

slide-4
SLIDE 4

4

Existing Applications

Data Mining

  • Mining with transactional characteristics of freight and

events.

  • - i.e. classification on safety/accident records might

find that trucks are prone to accidents at 7:00 AM on east - west roads.

  • - NO geometry of the network.

Network Structure

  • Optimization
  • - Finds solution (Minimize cost)
slide-5
SLIDE 5

5

Transportation Networks

  • Graph problems
  • Graph mining

i.e. Finding the frequent sub-graphs Algorithms * WARMR * AGM * SUBDUE * FSG

slide-6
SLIDE 6

6

Dataset

  • Six months of origin-destination (OD) data from a large

third-party logistic company. 98,292 transactions.

  • Represented as a directed graph by mapping locations

to vertices.

  • Each transaction can then be represented as the edge
  • f an OD pair.
  • The edges are labeled with the other attributes of the

transaction: pickup date, delivery date, distance, hours, weight, and mode. (binning strategy)

slide-7
SLIDE 7

7

slide-8
SLIDE 8

8

Mining Interests

  • Structurally Similar Routes
  • -Identify structurally similar patterns that occur in many

locations. Methods * SUBDUE * FSG

  • Temporally Repeated Routes
  • -Find patterns of routes repeated in time, rather

than space. Method * FSG

slide-9
SLIDE 9

9

Structurally Similar Routes

  • We assign all vertices the same label.
  • Three variants for edge labels: weight, distance,

and time.

  • - OD_TD : TOTAL-DISTANCE
  • - OD_GW : GROSS-WEIGHT
  • - OD_TH : MOVE-TRANSIT-HOURS
slide-10
SLIDE 10

10

Experiments with SUBDUE (MDL principle)

SUBDUE: A substructure discovery system

Results:

  • Took about 3.25 hours to handle a graph of 100

vertices and 561 edges to find the best 3 patterns of beam size 4.

  • Would need 6 months on the complete graph.
  • Results were trivial.
slide-11
SLIDE 11

11

  • Significant traffic from node 2 to node 4 via node 3, but

not much return traffic (deadheading)

slide-12
SLIDE 12

12

Experiments with FSG

  • FSG mines patterns across a set of graph

transactions.

  • Divides the single graph into multiple distinct

sub-graphs, and treats each sub-graph as a separate transaction.

Breadth first partitioning Depth first partitioning Both may result in patterns being broken

across partitions

slide-13
SLIDE 13

13

Results

  • Partition sizes; 400, 800, 1200 and 1600.
  • Depth-first partitioning: 200 frequent patterns

were found with the minimum support 120.

  • Breadth-first partitioning: 667 frequent patterns

were found with the minimum support 240.

  • Had runtime and memory problems with lower

supports on the breadth-first partitions.

  • FSG is not an appropriate tool to use for mining

recurrence patterns in a large single graph

slide-14
SLIDE 14

14

slide-15
SLIDE 15

15

Temporally Repeated Routes

  • FSG
  • Exploits the temporal nature of the

transportation graph

  • Partition each graph into a set of graph

transactions based on date

slide-16
SLIDE 16

16

Results

  • Unable to run FSG on the entire data set due to

insufficient memory / swap space.

  • Most were small patterns. (The following is the

biggest one)

slide-17
SLIDE 17

17

Patterns Discovered by Using Conventional Mining Algorithms

  • Mapped the dataset into a standard

“transactional” representation.

  • Used traditional data mining approaches.
  • Used Weka for association rule mining,

instance (tuple) classification and cluster analysis on the transportation data.

slide-18
SLIDE 18

18

Evaluations of Conventional Algorithms

  • Traditional data mining techniques have

produced interesting and meaningful results to summarize our data.

  • Further experimentation is required to explore

the potential and limitations of these techniques

  • n temporal transportation network data.
  • Lose some insights from the structural

characteristics of the data.

slide-19
SLIDE 19

19

Challenges for Data Mining Research

  • Handling the temporal aspects of graphs

(dynamic graphs).

  • Incorporating the notion of events into a graph.
  • Expanding graph mining techniques beyond

data similar to molecular structures.

  • Determining what makes a graph pattern

interesting.