knowledge discovery from transportation network data
play

Knowledge Discovery from Transportation Network Data Paper Review - PowerPoint PPT Presentation

Knowledge Discovery from Transportation Network Data Paper Review Jiang, W., Vaidya, J., Balaporia, Z., Clifton, C., and Banich, B. Knowledge Discovery from Transportation Network Data. In ICDE, 2005 1 Outline Background. Experiments.


  1. Knowledge Discovery from Transportation Network Data Paper Review Jiang, W., Vaidya, J., Balaporia, Z., Clifton, C., and Banich, B. Knowledge Discovery from Transportation Network Data. In ICDE, 2005 1

  2. Outline ● Background. ● Experiments. Structurally Similar Routes Temporally Repeated Routes ● Experiment results. ● Conventional techniques. ● New challenges. 2

  3. A natural application area for Data Mining ● Transportation and logistics are an important sector of the economy. --Transportation consumes 60% of oil worldwide ● Data mining has lead to significant gains in other areas ● Computer use is widespread in transportation and logistics. --Inventory management, parcel tracking, and even on- truck location sensors 3

  4. Existing Applications Data Mining ● Mining with transactional characteristics of freight and events. -- i.e. classification on safety/accident records might find that trucks are prone to accidents at 7:00 AM on east - west roads. -- NO geometry of the network. Network Structure ● Optimization -- Finds solution (Minimize cost) 4

  5. Transportation Networks ● Graph problems ● Graph mining i.e. Finding the frequent sub-graphs Algorithms * WARMR * AGM * SUBDUE * FSG 5

  6. Dataset ● Six months of origin-destination (OD) data from a large third-party logistic company. 98,292 transactions. ● Represented as a directed graph by mapping locations to vertices. ● Each transaction can then be represented as the edge of an OD pair. ● The edges are labeled with the other attributes of the transaction: pickup date, delivery date, distance, hours, weight, and mode. (binning strategy) 6

  7. 7

  8. Mining Interests ● Structurally Similar Routes --Identify structurally similar patterns that occur in many locations. Methods * SUBDUE * FSG ● Temporally Repeated Routes --Find patterns of routes repeated in time, rather than space. Method * FSG 8

  9. Structurally Similar Routes ● We assign all vertices the same label. ● Three variants for edge labels: weight, distance, and time. -- OD_TD : TOTAL-DISTANCE -- OD_GW : GROSS-WEIGHT -- OD_TH : MOVE-TRANSIT-HOURS 9

  10. Experiments with SUBDUE (MDL principle) SUBDUE: A substructure discovery system Results: ● Took about 3.25 hours to handle a graph of 100 vertices and 561 edges to find the best 3 patterns of beam size 4. ● Would need 6 months on the complete graph. ● Results were trivial. 10

  11. ● Significant traffic from node 2 to node 4 via node 3, but not much return traffic (deadheading) 11

  12. Experiments with FSG ● FSG mines patterns across a set of graph transactions. ● Divides the single graph into multiple distinct sub-graphs, and treats each sub-graph as a separate transaction. � Breadth first partitioning � Depth first partitioning � Both may result in patterns being broken across partitions 12

  13. Results ● Partition sizes; 400, 800, 1200 and 1600. ● Depth-first partitioning: 200 frequent patterns were found with the minimum support 120. ● Breadth-first partitioning: 667 frequent patterns were found with the minimum support 240. ● Had runtime and memory problems with lower supports on the breadth-first partitions. ● FSG is not an appropriate tool to use for mining recurrence patterns in a large single graph 13

  14. 14

  15. Temporally Repeated Routes ● FSG ● Exploits the temporal nature of the transportation graph ● Partition each graph into a set of graph transactions based on date 15

  16. Results ● Unable to run FSG on the entire data set due to insufficient memory / swap space. ● Most were small patterns. (The following is the biggest one) 16

  17. Patterns Discovered by Using Conventional Mining Algorithms ● Mapped the dataset into a standard “transactional” representation. ● Used traditional data mining approaches. ● Used Weka for association rule mining, instance (tuple) classification and cluster analysis on the transportation data. 17

  18. Evaluations of Conventional Algorithms ● Traditional data mining techniques have produced interesting and meaningful results to summarize our data. ● Further experimentation is required to explore the potential and limitations of these techniques on temporal transportation network data. ● Lose some insights from the structural characteristics of the data. 18

  19. Challenges for Data Mining Research ● Handling the temporal aspects of graphs (dynamic graphs). ● Incorporating the notion of events into a graph. ● Expanding graph mining techniques beyond data similar to molecular structures. ● Determining what makes a graph pattern interesting. 19

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend