Visualising Real Time Large Scale Micro-Simulation of Transport - - PowerPoint PPT Presentation

visualising real time large scale micro simulation of
SMART_READER_LITE
LIVE PREVIEW

Visualising Real Time Large Scale Micro-Simulation of Transport - - PowerPoint PPT Presentation

Visualising Real Time Large Scale Micro-Simulation of Transport Networks . Peter Heywood, Paul Richmond & Steve Maddock Department of Computer Science, The University of Sheffield Overview . Introduction Road Traffic Network Simulation


slide-1
SLIDE 1

Visualising Real Time Large Scale Micro-Simulation of Transport Networks

.

Peter Heywood, Paul Richmond & Steve Maddock

Department of Computer Science, The University of Sheffield

slide-2
SLIDE 2

Overview .

Introduction Road Traffic Network Simulation 3isualisation & Performance Conclusions

Heywood P., Richmond P. & Maddock S.

slide-3
SLIDE 3

Introduction .

slide-4
SLIDE 4

Why Simulate Transport Networks? .

  • Increasing traffic demand globally
  • UK projected increase between & : [Dep]
  • Up to % increase of car ownership
  • % to % growth in UK road traffic
  • Poor utilisation of existing infrastructure
  • Need for improved road simulation systems [NFN∗, UK ]
  • Used for planning & trialling road network changes
  • Cheaper & less disruptive than real world trials

Rush hour traffic on the M motorway (Mat Fascione - CC BY-SA .)

Heywood P., Richmond P. & Maddock S.

slide-5
SLIDE 5

Why Visualise Transport Network Simulations? .

  • Decision makers are often not modelling

specialists [NFN∗]

  • 3isualisation increases accessibility of

simulations

  • Improves decision making

An example of traffic microsimulation visualisation (sumo-gui)

Heywood P., Richmond P. & Maddock S.

slide-6
SLIDE 6

Our Aims .

  • Use the GPU for Agent Based Simulation of Road Network
  • Using FLAME GPU (Flexible Large Scale Agent Modelling Environment for the GPU)
  • Large-scale simulation of a road network
  • Car following behaviour on an artificial road network
  • Demonstrate performance of road network simulation using FLAME GPU
  • Described in forthcoming paper “Road Network Simulation using FLAME GPU” [HRM]
  • Develop custom visualisation for the simulation
  • Enable interactive simulation observation
  • Minimal impact on performance

Heywood P., Richmond P. & Maddock S.

slide-7
SLIDE 7

Road Traffic Network Simulation .

slide-8
SLIDE 8

Microsimulation, Agent Based Modelling & the GPU .

Microsimulation & Agent Based Modelling (ABM)

  • Bottom up simulations - individual level with local interactions [SYGD]
  • ABM provides a natural method for describing agents and behaviours
  • allows emergence of more complex behaviour
  • Good for modelling congested transport networks

Why General Purpose computing on Graphics Processing Units (GPGPU)?

  • Increased performance due to massively parallel architecture
  • Microsimulation is well suited for GPGPU computing [SN, 4S]
  • However it is not embarrassingly parallel

Heywood P., Richmond P. & Maddock S.

slide-9
SLIDE 9

The Simulation .

Artificial Road Network

  • Scales consistently unlike real world networks
  • Uniform grid of N junctions & N(N − ) roads

Gipps’ Car Following Model [Gip8]

  • Safety distance car following model
  • Considers driver and vehicle limitations
  • Extensively used [CPM]

FLAME GPU

  • “Template based simulation environment” for agent based

simulation on GPU architecture [Ric]

  • Provides a high level interface for describing agents,

abstracting the CUDA programming model [Ric]

  • State-based agent representation
  • Message-based communication

N = 3 N = 4 N = 5

www.flamegpu.com github.com/flamegpu

Heywood P., Richmond P. & Maddock S.

slide-10
SLIDE 10

Visualisation & Performance .

slide-11
SLIDE 11

General Purpose computing on Graphics Processing Units .

  • Massively parallel architecture
  • Perform same operation on many items of data (SIMD)
  • Kernels (GPU functions) execute same code in parallel

using many threads

  • Multiple memory spaces
  • Memory access pattern is important for performance
  • Dedicated cards connected over PCI bus
  • Host-Device memory transfers are relatively slow

Nvidia Tesla C (Source - CC .) CC .

Heywood P., Richmond P. & Maddock S.

slide-12
SLIDE 12

What is Geometry Instancing? .

  • Rendering multiple copies of the same geometry
  • 3ertex data is copied but modified to reduce repetition
  • Position
  • Colour
  • Animation state
  • Data needs to be accessible on the GPU
  • OpenGL Buffers
  • Requires fewer API calls [Khr]

Heywood P., Richmond P. & Maddock S.

slide-13
SLIDE 13

What is Geometry Instancing? .

  • Rendering multiple copies of the same geometry
  • 3ertex data is copied but modified to reduce repetition
  • Position
  • Colour
  • Animation state
  • Data needs to be accessible on the GPU
  • OpenGL Buffers
  • Requires fewer API calls [Khr]

Heywood P., Richmond P. & Maddock S.

slide-14
SLIDE 14

Interactive Visualisation .

  • Cross platform C++, 2penGL & libSDL[SDL]
  • Mouse & Keyboard controls (no-clip)
  • Simulation updated per frame (currently)
  • Geometry loaded from wavefront (.obj) files
  • Flat shading

Overview of visualisation

Heywood P., Richmond P. & Maddock S.

slide-15
SLIDE 15

Interactive Visualisation .

  • OpenGL Texture Buffers populated with agent data

via CUDA OpenGL Interop [Nvi]

  • Geometry Instancing [Khr] used to apply data to

models

  • gvec texelFetch(gsamplerBuffer sampler, int P); [Khr]
  • Reduced number of API calls [Khr]
  • Minimises host-device memory transfers
  • Fragment shader used to differentiate vehicles &

apply lighting model

Nearby view of visualisation

Heywood P., Richmond P. & Maddock S.

slide-16
SLIDE 16

How we use the GPU .

  • Road network stored in CUDA Constant Memory
  • Does not change during kernels
  • Maximum size to Kb currently -> CUDA

Read-only Memory

  • Geometry Instancing & CUDA interop
  • Avoids unnecessary host-device transfers
  • FLAME GPU
  • One thread per agent
  • State-based representation minimises branching
  • Synchronisation points defined by message

dependence

  • Transparently converts between AoS & SoA
  • Minimal transfer of data to host (CPU)

Host Device

Initial Agent Data

FLAME GPU simulation

Agent Data

Instanced Rendering

.obj model file

T exture Buffers

TBO1 TBO2 ... n ... 1 Model VBOs

PCI Bus

Instanced rendering memory transfers

Heywood P., Richmond P. & Maddock S.

slide-17
SLIDE 17

Demonstration

Heywood P., Richmond P. & Maddock S.

slide-18
SLIDE 18

Performance Impact .

  • Instanced visualisation has minimal

performance impact

  • N = , length m, vehicles &

iterations

  • N3IDIA GeForce GT9

Console ms 3isualisation ms Run-time Increase .8x

28 29 210 211 212 213 214 215 216 217 218 Number of Agents 10−1 100 101 102 103 104 Simulation time (ms) per iteration Non Partitioned Messaging Spatially Partitioned (radius = 5000m) Spatially Partitioned (radius = 2500m) Spatially Partitioned (radius = 250m)

Performance comparison between FLAME GPU message partitioning schemes

Heywood P., Richmond P. & Maddock S.

slide-19
SLIDE 19

What Next? .

Procedural Instancing

  • Increase variation of instanced vehicles
  • Procedurally generate data at runtime to modify instances
  • Use simulation data such as vehicle length / type
  • Applicable to many types of agents
  • 3ehicles
  • Pedestrians
  • Environment

Other Future Work

  • Analyse and visualise aggregate data using the GPU to increase accessibility
  • Further performance optimisations for large populations

Heywood P., Richmond P. & Maddock S.

slide-20
SLIDE 20

Conclusions .

slide-21
SLIDE 21

Conclusions .

  • Highlighted difficulties of large scale GPGPU microsimulation of transport networks
  • Expensive host-device memory transfers
  • Number of GPU draw calls
  • Described & demonstrated techniques used to combat these issues
  • CUDA OpenGL Interoperability
  • Geometry Instancing
  • Demonstrated minimal performance impact for an example visualisation

Heywood P., Richmond P. & Maddock S.

slide-22
SLIDE 22

Thank You

ptheywood.uk ptheywood@sheffield.ac.uk

Heywood P., Richmond P. & Maddock S.

slide-23
SLIDE 23

Additional Slides .

slide-24
SLIDE 24

Gipps’ Car Following Model Equation .

vn(t + τ) = min { vn(t) + .anτ( − vn(t)/1n)(. + vn(t)/1n)

  • ,

bnτ + √ bn

τ − bn[[xn−(t) − sn− − xn(t)] − vn(t)τ − vn−(t)/ˆ

b] }

an the maximum acceleration of vehicle n bn the most severe braking that the vehicle n will undertake sn the effective size of vehicle n, including a margin 1n the target speed of vehicle n xn(t) the location of the front of vehicle n at time t vn(t) the speed of vehicle n at time t τ constant reaction time for all vehicles ˆ b estimate of leading vehicles most severe braking

5 10 15 20 25 t 5 10 15 20 25 vn(t + τ)

Free-flow and Braking components of Gipps’ Car Following Model

Free-flow Component (Vn = 15, vn(0) = 0) Braking Component (Vn−1 = 10, xn−1(t) = 50)

Heywood P., Richmond P. & Maddock S.

slide-25
SLIDE 25

Results: Fixed Grid Network .

28 29 210 211 212 213 214 215 216 217 218 Number of Agents 10−1 100 101 102 103 104 Simulation time (ms) per iteration Non Partitioned Messaging Spatially Partitioned (radius = 5000m) Spatially Partitioned (radius = 2500m) Spatially Partitioned (radius = 250m)

  • Spatially partitioned messaging
  • utperforms non-partitioned

messaging

  • Smaller radii outperforms larger radii

beyond overhead

Heywood P., Richmond P. & Maddock S.

slide-26
SLIDE 26

Results: Fixed Grid Network - Per Agent .

28 29 210 211 212 213 214 215 216 217 218 Number of Agents 10−5 10−4 10−3 10−2 10−1 Simulation time (ms) per agent per iteration Non Partitioned Spatially Partitioned (radius = 5000m) Spatially Partitioned (radius = 2500m) Spatially Partitioned (radius = 250m)

  • Distinct gradient change at agents -

hardware utilisation vs larger message lists

  • Maximum message count

Non-partitioned

  • Partitioned r =
  • Partitioned r =
  • Partitioned r =
  • Heywood P., Richmond P. & Maddock S.
slide-27
SLIDE 27

Results: Fixed Grid Network - Kernel Profiling .

Partitioned Messaging r = 250 100 200 300 400 500 600 700 800 Average Kernel Time (ms) Average Kernel Execution Times inputdata

  • utputdata

reorder location messages hist location messages

  • Kernel times averaged over iterations
  • Agents
  • inputdata kernel is dominant

Non-partitioned Partitioned r = 5000Partitioned r = 2500 Partitioned r = 250 Message Partitioning Scheme 20000 40000 60000 80000 100000 120000 Average Kernel Time (ms) Average inputdata Kernel Execution Time inputdata Non-partitioned Partitioned r = 5000Partitioned r = 2500 Partitioned r = 250 Message Partitioning Scheme 10 20 30 40 50 Average Kernel Time (ms) Average outputdata Kernel Execution Time

  • utputdata

Heywood P., Richmond P. & Maddock S.

slide-28
SLIDE 28

Scaled Grid Network .

512 (N=2) 3072 (N=4) 7680 (N=6) 14336 (N=8) 23040 (N=10) 33792 (N=12) 46592 (N=14) 61440 (N=16) 78336 (N=18) 97280 (N=20) 118272 (N=22) 141312 (N=24) Number of Agents & Grid Size 10−1 100 101 102 103 104 Simulation time (ms) per iteration

Average iteration execution time for increasing Grid Size N with a fixed vehicle density of 64 agents per 1000m

Non Partitioned Messaging Spatially Partitioned Messaging (radius = 500m) Spatially Partitioned Messaging (radius = 250m)

  • Spatially partitioned messaging outperforms

non-partitioned beyond overhead

  • Up to x performance increase for spatial

partitioning than non-partitioned

Heywood P., Richmond P. & Maddock S.

slide-29
SLIDE 29

Bibliography .

slide-30
SLIDE 30

References I .

[CPM] Ciuffo B., Punzo 3., Montanino M.:

Thirty years of gipps’ car-following model. Transportation Research Record: Journal of the Transportation Research Board , (), –.

[Dep] Department for Transport:

Road traffic forecasts 5. https://www.gov.uk/government/uploads/system/uploads/attachment_data/file// road-transport-forecasts--extended-version.pdf, Mar. .

[Gip] Gipps P. G.:

A behavioural car-following model for computer simulation. Transportation Research Part B: Methodological , (), –.

[HRM] Heywood P., Richmond P., Maddock S.:

Road network simulation using flame gpu. In Euro-Par : Parallel Processing 2orkshops, Lecture Notes in Computer Science. . forthcoming.

[Khr] Khronos Group:

OpenGL SDK glDrawArraysInstanced manpage. https://www.opengl.org/sdk/docs/man/html/glDrawArraysInstanced.xhtml.

[Khr] Khronos Group:

GL_ARB_draw_instanced specification. https://www.opengl.org/registry/specs/ARB/draw_instanced.txt, . Last accessed --. Heywood P., Richmond P. & Maddock S.

slide-31
SLIDE 31

References II .

[Khr] Khronos Group:

OpenGl SDK - texelFetch. https://www.opengl.org/sdk/docs/man/html/texelFetch.xhtml, . Last accessed --.

[NFN∗] Neffendorf H., Fletcher G., North R., 4orsley T., Bradley R.:

Modelling for intelligent mobility. https://ts.catapult.org.uk/documents///Modelling+Intelligent+Mobility,+Feb+/ bcf-da-fca-adf-eedb, Feb. .

[Nvi] Nvidia C.:

Cuda c programming guide. http://docs.nvidia.com/cuda/pdf/CUDA_C_Programming_Guide.pdf, Mar. . Last accessed --.

[Ric] Richmond P.:

FLAME GPU technical report and user guide.

  • Tech. rep., technical report CS--. Technical report, University of Sheffield, Department of Computer Science, .

[Ric] Richmond P.:

Resolving conflicts between multiple competing agents in parallel simulations. In Euro-Par : Parallel Processing 2orkshops, Lopes L., Žilinskas J., Costan A., Cascella R., Kecskemeti G., Jeannot E., Cannataro M., Ricci L., Benkner S., Petit S., Scarano 3., Gracia J., Hunold S., Scott S., Lankes S., Lengauer C., Carretero J., Breitbart J., Alexander M., (Eds.), vol. of Lecture Notes in Computer Science. Springer International Publishing, , pp. –.

[SDL] Simple DirectMedia Layer (libSDL).

https://www.libsdl.org/. Heywood P., Richmond P. & Maddock S.

slide-32
SLIDE 32

References III .

[SN] Strippgen D., Nagel K.:

Multi-agent traffic simulation with cuda. In High Performance Computing & Simulation, 9. HPCS’9. International Conference on (), IEEE, pp. –.

[SYGD] Sommer C., Yao Z., German R., Dressler F.:

On the need for bidirectional coupling of road traffic microsimulation and network simulation. In Proceedings of the st ACM SIGMOBILE workshop on Mobility models (), ACM, pp. –.

[UK ] UK Department for Transport:

Quarterly Road Traffic Estimates: Great Britain Quarter 4 (October - December) 4. https: //www.gov.uk/government/uploads/system/uploads/attachment_data/file//road-traffic-estimates-quarter--.pdf, Feb. .

[4S] 4ang K., Shen Z.:

A gpu based trafficparallel simulation module of artificial transportation systems. In Service Operations and Logistics, and Informatics (SOLI), IEEE International Conference on (), IEEE, pp. –. Heywood P., Richmond P. & Maddock S.