K E D b . D a L a t a B a s e Jordan Vincent XML - - PowerPoint PPT Presentation

k e d
SMART_READER_LITE
LIVE PREVIEW

K E D b . D a L a t a B a s e Jordan Vincent XML - - PowerPoint PPT Presentation

XML processing using GPGPU Research proposal Jordan Vincent University of Tsukuba February 2, 2011 K E D b . D a L a t a B a s e Jordan Vincent XML processing using GPGPU Jordan Vincent Academic achievements Engineering degree


slide-1
SLIDE 1

XML processing using GPGPU

Research proposal Jordan Vincent

University of Tsukuba

February 2, 2011

K D E

D a t a B a s e L a b . Jordan Vincent XML processing using GPGPU

slide-2
SLIDE 2

Jordan Vincent

Academic achievements Engineering degree with emphasis on Software development. Research master degree with emphasis on Parallel algorithms. from University of Technology of Belfort-Montbeliard (France). Internship Final project assignment (6 months) at Kitagawa Data Engineering laboratory (University of Tsukuba, Japan).

Jordan Vincent XML processing using GPGPU

slide-3
SLIDE 3

Outline

1

Research project Background Master project Next challenge

2

Research plan Schedule Scope of research

Jordan Vincent XML processing using GPGPU

slide-4
SLIDE 4

Outline

1

Research project Background Master project Next challenge

2

Research plan Schedule Scope of research

Jordan Vincent XML processing using GPGPU

slide-5
SLIDE 5

XML/XPath

XML Semi-structured data format for exchanging data in a textual form. <a> <b arg1=”value1”> <c /> </b> <b arg1=”value2”> <c>text </c> </b> </a> Worldmap: 171 GB

(Sept 2010)

English articles: 27 GB

(Sept 2010)

XPath Core retrieval language for XML doc. XPath is a subset of XQuery. /a/b [ @arg1=value2 ]/ c

Jordan Vincent XML processing using GPGPU

slide-6
SLIDE 6

XML pattern matching

A B B C C

arg1 "value1" arg1 "value2" text

A B C

XML document XPath query

C

text Result arg1 "value2"

TwigStack TwigStack[1] is a famous algorithm to perform XML pattern matching.

Jordan Vincent XML processing using GPGPU

slide-7
SLIDE 7

Manycore processor family and OpenCL

Manycore processor family Heterogenous parallel processor architectures Nvidia (GPU) ATI/AMD (GPU/CPU) Intel (Larrabee project) CUDA Nvidia specific toolkit for general purpose development on GPU. OpenCL ”The open standard for parallel programming of heterogeneous systems” includes some GPU, CPU but also some DSP chips. Sony/IBM/Toshiba Cell Apple iPhone

Jordan Vincent XML processing using GPGPU

slide-8
SLIDE 8

Research works about GPGPU and DB processing

Fast computation of database operations using graphics processors [Govindaraju, SIGMOD’04] GPUQP: Query Co-Processing Using Graphics Processors [Fang, SIGMOD’07] Relational Joins on Graphics Processors [He, SIGMOD’08] Data Monster: Why graphics processors will transform database processing? [Di Blas, 2009] Accelerating SQL Database Operations on a GPU with CUDA [Bakkum, GPGPU’10] Exploring utilisation of GPU for database applications [Walkowiak, ICCS’10] Accelerating XML Query Matching through Custom Stack Generation on FPGAs [Moussalli, HiPEAC’10] → No research result about XML processing using GPGPU.

Jordan Vincent XML processing using GPGPU

slide-9
SLIDE 9

Master project: TwigStackGPU

Outline Based on Imam Machdi’s research[2] at KDE lab. about TwigStack algorithm for parallel query processing on cluster and multicore processors. Current result Application possible to Nvidia GPGPU? 6 months work and many technical problems encountered. → Project works but slow execution time.

Jordan Vincent XML processing using GPGPU

slide-10
SLIDE 10

Illustrated example

XML document root leaves partition 1 partition 2 partition 3 Partitionning algorithm XML pattern matching XML pattern matching XML pattern matching Intermediate solutions merge query matches Jordan Vincent XML processing using GPGPU

slide-11
SLIDE 11

Next challenge

Immediate future tasks performance evaluation and profiling. solve implementation issues for better performance. Many more problems to be addressed evaluate other architectures than Nvidia. enhance pattern matching algorithm to make use of more capabilities of GPU. explore other problems that share the same representation of XML documents.

Jordan Vincent XML processing using GPGPU

slide-12
SLIDE 12

Outline

1

Research project Background Master project Next challenge

2

Research plan Schedule Scope of research

Jordan Vincent XML processing using GPGPU

slide-13
SLIDE 13

Estimated schedule

non-uniform parallelism exploration design a new query processing algorithm that better fits GPGPU address other fields of XML processing domain

A B C D

2011 2014 past first demo in CUDA (nVIDIA)

T wigStack implementation

  • n GPGPU

OpenCL framework for XML processing algorithm for XML query processing on GPGPU XML-OLAP implementation

  • n GPGPU

Master thesis PhD thesis Manycore plateform comparison of T wigStack algorithm using new OpenCL framework T wigStackGPU presentation and benchmark Efficient XML query processing on GPGPU XML-OLAP on GPU presentation and benchmark Jordan Vincent XML processing using GPGPU

slide-14
SLIDE 14

”Layer cake”

OpenCL Query processing XML processing Non-uniform parallelism

hardware software

OLAP

  • peration

XML DB Webserver ... GPU CPU ...

Jordan Vincent XML processing using GPGPU

slide-15
SLIDE 15

Conclusion

1 XML query processing is a problem due to the growing

amount of content stored into XML documents.

2 Current project shows that XML query processing on GPGPU

is possible but well-known algorithm is not efficient.

3 No research results about XML query processing and GPGPU

yet, but promising results about relational database query processing.

4 An efficient GPU framework could be the base of other

researches related to XML processing. (e.g., XML-OLAP operation[3] using GPGPU)

Jordan Vincent XML processing using GPGPU

slide-16
SLIDE 16

References I

Nicolas Bruno, Nick Koudas, Divesh Srivastava. Holistic Twig Joins: Optimal XML Pattern Matching. SIGMOD 2002 Imam Machdi, Toshiyuki Amagasa, Hiroyuki Kitagawa. Executing parallel TwigStack algorithm on a multi-core system. International Journal of Web Information System, 2010. Chantola Kit, Toshiyuki Amagasa, Hiroyuki Kitagawa. Algorithms for Efficient Structure-based Grouping in XML-OLAP. iiWAS, 2008.

Jordan Vincent XML processing using GPGPU

slide-17
SLIDE 17

backup slide: CPU vs GPU

thread scheduling GPU hardware, massive parallelism of non-divergent threads CPU software, few parallelism of divergent threads memory consistency GPU no hardware consistency, software consistency not recommended (little independent caches, many cores) CPU hardware and complexe memory consistency management (big unified caches, few cores) computing priority GPU Less global memory CPU More global memory

Jordan Vincent XML processing using GPGPU

slide-18
SLIDE 18

backup slide: GPU powered webserver

GPU CPU

Dynamic XML doc. request

webserver webserver fastCGI

XML document answer XPath queries on XML doc.

(auth., AES decrypt, ...) (AES encrypt, ...)

Jordan Vincent XML processing using GPGPU