H a c k i n g P o s t g r e S Q L PGCon 2013 Ottawa, Canada - - PowerPoint PPT Presentation

h a c k i n g p o s t g r e s q l
SMART_READER_LITE
LIVE PREVIEW

H a c k i n g P o s t g r e S Q L PGCon 2013 Ottawa, Canada - - PowerPoint PPT Presentation

H a c k i n g P o s t g r e S Q L PGCon 2013 Ottawa, Canada Stephen Frost sfrost@snowman.net Resonate, Inc. Digital Media PostgreSQL Hadoop techjobs@resonateinsights.com http://www.resonateinsights.com S t e p h e n F r o


slide-1
SLIDE 1

H a c k i n g P o s t g r e S Q L

PGCon 2013 Ottawa, Canada

Stephen Frost sfrost@snowman.net

Resonate, Inc. • Digital Media • PostgreSQL • Hadoop • techjobs@resonateinsights.com • http://www.resonateinsights.com

slide-2
SLIDE 2

S t e p h e n F r o s t

  • PostgreSQL
  • Major Contributor
  • Implemented Roles in 8.3
  • Column-Level Privileges in 8.4
  • Contributions to PL/pgSQL, PostGIS
  • Resonate, Inc.
  • Principal Database Engineer
  • Online Digital Media Company
  • We're Hiring! * techjobs@resonateinsights.com
slide-3
SLIDE 3

D o y o u r e a d . . .

  • planet.postgresql.org
slide-4
SLIDE 4

P o s t g r e S Q L S o u r c e

  • Overall PG source tree structure

c o n t r i b - c o n t r i b m o d u l e s ( M i g h t b e c o m e e x t e n s i o n s , o n e d a y . . ) d o c - D o c u m e n t a t i o n ( S G M L ) s r c - P o s t g r e S Q L " c o r e " ( C c o d e , m o s t l y ) . . . s r c / b a c k e n d - P o s t g r e S Q L s e r v e r ( " B a c k - E n d " ) s r c / b i n - p s q l , p g _ d u m p , i n i t d b , e t c ( " F r o n t - E n d " ) s r c / c o m m o n - C o d e c o m m o n t o f r o n t & b a c k s r c / i n c l u d e - . h f i l e s , a n d f r i e n d s s r c / i n t e r f a c e s - l i b p q , e c p g s r c / p l - C o r e p r o c e d u r a l l a n g u a g e s ( p l p g s q l , p l p e r l , t c l , e t c ) s r c / p o r t - P l a t f o r m - s p e c i f i c h a c k s s r c / t o o l s - D e v e l o p e r t o o l s ( p g i n d e n t , e t c )

slide-5
SLIDE 5

D o w n t h e R a b b i t H o l e . .

  • Components of the backend (src/backend/...)

a c c e s s - M e t h o d s f o r a c c e s s i n g d i f f e r e n t t y p e s o f d a t a ( h e a p , b t r e e i n d e x e s , g i s t / g i n , e t c ) . c a t a l o g - D e f i n i t i o n o f t h e P G t a b l e s ( p g _ c a t a l o g . * ) c o m m a n d s - U s e r - l e v e l S Q L c o m m a n d s ( A L T E R , C R E A T E T A B L E , V A C U U M , e t c ) e x e c u t o r - D u h , t h e E x e c u t o r - r u n s t h e q u e r i e s a f t e r p l a n n i n g / o p t i m i z a t i o n f o r e i g n - H a n d l e s F o r e i g n D a t a W r a p p e r s , u s e r m a p p i n g s , e t c l i b - " G e n e r a l P u r p o s e " / " M i s c " f u n c t i o n s ( b u t t h e y a r e e l s e w h e r e t o o . . ) l i b p q - B a c k e n d i n t e r f a c e t o t a l k t o l i b p q , a k a t h e w i r e l i n e p r o t o c o l m a i n - m a i n ( ) , d e t e r m i n e s h o w t h e b a c k e n d P G p r o c e s s i s s t a r t i n g a n d h a n d s o f f t o t h e r i g h t s u b s y s t e m n o d e s - G e n e r a l i z e d " N o d e " s t r u c t u r e i n P G a n d f u n c t i o n s t o c o p y , c o m p a r e , e t c

  • p t i m i z e r - Q u e r y o p t i m i z e r , i m p l e m e n t s t h e c o s t i n g s y s t e m a n d g e n e r a t e s a p l a n f o r t h e e x e c u t o r

p a r s e r - L e x e r a n d G r a m m a r , h o w P G u n d e r s t a n d s t h e q u e r i e s y o u s e n d i t p o r t - B a c k e n d - s p e c i f i c p l a t f o r m - s p e c i f i c h a c k s p o s t m a s t e r - T h e " m a i n " P G p r o c e s s t h a t a l w a y s r u n s , a n s w e r s r e q u e s t s , h a n d s o f f c o n n e c t i o n s r e g e x - H e n r y S p e n c e r ' s r e g e x l i b r a r y , a l s o u s e d b y T C L , m a i n t a i n e d m o r e - o r - l e s s b y P G n o w r e p l i c a t i o n - B a c k e n d c o m p o n e n t s t o s u p p o r t r e p l i c a t i o n , s h i p p i n g W A L l o g s , r e a d i n g t h e m i n , e t c r e w r i t e - Q u e r y r e w r i t e e n g i n e , u s e d w i t h R U L E s s n o w b a l l - S n o w b a l l s t e m m i n g , u s e d w i t h f u l l - t e x t s e a r c h s t o r a g e - S t o r a g e l a y e r , h a n d l e s m o s t d i r e c t f i l e i / o , s u p p o r t f o r l a r g e o b j e c t s , e t c t c o p - " T r a f f i c C o p " - t h i s i s w h a t g e t s t h e a c t u a l q u e r i e s , r u n s t h e m , e t c t s e a r c h - F u l l - T e x t S e a r c h e n g i n e u t i l s - V a r i o u s b a c k - e n d u t i l i t y c o m p o n e n t s , c a c h e i n g s y s t e m , m e m o r y m a n a g e r , e t c

slide-6
SLIDE 6

S o y o u h a v e a n i d e a . .

  • Where to begin?
  • Depends on your idea, but I prefer the parser
  • Grammar drives a lot of things
  • Also one of the hardest items to get agreement on
  • The grammar is in src/backend/parser/
  • scan.l - lexer, handles tokenization
  • gram.y - actual grammar
  • Built with flex (lexer) and bison (parser)
  • Rarely have to change the lexer
slide-7
SLIDE 7

M o d i f y i n g t h e g r a m m a r

  • Grammar is a set of productions
  • "main" is the 'stmt' production
  • Lists all the top-level commands
  • Each is its own production then

s t m t : A l t e r E v e n t T r i g S t m t | A l t e r D a t a b a s e S t m t | A l t e r D a t a b a s e S e t S t m t . . . | C o p y S t m t C o p y S t m t : C O P Y o p t _ b i n a r y q u a l i f i e d _ n a m e o p t _ c o l u m n _ l i s t o p t _ o i d s c o p y _ f r o m o p t _ p r o g r a m c o p y _ f i l e _ n a m e c o p y _ d e l i m i t e r o p t _ w i t h c o p y _ o p t i o n s { C o p y S t m t * n = m a k e N o d e ( C o p y S t m t ) ; n - > r e l a t i o n = $ 3 ;

slide-8
SLIDE 8

M o d i f y i n g C o p y S t m t

  • Add it into the COPY production
  • Modify the C template code as needed
  • C code is extracted by bison
  • Run through a set of changes (eg: changes "$3")
  • Compiled as part of the overall parser (gram.c)
  • Remember to update the keywords list (kwlist.h)
  • Also remember to add to unreserved_keywords
  • Try to avoid creating new reserved keywords
slide-9
SLIDE 9

A d d i n g a n o p t i o n t o C O P Y

  • - - a / s r c / b a c k e n d / p a r s e r / g r a m . y

+ + + b / s r c / b a c k e n d / p a r s e r / g r a m . y @ @ - 5 2 1 , 8 + 5 2 1 , 8 @ @ s t a t i c v o i d p r o c e s s C A S b i t s ( i n t c a s _ b i t s , i n t l o c a t i o n , c o n s t c h a r * c o n s t r T y p e ,

  • C O M M I T T E D C O N C U R R E N T L Y C O N F I G U R A T I O N C O N N E C T I O N C O N S T R A I N T C O N S T R A I N T S
  • C O N T E N T _ P C O N T I N U E _ P C O N V E R S I O N _ P C O P Y C O S T C R E A T E

+ C O M M I T T E D C O M P R E S S E D C O N C U R R E N T L Y C O N F I G U R A T I O N C O N N E C T I O N C O N S T R A I N T + C O N S T R A I N T S C O N T E N T _ P C O N T I N U E _ P C O N V E R S I O N _ P C O P Y C O S T C R E A T E @ @ - 2 4 0 3 , 6 + 2 4 0 3 , 1 0 @ @ c o p y _ o p t _ i t e m : { $ $ = m a k e D e f E l e m ( " h e a d e r " , ( N o d e * ) m a k e I n t e g e r ( T R U E ) ) ; } + | C O M P R E S S E D + { + $ $ = m a k e D e f E l e m ( " c o m p r e s s e d " , ( N o d e * ) m a k e I n t e g e r ( T R U E ) ) ; + } | Q U O T E o p t _ a s S c o n s t { $ $ = m a k e D e f E l e m ( " q u o t e " , ( N o d e * ) m a k e S t r i n g ( $ 3 ) ) ; @ @ - 1 2 4 7 1 , 6 + 1 2 4 7 5 , 7 @ @ u n r e s e r v e d _ k e y w o r d : | C O M M I T T E D + | C O M P R E S S E D | C O N F I G U R A T I O N

slide-10
SLIDE 10

W h a t a b o u t t h e c o d e ?

  • COPY has a function to process options
  • Surprise, it's called "ProcessCopyOptions"
  • COPY is defined in backend/commands/copy.c
  • COPY state info
  • Local state structure CopyStateData also in copy.c
  • Not in a .h because only COPY needs it
  • Define structures in .c files near the top
slide-11
SLIDE 11

O p t i o n h a n d l i n g i n c o p y . c

@ @ - 1 0 9 , 6 + 1 1 9 , 7 @ @ t y p e d e f s t r u c t C o p y S t a t e D a t a b o o l b i n a r y ; / * b i n a r y f o r m a t ? * / + b o o l c o m p r e s s e d ; / * c o m p r e s s e d f i l e ? * / b o o l o i d s ; / * i n c l u d e O I D s ? * / @ @ - 8 8 9 , 6 + 1 1 8 6 , 2 0 @ @ P r o c e s s C o p y O p t i o n s ( C o p y S t a t e c s t a t e , } + e l s e i f ( s t r c m p ( d e f e l - > d e f n a m e , " c o m p r e s s e d " ) = = 0 ) + { + # i f d e f H A V E _ L I B Z + i f ( c s t a t e - > c o m p r e s s e d ) + e r e p o r t ( E R R O R , + ( e r r c o d e ( E R R C O D E _ S Y N T A X _ E R R O R ) , + e r r m s g ( " c o n f l i c t i n g o r r e d u n d a n t o p t i o n s " ) ) ) ; + c s t a t e - > c o m p r e s s e d = d e f G e t B o o l e a n ( d e f e l ) ; + # e l s e + e r e p o r t ( E R R O R , + ( e r r c o d e ( E R R C O D E _ S Y N T A X _ E R R O R ) , + e r r m s g ( " N o t c o m p i l e d w i t h z l i b s u p p o r t . " ) ) ) ; + # e n d i f + } e l s e i f ( s t r c m p ( d e f e l - > d e f n a m e , " o i d s " ) = = 0 )

slide-12
SLIDE 12

T h a t ' s i t , r i g h t ?

  • Not hardly.
  • Lots of changes to copy.c
  • New 'COMPRESSED' state
  • Tracking gzFile instead of FILE*
  • Using gzread / gzwrite instead of read/write
  • Data in and out
  • All is buffered with 2 buffers
  • Uncompressed data
  • Compressed data
slide-13
SLIDE 13

D i f f s t a t

d o c / s r c / s g m l / r e f / c o p y . s g m l | 1 2 + + s r c / b a c k e n d / c o m m a n d s / c o p y . c | 4 5 8 + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + - - - - - s r c / b a c k e n d / p a r s e r / g r a m . y | 9 + - s r c / b a c k e n d / s t o r a g e / f i l e / f d . c | 9 7 + + + + + + + + + + + + s r c / i n c l u d e / p a r s e r / k w l i s t . h | 1 + s r c / i n c l u d e / s t o r a g e / f d . h | 9 + + s r c / t e s t / r e g r e s s / i n p u t / c o p y . s o u r c e | 2 0 + + + s r c / t e s t / r e g r e s s / o u t p u t / c o p y . s o u r c e | 1 8 + + + 8 f i l e s c h a n g e d , 5 8 3 i n s e r t i o n s ( + ) , 4 1 d e l e t i o n s ( - )

  • Documentation updates in doc/src
  • Modify fd.c to support compressed files
  • fd.c provides file descriptor cacheing
  • Added: AllocateFileGz, FreeFileGz
  • Regression test updates
slide-14
SLIDE 14

C O P Y P I P E

  • Follow the mailing lists
  • Watch for others working on similar capabilities
  • Try to think about general answers, not specific
  • Be supportive of other ideas and approaches
  • Send and receive COPY data from program instead
  • E.g. for gzipped files

p o s t g r e s = # C O P Y t F R O M P R O G R A M ' z c a t / t m p / t . c s v . g z '

slide-15
SLIDE 15

H a c k i n g t h e P G w a y

  • PG has specific ways to do
  • Memory management
  • Error logging / cleanup
  • Linked lists
  • Catalog lookups
  • Nodes
  • Datums
  • Code Style
  • How to submit your patch
slide-16
SLIDE 16

M e m o r y H a n d l i n g

  • All memory is part of a memory context
  • Allocated through palloc()
  • Contexts exist for most of what you would expect
  • CurrentMemoryContext - what pg_malloc() will use
  • TopMemoryContext - Backend Lifetime
  • Per-Query Context
  • Per-Tuple Context
slide-17
SLIDE 17

L o g g i n g f r o m P G

  • Use ereport() with errcode() and errmsg()
  • error level and errmsg() are required
  • PG has a style guide for error messages
  • ERROR or higher and PG will handle most cleanup
  • Rolls back transaction
  • Frees appropriate memory contexts

+ i f ( g z w r i t e ( c s t a t e - > c o p y _ g z f i l e , f e _ m s g b u f - > d a t a , + f e _ m s g b u f - > l e n ) ! = f e _ m s g b u f - > l e n ) + e r e p o r t ( E R R O R , + ( e r r c o d e _ f o r _ f i l e _ a c c e s s ( ) , + e r r m s g ( " c o u l d n o t w r i t e t o C O P Y f i l e : % m " ) ) ) ;

slide-18
SLIDE 18

C a t a l o g L o o k u p s

  • SysCache
  • General function 'SearchSysCache'
  • Defined in utils/cache/syscache.c
  • Also some convenience routines in lsyscache.c
  • Scanning catalog tables and Snapshots
  • Beware of SnapshotNow semantics
  • Viewing exactly what is in the heap
  • Heap can change while scanning it
slide-19
SLIDE 19

N o d e s

  • PG has a node structure for expression trees
  • Each node has a 'type' plus appropriate data
  • 'type' is stored in the node, allows IsA() testing
  • Backend memory only, never out on disk, etc
  • Create nodes using makeNode(TYPE)
  • Adding node type
  • Node types defined in include/nodes/nodes.h
  • make / copy / equality funcs in backend/nodes/
slide-20
SLIDE 20

D a t u m s

  • General data type structure
  • Defined in postgres.h
  • Helper macros also in postgres.h
  • Example helpers, theres a bunch of them
  • Int32GetDatum(int) - Returns Datum of int
  • DatumGetInt32(Datum) - Returns int from Datum
slide-21
SLIDE 21

T u p l e s

  • Heap Tuple defined in include/access/htup.h
  • HeapTupleData is in-memory construct
  • Provides length of tuple, pointer to header
  • Used in multiple ways
  • Pointer to disk buffer (must be pin'd)
  • Empty
  • Single pmalloc'd chunk
  • Seperately allocated
  • Minimal Tuple structure
slide-22
SLIDE 22

T u p l e s ( m o r e )

  • HeapTupleHeaderData and friends in htup_details.h
  • Number of attributes
  • Provides various flags (NULL bitmap, etc)
  • Data follows the header (not in the struct)
  • Lots of macros for working with tuples in details
slide-23
SLIDE 23

T o a s t

  • Large values can be compressed
  • May also get "TOASTed" and moved to "toast" table
  • Handled as a stored-out-of-line Datum
  • Need to be careful with variable length Datums
  • Typically try to avoid de-TOASTing Datums until

absolutely required to

slide-24
SLIDE 24

O t h e r s u b s y s t e m s

  • Many things have already been done
  • Eg: linked list implementation (llist.h)
  • Generalized code should go in common area
  • Look at existing code
  • Real examples help immensely
  • Chances are, you will find what you need
  • Portability considerations
slide-25
SLIDE 25

C o d e S t y l e

  • Try to make your code 'fit in'
  • Follow the PG style guide in the FAQ
  • Beware of copy/paste
  • Comments
  • C-style comments only, no C++
  • Generally on their own lines
  • Describe why, not what or how
  • Big comment blocks for large code blocks
  • Functions, big conditions or loops
slide-26
SLIDE 26

S u b m i t t i n g P a t c h e s

  • Patch format
  • Context diff or git-diff
  • Ideally, pick which is better
  • Include in email to -hackers
  • Description of the patch
  • Regression tests
  • Documentation updates
  • pg_dump support
  • Register on commitfest.postgresql.org
slide-27
SLIDE 27

T h a n k y o u !

Stephen Frost sfrost@snowman.net @net_snow