an unsophisticated cooperative
play

An unsophisticated cooperative approach to prefetching linked data - PowerPoint PPT Presentation

An unsophisticated cooperative approach to prefetching linked data structures Alexander Galazin Murad Neiman-zade JSC MCST, Moscow EPIC-8, April 24, 2010 An unsophisticated cooperative approach to prefetching linked data structures


  1. An unsophisticated cooperative approach to prefetching linked data structures Alexander Galazin Murad Neiman-zade JSC “MCST”, Moscow EPIC-8, April 24, 2010

  2. An unsophisticated cooperative approach to prefetching linked data structures Motivation  Pointer-based applications significantly lack performance due to irregularity of memory access patterns  There is no information on how linked data structures addresses evolve in major applications  Existing approaches propose sophisticated cooperative techniques with great modifications in CPU EPIC-8, April 24, 2010 Alexander Galazin, Murad Neiman-zade

  3. An unsophisticated cooperative approach to prefetching linked data structures Background App Procedure %T app Data Misses 181.mcf flow_cost 53.7% 94.2% update_tree 15.8% 95.1% 197.parser xfree 7.0% 43.6% table_pointer 3.6% 59.4% 254.gap CollectGarb 9.4% 82.4% 300.twolf new_dbox_a 17.3% 71.0% EPIC-8, April 24, 2010 Alexander Galazin, Murad Neiman-zade

  4. An unsophisticated cooperative approach to prefetching linked data structures Studying LDS Traversal • Discover LDS traversal    addr addr • Collect , where addr –  k i k i address with which LDS traversal operates, i- loop iteration and k ={1..16} EPIC-8, April 24, 2010 Alexander Galazin, Murad Neiman-zade

  5. An unsophisticated cooperative approach to prefetching linked data structures LDS Traversal Behavior • 181.mcf – flow_cost : 2 addresses in LDS and only 1  if k is fixed – update_tree : 3  in 97% • 197.parser – xfree : 1  in 90% – table_pointer : 3  in 49% • 254.gap – CollectGarb : 2  in 96% • 300.twolf – new_dbox_a : 3  in 98% EPIC-8, April 24, 2010 Alexander Galazin, Murad Neiman-zade

  6. An unsophisticated cooperative approach to prefetching linked data structures Our method • Architectural support – New instruction IsOperandsNotReady • Compiler support – Discover LDS traversal – Inject prefetching code – Create compensating nodes EPIC-8, April 24, 2010 Alexander Galazin, Murad Neiman-zade

  7. An unsophisticated cooperative approach to prefetching linked data structures Architectural support • IsOperandsNotReady(TI) C-code – returns TRUE if any while(a) of the operands of TI { are not ready – otherwise FALSE a=a->next; – is always scheduled } together with TI in the ASM-code same wide instruction { and requires 1 logical cmpesb,1 %r0, 0, %pred1 unit. pass % ionr1 , %pred5 } EPIC-8, April 24, 2010 Alexander Galazin, Murad Neiman-zade

  8. An unsophisticated cooperative approach to prefetching linked data structures Compiler support. Preparation • for each LD we create a global array for keeping 3 most popular  and their LD arr[i] → d i LD arr[i+1] → f i frequencies; LD r1 → r1 • we keep a history of HISTORY(r1) addresses for the load for LD r1 → r1 D iterations; • in the preloop we load all elements of the array to ST arr[i] ← d i registers ST arr[i+1] ← f i • in the postloop we save values of 3 top  and HISTORY(r1) MOV r1 i → r i … their frequencies in the MOV r1 i+k → r (i+k) array; EPIC-8, April 24, 2010 Alexander Galazin, Murad Neiman-zade

  9. An unsophisticated cooperative approach to prefetching linked data structures Compiler support. Prefetching • in the loop head we create prefetches for ( A+  ) where LD arr[i] → d i LD arr[i+1] → f i A is the address of the LD on the current iteration; LD r1 → r1; USE(r1) HISTORY(r1) LD r1 → r1 PREFETCH(r1+d 1 ) • after the USE of LD result PREFETCH(r1+d 2 ) PREFETCH(r1+d 3 ) we add IsONR(USE) → P BRANCH cn P IsOperandsNotReady and ST arr[i] ← d i branch which transfer ST arr[i+1] ← f i HISTORY(r1) control to a compensating MOV r1 i → r i … node; MOV r1 i+k → r (i+k) EPIC-8, April 24, 2010 Alexander Galazin, Murad Neiman-zade

  10. An unsophisticated cooperative approach to prefetching linked data structures Compiler support. Calculating  • in the compensating node we calculate S – the difference between current load address LD arr[i] → d i LD arr[i+1] → f i and its oldest retained address; • then we search for whether there LD r1 → r1 is such  and if there is, we HISTORY(r1) LD r1 → r1 PREFETCH(r1+d 1 ) increment the value of register PREFETCH(r1+d 2 ) which keeps its frequency; PREFETCH(r1+d 3 ) IsONR(LD) → P if there is no such  we initialize a • BRANCH cn P new register with S and set a frequency register to one; ST arr[i] ← d i ST arr[i+1] ← f i • if the frequency of S becomes HISTORY(r1) greater than that of the previous MOV r1 i → r i … compensating node register we swap them, thus MOV r1 i+k → r (i+k) doing a “lazy bubble sort”; SUB r1, r i → v i SEARCH(v i ) in d i INCR(f i ) SWAP(d i , d i-1 ) EPIC-8, April 24, 2010 Alexander Galazin, Murad Neiman-zade

  11. An unsophisticated cooperative approach to prefetching linked data structures Experimental results • The method was evaluated on a computer with the Elbrus microprocessor; • The microprocessor has EPIC architecture, 4-way associative L2 of 256 KB, 4 load/store units. • 181.mcf reduced by 15% • 254.gap reduced by 4% • The method is still in the phase of active development EPIC-8, April 24, 2010 Alexander Galazin, Murad Neiman-zade

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend