leveraging customer behavioral data
play

Leveraging Customer Behavioral Data to Drive Revenue the GPU way - PowerPoint PPT Presentation

Leveraging Customer Behavioral Data to Drive Revenue the GPU way @arnon86 S7456 1 Hi! Arnon Shimoni Senior Solutions Architect I like hardware & parallel / concurrent stuff In my 4 th year at SQream Technologies Send gifs to @arnon86 or


  1. Leveraging Customer Behavioral Data to Drive Revenue the GPU way @arnon86 S7456 1

  2. Hi! Arnon Shimoni Senior Solutions Architect I like hardware & parallel / concurrent stuff In my 4 th year at SQream Technologies Send gifs to @arnon86 or arnon@sqream.com @arnon86 S7456 2

  3. tl;dr • GPUs are good number crunchers – makes them good for data processing • SQream DB with GPUs is fast • Rethink current solutions, the GPU can help • Simple hardware is good enough, let’s avoid throwing lots of hardware at issues. Don’t need to shovel money at the problem! @arnon86 S7456 3

  4. SQream DB – an SQL database powered by GPUs Powered by GPUs • Massively parallel engine • Relies on GPUs for power, not RAM Fast • Columnar storage • Always on compression • 2 TB / hour / GPU ingest speed Scalable • 10 TB to 1 PB with ease SQL Database • Familiar ANSI SQL • Standard connectors (ODBC, JDBC) Extensible for AI </> • Python, Jupyter, etc • Data science @arnon86 S7456 4

  5. This story starts at MWC last year That ’ s my ear! @arnon86 S7456 5

  6. SQream knows telecoms We’ve helped operators with Better analysis of network events • Speeding up CDR preparations • More history with security management (SIEM) • And now – customer behaviour •

  7. There is a lot of data about customers in telecoms • Where and when they wake up and where they spend their days (daily grinders) • When/where were they were Instagramming (When and where data was used) • How frustrated they got (what the network experience was in each location) • What modes of transport they use • How close they are to competitor locations But Bu t are e th they y act ctually using sing th this is da data ta? ? Ar Are e th they y get etting anyth ything act ctionabl ble? Ar Are e th they y loo looking at t th the e en entir tire cu customer r ba base, se, and d not ot just just a sing single cu cust stomer? r? @arnon86 S7456 7

  8. “ You know, Telefonica has this multi-million dollar product based on Hadoop for selling this customer behaviour data to 3 rd party companies. Have you thought about maybe getting the same solution for your company, but much simpler? ” @arnon86 S7456 8

  9. “ Oh, and we ’ ll do it for you with a single machine ” @arnon86 S7456 9

  10. Why their current setup wasn ’ t good enough for this • Data scientists and BI professionals have only short windows of time to run queries, because of overloaded systems • Windows cut even shorter due to long overnight loading • Queries take hours, and iterations become painful Long queries  Coffee breaks  Bathroom breaks  Unhappy managers  Unhappy everyone @arnon86 S7456 10

  11. Databases that displease data scientists • When data scientists or BI professionals want to ask questions that no one has asked before, these systems tend to ‘ break ’ and not deliver what ’ s expected • They ’ re just not designed for ad-hoc querying • Le Legacy da data tabases require indexing and a lot of manual tuning • Ne Newe wer da data taba bases like Vertica also require creating projections, which is time-consuming and inflexible • Dist Distrib ibuted ed da data taba bases don ’ t perform well when JOIN operations are necessary • In In-memory da data tabases are very painful on the wallet if you need more than a couple of terabytes @arnon86 S7456 11

  12. Picking the wrong databases will cause pain! Just some of what we saw Cloudera – for the BI team • Teradata – for the marketing team • Oracle Exadata – Transactional - for CDR collection and customer records • Vertica, Netezza – for financial • Lots of Greenplum – to collect from many sources, for marketing and BI • @arnon86 S7456 12

  13. Chanel says racks are fashionable. Our customers think otherwise @arnon86 S7456 13

  14. SQream DB software in a standard 2U server Configured with 96GB RAM and a single Tesla K80 for a $4,000 total investment. Designed to handle ~40 TB of telecom data @arnon86 S7456 14

  15. Sample dashboards generated Dashboard showing 3G/4G data throughput throughout the day (Morning, Lunch, Evening, Night, … ). Larger circles represent more data throughput. Colour becomes darker as the day progresses. Dark-outline circles mean more night-time traffic. Dashboard aggregates directly off SQream DB, with no intermediate steps. Represents 3 table join (3.3B rows ⋈ 40M rows ⋈ 300K rows) @arnon86 S7456 15

  16. Sample dashboards generated Dashboard showing 3G/4G data throughput throughout the day (Morning, Lunch, Evening, Night, … ). Larger circles represent more data throughput. Colour becomes darker as the day progresses. Dark-outline circles mean more night-time traffic. Dashboard aggregates directly off SQream DB, with no intermediate steps. Represents 3 table join (3.3B rows ⋈ 40M rows ⋈ 300K rows) @arnon86 S7456 16

  17. Saving hours on reporting with SQream DB Augmenting legacy MPP with a faster, easier to use GPU-powered analytics database 5 hours 80 node CDR 4G Data Sources ETL Aggregations Process CDR 3G Direct Loading, 2TB/h ingest rate Dozens of Reports Non CDR 20 minutes with SQream DB 15x faster @arnon86 S7456 17

  18. The cost of performance 80 nodes 80 s – 5 5 full racks ks HP DL380g 80g9 with NVIDI IDIA Tesla K80 960 CPU cores, 5.12 TB RAM 96 GB RAM + 6 TB storage ETL time 300 m 20 m 15x faster Reporting time 120 m 10 m 12x faster $ $ TCO w/license $10,000,000 $200,000 50x more cost effective SQream DB v1.9.6

  19. That wasn ’ t an anomaly We ’ ve done it against Netezza, Teradata, Oracle, Vertica, and even Hadoop based systems. 8 full 42 42U U racks, 56 S-Blades Dell C413 130 0 with h 4x NVIDIA Tesla a K80 80 7 TB RAM 512 GB RAM + iSCSI JBOD (20TB) Averag age e quer ery y time 33.70 31.70 (second nds) Processi sing ng Units ts 56 4 (S (S-Blad ade e / GPUs) Compressi ession n ratio 4.0 4.7 $ $ 12,000,000 Cost of Ownershi hip 500,000 Netezza SQream DB v1.9.7

  20. Find out more about SQream ’ s high performance GPU-driven database software www.sqream.com or arnon@sqream.com

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend