Build and operate a CEPH Infrastructure - University of Pisa case - PowerPoint PPT Presentation

Build and operate a CEPH Infrastructure - University of Pisa case study Simone Spinelli simone.spinelli@unipi.it 17 TF-Storage meeting - Pisa 13-14 October 2015

Agenda ● C E P H @ u n i p i : a n o v e r v i e w ● Performances ● Infrastructure bricks: ● Our experience – Network ● Conclusions – OSD nodes – Monitor Node – Racks – MGMT tools 17 TF-Storage meeting - Pisa 13-14 October 2015

University of Pisa ● Big sized Italian university: – 70K students – 8K employees – Not a campus but spread all over the city → no big datacenter but many small sites Own and manage an optical infrastructure with on top an MPLS-based MAN ● Proud host of GARR Network PoP ● ● Surrounded by other research/educational institutions (CNR/SantAnna/Scuola Normale…) 17 TF-Storage meeting - Pisa 13-14 October 2015

How we use CEPH Currently in production as backend for an Openstack installation, it hosts: ● department tenants (Web servers, etc.. ) ● tenants for research projects (DNA seq, etc… ) ● tenants for us: multimedia content from elearning platforms Working on: ● An email system for students hosted on Openstack → RBD ● A sync&share platform → RadosGW 17 TF-Storage meeting - Pisa 13-14 October 2015

Timeline ● Spring 2014: we started to plan: Capacity/Replica planning – Rack engineering (power/cooling) – Bare metal management – Confjguration Management – ● Dec 2014: fjrst testbed ● Feb 2015: 12 nodes cluster goes in production ● Jul 2015: Openstack goes in production ● Oct 2015: Start to deploy new ceph nodes (+12) 17 TF-Storage meeting - Pisa 13-14 October 2015

Overview ● 3 sites (we started with 2): – One replica per site – 2 active computing and storage – 1 for storage and quorum ● 2 difgerent network infrastructures: – services (1Gb and 10 Gb) – storage (10Gb and 40Gb) 17 TF-Storage meeting - Pisa 13-14 October 2015

Network ● Ceph clients and cluster networks are realized as VLAN on the same switching infrstructure ● Redundancy and loadbalancing are achieved by LACP ● Switching platforms: – Juniper EX4550: 32p SFP – Juniper EX4200: 24p copper 17 TF-Storage meeting - Pisa 13-14 October 2015

Storage ring ● Sites interconnected wirh a 2x40Gb ERP ● For Storage nodes: 1VirtualChassis per DC: – Maximize the bandwidht: 128GB backend inside the VC – Easy to confjgure and manage (NSSU) – No more than 8 nodes per VC – For computing nodes difgerent VC 17 TF-Storage meeting - Pisa 13-14 October 2015

Hardware:OSD nodes DELL R720XD (2U): ● Ubuntu 14.04 ● 2 Xeon e5-2603@1.8Ghz: 8 core total ● Linux 3.13.0-46-generic #77-Ubuntu ● 64GB RAM DDR3 ● Linux bonding driver: ● 2x10Gb Intel X520 Network Adapter – No special functions ● 12 2TB SATA disks (6disks/RUs) – Less complex ● 2 Samsung 850 256GB SSD disks ● Really easy to deploy with iDRAC – Mdadm raid1 for OS ● Intended to be the virtual machine pool – 6 partition per disk for XFS journal (faster) 17 TF-Storage meeting - Pisa 13-14 October 2015

Hardware:OSD nodes Supermicro SSG6047R-OSD120H: ● Ubuntu 14.04 ● 2 Xeon e5-2630v2@2.60Ghz : 24 core ● Linux 3.13.0-46-generic #77-Ubuntu total ● 2 SSD raid 1 for OS (dedicated) ● 256GB RAM DDR3 ● Linux bonding driver: ● 4x10Gb Intel X520 Network Adapter – No special functions ● 30 6TB SATA disks (7.5disks/RU) – Less complex ● 6 intel 3700 SSD disks for XFS journal ● Intended to be the object storage pool (slow) – 1 disk → 5 OSD 17 TF-Storage meeting - Pisa 13-14 October 2015

Hardware: monitor nodes Sun Sunfjre x4150 ● Hardware not virtual (3 in production, going to be 5) ● Ubuntu 14.04 - Linux 3.13.0-46-generic #77-Ubuntu ● 2 Intel Xeon X5355@2.66Ghz ● 2x1GB intel for Ceph Client network (LACP) ● 16GB RAM ● 5x120GB intel 3500 SSD RAID 10 + HotSpare 17 TF-Storage meeting - Pisa 13-14 October 2015

Racks plans IN PROGRESS: NOW: computing and storage will be computing and storage are in specifjc racks. mixed. ● 24U OSD nodes For storage: ● 32U OSD nodes ● 4U Computing nodes ● 2U monitor/cache ● 2U monitor/cache ● 8U network ● 10U network For computing: ● 32U for computing nodes ● 10U network The storage network fan-out is optimized 17 TF-Storage meeting - Pisa 13-14 October 2015

confjguration essential -1 262.1 root default rule serra_fibo_ing_high-end_ruleset { -15 87.36 datacenter fibonacci ruleset 3 -16 87.36 rack rack-c03-fib type replicated -14 87.36 datacenter serra min_size 1 -17 87.36 rack rack-02-ser max_size 10 -35 87.36 datacenter ingegneria step take default -31 0 rack rack-01-ing step choose firstn 0 type datacenter -32 0 rack rack-02-ing step chooseleaf firstn 1 type host-high- -33 0 rack rack-03-ing end -34 0 rack rack-04-ing step emit -18 87.36 rack rack-03-ser } 17 TF-Storage meeting - Pisa 13-14 October 2015

Tools Just 3 people working on CEPH (not 100%) and you need to grow quickly → Automation is REALLY important ● Confjguration management: Puppet – Most of the classes are already production-ready – A lot of documentation (best practices, books, community) ● Bare metal installation:The Foreman – Complete lifecycle for hardware – DHCP, DNS, Puppet ENC 17 TF-Storage meeting - Pisa 13-14 October 2015

Tools F o r m o n i t o r i n g / a l a r m i n g : Test environment: (Vagrant and VirtualBox) to test what is hardware ● Nagios+CheckMK indipendent: – alarms ● new functionalities – graphing ● Puppet classes ● Rsyslog ● upgrades procedures ● Looking at collectD + Graphite – Metrics correlation 17 TF-Storage meeting - Pisa 13-14 October 2015

Openstack integration ● It works straightforward ● Shared storage → live migration ● CEPH as a backend for: ● multiple pools are supported – Volumes ● Current issues: (OS=Juno Ceph=Giant) – Vms – M a s s i v e v o l u m e d e l e t i o n – Images – Evacuate ● Copy on Write: VM as a snapshot 17 TF-Storage meeting - Pisa 13-14 October 2015

Performances – ceph bench writes ===================================== ==================================== =================================== Total time run: 10.353915 Total time run: 60.308706 Total time run: 120.537838 Total writes made: 1330 Total writes made: 5942 Total writes made: 12593 Write size: 4194304 Write size: 4194304 Write size: 4194304 Bandwidth (MB/sec): 513.815 Bandwidth (MB/sec): 394.106 Bandwidth (MB/sec): 417.894 Stddev Bandwidth: 161.337 Stddev Bandwidth: 103.204 Stddev Bandwidth: 84.4311 Max bandwidth (MB/sec): 564 Max bandwidth (MB/sec): 524 Max bandwidth (MB/sec): 560 Min bandwidth (MB/sec): 0 Min bandwidth (MB/sec): 0 Min bandwidth (MB/sec): 0 Average Latency: 0.123224 Average Latency: 0.162265 Average Latency: 0.153105 Stddev Latency: 0.0928879 Stddev Latency: 0.211504 Stddev Latency: 0.175394 Max latency: 0.955342 Max latency: 2.71961 Max latency: 2.05649 Min latency: 0.045272 Min latency: 0.041313 Min latency: 0.038814 ===================================== ==================================== ==================================== 17 TF-Storage meeting - Pisa 13-14 October 2015

Performances – ceph bench reads rados bench -p BenchPool 10 rand rados bench -p BenchPool 10 seq =================================== ================================== Total time run: 10.065519 Total time run: 10.057527 Total reads made: 1561 Total reads made: 1561 Read size: 4194304 Read size: 4194304 Bandwidth (MB/sec): 620.336 Bandwidth (MB/sec): 620.829 Average Latency: 0.102881 Average Latency: 0.102826 Max latency: 0.294117 Max latency: 0.328899 Min latency: 0.04644 Min latency: 0.041481 =================================== ================================== 17 TF-Storage meeting - Pisa 13-14 October 2015

Performances: adding VMs What to measure: See how Latency is infmuenced by IOPS, measuring it while we add VMs (fjxed load generator). ● See how Total bandwidth decrease adding VMs ● Setup: 40VM on Openstack with 2 10G volumes (pre-allocated with dd): ● One with bandwidht CAP (100MB) – One with IOPS CAP (200 total) – We use fjo as benchmark tool and dsh to launch it from a master node. ● Refence: Measure Ceph RBD performance in a quantitative way: https://software.intel.com/en-us/blogs/2013/10/25/measure-ceph-rbd-performance- ● in-a-quantitative-way-part-i 17 TF-Storage meeting - Pisa 13-14 October 2015

Fio fio --size=1G \ fio --size=4G \ --runtime 60 \ --runtime=60 \ --ioengine=libaio \ --ioengine=libaio \ --direct=1 \ --direct=1 \ --rw=randread [randwrite]\ --rw=read [write]\ --name=fiojob \ --name=fiojob \ --blocksize=4K \ --blocksize=128K [256K] \ --iodepth=2 \ --iodepth=64 \ --rate_iops=200 \ --output=seqread.out --output=randread.out 17 TF-Storage meeting - Pisa 13-14 October 2015

Performances -write 17 TF-Storage meeting - Pisa 13-14 October 2015

Performances - write 17 TF-Storage meeting - Pisa 13-14 October 2015

Performances - read 17 TF-Storage meeting - Pisa 13-14 October 2015

Build and operate a CEPH Infrastructure - University of Pisa case - PowerPoint PPT Presentation

Build and operate a CEPH Infrastructure - University of Pisa case study Simone Spinelli simone.spinelli@unipi.it 17 TF-Storage meeting - Pisa 13-14 October 2015 Agenda C E P H @ u n i p i : a n o v e r v i e w Performances

Managing and Monitoring Ceph with the Ceph Dashboard Lenz Grimmer <lgrimmer@suse.com> |

CEPHALOPODS AND SAMBA IRA COOPER - SambaXP 2016.05.12 AGENDA CEPH Architecture. Why CEPH?

Ceph Rados Block Device Venky Shankar Ceph Developer, Red Hat SNIA, 2017 1 WHAT IS CEPH?

Agenda Openstack CEPH Storage Dream team: CEPH and Openstack Summary GUUG FFG 2015

BLUESTORE: A NEW STORAGE BACKEND FOR CEPH ONE YEAR IN SAGE WEIL 2017.03.23 OUTLINE Ceph

Linux Open Source Distributed Filesystem Ceph at SURFsara Remco van Vugt July 2, 2013 1/ 34

CEPH WIRE PROTOCOL REVISITED CEPH WIRE PROTOCOL REVISITED MESSENGER V2 MESSENGER V2 Ricardo

Ceph: All-in-One Network Data Storage What is Ceph and how we use it to backend the Arbutus cloud

Ceph storage with Rook Running Ceph on Kubernetes Alexander Trost, Rook Maintainer and DevOps

Know more about your Ceph Cluster with ELK Stack Cameron Seader Technology Strategist

Scaling Your Storage Using Ceph Wido den Hollander #CCCEU Who am I? Wido den Hollander

Presentation: 1. I-Max Ceph key points 2. Exams 3. Dimensions 4. Technical features 5. I-Max

Ceph & RocksDB (Cloud Storage ) Ceph Basics Placement Group PG#1 PG#2 PG#3

How to backup Ceph at scale FOSDEM, Brussels, 2018.02.04 About me Bartomiej wicki OVH

ACQUIRE | DISCOVER | FINANCE | BUILD | OPERATE ACQUIRE | DISCOVER | FINANCE | BUILD | OPERATE A

ACQUIRE | DISCOVER | FINANCE | BUILD | OPERATE ACQUIRE | DISCOVER | FINANCE | BUILD | OPERATE A

58 59 The ionic contamination requirements can be calculated as follows: Water would have ~ 3e22

SUSTAINABILITY & INNOVATION IN PUBLIC RADIO 2019 WBUR BIZLAB SUMMIT #BizLabSummit

UC UC SF SF 1 2 Objectives Surgical Site Infections 3 rd most frequent nosocomial infxn

DISTRICT OF COLUMBIA HEALTH INFORMATION EXCHANGE POLICY BOARD MEETING January 22, 2020| 3:00

The Art and Science of Data Wrangling Kristen M. Altenburger and Sam Pepose Facebook Core Data

Speaker Disclosure: MAPS: Multidisciplinary Abnormal Placentation Service Nothing to Disclose

New New or or Under Under Used Used Fe Features Presented By: Andy Schommer & Jane Nickalls

2002 Model Whitewater 6 Slide Rock 6 Wild Ride INSTALLATION MANUAL Whitewater Slide

Sambuz

Useful Links

Newsletter

Mail Us

Build and operate a CEPH Infrastructure - University of Pisa case - PowerPoint PPT Presentation

Build and operate a CEPH Infrastructure - University of Pisa case study Simone Spinelli simone.spinelli@unipi.it 17 TF-Storage meeting - Pisa 13-14 October 2015 Agenda C E P H @ u n i p i : a n o v e r v i e w Performances

Managing and Monitoring Ceph with the Ceph Dashboard Lenz Grimmer &lt;lgrimmer@suse.com&gt; |

CEPHALOPODS AND SAMBA IRA COOPER - SambaXP 2016.05.12 AGENDA CEPH Architecture. Why CEPH?

Ceph Rados Block Device Venky Shankar Ceph Developer, Red Hat SNIA, 2017 1 WHAT IS CEPH?

Agenda Openstack CEPH Storage Dream team: CEPH and Openstack Summary GUUG FFG 2015

BLUESTORE: A NEW STORAGE BACKEND FOR CEPH ONE YEAR IN SAGE WEIL 2017.03.23 OUTLINE Ceph

Linux Open Source Distributed Filesystem Ceph at SURFsara Remco van Vugt July 2, 2013 1/ 34

CEPH WIRE PROTOCOL REVISITED CEPH WIRE PROTOCOL REVISITED MESSENGER V2 MESSENGER V2 Ricardo

Ceph: All-in-One Network Data Storage What is Ceph and how we use it to backend the Arbutus cloud

Ceph storage with Rook Running Ceph on Kubernetes Alexander Trost, Rook Maintainer and DevOps

Know more about your Ceph Cluster with ELK Stack Cameron Seader Technology Strategist

Scaling Your Storage Using Ceph Wido den Hollander #CCCEU Who am I? Wido den Hollander

Presentation: 1. I-Max Ceph key points 2. Exams 3. Dimensions 4. Technical features 5. I-Max

Ceph &amp; RocksDB (Cloud Storage ) Ceph Basics Placement Group PG#1 PG#2 PG#3

How to backup Ceph at scale FOSDEM, Brussels, 2018.02.04 About me Bartomiej wicki OVH

ACQUIRE | DISCOVER | FINANCE | BUILD | OPERATE ACQUIRE | DISCOVER | FINANCE | BUILD | OPERATE A

ACQUIRE | DISCOVER | FINANCE | BUILD | OPERATE ACQUIRE | DISCOVER | FINANCE | BUILD | OPERATE A

58 59 The ionic contamination requirements can be calculated as follows: Water would have ~ 3e22

SUSTAINABILITY &amp; INNOVATION IN PUBLIC RADIO 2019 WBUR BIZLAB SUMMIT #BizLabSummit

UC UC SF SF 1 2 Objectives Surgical Site Infections 3 rd most frequent nosocomial infxn

DISTRICT OF COLUMBIA HEALTH INFORMATION EXCHANGE POLICY BOARD MEETING January 22, 2020| 3:00

The Art and Science of Data Wrangling Kristen M. Altenburger and Sam Pepose Facebook Core Data

Speaker Disclosure: MAPS: Multidisciplinary Abnormal Placentation Service Nothing to Disclose

New New or or Under Under Used Used Fe Features Presented By: Andy Schommer &amp; Jane Nickalls

2002 Model Whitewater 6 Slide Rock 6 Wild Ride INSTALLATION MANUAL Whitewater Slide

Sambuz

Useful Links

Newsletter

Mail Us

Managing and Monitoring Ceph with the Ceph Dashboard Lenz Grimmer <lgrimmer@suse.com> |

Ceph & RocksDB (Cloud Storage ) Ceph Basics Placement Group PG#1 PG#2 PG#3

SUSTAINABILITY & INNOVATION IN PUBLIC RADIO 2019 WBUR BIZLAB SUMMIT #BizLabSummit

New New or or Under Under Used Used Fe Features Presented By: Andy Schommer & Jane Nickalls