Distributed File Storage in Multi-Tenant Clouds using CephFS
Openstack Vancouver 2018 May 23
Patrick Donnelly CephFS Engineer Red Hat, Inc. Tom Barron Manila Engineer Red Hat, Inc. Ramana Raja CephFS Engineer Red Hat, Inc.
Distributed File Storage in Multi-Tenant Clouds using CephFS - - PowerPoint PPT Presentation
Distributed File Storage in Multi-Tenant Clouds using CephFS Openstack Vancouver 2018 May 23 Patrick Donnelly Tom Barron Ramana Raja CephFS Engineer Manila Engineer CephFS Engineer Red Hat, Inc. Red Hat, Inc. Red Hat, Inc. How do we
Patrick Donnelly CephFS Engineer Red Hat, Inc. Tom Barron Manila Engineer Red Hat, Inc. Ramana Raja CephFS Engineer Red Hat, Inc.
Openstack Sydney: 2017 November 06
S3 and Swift compatible object storage with object versioning, multi-site federation, and replication
A library allowing apps to direct access RADOS (C, C++, Java, Python, Ruby, PHP)
A software-based, reliable, autonomic, distributed object store comprised of self-healing, self-managing, intelligent storage nodes (OSDs) and lightweight monitors (Mons)
A virtual block device with snapshots, copy-on-write clones, and multi-site replication
A distributed POSIX file system with coherent caches and snapshots on any directory
3
Openstack Sydney: 2017 November 06
file system!
RADOS.
stored in RADOS. The MDS has no local state.
○
FUSE: ceph-fuse ...
○
Kernel: mount -t ceph ...
inode capabilities which enforce synchronous or buffered writes.
4
Client Client ceph-mds
RADOS
Metadata RPC File I/O Journal Metadata
Public OpenStack Service API (External) network Storage (Ceph public) network External Provider Network Storage Provider Network Router Router Tenant VMs with 2 nics Manila Share service Ceph MON Ceph MGR Ceph OSD Ceph OSD Ceph OSD Controller Nodes Storage nodes Tenant A Tenant B Compute Nodes Manila API service Ceph MDS placement: With MONs/python services/ dedicated? Ceph MDS
https://docs.openstack.org/manila/ocata/devref/cephfs_native_driver.html
Metadata updates Data updates
storage cluster (HA of MON, MDS, OSD)
in data plane.
(Pacemaker/Corosync)
OpenStack client/Nova VM
Monitor Metadata Server OSD Daemon
server daemons NFS gateway
Native Ceph NFS
Public OpenStack Service API (External) network Storage (Ceph public) network External Provider Network Storage NFS Network Router Router Tenant VMs with 2 nics Manila Share service Ceph MON Ceph MDS Ceph OSD Ceph OSD Ceph OSD Controller Nodes Storage nodes Tenant A Tenant B Compute Nodes Manila API service Ceph MDS placement: With MONs/python services/ dedicated?
https://docs.openstack.org/manila/queens/admin/cephfs_driver.html
Public OpenStack Service API (External) network Ceph public network External Provider Network Router Router Tenant VMs Manila Share service Ceph MON Ceph MDS Ceph OSD Ceph OSD Ceph OSD Controller Nodes Tenant A Tenant B Compute Nodes Manila API service Ceph MGR kubernetes
9
Kubernetes Pod (HA Managed by Kubernetes) MDS OSD OSD OSD Ganesha NFSGW MGR Manila
Push config Start grace period Metadata IO Data IO Get/Put Client State (in RADOS) Get Share/Config + Advertise to ServiceMap Spawn Container in NW Share
/usr/bin /ceph
REST API: Get/Put Shares (Publish Intent) Share: CephFS Name Export Paths Network Share (e.g. Neutron ID+CIDR) Share Server Count
Rook/Kubernetes + Kuryr (net driver)
HA managed by Kubernetes Scale-out & shares managed by mgr
10
StatefulSet #1 Pod #1 Pod #2
Ganesha NFSGW Ganesha NFSGW
Service
NFS Client NFS Client
One IP to access NFS Cluster for tenant; only via tenant network # of pods equal to scale-out for “CephFSShareNames pace”; dynamically grow/shrink?? Stable network identifiers for pods!
Openstack Sydney: 2017 November 06
Patrick Donnelly pdonnell@redhat.com Thanks to the CephFS team: John Spray, Greg Farnum, Zheng Yan, Ramana Raja, Doug Fuller, Jeff Layton, and Brett Niver. Homepage: http://ceph.com/ Mailing lists/IRC: http://ceph.com/IRC/
11