Where to store all the IoT Data?
Piotr Robert Konopelko Business & Technical Support Manager, MooseFS Pro piotr.konopelko@moosefs.pro
Where to store all the IoT Data? Piotr Robert Konopelko Business - - PowerPoint PPT Presentation
Where to store all the IoT Data? Piotr Robert Konopelko Business & Technical Support Manager, MooseFS Pro piotr.konopelko@moosefs.pro Linux Servers + MooseFS = Storage High Availability High Performance Petabytes Longterm What is
Piotr Robert Konopelko Business & Technical Support Manager, MooseFS Pro piotr.konopelko@moosefs.pro
High Availability High Performance Petabytes Longterm
The MooseFS Software-Defined Storage is designed for mission critical applications with high availability and high performance requirements.
No Single Point of Failure (SPOF-less configuration). Metadata of the file system is kept in two or more copies on physical redundant servers. User data is redundantly spread across the storage servers in the system.
with more than 2 billion of files.
Designed to support high performance I/O operations. User data is read/written in parallel, directly on many storage nodes at once avoiding single central server or single network connection bottlenecks.
The MooseFS storage software is designed for Big Data support. MooseFS enables virtually limitless pool of storage to support the most demanding distributed workloads.
All-Flash and Hybrid storage setup supported. Different manufacturers' disks and servers may be used in the one storage system. Older and newer technology may be mixed if required. MooseFS can be ran on wide range of clusters from Raspberry Pi to Enterprise Servers on many architectures
The MooseFS based storage is built with commodity components of virtually any manufacturer. It supports all major disks and disk interfaces types: SATA / SAS, SSD / HDD.
Support you get comes directly from the software manufacturer. We know each line of the source code, we can solve each issue.
Open source software with commodity components and operating systems make MooseFS safe and usable for entire lifetime.
Tiering and rolling upgrade features make hardware life longer as older, smaller and slower servers and/or disks may be used in less intensive tiers.
– Sensors – Control Systems – Data Acquisition Equipment
Country: Warsaw, Poland Market: Media & Entertainment, Market analysis Company: Medium company with 20+ European offices
Purpose of use: MooseFS is used as a primary storage for Internet traffic data measurement, which is the core business data for this company. MooseFS is installed on a few separate storage clusters, where heavy calculations are performed along with storing data. Concurrently used by 300+
thousands of new records per second. Clusters have been online since 2005 and are built with over 150 servers storing a few petabytes in total.
Country: Boston, MA, USA Market: Healthcare, Education, Research Company: One of the prominent Ivy League Universities
Purpose of use: MooseFS is used on two clusters in the university labs: the first is designed for medical data storage and processing, while the second is used to store VMs disks' images. A few dedicated custom features were added to the system for this customer. Clusters have been online since 2013 storing half a petabyte of crucial data.
Country: New York, USA Market: Research, Education Company: One of the prominent Ivy League Universities
Purpose of use: MooseFS is used as a storage solution for human genome research activities. Interesting pattern of MooseFS’s atomic snapshot feature use case: creating and discarding genome data for different proposals of research algorithms. More than 10 Petabytes of data is stored on a single cluster.
Country: Sweden Market: Cloud & Hosting Company: Cloud Provider
Purpose of use: MooseFS is used as storage for on-premise cloud solutions for their customers. Around 30 servers and 50 end-application connections creating several Petabytes storage cluster. The solution includes storage for surveillance systems.
Country: France Market: Research Company: European oceanographic research institute
Purpose of use:
MooseFS is used as a primary storage for research and data analysis for the exploration
machines where data is stored. Nearly half a billion files, mostly satellite images consuming around 4 petabytes of data, stored on over a hundred servers and serving almost a few hundred computing applications.
Country: Poland Market: Cloud & Hosting Company: Data Center Storage, CDN
Purpose of use:
MooseFS is used as a primary distributed storage in one of their Data Centers in
hosting and Content Delivery Network (images and videos mainly) for company's customers, second one is based on MooseFS + Proxmox integration and provides VPS
Country: Middle East Market: Cloud & Hosting Company: CDN, Cloud & Hosting
Purpose of use: MooseFS is used as a core backend storage solution for CDN clusters (images, ringtones, and videos). A few hundreds of millions of files store a couple
Network Provider.
Country: Warsaw, Poland Market: Market: Media & Entertainment, Internet television Company: A division of the biggest Polish media group
Purpose of use: MooseFS is used as a backend of the Content Delivery Network (CDN) service for internet TV serving content in Video on Demand manner. Interesting case
are used to store a few Petabytes of video and image content.
All system components are redundant and in case of failure there's an automatic failover mechanism that is transparent to the users.
Support for scheduling computation on data nodes for better overall system TCO by utilizing idle CPU and memory resources.
Ability to perform one node at a time system upgrades without service disruption including hardware replacement and additions. This feature allows to maintain hardware platform up-to-date with no downtime.
All system components are redundant and in case of failure there's an automatic failover mechanism that is transparent to the users.
Ability to perform one node at a time system upgrades without service disruption including hardware replacement and additions. This feature allows to maintain hardware platform up-to-date with no downtime.
Instant, uninterrupted provisioning of the state
This feature is ideal for on-line backup solutions.
Support for family of standards, specified by the IEEE, to clarify and make uniform the application programming interfaces (and ancillary issues, such as command line shell utilities) provided by Unix like operating systems.
Ensuring data redundancy using error correction code algorithms with up to 9 parities. It saves raw data space compared to an ordinary data duplication approach.
A virtual, global space for deleted objects configurable for each individual file and directory. A very useful feature for recovering accidentally deleted data caused by human error.
Limits set by a system administrator to restrict certain aspects of file system usage and to allocate limited disk space in a reasonable way, i.e. number of i-nodes
Access to files and directories is based on a standard Unix access control model enhanced with standard Access Control Lists.
For performance reasons there is a dedicated client component for Linux, FreeBSD and Mac OS X systems.
Performs all I/O operations in parallel threads of executions to deliver good read/write operations performance.
1-200 Gbps standard Ethernet based network used for all the communication with support for LACP configurations.