MongoDB for a High Volume Logistics Application Santa Clara, - PowerPoint PPT Presentation

MongoDB for a High Volume Logistics Application Santa Clara, California | April 23th – 25th, 2018

about me ... Eric Potvin Software Engineer in the performance team at Shipwire , an Ingram Micro company, in Sunnyvale, California

… A little background

who are we? We offer a cloud-based fulfillment software platform This platform enables thousands of brands and online retailers to manage their order fulfillment operations We support 20+ warehouses in multiple countries like USA, Canada, Australia, Italy, Germany and China

warehouses are … old-fashioned Some warehouses are unable to easily adapt their systems to new technologies Warehouses are using old infrastructure; aka servers (AS/400) or service providers Warehouses understands files … and FTP

what we have to deal with Millions of files received monthly Gigabytes of various document file types (XML, TXT/CSV, PDF) Limitations on file received (raw zip files vs zip files) Limitations of FTP connections

lots of data to maintain 8 processing servers Ingesting millions of files per month Thousands of log files 100+ GB of monthly logs / 250+ GB of data files

server resources & limitations By manipulating so many files, we are suffering from high server resources consumptions - Lots of processes with constant high CPU usage - Each processes has high RAM usage, - And high network usage - GBs of data transferred hourly

searching for information can be tedious Often, we need to look for data in case of errors or a common “we didn’t receive these files” Data and logs are not available for users Finding information requires an engineer to connect to each server

what about... NFS? This will eliminate the lookup across servers but still have some issues: - Still large amount of files - Network overhead for large files - And … -bash: /bin/ls: Argument list too long MySQL - Changing data structure requires maintenance

… so why did we choose MongoDB?

get all data at no cost? Analytics software are great and allow any user to see data But they can be costly and limited MongoDB gives the flexibility to save what we need With no monthly or setup fee

better integrations All data can now be visible by all users Can be integrated with our in-house applications Self-service tool allow users to take actions immediately in case of issues Accurate real-time tracking of documents Real-time monitoring of documents, server resources

no more frequent reads/writes No more slow CRUD operations on an XML file on disk Avoid millions of disk and memory operations It also make our code healthier …

simplified code From: To: mongoClient.getDatabase(myDatabase) Document doc = db.parse(<my_file>); .getCollection(myCollection) Element elem = doc.getDocumentElement(); .find(search) NodeList nl = elem.getElementsByTagName(<child>); .projection(whatINeed) for(int i = 0; i < nl.getLength(); i++) { NodeList node = (Element)nl.item(i).getElementsByTagName(<tag>); // and update later for(int j = 0; j < node.getLength(); j++) { collection.update(search, dataToUpdate); // fetch data for what I need // and update later

available for everyone and instantly Now all our apps can access MongoDB Microservices can access the same data without delay Data is available instantly, even after multiple manipulations

another ALTER? seriously? ... No more “system under maintenance” because we need to alter a big table No need to care about schema update due to a warehouses updated file And no need to store the entire content in a blob and try to search within

where is my data? Can access data using a “single point of access” ( all depends which secondary I am reading from ) Faster data access with multiple secondaries No more “file locked” … and waiting for unlock ...

server goes down, no big deal Election process is fantastic ! No more “down time” due to single points of failure Easy to expand and/or upgrade

How did we reduce server resource usage?

example of manipulating a single order 1 order from Chicago, USA to Québec City, Canada using an international carrier, 1 product ordered. This requires at least 7 XML files and 3 PDF files to be created

shipping confirmation example This files contains multiple nodes giving details about shipping details - Tracking numbers - Number of boxes shipped - Carrier including details - etc... File size can be up to few Megabytes

nested loops of … O(n*r)? Looping through few Megabytes file is slow - Each loop calls API and update database records What if the process crashed, where to start from? - Manual recovery Constant server monitoring resources

iterations (what we used to have) Open the entire file in memory Loop through each record, For each record loop through each box shipped For each box shipped, Loop through each product (quantity shipped, reason if not shipped)

Enough ! let’s keep this simple: O(1)

no more loops ... Save the data we only care about - Our own standard format using Kilobytes of data Higher efficiency of searching documents - One simple document, one single query

“Stateful” resource keep track of data changes inside the document No more intensive memory and disk usage due to multiple file manipulations Real-time manual change from a UI by any user

Fault tolerant MongoDB gives us persistent data (server reboot, segmentation fault, etc…) Eliminates memory issues when reading multiple large text file in memory Free up resources for other applications running on the same server

server resources This result in very low resource usage processes CPU percentage and load went down drastically Network usage dropped considerably

disk utilization No more -bash: /bin/ls: Argument list too long Lots of free space reused for something else No more frequent “cleanup” or disk maintenance No more file archiving/maintenance to a backup server No more disk at 95% utilization alerts

Let’s see a simple example

Application logs

application logs (what we used to have) Each application logs its data to their own specific files Each log uses different log level based on what is executed CRIT (0), ERR (1), WARN (2), INFO (3), DEBUG (4) Logs are saved with following format in /var/log/my_application/my_app.log 2017-11-12T03:50:02-08:00 [ INFO / 3 ] (PID: 12345): My message

application log (search) To search, we simply need to run: for x in $(seq 1 8); do ssh "p$x.myserver" grep -r "my search" /logs/app/* ; Done … wait … and … wait

no more ! let’s fix this

logging in MongoDB Each application logs its data to their own specific namespace Database used: <application_name> Collection used: <application_specific> Example: warehouse.sending_files

logging in MongoDB (example) {“datetime”: date: ISODate(), “level”:”INFO”, “code”:3, “pid”: 12345, “message”: “file orders_1234.zip sent to /inbound/” }

MongoDB log (search) use logs; db.my_app.find(); db.my_app.find({level: “INFO”}); db.my_app.find({message: /some specific data/);

archiving logs Archiving data can be done by using the TTL index ● Warning: ttl index runs every 60 seconds on all namespaces and records to identify which records needs to be removed. This can slow down data access. Another way is to create a daemon that generates “yearly or monthly” collections. Then, use the mongodump to archive the records.

So … What can MongoDB do for you?

Thank You!

MongoDB for a High Volume Logistics Application Santa Clara, - PowerPoint PPT Presentation

MongoDB for a High Volume Logistics Application Santa Clara, California | April 23th 25th, 2018 about me ... Eric Potvin Software Engineer in the performance team at Shipwire , an Ingram Micro company, in Sunnyvale, California A little

Percona Backup for MongoDB Akira Kurogane Percona 3 - 2 - 1 MongoDB Percona Server for

MongoDB Thomas Schwarz, SJ MongoDB History 2007 Developed by 10gen as a Platform as a Service

MongoDB Building data model with MongoDB and Mongoose MVC Pattern Connect Express app to

MongoDB Sharding 101 Agenda What is MongoDB? Single Instances Replica-set

Everything You Know About MongoDB is Wrong (Probably) Mark Smith | MongoDB | @Judy2K Myth 0

External Authentication with Percona Server for MongoDB and MongoDB Enterprise Jason Terpko DBA

1. Instillations o https://www.mongodb.com/download-center/community 2. Download and Install

Your First MongoDB Environment: What You Should Know Before Choosing MongoDB as Your Database Me

Introduction to MongoDB Kristina Chodorow kristina@mongodb.org Application PHP Apache

Information Retrieval in MongoDB Data storage, Indexing and Querying Kaustubh Dhokte (NB97699)

MongoDB Backups, All Grown up! David Murphy David Murphy MongoDB Practice Manager for Percona

What's New in Percona Server for MongoDB? 2019 Q3: Enterprise Enhancements and v4.2 4:00 PM -

MongoDB and Java 8 Agenda Java8 Main Features MongoDB + Java8 Few Examples RX Driver 3 Java

Geospatial and MongoDB MongoDB Geospatial Features Agenda Query Examples Optimizations 2

Rapid Application Prototyping with Java & MongoDB Trisha Gee, MongoDB Java Engineer Fully

Dos and Donts of a Hybrid Environment MySQL and MongoDB Introduction Im Rick Vasquez a

OS X Imaging, Software and Policy Deployment with Linux Services Doug Brown Redlands College

CLAS12 Software Demonstration Part 1 of 2 Nathan Harrison CLAS Collaboration Meeting November

Traffic Classification based on Visualization

Ubuntu Installa5on Don Porter CSE/ISE 311: Systems Administra5on

Learning Nonstationary Models of Normal Network Traffic for Detecting Novel Attacks Matthew V.

Todays topics Computer Applications Computer Security Upcoming Operating Systems ( Great

3 UkkVaT XCE best . k as rank approximation of Xc 0k Rpxk ) orthogonal prnxk

User Security Chapter 30 Computer Security: Art and Science , 2 nd Edition Version 1.0 Slide

MongoDB for a High Volume Logistics Application Santa Clara, - PowerPoint PPT Presentation

MongoDB for a High Volume Logistics Application Santa Clara, California | April 23th 25th, 2018 about me ... Eric Potvin Software Engineer in the performance team at Shipwire , an Ingram Micro company, in Sunnyvale, California A little

Percona Backup for MongoDB Akira Kurogane Percona 3 - 2 - 1 MongoDB Percona Server for

MongoDB Thomas Schwarz, SJ MongoDB History 2007 Developed by 10gen as a Platform as a Service

MongoDB Building data model with MongoDB and Mongoose MVC Pattern Connect Express app to

MongoDB Sharding 101 Agenda What is MongoDB? Single Instances Replica-set

Everything You Know About MongoDB is Wrong (Probably) Mark Smith | MongoDB | @Judy2K Myth 0

External Authentication with Percona Server for MongoDB and MongoDB Enterprise Jason Terpko DBA

1. Instillations o https://www.mongodb.com/download-center/community 2. Download and Install

Your First MongoDB Environment: What You Should Know Before Choosing MongoDB as Your Database Me

Introduction to MongoDB Kristina Chodorow kristina@mongodb.org Application PHP Apache

Information Retrieval in MongoDB Data storage, Indexing and Querying Kaustubh Dhokte (NB97699)

MongoDB Backups, All Grown up! David Murphy David Murphy MongoDB Practice Manager for Percona

What's New in Percona Server for MongoDB? 2019 Q3: Enterprise Enhancements and v4.2 4:00 PM -

MongoDB and Java 8 Agenda Java8 Main Features MongoDB + Java8 Few Examples RX Driver 3 Java

Geospatial and MongoDB MongoDB Geospatial Features Agenda Query Examples Optimizations 2

Rapid Application Prototyping with Java &amp; MongoDB Trisha Gee, MongoDB Java Engineer Fully

Dos and Donts of a Hybrid Environment MySQL and MongoDB Introduction Im Rick Vasquez a

OS X Imaging, Software and Policy Deployment with Linux Services Doug Brown Redlands College

CLAS12 Software Demonstration Part 1 of 2 Nathan Harrison CLAS Collaboration Meeting November

Traffic Classification based on Visualization

Ubuntu Installa5on Don Porter CSE/ISE 311: Systems Administra5on

Learning Nonstationary Models of Normal Network Traffic for Detecting Novel Attacks Matthew V.

Todays topics Computer Applications Computer Security Upcoming Operating Systems ( Great

3 UkkVaT XCE best . k as rank approximation of Xc 0k Rpxk ) orthogonal prnxk

User Security Chapter 30 Computer Security: Art and Science , 2 nd Edition Version 1.0 Slide

Rapid Application Prototyping with Java & MongoDB Trisha Gee, MongoDB Java Engineer Fully