performance analysis of mongodb vs. postgis/postgresql databases for - - PowerPoint PPT Presentation

performance analysis of mongodb vs postgis postgresql
SMART_READER_LITE
LIVE PREVIEW

performance analysis of mongodb vs. postgis/postgresql databases for - - PowerPoint PPT Presentation

Line Intersection and Point Containment Spatial Queries Sarthak Agarwal, KS Rajan September 16, 2015 Lab for Sptial Informatics International Institute of Information Technology Hyderabad FOSS4G Seoul, South Korea performance analysis of


slide-1
SLIDE 1

performance analysis of mongodb vs. postgis/postgresql databases for

Line Intersection and Point Containment Spatial Queries

Sarthak Agarwal, KS Rajan September 16, 2015

Lab for Sptial Informatics International Institute of Information Technology Hyderabad FOSS4G Seoul, South Korea

slide-2
SLIDE 2

Problem Statement

Why do we need a Internet connection to know directions from one point to another? Why not deploy the server on the mobile device itself? Why to query heavy SQL servers everytime?

FOSS4G Seoul, South Korea | September 14th – 19th , 2015 Sarthak Agarwal, KS Rajan 2

slide-3
SLIDE 3

Spatial Databases

∙ Spatial databases currently are primarily based on RDBMS. eg PostGIS ∙ They have a great potential to store, manage and query very large dataset. ∙ SQL databases face scalability and agility challenges. ∙ Spatial applications do not always have a fixed schema every time.

FOSS4G Seoul, South Korea | September 14th – 19th , 2015 Sarthak Agarwal, KS Rajan 3

slide-4
SLIDE 4

PostgresSQL/PostGIS

∙ PostgreSQL is an open source, object-relational database management system (ORDBMS). ∙ PostGIS adds support for geographic objects to the PostgreSQL

  • bject-relational database.

∙ The functions of PostGIS can be divided into 5 broad categories-

∙ Management. ∙ Conversion. ∙ Retrieval. ∙ Comparison. ∙ Generation.

FOSS4G Seoul, South Korea | September 14th – 19th , 2015 Sarthak Agarwal, KS Rajan 4

slide-5
SLIDE 5

Spatial Databases

∙ NoSQl or Not only Sql databases for non-relational data stores. ∙ They have a great potential to store, manage and query very large dataset. ∙ They performs better in cases

∙ where there is a need to improve the query response time ∙ can handle the rise in the data storage and frequency at which it is accessed and processed.

∙ Spatial applications deals with problems like over time evolution

  • f schema and data size

FOSS4G Seoul, South Korea | September 14th – 19th , 2015 Sarthak Agarwal, KS Rajan 5

slide-6
SLIDE 6

MongoDB

∙ Document oriented datastore. ∙ High performance and retains friendly properties of SQL. ∙ GeoJSON objects. ∙ Multiple geospatial indexes per collection - 2d, 2dsphere . ∙ Data can be imported from CSV files by converting it into GeoJSON

  • bjects.

∙ No support for R trees.

FOSS4G Seoul, South Korea | September 14th – 19th , 2015 Sarthak Agarwal, KS Rajan 6

slide-7
SLIDE 7

Comparison

SQL ∙ Not designed for distributed System ∙ Good for structured data Unstructured data, points and lines not very suitable. NoSQL ∙ Distributed Databases spread

  • ver multiple servers,

∙ Schema less databases where multiple geometries can be stored in the same column.

FOSS4G Seoul, South Korea | September 14th – 19th , 2015 Sarthak Agarwal, KS Rajan 7

slide-8
SLIDE 8

Does NoSQL hold a promise in the context of Spatial Databases and Spatial Queries?

FOSS4G Seoul, South Korea | September 14th – 19th , 2015 Sarthak Agarwal, KS Rajan 8

slide-9
SLIDE 9

How to compare

∙ Conventional Databases. ∙ Geometry Databases.

∙ Point in a polygon ∙ Line Intersection

FOSS4G Seoul, South Korea | September 14th – 19th , 2015 Sarthak Agarwal, KS Rajan 9

slide-10
SLIDE 10

Datasets

∙ Synthetic dataset created for all cases. ∙ Test best case scenarios. ∙ Small data size to very big data size ∙ All the data in the analysis was processed using In-memory and no secondary memory was used.

FOSS4G Seoul, South Korea | September 14th – 19th , 2015 Sarthak Agarwal, KS Rajan 10

slide-11
SLIDE 11

Performance

FOSS4G Seoul, South Korea | September 14th – 19th , 2015 Sarthak Agarwal, KS Rajan 11

slide-12
SLIDE 12

Results

∙ MongoDB performs better as the datasize increases. ∙ PostGIS fails at very large datasets. ∙ Indexing increases the performance of both datasets. ∙ PostGIS time increases exponentially as size of dataset increases, whereas MongoDB still performs within some bounds.

FOSS4G Seoul, South Korea | September 14th – 19th , 2015 Sarthak Agarwal, KS Rajan 12

slide-13
SLIDE 13

Discussions

∙ These results suggest that MongoDB performs better by an average factor of 25x. ∙ Factor increases exponentially as the data size increases in both indexed and non-indexed operations. ∙ After observing these results NoSQL databases can be stated better suited for simultaneous multiple-user query systems including Web-GIS and mobile-GIS.

FOSS4G Seoul, South Korea | September 14th – 19th , 2015 Sarthak Agarwal, KS Rajan 13

slide-14
SLIDE 14

Conclusion

∙ non-relational databases are more suited to the multi-user query systems

∙ potential to be implemented in servers with limited computational power.

∙ Further studies are required ∙ In future we are planning on expanding our study to other spatial query functions as well as spatial algorithms

FOSS4G Seoul, South Korea | September 14th – 19th , 2015 Sarthak Agarwal, KS Rajan 14

slide-15
SLIDE 15

Questions?

FOSS4G Seoul, South Korea | September 14th – 19th , 2015 Sarthak Agarwal, KS Rajan 15