ZODB Performance

ZODB is very interesting software, very different from mainstream databases, and so it is only natural to ask about performance.       Yes, it offers great speed of software development, but how does it perform at runtime.  In particular a ZODB application is just a python application, with potentially lots of disk access to small python objects.  And also with lots of traversals of the graph of objects, particularlly if used in Zope.  So how does ZODB perform?        

 

Introduction

To understnad ZODB Performance, once needs to understand the structure of ZODB.  There are many ways to store the ZODB data.  Each one has different performance characteristics.   The simplest is file storage.  All the data is stoed in one file.  Large objects, such as images and documents can be stored in a separate file called a blob. For increased performance, one can use ZEO as a sharred database server, and cache the objects in ZEO server memery or in the application server memory.  For even better performance, one can use RelStorage, to store the objects in a relational database, or NEO to store the objects in multiple relational databases.  

ZODB Read and Writes

There is a difference between ZODB read and write performance.   ZODB is generally considered best for applicaiotns that have more reads than writes.  Web applicatins are usually like that.  Banking applications or airline reservations systems should use a different technology.  

So how do reads work?  ZODB runs off of a single file.   Reads can be all over the file, writes are only at the end of the file. Since a single file can be distributed across multiple hard drive spindles, in principal read parallelism is possible, but in practice the read spead is limited by the speed of the ZEO server process.  Writes are different.    ZODB only writes to the end of a file.   Writes may run into conflicts, and require retries. So that limits the write performance of the ZODB. 


And of course the server caches data from the disk, and that also has a big impact on performance. 

 

Small Databases

ZODB performs brilliantly on small databases.  If your entire database fits in RAM, or even just the active part of the database fits in RAM, then ZODB will cache it on the  database server RAM, and on the application server RAM, and it will perform brilliantly.   Of course you are still limited by the number of writes allowed.  

Large Databases

Zope Corporation had databases with over a terabyte of blob data in S3 and a a couple of hundred-gig main databases.

ZODB on SSD

Every database is performance limited by the hard drives. 

Historically relational databases were designed to load a bunch of identical records all at once.  Trees were expensive, particularly across record types.  On SSDs a database can afford to navigate the graph, and ZODB and python make it easy to do so.   In fact you can navigate the graph faster than ZEO can handle the data.  What a huge change to the world of software design.  

 Move ZODB to a Solid State Hard Drive, about 4 times more expensive, and you will no longer be limited by your hard drive persomance.   You will now be limited by the speed of the Pyrhon Process on the server. 

ZODB Scaling

ZODB Scales by adding ZEO application servers. In Zope Corporatins their largest deployment they had as many as many as 50 application servers. And that means way more than that Web Clients.  Sharding is not used, but NEO allows multiple relational databases to serve ZODB data. 

Reliability

Sure ZODB is amazing at storing python objects, but can you trust it with your data? Rather than giving you my opinion, here are some quotes:

"What is amazing is the reliability of zodb under extreme abuse. it has been beaten up over a decade. it's rock solid." Alan Runyan, Enfold Systems

I have the same experience.  Generally the people I speak to speak well of it. 




I invite you to Register and then link to your own blog postings and software packages..

Powered by Zopache, Grok, Zope and ZODB

Robots Crawl This