Performance Evaluation of NoSQL Document Databases: Couchbase, CouchDB, and MongoDB
Abstract
:1. Introduction
- Analysis of the main characteristics of NoSQL document databases;
- Performance and scale-up evaluation of NoSQL document databases using YCSB benchmark;
- Study the impact of query execution time on the databases using a different number of threads and various records size.
2. Related Work
3. NoSQL Document Databases
3.1. Couchbase
- Provides triggers, CRUD (create, read, update, and delete) operations, ad hoc queries, MapReduce, and indexes;
- It uses MVCC (MultiVersion Concurrency Control);
- It supports master–master replication and master–slave replication;
- It supports horizontal and vertical scaling.
- It is not compatible with the four key properties of a transaction: atomicity, consistency, isolation, and durability (ACID properties);
- Indexing takes up too much RAM.
3.2. CouchDB
- Provides triggers, CRUD operations, and MapReduce;
- It uses master–master replication;
- It supports horizontal scaling;
- It uses MVCC.
- Views are temporary;
- Does not have support for ad hoc queries.
3.3. MongoDB
- It provides indexing, ad hoc queries, CRUD operations, and MapReduce;
- It is ACID-compliant;
- It provides native drivers for programming languages and frameworks;
- It supports master–slave replication;
- It supports horizontal, vertical, and tiered scaling;
- It uses MVCC.
- Data can easily be eliminated by mistake dues to the lack of relations;
- Indexing takes up too much RAM;
- It just supports triggers on MongoDB Atlas.
4. Yahoo! Cloud Serving Benchmark (YCSB)
- Workload properties: define the workload; for example, the combination of reading and writing, the distribution to be used, and the size and number of fields in a record;
- Runtime properties: define the specific properties of a workload execution; for example, the database interface layer uses the properties used to initialize this layer, such as the database service hostname and the number of client threads.
5. Experimental Results
5.1. Couchbase
- Considering 100,000 records, we see that increasing the number of threads to three reduced the runtime almost in half;
- Considering the experiments with 1,000,000 records, we verified that by increasing the number of threads from one to three, the runtime decreased by more than half. However, when we modify from three to six threads, the runtime does not halve, only decreasing by 40%. This occurs because the computer used supports only six threads and has other processes running in addition to the experimental process. Thus, Couchbase does not take full advantage of all available threads;
- Taking into account 10,000,000 records, we see that as the number of threads increased, the runtime decreased by almost 60% each time the number of threads was incremented;
- When increasing from 100,000 to 1,000,000 records, we expected the runtime increase to be 10 times; however, there was an increase of 18.3 times. This increment can be explained by the limitations of computer hardware with a single physical processor and also by the use of computer resources by other processes running at the same time;
- When moving from 1,000,000 records to 10,000,000 records, we expected the runtime to increase 10 times since the workload also increased 10 times; however, there was a 23 times increase. This 23-fold increase is due to the fact that the workload was higher; therefore, more disk accesses had to be made, which made runtime much slower;
- When increasing from 100,000 records to 10,000,000 records, the runtime increases 421.3 times, when the expected increase would be 100 times since the workload also increased 100 times. This growth is explained by the limitations of computer hardware.
5.2. CouchDB
- Considering 100,000 records, we see that increasing the number of threads from one to three caused the runtime to be more than halved. Although the same did not happen when switching from three to six threads, with the runtime being reduced by only 30%. This can be explained by the management of the threads that is performed by the database engine;
- Considering 1,000,000 records, we can see that with the increase in the number of threads, the runtime has almost halved;
- Considering 10,000,000 records, the runtime from one to three threads decreases by almost 60%, and from three to six threads, the runtime decreases by 50%. This shows that CouchDB presents good scale-up when using more threads;
- When increasing from 100,000 records to 1,000,000 records, the expected runtime was 10 times longer; however, there was an increase of 17.9 times;
- When increasing from 1,000,000 records to 10,000,000 records, the expected runtime increase was 10 times since the workload also increased 10 times, but it was found to increase 18.1 times;
- From 100,000 records to 10,000,000 records, the runtime increases by 323.7 times, when the expected increase was only 100 times since the workload increase was also 100 times. As explained before, this increase is due to the limitations of computer hardware and the high workload, implying more disk access.
5.3. MongoDB
- Considering 100,000 records, we verify that the runtime decreases more than halved when increasing from one to three threads. The same does not happen when increasing from three to six threads, as it drops only 20% of the runtime, representing poor scaling-up. This 20% can also be explained by the fact that the computer used only supports six threads and uses them for other processes that run simultaneously;
- Considering 1,000,000 records, we verify that the runtime was reduced by 50% when it switches from one to three threads. When it was changed from three to six threads, the decrease in runtime was even worse, decreasing by only 10%. This can happen because there is much processing, thus using the disk more frequently, which slows down the runtime;
- Considering 10,000,000 records, we verify that, once again, its performance has worsened. This is because it has more data to process and consumes more computer resources. When increasing from one to three threads, the runtime only decreases by 30%. It is worse when it goes from three to six threads, in which it only decreases by 10%. These results show that MongoDB scales up poorly when using more threads;
- When moving from 100,000 records to 1,000,000 records, the runtime was expected to increase by 10 times; however, there was a 37.4 times increase. This increment is much higher than Couchbase and CouchDB, with an increase of 18.3 times and 17.9 times, respectively. This growth can be explained by the limitations of computer hardware and the use of computer resources by other processes running at the same time;
- When moving from 1,000,000 records to 10,000,000 records, the increase expected was 10 times; however, there was an increase in a runtime of 23.8 times. This increment of 23 times is due to the fact that the workload is higher, and therefore, more disk accesses need to be made, which slows down the runtime;
- When moving from 100,000 records to 10,000,000 records, the runtime increases by 890 times, a much higher value than Couchbase and CouchDB, with an increase of 421.3 times and 323.7 times, respectively. This growth is also due to the justifications already presented in the previous points.
5.4. Discussion of the Results
- With 100,000 and 1,000,000 records, MongoDB was the database that had the shortest runtime. However, when using 10,000,000 records, CouchDB had the shortest runtime, followed by MongoDB and Couchbase;
- MongoDB had the worst runtime in workload E. On the other hand, Couchbase had the worst runtime using workloads A, E, and F. The operations that compose these workloads are scan in workload E, and read–modify–write in workload F;
- It was verified that the three document databases present good scale-up when moving from one to three threads. When moving from three to six threads, MongoDB presents the worst scale-up, especially when increasing the workload size;
- In some cases, CouchDB performed better than MongoDB, particularly when switching from three to six threads and also when using a large number of records, where the difference between runtime was greater.
- MongoDB has a 3.28 times faster runtime than CouchDB;
- MongoDB has a 7.00 times faster runtime than Couchbase;
- CouchDB has a 2.13 times faster runtime than Couchbase.
6. Conclusions and Future Work
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Tannir, K. RavenDB 2.x; PACKT Publishing: Birmingham, UK, 2013. [Google Scholar]
- Elmasri, R.; Navathe, S.B. Fundamentals of Database Systems, 7th ed.; Pearson Publishing: London, UK, 2016. [Google Scholar]
- Dave, M. SQL and NoSQL Databases. 2012. Available online: https://www.researchgate.net/publication/303856633 (accessed on 5 March 2022).
- Reniers, V.; van Landuyt, D.; Rafique, A.; Joosen, W. On the state of NoSQL benchmarks. In Proceedings of the ICPE 2017—Companion of the 2017 ACM/SPEC International Conference on Performance Engineering, L’Aquila, Italy, 22–26 April 2017; pp. 107–112. [Google Scholar]
- Dalström, I.; Ericsson, P. Performance Comparison between PostgreSQL, MongoDB, ArangoDB and HBase. Bachelor’s Thesis, Information Technology. University of Skövde, Sweden, 2022. [Google Scholar]
- Mishra, O.; Lodhi, P.; Mehta, S. Document Oriented NoSQL Databases: An Empirical Study. In Data Science and Analytics; Springer Nature: Singapore, 2018; pp. 126–136. [Google Scholar]
- Augusto, D.; Morais, W.; Freitas, E. NoSQL real-time database performance comparison. Int. J. Parallel Emergent Distrib. Syst. 2018, 33, 144–156. [Google Scholar]
- Abramova, V.; Bernardino, J. NoSQL Databases: MongoDB vs Cassandra. In Proceedings of the International Conference on Computer Science & Software Engineering, Porto, Portugal, 10–12 July 2013. [Google Scholar]
- Matallah, H.; Belalem, G.; Bouamrane, K. Experimental comparative study of NoSQL databases: HBASE versus MongoDB by YCSB. Int. J. Comput. Syst. Sci. Eng. 2017, 32, 307–317. [Google Scholar]
- Carvalho, I.; Sá, F.; Bernardino, J. NoSQL Document Databases Assessment: Couchbase, CouchDB, and MongoDB. In Proceedings of the 11th International Conference on Data Science, Technology and Applications, Lisbon, Portugal, 11–13 July 2022; pp. 557–564. [Google Scholar]
- Pandey, R. Performance Benchmarking and Comparison of Cloud-Based Databases MongoDB (NoSQL) vs. MySQL (Relational) Using YCSB; National College of Ireland: Dublin, Ireland, 2020. [Google Scholar]
- Leavitt, N. Will NoSQL Databases Live Up to Their Promise? 2010. Available online: www.leavcom (accessed on 5 March 2022).
- Nayak, A.; Poriya, A.; Poojary, D. Type of NOSQL Databases and its Comparison with Relational Databases. Int. J. Appl. Inf. Syst. 2013, 5, 16–19. [Google Scholar]
- DB-Engines. DB-Engines Ranking. 2022. Available online: https://db-engines.com/en/ranking (accessed on 27 March 2022).
- Couchbase Inc. Couchbase Documentation. 2022. Available online: https://docs.couchbase.com/home/index.html (accessed on 24 April 2022).
- CouchDB. Apache CouchDB® 3.2.0 Documentation. 2022. Available online: https://docs.couchdb.org/en/stable/ (accessed on 23 April 2022).
- MongoDB. MongoDB Architecture Guide. 2021. Available online: https://www.mongodb.com/collateral/mongodb-architecture-guide (accessed on 29 March 2022).
- MongoDB. Welcome to the MongoDB Documentation. 2021. Available online: https://www.mongodb.com/docs/ (accessed on 25 March 2022).
- MongoDB. GridFS. 2021. Available online: https://docs.mongodb.com/manual/core/gridfs/ (accessed on 9 March 2022).
- Wang, S.; Li, G.; Yao, X.; Zeng, Y.; Pang, L.; Zhang, L. A Distributed Storage and Access Approach for Massive Remote Sensing Data in MongoDB. ISPRS Int. J. Geo-Inf. 2019, 8, 533. [Google Scholar] [CrossRef] [Green Version]
- Kashyap, S.; Zamwar, S.; Bhavsar, T.; Singh, S. Benchmarking and Analysis of NoSQL Technologies. Int. J. Emerg. Technol. Adv. Eng. 2013, 3, 422–426. [Google Scholar]
- Cooper, B. Yahoo! Cloud Serving Benchmark. 2010. Available online: https://research.yahoo.com/news/yahoo-cloud-serving-benchmark/ (accessed on 18 May 2022).
- Hellerstein, J.M.; Chaudhuri, S.; Rosenblum, M.; Association for Computing Machinery; Special Interest Group on Management of Data; ACM Special Interest Group in Operating Systems. Benchmarking Cloud Serving Systems with YCSB. In Proceedings of the 1st ACM Symposium on Cloud Computing, Indianapolis, IN, USA, 10–11 June 2010; p. 252. [Google Scholar]
Workloads | Operations | Distribution | Records | Threads | Data Size |
---|---|---|---|---|---|
A—Update Heavy | Read: 50% | Zipfian | 100,000 | 1 | Field size = 500 bytes |
Update: 50% | 1,000,000 | 3 | Field number = 20 | ||
10,000,000 | 6 | 500 bytes × 20 = 10 KB | |||
B—Read Mostly | Read: 95% | Zipfian | 100,000 | 1 | Field size = 500 bytes |
Update: 5% | 1,000,000 | 3 | Field number = 20 | ||
10,000,000 | 6 | 500 bytes × 20 = 10 KB | |||
C—Read Only | Read: 100% | Zipfian | 100,000 | 1 | Field size = 500 bytes |
1,000,000 | 3 | Field number = 20 | |||
10,000,000 | 6 | 500 bytes × 20 = 10 KB | |||
D—Read Latest | Read: 95% | Latest | 100,000 | 1 | Field size = 500 bytes |
Insert: 5% | 1,000,000 | 3 | Field number = 20 | ||
10,000,000 | 6 | 500 bytes × 20 = 10 KB | |||
E—Short Ranges | Scan: 95% | Zipfian | 100,000 | 1 | Field size = 500 bytes |
Insert: 5% | Uniform | 1,000,000 | 3 | Field number = 20 | |
10,000,000 | 6 | 500 bytes × 20 = 10 KB | |||
F—Read–Modify–Write | Read: 50% | Zipfian | 100,000 | 1 | Field size = 500 bytes |
Read–Modify–Write: 50% | 1,000,000 | 3 | Field number = 20 | ||
10,000,000 | 6 | 500 bytes × 20 = 10 KB | |||
G—Update Mostly | Update: 95% | Zipfian | 100,000 | 1 | Field size = 500 bytes |
Read: 5% | 1,000,000 | 3 | Field number = 20 | ||
10,000,000 | 6 | 500 bytes × 20 = 10 KB | |||
H—Update Only | Update: 100% | Zipfian | 100,000 | 1 | Field size = 500 bytes |
1,000,000 | 3 | Field number = 20 | |||
10,000,000 | 6 | 500 bytes × 20 = 10 KB |
Couchbase | 100,000 Records | 1,000,000 Records | 10,000,000 Records |
---|---|---|---|
Workload A | 10.28 s | 1037.79 s | 85,910.67 s |
Workload B | 9.27 s | 687.31 s | 33,770.34 s |
Workload C | 6.72 s | 80.34 s | 3423.30 s |
Workload D | 12.01 s | 201.87 s | 29,839.63 s |
Workload E | 373.08 s | 3288.45 s | 34,028.25 s |
Workload F | 33.16 s | 2822.24 s | 47,416.49 s |
Workload G | 20.63 s | 523.50 s | 6905.86 s |
Workload H | 22.88 s | 83.78 s | 1078.91 s |
Total | 488.03 s (8.13 min) | 8725.28 s (145.42 min) | 242,373.45 s (4039.56 min) |
Couchbase | 100,000 Records | 1,000,000 Records | 10,000,000 Records |
---|---|---|---|
1 Thread | 1094.68 s | 18,951.37 s | 384,101.63 s |
3 Threads | 488.03 s | 8725.28 s | 242,373.45 s |
6 Threads | 256.29 s | 6066.70 s | 148,339.99 s |
Total | 1839.00 s (30.65 min) | 33,743.33 s (562.39 min) | 774,815.06 s (12,913.58 min) |
CouchDB | 100,000 Records | 1,000,000 Records | 10,000,000 Records |
---|---|---|---|
Workload A | 10.58 s | 1278.84 s | 18,513.14 s |
Workload B | 14.83 s | 790.41 s | 28,537.45 s |
Workload C | 17.20 s | 145.05 s | 4150.06 s |
Workload D | 19.00 s | 194.48 s | 9021.59 s |
Workload E | 181.30 s | 2806.93 s | 45,257.58 s |
Workload F | 22.58 s | 734.97 s | 8988.81 s |
Workload G | 26.41 s | 789.59 s | 12,043.40 s |
Workload H | 22.37 s | 356.89 s | 6058.88 s |
Total | 314.27 s (5.24 min) | 7097.16 s (118.29 min) | 132,570.91 s (2209.52 min) |
CouchDB | 100,000 Records | 1,000,000 Records | 10,000,000 Records |
---|---|---|---|
1 Thread | 742.38 s | 12,233.93 s | 222,046.97 s |
3 Threads | 314.27 s | 7097.16 s | 132,570.91 s |
6 Threads | 243.09 s | 3937.96 s | 66,041.70 s |
Total | 1299.74 s (20.50 min) | 23,269.05 s (387.82 min) | 420,659.55 s (7010.99 min) |
MongoDB | 100,000 Records | 1,000,000 Records | 10,000,000 Records |
---|---|---|---|
Workload A | 10.65 s | 297.41 s | 4950.83 s |
Workload B | 7.15 s | 122.37 s | 2564.01 s |
Workload C | 6.61 s | 111.58 s | 2305.59 s |
Workload D | 7.31 s | 81.03 s | 1427.47 s |
Workload E | 78.82 s | 4491.09 s | 126,310.74 s |
Workload F | 13.52 s | 296.36 s | 4756.72 s |
Workload G | 12.22 s | 541.20 s | 6765.96 s |
Workload H | 12.22 s | 463.96 s | 7034.36 s |
Total | 137.85 s (2.30 min) | 6107.59 s (101.79 min) | 151,164.85 s (2519.41 min) |
MongoDB | 100,000 Records | 1,000,000 Records | 10,000,000 Records |
---|---|---|---|
1 Thread | 297.30 s | 8883.50 s | 197,138.39 s |
3 Threads | 137.85 s | 6107.59 s | 151,164.85 s |
6 Threads | 115.41 s | 5609.53 s | 141,708.75 s |
Total | 550.56 s (9.18 min) | 20,600.62 s (343.34 min) | 490,011.99 s (8166.87 min) |
Couchbase | CouchDB | MongoDB | |
---|---|---|---|
Total | 810,397.40 s (13,506.62 min) | 445,228.30 s (7420.47 min) | 510,909.90 s (8515.16 min) |
Workload E | Couchbase | CouchDB | MongoDB |
---|---|---|---|
Total | 147,909.45 s (2465.16 min) | 134,837.95 s (2247.30 min) | 417,344.15 s (6955.74 min) |
Couchbase | CouchDB | MongoDB | |
---|---|---|---|
Total | 662,487.95 s (11,041.47 min) | 310,390.35 s (5173.17 min) | 94,575.80 s (1576.26 min) |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Carvalho, I.; Sá, F.; Bernardino, J. Performance Evaluation of NoSQL Document Databases: Couchbase, CouchDB, and MongoDB. Algorithms 2023, 16, 78. https://doi.org/10.3390/a16020078
Carvalho I, Sá F, Bernardino J. Performance Evaluation of NoSQL Document Databases: Couchbase, CouchDB, and MongoDB. Algorithms. 2023; 16(2):78. https://doi.org/10.3390/a16020078
Chicago/Turabian StyleCarvalho, Inês, Filipe Sá, and Jorge Bernardino. 2023. "Performance Evaluation of NoSQL Document Databases: Couchbase, CouchDB, and MongoDB" Algorithms 16, no. 2: 78. https://doi.org/10.3390/a16020078
APA StyleCarvalho, I., Sá, F., & Bernardino, J. (2023). Performance Evaluation of NoSQL Document Databases: Couchbase, CouchDB, and MongoDB. Algorithms, 16(2), 78. https://doi.org/10.3390/a16020078