Introduction
NoSQL (short for “Not Only SQL”) is an alternative to traditional databases, focused on capturing and processing large amounts of data.
There are several types of NoSQL databases, each with a unique approach to data modeling and different uses cases.
In this tutorial, we will provide a brief overview of multiple NoSQL database types and list some of the popular examples for each one.
NoSQL Database Types
The four most popular types of NoSQL databases are key-value databases, document-based databases, graph-based databases, and wide column-based databases:
Note: For a more in-depth overview for each database type, check out our guide to NoSQL databases.
Key-Value Databases
Key-value databases organize data in pairs of keys and values, where each key is tied to a specific object, representing a data field. Providing a key allows you to view the data stored in the object it is paired with.
This is the simplest and most scalable type of NoSQL database, offering flexibility and improved performance.
Document-Based NoSQL Databases
Document-based databases also use key-value pairs they store into documents. These documents are further grouped into collections based on their content and use.
These databases most commonly store data as JSON, XML, BSON, or YAML forms, usually without implementing a schema. This approach makes them suitable for cases that require a flexible structure and the ability to quickly add and retrieve data.
Graph-Based Databases
Graph-based databases represent data as a collection of nodes (data elements) connected by edges. In this data structure, nodes contain pieces of data, while edges define relationships between them.
This database type is commonly used to represent relationships between different data entries, such as friend connections on social networks. Users can perform complex queries and directly pull multiple pieces of data at the same time.
Wide Column-Based Databases
Wide column-based databases store data into separate columns, similar to how data is stored in tables with relational databases. Unlike relational databases, wide-column databases do not use predefined keys or column names.
This allows for variations in column names, even within the same table. It is also easy to add large amounts of data as new columns, or group existing ones into column families.
Object Databases
Object databases store data elements as objects to be used in object-oriented programming. They are designed to work with programming languages like Python, Ruby, Delphi, Java, etc.
Grid and Cloud Databases
Grid and cloud databases use a data grid – a network of systems working with data accessible through the cloud.
This type of database works with both SQL and NoSQL data models and is typically offered as a database-as-a-service.
Multi-Model Databases
Finally, multi-model databases combine the features of two or more different database types. This allows them to provide a solution for unique use cases where other database types are not suitable.
List of NoSQL Databases
Below is a list of NoSQL databases for 2021, arranged into sections by database type:
List of Key-Value Databases
Redis
Redis works as a data structure server that stores data in-memory. This means that Redis reads and modifies data from the main memory, but it also has built-in persistence. This feature allows saving data to disk so it can be reconstructed if the system restarts.
Advantages of using Redis:
- Working in-memory allows for high performance and flexibility.
- Support for many different data types and programming languages.
- Easy to scale and supports automatic partitioning.
Note: If you are interested in using Redis, check out our guides on how to install Redis on Ubuntu and Mac.
Aerospike
Like Redis, Aerospike is open-source, in-memory NoSQL database. Aerospike is optimized for online retail use thanks to its high performance and the ability to combine transaction data with analytics.
Advantages of using Aerospike:
- Reliable performance with very low latency.
- A good ratio of price and performance makes it suitable for smaller businesses.
Riak
Riak stores key-value pairs in data objects it calls “buckets.” It supports a wide range of data formats and emphasizes data stability and predictable performance.
Advantages of using Riak:
- Key-value pairs are saved in clusters of three nodes, with the option of replicating the data to additional nodes for backup.
- Data can be stored in-memory, on disks, or both.
- Multi-datacenter replication allows to back up your data to data centers in different locations.
Project Voldemort
LinkedIn uses Project Voldemort as their solution for high-scalability key-value storage. It works as a distributed, fault-tolerant, and persistent hash table.
Advantages of using Project Voldemort:
- Data is automatically replicated and partitioned over multiple servers.
- Storage and serialization plugins are available.
- Good single-node performance.
List for Document-Based NoSQL Databases
MongoDB
MongoDB is an open-source, agile database that a wide range of companies use across different industries. It stores documents as JSON objects that can quickly change schemas according to your needs.
Advantages of using MongoDB:
- Easy to scale from a single server to complex systems.
- Consistently provides high performance.
- High reliability thanks to replication and load balancing.
Note: Also, have a look at our guides to installing MongoDB on Ubuntu and CentOS.
Couchbase Server
Couchbase Server (known initially as Membase) is an open-source, distributed database solution. The primary design intention is to work with interactive applications to store large amounts of user data as JSON objects.
Advantages of using Couchbase Server:
- Cluster management allows for quick scaling.
- Customizable replication, even between data centers.
CouchDB
CouchDB is an open-source database written in Erlang. It offers features such as multi-version concurrency control (using ACID semantics), multi-master replication, and map/reduce.
Advantages of using CouchDB:
- Able to replicate data to devices like smartphones for offline access.
- Guarantees eventual consistency, providing availability and partition tolerance.
Elasticsearch
Elasticsearch is a distributed database that works as a search engine capable of full-text search with fuzzy matching. It falls under dual licensing: some parts are covered by the Server Side Public License, while others fall into the proprietary Elastic License category.
Advantages of using Elasticsearch:
- You can expand the features by combining Elasticsearch with other solutions, such as Logstash (data collection and log-parsing), Kibana (analytics), and Beats (data shipping).
- Scalable, real-time search with faceting and percolating.
Note: Check out our complete tutorial on ELK Stack to learn more about Elasticsearch.
List for Graph-Based Databases
Neo4J
Neo4J is an open-source graph-based database built in Java, with additional features available as a part of their Graph Data Platform. It uses the Cypher query language to offer access to a wider range of queries than other database types while maintaining high performance.
Advantages of using Neo4J:
- Useful for solving problems that require repeated network probing.
- Facilitates the analysis data objects and their relationships.
OrientDB
OrientDB is an open-source, multi-model database system with a strong emphasis on the graph database model. It can be deployed on any operating system and boasts a wide range of features, which can be further expanded by upgrading to the Enterprise Edition.
Advantages of using OrientDB:
- A strong security system based on users and roles.
- Easy to get started with a free Udemy course, extensive user support through Stack Overflow.
- Easy to import other relational databases into OrientDB.
RedisGraph
RedisGraph is a graph database module for Redis. It is based on the Property Graph model and uses the Cypher query language to translate queries into linear algebra expressions.
Advantages of using RedisGraph:
- Easy to combine with existing Redis databases.
- Allows adding node labels and relationship types.
InfiniteGraph
InfiniteGraph is a distributed database focusing on performing complex object queries. It is used to develop web or mobile applications that solve graph problems working with complex big data sets.
Advantages of using InfiniteGraph:
- Able to handle complex or parallel queries that require high performance.
- Backup with flexible consistency (from ACID to relaxed).
List for Wide Column-Based Databases
Cassandra
Apache Cassandra is a free, open-source database solution built to handle large data loads with minimal impact on performance. Twitter, Netflix, and Reddit all use Cassandra due to its high speed and availability.
Advantages of using Cassandra:
- Asynchronous, masterless replication ensures protection from data loss without causing latency.
- Scales easily across multiple data centers with no downtime.
Note: For more information, refer to our guides on installing Cassandra on Ubuntu and Windows.
Cosmos DB
Microsoft’s Azure Cosmos DB is a proprietary database solution. The product was designed to be globally distributed to help manage large-scale, horizontally scalable databases.
Advantages of using Cosmos DB:
- Combines with other Microsoft Azure services for expanded features.
- Automatic partitioning over multiple data centers.
HBase
HBase is designed to work with extremely large databases, with billions of rows and millions of columns. It runs on top of Hadoop Distributed File System (HDFS) and allows Hadoop to work like Google’s Bigtable.
Advantages of using HBase:
- Allows for large throughput on a scale of petabytes of data.
- Enables random, real-time read/write access to your database.
Accumulo
Apache Accumulo is another solution built on Hadoop and based on Google’s Bigtable. It improves on the Bigtable design by adding features like cell-based access control and server-side programming.
Advantages of using Accumulo:
- You can add cell-level security labels and store data of different security levels in the same table.
List of Object Databases
ObjectDB
ObjectDB is an object database solution for Java development with built-in support for Java APIs. It works in client-server or embedded mode.
Advantages of working with ObjectDB:
- Supports various platforms and operating systems.
- Uses both JDO and JPA query languages.
Ninja Database Pro
Ninja Database Pro features a high automation level that makes it easy for beginners to use. It provides a robust and fast way to manage data objects in a database.
Advantages of using Ninja Database Pro:
- Can work with complex data objects like double linked lists, multi-dimensional arrays, and dictionaries.
- ACID compliant, with built-in AES encryption.
NeoDB
NeoDB structures data as a network of objects resembling a large tree. This network is called a node space, and it focuses on nodes (objects), their relationships, and their properties.
Advantages of using NeoBD:
- Good for handling semi-structured data, with few mandatory but many optional attributes.
Objectivity/DB
Objectivity/DB is a distributed database that allows you to work with data objects in C++, C#, Java, or Python without converting them into tables.
Advantages of using Objectivity/DB:
- Uses any supported programming language on the operating system of your choice.
- Its architecture makes it a good choice for grid computing environments.
List of Cloud and Grid Databases
Oracle Coherence
Oracle Coherence is a distributed cache and in-memory data grid based on Java. It manages data in clustered applications, which eliminates the need to query the database directly each time you need to manage data.
Advantages of using Oracle Coherence:
- Provides high availability, scalability, and low latency.
Infinispan
Infinispan is an open-source data grid solution written in Java. Infinispan can be embedded into Java applications as a library, and non-Java applications can use it with TCP/IP.
Advantages of using Infinispan:
- Highly scalable while maintaining availability.
- Its pluggable architecture allows it to persist data to the filesystem or other database managers.
Hazelcast
Hazelcast is an open-source data grid. It is based on Java and can run on-premises, virtually, in the cloud, or in Docker containers.
Advantages of using Hazelcast:
- Allows for horizontal scaling of storage and processing power.
List of Multi-Model Databases
ArangoDB
ArangoDB is a free and open-source database manager that supports key-value, document, and graph database models.
Advantages of using ArangoDB:
- The AQL query language allows you to target different database types with a single query.
- Works as a distributed cluster with the single-click cluster deployment option.
Conclusion
After going through this article, you should have a solid overview of the most popular NoSQL databases and their main features. The lists helps you find a database solution that best fits your needs.
Please note that this is not a comprehensive list, and there are many more solutions available on the market.
原创文章,作者:bd101bd101,如若转载,请注明出处:https://blog.ytso.com/tech/database/226202.html