Databases

The data ingestion tools discussed in the previous section are helpful in streaming data. For the next few steps, we need a storage space where this incoming data streams can be stored and accessed. Hence, arise the need for a database. Databases are of two types - Relational Database and Non-Relational Database.

Relational Database

Relational Databases consist of a collection of tables that contain data represented by fixed attributes and data types. By means of Structured Query Language (SQL) statements, we can read, create, update or delete data in these tables. Each table comes associated with a key that can be used for the identification of a specific row or column. Popular Relational Databases Management Systems are Oracle, MySQL, Microsoft SQL Server, and PostgreSQL.

Non-Relational Database

For Non-Relation Databases, the data does not require a fixed schema, it allows for unstructured or semi-structured data to be stored and manipulated. Key-value stores such as Redis and Amazon DynamoDB that simply store data in terms of key-value pairs. MongoDB and CouchBase are document types that are free of schema, storing data as JSON documents. Graphical Databases such as Neo4J, Datastax Enterprise Graph represent data as a network with several related nodes or objects. One node is represented by free form data which is connected to other nodes forming relationships. Elasticsearch, Splunk and Solr are search engines that like document types store data as JSON documents. They are one step better in drawing emphasis on providing easy accessibility to the unstructured data via text-based searches with varying strings of complexity.

PreviousData Ingestion tools NextProgramming Languages

Last updated 1 year ago