So, what is a database?

If you are considering a career in Analytics and Business Intelligence or in general want to expand your area of focus at work to pursue more data-intensive problem solving, it is important to understand the basics of database and relational models. The usage of data is so ubiquitous in our daily professional lives that any professional (not necessarily data analysts) may find themselves using data to make decisions, irrespective of the industry vertical or the job function one may be in.

Here is a small, jargon-less and beginner-friendly take on what is a database and some essentials one might need to be familiar with to get going with all things data.

What is a Database (DB)?

Put simply, database is the place where all the digital data gets stores in a structured format. The purpose of creating a database is to ease access to information. The kind of information we store in a database can range from customer information, internal employee data, or other market or product related data, basically any data. Anything that anyone ever creates, stores, reads, uses, modifies, shares, or deletes.

What is data - anything that anyone ever creates, stores, reads, uses, modifies, shares, or deletes.

What is a Database Management System (DBMS)?

DBMS is a software application, which is used to create, update and access data from databases. Users of the application are given different access privileges, depending on which, they can perform several kinds of operations on the database such as manipulation of data or changing the structure of the data itself.

What is a Relational Database Management System (RDBMS)?

There are different types of database management systems such as hierarchical, network based or relational. Here, we will only focus on Relational DBMS.

Without going into too much history, we must know that RDBMS or Relational Database Management Systems, as the name suggests, is based on relational data models (tables). Relational data models are where the structure of the data can be represented in a tabular format of rows and columns. For example, any data that you may be accessing in a spreadsheet format is tabular data, such as accounting, expenses etc.

Everything in a relational database is stored in the form of relations or tables, and information is accessed across different tabular relations through Joins. Most commercial RDBMS use SQL (Structured Query Language) to access the database. JOINS in SQL are explained here. We will cover SQL basics in a separate article. RDBMS functions are based on CRUD style - Create, Read, Update and Delete functions to access and modify data.

Some basic characteristics of RDBMS

For anyone starting out a career in database analytics, must know about the concept of Primary and Foreign key. These are termed as SQL Constraints, which are rules that are enforced on data columns of a table, to ensure accuracy and reliability of data.

1. Primary key Primary key is the attribute or column, which uniquely identifies each row of a table. By definition, it cannot have duplicate or null values. The primary key can be composed of one or more column attributes. When the primary key is constructed through a combination of two or more attributes, it is called a composite key.

2. Foreign key Foreign key uniquely identifies a row / record in a different table. These are column attributes in one table, whose value must either match the primary key in another table or be null. Similar to primary key, foreign key in a table can also be composed of a combination of two or more attributes.

Some other important basics to know:

  • 1. The ordering of rows is not important in a relational database. You can insert rows in any order in the table.

  • 2. All rows should be distinct or unique. In other words, there cannot be duplicate rows in a table.