Home SQL Fundamentals SQL SQL Fundamentals SQL Fundamentals

SQL Fundamentals

March 20, 2021
SQL Fundamentals

In this blog post will provide a brief introduction to working with relational databases and you’ll learn about their structure. Also, you’ll learn how to talk about them using database lingo, and how to begin an analysis using simple SQL commands to select and summarize columns from database tables.

Data Scientist

The role of a data scientist is to turn raw data into actionable insights. Much of the world’s raw data lives in organized collections of tables called relational databases. To be an effective data scientist, you must know how to wrangle and extract data from these databases using a language called SQL. In this blog post will talk about syntax in SQL (shared by many types of databases, such as PostgreSQL, MySQL, SQL Server, and Oracle) and everything you need to know to begin working with databases today.

Beginning your SQL journey

SQL, which stands for Structured Query Language, is a language for interacting with data stored in something called a relational database.

You can think of a relational database as a collection of tables. A table is just a set of rows and columns, like a spreadsheet, which represents exactly one type of entity. For example, a table might represent employees in a company or purchases made, but not both.

Each row, or record, of a table contains information about a single entity. For example, in a table representing employees, each row represents a single person. Each column, or field, of a table contains a single attribute for all rows in the table. For example, in a table representing employees, we might have a column containing first and last names for all employees.

The table of employees might look something like this:


SELECTing single columns

While SQL can be used to create and modify databases, the focus of this blog post will be querying databases. A query is a request for data from a database table (or combination of tables). Querying is an essential skill for a data scientist, since the data you need for your analyses will often live in databases.

In SQL, you can select data from a table using a SELECT statement. For example, the following query selects the name column from the people table:

FROM people;

That said, it’s good practice to make SQL keywords uppercase to distinguish them from other parts of your query, like column and table names.

It’s also good practice (but not necessary) to include a semicolon at the end of your query. This tells SQL where the end of your query is.

SELECT title
FROM movies;

SELECTing multiple columns

In the real world, you will often want to select multiple columns. Luckily, SQL makes this really easy. To select multiple columns from a table, simply separate the column names with commas.

For example, this query selects two columns, name and birthdate, from the people table:

SELECT name, birthdate
FROM people;

Sometimes, you may want to select all columns from a table. Typing out every column name would be a pain, so there’s a handy shortcut:

FROM people;

If you only want to return a certain number of results, you can use the LIMIT keyword to limit the number of rows returned:

FROM people


Often your results will include many duplicate values. If you want to select all the unique values from a column, you can use the DISTINCT keyword. This might be useful if, for example, you’re interested in knowing which languages are represented in the movies table:

FROM movies;

Learning to COUNT

What if you want to count the number of employees in your employees table? The COUNT statement lets you do this by returning the number of rows in one or more columns.

For example, this code gives the number of rows in the employees table:

FROM employees;

You can combine keywords to select data. Next example will count all of the unique names from the employees table:

FROM employees;
Was this article helpful?