In this tutorial, we will expand on what we’ve learned about
JOIN and combine multiple tables together to filter and view useful information.
#Table of Contents
Query 1. Check the overdue fees owed by customers and the number of times they were late returning movies by combining 5 different tables.
JOIN first. Do you remember what this function does?
JOIN allows you to combine two or more tables using a common column (Join Key).
Different kinds of
Depending on the information you want to see, different
JOIN functions can be used. In this next example, we will user
INNER JOIN to combine two tables and view only rentals that are past due.
Q1. Check the overdue fees owed by customers and the number of times they were late returning movies by combining 5 different tables.
Let’s say we want to give the customers with the least amount of late fees a coupon in our imaginary DVD store. To find who they are, we have to combine multiple tables that contain different levels of data. We specified the columns we will use to combine all five tables below:
To calculate an accurate fee for late returns, we need to know how long the movie was rented and how much it costs to rent. We also need to factor in when it was rented, and when it was returned.
The main information we want extracted from all these tables is the customer ID number, customer’s full name, email address, the number of times they were late in returning a movie, and how much they owe for being late. This requires several table combinations because we need to calculate the number of overdue payments and the rent by movie.
Here is a visual break down of what we need from each table:
And here is a breakdown of what the image means:
- Join the rental table and the customer table using the
- Join the rental table and the payment table using the
- Join the rental table and the inventory table using the
- Join the inventory table and the film table using the
Now that we checked out the order we will combine the tables and the columns we need, let’s look at our query in full.
If it looks too complicated, don’t worry. We’re going to break it down into 6 simple steps. :)
JOIN in the
FROM function to join 5 tables: rental table, customer table, payment table, inventory table, film table.
We usually start with
SELECT, but this time we will start with
FROM. The reason we’re doing this is because we want to define a nickname (alias) for all our tables first, which will make writing our
SELECT function later on much more intuitive and easy.
So let’s write the
JOIN query that combines our five tables in
FROM. We want all data including loan/return information, overdue information, payment information, movie information, and customer information. So we will combine the tables with
INNER JOIN, which does not generate any
To save time and energy so we don’t have to type out the name of the table every time we need to call on it, we will give every table a nickname. Here is the alias cheat-sheet:
Rental table alias ▶ ️ r
Customer table alias ▶ c
Payment table alias ▶ p
Film table alias ▶ f
Inventory table alias ▶ i
Combine the tables in the order of
JOIN above, and note the common column that combines each table after
ON. Great! We’ve combined all the tables to get the information we need.
2. Add an
AND function to the
JOIN query to get the data that corresponds to the overdue period.
We already joined the film table, but we also want a calculation result from the film table. We can do this using
To get an accurate time period of how long a customer was late in returning a movie, we’re going to use
DATEDIFF to calculate the difference between the day the movie was rented out and the day it was returned. If you forgot how to use
DATEDIFF, check out our previous lesson on it.
⭐️ Don’t forget to utilize the ‘r’ alias for the rental table here.
3. Specify the columns we want to view for customer information (full name, email) in the
Using the aliases we already established, we can clarify the columns we want to see in
Start with the customer ID, which is from the customer table (alias c). Next, connect the first and last name of the customers using
CONCAT. Put a space between the names using quotations and rename the column as customer_name. Last, add in the customer email.
⭐️ Keep in mind that all these columns are from the customer table, so we need to add the alias
c. before each column name.
4. Count of the number of delayed movie returns per customer and view them with the
With information from the rental table, we can find the number of delayed returns of movies per customer. Using the
COUNT function, count all the rental IDs from the rental table (alias r). Rename this column overdue_amount.
🔑 Because we only joined the long rental periods through
JOIN, we will only get the number of rentals that are overdue.
GROUP BY customer ID to organize your results. Don’t forget to use the alias of the customer table (
5. Find the total late fee amount owed per customer for their late returns in the
To calculate the late fee, we need to find the difference between the actual rent paid for each movie that is past due and the specified standard rent. Use
SUM to calculate the amount from the payment table (
p.amount) subtracted (
-) by the rental rate from the films table (
GROUP BY is also critical to this calculation because it helps better organize the total sum of past due fees per customer ID.
6. Order the total amount of late fees and late payment calculated per customer in the order of low fees to high fees owed.
ORDER BY, sort by the overdue_number column we created in step 4. Don’t forget to close the query with a semicolon (
We can now execute the query and view our data!
Here is the full query if you want to copy & paste it:
SELECT c.customer_id, CONCAT(c.first_name, “, “, c.last_name) AS customer_name, c.email, COUNT(r.rental_id) as overdue_number, SUM(p.amount-f.rental_rate) as overdue_amount FROM rental r INNER JOIN customer c ON r.customer_id = c.customer_id INNER JOIN payment p ON r.rental_id = p.rental_id INNER JOIN inventory i ON r.inventory_id = i.inventory_id INNER JOIN film f ON i.film_id = f.film_id AND f.rental_duration < DATEDIFF(r.return_date, r.rental_date) GROUP BY c.customer_id ORDER BY overdue_number;
This is a good way to join several tables and calculate information we need from all of them. Using the
JOIN function, we’re able to accurately view the information of each customer and how much they owe our imaginary movie store in late fees. We can also see how many times they were late in returning a movie! 🤗