In this post, we demonstrate how to combine two datasets and filter results using SQL techniques like JOIN
and subqueries. This approach is useful when you need to cross-reference and extract filtered information from multiple tables. For example, let’s find Nobel Prize winners from South America by combining two datasets: a list of Nobel Laureates and a list of countries with their respective continents.
Step 1: Setup
Assume we have imported the following two tables into a database named Nobel_Laureates
:
Table Name | Columns |
---|---|
Countries |
Country , Continent |
Nobel_Laureates |
Winner , Country_of_Birth |
We want to filter Nobel Laureates who were born in countries located in South America. The first step is to identify the countries in South America.
Step 2: Filter Countries Using Subquery
To get the list of countries in South America, we use the Countries
table with the following query:
SELECT Country FROM Countries WHERE Continent = 'South America';
This query returns all countries in South America. Now, we can use this result to filter Nobel Laureates born in these countries by applying a subquery:
SELECT Winner FROM Nobel_Laureates WHERE Country_of_Birth IN ( SELECT Country FROM Countries WHERE Continent = 'South America' );
This query filters the Nobel_Laureates
table using the subquery, which dynamically retrieves the list of South American countries.
Step 3: Filter Countries Using JOIN
Alternatively, we can achieve the same result using a JOIN
. By joining the Nobel_Laureates
table with the Countries
table, we directly match Nobel Laureates to their respective continents:
SELECT Winner FROM Nobel_Laureates INNER JOIN Countries ON Nobel_Laureates.Country_of_Birth = Countries.Country WHERE Countries.Continent = 'South America';
In this query, the INNER JOIN
creates a connection between the two tables based on the shared Country
column. The WHERE
clause then filters the rows to include only those belonging to South America.
Result Comparison
Both the subquery and the JOIN
approach produce the same result. However, the choice between these methods depends on your specific use case. JOIN
is generally faster when working with large datasets because it avoids creating a temporary result set like a subquery does.
Approach | Advantages | Use Case |
---|---|---|
Subquery | Simpler and easier to read for small datasets | Filtering data based on dynamic criteria |
JOIN | More efficient for large datasets | Cross-referencing data between tables |
Conclusion
Both subqueries and joins are powerful tools for combining and filtering data in SQL. Understanding their strengths and when to use them will help you write more efficient and maintainable queries. Practice with these techniques to master filtering rows across multiple tables.