subqueries in the from clause
From clause subqueries always seemed somewhat limited to me. Generally, you need to be able to fully create the subquery staticly on it's own, without any data from the main query's results, and then join that static data set to your main query using the on clause.
But, sometimes, you need to base the subquery on data items from the main query. (Where the subquery gets re-evaluated for each row of the main query.)
Basing the subquery on the main query data happens all the time when using subqueries in the columns and where sections of sql statements. for example:
if we have some tables:
student
id name
1 joe
2 mary
scores
id studentId score scoreDate
1 1 75 2025-01-31
2 1 92 2025-02-25
3 1 88 2025-03-15
4 2 87 2025-01-31
5 2 95 2025-02-25
6 2 100 2025-03-15
and we select data to return the top score for each student:
select
student.name,
(
select
score
from
scores
where
score.studentID = student.id
order by
scoreDate desc
limit 1
) as topScore
from
student
or select only the students with their top score being greater than 90:
select
student.name
from
student
where
(
select
score
from
scores
where
score.studentID = student.id
order by
scoreDate desc
limit 1
) > 90
everything is fine, but if you try to use the subquery in the from clause :
select
student.name,
scores.score as topScore
from
student
left join (
select
score
from
scores
where
score.studentID = student.id
order by
scoreDate desc
limit 1
) as topScore
FAIL: the sql engine will say: "can not access table 'student' in subquery"
WTF, why not?!?
Well the answer lies within the "type" of subquery. A subquery that references the main query in order to re-rerun itself and get new data with every row from the main query is called a "correlated subquery". While the columns section and the where section don't care which type of subquery is used (correlated vs non-correlated), the from section does care.
Question: if you can use a correlated subquery in the columns clause, why need it in the from clause?
Well, because sometimes you need to get multiple columns from the subquery, and subqueries in the columns clause only allow one column to be returned. For example, what if I wanted topScore AND the scoreDate of that score? I would love to do this:
select
student.name,
topScore.score as topScore,
topScore.scoreDate as topScoreDate
from
student
left join (
select
score
from
scores
where
score.studentID = student.id
order by
scoreDate desc
limit 1
) as topScore
FAIL: still can't do it of course!
Do we need to create 2 subqueries in the columns clause?!? That is extremely inefficient! The same query would have to be called twice now. (and using correlated subqueries in the first place incurs a cost, so having to double-up the cost is painful!)
select
student.name,
(
select
score
from
scores
where
score.studentID = student.id
order by
scoreDate desc
limit 1
) as topScore,
(
select
scoreDate
from
scores
where
score.studentID = student.id
order by
scoreDate desc
limit 1
) as topScoreDate,
from
student
(yuck!)
I found the answer
I've been developing in sql for about 20 years and never knew about this. When I found out, I felt a little foolish, but definitely relieved--here's how it's done:
The secret is in the on clause. On clauses, like columns clauses and where clauses also do not care if a correlated subquery is used, and that fixes the problem.
select
student.name,
topScores.score as topScore,
topScores.scoreDate as topScoreDate
from
student
left join scores as topScores on topScores.id = (
select
id
from
scores
where
score.studentID = student.id
order by
scoreDate desc
limit 1
)
Yes this does incur an extra lookup to get the topScores record "again", however, since it's via the primary key, it's not so bad.
By the way, some database engines offer special syntax for correlated subqueries in from clauses, which would then eliminate that extra lookup. See the outer apply join operator for MS SQL Server, and the lateral join operator in MySql. As of 9/2025, MariaDB and Sqlite do not have any special syntax.