Hi I am working through example #7 from the sql zoo tutorial: SELECT within SELECT. In the following question
\"Find each country that belongs to a continent where all p
why use a sub query?
try using:
SELECT name, continent, population FROM world
WHERE population > 25000000
and/or
SELECT name, continent, population FROM world
WHERE population <= 25000000
the column of your condition: "population" is in the FROM
table: "world". There is no need to use a sub query of the same table "world" again, just use the "population" column directly in the WHERE
or are you trying to do this:
SELECT name, continent, population FROM world
WHERE continent NOT IN (
SELECT continent FROM world
GROUP BY continent
HAVING SUM(population) > 25000000)
notice the: SUM(), GROUP BY, and HAVING
If I'm reading this correctly, the question asks to list every country in a continent where every country has a population below 25000000, correct?
If yes, look at your sub query:
SELECT continent FROM world
WHERE population > 25000000
You are pulling every continent that has at least one country w/ population over 25000000, so excluding those is why it works.
Example: Continent Alpha has 5 countries, four of them are small, but one of them, country Charlie has a population of 50000000.
So your sub query will return Continent Alpha because country Charlie fit the constraint of population > 25000000. This sub query will find everything that you don't want, that's why using the not in will work.
On the other hand:
SELECT continent FROM world
WHERE population > 25000000
If ANY country is below 25000000, it will display the continent, which is not what you want, because you want EVERY country to be below.
Example: Continent Alpha from before, the four small countries. Those four are below 25000000, so they will be returned by your sub query, regardless of the fact that Country Charlie has 50000000.
Obviously, this is not the best way to go about it, but this is why the first query worked, and the second did not.
Because every other continent has at least one country with less then 25 Mio population. That is what this says.
SELECT name, continent, population FROM world
WHERE continent IN (
SELECT continent FROM world
WHERE population < 25000000)
Translating it into words: From the list of all countries (in table world) please find all countries where the continent has a country that has less than 25 Mio population.
Show the table DECLARATION. It seems you use CONTINENT as the continent number. Then you should check it is marked with PRIMARY KEY and NOT NULL options. I realyl suspect you just forgot about very special meaning NULL has in SQL.
I make an example in Firebird 2.5.1 SQL server.
CREATE TABLE WORLD (
CONTINENT INTEGER,
NAME VARCHAR(20),
POPULATION INTEGER
);
INSERT INTO WORLD (CONTINENT, NAME, POPULATION) VALUES (NULL, 'null-id', 100);
INSERT INTO WORLD (CONTINENT, NAME, POPULATION) VALUES (1, 'normal 1', 10);
INSERT INTO WORLD (CONTINENT, NAME, POPULATION) VALUES (2, 'normal 2', 200);
INSERT INTO WORLD (CONTINENT, NAME, POPULATION) VALUES (3, 'null-pop', NULL);
INSERT INTO WORLD (CONTINENT, NAME, POPULATION) VALUES (4, 'normal 4', 110);
COMMIT WORK;
Now let's try your requests and see if the 1st row, having CONTINENT IS NULL would be present anywhere:
SELECT continent, population FROM world
WHERE continent IN (
SELECT continent FROM world
WHERE population > 100)
CONTINENT POPULATION
2 200
4 110
and then
SELECT continent, population FROM world
WHERE continent NOT IN (
SELECT continent FROM world
WHERE population > 100)
CONTINENT POPULATION
1 10
3 <NULL>
By the logic of the request you suppose CONTINENT to be the row ID, then you should make it NOT-NULL and then there would not be the line, that is not seen by [NOT] IN condition.
Now, let re-phrase this into flat query:
SELECT continent, population FROM world
WHERE NOT (population > 100)
CONTINENT POPULATION
<NULL> 100
1 10
SELECT continent, population FROM world
WHERE population > 100
CONTINENT POPULATION
2 200
4 110
This time the missed row was the one having NULL for Population column.
Then FreshPrinceOfSO suggested using EXISTS clause. While potentially it may end with most slow (non-effective) query plan, it at least masks away the special meaning of NULL value in SQL.
SELECT continent, population FROM world w_ext
WHERE EXISTS (
SELECT continent FROM world w_int
WHERE (w_int.population > 100) and (w_int.continent = w_ext.continent)
)
CONTINENT POPULATION
2 200
4 110
SELECT continent, population FROM world w_ext
WHERE NOT EXISTS (
SELECT continent FROM world w_int
WHERE (w_int.population > 100) and (w_int.continent = w_ext.continent)
)
CONTINENT POPULATION
<NULL> 100
1 10
3 <NULL>