In my table I have the following columns:
id|name|gender
I would like to extract all the names which are duplicated, but I don\'t want to scan
As per my understanding what you mean from your statement that you do not want to read all the records at once is that you want the query for finding duplicate names in chunks.
If my understanding is correct, then you can run the query in batches using id as the separator for that ie. first run the query for 1000 records, then run the query for records between 1000 and 2000, then between 2000 and 3000.
But there is one issue when you do not want to read all the records at once : It might happen that the name is unique in the chunk you have picked up but still it can be a duplicate in the whole table i.e. a name can be unique between records 1000 and 2000 but can occur between records 4000 and 5000.
SELECT
name
FROM
Mytable
GROUP BY
name
HAVING
COUNT(*) > 1
To separate gender too, change the GROUP BY to name, gender
To delete duplicate rows
DELETE MyTable
WHERE ID NOT IN
(
SELECT
MIN(Id)
FROM
Mytable
GROUP BY
name
)
Is this what you're looking for?
select name from tbl
group by name
having count(*) > 1