问题
I have a 1M rows of CSV data. select 10 rows, Will I be billed for 10 rows. What is data returned and data scanned means in S3 Select?
There is less documentation on these terms of S3 select
回答1:
To keep things simple lets forget for some time that S3 reads in a columnar way. Suppose you have the following data:
| City | Last Updated Date |
|------------|---------------------|
| London | 1st Jan |
| London | 2nd Jan |
| New Delhi | 2nd Jan |
A query for fetching the latest update date
- forces S3 to scan all 3 records
- but the returned records are only 2 (when the last updated date is 2nd Jan)
A query of select city where last updated date is 1st Jan,
- will scan all 3 rows
- but return only 1 string - "New Delhi".
Hence based on your query, it might scan more data (3 rows) but return less data (2 rows).
I hope you understand the difference between Data Scanned and Data Returned now.
来源:https://stackoverflow.com/questions/53001443/how-s3-select-pricing-works-what-is-data-returned-and-scanned-in-s3-select-mean