A normal search experience (displaying a 20 hits search page) requires
num segments * (1 + num terms * 2) + 20 GET requests.
We have 180 segments for our commoncrawl index.
So we can consider a generous upper bound of 1000 requests.
The GET request costs adds $0.0004 per commoncrawl search request.
Storage costs us $5 per day, so the cost of GET request starts topping storage cost at >10k request per day.
Our search engine is meant for searching large datasets, with a low number of queries: Logs, SIEM, e-discovery, exotic big data datasets, etc.
These use case have typically a low daily query rate.
For high request rate, (1 query per second) like e-commerce, entirely decoupling storage and compute is actually a bad idea.
For low request rate (< 1000 per day), using S3 without caring about the GET request cost is perfectly fine.
And in the middle, you might probably want to use another object model with a more favorable pricing model.