I have an app that involves sending Apple Push Notifications to ~1M users periodically. The setup for doing so has been built and tested for small numbers of notifications. Sinc
I completely agree with you that this API is very frustrating, and if they would have sent a response for each notification it would have been much easier to implement.
That said, here's what Apple say you should do (from Technical Note) :
Push Notification Throughput and Error Checking
There are no caps or batch size limits for using APNs. The iOS 6.1 press release stated that APNs has sent over 4 trillion push notifications since it was established. It was announced at WWDC 2012 that APNs is sending 7 billion notifications daily.
If you're seeing throughput lower than 9,000 notifications per second, your server might benefit from improved error handling logic.
Here's how to check for errors when using the enhanced binary interface. Keep writing until a write fails. If the stream is ready for writing again, resend the notification and keep going. If the stream isn't ready for writing, see if the stream is available for reading.
If it is, read everything available from the stream. If you get zero bytes back, the connection was closed because of an error such as an invalid command byte or other parsing error. If you get six bytes back, that's an error response that you can check for the response code and the ID of the notification that caused the error. You'll need to send every notification following that one again.
Once everything has been sent, do one last check for an error response.
It can take a while for the dropped connection to make its way from APNs back to your server just because of normal latency. It's possible to send over 500 notifications before a write fails because of the connection being dropped. Around 1,700 notifications writes can fail just because the pipe is full, so just retry in that case once the stream is ready for writing again.
Now, here's where the tradeoffs get interesting. You can check for an error response after every write, and you'll catch the error right away. But this causes a huge increase in the time it takes to send a batch of notifications.
Device tokens should almost all be valid if you've captured them correctly and you're sending them to the correct environment. So it makes sense to optimize assuming failures will be rare. You'll get way better performance if you wait for write to fail or the batch to complete before checking for an error response, even counting the time to send the dropped notifications again.
None of this is really specific to APNs, it applies to most socket-level programming.
If your development tool of choice supports multiple threads or interprocess communication, you could have a thread or process waiting for an error response all the time and let the main sending thread or process know when it should give up and retry.