S3Stream is getting closed before processing the entire payload

别说谁变了你拦得住时间么 提交于 2019-12-25 01:54:12

问题


I am processing the bulk json payload from s3. code as follows:

import com.fasterxml.jackson.core.JsonFactory;
import com.fasterxml.jackson.core.JsonParseException;
import com.fasterxml.jackson.core.JsonParser;
import com.amazonaws.services.s3.model.S3Object;
import static com.fasterxml.jackson.core.JsonToken;
import com.google.common.util.concurrent.Futures;
import com.google.common.util.concurrent.ListenableFuture;   

public boolean sync(Job job)
        throws IOException

//validating the json payload from s3.
try(InputStream s3Stream = readStreamFromS3()) 
{
    validationService.validate(s3Stream);
} 
catch (S3SdkInteractionException e) {
{ 
   logger.error(e.getLocalizedMessage();
}

//process the json payload from s3.
try (InputStream s3Stream = readStreamFromS3())
{
    syncService.process(s3Stream);
}
catch (S3SdkInteractionException e) {
{
    logger.error(e.getLocalizedMessage();
}
}


public InputSteam readStreamFromS3()
{
    return S3Object.getObjectContent();
}

// Process will sync the user data in the s3 stream. 
// I am not closing the stream till the entire stream is processed.  I 
// need to handle as a stream processing. 
// I dont want keep the contents in memory for processing, not 
   feasible for my use case.
public boolean process(InputStream s3Stream)
{
    jsonFactory = objectMapper.getFactory();   
    try(JsonParser jsonParser = jsonFactory.createParser(s3Stream) {

        JsonToken jsonToken = jsonParser.nextToken();
        List<HttpResponseFuture<UserResponse> userFutures = new ArrayLsit<>(20);
        while(true) {
           for(int i = 0; i < 20; i++)
            {
              try {
                   // stream is processed fully
                    if (jsonToken == null || jsonToken == JSONTOKEN.END_OBJECT) { break; }

                   while (!jsonToken.isStructStart()) {
                           jsonToken = jsonParser.nextToken();    
                       }

                   // Fetch the user record from the stream
                   if (jsonTokenn.isStructStart()) {
                       Map<String,Object> userNode = jsonParser.readValueAs(Map.class);

                      // calling an external service and adding future response
                      userFutures.add(executeAsync(httpClient, userNode);

                    //Move to the next user record
                     if (jsonToken == JSONTOKEN.START_OBJECT) {
                           jsonToken = jsonParser.nextToken();     
                       }
                   }
                 }
              catch (JsonParseException jpe) {
                   logger.error(jpe.getLocalizedMessage());
                   break;
               }
             }

             for(ListenableFuture<UserResponse> responseFuture : Futures.inCompletionOrder(userFutures)) {
                 JsonResponse response = responseFuture.get();
            }

         } 

   }
   return false;
}

There is serviceA through which we are ingesting data (json payload) to S3. Another serviceB (the pseudocode shown above) will process the s3 data and call another serviceC to sync the data (json payload) in underlying store.

Problem:

I am seeing repeated s3 warning in our code. com.amazonaws.services.s3.internal.S3AbortableInputStream Not all bytes were read from the S3ObjectInputStream, aborting HTTP connection. This is likely an error and may result in sub-optimal behavior. Request only the bytes you need via a ranged GET or drain the input stream after use

The validation phase is executing as expected without any issues. However on syncing the data(ie. syncService.process()), the s3Stream is getting closed before the entire payload is processed. Since the stream is getting the closed before i process the entire stream, i am in inconsistent state.

Dependency information as follows

aws-java-sdk-s3:1.11.411

guava:guava-25.0-jre

jackson-core:2.9.6

Json payload could vary between few MB's to 2 GB.

Any help would be appreciated.

来源:https://stackoverflow.com/questions/55227573/s3stream-is-getting-closed-before-processing-the-entire-payload

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!