Breaking gzipped JSON into chunks of arbitrary size

只谈情不闲聊 提交于 2021-01-29 18:25:36

问题


This is being done in Typescript, but the algorithm is applicable to any language, I think.

I am sending log data (parsed AWS ALB logs) to New Relic, and their maximum payload size is 10^6 bytes. What I'm doing right now is encoding the entire ALB log I get from S3 in JSON, gzipping it, and then examining the size via Buffer.byteLength. If it's in excess of 900,000 bytes (I want to leave some headroom, because the gzipped data doesn't exactly scale linearly with the number of log entries) I create a multiplier as 900,000 / byte length and break the log entries into chunks of that size as shown below.

This works, but I'm concerned that the algorithm won't work as well when the data are more heterogeneous. That 900,000 number is fairly arbitrary, after all. Is there a better way to break these records up? I suppose I could try and dynamically determine the optimal chunk size, but I feel like that would needlessly burn up a lot of CPU.

  import { chunk } from 'lodash'

  async function chunkify(messages: Array<unknown>): Promise<Array<Buffer>> {
    const postdata = [{ logs: messages }]
    const postdataGzipped: Buffer = (await gzip(
      JSON.stringify(postdata)
    )) as Buffer

    if (postdataGzipped.byteLength < MaxPayloadSize) {
      return [postdataGzipped]
    } else {
      const multiplier = MaxPayloadSize / postdataGzipped.byteLength
      const chunkSize = Math.floor(messages.length * multiplier)
      console.info(
        `Break ${messages.length} messages into chunks of (up to) ${chunkSize} elements each`
      )
      const chunks: Buffer[] = await Promise.all(
        chunk(messages, chunkSize).map(
          (messageChunk) =>
            gzip(JSON.stringify([{ logs: messageChunk }])) as Promise<Buffer>
        )
      )
      return chunks
    }

来源:https://stackoverflow.com/questions/64124467/breaking-gzipped-json-into-chunks-of-arbitrary-size

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!