reading files from tar.gz archive in Nim

自作多情 提交于 2019-12-10 14:24:54

问题


Looking for a way to read in a file from a tar.gz archive using the Nim programming language (version 0.11.2). Say I have an archive

/my/path/to/archive.tar.gz

and a file in that archive

my/path/to/archive/file.txt

My goal is to be able to read the contents of the file line by line in Nim. In Python I can do this with the tarfile module. In Nim there are the libzip and zlib modules, but the documentation is minimal and there are no examples. There's also the zipfiles module, but I'm not sure if this is capable of working with tar.gz archives.


回答1:


To my knowledge, libzip and zlib cannot be used to read tar files (afaik they only support zip archives and/or raw string compression, while a tar.gz requires gzip + tar). Unfortunately it looks like there are no Nim libraries yet which read tar.gz archives.

If you are okay with a quick-and-dirty tar-based solution, you can do this:

import osproc

proc extractFromTarGz(archive: string, filename: string): string =
  # -z extracts
  # -f specifies filename
  # -z runs through gzip
  # -O prints to STDOUT
  result = execProcess("tar -zxf " & archive & " " & filename & " -O")

let content = extractFromTarGz("test.tar.gz", "some/subpath.txt")

If you want a clean and flexible solution, this would be a good opportunity to write a wrapper for the libarchive library ;).




回答2:


In a project at my company, we've been using the following module, exposing gzip files as streams:

import
  zlib, streams

type
  GZipStream* = object of StreamObj
    f: GzFile

  GzipStreamRef* = ref GZipStream

proc fsClose(s: Stream) =
  discard gzclose(GZipStreamRef(s).f)

proc fsReadData(s: Stream, buffer: pointer, bufLen: int): int =
  return gzread(GZipStreamRef(s).f, buffer, bufLen)

proc fsAtEnd(s: Stream): bool =
  return gzeof(GZipStreamRef(s).f) != 0

proc newGZipStream*(f: GzFile): GZipStreamRef =
  new result
  result.f = f
  result.closeImpl = fsClose
  result.readDataImpl = fsReadData
  result.atEndImpl = fsAtEnd
  # other methods are nil!

proc newGZipStream*(filename: cstring): GZipStreamRef =
  var gz = gzopen(filename, "r")
  if gz != nil: return newGZipStream(gz)

But you also need to to be able to read the tar header in order to find the correct location of the desired file in the uncompressed gzip stream. You could wrap some existing C library like libtar to do this, or you could roll your own implementation.




回答3:


I created a basic untar package that may help with this: https://github.com/dom96/untar



来源:https://stackoverflow.com/questions/33082639/reading-files-from-tar-gz-archive-in-nim

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!