Python concatenate text files

后端 未结 12 1949
無奈伤痛
無奈伤痛 2020-11-22 02:51

I have a list of 20 file names, like [\'file1.txt\', \'file2.txt\', ...]. I want to write a Python script to concatenate these files into a new file. I could op

相关标签:
12条回答
  • 2020-11-22 03:02

    If you have a lot of files in the directory then glob2 might be a better option to generate a list of filenames rather than writing them by hand.

    import glob2
    
    filenames = glob2.glob('*.txt')  # list of all .txt files in the directory
    
    with open('outfile.txt', 'w') as f:
        for file in filenames:
            with open(file) as infile:
                f.write(infile.read()+'\n')
    
    0 讨论(0)
  • 2020-11-22 03:03
    def concatFiles():
        path = 'input/'
        files = os.listdir(path)
        for idx, infile in enumerate(files):
            print ("File #" + str(idx) + "  " + infile)
        concat = ''.join([open(path + f).read() for f in files])
        with open("output_concatFile.txt", "w") as fo:
            fo.write(path + concat)
    
    if __name__ == "__main__":
        concatFiles()
    
    0 讨论(0)
  • 2020-11-22 03:04

    I don't know about elegance, but this works:

        import glob
        import os
        for f in glob.glob("file*.txt"):
             os.system("cat "+f+" >> OutFile.txt")
    
    0 讨论(0)
  • 2020-11-22 03:06

    Use shutil.copyfileobj.

    It automatically reads the input files chunk by chunk for you, which is more more efficient and reading the input files in and will work even if some of the input files are too large to fit into memory:

    import shutil
    
    with open('output_file.txt','wb') as wfd:
        for f in ['seg1.txt','seg2.txt','seg3.txt']:
            with open(f,'rb') as fd:
                shutil.copyfileobj(fd, wfd)
    
    0 讨论(0)
  • 2020-11-22 03:06
    outfile.write(infile.read()) # time: 2.1085190773010254s
    shutil.copyfileobj(fd, wfd, 1024*1024*10) # time: 0.60599684715271s
    

    A simple benchmark shows that the shutil performs better.

    0 讨论(0)
  • 2020-11-22 03:07

    This should do it

    For large files:

    filenames = ['file1.txt', 'file2.txt', ...]
    with open('path/to/output/file', 'w') as outfile:
        for fname in filenames:
            with open(fname) as infile:
                for line in infile:
                    outfile.write(line)
    

    For small files:

    filenames = ['file1.txt', 'file2.txt', ...]
    with open('path/to/output/file', 'w') as outfile:
        for fname in filenames:
            with open(fname) as infile:
                outfile.write(infile.read())
    

    … and another interesting one that I thought of:

    filenames = ['file1.txt', 'file2.txt', ...]
    with open('path/to/output/file', 'w') as outfile:
        for line in itertools.chain.from_iterable(itertools.imap(open, filnames)):
            outfile.write(line)
    

    Sadly, this last method leaves a few open file descriptors, which the GC should take care of anyway. I just thought it was interesting

    0 讨论(0)
提交回复
热议问题