combining one-column text files into one multi-column file, separated by tabs

后端 未结 2 1006
别跟我提以往
别跟我提以往 2021-01-26 04:39

I have several one-column text files that I want to put next to each other (e.g. each file is one column). The only documentation I could find was on appending files to the bott

相关标签:
2条回答
  • 2021-01-26 05:22

    Here's a good way to do it incrementally and makes sure all the files involved are closed when it ends:

    from contextlib import contextmanager
    import glob
    from itertools import izip
    
    @contextmanager
    def multi_file_manager(files, mode='rb'):
        """ Open multiple files and make sure they all get closed. """
        files = [open(file, mode) for file in files]
        yield files
        for file in files:
            file.close()
    
    def read_info(file):
        """ Generator function to read and extract info from each line of a file. """
        for line in file:
            yield line.split('[')[0]
    
    with open("MA_continuous_results.csv", "wb") as outfile:
        MA_files = glob.glob("MA*continuous*2")
    
        col_headers = (filename.split("maskave_")[-1] for filename in MA_files)
        outfile.write('\t'.join(col_headers) + '\n')
    
        with multi_file_manager(MA_files) as infiles:
            generators = [read_info(file) for file in infiles]
    
            for fields in izip(*generators):
                outfile.write('\t'.join(fields) + '\n')
    
    0 讨论(0)
  • 2021-01-26 05:32

    I would rewrite your code as follows:

    import glob
    from itertools import izip
    
    def extract_meaningful_info(line):
        return line.rstrip('\n').split('[')[0]
    
    MA_files = glob.glob("MA*continuous*2")
    
    with open("MA_continuous_results.csv", "wb") as outfile:
        outfile.write("\t".join(MA_files) + '\n')
        for fields in izip(*(open(f) for f in MA_files)):
            fields = [extract_meaningful_info(f) for f in fields]
            outfile.write('\t'.join(fields) + '\n')
    

    (code is for python2)

    You might want to read about:

    • itertools.izip

    • * in the arguments of function

    0 讨论(0)
提交回复
热议问题