Pytorch dataloader, too many threads, too much cpu memory allocation

大兔子大兔子 提交于 2019-12-11 15:53:33

问题


I'm training a model using PyTorch. To load the data, I'm using torch.utils.data.DataLoader. The data loader is using a custom database I've implemented. A strange problem has occurred, every time the second for in the following code executes, the number of threads/processes increases and a huge amount of memory is allocated

    for epoch in range(start_epoch, opt.niter + opt.niter_decay + 1):
        epoch_start_time = time.time()
        if epoch != start_epoch:
            epoch_iter = epoch_iter % dataset_size
        for i, item in tqdm(enumerate(dataset, start=epoch_iter)):

I suspect the threads and memories of the previous iterators are not released after each __iter__() call to the data loader. The allocated memory is close to the amount of memory allocated by the main thread/process when the threads are created. That is in the initial epoch the main thread is using 2GB of memory and so 2 threads of size 2GB are created. In the next epochs, 5GB of memory is allocated by the main thread and two 5GB threads are constructed (num_workers is 2). I suspect that fork() function copies most of the context to the new threads.

The following is the Activity monitor showing the processes created by python, ZMQbg/1 are processes related to python.

My dataset used by the data loader has 100 sub-datasets, the __getitem__ call randomly selects one (ignoring the index). (the sub-datasets are AlignedDataset from pix2pixHD GitHub repository):


回答1:


torch.utils.data.DataLoader prefetch 2*num_workers, so that you will always have data ready to send to the GPU/CPU, this could be the reason you see the memory increase

https://pytorch.org/docs/stable/_modules/torch/utils/data/dataloader.html



来源:https://stackoverflow.com/questions/57250275/pytorch-dataloader-too-many-threads-too-much-cpu-memory-allocation

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!