How to open a huge excel file efficiently

前端 未结 11 753
佛祖请我去吃肉
佛祖请我去吃肉 2021-01-30 21:29

I have a 150MB one-sheet excel file that takes about 7 minutes to open on a very powerful machine using the following:

# using python
import xlrd
wb = xlrd.open_         


        
11条回答
  •  情话喂你
    2021-01-30 21:47

    The c# and ole solution still have some bottleneck.So i test it by c++ and ado.

    _bstr_t connStr(makeConnStr(excelFile, header).c_str());
    
    TESTHR(pRec.CreateInstance(__uuidof(Recordset)));       
    TESTHR(pRec->Open(sqlSelectSheet(connStr, sheetIndex).c_str(), connStr, adOpenStatic, adLockOptimistic, adCmdText));
    
    while(!pRec->adoEOF)
    {
        for(long i = 0; i < pRec->Fields->GetCount(); ++i)
        {   
            _variant_t v = pRec->Fields->GetItem(i)->Value;
            if(v.vt == VT_R8)
                num[i] = v.dblVal;
            if(v.vt == VT_BSTR)
                str[i] = v.bstrVal;          
            ++cellCount;
        }                                    
        pRec->MoveNext();
    }
    

    In i5-4460 and HDD machine,i find 500 thousands of cell in xls will take 1.5s.But same data in xlsx will take 2.829s.so it's possible for manipulating your data under 30s.

    If you really need under 30s,use RAM Drive to reduce file IO.It will significantly improve your process. I cannot download your data to test it,so please tell me the result.

提交回复
热议问题