large-files | 易学教程

Large File Download

阅读更多关于 Large File Download

问题 Internet Explorer has a file download limit of 4GB (2 GB on IE6). Firefox does not have this problem (haven't tested safari yet) (More info here: http://support.microsoft.com/kb/298618) I am working on a site that will allow the user to download very large files (up to and exceeding 100GB) What is the best way to do this without using FTP . The end user must be able to download the file from there browser using HTTP. I don't think Flash or Silverlight can save files to the client so as far as

Efficient way to aggregate and remove duplicates from very large (password) lists

阅读更多关于 Efficient way to aggregate and remove duplicates from very large (password) lists

问题 Context: I am attempting to combine a large amount of separate password list text files into a single file for use in dictionary based password cracking. Each text file is line delimited (a single password per line) and there are 82 separate files at the moment. Most (66) files are in the 1-100Mb filesize range, 12 are 100-700Mb, 3 are 2Gb, and 1 (the most problematic) is 11.2Gb. In total I estimate 1.75 billion non-unique passwords need processing; of these I estimate ~450 million (%25) will

C# serialize large array to disk

阅读更多关于 C# serialize large array to disk

问题 I have a very large graph stored in a single dimensional array (about 1.1 GB) which I am able to store in memory on my machine which is running Windows XP with 2GB of ram and 2GB of virtual memory. I am able to generate the entire data set in memory, however when I try to serialize it to disk using the BinaryFormatter , the file size gets to about 50MB and then gives me an out of memory exception. The code I am using to write this is the same I use amongst all of my smaller problems:

Reading and graphing data read from huge files

阅读更多关于 Reading and graphing data read from huge files

问题 We have pretty large files, the order of 1-1.5 GB combined (mostly log files) with raw data that is easily parseable to a csv, which is subsequently supposed to be graphed to generate a set of graph images. Currently, we are using bash scripts to turn the raw data into a csv file, with just the numbers that need to be graphed, and then feeding it into a gnuplot script. But this process is extremely slow. I tried to speed up the bash scripts by replacing some piped cut s, tr s etc. with a

off_t without -D_FILE_OFFSET_BITS=64 on a file > 2GB

阅读更多关于 off_t without -D_FILE_OFFSET_BITS=64 on a file > 2GB

问题 1- I'm wondering, what would be the problem if I try to read a file greater than 2GB in size without compiling my program with the option -D_FILE_OFFSET_BITS=64 using off_t and using the second function on this page? would it segfault? 2- I'm planning to use this implementation with off64_t and #define _LARGEFILE64_SOURCE 1 #define _FILE_OFFSET_BITS 64 Would there be any problem? 回答1: stat() will fail, and errno set to EOVERFLOW in that case. Here's what the linux man page says EOVERFLOW stat

off_t without -D_FILE_OFFSET_BITS=64 on a file > 2GB

阅读更多关于 off_t without -D_FILE_OFFSET_BITS=64 on a file > 2GB

Reading large text files with Pandas [duplicate]

阅读更多关于 Reading large text files with Pandas [duplicate]

问题 This question already has answers here : How to read a 6 GB csv file with pandas (13 answers) Closed last year . I have been trying to read a few large text files (sizes around 1.4GB - 2GB) with Pandas, using the read_csv function, with no avail. Below are the versions I am using: Python 2.7.6 Anaconda 1.9.2 (64-bit) (default, Nov 11 2013, 10:49:15) [MSC v.1500 64 bit (AMD64)] IPython 1.1.0 Pandas 0.13.1 I tried the following: df = pd.read_csv(data.txt') and it crashed Ipython with a message:

Python: slicing a very large binary file

阅读更多关于 Python: slicing a very large binary file

问题 Say I have a binary file of 12GB and I want to slice 8GB out of the middle of it. I know the position indices I want to cut between. How do I do this? Obviously 12GB won't fit into memory, that's fine, but 8GB won't either... Which I thought was fine, but it appears binary doesn't seem to like it if you do it in chunks! I was appending 10MB at a time to a new binary file and there are discontinuities on the edges of each 10MB chunk in the new file. Is there a Pythonic way of doing this easily

Large file not flushed to disk immediately after calling close()?

阅读更多关于 Large file not flushed to disk immediately after calling close()?

问题 I'm creating large file with my python script (more than 1GB , actually there's 8 of them). Right after I create them I have to create process that will use those files. The script looks like: # This is more complex function, but it basically does this: def use_file(): subprocess.call(['C:\\use_file', 'C:\\foo.txt']); f = open( 'C:\\foo.txt', 'wb') for i in 10000: f.write( one_MB_chunk) f.flush() os.fsync( f.fileno()) f.close() time.sleep(5) # With this line added it just works fine t =

How to read millions of rows from the text file and insert into table quickly

阅读更多关于 How to read millions of rows from the text file and insert into table quickly

问题 I have gone through the Insert 2 million rows into SQL Server quickly link and found that I can do this by using Bulk insert. So I am trying to create the datatable (code as below), but as this is a huge file (more than 300K row) I am getting an OutOfMemoryEexception in my code: string line; DataTable data = new DataTable(); string[] columns = null; bool isInserted = false; using (TextReader tr = new StreamReader(_fileName, Encoding.Default)) { if (columns == null) { line = tr.ReadLine();