large-files

Large File Download

生来就可爱ヽ(ⅴ<●) 提交于 2019-12-22 03:15:50
问题 Internet Explorer has a file download limit of 4GB (2 GB on IE6). Firefox does not have this problem (haven't tested safari yet) (More info here: http://support.microsoft.com/kb/298618) I am working on a site that will allow the user to download very large files (up to and exceeding 100GB) What is the best way to do this without using FTP . The end user must be able to download the file from there browser using HTTP. I don't think Flash or Silverlight can save files to the client so as far as

Efficient way to aggregate and remove duplicates from very large (password) lists

倾然丶 夕夏残阳落幕 提交于 2019-12-22 01:36:15
问题 Context: I am attempting to combine a large amount of separate password list text files into a single file for use in dictionary based password cracking. Each text file is line delimited (a single password per line) and there are 82 separate files at the moment. Most (66) files are in the 1-100Mb filesize range, 12 are 100-700Mb, 3 are 2Gb, and 1 (the most problematic) is 11.2Gb. In total I estimate 1.75 billion non-unique passwords need processing; of these I estimate ~450 million (%25) will

C# serialize large array to disk

孤街浪徒 提交于 2019-12-21 20:35:07
问题 I have a very large graph stored in a single dimensional array (about 1.1 GB) which I am able to store in memory on my machine which is running Windows XP with 2GB of ram and 2GB of virtual memory. I am able to generate the entire data set in memory, however when I try to serialize it to disk using the BinaryFormatter , the file size gets to about 50MB and then gives me an out of memory exception. The code I am using to write this is the same I use amongst all of my smaller problems:

Reading and graphing data read from huge files

对着背影说爱祢 提交于 2019-12-21 17:26:42
问题 We have pretty large files, the order of 1-1.5 GB combined (mostly log files) with raw data that is easily parseable to a csv, which is subsequently supposed to be graphed to generate a set of graph images. Currently, we are using bash scripts to turn the raw data into a csv file, with just the numbers that need to be graphed, and then feeding it into a gnuplot script. But this process is extremely slow. I tried to speed up the bash scripts by replacing some piped cut s, tr s etc. with a

off_t without -D_FILE_OFFSET_BITS=64 on a file > 2GB

喜你入骨 提交于 2019-12-21 05:16:13
问题 1- I'm wondering, what would be the problem if I try to read a file greater than 2GB in size without compiling my program with the option -D_FILE_OFFSET_BITS=64 using off_t and using the second function on this page? would it segfault? 2- I'm planning to use this implementation with off64_t and #define _LARGEFILE64_SOURCE 1 #define _FILE_OFFSET_BITS 64 Would there be any problem? 回答1: stat() will fail, and errno set to EOVERFLOW in that case. Here's what the linux man page says EOVERFLOW stat

off_t without -D_FILE_OFFSET_BITS=64 on a file > 2GB

怎甘沉沦 提交于 2019-12-21 05:16:07
问题 1- I'm wondering, what would be the problem if I try to read a file greater than 2GB in size without compiling my program with the option -D_FILE_OFFSET_BITS=64 using off_t and using the second function on this page? would it segfault? 2- I'm planning to use this implementation with off64_t and #define _LARGEFILE64_SOURCE 1 #define _FILE_OFFSET_BITS 64 Would there be any problem? 回答1: stat() will fail, and errno set to EOVERFLOW in that case. Here's what the linux man page says EOVERFLOW stat

Reading large text files with Pandas [duplicate]

孤者浪人 提交于 2019-12-21 05:07:09
问题 This question already has answers here : How to read a 6 GB csv file with pandas (13 answers) Closed last year . I have been trying to read a few large text files (sizes around 1.4GB - 2GB) with Pandas, using the read_csv function, with no avail. Below are the versions I am using: Python 2.7.6 Anaconda 1.9.2 (64-bit) (default, Nov 11 2013, 10:49:15) [MSC v.1500 64 bit (AMD64)] IPython 1.1.0 Pandas 0.13.1 I tried the following: df = pd.read_csv(data.txt') and it crashed Ipython with a message:

Python: slicing a very large binary file

浪子不回头ぞ 提交于 2019-12-21 04:04:27
问题 Say I have a binary file of 12GB and I want to slice 8GB out of the middle of it. I know the position indices I want to cut between. How do I do this? Obviously 12GB won't fit into memory, that's fine, but 8GB won't either... Which I thought was fine, but it appears binary doesn't seem to like it if you do it in chunks! I was appending 10MB at a time to a new binary file and there are discontinuities on the edges of each 10MB chunk in the new file. Is there a Pythonic way of doing this easily

Large file not flushed to disk immediately after calling close()?

余生长醉 提交于 2019-12-21 03:36:05
问题 I'm creating large file with my python script (more than 1GB , actually there's 8 of them). Right after I create them I have to create process that will use those files. The script looks like: # This is more complex function, but it basically does this: def use_file(): subprocess.call(['C:\\use_file', 'C:\\foo.txt']); f = open( 'C:\\foo.txt', 'wb') for i in 10000: f.write( one_MB_chunk) f.flush() os.fsync( f.fileno()) f.close() time.sleep(5) # With this line added it just works fine t =

How to read millions of rows from the text file and insert into table quickly

夙愿已清 提交于 2019-12-20 00:50:12
问题 I have gone through the Insert 2 million rows into SQL Server quickly link and found that I can do this by using Bulk insert. So I am trying to create the datatable (code as below), but as this is a huge file (more than 300K row) I am getting an OutOfMemoryEexception in my code: string line; DataTable data = new DataTable(); string[] columns = null; bool isInserted = false; using (TextReader tr = new StreamReader(_fileName, Encoding.Default)) { if (columns == null) { line = tr.ReadLine();