large-files

Reading large csv files with strings containing commas as one field

半城伤御伤魂 提交于 2019-12-12 12:33:18
问题 I have a large .csv file (~26000 rows). I want to be able to read it into matlab. Another problem is that it contains a collection of strings delimited by commas in one of the fields. I'm having trouble reading it. I tried stuff like tdfread, which won't work here. Any tricks with textscan i should be aware about? Is there any other way? 回答1: I'm not sure what is generating your CSV file but that is your problem. The point of a CSV file, is that the file itself designates separation of fields

How to read line-delimited JSON from large file (line by line)

社会主义新天地 提交于 2019-12-12 08:20:20
问题 I'm trying to load a large file (2GB in size) filled with JSON strings, delimited by newlines. Ex: { "key11": value11, "key12": value12, } { "key21": value21, "key22": value22, } … The way I'm importing it now is: content = open(file_path, "r").read() j_content = json.loads("[" + content.replace("}\n{", "},\n{") + "]") Which seems like a hack (adding commas between each JSON string and also a beginning and ending square bracket to make it a proper list). Is there a better way to specify the

How can you concatenate two huge files with very little spare disk space? [closed]

99封情书 提交于 2019-12-12 07:31:29
问题 Closed. This question is off-topic. It is not currently accepting answers. Want to improve this question? Update the question so it's on-topic for Stack Overflow. Closed 7 years ago . Suppose that you have two huge files (several GB) that you want to concatenate together, but that you have very little spare disk space (let's say a couple hundred MB). That is, given file1 and file2 , you want to end up with a single file which is the result of concatenating file1 and file2 together byte-for

Reading a particular line by line number in a very large file

坚强是说给别人听的谎言 提交于 2019-12-12 05:02:44
问题 The file will not fit into memory. It is over 100GB and I want to access specific lines by line number. I do not want to count line by line until I reach it. I have read http://docstore.mik.ua/orelly/perl/cookbook/ch08_09.htm When I built an index using the following methods, the line return works up to a certain point. Once the line number is very large, the line being returned is the same. When I go to the specific line in the file the same line is returned. It seems to work for line

Unable to Find and Replace a string in 2GB XML file using Powershell

你说的曾经没有我的故事 提交于 2019-12-12 04:52:41
问题 I am new to Windows PowerShell. I am trying to perform a find and replace string on 4 occasions. But even a simple find and replace is throwing an Exception of type 'System.OutOfMemoryException' was thrown. error. I used Get-content . Is there a way to achieve it without disrupting the memory? e.g. Replace ".000000000Z" with ".732Z" where 732 will be the milliseconds the job is run? PSversion: 3.0 回答1: The typical method is to use .Net methods to do it line by line. Assuming you've got

Error while parsing a very large (10 GB) XML file in R, using the XML package

对着背影说爱祢 提交于 2019-12-12 03:49:54
问题 Context I'm currently working on a project involving osm data (Open Street Map). In order to manipulate geographic objects, I have to convert the data (an osm xml file) into an object. The osmar package lets me do this, but it fails to parse the raw xml data. The error Error in paste(file, collapse = "\n") : result would exceed 2^31-1 bytes The code require(osmar) osmar_obj <- get_osm("anything", source = osmsource_file("my filename")) Inside the get_osm function, the code calls ret <-

Perl read a large file for use with multi line regex

牧云@^-^@ 提交于 2019-12-12 03:34:17
问题 I have a 4GB text file with highly variable length lines, this is only a sample file, production files will be much larger. I need to read the file and apply a multi line regex. What is the best way to read such a large file for the multi line regex? If I read it line by line, I don't think my multi line regex will work correctly. When I use the read function in 3 argument form my regex results vary as I change the size of length I specify in the the read statement. I believe that the file's

Visual Studio 2008 Debugging large c file

删除回忆录丶 提交于 2019-12-12 02:07:24
问题 I am debugging a very large c file . It is approx 70000+ lines of code. The debugger is not functioning properly, However the code is compiled correctly. Is there any flag or something which needs to be set to debug this file. Edit: I have changed the location of the function from bottom of file to top it is now debugging the function as expected. Don't know the reason 回答1: The easiest solution is to split the file in two, keeping each file under 65535 lines. There is rarely a good reason to

Git - remove commit of large file

帅比萌擦擦* 提交于 2019-12-12 01:57:33
问题 I erroneously committed a big file (>100Mb) that I really didn't have to include in my git history. I removed the file, also, I removed it from git cache, then I committed again. Despite this, when I try to push to my remote branch, git gives me a size error. I also tried a git rebase , but the commit is still there, what shall I do? remote: error: GH001: Large files detected. You may want to try Git Large File Storage - https://git-lfs.github.com. remote: error: Trace:

Best approach to write huge sql dataset into xml file? [closed]

半城伤御伤魂 提交于 2019-12-12 00:57:23
问题 This question is unlikely to help any future visitors; it is only relevant to a small geographic area, a specific moment in time, or an extraordinarily narrow situation that is not generally applicable to the worldwide audience of the internet. For help making this question more broadly applicable, visit the help center. Closed 6 years ago . In C#,i have a huge dataset that i want to write it into xml file with write xml,that is my code: using (var myConnection = new SqlConnection("Data