read.csv

Read csv file in R with currency column as numeric

☆樱花仙子☆ 提交于 2019-11-27 08:07:25
I'm trying to read into R a csv file that contains information on political contributions. From what I understand, the columns by default are imported as factors, but I need the the amount column ('CTRIB_AMT' in the dataset) to be imported as a numeric column so I can run a variety of functions that wouldn't work for factors. The column is formatted as a currency with a "$" as prefix. I used a simple read command to import the file initially: contribs <- read.csv('path/to/file') And then tried to convert the CTRIB_AMT from currency to numeric: as.numeric(as.character(sub("$","",contribs$CTRIB

How can I read the header but also skip lines - read.table()?

≡放荡痞女 提交于 2019-11-27 07:49:04
Data.txt: Index;Time; 1;2345; 2;1423; 3;5123; The code: dat <- read.table('data.txt', skip = 1, nrows = 2, header =TRUE, sep =';') The result: X1 X2345 1 2 1423 2 3 5123 I expect the header to be Index and Time, as follows: Index Time 1 2 1423 2 3 5123 How do I do that? Beasterfield I am afraid, that there is no direct way to achieve this. Either you read the entire table and remove afterwards the lines you don't want or you read in the table twice and assign the header later: header <- read.table('data.txt', nrows = 1, header = FALSE, sep =';', stringsAsFactors = FALSE) dat <- read.table(

Imported a csv-dataset to R but the values becomes factors

折月煮酒 提交于 2019-11-27 00:48:30
I am very new to R and I am having trouble accessing a dataset I've imported. I'm using RStudio and used the Import Dataset function when importing my csv-file and pasted the line from the console-window to the source-window. The code looks as follows: setwd("c:/kalle/R") stuckey <- read.csv("C:/kalle/R/stuckey.csv") point <- stuckey$PTS time <- stuckey$MP However, the data isn't integer or numeric as I am used to but factors so when I try to plot the variables I only get histograms, not the usual plot. When checking the data it seems to be in order, just that I'm unable to use it since it's

Why am I getting X. in my column names when reading a data frame?

筅森魡賤 提交于 2019-11-26 22:22:14
I asked a question about this a few months back , and I thought the answer had solved my problem, but I ran into the problem again and the solution didn't work for me. I'm importing a CSV: orders <- read.csv("<file_location>", sep=",", header=T, check.names = FALSE) Here's the structure of the dataframe: str(orders) 'data.frame': 3331575 obs. of 2 variables: $ OrderID : num -2034590217 -2034590216 -2031892773 -2031892767 -2021008573 ... $ OrderDate: Factor w/ 402 levels "2010-10-01","2010-10-04",..: 263 263 269 268 301 300 300 300 300 300 ... If I run the length command on the first column,

read.csv blank fields to NA

断了今生、忘了曾经 提交于 2019-11-26 20:23:17
问题 I have tab delimited text file, named 'a.txt'. The D column is empty. A B C D 10 20 NaN 30 40 40 30 20 20 NA 20 I want to have the dataframe looking and acting exactly as the text file, with a space in the 2nd row and in the 2nd column. Unfortunately, read.csv is converting all the blanks and NA to "NA". I want to read NA and NaN as characters. b<- read.csv("a.txt",sep="\t", skip =0, header = TRUE, comment.char = "",check.names = FALSE, quote="", ) To summarize: I want to replicate the same

How to read only lines that fulfil a condition from a csv into R?

随声附和 提交于 2019-11-26 20:17:35
I am trying to read a large csv file into R. I only want to read and work with some of the rows that fulfil a particular condition (e.g. Variable2 >= 3 ). This is a much smaller dataset. I want to read these lines directly into a dataframe, rather than load the whole dataset into a dataframe and then select according to the condition, since the whole dataset does not easily fit into memory. You could use the read.csv.sql function in the sqldf package and filter using SQL select. From the help page of read.csv.sql : library(sqldf) write.csv(iris, "iris.csv", quote = FALSE, row.names = FALSE)

How to detect the right encoding for read.csv?

£可爱£侵袭症+ 提交于 2019-11-26 19:33:46
I have this file (http://b7hq6v.alterupload.com/en/) that I want to read in R with read.csv . But I am not able to detect the correct encoding. It seems to be a kind of UTF-8. I am using R 2.12.1 on an WindowsXP Machine. Any Help? Marek First of all based on more general question on StackOverflow it is not possible to detect encoding of file in 100% certainty. I've struggle this many times and come to non-automatic solution: Use iconvlist to get all possible encodings: codepages <- setNames(iconvlist(), iconvlist()) Then read data using each of them x <- lapply(codepages, function(enc) try

read.csv, header on first line, skip second line [duplicate]

独自空忆成欢 提交于 2019-11-26 18:49:37
This question already has an answer here: How can I read the header but also skip lines - read.table()? 5 answers I have a CSV file with two header rows, the first row I want to be the header, but the second row I want to discard. If I do the following command: data <- read.csv("HK Stocks bbg.csv", header = T, stringsAsFactors = FALSE) The first row becomes the header and the second row of the file becomes the first row of my data frame: Xaaaaaaaaa X X.1 Xbbbbbbbbbb X.2 X.3 1 Date PX_LAST NA Date PX_LAST NA 2 31/12/2002 38.855 NA 31/12/2002 19.547 NA 3 02/01/2003 38.664 NA 02/01/2003 19.547 NA

Specifying colClasses in the read.csv

无人久伴 提交于 2019-11-26 18:19:37
I am trying to specify the colClasses options in the read.csv function in R. In my data, the first column "time" is basically a character vector while the rest of the columns are numeric. data <- read.csv("test.csv", comment.char="" , colClasses=c(time="character", "numeric"), strip.white=FALSE) In the above command, I would want R to read in the "time" column as "character" and the rest as numeric. Although, the "data" variable did have the correct result after the command completed, R returned the following warnings. I am wondering how I could fix these warnings? Warning messages: 1: In read

Read csv file in R with currency column as numeric

…衆ロ難τιáo~ 提交于 2019-11-26 17:44:56
问题 I'm trying to read into R a csv file that contains information on political contributions. From what I understand, the columns by default are imported as factors, but I need the the amount column ('CTRIB_AMT' in the dataset) to be imported as a numeric column so I can run a variety of functions that wouldn't work for factors. The column is formatted as a currency with a "$" as prefix. I used a simple read command to import the file initially: contribs <- read.csv('path/to/file') And then