How can i convert a dataframe with a factor column to a xts object?

I have a csv file and when i use this command

SOLK<-read.table('Book1.csv',header=TRUE,sep=';')

I get this output

> SOLK
          Time Close Volume
1   10:27:03,6  0,99   1000
2   10:32:58,4  0,98    100
3   10:34:16,9  0,98    600
4   10:35:46,0  0,97    500
5   10:35:50,6  0,96     50
6   10:35:50,6  0,96   1000
7   10:36:10,3  0,95     40
8   10:36:10,3  0,95    100
9   10:36:10,4  0,95    500
10  10:36:10,4  0,95    100
.      .         .       .
.      .         .       .
.      .         .       .
285 17:09:44,0  0,96    404

Here is the result of dput(SOLK[1:10,]):

 > dput(SOLK[1:10,])
structure(list(Time = structure(c(1L, 2L, 3L, 4L, 5L, 5L, 6L, 
6L, 7L, 7L), .Label = c("10:27:03,6", "10:32:58,4", "10:34:16,9", 
"10:35:46,0", "10:35:50,6", "10:36:10,3", "10:36:10,4", "10:36:30,8", 
"10:37:23,3", "10:37:38,2", "10:37:39,3", "10:37:45,9", "10:39:07,5", 
"10:39:07,6", "10:39:46,6", "10:41:21,8", "10:43:20,6", "10:43:36,4", 
"10:43:48,8", "10:43:48,9", "10:43:54,6", "10:44:01,5", "10:44:08,4", 
"10:45:47,2", "10:46:16,7", "10:47:03,6", "10:47:48,6", "10:47:55,0", 
"10:48:09,9", "10:48:30,6", "10:49:20,6", "10:50:31,9", "10:50:34,6", 
"10:50:38,1", "10:51:02,8", "10:51:11,5", "10:55:57,7", "10:57:57,2", 
"10:59:06,9", "10:59:33,5", "11:00:31,0", "11:00:31,1", "11:04:46,4", 
"11:04:53,4", "11:04:54,6", "11:04:56,1", "11:04:58,9", "11:05:02,0", 
"11:05:02,6", "11:05:24,7", "11:05:56,7", "11:06:15,8", "11:13:24,1", 
"11:13:24,2", "11:13:32,1", "11:13:36,2", "11:13:37,2", "11:13:44,5", 
"11:13:46,8", "11:14:12,7", "11:14:19,4", "11:14:19,8", "11:14:21,2", 
"11:14:38,7", "11:14:44,0", "11:14:44,5", "11:15:10,5", "11:15:10,6", 
"11:15:12,9", "11:15:16,6", "11:15:23,3", "11:15:31,4", "11:15:36,4", 
"11:15:37,4", "11:15:49,5", "11:16:01,4", "11:16:06,0", "11:17:56,2", 
"11:19:08,1", "11:20:17,2", "11:26:39,4", "11:26:53,2", "11:27:39,5", 
"11:28:33,0", "11:30:42,3", "11:31:00,7", "11:33:44,2", "11:39:56,1", 
"11:40:07,3", "11:41:02,1", "11:41:30,1", "11:45:07,0", "11:45:26,6", 
"11:49:50,8", "11:59:58,1", "12:03:49,9", "12:04:12,6", "12:06:05,8", 
"12:06:49,2", "12:07:56,0", "12:09:37,7", "12:14:25,5", "12:14:32,1", 
"12:15:42,1", "12:15:55,2", "12:16:36,9", "12:16:44,2", "12:18:00,3", 
"12:18:12,8", "12:28:17,8", "12:28:17,9", "12:28:23,7", "12:28:51,1", 
"12:36:33,2", "12:37:45,0", "12:39:22,2", "12:40:19,5", "12:42:22,1", 
"12:58:46,3", "13:06:05,8", "13:06:05,9", "13:07:17,6", "13:07:17,7", 
"13:09:01,3", "13:09:01,4", "13:09:11,3", "13:09:31,0", "13:10:07,8", 
"13:35:43,8", "13:38:27,7", "14:11:16,0", "14:17:31,5", "14:26:13,9", 
"14:36:11,8", "14:38:43,7", "14:38:47,8", "14:38:51,8", "14:48:26,7", 
"14:52:07,4", "14:52:13,8", "15:09:24,7", "15:10:25,8", "15:29:12,1", 
"15:31:55,9", "15:34:04,1", "15:44:10,8", "15:45:07,1", "15:57:04,9", 
"15:57:13,9", "16:16:27,9", "16:21:41,7", "16:36:01,5", "16:36:13,2", 
"16:46:10,5", "16:46:10,6", "16:47:37,3", "16:50:52,4", "16:50:52,5", 
"16:51:44,5", "16:55:11,5", "16:56:21,8", "16:56:37,5", "16:57:37,9", 
"16:58:18,6", "16:58:44,5", "17:00:39,1", "17:01:50,7", "17:03:13,2", 
"17:03:28,3", "17:03:46,7", "17:03:47,0", "17:04:30,4", "17:08:41,8", 
"17:09:44,0"), class = "factor"), Close = structure(c(8L, 7L, 
7L, 6L, 5L, 5L, 4L, 4L, 4L, 4L), .Label = c("0,92", "0,93", "0,94", 
"0,95", "0,96", "0,97", "0,98", "0,99"), class = "factor"), Volume = c(1000L, 
100L, 600L, 500L, 50L, 1000L, 40L, 100L, 500L, 100L)), .Names = c("Time", 
"Close", "Volume"), row.names = c(NA, 10L), class = "data.frame")

The first column includes the time stamp of every transaction during a stock's exchange daily session. I would like to convert the Close and Volume columns to an xts object ordered by the Time column.

UPDATE: From your edits, it appears you imported your data using two different commands. It also appears you should be using read.csv2. I've updated my answer with Lines that (I assume) look more like your original CSV (I have to guess because you don't say what the file looks like). The rest of the answer doesn't change.

You have to add a date to your times because xts stores all index values internally as POSIXct (I just used today's date).

I had to convert the "," decimal notation to the "." convention (using gsub), but that may be locale-dependent and you may not need to. paste today's date with the (possibly converted) time and then convert it to POSIXct to create an index suitable for xts.

I've also formatted the index so you can see the fractional seconds.

Lines <- "Time;Close;Volume
10:27:03,6;0,99;1000
10:32:58,4;0,98;100
10:34:16,9;0,98;600
10:35:46,0;0,97;500
10:35:50,6;0,96;50
10:35:50,6;0,96;1000
10:36:10,3;0,95;40
10:36:10,3;0,95;100
10:36:10,4;0,95;500
10:36:10,4;0,95;100"

SOLK <- read.csv2(con <- textConnection(Lines))
close(con)

solk <- xts(SOLK[,c("Close","Volume")],
  as.POSIXct(paste("2011-09-02", gsub(",",".",SOLK[,1]))))
indexFormat(solk) <- "%Y-%m-%d %H:%M:%OS6"
solk
#                            Close Volume
# 2011-09-02 10:27:03.599999  0.99   1000
# 2011-09-02 10:32:58.400000  0.98    100
# 2011-09-02 10:34:16.900000  0.98    600
# 2011-09-02 10:35:46.000000  0.97    500
# 2011-09-02 10:35:50.599999  0.96     50
# 2011-09-02 10:35:50.599999  0.96   1000
# 2011-09-02 10:36:10.299999  0.95     40
# 2011-09-02 10:36:10.299999  0.95    100
# 2011-09-02 10:36:10.400000  0.95    500
# 2011-09-02 10:36:10.400000  0.95    100

That's an odd structure. Translating it to dput syntax

SOLK <- structure(list(structure(c(1L, 2L, 3L, 4L, 5L, 5L, 6L, 6L, 7L, 
7L), .Label = c("10:27:03,6", "10:32:58,4", "10:34:16,9", "10:35:46,0", 
"10:35:50,6", "10:36:10,3", "10:36:10,4"), class = "factor"), 
    Close = c(0.99, 0.98, 0.98, 0.97, 0.96, 0.96, 0.95, 0.95, 
    0.95, 0.95), Volume = c(1000L, 100L, 600L, 500L, 50L, 1000L, 
    40L, 100L, 500L, 100L)), .Names = c("", "Close", "Volume"
), class = "data.frame", row.names = c("1", "2", "3", "4", "5", 
"6", "7", "8", "9", "10"))

I'm assuming the comma in the timestamp is decimal separator.

library("chron")
time.idx <- times(gsub(",",".",as.character(SOLK[[1]])))

Unfortunately, it seems xts won't take this as a valid order.by; so a date (today, for lack of a better choice) must be included to make xts happy.

xts(SOLK[[2]], order.by=chron(Sys.Date(), time.idx))

来源：https://stackoverflow.com/questions/7288045/how-can-i-convert-a-dataframe-with-a-factor-column-to-a-xts-object

标签

finance

xts