How can i convert a dataframe with a factor column to a xts object?

廉价感情. 提交于 2019-12-22 17:55:05

问题


I have a csv file and when i use this command

SOLK<-read.table('Book1.csv',header=TRUE,sep=';')

I get this output

> SOLK
          Time Close Volume
1   10:27:03,6  0,99   1000
2   10:32:58,4  0,98    100
3   10:34:16,9  0,98    600
4   10:35:46,0  0,97    500
5   10:35:50,6  0,96     50
6   10:35:50,6  0,96   1000
7   10:36:10,3  0,95     40
8   10:36:10,3  0,95    100
9   10:36:10,4  0,95    500
10  10:36:10,4  0,95    100
.      .         .       .
.      .         .       .
.      .         .       .
285 17:09:44,0  0,96    404

Here is the result of dput(SOLK[1:10,]):

 > dput(SOLK[1:10,])
structure(list(Time = structure(c(1L, 2L, 3L, 4L, 5L, 5L, 6L, 
6L, 7L, 7L), .Label = c("10:27:03,6", "10:32:58,4", "10:34:16,9", 
"10:35:46,0", "10:35:50,6", "10:36:10,3", "10:36:10,4", "10:36:30,8", 
"10:37:23,3", "10:37:38,2", "10:37:39,3", "10:37:45,9", "10:39:07,5", 
"10:39:07,6", "10:39:46,6", "10:41:21,8", "10:43:20,6", "10:43:36,4", 
"10:43:48,8", "10:43:48,9", "10:43:54,6", "10:44:01,5", "10:44:08,4", 
"10:45:47,2", "10:46:16,7", "10:47:03,6", "10:47:48,6", "10:47:55,0", 
"10:48:09,9", "10:48:30,6", "10:49:20,6", "10:50:31,9", "10:50:34,6", 
"10:50:38,1", "10:51:02,8", "10:51:11,5", "10:55:57,7", "10:57:57,2", 
"10:59:06,9", "10:59:33,5", "11:00:31,0", "11:00:31,1", "11:04:46,4", 
"11:04:53,4", "11:04:54,6", "11:04:56,1", "11:04:58,9", "11:05:02,0", 
"11:05:02,6", "11:05:24,7", "11:05:56,7", "11:06:15,8", "11:13:24,1", 
"11:13:24,2", "11:13:32,1", "11:13:36,2", "11:13:37,2", "11:13:44,5", 
"11:13:46,8", "11:14:12,7", "11:14:19,4", "11:14:19,8", "11:14:21,2", 
"11:14:38,7", "11:14:44,0", "11:14:44,5", "11:15:10,5", "11:15:10,6", 
"11:15:12,9", "11:15:16,6", "11:15:23,3", "11:15:31,4", "11:15:36,4", 
"11:15:37,4", "11:15:49,5", "11:16:01,4", "11:16:06,0", "11:17:56,2", 
"11:19:08,1", "11:20:17,2", "11:26:39,4", "11:26:53,2", "11:27:39,5", 
"11:28:33,0", "11:30:42,3", "11:31:00,7", "11:33:44,2", "11:39:56,1", 
"11:40:07,3", "11:41:02,1", "11:41:30,1", "11:45:07,0", "11:45:26,6", 
"11:49:50,8", "11:59:58,1", "12:03:49,9", "12:04:12,6", "12:06:05,8", 
"12:06:49,2", "12:07:56,0", "12:09:37,7", "12:14:25,5", "12:14:32,1", 
"12:15:42,1", "12:15:55,2", "12:16:36,9", "12:16:44,2", "12:18:00,3", 
"12:18:12,8", "12:28:17,8", "12:28:17,9", "12:28:23,7", "12:28:51,1", 
"12:36:33,2", "12:37:45,0", "12:39:22,2", "12:40:19,5", "12:42:22,1", 
"12:58:46,3", "13:06:05,8", "13:06:05,9", "13:07:17,6", "13:07:17,7", 
"13:09:01,3", "13:09:01,4", "13:09:11,3", "13:09:31,0", "13:10:07,8", 
"13:35:43,8", "13:38:27,7", "14:11:16,0", "14:17:31,5", "14:26:13,9", 
"14:36:11,8", "14:38:43,7", "14:38:47,8", "14:38:51,8", "14:48:26,7", 
"14:52:07,4", "14:52:13,8", "15:09:24,7", "15:10:25,8", "15:29:12,1", 
"15:31:55,9", "15:34:04,1", "15:44:10,8", "15:45:07,1", "15:57:04,9", 
"15:57:13,9", "16:16:27,9", "16:21:41,7", "16:36:01,5", "16:36:13,2", 
"16:46:10,5", "16:46:10,6", "16:47:37,3", "16:50:52,4", "16:50:52,5", 
"16:51:44,5", "16:55:11,5", "16:56:21,8", "16:56:37,5", "16:57:37,9", 
"16:58:18,6", "16:58:44,5", "17:00:39,1", "17:01:50,7", "17:03:13,2", 
"17:03:28,3", "17:03:46,7", "17:03:47,0", "17:04:30,4", "17:08:41,8", 
"17:09:44,0"), class = "factor"), Close = structure(c(8L, 7L, 
7L, 6L, 5L, 5L, 4L, 4L, 4L, 4L), .Label = c("0,92", "0,93", "0,94", 
"0,95", "0,96", "0,97", "0,98", "0,99"), class = "factor"), Volume = c(1000L, 
100L, 600L, 500L, 50L, 1000L, 40L, 100L, 500L, 100L)), .Names = c("Time", 
"Close", "Volume"), row.names = c(NA, 10L), class = "data.frame")

The first column includes the time stamp of every transaction during a stock's exchange daily session. I would like to convert the Close and Volume columns to an xts object ordered by the Time column.


回答1:


UPDATE: From your edits, it appears you imported your data using two different commands. It also appears you should be using read.csv2. I've updated my answer with Lines that (I assume) look more like your original CSV (I have to guess because you don't say what the file looks like). The rest of the answer doesn't change.

You have to add a date to your times because xts stores all index values internally as POSIXct (I just used today's date).

I had to convert the "," decimal notation to the "." convention (using gsub), but that may be locale-dependent and you may not need to. paste today's date with the (possibly converted) time and then convert it to POSIXct to create an index suitable for xts.

I've also formatted the index so you can see the fractional seconds.

Lines <- "Time;Close;Volume
10:27:03,6;0,99;1000
10:32:58,4;0,98;100
10:34:16,9;0,98;600
10:35:46,0;0,97;500
10:35:50,6;0,96;50
10:35:50,6;0,96;1000
10:36:10,3;0,95;40
10:36:10,3;0,95;100
10:36:10,4;0,95;500
10:36:10,4;0,95;100"

SOLK <- read.csv2(con <- textConnection(Lines))
close(con)

solk <- xts(SOLK[,c("Close","Volume")],
  as.POSIXct(paste("2011-09-02", gsub(",",".",SOLK[,1]))))
indexFormat(solk) <- "%Y-%m-%d %H:%M:%OS6"
solk
#                            Close Volume
# 2011-09-02 10:27:03.599999  0.99   1000
# 2011-09-02 10:32:58.400000  0.98    100
# 2011-09-02 10:34:16.900000  0.98    600
# 2011-09-02 10:35:46.000000  0.97    500
# 2011-09-02 10:35:50.599999  0.96     50
# 2011-09-02 10:35:50.599999  0.96   1000
# 2011-09-02 10:36:10.299999  0.95     40
# 2011-09-02 10:36:10.299999  0.95    100
# 2011-09-02 10:36:10.400000  0.95    500
# 2011-09-02 10:36:10.400000  0.95    100



回答2:


That's an odd structure. Translating it to dput syntax

SOLK <- structure(list(structure(c(1L, 2L, 3L, 4L, 5L, 5L, 6L, 6L, 7L, 
7L), .Label = c("10:27:03,6", "10:32:58,4", "10:34:16,9", "10:35:46,0", 
"10:35:50,6", "10:36:10,3", "10:36:10,4"), class = "factor"), 
    Close = c(0.99, 0.98, 0.98, 0.97, 0.96, 0.96, 0.95, 0.95, 
    0.95, 0.95), Volume = c(1000L, 100L, 600L, 500L, 50L, 1000L, 
    40L, 100L, 500L, 100L)), .Names = c("", "Close", "Volume"
), class = "data.frame", row.names = c("1", "2", "3", "4", "5", 
"6", "7", "8", "9", "10"))

I'm assuming the comma in the timestamp is decimal separator.

library("chron")
time.idx <- times(gsub(",",".",as.character(SOLK[[1]])))

Unfortunately, it seems xts won't take this as a valid order.by; so a date (today, for lack of a better choice) must be included to make xts happy.

xts(SOLK[[2]], order.by=chron(Sys.Date(), time.idx))


来源:https://stackoverflow.com/questions/7288045/how-can-i-convert-a-dataframe-with-a-factor-column-to-a-xts-object

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!