Convert data frame with date column to timeseries

后端 未结 5 873
走了就别回头了
走了就别回头了 2020-11-29 20:06

I\'ve got a data frame with the following data:

>PRICE
         DATE  CLOSE
1    20070103 54.700
2    20070104 54.770
3    20070105 55.120
4    20070108 5         


        
相关标签:
5条回答
  • 2020-11-29 20:53

    Whether you want to convert a data frame (or any time series) to a xts or zoo object, as in the answers above, or to any other time series (such as a ts object) the tsbox package makes coercion easy:

    PRICE <- structure(list(
      DATE = c(20070103L, 20070104L, 20070105L, 20070108L, 20070109L,
               20070110L, 20070111L, 20070112L, 20070115L),
      CLOSE = c(54.7, 54.77, 55.12, 54.87, 54.86, 54.27, 54.77, 55.36, 55.76)),
      .Names = c("DATE", "CLOSE"), class = "data.frame",
      row.names = c("1", "2", "3", "4", "5", "6", "7", "8", "9"))
    
    library(tsbox)
    
    ts_xts(PRICE)
    #> [time]: 'DATE' [value]: 'CLOSE'
    #> Loading required namespace: xts
    #> Registered S3 method overwritten by 'xts':
    #>   method     from
    #>   as.zoo.xts zoo
    #>            CLOSE
    #> 2007-01-03 54.70
    #> 2007-01-04 54.77
    #> 2007-01-05 55.12
    #> 2007-01-08 54.87
    #> 2007-01-09 54.86
    #> 2007-01-10 54.27
    #> 2007-01-11 54.77
    #> 2007-01-12 55.36
    #> 2007-01-15 55.76
    
    ts_ts(PRICE)
    #> [time]: 'DATE' [value]: 'CLOSE'
    #> Time Series:
    #> Start = 2007.00547581401 
    #> End = 2007.0383306981 
    #> Frequency = 365.2425 
    #>  [1] 54.70 54.77 55.12    NA    NA 54.87 54.86 54.27 54.77 55.36    NA
    #> [12]    NA 55.76
    
    0 讨论(0)
  • 2020-11-29 20:54

    Your DATE column may represent a date, but it is actually either a character, factor, integer, or a numeric vector.

    First, you need to convert the DATE column to a Date object. Then you can create an xts object from the CLOSE and DATE columns of your PRICE data.frame. Finally, you can use the xts object to calculate returns and the Calmar ratio.

    PRICE <- structure(list(
      DATE = c(20070103L, 20070104L, 20070105L, 20070108L, 20070109L,
               20070110L, 20070111L, 20070112L, 20070115L),
      CLOSE = c(54.7, 54.77, 55.12, 54.87, 54.86, 54.27, 54.77, 55.36, 55.76)),
      .Names = c("DATE", "CLOSE"), class = "data.frame",
      row.names = c("1", "2", "3", "4", "5", "6", "7", "8", "9"))
    
    library(PerformanceAnalytics)  # loads/attaches xts
    # Convert DATE to Date class
    PRICE$DATE <- as.Date(as.character(PRICE$DATE),format="%Y%m%d")
    # create xts object
    x <- xts(PRICE$CLOSE,PRICE$DATE)
    CalmarRatio(Return.calculate(x))
    #                  [,1]
    # Calmar Ratio 52.82026
    
    0 讨论(0)
  • 2020-11-29 21:02

    Most people find working with the time series class to be a big pain. You should consider using the zoo class from package zoo. It will not complain about missing times , only about duplicates. The PerformanceAnalytics functions are almost certainly going to be expecting 'zoo' or its descendant class 'xts'.

    pricez <- read.zoo(text="   DATE  CLOSE
     1    20070103 54.700
     2    20070104 54.770
     3    20070105 55.120
     4    20070108 54.870
     5    20070109 54.860
     6    20070110 54.270
     7    20070111 54.770
     8    20070112 55.360
     9    20070115 55.760
     ")
     index(pricez) <- as.Date(as.character(index(pricez)), format="%Y%m%d")
     pricez
    2007-01-03 2007-01-04 2007-01-05 2007-01-08 2007-01-09 2007-01-10 2007-01-11 2007-01-12 2007-01-15 
         54.70      54.77      55.12      54.87      54.86      54.27      54.77      55.36      55.76 
    
    0 讨论(0)
  • 2020-11-29 21:08

    An alternative solution is to use the tidyquant package, which allows the functionality of the financial packages, including time series functionality, to be used with data frames. The following examples shows how you can get the Calmar Ratio for multiple assets. The tidyquant vignettes go into more details on how to use the package.


    library(tidyquant)
    # Get prices
    price_tbl <- c("FB", "AMZN", "NFLX", "GOOG") %>%
        tq_get(get  = "stock.prices",
               from = "2010-01-01",
               to   = "2016-12-31")
    price_tbl
    #> # A tibble: 6,449 × 8
    #>    symbol       date  open  high   low close    volume adjusted
    #>     <chr>     <date> <dbl> <dbl> <dbl> <dbl>     <dbl>    <dbl>
    #> 1      FB 2012-05-18 42.05 45.00 38.00 38.23 573576400    38.23
    #> 2      FB 2012-05-21 36.53 36.66 33.00 34.03 168192700    34.03
    #> 3      FB 2012-05-22 32.61 33.59 30.94 31.00 101786600    31.00
    #> 4      FB 2012-05-23 31.37 32.50 31.36 32.00  73600000    32.00
    #> 5      FB 2012-05-24 32.95 33.21 31.77 33.03  50237200    33.03
    #> 6      FB 2012-05-25 32.90 32.95 31.11 31.91  37149800    31.91
    #> 7      FB 2012-05-29 31.48 31.69 28.65 28.84  78063400    28.84
    #> 8      FB 2012-05-30 28.70 29.55 27.86 28.19  57267900    28.19
    #> 9      FB 2012-05-31 28.55 29.67 26.83 29.60 111639200    29.60
    #> 10     FB 2012-06-01 28.89 29.15 27.39 27.72  41855500    27.72
    #> # ... with 6,439 more rows
    
    # Convert to period returns
    return_tbl <- price_tbl %>%
        group_by(symbol) %>%
        tq_transmute(ohlc_fun   = Ad, 
                     mutate_fun = periodReturn,
                     period     = "daily")
    return_tbl
    #> Source: local data frame [6,449 x 3]
    #> Groups: symbol [4]
    #> 
    #>    symbol       date daily.returns
    #>     <chr>     <date>         <dbl>
    #> 1      FB 2012-05-18    0.00000000
    #> 2      FB 2012-05-21   -0.10986139
    #> 3      FB 2012-05-22   -0.08903906
    #> 4      FB 2012-05-23    0.03225806
    #> 5      FB 2012-05-24    0.03218747
    #> 6      FB 2012-05-25   -0.03390854
    #> 7      FB 2012-05-29   -0.09620809
    #> 8      FB 2012-05-30   -0.02253811
    #> 9      FB 2012-05-31    0.05001770
    #> 10     FB 2012-06-01   -0.06351355
    #> # ... with 6,439 more rows
    
    # Calculate performance
    return_tbl %>%
        tq_performance(Ra = daily.returns,
                       performance_fun = CalmarRatio)
    #> Source: local data frame [4 x 2]
    #> Groups: symbol [4]
    #> 
    #>   symbol CalmarRatio
    #>    <chr>       <dbl>
    #> 1     FB  0.50283172
    #> 2   AMZN  0.91504597
    #> 3   NFLX  0.14444744
    #> 4   GOOG  0.05068483
    
    0 讨论(0)
  • 2020-11-29 21:09

    This answer based on @Joshua_Ulrich's answer creates a time series from the built-in airquality dataset containing "Daily air quality measurements in New York, May to September 1973".

    > head(airquality,3)
      Ozone Solar.R Wind Temp Month Day
    1    41     190  7.4   67     5   1
    2    36     118  8.0   72     5   2
    3    12     149 12.6   74     5   3
    

    Convert Month and Day to a vector of class "Date"

    airqualitydate = as.Date(sprintf("1973%02.0f%02.0f", airquality$Month, airquality$Day), 
                             format="%Y%m%d")
    

    Create the time series object

    ts_airquality <- xts(airquality, airqualitydate)
    head(ts_airquality, 3)
               Ozone Solar.R Wind Temp Month Day
    1973-05-01    41     190  7.4   67     5   1
    1973-05-02    36     118  8.0   72     5   2
    1973-05-03    12     149 12.6   74     5   3
    

    Plot to illustrate the different output of the plot.xts() function. (compare to plot(airquality))

    plot(ts_airquality$Ozone, main="Ozone (ppb)")
    lines(ts_airquality$Temp, on=NA, main="Temperature (degrees F)")
    

    Note, the base R ts() method is mostly suited for quarterly or yearly data. As explained in an answer to "starting a daily time series in R":

    "Time Series Object does not work well with creating daily time series. I will suggest you use the zoo library."

    In particular the xts package is an extension to zoo.

    0 讨论(0)
提交回复
热议问题