Combining time-series objects and lists: Package “termstrc”

前端 未结 2 532
臣服心动
臣服心动 2021-01-03 06:08

The R package \"termstrc\", designed for term-structure estimation, is an incredibly useful tool, but it requires data to be set in a particularly awkward format: lists with

相关标签:
2条回答
  • 2021-01-03 06:33

    My 2 cents, I have been trying to get this work with new Rblpapi. I still have some problems with createCouponBonds part but I think other functions returns correctly. Won't solve whole problem but at least partial fix. BBcurveIDs, bbStaticDataFields, bbDynamicDataFields, bbHistoricDataFields are the same as above.

    bbGetCountry <- function(cCode, up = FALSE) {
      if (up == TRUE) startDate <- as.Date("2016-01-01") else startDate <- histStartDate 
      cal <- Calendar(weekdays=c("saturday", "sunday"))
      wdays <- as.list(bizseq(startDate, Sys.Date(), cal))
      actives <- lapply(wdays, function(x) { 
        bds(BBcurveIDs[cCode][[1]], "CURVE_MEMBERS", override = c(CURVE_DATE=format(x, "%Y%m%d")))
      })
      names(actives) <- wdays
      uniqueActives <- unique(unlist(actives))
      staticData <- bdp(uniqueActives, bbStaticDataFields)
      cfData <- lapply(uniqueActives, function(x) {
        bds(x, "DES_CASH_FLOW_ADJ", override = c(SETTLE_DT = format(as.Date(staticData[x, "FIRST_SETTLE_DT"]), "%Y%m%d")))
      })
      names(cfData) <- uniqueActives
    
      historicData <- lapply(bbHistoricDataFields, function(x) bbdh(uniqueActives, flds = x, startDate = startDate))
      names(historicData) <- bbHistoricDataFields
      allDates <- as.Date(index(historicData$LAST_PRICE))
    
      save(actives, file = paste("data_", cCode, "actives.dat", sep = ""))
      save(staticData, file = paste("data_", cCode, "staticData.dat", sep = ""))
      save(cfData, file = paste("data_", cCode, "cfData.dat", sep = ""))
      save(historicData, file = paste("data_", cCode, "historicData.dat", sep = ""))
      #save(settleDates, file = paste("data_", cCode, "settleDates.dat", sep = ""))
      assign(paste(cCode, "data", sep = ""), list(actives = actives, staticData = staticData, cfData = cfData,    #
                                                  historicData = historicData), pos = 1)
    
    }
    

    And bbdh function:

    bbdh <- function(secs, years = 1, flds = "last_price", startDate = NULL) {
      if(is.null(startDate)) startDate <- Sys.Date() - years * 365.25
      if(class(startDate) == "Date") stardDate <- format(startDate, "%Y%m%d")
      if(nchar(startDate) > 8) startDate <- format(as.Date(startDate), "%Y%m%d")
      rawd <- bdh(secs, flds, 
                  startDate, 
                  include.non.trading.days = FALSE,
                  options = structure(c("PREVIOUS_VALUE", "NON_TRADING_WEEKDAYS"),
                                      names = c("nonTradingDayFillMethod","nonTradingDayFillOption")))
      rawd <- ldply(rawd, data.frame)
      colnames(rawd) <- c("sec", "date", "fld")
      rawd <- dcast(rawd, date ~ sec, value.var="fld")
      colnames(rawd) <- gsub(" Corp", "", colnames(rawd))
      return(xts(rawd[,-1], order.by=rawd[,1]))
    }
    
    0 讨论(0)
  • 2021-01-03 06:46

    This a fairly advanced data manipulation question. R has many powerful data manipulation tools and you're not going to need to move away from R to prepare the (admittedly fairly obtuse) dyncouponbonds object. Indeed you actually shouldn't, because taking a structure from another language and then turning into dyncouponbonds will simply be more work.

    The first thing I would make sure is that you are very familiar with the lapply function. You're going to be making plenty of use of it. You're going to be using it to create a list of couponbonds objects, which is what dyncouponbonds actually is. Creating couponbonds objects however is a little tougher, mainly because of the CASHFLOWS sublist which wants each cashflow associated with the bond's ISIN and with the date of the cashflow. For this you'll use lapply and some fairly advanced subscripting. The subset function will also come in handy.

    This question also very much depends on where you will be getting the data from, and getting it out of Bloomberg is non-trivial, mainly because you will need to go back in history using the BDS function and "DES_CASH_FLOW" field for each bond to get its cashflows. I say history, because if you're using dyncouponbonds I'm assuming you will want to do historic yield curve analysis. You'll need to override the BDS function's "SETTLE_DT" field, to the value that you will have received for the bond using the BDP function and field "FIRST_SETTLE_DT", so that you get all the cashflows from the beginning of the bond's life (otherwise it'll only return from today, and that's no good for historic analysis). But I digress. If you're not using bloomberg I don't know where you'll get this data from.

    You'll then need to get the static data for each bond, namely the maturity, the ISIN, and the coupon rate and the issue date. And you'll need historic price and accrued interest data. Again if using bloomberg, you'll use the BDP function for this with fields you'll see in the code, below, and the historic data function BDH which I have wrapped as bbdh. Assuming again that you're a bloomberg user, here is the code:

    bbGetCountry <- function(cCode, up = FALSE) {
    # this function is going to get all the data out of bloomberg that we need for a
    # country, and update it if ncessary
        if (up == TRUE) startDate <- as.Date("2012-01-01") else startDate <- histStartDate 
        # first get all the curve members for history
        wdays <- wdaylist(startDate, Sys.Date()) # create the list of working days from startdate
        actives <- lapply(wdays, function(x) { 
            bds(conn, BBcurveIDs[cCode], "CURVE_MEMBERS", override_fields = "CURVE_DATE",
            override_values = format(x, "%Y%m%d"))
        })
        names(actives) <- wdays
        uniqueActives <- unique(unlist(actives)) # there will be puhlenty duplicates. Get rid of them
        # now get the unchanging bond data
        staticData <- bdp(conn, uniqueActives, bbStaticDataFields)
        # now get the cash flowdata
        cfData <- lapply(uniqueActives, function(x) {
            bds(conn, x, "DES_CASH_FLOW_ADJ", override_fields = "SETTLE_DT", 
                override_values = format(as.Date(staticData[x, "FIRST_SETTLE_DT"]), "%Y%m%d"))
        })
        names(cfData) <- uniqueActives
        # now for historic data
        historicData <- lapply(bbHistoricDataFields, function(x) bbdh(uniqueActives, flds = x, startDate = startDate))
        names(historicData) <- bbHistoricDataFields   # put the names in otherwise we get a numbered list
        allDates <- as.Date(index(historicData$LAST_PRICE)) # all the dates we will find settlement dates for for all bonds. No posix
        save(actives, file = paste("data/", cCode, "actives.dat", sep = ""))      #save all the files now
        save(staticData, file = paste("data/", cCode, "staticData.dat", sep = ""))
        save(cfData, file = paste("data/", cCode, "cfData.dat", sep = ""))
        save(historicData, file = paste("data/", cCode, "historicData.dat", sep = ""))
        #save(settleDates, file = paste("data/", cCode, "settleDates.dat", sep = ""))
        assign(paste(cCode, "data", sep = ""), list(actives = actives, staticData = staticData, cfData = cfData,    #
            historicData = historicData), pos = 1)
    

    }

    the bbdh function I use above is wrapper around the Rbbg library's bdh function and looks like this:

    bbdh <- function(secs, years = 1, flds = "last_price", startDate = NULL) {
            #this function gets secs over years from bloomberg daily data
                if(is.null(startDate)) startDate <- Sys.Date() - years * 365.25
                if(class(startDate) == "Date") stardDate <- format(startDate, "%Y%m%d") #convert date classes to bb string
                if(nchar(startDate) > 8) startDate <- format(as.Date(startDate), "%Y%m%d") # if we've been passed wrong format character string 
                rawd <- bdh(conn, secs, flds, startDate, always.display.tickers = TRUE, include.non.trading.days = TRUE,
                    option_names = c("nonTradingDayFillOption", "nonTradingDayFillMethod"),
                    option_values = c("NON_TRADING_WEEKDAYS", "PREVIOUS_VALUE"))
                rawd <- dcast(rawd, date ~ ticker) #put into columns
                colnames(rawd) <- sub(" .*", "", colnames(rawd)) #remove the govt, currncy bits from bb tickers
                return(xts(rawd[, -1], order.by = as.POSIXct(rawd[, 1])))
            }
    

    The country code comes from a structure which associates two letter names with bloomberg yield curve descriptions:

    BBcurveIDs  <- list(PO = "YCGT0084 Index", #Portugal
                        DE = "YCGT0016 Index", 
                        FR = "YCGT0014 Index", 
                        SP = "YCGT0061 Index",
                        IT = "YCGT0040 Index",
                        AU = "YCGT0001 Index", #Australia
                        AS = "YCGT0063 Index", #Austria
                        JP = "YCGT0018 Index",
                        GB = "YCGT0022 Index",
                        HK = "YCGT0095 Index",
                        CA = "YCGT0007 Index",
                        CH = "YCGT0082 Index",
                        NO = "YCGT0078 Index",
                        SE = "YCGT0021 Index",
                        IR = "YCGT0062 Index",
                        BE = "YCGT0006 Index",
                        NE = "YCGT0020 index", 
                        ZA = "YCGT0090 Index",
                        PL = "YCGT0177 Index", #Poland
                        MX = "YCGT0251 Index")
    

    So bbGetCountry will create 4 different data structures, called actives, staticData, dynamicData, and historicData, all from the following bloomberg fields:

    bbStaticDataFields <- c("ID_ISIN",
                          "ISSUER", 
                          "COUPON",
                          "CPN_FREQ",
                          "MATURITY",
                          "CALC_TYP_DES",                    # pricing calculation type 
                          "INFLATION_LINKED_INDICATOR",     # N or Y, in R returned as TRUE or FALSE
                          "ISSUE_DT",
                          "FIRST_SETTLE_DT",
                          "PX_METHOD",                      # PRC or YLD 
                          "PX_DIRTY_CLEAN",                 # market convention dirty or clean
                          "DAYS_TO_SETTLE",
                          "CALLABLE",
                          "MARKET_SECTOR_DES",
                          "INDUSTRY_SECTOR",
                          "INDUSTRY_GROUP",
                          "INDUSTRY_SUBGROUP")
    
    bbDynamicDataFields <- c("IS_STILL_CALLABLE",
                            "RTG_MOODY",
                            "RTG_MOODY_WATCH",
                            "RTG_SP",
                            "RTG_SP_WATCH",
                            "RTG_FITCH",
                            "RTG_FITCH_WATCH")
    
    bbHistoricDataFields <- c("PX_BID",
                              "PX_ASK",
                              #"PX_CLEAN_BID",
                              #"PX_CLEAN_ASK",
                              "PX_DIRTY_BID",
                              "PX_DIRTY_ASK",
                              #"ASSET_SWAP_SPD_BID",
                              #"ASSET_SWAP_SPD_ASK",
                              "LAST_PRICE",
                              #"SETTLE_DT",
                              "YLD_YTM_MID")
    

    Now you're ready to create couponbond objects, using all these data structures:

    createCouponBonds <- function(cCode, dateString) {
        cdata <- get(paste(cCode, "data", sep = "")) # get the data set
        today <- as.Date(dateString)
        settleDate <- today
        daycount <- 0
        while(daycount < 3) {
            settleDate <- settleDate + 1
            if (!(weekdays(settleDate) %in% c("Saturday", "Sunday"))) daycount <- daycount + 1
        }
        goodbonds <- subset(cdata$staticData, COUPON != 0 & INFLATION_LINKED_INDICATOR == FALSE) # clean out zeros and tbills
        goodbonds <- goodbonds[rownames(goodbonds) %in% cdata$actives[[dateString]][, 1], ]
        stripnames <- sapply(strsplit(rownames(goodbonds), " "), function(x) x[1])
        pxbid <- cdata$historicData$PX_BID[today, stripnames]
        pxask <- cdata$historicData$PX_ASK[today, stripnames]
        pxdbid <- cdata$historicData$PX_DIRTY_BID[today, stripnames]
        pxdask <- cdata$historicData$PX_DIRTY_ASK[today, stripnames]
        price <- as.numeric((pxbid + pxask) / 2)
        accrued <- as.numeric(pxdbid - pxbid)
        cashflows <- lapply(rownames(goodbonds), function(x) {
            goodflows <- cdata$cfData[[x]][as.Date(cdata$cfData[[x]][, "Date"]) >= today, ]
            #gfstipnames <- sapply(strsplit(rownames(goodflows), " "), function(x) x[1]) dunno if I need this
            isin <- rep(cdata$staticData[x, "ID_ISIN"], nrow(goodflows))
            cf <- apply(goodflows[, 2:3], 1, sum) / 10000
            dt <- as.Date(goodflows[, 1])
            return(list(isin = isin, cf = cf, dt = dt))
        })
        isinvec <- unlist(lapply(cashflows, function(x) x$isin))
        cfvec <- as.numeric(unlist(lapply(cashflows, function(x) x$cf)))
        datevec <- unlist(lapply(cashflows, function(x) x$dt))
        govbonds <- list(ISIN = goodbonds$ID_ISIN, 
                         MATURITYDATE = as.Date(goodbonds$MATURITY),
                         ISSUEDATE = as.Date(goodbonds$FIRST_SETTLE_DT),
                         COUPONRATE = as.numeric(goodbonds$COUPON) / 100,
                         PRICE = price,
                         ACCRUED = accrued,
                         CASHFLOWS = list(ISIN = isinvec, CF = cfvec, DATE = as.Date(datevec)),
                         TODAY = settleDate)
        govbonds <- list(govbonds)
        names(govbonds) <- cCode
        class(govbonds) <- "couponbonds"
        return(govbonds)
    }
    

    Take a close look at the cashflows <- lapply... function because this is where you'll create the sublist and is the core of the answer to your question, although of course, how this is done depends very much on how you have decided to build the intermediate data structures, and I have given you just one possibility. I realise that my answer is complex, but the problem is very complex. All the code you need is not in this answer either, a few helper functions are missing, but I am happy to provide them if you contact me. Certainly the skeleton of the core functions is all here, and actually, much of the problem is getting the data in the first place, and structuring it appropriately. You correctly surmise that some of the data is static for each bond, some of it is dynamic, and some of it is historical. So the dimensions of the intermediate datas structures are different for different pieces of the couponbonds objects. How you represent that is up to you, though I have used separate lists / data frames for each, linked via the bond IDs where necessary.

    The function above will take a date string so you can do it for each of your historic data points, using the above-mentioned lapply, and hey "presto", dyncouponds:

    spl <<- lapply(dodates, function(x) createCouponBonds("SP", x))
        names(spl) <<- lapply(spl, function(x) x$SP$TODAY)
        class(spl) <- "dyncouponbonds"
    

    There you go. You asked for it....

    If you're not using bloomberg, your input data structures will be very different but, as I said starting out, get super familiar with lapply and sapply. OBviously there are many other ways this problem could be solved, but the above works for Bloomberg. If you understand this code, you'll surely know what you're doing for other data sources.

    Finally please note that the Rbbg package from findata.org is used to interface to bloomberg.

    0 讨论(0)
提交回复
热议问题