error with a function to retrieve data from a database

前端 未结 2 1401
后悔当初
后悔当初 2021-01-13 11:42

I am trying to get a FASTA file form NCBI website, I use the following function

getncbiseq <- function(accession){
  dbs <- c()
  for (i in 1:numdbs){         


        
相关标签:
2条回答
  • 2021-01-13 12:09

    I think you intended to assign your call to query() to a variable called query2, but you forgot to do it. Try this:

    if (!(inherits(resquery, "try-error"))) {
      queryname <- "query2"
      thequery <- paste("AC=", accession, sep="")
      query2 <- query(queryname, thequery)
      # see if a sequence was retrieved:
      seq <- getSequence(query2$req[[1]])
      closebank()
      return(seq)
    }
    

    As you mentioned, the rest of your code also has some quirks and kinks which could probably be improved upon.

    Update:

    Here is a refactor of your code using sapply on the dbs vector instead of an explicit for loop (the latter which is usually frowned upon by R people):

    processdbs <- function(x, y) {
        choosebank(x)
        resquery <- try(query(".tmpquery", paste("AC=", y)), silent = TRUE)
        if (!(inherits(resquery, "try-error"))) {
          queryname <- "query2"
          thequery  <- paste("AC=", y, sep="")
          query2 <- query(queryname, thequery)
    
          # see if a sequence was retrieved:
          seq <- getSequence(query2$req[[1]])
          closebank()
          return(seq)
        }
        closebank()
    }
    
    getncbiseq <- function(accession) {
       dbs <- c("genbank","refseq","refseqViruses","bacterial")
       result <- sapply(dbs, processdbs, y=accession)
       closebank()
    
       print(paste("ERROR: accession",accession,"was not found"))
    }
    

    You may have to do a slight amount of additional work to inspect the result vector and determine whether a sequence was retrieved anywhere.

    0 讨论(0)
  • 2021-01-13 12:16

    Thanks for great help. I was stuck at this point for a full day. I finally got the following code worked under windows 10 with R3.4.0(32bits):-

    getncbiseq <- function(accession)
    {
    require("seqinr") # this function requires the SeqinR R package
    # first find which ACNUC database the accession is stored in:
    dbs <- c("genbank","refseq","refseqViruses","bacterial")
    numdbs <- length(dbs)
    for (i in 1:numdbs)
    {
    db <- dbs[i]
    choosebank(db)
    # check if the sequence is in ACNUC database 'db':
    resquery <- try(query(".tmpquery", paste("AC=", accession)), silent = TRUE)
    
    if (!(inherits(resquery, "try-error"))) {
      queryname <- "query2"
      thequery <- paste("AC=", accession, sep="")
      query2 <- query(queryname, thequery)
      # see if a sequence was retrieved:
      seq <- getSequence(query2$req[[1]])
      closebank()
      return(seq)
    }
    closebank()
    }
    print(paste("ERROR: accession",accession,"was not found"))
    }
    
    0 讨论(0)
提交回复
热议问题