Remove text after second colon

橙三吉。 提交于 2019-12-11 07:25:07

问题


I need to remove everything after the second colon. I have several date formats, that need to be cleaned using the same algorithm.

a <- "2016-12-31T18:31:34Z"
b <- "2016-12-31T18:31Z"

I have tried to match on the two colon groups, but I cannot seem to find out how to remove the second match group.

sub("(:.*){2}", "", "2016-12-31T18:31:34Z")

回答1:


A regex you can use: (:[^:]+):.*

which you can check on: regex101 and use like

sub("(:[^:]+):.*", "\\1", "2016-12-31T18:31:34Z")
[1] "2016-12-31T18:31"
sub("(:[^:]+):.*", "\\1", "2016-12-31T18:31Z")
[1] "2016-12-31T18:31Z"



回答2:


Use it as an opportunity to make a partial timestamp validator vs just targeting any trailing seconds:

remove_seconds <- function(x) {
  require(stringi)
  x <- stri_trim_both(x)
  x <- stri_match_all_regex(x, "([[:digit:]]{4}-[[:digit:]]{2}-[[:digit:]]{2}T[[:digit:]]{2}:[[:digit:]]{2})")[[1]]
  if (any(is.na(x))) return(NA)  
  sprintf("%sZ", x[,2])
}

That way, you'll catch errant timestamp strings.




回答3:


Let say you have a vector:

date <- c("2016-12-31T18:31:34Z", "2016-12-31T18:31Z", "2017-12-31T18:31Z")

Then you could split it by ":" and take only first two elements dropping the rest:

out = sapply(date, function(x) paste(strsplit(x, ":")[[1]][1:2], collapse = ':'))


来源:https://stackoverflow.com/questions/46213661/remove-text-after-second-colon

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!