“last name, first name” -> “first name last name” in serialized strings

前端 未结 3 455
慢半拍i
慢半拍i 2021-01-13 00:14

I have a bunch of strings that contain lists of names in last name, first name format, separated by commas, like so:

names <- c(\'Beaufoy         


        
相关标签:
3条回答
  • 2021-01-13 01:00

    If you can be certain that a comma isn't going to be in a person's name, this might work:

    mynames <- c('Beaufoy, Simon, Boyle, Danny',
                 'Nolan, Christopher',
                 'Blumberg, Stuart, Cholodenko, Lisa',
                 'Seidler, David',
                 'Sorkin, Aaron',
                 'Hoover, J. Edgar')
    mynames2 <- strsplit(mynames, ", ")
    
    unlist(lapply(mynames2, 
                  function(x) paste(x[1:length(x) %% 2 == 0], 
                                    x[1:length(x) %% 2 != 0])))
    # [1] "Simon Beaufoy"     "Danny Boyle"       "Christopher Nolan"
    # [4] "Stuart Blumberg"   "Lisa Cholodenko"   "David Seidler"    
    # [7] "Aaron Sorkin"      "J. Edgar Hoover"        
    

    I've added J. Edgar Hoover in there for good measure.

    If you want the names that were quoted together to stay together, add collapse = ", " to your paste() function:

    unlist(lapply(mynames2, 
                  function(x) paste(x[1:length(x) %% 2 == 0], 
                                    x[1:length(x) %% 2 != 0],
                                    collapse = ", ")))
    # [1] "Simon Beaufoy, Danny Boyle"       "Christopher Nolan"               
    # [3] "Stuart Blumberg, Lisa Cholodenko" "David Seidler"                   
    # [5] "Aaron Sorkin"                     "J. Edgar Hoover"    
    
    0 讨论(0)
  • 2021-01-13 01:00

    I'm in favor of @AnandaMahto's Answer, but just for fun, this illustrates another method using scan, split, and rapply.

    names <- c(names, 'Chambers, John, Ihaka, Ross, Gentleman, Robert')
    
    # extract names
    snames <- 
    lapply(names, function(x) scan(text=x, what='', sep=',', strip.white=TRUE, quiet=TRUE))
    
    # break up names
    snames<-lapply(snames, function(x) split(x, rep(seq(length(x) %/% 2), each=2)))
    
    # collapse together, reversed
    rapply(snames, function(x) paste(x[2:1], collapse=' '))
    
    0 讨论(0)
  • 2021-01-13 01:09

    (1) Maintain same names in each element This can be done with a single gsub (assuming there are no commas within names):

    > gsub("([^, ][^,]*), ([^,]+)", "\\2 \\1", names)
    [1] "Simon Beaufoy, Danny Boyle"       "Christopher Nolan"               
    [3] "Stuart Blumberg, Lisa Cholodenko" "David Seidler"                   
    [5] "Aaron Sorkin"    
    
    > gsub("([^, ][^,]*), ([^,]+)", "\\2 \\1", "Hoover, J. Edgar")
    [1] "J. Edgar Hoover"
    

    (2) Separate into one name per element If you wanted each first name last name in a separate element then use (a) scan

    scan(text = out, sep = ",", what = "")
    

    where out is the result of the gsub above or to get it directly try (b) strapply:

    > library(gsubfn)
    > strapply(names, "([^, ][^,]*), ([^,]+)", x + y ~ paste(y, x), simplify = c)
    [1] "Simon Beaufoy"     "Danny Boyle"       "Christopher Nolan"
    [4] "Stuart Blumberg"   "Lisa Cholodenko"   "David Seidler"    
    [7] "Aaron Sorkin"     
    
    > strapply("Hoover, Edgar J.", "([^, ][^,]*), ([^,]+)", x + y ~ paste(y, x), 
    +   simplify = c)
    [1] "Edgar J. Hoover"
    

    Note that all examples above used the same regular expression for matching.

    UPDATE: removed comma separating first and last name.

    UPDATE: added code to separate out each first name last name into a separate element in case that is the preferred output format.

    0 讨论(0)
提交回复
热议问题