Splitting a string based on locating a varying code (with a similar format)

前端 未结 1 1017
难免孤独
难免孤独 2021-01-26 09:09

I uploaded a txt file in to R as follows: Election_Parties <- readr::read_lines(\"Election_Parties.txt\") Let\'s say the following text

相关标签:
1条回答
  • 2021-01-26 09:44

    You may use

    strsplit(paste(Election_Parties, collapse=" "), "\\s+(?=P\\d+-)", perl=TRUE)[[1]]
    

    See the R demo online.

    Output:

    [1] "P23-Andalusian Social Democratic Party (Partido Social-Demócrata Andaluz [PSDA])"                              
    [2] "P24-Andalusian Socialist Movement (Movimiento Socialista Andaluz [MSA])"                                       
    [3] "P235-Andalusian Socialist Party-Andalucian Party (Partido Socialista Andalucista-Partido Andalucista [PSA-PA])"
    [4] "P26-Andalusist Party (Partido Andalucista [PA])"                                                               
    [5] "P217-Andecha Astur (Andecha Astur [AA])" 
    

    The \s+(?=P\d+-) pattern matches 1+ whitespaces that are followed with P, 1+ digits, -, but the P<numbers>- is not consumed since the pattern resides in the positive lookahead construct that is a zero-width assertion. Due to this lookahead, the perl=TRUE argument is necessary to process the regex with the PCRE regex engine.

    0 讨论(0)
提交回复
热议问题