How can I split a string in Java and retain the delimiters?

后端 未结 6 1725
无人共我
无人共我 2020-12-06 17:01

I have this string (Java 1.5):

:alpha;beta:gamma;delta

I need to get an array:

{\":alpha\", \";beta\", \":gamma\", \";delta         


        
相关标签:
6条回答
  • 2020-12-06 17:24

    This should work with Java 1.5 (Pattern.quote was introduced in Java 1.5).

    // Split the string on delimiter, but don't delete the delimiter
    private String[] splitStringOnDelimiter(String text, String delimiter, String safeSequence){
        // A temporary delimiter must be added as Java split method deletes the delimiter
    
        // for safeSequence use something that doesn't occur in your texts 
        text=text.replaceAll(Pattern.quote(delimiter), safeSequence+delimiter);
        return text.split(Pattern.quote(safeSequence));
    }
    

    If first element is the problem:

    private String[] splitStringOnDelimiter(String text, String delimiter, String safeSequence){
        text=text.replaceAll(Pattern.quote(delimiter), safeSequence+delimiter);
        String[] tempArray = text.split(Pattern.quote(safeSequence));
        String[] returnArray = new String[tempArray.length-1];
        System.arraycopy(tempArray, 1, returnArray, 0, returnArray.length);
        return returnArray;
    }
    

    E.g., here "a" is the delimiter:

    splitStringOnDelimiter("-asd-asd-g----10-9asdas jadd", "a", "<>")
    

    You get this:

    1.: -
    2.: asd-
    3.: asd-g----10-9
    4.: asd
    5.: as j
    6.: add
    

    If you in fact want this:

    1.: -a
    2.: sd-a
    3.: sd-g----10-9a
    4.: sda
    5.: s ja
    6.: dd
    

    You switch:

    safeSequence+delimiter
    

    with

    delimiter+safeSequence
    
    0 讨论(0)
  • 2020-12-06 17:25
    /**
     * @param list an empty String list. used for internal purpose. 
     * @param str  String which has to be processed.
     * @return Splited String Array with delimiters.
     */
    public  String[] split(ArrayList<String> list, String str){
      for(int i = str.length()-1 ; i >=0 ; i--){
         if(!Character.isLetterOrDigit((str.charAt(i)))) {
            list.add(str.substring(i, str.length()));
            split(list,str.substring(0,i));
            break;
         }
      }
      return list.toArray(new String[list.size()]);
    }
    
    0 讨论(0)
  • 2020-12-06 17:26

    You can do this by simply using patterns and matcher class in java regx.

        public static String[] mysplit(String text)
        {
         List<String> s = new ArrayList<String>();
         Matcher m = Pattern.compile("(:|;)\\w+").matcher(text);
         while(m.find()) {
       s.add(m.group());
         }
         return s.toArray(new String[s.size()]);
        }
    
    0 讨论(0)
  • 2020-12-06 17:32

    Assuming that you only have a finite set of seperators before the words in your string (eg ;, : etc) you can use the following technique. (apologies for any syntax errors, but its been a while since I used Java)

    String toSplit = ":alpha;beta:gamma;delta "
    toSplit = toSplit.replace(":", "~:")
    toSplit = toSplit.replace(";", "~;")
    //repeat for all you possible seperators
    String[] splitStrings = toSplit.split("~")
    
    0 讨论(0)
  • 2020-12-06 17:34
    str.split("(?=[:;])")
    

    This will give you the desired array, only with an empty first item. And:

    str.split("(?=\\b[:;])")
    

    This will give the array without the empty first item.

    • The key here is the (?=X) which is a zero-width positive lookahead (non-capturing construct) (see regex pattern docs).
    • [:;] means "either ; or :"
    • \b is word-boundary - it's there in order not to consider the first : as delimiter (since it is the beginning of the sequence)
    0 讨论(0)
  • 2020-12-06 17:36

    To keep the separators, you can use a StringTokenizer:

    new StringTokenizer(":alpha;beta:gamma;delta", ":;", true)
    

    That would yield the separators as tokens.

    To have them as part of your tokens, you could use String#split with lookahead.

    0 讨论(0)
提交回复
热议问题