Replicating String.split with StringTokenizer

后端 未结 9 1074
心在旅途
心在旅途 2021-02-06 14:03

Encouraged by this, and the fact I have billions of string to parse, I tried to modify my code to accept StringTokenizer instead of String[]

The only t

9条回答
  •  小蘑菇
    小蘑菇 (楼主)
    2021-02-06 14:49

    Well, the fastest thing you could do would be to manually traverse the string, e.g.

    List split(String s) {
            List out= new ArrayList();
               int idx = 0;
               int next = 0;
            while ( (next = s.indexOf( ',', idx )) > -1 ) {
                out.add( s.substring( idx, next ) );
                idx = next + 1;
            }
            if ( idx < s.length() ) {
                out.add( s.substring( idx ) );
            }
                   return out;
        }
    

    This (informal test) looks to be something like twice as fast as split. However, it's a bit dangerous to iterate this way, for example it will break on escaped commas, and if you end up needing to deal with that at some point (because your list of a billion strings has 3 escaped commas) by the time you've allowed for it you'll probably end up losing some of the speed benefit.

    Ultimately it's probably not worth the bother.

提交回复
热议问题