Converting a sentence string to a string array of words in Java

后端 未结 16 2393
余生分开走
余生分开走 2020-12-01 00:04

I need my Java program to take a string like:

\"This is a sample sentence.\"

and turn it into a string array like:

{\"this\         


        
相关标签:
16条回答
  • 2020-12-01 00:26

    String.split() will do most of what you want. You may then need to loop over the words to pull out any punctuation.

    For example:

    String s = "This is a sample sentence.";
    String[] words = s.split("\\s+");
    for (int i = 0; i < words.length; i++) {
        // You may want to check for a non-word character before blindly
        // performing a replacement
        // It may also be necessary to adjust the character class
        words[i] = words[i].replaceAll("[^\\w]", "");
    }
    
    0 讨论(0)
  • 2020-12-01 00:26

    Most of the answers here convert String to String Array as the question asked. But Generally we use List , so more useful will be -

    String dummy = "This is a sample sentence.";
    List<String> wordList= Arrays.asList(dummy.split(" "));
    
    0 讨论(0)
  • 2020-12-01 00:28

    Try this:

    String[] stringArray = Pattern.compile("ian").split(
    "This is a sample sentence"
    .replaceAll("[^\\p{Alnum}]+", "") //this will remove all non alpha numeric chars
    );
    
    for (int j=0; i<stringArray .length; j++) {
      System.out.println(i + " \"" + stringArray [j] + "\"");
    }
    
    0 讨论(0)
  • 2020-12-01 00:29

    Following is a code snippet which splits a sentense to word and give its count too.

     import java.util.HashMap;
     import java.util.Iterator;
     import java.util.Map;
    
     public class StringToword {
    public static void main(String[] args) {
        String s="a a a A A";
        String[] splitedString=s.split(" ");
        Map m=new HashMap();
        int count=1;
        for(String s1 :splitedString){
             count=m.containsKey(s1)?count+1:1;
              m.put(s1, count);
            }
        Iterator<StringToword> itr=m.entrySet().iterator();
        while(itr.hasNext()){
            System.out.println(itr.next());         
        }
        }
    
    }
    
    0 讨论(0)
  • 2020-12-01 00:33

    Now, this can be accomplished just with split as it takes regex:

    String s = "This is a sample sentence with []s.";
    String[] words = s.split("\\W+");
    

    this will give words as: {"this","is","a","sample","sentence", "s"}

    The \\W+ will match all non-alphabetic characters occurring one or more times. So there is no need to replace. You can check other patterns also.

    0 讨论(0)
  • 2020-12-01 00:34

    string.replaceAll() doesn't correctly work with locale different from predefined. At least in jdk7u10.

    This example creates a word dictionary from textfile with windows cyrillic charset CP1251

        public static void main (String[] args) {
        String fileName = "Tolstoy_VoinaMir.txt";
        try {
            List<String> lines = Files.readAllLines(Paths.get(fileName),
                                                    Charset.forName("CP1251"));
            Set<String> words = new TreeSet<>();
            for (String s: lines ) {
                for (String w : s.split("\\s+")) {
                    w = w.replaceAll("\\p{Punct}","");
                    words.add(w);
                }
            }
            for (String w: words) {
                System.out.println(w);
            }
        } catch (Exception e) {
            e.printStackTrace();
        }
    
    0 讨论(0)
提交回复
热议问题