Latin Regex with symbols
问题 I need split a text and get only words, numbers and hyphenated composed-words. I need to get latin words also, then I used \p{L} , which gives me é, ú ü ã, and so forth. The example is: String myText = "Some latin text with symbols, ? 987 (A la pointe sud-est de l'île se dresse la cathédrale Notre-Dame qui fut lors de son achèvement en 1330 l'une des plus grandes cathédrales d'occident) : ! @ # $ % ^& * ( ) + - _ #$% " ' : ; > < / \ | , here some is wrong… * + () e -" Pattern pattern =