Yes, that is correct. UTF-8 is an encoding for the Unicode character set, which supports pretty much every language in the world.
I think the only difference comes with sorting your results, different letters might come in a different order in other languages (accents, umlauts, etc.). Also, comparing a
to ä
might behave differently in another collation.
The _ci
suffix means sorting and comparison happens case insensitive.
http://www.collation-charts.org/ might be of interest to you.