I\'m trying to get the day of the week, and have it work consistently in any locale. In locales with Latin alphabets, everything is fine.
Sys.getlocale()
##
The RStudio/Architect problem
This can be solved, slightly messily, by explicitly changing the encoding of the weekdays string to UTF-8.
current_codepage <- as.character(l10n_info()$codepage)
iconv(weekdays(Sys.Date()), from = current_codepage, to = "utf8")
Note that codepages only exist on Windows; l10n_info()$codepage
is NULL
on Linux.
The LC_TIME problem
It turns out that under Windows you have to set both the LC_CTYPE
and LC_TIME
locale categories, and you have to set LC_CTYPE
before LC_TIME
, or it won't work.
In the end, we need different implementations for different OSes.
Windows version:
get_today_windows <- function(locale = NULL)
{
if(!is.null(locale))
{
lc_ctype <- Sys.getlocale("LC_CTYPE")
lc_time <- Sys.getlocale("LC_TIME")
on.exit(Sys.setlocale("LC_CTYPE", lc_ctype))
on.exit(Sys.setlocale("LC_TIME", lc_time), add = TRUE)
Sys.setlocale("LC_CTYPE", locale)
Sys.setlocale("LC_TIME", locale)
}
today <- weekdays(Sys.Date())
current_codepage <- as.character(l10n_info()$codepage)
iconv(today, from = current_codepage, to = "utf8")
}
get_today_windows()
## [1] "Tuesday"
get_today_windows("French_France")
## [1] "mardi"
get_today_windows("Arabic_Qatar")
## [1] "الثلاثاء"
get_today_windows("Serbian (Cyrillic)")
## [1] "уторак"
get_today_windows("Chinese (Traditional)_Taiwan")
## [1] "星期二"
Linux version:
get_today_linux <- function(locale = NULL)
{
if(!is.null(locale))
{
lc_time <- Sys.getlocale("LC_TIME")
on.exit(Sys.setlocale("LC_TIME", lc_time), add = TRUE)
Sys.setlocale("LC_TIME", locale)
}
weekdays(Sys.Date())
}
get_today_linux()
## [1] "Tuesday"
get_today_linux("fr_FR.utf8")
## [1] "mardi"
get_today_linux("ar_QA.utf8")
## [1] "الثلاثاء"
get_today_linux("sr_RS.utf8")
## [1] "уторак"
get_today_linux("zh_TW.utf8")
## [1] "週二"
Enforcing the .utf8
encoding in the locale seems important get_today_linux("zh_TW")
doesn't display properly.
The system of naming locales is OS-specific. I recommend you to read the locales from R Installation and Administration manual for a complete explanation.
The list of supported language is listed MSDN Language Strings. And surprisingly there is not Arabic language there. The "Language string" column contains the legal input for setting locale in R and even in the list contry /regions strings there no country spoken arabic there.
Of course you can change your locale global settings( panel setting --> region --> ..) but this will change it globally and it is not sure to get the right output without encoding problem.
Arabic is generally not supported by default, but is easy to set it using locale
.
locale -a ## to list all already supported language
sudo locale-gen ar_QA.UTF-8 ## install it in case does not exist
under RStudio now :
Sys.setlocale('LC_TIME','ar_QA.UTF-8')
[1] "ar_QA.UTF-8"
> format(Sys.Date(),'%A')
[1] "الثلاثاء
Note also that under R console the printing is not as pretty as in R studio because it is written from left to right not from right to left.