How to convert string to title case while following rules, not just simply capitalizing every first letter of the word?
Sample rule:
- Capitalize all words, with exception to:
- Lowercase all articles (a, the), prepositions (to, at, in, with), and coordinating conjunctions (and, but, or)
- Capitalize the first and last word in a title, regardless of part of speech
Any easy way to do this in bash? One-liners appreciated.
(And just as an additional note, this is to be used in parcellite
actions.)
$ cat titles.txt
purple haze
Somebody To Love
fire on the mountain
THE SONG REMAINS THE SAME
Watch the NorthWind rise
eight miles high
just dropped in
strawberry letter 23
$ cat cap.awk
BEGIN { split("a the to at in on with and but or", w)
for (i in w) nocap[w[i]] }
function cap(word) {
return toupper(substr(word,1,1)) tolower(substr(word,2))
}
{
for (i=1; i<=NF; ++i) {
printf "%s%s", (i==1||i==NF||!(tolower($i) in nocap)?cap($i):tolower($i)),
(i==NF?"\n":" ")
}
}
$ awk -f cap.awk titles.txt
Purple Haze
Somebody to Love
Fire on the Mountain
The Song Remains the Same
Watch the Northwind Rise
Eight Miles High
Just Dropped In
Strawberry Letter 23
EDIT (as a one liner):
$ echo "the sun also rises" | awk 'BEGIN{split("a the to at in on with and but or",w); for(i in w)nocap[w[i]]}function cap(word){return toupper(substr(word,1,1)) tolower(substr(word,2))}{for(i=1;i<=NF;++i){printf "%s%s",(i==1||i==NF||!(tolower($i) in nocap)?cap($i):tolower($i)),(i==NF?"\n":" ")}}'
The Sun Also Rises
Thanks @jas for giving a nice answer to this one. Eventually, what I needed for parcellite
is this one-long-liner in the shell: (For the love of the pipe!)
echo '%s' | sed 's/\<./\u&/g' | sed 's/\ The\ /\ the\ /' | sed 's/\ A\ /\ a\ /' | sed 's/\ An\ /\ an\ /' | sed 's/\ As\ /\ as\ /' | sed 's/\ At\ /\ at\ /' | sed 's/\ But\ /\ but\ /' | sed 's/\ By\ /\ by\ /' | sed 's/\ For\ /\ for\ /' | sed 's/\ In\ /\ in\ /' | sed 's/\ Of\ /\ of\ /' | sed 's/\ Off\ /\ off\ /' | sed 's/\ On\ /\ on\ /' | sed 's/\ Per\ /\ per\ /' | sed 's/\ To\ /\ to\ /' | sed 's/\ Up\ /\ up\ /' | sed 's/\ Via\ /\ via\ /' | sed 's/\ And\ /\ and\ /' | sed 's/\ Nor\ /\ nor\ /' | sed 's/\ Or\ /\ or\ /' | sed 's/\ So\ /\ so\ /' | sed 's/\ Yet\ /\ yet\ /' | parcellite
The sed
s were of course generated from a loop:
for word in {The,A,An,As,At,But,By,For,In,Of,Off,On,Per,To,Up,Via,And,Nor,Or,So,Yet}
do
low=`echo "$word" | tr '[A-Z]' '[a-z]'`
printf "sed 's/\ $word\ /\ $low\ /' | "
done
Thanks for those who tried. :-)
来源:https://stackoverflow.com/questions/35006611/how-to-convert-text-following-title-case-rules-in-bash