My input file looks like below:
“true true, rohith Rohith;
cold burn, and fact and fact good good?”
Output shoud look like:
Depending on your expected input, this might work:
sed -r 's/([a-zA-Z0-9_-]+)( *)\1/\1\2/g ; s/ ([.,;:])/\1/g ; s/ / /g' myfile
([a-zA-Z0-9_-]+) = words that might be repeated.
( *)\1 = check if the previous word is repeated after a space.
s/ ([.,;:])/\1/g = removes extra spaces before punctuation (you might want to add characters to this group).
s/ / /g = removes double spaces.
This works with GNU sed.