I want to replace whitespace with underscore in a string to create nice URLs. So that for example:
\"This should be connected\" becomes \"This_should_be_conn
I'm using the following piece of code for my friendly urls:
from unicodedata import normalize
from re import sub
def slugify(title):
name = normalize('NFKD', title).encode('ascii', 'ignore').replace(' ', '-').lower()
#remove `other` characters
name = sub('[^a-zA-Z0-9_-]', '', name)
#nomalize dashes
name = sub('-+', '-', name)
return name
It works fine with unicode characters as well.
Using the re
module:
import re
re.sub('\s+', '_', "This should be connected") # This_should_be_connected
re.sub('\s+', '_', 'And so\tshould this') # And_so_should_this
Unless you have multiple spaces or other whitespace possibilities as above, you may just wish to use string.replace
as others have suggested.
Replacing spaces is fine, but I might suggest going a little further to handle other URL-hostile characters like question marks, apostrophes, exclamation points, etc.
Also note that the general consensus among SEO experts is that dashes are preferred to underscores in URLs.
import re
def urlify(s):
# Remove all non-word characters (everything except numbers and letters)
s = re.sub(r"[^\w\s]", '', s)
# Replace all runs of whitespace with a single dash
s = re.sub(r"\s+", '-', s)
return s
# Prints: I-cant-get-no-satisfaction"
print(urlify("I can't get no satisfaction!"))
Python has a built in method on strings called replace which is used as so:
string.replace(old, new)
So you would use:
string.replace(" ", "_")
I had this problem a while ago and I wrote code to replace characters in a string. I have to start remembering to check the python documentation because they've got built in functions for everything.
You don't need regular expressions. Python has a built-in string method that does what you need:
mystring.replace(" ", "_")
Django has a 'slugify' function which does this, as well as other URL-friendly optimisations. It's hidden away in the defaultfilters module.
>>> from django.template.defaultfilters import slugify
>>> slugify("This should be connected")
this-should-be-connected
This isn't exactly the output you asked for, but IMO it's better for use in URLs.