How to make Django slugify work properly with Unicode strings?

后端 未结 8 1660
猫巷女王i
猫巷女王i 2020-11-28 19:58

What can I do to prevent slugify filter from stripping out non-ASCII alphanumeric characters? (I\'m using Django 1.0.2)

cnprog.com has Chinese character

相关标签:
8条回答
  • I am interested in allowing only ASCII characters in the slug this is why I tried to benchmark some of the available tools for the same string:

    • Unicode Slugify:

      In [5]: %timeit slugify('Παίζω τρέχω %^&*@# και γ%^(λώ la fd/o', only_ascii=True)
      37.8 µs ± 86.7 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
      
      'paizo-trekho-kai-glo-la-fdo'
      
    • Django Uuslug:

      In [3]: %timeit slugify('Παίζω τρέχω %^&*@# και γ%^(λώ la fd/o')
      35.3 µs ± 303 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
      
      'paizo-trekho-kai-g-lo-la-fd-o'
      
    • Awesome Slugify:

      In [3]: %timeit slugify('Παίζω τρέχω %^&*@# και γ%^(λώ la fd/o')
      47.1 µs ± 1.94 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
      
      'Paizo-trekho-kai-g-lo-la-fd-o'
      
    • Python Slugify:

      In [3]: %timeit slugify('Παίζω τρέχω %^&*@# και γ%^(λώ la fd/o')
      24.6 µs ± 122 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
      
      'paizo-trekho-kai-g-lo-la-fd-o'
      
    • django.utils.text.slugify with Unidecode:

      In [15]: %timeit slugify(unidecode('Παίζω τρέχω %^&*@# και γ%^(λώ la fd/o'))
      36.5 µs ± 89.7 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
      
      'paizo-trekho-kai-glo-la-fdo'
      
    0 讨论(0)
  • 2020-11-28 20:27

    There is a python package called unidecode that I've adopted for the askbot Q&A forum, it works well for the latin-based alphabets and even looks reasonable for greek:

    >>> import unidecode
    >>> from unidecode import unidecode
    >>> unidecode(u'διακριτικός')
    'diakritikos'
    

    It does something weird with asian languages:

    >>> unidecode(u'影師嗎')
    'Ying Shi Ma '
    >>> 
    

    Does this make sense?

    In askbot we compute slugs like so:

    from unidecode import unidecode
    from django.template import defaultfilters
    slug = defaultfilters.slugify(unidecode(input_text))
    
    0 讨论(0)
提交回复
热议问题