Validate that string contains only allowed characters in Ruby

前端 未结 4 1597
情深已故
情深已故 2021-01-07 12:13

How can I test whether a Ruby string contains only a specific set of characters?

For example, if my set of allowed characters is \"AGHTM\" plus digits <

相关标签:
4条回答
  • 2021-01-07 12:50
    allowed = "AGHTM"
    allowed = /\A[\d#{allowed}]+\z/i
    
    "MT3G22AH" =~ allowed #⇒ truthy
    "TAR34" =~ allowed #⇒ falsey
    
    0 讨论(0)
  • 2021-01-07 12:50

    String#delete

    One possibility is to delete all the allowed characters and check if the resulting string is empty :

    "MT3G22AH".delete("AGHTM0-9").empty?
    #=> true
    "TAR34".delete("AGHTM0-9").empty?
    #=> false
    

    Performance

    Short strings

    For short strings, @steenslag is the fastest method, followed by @Jesse and my method.

    def mudasobwa(string)
      allowed = 'AGHTM'
      allowed = /\A[\d#{allowed}]+\z/i
      string.match? allowed
    end
    
    def eric(string)
      string.delete('AGHTM1-9').empty?
    end
    
    def meagar(string)
      allowed = 'AGHTM0123456789'
      string.chars.uniq.all? { |c| allowed.include?(c) }
    end
    
    def jesse(string)
      string.count('^AGHTM0-9').zero?
    end
    
    def steenslag(string)
      !string.match?(/[^AGHTM0-9]/) 
    end
    
    require 'fruity'
    
    n = 1
    str1 = 'MT3G22AH' * n
    str2 = 'TAR34' * n
    compare do
      _jesse { [jesse(str1), jesse(str2)] }
      _eric { [eric(str1), eric(str2)] }
      _mudasobwa { [mudasobwa(str1), mudasobwa(str2)] }
      _meagar { [meagar(str1), meagar(str2)] }
      _steenslag { [steenslag(str1), steenslag(str2)] }
    end
    

    It outputs :

    Running each test 1024 times. Test will take about 2 seconds.
    _steenslag is faster than _jesse by 2.2x ± 0.1
    _jesse is faster than _eric by 8.000000000000007% ± 1.0%
    _eric is faster than _meagar by 4.3x ± 0.1
    _meagar is faster than _mudasobwa by 2.4x ± 0.1
    

    Longer strings

    For longer strings ( n=5000), @Jesse becomes the fastest method.

    Running each test 32 times. Test will take about 12 seconds.
    _jesse is faster than _eric by 2.5x ± 0.01
    _eric is faster than _mudasobwa by 4x ± 1.0
    _mudasobwa is faster than _steenslag by 2x ± 0.1
    _steenslag is faster than _meagar by 11x ± 0.1
    
    0 讨论(0)
  • 2021-01-07 12:54

    This seems to be faster than all previous benchmarks (by @Eric Duminil)(ruby 2.4):

    !string.match?(/[^AGHTM0-9]/) 
    
    0 讨论(0)
  • 2021-01-07 12:57

    A nicely idiomatic non-regex solution is to use String#count:

    "MT3G22AH".count("^AGHTM0-9").zero?  # => true
    "TAR34".count("^AGHTM0-9").zero?     # => false
    

    The inverse also works, if you find it more readable:

    "MT3G22AH".count('AGHTM0-9') == "MT3G22AH".size  # => true
    

    Take your pick.

    For longer strings, both methods here perform significantly better than regex-based options.

    0 讨论(0)
提交回复
热议问题