How to create the most compact mapping n → isprime(n) up to a limit N?

后端 未结 30 2698
遇见更好的自我
遇见更好的自我 2020-11-22 02:11

Naturally, for bool isprime(number) there would be a data structure I could query.
I define the best algorithm, to be the algorithm that pr

相关标签:
30条回答
  • 2020-11-22 02:38

    According to wikipedia, the Sieve of Eratosthenes has complexity O(n * (log n) * (log log n)) and requires O(n) memory - so it's a pretty good place to start if you aren't testing for especially large numbers.

    0 讨论(0)
  • 2020-11-22 02:39

    best algorithm for Primes number javascript

     function isPrime(num) {
          if (num <= 1) return false;
          else if (num <= 3) return true;
          else if (num % 2 == 0 || num % 3 == 0) return false;
          var i = 5;
          while (i * i <= num) {
            if (num % i == 0 || num % (i + 2) == 0) return false;
            i += 6;
          }
          return true
        }
    
    0 讨论(0)
  • 2020-11-22 02:41

    I think one of the fastest is my method that I made.

    void prime(long long int number) {
        // Establishing Variables
        long long int i = 5;
        int w = 2;
        const long long int lim = sqrt(number);
    
        // Gets 2 and 3 out of the way
        if (number == 1) { cout << number << " is hard to classify. \n";  return; }
        if (number == 2) { cout << number << " is Prime. \n";  return; }
        if (number == 3) { cout << number << " is Prime. \n";  return; }
    
        // Tests Odd Ball Factors
        if (number % 2 == 0) { cout << number << " is not Prime. \n";  return; }
        if (number % 3 == 0) { cout << number << " is not Prime. \n";  return; }
    
        while (i <= lim) {
            if (number % i == 0) { cout << number << " is not Prime. \n";  return; }
            // Tests Number
            i = i + w; // Increments number
            w = 6 - i; // We already tested 2 and 3
            // So this removes testing multepules of this
        }
        cout << number << " is Prime. \n"; return;
    }
    
    0 讨论(0)
  • 2020-11-22 02:43

    One can use sympy.

    import sympy
    
    sympy.ntheory.primetest.isprime(33393939393929292929292911111111)
    
    True
    

    From sympy docs. The first step is looking for trivial factors, which if found enables a quick return. Next, if the sieve is large enough, use bisection search on the sieve. For small numbers, a set of deterministic Miller-Rabin tests are performed with bases that are known to have no counterexamples in their range. Finally if the number is larger than 2^64, a strong BPSW test is performed. While this is a probable prime test and we believe counterexamples exist, there are no known counterexamples

    0 讨论(0)
  • 2020-11-22 02:44

    For large numbers you cannot simply naively check whether the candidate number N is divisible by none of the numbers less than sqrt(N). There are much more scalable tests available, such as the Miller-Rabin primality test. Below you have implementation in python:

    def is_prime(x):
        """Fast implementation fo Miller-Rabin primality test, guaranteed to be correct."""
        import math
        def get_sd(x):
            """Returns (s: int, d: int) for which x = d*2^s """
            if not x: return 0, 0
            s = 0
            while 1:
                if x % 2 == 0:
                    x /= 2
                    s += 1
                else:
                    return s, x
        if x <= 2:
            return x == 2
        # x - 1 = d*2^s
        s, d = get_sd(x - 1)
        if not s:
            return False  # divisible by 2!
        log2x = int(math.log(x) / math.log(2)) + 1
        # As long as Riemann hypothesis holds true, it is impossible
        # that all the numbers below this threshold are strong liars.
        # Hence the number is guaranteed to be a prime if no contradiction is found.
        threshold = min(x, 2*log2x*log2x+1)
        for a in range(2, threshold):
            # From Fermat's little theorem if x is a prime then a^(x-1) % x == 1
            # Hence the below must hold true if x is indeed a prime:
            if pow(a, d, x) != 1:
                for r in range(0, s):
                    if -pow(a, d*2**r, x) % x == 1:
                        break
                else:
                    # Contradicts Fermat's little theorem, hence not a prime.
                    return False
        # No contradiction found, hence x must be a prime.
        return True
    

    You can use it to find huge prime numbers:

    x = 10000000000000000000000000000000000000000000000000000000000000000000000000000
    for e in range(1000):
        if is_prime(x + e):
            print('%d is a prime!' % (x + e))
            break
    
    # 10000000000000000000000000000000000000000000000000000000000000000000000000133 is a prime!
    

    If you are testing random integers probably you want to first test whether the candidate number is divisible by any of the primes smaller than, say 1000, before you call Miller-Rabin. This will help you filter out obvious non-primes such as 10444344345.

    0 讨论(0)
  • 2020-11-22 02:45

    I compared the efficiency of the most popular suggestions to determine if a number is prime. I used python 3.6 on ubuntu 17.10; I tested with numbers up to 100.000 (you can test with bigger numbers using my code below).

    This first plot compares the functions (which are explained further down in my answer), showing that the last functions do not grow as fast as the first one when increasing the numbers.

    And in the second plot we can see that in case of prime numbers the time grows steadily, but non-prime numbers do not grow so fast in time (because most of them can be eliminated early on).

    Here are the functions I used:

    1. this answer and this answer suggested a construct using all():

      def is_prime_1(n):
          return n > 1 and all(n % i for i in range(2, int(math.sqrt(n)) + 1))
      
    2. This answer used some kind of while loop:

      def is_prime_2(n):
          if n <= 1:
              return False
          if n == 2:
              return True
          if n == 3:
              return True
          if n % 2 == 0:
              return False
          if n % 3 == 0:
              return False
      
          i = 5
          w = 2
          while i * i <= n:
              if n % i == 0:
                  return False
              i += w
              w = 6 - w
      
          return True
      
    3. This answer included a version with a for loop:

      def is_prime_3(n):
          if n <= 1:
              return False
      
          if n % 2 == 0 and n > 2:
              return False
      
          for i in range(3, int(math.sqrt(n)) + 1, 2):
              if n % i == 0:
                  return False
      
          return True
      
    4. And I mixed a few ideas from the other answers into a new one:

      def is_prime_4(n):
          if n <= 1:          # negative numbers, 0 or 1
              return False
          if n <= 3:          # 2 and 3
              return True
          if n % 2 == 0 or n % 3 == 0:
              return False
      
          for i in range(5, int(math.sqrt(n)) + 1, 2):
              if n % i == 0:
                  return False
      
          return True
      

    Here is my script to compare the variants:

    import math
    import pandas as pd
    import seaborn as sns
    import time
    from matplotlib import pyplot as plt
    
    
    def is_prime_1(n):
        ...
    def is_prime_2(n):
        ...
    def is_prime_3(n):
        ...
    def is_prime_4(n):
        ...
    
    default_func_list = (is_prime_1, is_prime_2, is_prime_3, is_prime_4)
    
    def assert_equal_results(func_list=default_func_list, n):
        for i in range(-2, n):
            r_list = [f(i) for f in func_list]
            if not all(r == r_list[0] for r in r_list):
                print(i, r_list)
                raise ValueError
        print('all functions return the same results for integers up to {}'.format(n))
    
    def compare_functions(func_list=default_func_list, n):
        result_list = []
        n_measurements = 3
    
        for f in func_list:
            for i in range(1, n + 1):
                ret_list = []
                t_sum = 0
                for _ in range(n_measurements):
                    t_start = time.perf_counter()
                    is_prime = f(i)
                    t_end = time.perf_counter()
    
                    ret_list.append(is_prime)
                    t_sum += (t_end - t_start)
    
                is_prime = ret_list[0]
                assert all(ret == is_prime for ret in ret_list)
                result_list.append((f.__name__, i, is_prime, t_sum / n_measurements))
    
        df = pd.DataFrame(
            data=result_list,
            columns=['f', 'number', 'is_prime', 't_seconds'])
        df['t_micro_seconds'] = df['t_seconds'].map(lambda x: round(x * 10**6, 2))
        print('df.shape:', df.shape)
    
        print()
        print('', '-' * 41)
        print('| {:11s} | {:11s} | {:11s} |'.format(
            'is_prime', 'count', 'percent'))
        df_sub1 = df[df['f'] == 'is_prime_1']
        print('| {:11s} | {:11,d} | {:9.1f} % |'.format(
            'all', df_sub1.shape[0], 100))
        for (is_prime, count) in df_sub1['is_prime'].value_counts().iteritems():
            print('| {:11s} | {:11,d} | {:9.1f} % |'.format(
                str(is_prime), count, count * 100 / df_sub1.shape[0]))
        print('', '-' * 41)
    
        print()
        print('', '-' * 69)
        print('| {:11s} | {:11s} | {:11s} | {:11s} | {:11s} |'.format(
            'f', 'is_prime', 't min (us)', 't mean (us)', 't max (us)'))
        for f, df_sub1 in df.groupby(['f', ]):
            col = df_sub1['t_micro_seconds']
            print('|{0}|{0}|{0}|{0}|{0}|'.format('-' * 13))
            print('| {:11s} | {:11s} | {:11.2f} | {:11.2f} | {:11.2f} |'.format(
                f, 'all', col.min(), col.mean(), col.max()))
            for is_prime, df_sub2 in df_sub1.groupby(['is_prime', ]):
                col = df_sub2['t_micro_seconds']
                print('| {:11s} | {:11s} | {:11.2f} | {:11.2f} | {:11.2f} |'.format(
                    f, str(is_prime), col.min(), col.mean(), col.max()))
        print('', '-' * 69)
    
        return df
    

    Running the function compare_functions(n=10**5) (numbers up to 100.000) I get this output:

    df.shape: (400000, 5)
    
     -----------------------------------------
    | is_prime    | count       | percent     |
    | all         |     100,000 |     100.0 % |
    | False       |      90,408 |      90.4 % |
    | True        |       9,592 |       9.6 % |
     -----------------------------------------
    
     ---------------------------------------------------------------------
    | f           | is_prime    | t min (us)  | t mean (us) | t max (us)  |
    |-------------|-------------|-------------|-------------|-------------|
    | is_prime_1  | all         |        0.57 |        2.50 |      154.35 |
    | is_prime_1  | False       |        0.57 |        1.52 |      154.35 |
    | is_prime_1  | True        |        0.89 |       11.66 |       55.54 |
    |-------------|-------------|-------------|-------------|-------------|
    | is_prime_2  | all         |        0.24 |        1.14 |      304.82 |
    | is_prime_2  | False       |        0.24 |        0.56 |      304.82 |
    | is_prime_2  | True        |        0.25 |        6.67 |       48.49 |
    |-------------|-------------|-------------|-------------|-------------|
    | is_prime_3  | all         |        0.20 |        0.95 |       50.99 |
    | is_prime_3  | False       |        0.20 |        0.60 |       40.62 |
    | is_prime_3  | True        |        0.58 |        4.22 |       50.99 |
    |-------------|-------------|-------------|-------------|-------------|
    | is_prime_4  | all         |        0.20 |        0.89 |       20.09 |
    | is_prime_4  | False       |        0.21 |        0.53 |       14.63 |
    | is_prime_4  | True        |        0.20 |        4.27 |       20.09 |
     ---------------------------------------------------------------------
    

    Then, running the function compare_functions(n=10**6) (numbers up to 1.000.000) I get this output:

    df.shape: (4000000, 5)
    
     -----------------------------------------
    | is_prime    | count       | percent     |
    | all         |   1,000,000 |     100.0 % |
    | False       |     921,502 |      92.2 % |
    | True        |      78,498 |       7.8 % |
     -----------------------------------------
    
     ---------------------------------------------------------------------
    | f           | is_prime    | t min (us)  | t mean (us) | t max (us)  |
    |-------------|-------------|-------------|-------------|-------------|
    | is_prime_1  | all         |        0.51 |        5.39 |     1414.87 |
    | is_prime_1  | False       |        0.51 |        2.19 |      413.42 |
    | is_prime_1  | True        |        0.87 |       42.98 |     1414.87 |
    |-------------|-------------|-------------|-------------|-------------|
    | is_prime_2  | all         |        0.24 |        2.65 |      612.69 |
    | is_prime_2  | False       |        0.24 |        0.89 |      322.81 |
    | is_prime_2  | True        |        0.24 |       23.27 |      612.69 |
    |-------------|-------------|-------------|-------------|-------------|
    | is_prime_3  | all         |        0.20 |        1.93 |       67.40 |
    | is_prime_3  | False       |        0.20 |        0.82 |       61.39 |
    | is_prime_3  | True        |        0.59 |       14.97 |       67.40 |
    |-------------|-------------|-------------|-------------|-------------|
    | is_prime_4  | all         |        0.18 |        1.88 |      332.13 |
    | is_prime_4  | False       |        0.20 |        0.74 |      311.94 |
    | is_prime_4  | True        |        0.18 |       15.23 |      332.13 |
     ---------------------------------------------------------------------
    

    I used the following script to plot the results:

    def plot_1(func_list=default_func_list, n):
        df_orig = compare_functions(func_list=func_list, n=n)
        df_filtered = df_orig[df_orig['t_micro_seconds'] <= 20]
        sns.lmplot(
            data=df_filtered, x='number', y='t_micro_seconds',
            col='f',
            # row='is_prime',
            markers='.',
            ci=None)
    
        plt.ticklabel_format(style='sci', axis='x', scilimits=(3, 3))
        plt.show()
    
    0 讨论(0)
提交回复
热议问题