How to optimize printing out the difference between the greater and the lesser of two integers?

独自空忆成欢 提交于 2019-12-07 08:41:03

问题


UVA Problem no. 10055, Hashmat the Brave Warrior, probably the easiest problem there. The input consists of a series of pairs of unsigned integers ≤ 2^32 (thus mandating the use of 64bit integers…) For each pair the task is to print out the difference between the greater and the lesser integer.

According to the statistics, the fastest solutions run in below 0.01 sec. However, all my attempts to solve this typically run in 0.02 sec, with probably random deviations of ± 0.01 sec.

I tried:

#include <cstdint>
#include <iostream>
using namespace std;

int main()
{
  ios_base::sync_with_stdio(false);
  cin.tie(nullptr);  

  uint_fast64_t i, j;
  while(cin >> i >> j) {
    if(i > j)
      cout << i-j << '\n';
    else
      cout << j-i << '\n';
  }
}

And also:

#include <cstdlib>
#include <cstdint>
#include <iostream>
using namespace std;

int main()
{
  ios_base::sync_with_stdio(false);
  cin.tie(nullptr);  

  int_fast64_t i, j;
  while(cin >> i >> j) {
    cout << abs(i-j) << '\n';
  }
}

And also:

#include <algorithm>
#include <cstdint>
#include <iostream>
using namespace std;

int main()
{
  ios_base::sync_with_stdio(false);
  cin.tie(nullptr);  

  uint_fast64_t i, j;
  while(cin >> i >> j) {
    cout << max(i,j)-min(i,j) << '\n';
  }
}

All with same results.

I also tried using printf()/scanf() instead of cin/cout, still with same results (besides, my benchmarks were showing that cin/cout preceded by cin.tie(nullptr) can be even a little faster than printf()/scanf() – at least unless there are some ways to optimize the performance of cstdio I’m not aware of).

Is there any way to optimize this down to below 0.01 sec., or should I assume that guys who’ve achieved this time are either extremely lucky or cheaters printing out a precomputed answer to the judge’s input?

The programs are compiled with C++11 5.3.0 - GNU C++ Compiler with options: -lm -lcrypt -O2 -std=c++11 -pipe -DONLINE_JUDGE.

EDIT: This is my attempt to combine the advices of @Sorin and @MSalters:

#include <stdio.h>
#include <stdint.h>

unsigned long long divisors[] = {
  1000000000,
  1000000000,
  1000000000,
  1000000000,
  100000000,
  100000000,
  100000000,
  10000000,
  10000000,
  10000000,
  1000000,
  1000000,
  1000000,
  1000000,
  100000,
  100000,
  100000,
  10000,
  10000,
  10000,
  1000,
  1000,
  1000,
  1000,
  100,
  100,
  100,
  10,
  10,
  10,
  1,
  1,
  1
};


int main()
{
  unsigned long long int i, j, res;

  unsigned char inbuff[2500000]; /* To be certain there's no overflow here */
  unsigned char *in = inbuff;
  char outbuff[2500000]; /* To be certain there's no overflow here */
  char *out = outbuff;

  int c = 0;

  while(1) {
    i = j = 0;

    inbuff[fread(inbuff, 1, 2500000, stdin)] = '\0';

    /* Skip whitespace before first number and check if end of input */
    do {
      c = *(in++);
    } while(c != '\0' && !(c >= '0' && c <= '9'));

    /* If end of input, print answer and return */
    if(c == '\0') {
      *(--out) = '\0';
      puts(outbuff);
      return 0;
    }

    /* Read first integer */
    do {
      i = 10 * i + (c - '0');
      c = *(in++);
    } while(c >= '0' && c <= '9');

    /* Skip whitespace between first and second integer */
    do {
      c = *(in++);
    } while(!(c >= '0' && c <= '9'));

    /* Read second integer */
    do {
      j = 10 * j + (c - '0');
      c = *(in++);
    } while(c >= '0' && c <= '9');

    if(i > j)
      res = i-j;
    else
      res = j-i;



    /* Buffer answer */
    if(res == 0) {
      *(out++) = '0';
    } else {
      unsigned long long divisor = divisors[__builtin_clzll(res)-31];
      /* Skip trailing 0 */
      if(res < divisor) {
        divisor /= 10;
      }
      /* Buffer digits */
      while(divisor != 0) {
        unsigned long long digit = res / divisor;
        *(out++) = digit + '0';
        res -= divisor * digit;
        divisor /= 10;
      }
    }
    *(out++) = '\n';
  }
}   

Still 0.02sec.


回答1:


I would try to eliminate IO operations. Read one block of data (as big as you can). Compute the outputs, write them to another string then write that string out.

You sscanf or stringstream equivalents to read/write from your memory blocks.

IO usually needs to go through the kernel so there's a small chance that you would loose the CPU for a bit. There's also some cost(time) associated with it. It's small but you are trying to run in less than 10ms.




回答2:


printf is a swiss army knife. It knows many ways to format its arguments, and that can be any number. In this case, you want a single dedicated function, so you don't wast time scanning for the single occurrence of %d. (BTW, this is a speed benefit of std::cout << - the compiler sorts out the overloading at compile time).

Once you have that single formatting function, make it output to a single char[] and call puts on that. As puts does no formatting of its own, it can be much faster than printf.




回答3:


Here is my variant with assembler routines.

#include <iostream>
#include <string>

using namespace std;

int main()
{
   unsigned long long i, j;
   string outv;   
   while(cin >> i >> j) {
     asm("movq %0, %%rax;"
         "movq %1, %%rdx;"  
         "subq %%rax, %%rdx;"
         "jns .L10;"        
         "notq %%rdx;"      
         "addq $0b1, %%rdx;"
         ".L10: movq %%rdx, %0": : "g"(i), "g"(j) );       
     string str = to_string(i);
     outv += str + "\n";     
    }
    cout << outv;   
 }



回答4:


The trick is using :

  • unsafe Input : https://www.quora.com/What-is-the-fastest-input-output-method-in-C++ . On Windows use Microsoft Thread unsafe version https://docs.microsoft.com/en-us/cpp/c-runtime-library/reference/getchar-nolock-getwchar-nolock , as this Codeforces submission: http://codeforces.com/contest/339/submission/27533017 .

    On Linux and Mac OS, for GCC and clang use https://linux.die.net/man/3/unlocked_stdio POSIX Standard thread unsafe version (Unlocked Stdio).

  • Custom input: or sometimes called Naive Input is faster than standard functions. It is about getting characters from input and converting it to integer. To optimize inputting from console, read: http://stackoverflow.com/questions/705303/faster-i-o-in-c/705378 . To optimize string to integer, read article: http://tinodidriksen.com/2010/02/16/cpp-convert-string-to-int-speed/ , and read code: http://tinodidriksen.com/uploads/code/cpp/speed-string-to-int.cpp . For Speed comparison read: http://codeforces.com/blog/entry/5217 and code: https://bitbucket.org/andreyv/cppiotest/src/tip/iotest.cpp?fileviewer=file-view-default

This solution which runs in less than 0.001 seconds , is based on UVa Online Judge submission http://ideone.com/ca8sDu that was solved by http://uhunt.felix-halim.net/id/779215 ; However this Solution is Abridged and modified #include

#define pll(n) printf("%lld ",(n))
#define plln(n) printf("%lld\n",(n))
typedef long long ll;

#if defined(_WINDOWS) // On Windows GCC, use the slow thread safe version
inline int getchar_unlocked() {
    return getchar();
}
#elif defined  (_MSC_VER)// On   Visual Studio
inline int getchar_unlocked(){
    return _getchar_nolock(); // use Microsoft Thread unsafe version
}
#endif 

inline int  scn( ll & n){
     n = 0;
     int  c = getchar_unlocked(),t=0;
    if (c == EOF) 
        return 0;
    while(c < '0' || c > '9') {
        if(c==45)
            t=1;
        c = getchar_unlocked(); 
    }
    while(c >= '0' && c <= '9'){
        n = n *10+ c - '0';       
        c = getchar_unlocked();
    }
    if(t!=0)
        n *=-1;
    return 1;
}

int main(){
    ll n, m;
    while(scn(n)+scn(m)==2){
        if (n>m)
            plln(n - m);
        else
            plln(m - n);
    }
    return 0;
}


来源:https://stackoverflow.com/questions/44108745/how-to-optimize-printing-out-the-difference-between-the-greater-and-the-lesser-o

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!