Chess Optimizations

♀尐吖头ヾ 提交于 2019-12-02 15:38:37

Over the last year of development of my chess engine (www.chessbin.com), much of the time has been spent optimizing my code to allow for better and faster move searching. Over that time I have learned a few tricks that I would like to share with you.

Measuring Performance

Essentially you can improve your performance in two ways:

  • Evaluate your nodes faster
  • Search fewer nodes to come up with the same answer

Your first problem in code optimization will be measurement. How do you know you have really made a difference? In order to help you with this problem you will need to make sure you can record some statistics during your move search. The ones I capture in my chess engine are:

  • Time it took for the search to complete.
  • Number of nodes searched

This will allow you to benchmark and test your changes. The best way to approach testing is to create several save games from the opening position, middle game and the end game. Record the time and number of nodes searched for black and white. After making any changes I usually perform tests against the above mentioned save games to see if I have made improvements in the above two matrices: number of nodes searched or speed.

To complicate things further, after making a code change you might run your engine 3 times and get 3 different results each time. Let’s say that your chess engine found the best move in 9, 10 and 11 seconds. That is a spread of about 20%. So did you improve your engine by 10%-20% or was it just varied load on your pc. How do you know? To fight this I have added methods that will allow my engine to play against itself, it will make moves for both white and black. This way you can test not just the time variance over one move, but a series of as many as 50 moves over the course of the game. If last time the game took 10 minutes and now it takes 9, you probably improved your engine by 10%. Running the test again should confirm this.

Finding Performance Gains

Now that we know how to measure performance gains lets discuss how to identify potential performance gains.

If you are in a .NET environment then the .NET profiler will be your friend. If you have a Visual Studio for Developers edition it comes built in for free, however there are other third party tools you can use. This tool has saved me hours of work as it will tell you where your engine is spending most of its time and allow you to concentrate on your trouble spots. If you do not have a profiler tool you may have to somehow log the time stamps as your engine goes through different steps. I do not suggest this. In this case a good profiler is worth its weight in gold. Red Gate ANTS Profiler is expensive but the best one I have ever tried. If you can’t afford one, at least use it for their 14 day trial.

Your profiler will surly identify things for you, however here are some small lessons I have learned working with C#:

  • Make everything private
  • Whatever you can’t make private, make it sealed
  • Make as many methods static as possible.
  • Don’t make your methods chatty, one long method is better than 4 smaller ones.
  • Chess board stored as an array [8][8] is slower then an array of [64]
  • Replace int with byte where possible.
  • Return from your methods as early as possible.
  • Stacks are better than lists
  • Arrays are better than stacks and lists.
  • If you can define the size of the list before you populate it.
  • Casting, boxing, un-boxing is evil.

Further Performance Gains:

I find move generation and ordering is extremely important. However here is the problem as I see it. If you evaluate the score of each move before you sort and run Alpha Beta, you will be able to optimize your move ordering such that you will get extremely quick Alpha Beta cutoffs. This is because you will be able to mostly try the best move first. However the time you have spent evaluating each move will be wasted. For example you might have evaluated the score on 20 moves, sort your moves try the first 2 and received a cut-off on move number 2. In theory the time you have spent on the other 18 moves was wasted.

On the other hand if you do a lighter and much faster evaluation say just captures, your sort will not be that good and you will have to search more nodes (up to 60% more). On the other hand you would not do a heavy evaluation on every possible move. As a whole this approach is usually faster.

Finding this perfect balance between having enough information for a good sort and not doing extra work on moves you will not use, will allow you to find huge gains in your search algorithm. Furthermore if you choose the poorer sort approach you will want to first to a shallower search say to ply 3, sort your move before you go into the deeper search (this is often called Iterative Deepening). This will significantly improve your sort and allow you to search much fewer moves.

Answering an old question.

Assuming you already have a working transposition table.

Late Move Reduction. That gave my program about 100 elo points and it is very simple to implement.

In my experience, unless your implementation is very inefficient, then the actual board representation (0x88, bitboard, etc.) is not that important.

Although you can criple you chess engine with bad performance, a lightning fast move generator in itself is not going to make a program good.

The search tricks used and the evaluation function are the overwhelming factors determining overall strength.

And the most important parts, by far, of the evaluation are Material, Passed pawns, King Safety and Pawn Structure.

The most important parts of the search are: Null Move Pruning, Check Extension and Late Move reduction.

Your program can come a long, long way, on these simple techniques alone!

Good move ordering!

An old question, but same techniques apply now as for 5 years ago. Aren't we all writing our own chess engines, I have my own called "Norwegian Gambit" that I hope will eventually compete with other Java engines on the CCRL. I as many others use Stockfish for ideas since it is so nicely written and open. Their testing framework Fishtest and it's community also gives a ton of good advice. It is worth comparing your evaluation scores with what Stockfish gets since how to evaluate is probably the biggest unknown in chess-programming still and Stockfish has gone away from many traditional evals which have become urban legends (like the double bishop bonus). The biggest difference however was after I implemented the same techniques as you mention, Negascout, TT, LMR, I started using Stockfish for comparison and I noticed how for the same depth Stockfish had much less moves searched than I got (because of the move ordering).

Move ordering essentials

The one thing that is easily forgotten is good move-ordering. For the Alpha Beta cutoff to be efficient it is essential to get the best moves first. On the other hand it can also be time-consuming so it is essential to do it only as necessary.

  1. Transposition table
  2. Sort promotions and good captures by their gain
  3. Killer moves
  4. Moves that result in check on opponent
  5. History heuristics
  6. Silent moves - sort by PSQT value

The sorting should be done as needed, usually it is enough to sort the captures, and thereafter you could run the more expensive sorting of checks and PSQT only if needed.

About Java/C# vs C/C++/Assembly

Programming techniques are the same for Java as in the excellent answer by Adam Berent who used C#. Additionally to his list I would mention avoiding Object arrays, rather use many arrays of primitives, but contrary to his suggestion of using bytes I find that with 64-bit java there's little to be saved using byte and int instead of 64bit long. I have also gone down the path of rewriting to C/C++/Assembly and I am having no performance gain whatsoever. I used assembly code for bitscan instructions such as LZCNT and POPCNT, but later I found that Java 8 also uses those instead of the methods on the Long object. To my surprise Java is faster, the Java 8 virtual machine seems to do a better job optimizing than a C compiler can do.

I know that one improvement that was talked about at the AI courses in university where having a huge database of finishing moves. So having a precalculated database for games with only a small number of figures left. So that if you hit a near end positioning in your search you stop the search and take a precalculated value that improves your search results like extra deepening that you can do for important/critique moves without much computation time spend. I think it also comes with a change in heuristics in a late game state but I'm not a chess player so I don't know the dynamics of game finishing.

Be warned, getting game search right in a threaded environment can be a royal pain (I've tried it). It can be done, but from some literature searching I did a while back, it's extremely hard to get any speed boost at all out of it.

Its quite an old question, I was just searching questions on chess and found this one unanswered. Well it may not be of any help to you now, but may prove helpful to other users.

I didn't see null move pruning, transposition tables.. are you using them? They would give you a big boost...

One thing that gave me a big boost was minimizing conditional branching... Alot of things can be precomputed. Search for such opportunities.

Most modern PCs have multiple cores so it would be a good idea making it multithreading. You don't necessarily need to go MDF(f) for that.

I wont suggest moving your code to bitboard. Its simply too much work. Even though bitboards could give a boost on 64 bit machines.

Finally and most importantly chess literature dominates any optimizations we may use. optimization is too much work. Look at open source chess engines, particularly crafty and fruit/toga. Fruit used to be open source initially.

As far as tips, I know large gains can be found in optimizing your move generation routines before any eval functions. Making that function as tight as possible can give you 10% or more in nodes/sec improvement.

If you're moving to bitboards, do some digging on rec.games.chess.computer archives for some of Dr. Robert Hyatts old posts about Crafty (pretty sure he doesn't post anymore). Or grab the latest copy from his FTP and start digging. I'm pretty sure it would be a significant shift for you though.

Late answer, but this may help someone:

Given all the optimizations you mentioned, 1450 ELO is very low. My guess is that something is very wrong with your code. Did you:

  1. Wrote a perft routine and ran it through a set of positions? All tests should pass, so you know your move generator is free of bugs. If you don't have this, there's no point in talking about ELO.

  2. Wrote a mirrorBoard routine and ran the evaluation code through a set of positions? The result should be the same for the normal and mirrored positions, otherwise you have a bug in your eval.

  3. Do you have a hashtable (aka transposition table)? If not, this is a must. It will help while searching and ordering moves, giving a brutal difference in speed.

  4. How do you implement move ordering? This links back to point 3.

  5. Did you implement the UCI protocol? Is your move parsing function working properly? I had a bug like this in my engine:

      /* Parses a uci move string and return a Board object */
      Board parseUCIMoves(String moves)// e2e4 c7c5 g1f3 ...{
          //...
          if (someMove.equals("e1g1") || someMove.equals("e1c1"))
              //apply proper castle
         //...
     }
    

Sometimes the engine crashed while playing a match, and I thought it was the GUI fault, since all perft tests were ok. It took me one week to find the bug by luck. So, test everything.

For (1) you can search every position to depth 6. I use a file with ~1000 positions. See here https://chessprogramming.wikispaces.com/Perft

For (2) you just need a file with millions of positions (just the FEN string).

Given all the above and a very basic evaluation function (material, piece square tables, passed pawns, king safety) it should play at +-2000 ELO.

  • Transposition Table
  • Opening Book
  • End Game Table Bases
  • Improved Static Board Evaluation for Leaf Nodes
  • Bitboards for Raw Speed

Profile and benchmark. Theoretical optimizations are great, but unless you are measuring the performance impact of every change you make, you won't know whether your work is improving or worsening the speed of the final code.

Try to limit the penalty to yourself for trying different algorithms. Make it easy to test various implementations of algorithms against one another. i.e. Make it easy to build a PVS version of your code as well as a NegaScout version.

Find the hot spots. Refactor. Rewrite in assembly if necessary. Repeat.

Assuming "history heuristic" involves some sort of database of past moves, a learning algorithm isn't going to give you much more unless it plays a lot of games against the same player. You can probably achieve more by classifying a player and tweaking the selection of moves from your historic database.

It's been a long time since I've done any programming on any chess program, but at the time, bit boards did give a real improvement. Other than that I can't give you much advise. Do you only evaluate the position of pawns? Some (slight) bonuses for position or mobility of some key pieces may be in order.

I'm not certain what type of thing you would like it to learn however...

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!