Null move pruning

The null move heuristic has been added and yielded another pretty solid increase in Elo.

For the uninitiated, a “null move” isn’t a real over-the-board move of course, it’s just a heuristic that can be used to terminate a line of search early, before any moves are even generated. The heuristic is based on the “Null Move Observation,” which says that it is almost always better to do something rather than nothing.

The basic idea is to actually try “doing nothing” and seeing if that would allow the opponent to improve their position. If they can’t, odds are that when you really do make a move your position will be too good (the opponent will do something else which prevents this entire line anyway). An example might be capturing the opponent’s queen with a pawn, leaving none of my pieces hanging (en-prise). No matter what the opponent does in response, he’s probably losing, so he’ll make a move which avoids leaving his queen hanging in the first place.

For this to actually save time, the search that is done after “doing nothing” (allowing the opponent a second move) is done to a reduced depth. How much to reduce is the big question – too much and you will make some very unsafe and risky cutoffs. Too little and you’re not really saving much time. A very conservative value would be R=2 or R=3, but with increasing processor speeds and greater and greater search depths many programmers push this value to extremes. It should all be backed by testing. Currently I am using R=3 but I do plan to revisit this in the future.

I am also reducing that reduction towards the leaves to ensure there is at least one ply of full width depth search remaining before dropping into the quiescence search. The idea here is to give the search an opportunity to resolve checks (the qsearch is currently capture only). My programs also avoid using this heuristic in positions that are likely zugzwang, meaning any move could potentially be harmful (which violates the Null Move Observation the heuristic is based on).

You’d have to know a little about the alpha-beta algorithm for a more technical description to make sense, but there’s really no reason for me to do that when you can read all about it on CPW – https://www.chessprogramming.org/Null_Move_Pruning or on this old-but-excellent description by Bruce Moreland – http://web.archive.org/web/20070708232843/http://www.brucemo.com/compchess/programming/nullmove.htm .

The important question – how much is it worth?

WinsLossesDrawsPctEloError
78051670456.6%+46.13+/- 12.29
81250468457.7%+53.93+/- 12.41
chess4j self play match, with null move vs. without, 5+0.25

… and …

WinsLossesDrawsPctEloError
237111864528.0%-164.29+/- 13.13
Prophet 4 vs Prophet 3, 10+0.5

So once again a pretty large jump, and of course I’m very pleased. Prophet4 is now within 200 elo of Prophet3 and there is still a pretty good list of enhancements to implement.

The next four planned changes are going to be smaller and perhaps more difficult to measure: (1) stopping the iterator early if less than half the allotted time remains, (2) proper treatment of mate scores in hash (rather than the “cheat” that’s in place now), (3) aspiration windows within the iterative deepening function, and (4) re-examination of the three-fold rep: is a two-fold repetition good enough? I don’t expect any of those will be big changes on their own, but I’m hoping that cumulatively they’ll be worth a few points. But, you never know, which is why we test.

Posted in chess4j, Computer Chess, Prophet | Comments Off on Null move pruning

Hashing

Both chess4j and Prophet now have hashing. Maybe I should say that chess4j has hashing again, because it did have hashing before as I wrote about here. As stated in a previous post, I basically “tore down” the search in chess4j in order to build it back up alongside P4, so I could test equivalence as I go.

The replacement strategy is very simple and certainly an area for future work. For every single node in the full width search where moves are expanded and searched (so these wouldn’t include quiescence nodes, leaf nodes, or nodes where a draw caused an early exit), the result is stored in the hash table. For every node of the full width search except the root, the hash table is probed. The hash entry includes not just the score for the position, but the depth of the search that produced it, and the type of node (PV, ALL, CUT) as well so we know if the score represents a bound or an exact value. In the case of PV and CUT nodes, the “best move” is also stored to aid in move ordering. The hash table is not “bucketed” – an entry consists of just a single slot, so anytime something is hashed any previous values are overwritten. This is referred to in the literature as the “always replace” strategy. It’s very simple, but very effective. In the future I intend to experiment with probing from the qsearch, different replacement strategies (such as “depth preferred”) and expanding the number of values that can be mapped to a single key.

When a hash probe is done and a value found, if the depth associated with the entry is at least as big as the depth we intend to search, we may be able to immediately exit the search without expanding any moves. If not, we at least have a move to try. If we do have to search and there is a suggested move from the hash table, it is tried first, before any moves are generated.

The results of two runs of chess4j with hash vs. without:

WinsLossesDrawsPctEloError
95943960263.0%+92.4612.97
98146955062.8%+90.9713.22

… and Prophet4 vs Prophet3:

WinsLossesDrawsPctEloError
145136049519.6%-244.9214.94

A very solid jump in strength; I’ll take it. 🙂

I’ve been putting some thought into my testing regime. In the past I’ve tested changes by running gauntlets of 10+0.5, which can take quite a long time on a single processor if you really want to get the error bars down. I have an older laptop that I’ve dedicated to testing. It has two cores, so it would be nice to utilize both of them. So, I did some self play tests with both of my programs, running two games concurrently, and happily the results came out pretty close to 50%. Additionally, I also experimented with reducing game times in self play matches from 10+0.5 down to 5+0.25. The results were still around 50%, and without any timeouts. So, at the moment my testing regime is to run a self play match, and then to run a P4 vs P3 match as verification. I think this will work until I’ve reached parity with P3, at which point I intend to go back to gauntlet testing. It would be good to introduce some statistical methods to measure the confidence a change is good (and to use for early termination), particularly as the elo become harder to come by and more and more games are needed to measure the change. I imagine at some point I’ll need more cores for testing as well.

Posted in chess4j, Computer Chess, Prophet | Comments Off on Hashing

Quiescence Search

It’s not very often that you can make a change that will net your program 350 Elo, but that’s exactly what I just did! A quiescence search was added to chess4j and Prophet4, resulting in a dramatic increase in playing strength.

The main issue with a fixed depth search that simply calls a static evaluation at the leaves is the horizon effect. A quiescence search attempts to minimize this issue by performing a more limited (highly selective) search at the leaves, until the position becomes quiet. There is some variability between programs as to what moves to expand, but typically they would be capturing moves, check evasions and possibly checking moves. Promotions or other threatening moves may also be considered. The danger is search explosion – expanding too many moves will cause the subtree to be too big and take too much time to resolve, possibly at the expense of starting a new iteration of the full width search. Therefore, it may become necessary to limit the quiescence search in some way, such as a maximum depth.

My programs are currently only considering captures, though I do plan to experiment with checks at some point. Captures seemed to be a safe starting point, sort of a “base line,” and minimize the chance of a search explosion since capture sequences tend to “fizzle out” at some point (though I’m sure there are some pathological positions out there that would cause problems).

Results of a chess4j self play match: with a quiescence search vs without:

WinsLossesDrawsPctEloError
16831601570.881347.3621.23

… and the latest Prophet4 vs Prophet3:

WinsLossesDrawsPctEloError
8316612560.105-371.3320.18

Still a long way to go to catch P3, but considering that before the quiescence search was added the results were more like 5 wins, the needle has moved quite a lot.

The next planned addition – hash tables – should net another large gain. I’m not sure that I’ll ever see this kind of increase again, but I expect it will be significant.

I’m also starting to do some experimentation to devise a testing strategy. I’ll write about that in another post when I have some results to share.

Posted in chess4j, Computer Chess, Prophet | Comments Off on Quiescence Search

Pondering and Analysis

chess4j can now ponder! If you’re not familiar with that term, it means that the program will make use of the time the opponent is on move. It does this by predicting the move the opponent will make, and thinking about what its response will be.

Let’s say that the program would spend X seconds formulating its response. The opponent makes its move in Y seconds, and it makes the predicted move. If X <= Y, the program can move instantly. If X > Y, then the program only needs to spend X-Y more seconds. If, however, the program did not correctly predict the opponent’s move, it will need to throw out the work and start over.

There are other ways to implement pondering. One variation is for the engine to start thinking about what it would do if it were the opponent. The idea is that when the opponent does move, the hash tables will be full of good information that should make the search move along very quickly. It’s hard to see how this can be more effective than the “guess” approach though, assuming the prediction rate is fairly good. A 50% prediction rate is completely reasonable. That’s a lot of the opponent time being utilized by the engine.

So how many ELO is pondering worth? Conventional wisdom is that a doubling of speed is worth around 70 ELO, so that should serve as an upper bound (effectively using 100% of the opponent’s time would conceptually be equivalent to a doubling of time). If we effectively use about 50% of the opponent’s time, I would expect a 30-40 ELO boost.

I ran some tests, chess4j with pondering vs without pondering. This was mainly a stability test; I wouldn’t put too much stock in the results as the prediction rate against itself will obviously be a lot higher than it would against another opponent. Still, the results are interesting.

With PonderingWithout PonderingDrawRate
10106513390.670

Clearly pondering is a big advantage.

Analysis mode is also supported now, meaning it’s possible to analyze games or just try different moves to see the engine’s “opinion” of each. Make a move, let the engine think for a while, back up and try another move, etc.

One final note: Pondering and Analysis mode are not supported in Prophet directly, but they are supported by chess4j when using Prophet. This highlights the benefit of the JNI integration – keep the native engine “lean and clean” and implement the ancillary functions in the high level language.

Now, onto a quiescence search.

Posted in chess4j, Computer Chess | Comments Off on Pondering and Analysis

chess4j + Prophet4 rewrite status 6/14/20

Support for time controls has been added to both Prophet and chess4j. Each is capable of playing with a fixed time per move or incremental chess clocks. They can use conventional chess clocks as well, but the timing strategy doesn’t really take into account that time will be added after the time control is up (for example 40 minutes for the first 40 moves, then 5 more minutes for each 10 moves, etc.).

Another addition that was tricky to get right was the JNI support to print changes in the principal variation during a search. I ended up adding a callback function that is invoked from the search when the PV changes at the root. When using P4 as a standalone engine, that callback would be a simple “print PV” function. When invoking the P4 search from chess4j, the callback is a method in the JNI layer that calls a similar “print PV” function in the Java code. It was tricky, but works beautifully.

I mentioned in my last post that I was kicking off some self play matches in P4 using 10 second matches with 0.5 second increments. This did uncover a couple of bugs, but everything is stable now. Unlike fixed depth matches where you would expect an exact .500 result, the timed matches will introduce a small amount of variability as the system becomes “busy” at times which will cause the search to stop at different points. So, the result of a 4000 game match was close to but not exactly 0.500.

Just out of curiosity I also played a 2000 game match of P4 vs P3. There was absolutely no doubt that P3 should win this match hands down. P4 is missing many search algorithms that are worth lots of ELO. I just wanted to test for stability and get a baseline measurement to gauge future improvements against. The result – 5 – 1959 – 36! Clearly a long way to go, but search improvements are coming soon.

Next up: analysis support, ponder support in chess4j, and a Windows build of the chess4j + P4 bundle. Once those features are complete, it will be time to start re-adding those search enhancements to climb the ELO ladder. Can’t wait.

Posted in chess4j, Computer Chess, Prophet | Comments Off on chess4j + Prophet4 rewrite status 6/14/20

Prophet4 rewrite status 5/31/20

The last couple of weeks have been focused on transitioning P4 from a single threaded engine that plays to a fixed search depth to a multi-threaded engine that can respond to commands while searching, and can search for a fixed period of time.

As soon as the code to use threads for searching was complete, I immediately pulled out an old laptop, got everything up to date, and turned it loose on doing a series of fixed depth self play matches using the fantastic Cute Chess: https://github.com/cutechess/cutechess .

The first test was to run depth 1 vs depth 1, then 2 vs 2, up to 5 vs 5. Each match was 20,000 games using over 1000 starting positions. The results are below. The score itself isn’t interesting, only that the end result was a perfect .500 in each case, and that there were no crashes, stalls, etc.

DepthWinsLossesDrawsScore
122552255154900.500
27147714757060.500
38913891321740.500
47804780443920.500
59123912317540.500

100,000 self play games (actually 200,000 if you consider that the program played both sides) is a pretty good indication that it’s stable, but just out of curiosity I also played some fixed depth self play matches where one side was able to search deeper than the other. Naturally you would expect the side that can “see further” to have a large advantage, and indeed that is the case. The table below shows the results of a series of 20,000 game matches.

12345
10.50.0120.001
20.9880.50.0380.045
30.9990.9620.50.10.029
40.9550.90.50.095
50.9720.9050.5

As you can see, even a 1 ply advantage is enormous, at least at very shallow depths. I believe that advantage would diminish at larger depths, but our branching factor isn’t good enough to play any longer games without it taking forever. Anyway, the point of this exercise wasn’t really to quantify the advantage of a larger search depth, but to test P4’s stability, and I’m happy to report it’s rock solid. P4 has played around 1 million fixed depth games at this point with 0 crashes.

I’m starting another round of testing now, using incremental clocks – 10 seconds per side with 0.5 second increment. I don’t expect any stability issues but I’d like to see that the engine never runs out of time (or at least very, very rarely).

Posted in Computer Chess, Prophet | Comments Off on Prophet4 rewrite status 5/31/20

chess4j + Prophet integration status 5/17/20

Several months ago I wrote about a proof-of-concept JNI integration between chess4j and Prophet4. At the time I had managed to compile Prophet4 as a static library, load that library into chess4j and write a JNI layer. The only function the Prophet4 library was serving at the time was for endpoint evaluation, but it was enough to show the idea was viable. Based on the success of the proof-of-concept, I’ve decided to go “all in” with this chess4j + Prophet4 integration. And, it’s going really well.

Since the time I wrote that article, I’ve added a search function to Prophet4 with some basic heuristics, as well as an iterative deepening driver. I’ve also added some additional JNI layer code that allows chess4j to utilize Prophet’s search. The native code runs 2-3x faster, so there is a tremendous speed improvement.

I’ve also realized some benefits when it comes to testing. Just to be able to compare apples to apples, I’ve stripped a lot of chess4j code out to simplify the search to the exact equivalent of Prophet’s. It didn’t really bother me too much to do that, because I wanted to refactor it to improve the test coverage anyway. Now, I’m building both programs back up, feature by feature. I’ve found that it’s often easier to build more comprehensive tests in Java, largely due to the ease of “injecting” mocked dependencies. As a feature is implemented in chess4j and I’m satisfied with the testing, I’ll implement the same feature in Prophet4. I still add tests in P4, using Google’s test framework, but not always as comprehensive. Then, the JNI layer is leveraged to ensure that the searches traverse exactly the same tree, produce the same score, and that the iterative deepening driver produces exactly the same PV. Proving that the searches are equivalent gives a sort of transitivity argument.

So, chess4j and Prophet4 are now and probably forever linked. In some sense you could consider chess4j a Java port of Prophet4, or Prophet4 a C port of chess4j. But, there’s more to it than that. The programs serve different purposes. chess4j will be the more “feature complete” engine. It contains an opening book, where P4 does not. It will be capable of pondering, but P4 may not. Prophet4 will be the lighter weight / faster of the two and more suitable for testers that focus on engine-to-engine testing. The chess4j releases will include platform specific builds that utilize P4 as a static library, and for any platforms that aren’t included, the Java engine will still work.

For future work I’m envisioning an SMP implementation such as Young Brothers Wait in P4. P4 is the more likely to use tablebases. Since chess4j uses P4 as a static library, it will automatically benefit from those features. Macro level work such as distributed computing and learning experiments are more likely to be done in chess4j. Who knows, but for now I’m having fun. I guess that’s the main benefit.

Posted in chess4j, Computer Chess, Prophet | Tagged , | Comments Off on chess4j + Prophet integration status 5/17/20

Prophet4 rewrite status 3/24/20

Killer moves have been implemented. Two killer moves are kept per ply. A killer move is defined as a non-capturing move that causes a fail high. A new killer move is stored in “slot 1,” while the previous killer move in slot 1 is shifted to slot 2. Killer moves are tried after capturing moves, before any other non-capturing moves are even generated.

The next item on the agenda is to add draw checks into the search, then iterative deepening and “last PV” move ordering.

Posted in Computer Chess, Prophet | Comments Off on Prophet4 rewrite status 3/24/20

Prophet4 rewrite status 2/16/20

Added a simple depth first search. The search uses the alpha/beta algorithm and recognizes checkmated and stalemated positions but otherwise has no “intelligence.” It is a simple fixed depth search without iterative deepening or timing.

The next phase of work will be to add some basic move ordering, such as trying the previous PV first, MVV/LVA capture scoring and killer moves.

The JNI integration with chess4j has gone extremely well. chess4j is capable of using Prophet for search, so after this rewrite is complete there will likely be a chess4j bundle that includes P4 for some platforms. I am leaning towards keeping P4 as a lightweight, XBoard compatible “calculation engine” and leveraging chess4j for ancillary functions such as opening book, pondering support, and distributed computing support.

Posted in chess4j, Computer Chess, Prophet | Comments Off on Prophet4 rewrite status 2/16/20

chess4j + Prophet4 POC

I just wrapped up a successful proof-of-concept to marry up my two chess engines – the Java based chess4j and the C based Prophet4 . Note, Prophet4 isn’t even a complete engine yet – it doesn’t even have a search! The idea for the POC was to see what it would look like for chess4j to load Prophet4 as a static library and call down into the native code for endpoint evaluation. It has worked out fairly well. The “bridge” is the Java Native Interface (JNI).

Why would I want to do this? Well, there are advantages and disadvantages to writing a chess engine in a high level language like Java. (I know C is technically a high level language too, but it’s much closer to the hardware than Java.) It is certainly easier to quickly write and test new code in Java (well, to me it is). It is easier to maintain – the refactoring support in IntelliJ is fantastic. There are almost limitless open source libraries available for things like opening books and even machine learning. There are high level abstractions to support functional programming and parallel processing. In short, programming in a higher level language is just more productive. The major disadvantages are that they tend to run slower and are less memory efficient. Can we have the best of both worlds?

I can envision writing the core functionality in native code for speed of execution, but implementing the engine itself with all the bells and whistles in Java. The protocol code, opening book, search iterator, pondering code and distributed (multi-machine) search could be implemented at a high level. When it’s time to execute a search, drop down to native code. The search in the native code could efficiently use multiple processors.

In this POC, calling the Prophet4 library to do the endpoint evaluation slowed the engine down by more than 50%. That’s not unexpected though. There is an overhead to “crossing the bridge.” Evaluating a single endpoint is simply too granular a unit-of-work to overcome that overhead. I’m sure that when the C code is called to perform a search, there will be a tremendous speedup.

One major drawback to this entire idea is that it flies in the face of one of the main benefits of Java, which is machine independence. By relying on native code, you are once again tied to a specific architecture. That means platform-specific build artifacts: Linux, Windows, Mac, for both 32 bit/64 bit if desired. This being a personal project, I’m not terribly worried about it. I could always leave a Java implementation of the search in place as a fallback if I wanted to – still deciding.

The next major task on my list is to implement a search in Prophet4. As that develops I will continue to play around with this idea. We’ll see where it goes, but so far I’m feeling pretty good about it.

Posted in chess4j, Computer Chess, Prophet | Tagged , | Comments Off on chess4j + Prophet4 POC