I’m releasing version 3.5 of my Java based chess engine, chess4j. The only changes since 3.4 are some bugfixes to the ponder search and changes in some logging levels.
It’s been about a year since the last release of Prophet3, which was version 20170909. The main changes since then are the addition of a pawn hash table, improved Winboard compatibility, and a ponder search (*). I also added a depth preferred hash table, and tried using it in combination with an “always replace” table, but surprisingly that seemed to be slightly weaker. I plan to revisit this idea – it seems it should work.
* – A ponder search means Prophet will take advantage of the time the opponent is “on move.” Instead of idling, it assumes the opponent will make the next move in the principal continuation evaluated from the last search. If the opponent doesn’t make that move, then nothing is lost, we just start the search over. If, however, the opponent does make the predicted move, then we are already that much closer to having a response ready. In some cases, we may play an instant response. This can be a huge time saver, allowing us to allocate slightly more time-per-move to the engine.
This version isn’t much stronger without pondering, but with pondering on it gains about 40 elo vs my standard set of opponents. Here are the current test results (without pondering):
|11||prophet 2.0 e1||-106||19180||40%||-35||18%|
chess4j 3.4 is officially released. The main changes in this release are the hashing related changes that I discussed some time back in this post. Other than that, there are many smaller changes, many of which I’m sure I’ve lost track of, but some that come to mind are: converting to Java8 syntax, incorporating Scala in testing, and improved Winboard compatibility.
Also, the source code has been moved from SourceForge to Github. The new project website is https://github.com/jswaff/chess4j.
The next major area of work will be to parallelize the search, but I anticipate another minor release before that work begins.
Hashing is an area I’ve meant to revisit in chess4j since it was originally implemented, for a couple of reasons. (1) The hash entries consume too much space, limiting the number of entries the hash table can store, and (2) in preparation for multi-processing capabilities. I’ll address the first issue in this post, and follow up on the second at a later date.
If you don’t know what a hash table is used for, I recommend reading this article: https://chessprogramming.wikispaces.com/Hash+Table .
Up until recently the table entries in chess4j had the following structure:
TranspositionTableEntry long zobristKey (64 bits) TranspositionTableEntryType type (32 bit reference to an Enum) int score (32 bits) int depth (32 bits) Move move (32 bit reference to an Object)
So how much space does a single hash entry take? It’s not very straightforward to figure it out. With C you can use the ‘sizeof’ operation, but things get more complicated with object oriented languages. But, we can make some educated guesses and do some hand waiving to get some estimates.
The zobristKey, score, and depth members are all primitive: longs are 64 bits, and ints are 32 bits each. The ‘type’ member is a reference to an enum. An assumption is that references are 32 bits, since we’re using less than 32GB of heap — though that is JVM specific. By the same logic ‘move’ is also a 32 bit reference. Therefore, the TranspositionTableEntry itself is 64 + 32 + 32 + 32 + 32 bits == 192 bits == 24 bytes. (That likely ignores some OO “overhead.”)
That is already pretty bad, but it’s not even the complete picture. Note ‘move’ is a reference to an instance of a Move object. The Move object being referenced has a cost associated with it too. That object can’t be deallocated from the heap as long as we’re referencing it. So how big is a Move?
Move Square from Square to Piece piece Piece captured (possibly null) Piece promotion (possibly null) boolean castle boolean epCapture
Again, it’s difficult to come up with an exact number, but we have between 3 and 5 references (2 Square and 1-3 Piece) to Singleton objects, and two booleans which are 8 bits each. So, somewhere between 14 and 22 bytes. Note that not every hash entry will have a Move associated with it: upper bounds for example, but most would.
All in all, that makes the cost of a single hash entry somewhere between 24 and 46 bytes! Doing a little more hand waiving, let’s assume an average of 38 bytes (which is probably low). That means a fully loaded hash table would have the following maximums:
So how can this be improved? It turns out that with a little bit twiddling it’s possible to encode the type, score, depth and move members into a single 64 bit value!
TranspositionTableEntry long zobristKey (64 bits) long val (64 bits)
Not only does that reduce the overhead of an entry from 192 bits to 128 bits (24 bytes to 16 bytes), but it completely eliminates the need to keep those Move objects on the heap! Now a fully loaded hash table has the following maximums:
That’s a fantastic improvement in terms of storage capacity. Essentially we’re trading the computational cost of doing the transformations for space. The theory is that the extra space, which allows us to store far more entries, will more than offset that computational cost. So does it work?
chess4j has a command line parameter “-hash=NNN” that can be used to set the number of hash entries. NNN must be a power of 2. I ran several tests with the old and new hash schemas. Using the Silent but Deadly test suite at 30 seconds per problem, I measured average depth achieved at various combinations of max heap space (the Java -Xmx VM argument) and max # hash entries.
Results for the old hash schema:
|Heap size (-Xmx)||NNN=1m||NNN=2m||NNN=4m||NNN=8m||NNN=16m||NNN=32m||NNN=64m||NNN=128m|
OOM = “Out of Memory”. The green cells indicate the peak performance setting for that combination of heap space and # hash entries. As you can see, performance can actually decline when the number of hash entries is too large for the given heap space, even before the JVM throws an Out of Memory error.
Results for the new hash schema:
|Heap size (-Xmx)||NNN=1m||NNN=2m||NNN=4m||NNN=8m||NNN=16m||NNN=32m||NNN=64m||NNN=128m|
So, it appears the new hash schema indeed performs better, particularly with smaller heap sizes.
It occurred to me during testing that perhaps the JVM should have been initialized with the ‘-Xms’ argument equal to ‘-Xmx’, in order to not “penalize” the program for the overhead of increasing the heap, since it will surely grow to capacity. I reasoned that, using a large initial -Xms value would likely make the larger heap sizes look even better relative to the smaller ones, but doesn’t really matter if the point of the test is to compare performance of one schema to the other, as long as all settings are consistent.
Node rate per second, hash hit rate, and hash collision rate were also measured, but were less interesting. Node rate and hash collision rate predictably decreased with successively larger tables while hash hit rate increased.
It’s worth pointing out that my experience here is similar to Jonatan Pettersson’s with Mediocre Chess. See http://mediocrechess.sourceforge.net/guides/memoryinhashtables.html
These changes will be part of the 3.4 release. For the impatient, the changes to the main hash table have already been merged into master on the project github repo.
Back in early March I added aspiration windows to Prophet with a window of 1/3 pawn, as well as some futility pruning and delta pruning. My schedule hasn’t allowed me to do a lot of work on the code since then, so I took advantage of the time to find some new sparring partners. The results are in the table below.
These games — almost 200k of them — are played at 3+0.5. That is, the clock starts at 3 seconds and 1/2 second is added after each move. That may seem like a fast game but when you’re going for quantity, it’s actually slower than I’d like. That’s about as fast as I could go on my older test system without getting a lot of time forfeits.
I’ve actually dropped Beowulf, Prophet2, and Matheus off the active list. The others should keep Prophet busy for a while.
With summer coming and other commitments starting to wind down for a while, I’ve started working again. The latest version seems to be about +10 ELO above this version.
I just released chess4j 3.2. This release is purely about optimizing for speed. That is, there are no changes to the search or evaluation terms.
chess4j is now a bitboard engine, and uses magic bitboards for move generation. Simply by replacing some loops over all squares with loops over (possibly sparsely populated) bitboards, and probably most importantly getting rid of some managed collections stuff, chess4j 3.2 is 4-5x faster than 3.1. In a head to head match of almost 700 games with 3.1, 3.2 scored 79% which translates into a 200-250 elo gain.
One other important difference is that the opening book is now off by default. If you want to use the opening book included with the release, you must add “-book=book.db” to the command line. (This is done in the included chess4j.bat.)
I think I can still get some more ELO just off speed optimizations and efficiency improvements (such as splitting up capture and noncapture move generation), so there will probably be one more release in the 3.x line before moving onto 4.0. The 4.x line will be focused on parallel search.
As usual you can grab the latest and greatest from the project website. Enjoy!
Several months ago I decided to get a little more rigorous about how I test changes to Prophet. With previous versions of Prophet, I would make a change, run some test suites consisting of a few hundred or a thousand tactical positions, play a few games online to convince myself the change was good, and that was it. That doesn’t really cut it any more though. Fruit and other engines that have followed suit have shown the importance of a having a solid testing methodology that accurately measures how effective changes are. So, to that end, I found some sparring partners and took my first measurements:
Then Christmas happened, and I found myself with a little spare time to work on Prophet. I implemented bitboards, and then magic bitboards, and a few other speed optimizations. Suddenly it looked like I needed some new sparring partners!
Here is what I ended up with. I like this a lot, as Prophet3 is right in the middle with some engines below and above. That seems like a pretty good cross section. As Prophet3 continues to improve, I’ll just add newer strong engines to the mix and drop the bottom ones off.
I just released version 3.1 of chess4j, my Java based chess engine. This version is about 100 elo stronger than 3.0. The key differences are:
- The Groovy language is no longer used. Groovy is a fine language for some purposes. chess4j didn’t make heavy use of it but did use it for some I/O type classes, which I wrote about in this post. Unfortunately, it was a constant struggle with my IDE of choice (STS). It constantly balked and complained about the Groovy classes. I had hoped that in time IDE support would improve but in the end I just got tired of constantly messing around with it. On a positive note the chess4j jar file went from 11.5 mb to 4.5 mb.
- Some modest search improvements, worth about 20 elo.
- Added pondering. This means that when it’s the opponent’s turn to move, chess4j is already thinking about its response. If chess4j correctly predicted what the opponent will play, then it is that far ahead in formulating a response (it may even be able to play its move instantly). Seems to be worth about 80 elo.
In my testing, chess4j 3.1 scores 53% against 3.0 in a 1000 game match without pondering, and 64% in a 1600 game match with pondering.
I have my sights set on a parallel search for the 4.0 release, but I think there will be another minor release or two before that to work on search speed optimizations.
You can grab the latest and greatest from the project website. Enjoy!
After a VERY long absence from server play, the rewrite of my chess engine Prophet is finally far enough along to play some games online. The engine itself has been capable of play for a little while now, but I’ve establishing a testing routine and wanted to get an opening book.
Previous versions of Prophet had a custom book format, using fseek() and the like on a binary file. I may go back to that at some point but right now Prophet 3 uses SQLite to query the book built for chess4j.
Anyway, the engine has been online ICC most of the day:
67: + 1632 W 1469 brittabeast [ br 5 3] D11 Res Nov 30 16 15:29
66: + 1622 B 1479 brittabeast [ br 5 3] B03 Mat Nov 30 16 15:23
65: + 1611 W 1466 ituchesstech [ br 5 3] D08 Mat Nov 30 16 15:13
64: + 1600 B 1533 Knipshun [ br 5 3] A00 Res Nov 30 16 14:56
63: + 1586 W 1547 Knipshun [ br 5 3] C00 Res Nov 30 16 14:53
62: + 1570 B 1563 Knipshun [ br 5 3] A00 Res Nov 30 16 14:49
61: + 1553 W 1092 Darkpower [ br 5 3] A04 Res Nov 30 16 14:42
60: + 1551 B 1094 Darkpower [ br 5 3] B00 Res Nov 30 16 14:37
59: + 1549 W 1096 Darkpower [ br 5 3] D35 Res Nov 30 16 14:33
58: + 1547 B 1098 Darkpower [ br 5 3] B02 Res Nov 30 16 14:29
57: + 1545 W 1100 Darkpower [ br 5 3] C41 Res Nov 30 16 14:23
56: + 1543 B 1102 Darkpower [ br 5 3] B07 Res Nov 30 16 14:16
55: + 1541 W 1104 Darkpower [ br 5 3] D31 Res Nov 30 16 14:14
54: + 1539 B 1106 Darkpower [ br 5 3] B06 Res Nov 30 16 14:04
53: + 1537 W 1108 Darkpower [ br 5 3] D37 Res Nov 30 16 13:59
52: + 1534 B 1111 Darkpower [ br 5 3] B01 Res Nov 30 16 13:53
51: + 1531 B 1520 zalk [ br 5 3] A00 Res Nov 30 16 13:50
50: + 1514 W 1537 zalk [ br 5 3] B06 Res Nov 30 16 13:45
49: + 1495 W 1233 TT [ br 5 3] A04 Fla Nov 30 16 13:33
48: + 1695 B 1431 amr [ Br 2 1] B00 Mat Nov 30 16 13:28
20/20 isn’t too bad! Granted, it helps that it was vastly underrated, but still…what fun 🙂
After a long period of inactivity I’m happy to announce the release of version 3.0 of my Java chess program, chess4j. This version is a good bit stronger than the previous version, though still not as strong as my C/C++ program Prophet (see below). The key areas of improvement are:
- The addition of an opening book, which I discussed here.
- Improvements in the positional evaluation function
- Passed pawns are rewarded
- Isolated pawns are penalized
- Doubled pawns are penalized
- Rooks on open files or half open files are rewarded
- Rooks and queens on the 7th rank, particularly when connected to another rook or queen are rewarded
- There is some attempt to keep the king sheltered until the endgame, and then encouraged to move towards the center
- Almost 2x speed increase
The table below compares the performance of chess4j v2 against v3 on several different test suites. Each suite was run twice and the results averaged. I highlighted changes of +10% or more in green (good), and -10% or more in red (bad). The top 4 suites are tactical suites. There weren’t any significant changes here, despite the 2x speed increase. However, the ‘SBD’ suite and all the ‘STS’ suites below that are positional suites, and there were some differences there. Key areas of improvement (according to these tests) were ‘undermining’, ‘open files and diagonals’, ‘bishop vs knight’, ‘offer of simplification’, and ‘queens and rooks to the 7th rank’. The one suite marked in red measures advancement of the f/g/h pawns. I think the added king safety code has something to do with that: the program is less willing to push those pawns if the king is castled on that side.
|Suite/Time||v2 raw||v2 %||v3 raw||v3 %||Delta raw||Delta %|
The table below shows chess4j’s performance in a head-to-head match verse four other engines (the bottom one being my engine, Prophet). The match conditions were: only one processor could be used, no thinking on the opponents time, and no opening book libraries or endgame tablebases. The matches were done using a set of 20 starting positions called the ‘Nunn Test Suite’. The engines play from each position twice- once as white and once as black, making the entire match 40 games. One thing that may jump out is that the results don’t seem consistent. Why the variability against the same opponent using different times? It very well could be that chess4j is better at longer time controls, but it would take thousands of games to prove that theory. This game set is too small to be statistically significant. The main point was to get a feel for how strong the engine is.
|Opp||1 min||3 min||10 min|
While there is certainly a lot to do with the evaluation, it’s not so far behind Prophet’s that it would explain such a lopsided result. I’m afraid the main disadvantage chess4j has comes down to speed. On my laptop I typically see chess4j search between 180,000 and 200,000 positions per second. That may sound like a lot but this (32 bit compile of Prophet) searches between 800,000 and 1 million positions per second (and there are programs out there that search much faster than that!). To test that theory I ran another set of matches against the same opponents using time odds. In the table below, the 2x, 4x, and 8x columns show how chess4j would fare in a 1 minute match if it were 2x, 4x, or 8x faster than it is now. As you can see, if chess4j were 5-10 times faster than it is it would be much more competitive against this group.
Is it possible to make chess4j that much faster? Maybe. I was able to improve the speed of this version by a factor of two by doing a careful analysis of the data structures being used (mostly Collections stuff, like ‘ArrayList’ vs ‘HashSet’ and such), and looking up their runtime complexities for different operations (like adding, or iterating…) and changing the data structures where appropriate. But, I think I’ve gone about as far as I can down that road. This is a Java program after all. It’s not as “close to the metal” as a C program. I think to squeeze much more out of I’d have to scale back on how much object instantiation it does. But then it would look less like an object oriented program, and would probably be less readable and possibly harder to test… all of which go against what I want to accomplish with this program.
So, what’s next? My next areas of focus are:
- It’s time to establish a proper testing routine. Meaning that, going forward, all changes to the program need to be measured in terms of ELO gain. This is a lot harder than it sounds but it will ensure changes are good changes.
- Add a pondering search. Instead of sitting idly as the opponent is on move, try to guess what the opponent is going to do and start formulating a response. If we’re right often enough we can allow more time to think about each move.
- Use all the machine’s processors! Most computers these days have multiple processors, yet chess4j is only capable of using one. (using multiple processors is easy. doing it efficiently [in the context of a computer chess] is MUCH harder than you might think)
If you’d like to try your hand against chess4j, you can download it from the
project website. It’s a console program though, not graphical. However, you can download a GUI and set it up to use chess4j… see the readme.txt file.