# Announcing: New 4x4x4 Brute Force Solver

Status
Not open for further replies.

#### unsolved

##### Member

The links on the left will always have the latest version of the program ready for downloading. Let your mouse hoover over a link, and you will see the amount of RAM you need in order to run it. The highest performance version requires 67 GB of RAM, and the lowest needs only 650 MB.

• 5-TFS + Centers @ 7-TFS explicit hash table = 67 GB
• 5-TFS + Centers @ 7-TFS implicit hash table = 20 GB
• 4-TFS + Centers @ 7-TFS explicit hash table = 28 GB
• 4-TFS + Centers @ 7-TFS implicit hash table = 7 GB
• 4-TFS + Centers @ 6-TFS implicit hash table = 2 GB
• 3-TFS + Centers @ 6-TFS explicit hash table = 650 MB

Explicit hash table: Stores 64-bit hash number plus entire cube position in RAM for explicit verification that the positions must match, therefore the stored solution must be correct for that position.

Implicit hash table: Stores 64-bit hash number without the actual position, and requires the search to also probe the stored PV before being able to determine if the hash table position matches the cube being searched. Less RAM and slower with more CPU activity when hitting multiple solutions.

On my machine, roughly every 90 minutes, there is a 64-bit hash found during the search that matches a position in the table but is not a solution. This is why it is very important to verify that the 64-bit hash key is also really associated with the position in question. I thought 64-bits of hash safety would last until the sun would burn out, but this apparently is one of those Birthday Problem type of phenomena.

http://lightningcloudcomputing.com/OO_4x4x4/Rubiks_Revenge_4x4x4_Program.shtml

Last edited:

#### qqwref

##### Member
With your bitboard setup, it occurs to me that there may be a faster way to check that the puzzle is solved. Currently I believe you check that each side is equal to 0x22222222 or 0x33333333 or ...

Each side is a 64-bit integer with every 4 bits representing one sticker. If the side is solved, all those 16 stickers are the same. So, if we circular-shift the integer left or right by 4 bits, and it is unchanged, that side is solved. Rotating is a single assembly operation, although unfortunately it can't be written as such in C, so you'll have to write something like "cube._4x4x4_left == (cube._4x4x4_left << 4) | (cube._4x4x4_left >> 60)". It should optimize down though.

#### cuBerBruce

##### Member
write something like "cube._4x4x4_left == (cube._4x4x4_left << 4) | (cube._4x4x4_left >> 60)"
Just want to point out the | operator (like the || operator) has lower precedence than ==, so the whole expression to right of the == should be parenthesized.

As long as the bitboards are an unsigned type (which I believe is the case), the right shift should work as expected. Just noting that since a right shift can shift in 1s if the value is negative.

#### unsolved

##### Member
@cuBerBruce: Such a simple idea, but it's actually quite hard to implement it correctly. But I think I figured out the three necessary rules.

1. if there are 2 moves on the same axis, both turn in the same direction (U is the same direction as d') and one of the moves is of <D>, then ignore the case.
2. if there are 3 moves on the same axis, and >=2 moves turn in the same direction, then ignore the case.
3. ignore all cases with 4 moves on the same axis.

Using this rules, if get the following numbers of valid sequences:

 n all 36*33^(n-1) canceling moves (U d U = U2 d) cancel by rotation (U u = d D) 0 1​ 1​ 1​ 1 36​ 36​ 36​ 2 1.188​ 1.026​ 999​ 3 39.204​ 28.836​ 27.288​ 4 1.293.732​ 810.891​ 746.562​ 5 42.693.156​ 22.803.120​ 20.421.648​ 6 1.408.874.148​ 641.245.896​ 558.627.948​

As it turns out, there is not much difference. 2nd column has a grow rate of ~28.12, 3rd has ~27.35. It only has an impact for searches with >> 6 moves.

You can simplify the recursion:
Let a, b, c be the number of sequences that currently have 1, 2, 3 (respectively) turns on the same axis
seqCnt(n) = a(n) + b(n) + c(n)
a(1) = 36, b(1) = 0, c(1) = 0
b(2) = 3*(12*9/2-3*3) = 135
c(2) = 0
c(3) = 3*(12*9*6/6-4*7*3) = 72
a(n) = seqCnt(n-1) * 24
b(n) = seqCnt(n-2)*2*(12*9/2 - 3*3) = seqCnt(n-2)*90
c(n) = seqCnt(n-3)*2*(12*9*6/6-4*7*3) = seqCnt(n-3)*48

Therefore seqCnt(n) = seqCnt(n-1) * 24 + seqCnt(n-2)*90 + seqCnt(n-3)*48 with seqCnt(1) = 36, seqCnt(2) = 999, seqCnt(3) = 27288.

seqCnt(10) = 312757467379056, which is approximately 3/4 of the last implementation.
seqCnt(n) ~= 1.332886714636981*27.3543072482344^n. => Grow factor is ~ 27.3543072482344. Not much difference.
I finally got around to implementing Jakube's Rules in my 4x4x4 program. Rule #1 was the most difficult to implement, the others were quite easy. My program currently generates the minimal number of nodes per depth. If you subtract from my cumulative totals the total from the prior depth, our numbers match. There is a slight error in Jakube's list starting at depth 4, probably from rounding errors common in Microsoft Excel. I have the correct numbers up to depth 32 (shown at the bottom).

I also have TFS-06 solved now on my machine with 64 GB of RAM. This allows the program to complete depth-12 searches in just under 4 minutes, even though only searching 2.4 million positions per second. Probing the large hash table is a slowdown of a factor of 5, but reducing the search space by a factor of 27 is well worth it.

And the grand totals:

Edit: Interesting test position

The test scramble below is one of my "litmus test" positions I have been using for the last year:

Now that I solved every cube position 8-turns-from a solved state if only the centers are disturbed, the program can see an upcoming solution beyond its present depth of search. The scramble was 14 moves in length. At nominal depth 6, the program finds sequences leading to the 8-TFS centers database. At the conclusion of nominal depth 6, the logic indicates it still needs to keep searching for a potential quicker solution. A 14-move solution has been found, but depth 6 + (full) TFS-06 = 12 moves have only been searched to completion. So nominal depth 7 will be able to complete 13 moves of searching with the full 6-Turns-From-Solved database in RAM.

At nominal depth 7, I expected to see some 7-TFS center database hits (7 + 7 == 14 and matches the optimal depth found), but the program found 13-move solutions with the full 6-TFS database beforehand. So the "noose tightening" algorithm is working fine, only reporting move sequences at or better than the current optimal solution's depth.

The variable search depth code makes things a little more complex, but will have a large payoff in the Stage Solving code, since I use the Cage Method.

Last edited:

#### cuBerBruce

##### Member
I finally got around to implementing Jakube's Rules in my 4x4x4 program. Rule #1 was the most difficult to implement, the others were quite easy. My program currently generates the minimal number of nodes per depth. If you subtract from my cumulative totals the total from the prior depth, our numbers match. There is a slight error in Jakube's list starting at depth 4, probably from rounding errors common in Microsoft Excel. I have the correct numbers up to depth 32 (shown at the bottom).
I looked at using a syllable-based approach to see how many different maneuvers we should get at each depth. I got numbers that were generally a bit less than unsolved's numbers. His numbers were generally a bit less than Jakube's numbers. So I looked into why there were discrepancies.

The 4x4x4 obviously has 4 layers for each of the three main axes. Each layer can be in one of 4 positions (that we care about). So there are 4^4 - 256 possible ways a given set of 4 layers can be arranged. But some are essentially equivalent if we don't care about arrangements that are the same except for how the cube is oriented. Also, each time we change axis, we want to make some change for the new axis. So the total number of arrangements that matter is 4^3-1 = 63. This is the number of syllables we need to use.

Jakube's rules correctly allow 12 syllables at depth 1. They also correctly allow 45 syllables of depth 2. This leaves 63 - 12 - 45 = 6 syllables we need at depth 3 (or greater). None require more than 3 moves to produce, so 6 is indeed the number of syllables we need at depth 3.

In looking at Jakube's rules and numbers, it appears he is using 24 "syllables" at depth 3. For depth 3, Jakube allows any layer to remain fixed, and each of the other three layers must be moved different amounts (and not remain fixed, of course). Since we only care about how they move with respect to one another, we can arbitrarily pick one of the layers to be the fixed one. This reduces the number at depth 3 from 24 (what Jakube used) to 6 (in agreement with the number depth-3 syllables I calculated in the previous paragraph).

Thus, to properly reduce to minimal syllables, there should be another rule added to Jakube's rules. This rule could be:

if there are 3 moves on the same axis, ignore the case if the D layer (or L or B) is one of the layers moved.

As for Jakube's rules as stated, what are the correct numbers? Well, I did my own calculations and the results I got exactly match those that unsolved gives in the previous post. So I believe there must be a flaw in Jakube's formula or at least in his calculated numbers.

Using 6 depth-3 syllables, the numbers I get are given below. (I note that here I am counting the "null" maneuver at depth 0, while unsolved chose not to count it in his table.)

Code:
depth                                       maneuvers                              cumulative maneuvers

0                                                 1                                                 1
1                                                36                                                37
2                                               999                                              1036
3                                             27234                                             28270
4                                            743958                                            772228
5                                          20318040                                          21090268
6                                         554915988                                         576006256
7                                       15155534808                                       15731541064
8                                      413919090792                                      429650631856
9                                    11304715303584                                    11734365935440
10                                   308747751874992                                   320482117810432
11                                  8432337451411872                                  8752819569222304
12                                230299053086277216                                239051872655499520
13                               6289792617720221568                               6528844490375721088
14                             171783125652467209536                             178311970142842930624
15                            4691639939891068296576                            4869951910033911227200
16                          128135317377520330634880                          133005269287554241862080
17                         3499556609158513688443392                         3632561878446067930305472
18                        95577836863059851099339520                        99210398741505919029644992
19                      2610365803346232902311672320                      2709576202087738821341317312
20                     71292779277294878418682013184                     74002355479382617240023330496
21                   1947106558998594761469610899456                   2021108914477977378709634229952
22                  53178231940562968127779782841344                  55199340855040945506489417071296
23                1452372670234712302140003953301504                1507572011089753247646493370372800
24               39666350238991745519997910665750528               41173922250081498767644404036123328
25             1083344084840212755290083569169244160             1124518007090294254057727973205367488
26            29587658029717179771387497667419025408            30712176036807474025445225640624392896
27           808080756551699363435647440171277590528           838792932588506837461092665811901983424
28         22069827508933413454943894357008405389312         22908620441521920292404987022820307372736
29        602758180200411471785118384155625740795904        625666800641933392077523371178446048168640
30      16462177769692502926149019481147829595226112      17087844570334436318226542852326275643394752
31     449605340620764303649696448853838327821729792     466693185191098739967922991706164603465124544
32   12279357272333073488607547946405292040181415936   12746050457524172228575470938111456643646540480

Last edited:

#### unsolved

##### Member
I looked at using a syllable-based approach to see how many different maneuvers we should get at each depth. I got numbers that were generally a bit less than unsolved's numbers. His numbers were generally a bit less than Jakube's numbers. So I looked into why there were discrepancies.

As for Jakube's rules as stated, what are the correct numbers? Well, I did my own calculations and the results I got exactly match those that unsolved gives in the previous post. So I believe there must be a flaw in Jakube's formula or at least in his calculated numbers.
Well that part is at least encouraging.

Thus, to properly reduce to minimal syllables, there should be another rule added to Jakube's rules. This rule could be:

if there are 3 moves on the same axis, ignore the case if the D layer (or L or B) is one of the layers moved.
OK, so then a move sequence such as U u2 D should be disallowed?

I tried playing around to create this same position using only U, u and d and couldn't come up with something.

https://alg.cubing.net/?puzzle=4x4x4&setup=U_2U2_D

EDIT:

But now I see U2 u' d' then rotating the cube with y' handles that. Hmmm.

https://alg.cubing.net/?puzzle=4x4x4&setup=U2_2U-_2D-_y-

EDIT AGAIN

So I added Bruce Rule #4 to the move generator code, turned off the centers database probing code, and this is what my 13-move litmus test position is reporting:

So our cumulative numbers match if you factor into account that my program does not generate a null move at depth 0.

Code:
for(axis_of_rotation = 0; axis_of_rotation < 3; axis_of_rotation++)
{
present_depth_index = unchanging_depth - which_depth;
GLOBAL_AXIS[present_depth_index] = axis_of_rotation;

for(move_group = 0; move_group < 4; move_group++)
{
if((axis_of_rotation != last_move_axis) || ((axis_of_rotation == last_move_axis) && (move_group > last_move_group)))
{
___4___CONSECUTIVE___MOVES___ON___THE___SAME___AXIS___TEST___

for(move_direction = 0; move_direction < 3; move_direction++)
{
GLOBAL_DIRECTION[present_depth_index] = move_direction;

___3___CONSECUTIVE___MOVES___ON___THE___SAME___AXIS___WITH___2___MOVES___IN___THE___SAME___DIRECTION___TEST___

if(avoid_move_generation == 0)
{
which_move_to_make = GLOBAL_AXIS_ROTATION[axis_of_rotation][move_group][move_direction];

___3___CONSECUTIVE___MOVES___ON___THE___SAME___AXIS___BRUCE___RULE___4___

___2___CONSECUTIVE___MOVES___ON___THE___SAME___AXIS___WITH___2___MOVES___IN___THE___SAME___DIRECTION___TEST___

modified_cube = cube_move_array[which_move_to_make](which_cube);
search_forever_maximized(modified_cube, which_depth-1, axis_of_rotation, move_group, which_move_to_make, unchanging_depth);
}
}
}
}
}
where...

Code:
#define ___4___CONSECUTIVE___MOVES___ON___THE___SAME___AXIS___TEST___ \
avoid_move_generation = 1;\
if(present_depth_index >= 3)\
{\
for(i = 0; i <= 2; i++)\
avoid_move_generation = avoid_move_generation && (GLOBAL_AXIS[present_depth_index - 3] == GLOBAL_AXIS[present_depth_index - i]);\
}\
else avoid_move_generation = 0;\
if(avoid_move_generation) continue;\

#define ___3___CONSECUTIVE___MOVES___ON___THE___SAME___AXIS___WITH___2___MOVES___IN___THE___SAME___DIRECTION___TEST___ \
avoid_move_generation = 1;\
if(present_depth_index >= 2)\
{\
for(i = 0; i <= 1; i++)\
avoid_move_generation = avoid_move_generation && (GLOBAL_AXIS[present_depth_index - 2] == GLOBAL_AXIS[present_depth_index - i]);\
if(avoid_move_generation)\
{\
unsigned short counter[3];\
counter[0] = 0; counter[1] = 0; counter[2] = 0;\
for(i = 0; i <= 2; i++)\
{\
counter[GLOBAL_DIRECTION[present_depth_index - i]] = 1 + counter[GLOBAL_DIRECTION[present_depth_index - i]];\
if(counter[GLOBAL_DIRECTION[present_depth_index - i]] >= 2) avoid_move_generation = 2;\
}\
if(avoid_move_generation != 2) avoid_move_generation = 0;\
}\
}\
else avoid_move_generation = 0;\
if(avoid_move_generation == 2) continue;\

#define ___3___CONSECUTIVE___MOVES___ON___THE___SAME___AXIS___BRUCE___RULE___4___ \
avoid_move_generation = 1;\
if(present_depth_index >= 2)\
{\
for(i = 0; i <= 1; i++)\
avoid_move_generation = avoid_move_generation && (GLOBAL_AXIS[present_depth_index - 2] == GLOBAL_AXIS[present_depth_index - i]);\
}\
else avoid_move_generation = 0;\
if(avoid_move_generation)\
{\
if((which_move_to_make >= MOVE_GENERATOR_L_PLUS_) && (which_move_to_make <= MOVE_GENERATOR_L_TWICE)) continue;\
if((which_move_to_make >= MOVE_GENERATOR_B_PLUS_) && (which_move_to_make <= MOVE_GENERATOR_B_TWICE)) continue;\
if((which_move_to_make >= MOVE_GENERATOR_K_PLUS_) && (which_move_to_make <= MOVE_GENERATOR_K_TWICE)) continue;\
}\

#define ___2___CONSECUTIVE___MOVES___ON___THE___SAME___AXIS___WITH___2___MOVES___IN___THE___SAME___DIRECTION___TEST___ \
avoid_move_generation = 1;\
if(present_depth_index >= 1)\
{\
avoid_move_generation = (GLOBAL_AXIS[present_depth_index - 1] == GLOBAL_AXIS[present_depth_index]);\
}\
else avoid_move_generation = 0;\
if(avoid_move_generation)\
{\
unsigned short same_direction_counter[3];\
same_direction_counter[0] = 0; same_direction_counter[1] = 0; same_direction_counter[2] = 0;\
for(i = 0; i <= 1; i++)\
{\
same_direction_counter[GLOBAL_DIRECTION[present_depth_index - i]] = 1 + same_direction_counter[GLOBAL_DIRECTION[present_depth_index - i]];\
if(same_direction_counter[GLOBAL_DIRECTION[present_depth_index - i]] == 2) avoid_move_generation = 2;\
}\
if(avoid_move_generation != 2) avoid_move_generation = 0;\
else\
{\
if(((GLOBAL_CUBE_SOLUTION[present_depth_index - 1] - 1) <= MOVE_GENERATOR_l_TWICE) && ((GLOBAL_CUBE_SOLUTION[present_depth_index - 1] - 1) >= MOVE_GENERATOR_R_PLUS_) && (which_move_to_make >= MOVE_GENERATOR_L_PLUS_) && (which_move_to_make <= MOVE_GENERATOR_L_TWICE)) continue;\
if(((GLOBAL_CUBE_SOLUTION[present_depth_index - 1] - 1) <= MOVE_GENERATOR_b_TWICE) && ((GLOBAL_CUBE_SOLUTION[present_depth_index - 1] - 1) >= MOVE_GENERATOR_T_PLUS_) && (which_move_to_make >= MOVE_GENERATOR_B_PLUS_) && (which_move_to_make <= MOVE_GENERATOR_B_TWICE)) continue;\
if(((GLOBAL_CUBE_SOLUTION[present_depth_index - 1] - 1) >= MOVE_GENERATOR_F_PLUS_) && ((GLOBAL_CUBE_SOLUTION[present_depth_index - 1] - 1) <= MOVE_GENERATOR_k_TWICE) && (which_move_to_make >= MOVE_GENERATOR_K_PLUS_) && (which_move_to_make <= MOVE_GENERATOR_K_TWICE)) continue;\
}\
}\
EDIT: P.S., Thanks Bruce!

Last edited:

#### Herbert Kociemba

##### Member
I looked at using a syllable-based approach to see how many different maneuvers we should get at each depth. I got numbers that were generally a bit less than unsolved's numbers. His numbers were generally a bit less than Jakube's numbers. So I looked into why there were discrepancies.

The 4x4x4 obviously has 4 layers for each of the three main axes. Each layer can be in one of 4 positions (that we care about). So there are 4^4 - 256 possible ways a given set of 4 layers can be arranged. But some are essentially equivalent if we don't care about arrangements that are the same except for how the cube is oriented. Also, each time we change axis, we want to make some change for the new axis. So the total number of arrangements that matter is 4^3-1 = 63. This is the number of syllables we need to use.

Jakube's rules correctly allow 12 syllables at depth 1. They also correctly allow 45 syllables of depth 2. This leaves 63 - 12 - 45 = 6 syllables we need at depth 3 (or greater). None require more than 3 moves to produce, so 6 is indeed the number of syllables we need at depth 3.

In looking at Jakube's rules and numbers, it appears he is using 24 "syllables" at depth 3. For depth 3, Jakube allows any layer to remain fixed, and each of the other three layers must be moved different amounts (and not remain fixed, of course). Since we only care about how they move with respect to one another, we can arbitrarily pick one of the layers to be the fixed one. This reduces the number at depth 3 from 24 (what Jakube used) to 6 (in agreement with the number depth-3 syllables I calculated in the previous paragraph).

Thus, to properly reduce to minimal syllables, there should be another rule added to Jakube's rules. This rule could be:

if there are 3 moves on the same axis, ignore the case if the D layer (or L or B) is one of the layers moved.

As for Jakube's rules as stated, what are the correct numbers? Well, I did my own calculations and the results I got exactly match those that unsolved gives in the previous post. So I believe there must be a flaw in Jakube's formula or at least in his calculated numbers.

Using 6 depth-3 syllables, the numbers I get are given below. (I note that here I am counting the "null" maneuver at depth 0, while unsolved chose not to count it in his table.)
I do not know how you generate the counts, but there is a quite elegant way to do get them using generating functions. I there are 12/45/6 syllables of length 1/2/3 on one axis we define a generating function a(x) = 12x + 45x^2 + 6x^3 for this axis. Then the generating function for the number of canonical maneuvers is

gf(x) = 1+3a(x)/(1-2a(x))

and the generating function for the cumulated counts is Gf(x) = gf(x)/(1-x)

This works not only for 4x4x4 but for any metric for nxnxn, you have to plug in the correct numbers for the syllables in the formula for a(x) of course. Only in the case of OBTM we have a closed formula for a(x) for nxnxn which is a(x)= (1+3x)^(n-1) - 1 in HTM and a(x)=(1+2x+x^2)^(n-1) - 1 in QTM.

But back to the 4x4x4. If we use a(x) = 12x + 45x^2 + 6x^3 we get

gf(x) =-(1/2)-3/(-2+48 x+180 x^2+24 x^3)

In Mathematica, you then use for example

Series[gf[x], {x, 0, 10}]

to get the first 10 terms of the Taylor series expansion which gives

1+36 x+999 x^2+27234 x^3+743958 x^4+20318040 x^5+554915988 x^6+15155534808 x^7+413919090792 x^8+11304715303584 x^9+308747751874992 x^10+O[x]^11

Last edited:

#### unsolved

##### Member
I do not know how you generate the counts, but there is a quite elegant way to do get them using generating functions.
The way Jake did his original enumeration was quite easy.

The move counts are a function of the sum of legal parallel plane moves. Compute how many legal moves there are when only one plane is moved, then two parallel planes (like U followed by either u d or D), then three consecutive parallel plane moves (U u2 D = U2 u' d' so one of these must be disallowed) and then add them up.

There is no way to perform 2 consecutive parallel plane moves at depth 1, so that gets as 0, as well as 3 parallel plane moves. So for depth 1 you get 36 moves of one plane (U, R, F, etc..) and 0 for the others. At depth 2, there are 135 parallel plane moves of 2 parallel planes. At depth 3, there are 72 moves of 3 parallel planes.

The numbers 36, 135, and 72 are shown shaded in purple.

Every other number below the purple is derived from recursive formulas.

For 1 parallel plane, the total moves at a given depth is 24 times the total sum at the prior depth. 24 * 36 = 864.

For 2 parallel planes, the total moves at a given depth is 90 times the total sum from 2 depths prior. 90 * 36 = 3240.

For 3 parallel planes, the total moves at a given depth is 48 times the total sum from 3 depths prior. 48 * 36 = 1728.

Excel can handle up to depth 10 before rounding errors screw things up. Then you need a big number library to handle the rest.

Last edited:

#### Herbert Kociemba

##### Member
If I use a(x) := 12 x + 45 x^2 + 24 x^3 I get Jakes numbers. So as Bruce already stated, he seems to use 24 syllables of length 3.

#### rokicki

##### Member
Well that part is at least encouraging.
Code:
for(axis_of_rotation = 0; axis_of_rotation < 3; axis_of_rotation++)
{
present_depth_index = unchanging_depth - which_depth;
GLOBAL_AXIS[present_depth_index] = axis_of_rotation;

for(move_group = 0; move_group < 4; move_group++)
{
if((axis_of_rotation != last_move_axis) || ((axis_of_rotation == last_move_axis) && (move_group > last_move_group)))
{
___4___CONSECUTIVE___MOVES___ON___THE___SAME___AXIS___TEST___

for(move_direction = 0; move_direction < 3; move_direction++)
{
GLOBAL_DIRECTION[present_depth_index] = move_direction;

___3___CONSECUTIVE___MOVES___ON___THE___SAME___AXIS___WITH___2___MOVES___IN___THE___SAME___DIRECTION___TEST___

if(avoid_move_generation == 0)
{
which_move_to_make = GLOBAL_AXIS_ROTATION[axis_of_rotation][move_group][move_direction];

___3___CONSECUTIVE___MOVES___ON___THE___SAME___AXIS___BRUCE___RULE___4___

___2___CONSECUTIVE___MOVES___ON___THE___SAME___AXIS___WITH___2___MOVES___IN___THE___SAME___DIRECTION___TEST___

modified_cube = cube_move_array[which_move_to_make](which_cube);
search_forever_maximized(modified_cube, which_depth-1, axis_of_rotation, move_group, which_move_to_make, unchanging_depth);
}
}
}
}
}
Wow. You can replace all of this with a small state table. For SST I use
49 states. We have two tables:

const int MOVEPRUNE_STATES = 49 ;
unsigned char nextmove_state[MOVEPRUNE_STATES][NMOVES] ;

so the move stuff just goes:

Code:
int recur(const cubepos &cp, int moveprune_state) {
...
for (int mv=0; mv<NMOVES; mv++)
if ((movemask >> mv) & 1) {
cp2 = cp ;
cp2.move(mv) ;
...
recur(cp2, nextmove_state[moveprune_state][mv]) ;
...
}
}
Generating the state tables can be done just once, using the more complicated
and longer code you show above.

This may reduce your instruction count a fair bit and bump your node evaluation
speed.

Last edited:

#### unsolved

##### Member
Wow. You can replace all of this with a small state table. For SST I use
49 states. We have two tables:

const int MOVEPRUNE_STATES = 49 ;
unsigned char nextmove_state[MOVEPRUNE_STATES][NMOVES] ;

so the move stuff just goes:

Code:
int recur(const cubepos &cp, int moveprune_state) {
...
for (int mv=0; mv<NMOVES; mv++)
if ((movemask >> mv) & 1) {
cp2 = cp ;
cp2.move(mv) ;
...
recur(cp2, nextmove_state[moveprune_state][mv]) ;
...
}
}
Generating the state tables can be done just once, using the more complicated
and longer code you show above.

This may reduce your instruction count a fair bit and bump your node evaluation
speed.
Well the axis rotation data is pretty small.

Code:
void init_axis_rotation_data(void)
{
GLOBAL_AXIS_ROTATION[0][0][0] = MOVE_GENERATOR_R_PLUS_;
GLOBAL_AXIS_ROTATION[0][0][1] = MOVE_GENERATOR_R_MINUS;
GLOBAL_AXIS_ROTATION[0][0][2] = MOVE_GENERATOR_R_TWICE;
GLOBAL_AXIS_ROTATION[0][1][0] = MOVE_GENERATOR_r_PLUS_;
GLOBAL_AXIS_ROTATION[0][1][1] = MOVE_GENERATOR_r_MINUS;
GLOBAL_AXIS_ROTATION[0][1][2] = MOVE_GENERATOR_r_TWICE;

GLOBAL_AXIS_ROTATION[0][2][0] = MOVE_GENERATOR_l_PLUS_;
GLOBAL_AXIS_ROTATION[0][2][1] = MOVE_GENERATOR_l_MINUS;
GLOBAL_AXIS_ROTATION[0][2][2] = MOVE_GENERATOR_l_TWICE;
GLOBAL_AXIS_ROTATION[0][3][0] = MOVE_GENERATOR_L_PLUS_;
GLOBAL_AXIS_ROTATION[0][3][1] = MOVE_GENERATOR_L_MINUS;
GLOBAL_AXIS_ROTATION[0][3][2] = MOVE_GENERATOR_L_TWICE;

GLOBAL_AXIS_ROTATION[1][0][0] = MOVE_GENERATOR_T_PLUS_;
GLOBAL_AXIS_ROTATION[1][0][1] = MOVE_GENERATOR_T_MINUS;
GLOBAL_AXIS_ROTATION[1][0][2] = MOVE_GENERATOR_T_TWICE;
GLOBAL_AXIS_ROTATION[1][1][0] = MOVE_GENERATOR_t_PLUS_;
GLOBAL_AXIS_ROTATION[1][1][1] = MOVE_GENERATOR_t_MINUS;
GLOBAL_AXIS_ROTATION[1][1][2] = MOVE_GENERATOR_t_TWICE;

GLOBAL_AXIS_ROTATION[1][2][0] = MOVE_GENERATOR_b_PLUS_;
GLOBAL_AXIS_ROTATION[1][2][1] = MOVE_GENERATOR_b_MINUS;
GLOBAL_AXIS_ROTATION[1][2][2] = MOVE_GENERATOR_b_TWICE;
GLOBAL_AXIS_ROTATION[1][3][0] = MOVE_GENERATOR_B_PLUS_;
GLOBAL_AXIS_ROTATION[1][3][1] = MOVE_GENERATOR_B_MINUS;
GLOBAL_AXIS_ROTATION[1][3][2] = MOVE_GENERATOR_B_TWICE;

GLOBAL_AXIS_ROTATION[2][0][0] = MOVE_GENERATOR_F_PLUS_;
GLOBAL_AXIS_ROTATION[2][0][1] = MOVE_GENERATOR_F_MINUS;
GLOBAL_AXIS_ROTATION[2][0][2] = MOVE_GENERATOR_F_TWICE;
GLOBAL_AXIS_ROTATION[2][1][0] = MOVE_GENERATOR_f_PLUS_;
GLOBAL_AXIS_ROTATION[2][1][1] = MOVE_GENERATOR_f_MINUS;
GLOBAL_AXIS_ROTATION[2][1][2] = MOVE_GENERATOR_f_TWICE;

GLOBAL_AXIS_ROTATION[2][2][0] = MOVE_GENERATOR_k_PLUS_;
GLOBAL_AXIS_ROTATION[2][2][1] = MOVE_GENERATOR_k_MINUS;
GLOBAL_AXIS_ROTATION[2][2][2] = MOVE_GENERATOR_k_TWICE;
GLOBAL_AXIS_ROTATION[2][3][0] = MOVE_GENERATOR_K_PLUS_;
GLOBAL_AXIS_ROTATION[2][3][1] = MOVE_GENERATOR_K_MINUS;
GLOBAL_AXIS_ROTATION[2][3][2] = MOVE_GENERATOR_K_TWICE;
}
I use T for top (as opposed to U), B for Bottom (as opposed to Back), and K for Back.

So the 3 for loops are just going through 0...2, 0...3, and 0..2.

The MOVE_GENERATOR_R_PLUS_ to MOVE_GENERATOR_K_TWICE are just constants 0 through 35. They are passed to an array of function pointers.

Code:
which_move_to_make = GLOBAL_AXIS_ROTATION[axis_of_rotation][move_group][move_direction];
modified_cube = cube_move_array[which_move_to_make](which_cube);
So cube_move_array[ ... ] takes an index, and that index sends you to the routine that executes the associated move. It only takes 6 lines of code to make any move.

Code:
BITBOARD_4x4x4_CUBE T_PLUS(BITBOARD_4x4x4_CUBE original_4x4x4_cube)
{
BITBOARD_4x4x4_CUBE new_cube;

/**************************************************************************/
/* Rotating the top will change something on every face except the bottom */
/**************************************************************************/
new_cube._4x4x4_bottom = original_4x4x4_cube._4x4x4_bottom;

/******************************************************************************************/
/* Entire row of cubes moves from right -----> front -----> left -----> back -----> right */
/* new front = old right; new right = old back; new back = old left; new left = old front */
/******************************************************************************************/

/*********************************************************************************/
/*                                 Rotate the TOP                                */
/*********************************************************************************/
/*                                                                               */
/* THESE CUBE LOCATIONS BECOME....              THESE CUBE LOCATIONS             */
/*                                                                               */
/*                                                                               */
/*********************************/             /*********************************/
/*  01   *  02   *  03   *   04  */             /*  13   *  09   *  05   *   01  */
/*********************************/             /*********************************/
/*  05   *  06   *  07   *   08  */             /*  14   *  10   *  06   *   02  */
/*********************************/             /*********************************/
/*  09   *  10   *  11   *   12  */             /*  15   *  11   *  07   *   03  */
/*********************************/             /*********************************/
/*  13   *  14   *  15   *   16  */             /*  16   *  12   *  08   *   04  */
/*********************************/             /*********************************/
/*                                                                               */
/*                                                                               */
/*********************************************************************************/

new_cube._4x4x4_top = ((original_4x4x4_cube._4x4x4_top & BITMASK_CUBE_FACE_SQUARE_13) << 48)  | \
((original_4x4x4_cube._4x4x4_top & BITMASK_CUBE_FACE_SQUARE_09) << 28)  | \
((original_4x4x4_cube._4x4x4_top & BITMASK_CUBE_FACE_SQUARE_05) << 8)   | \
((original_4x4x4_cube._4x4x4_top & BITMASK_CUBE_FACE_SQUARE_01) >> 12)  | \
((original_4x4x4_cube._4x4x4_top & BITMASK_CUBE_FACE_SQUARE_14) << 36)  | \
((original_4x4x4_cube._4x4x4_top & BITMASK_CUBE_FACE_SQUARE_10) << 16)  | \
((original_4x4x4_cube._4x4x4_top & BITMASK_CUBE_FACE_SQUARE_06) >> 4)   | \
((original_4x4x4_cube._4x4x4_top & BITMASK_CUBE_FACE_SQUARE_02) >> 24)  | \
((original_4x4x4_cube._4x4x4_top & BITMASK_CUBE_FACE_SQUARE_15) << 24)  | \
((original_4x4x4_cube._4x4x4_top & BITMASK_CUBE_FACE_SQUARE_11) << 4)   | \
((original_4x4x4_cube._4x4x4_top & BITMASK_CUBE_FACE_SQUARE_07) >> 16)  | \
((original_4x4x4_cube._4x4x4_top & BITMASK_CUBE_FACE_SQUARE_03) >> 36)  | \
((original_4x4x4_cube._4x4x4_top & BITMASK_CUBE_FACE_SQUARE_16) << 12)  | \
((original_4x4x4_cube._4x4x4_top & BITMASK_CUBE_FACE_SQUARE_12) >> 8)   | \
((original_4x4x4_cube._4x4x4_top & BITMASK_CUBE_FACE_SQUARE_08) >> 28)  | \

return new_cube;
}
When I disable the 64-bit hash key generation and hash table probing, the search reaches ~ 20,000,000 nodes/sec per core. I typically have 12 cores engaged for parallel searching, which can absorb everything generated by the first two for loops.