Welcome to the Speedsolving.com, home of the web's largest puzzle community! You are currently viewing our forum as a guest which gives you limited access to join discussions and access our other features.

The links on the left will always have the latest version of the program ready for downloading. Let your mouse hoover over a link, and you will see the amount of RAM you need in order to run it. The highest performance version requires 67 GB of RAM, and the lowest needs only 650 MB.

Explicit hash table: Stores 64-bit hash number plus entire cube position in RAM for explicit verification that the positions must match, therefore the stored solution must be correct for that position.

Implicit hash table: Stores 64-bit hash number without the actual position, and requires the search to also probe the stored PV before being able to determine if the hash table position matches the cube being searched. Less RAM and slower with more CPU activity when hitting multiple solutions.

On my machine, roughly every 90 minutes, there is a 64-bit hash found during the search that matches a position in the table but is not a solution. This is why it is very important to verify that the 64-bit hash key is also really associated with the position in question. I thought 64-bits of hash safety would last until the sun would burn out, but this apparently is one of those Birthday Problem type of phenomena.

With your bitboard setup, it occurs to me that there may be a faster way to check that the puzzle is solved. Currently I believe you check that each side is equal to 0x22222222 or 0x33333333 or ...

Each side is a 64-bit integer with every 4 bits representing one sticker. If the side is solved, all those 16 stickers are the same. So, if we circular-shift the integer left or right by 4 bits, and it is unchanged, that side is solved. Rotating is a single assembly operation, although unfortunately it can't be written as such in C, so you'll have to write something like "cube._4x4x4_left == (cube._4x4x4_left << 4) | (cube._4x4x4_left >> 60)". It should optimize down though.

Just want to point out the | operator (like the || operator) has lower precedence than ==, so the whole expression to right of the == should be parenthesized.

As long as the bitboards are an unsigned type (which I believe is the case), the right shift should work as expected. Just noting that since a right shift can shift in 1s if the value is negative.

@cuBerBruce: Such a simple idea, but it's actually quite hard to implement it correctly. But I think I figured out the three necessary rules.

if there are 2 moves on the same axis, both turn in the same direction (U is the same direction as d') and one of the moves is of <D>, then ignore the case.

if there are 3 moves on the same axis, and >=2 moves turn in the same direction, then ignore the case.

ignore all cases with 4 moves on the same axis.

Using this rules, if get the following numbers of valid sequences:

n

all
36*33^(n-1)

canceling moves
(U d U = U2 d)

cancel by rotation
(U u = d D)

0

1

1

1

1

36

36

36

2

1.188

1.026

999

3

39.204

28.836

27.288

4

1.293.732

810.891

746.562

5

42.693.156

22.803.120

20.421.648

6

1.408.874.148

641.245.896

558.627.948

As it turns out, there is not much difference. 2nd column has a grow rate of ~28.12, 3rd has ~27.35. It only has an impact for searches with >> 6 moves.

Spoiler: recursion

You can simplify the recursion:
Let a, b, c be the number of sequences that currently have 1, 2, 3 (respectively) turns on the same axis
seqCnt(n) = a(n) + b(n) + c(n)
a(1) = 36, b(1) = 0, c(1) = 0
b(2) = 3*(12*9/2-3*3) = 135
c(2) = 0
c(3) = 3*(12*9*6/6-4*7*3) = 72
a(n) = seqCnt(n-1) * 24
b(n) = seqCnt(n-2)*2*(12*9/2 - 3*3) = seqCnt(n-2)*90
c(n) = seqCnt(n-3)*2*(12*9*6/6-4*7*3) = seqCnt(n-3)*48

seqCnt(10) = 312757467379056, which is approximately 3/4 of the last implementation.
seqCnt(n) ~= 1.332886714636981*27.3543072482344^n. => Grow factor is ~ 27.3543072482344. Not much difference.

I finally got around to implementing Jakube's Rules in my 4x4x4 program. Rule #1 was the most difficult to implement, the others were quite easy. My program currently generates the minimal number of nodes per depth. If you subtract from my cumulative totals the total from the prior depth, our numbers match. There is a slight error in Jakube's list starting at depth 4, probably from rounding errors common in Microsoft Excel. I have the correct numbers up to depth 32 (shown at the bottom).

I also have TFS-06 solved now on my machine with 64 GB of RAM. This allows the program to complete depth-12 searches in just under 4 minutes, even though only searching 2.4 million positions per second. Probing the large hash table is a slowdown of a factor of 5, but reducing the search space by a factor of 27 is well worth it.

And the grand totals:

Edit: Interesting test position

The test scramble below is one of my "litmus test" positions I have been using for the last year:

Now that I solved every cube position 8-turns-from a solved state if only the centers are disturbed, the program can see an upcoming solution beyond its present depth of search. The scramble was 14 moves in length. At nominal depth 6, the program finds sequences leading to the 8-TFS centers database. At the conclusion of nominal depth 6, the logic indicates it still needs to keep searching for a potential quicker solution. A 14-move solution has been found, but depth 6 + (full) TFS-06 = 12 moves have only been searched to completion. So nominal depth 7 will be able to complete 13 moves of searching with the full 6-Turns-From-Solved database in RAM.

At nominal depth 7, I expected to see some 7-TFS center database hits (7 + 7 == 14 and matches the optimal depth found), but the program found 13-move solutions with the full 6-TFS database beforehand. So the "noose tightening" algorithm is working fine, only reporting move sequences at or better than the current optimal solution's depth.

The variable search depth code makes things a little more complex, but will have a large payoff in the Stage Solving code, since I use the Cage Method.

I finally got around to implementing Jakube's Rules in my 4x4x4 program. Rule #1 was the most difficult to implement, the others were quite easy. My program currently generates the minimal number of nodes per depth. If you subtract from my cumulative totals the total from the prior depth, our numbers match. There is a slight error in Jakube's list starting at depth 4, probably from rounding errors common in Microsoft Excel. I have the correct numbers up to depth 32 (shown at the bottom).

I looked at using a syllable-based approach to see how many different maneuvers we should get at each depth. I got numbers that were generally a bit less than unsolved's numbers. His numbers were generally a bit less than Jakube's numbers. So I looked into why there were discrepancies.

The 4x4x4 obviously has 4 layers for each of the three main axes. Each layer can be in one of 4 positions (that we care about). So there are 4^4 - 256 possible ways a given set of 4 layers can be arranged. But some are essentially equivalent if we don't care about arrangements that are the same except for how the cube is oriented. Also, each time we change axis, we want to make some change for the new axis. So the total number of arrangements that matter is 4^3-1 = 63. This is the number of syllables we need to use.

Jakube's rules correctly allow 12 syllables at depth 1. They also correctly allow 45 syllables of depth 2. This leaves 63 - 12 - 45 = 6 syllables we need at depth 3 (or greater). None require more than 3 moves to produce, so 6 is indeed the number of syllables we need at depth 3.

In looking at Jakube's rules and numbers, it appears he is using 24 "syllables" at depth 3. For depth 3, Jakube allows any layer to remain fixed, and each of the other three layers must be moved different amounts (and not remain fixed, of course). Since we only care about how they move with respect to one another, we can arbitrarily pick one of the layers to be the fixed one. This reduces the number at depth 3 from 24 (what Jakube used) to 6 (in agreement with the number depth-3 syllables I calculated in the previous paragraph).

Thus, to properly reduce to minimal syllables, there should be another rule added to Jakube's rules. This rule could be:

if there are 3 moves on the same axis, ignore the case if the D layer (or L or B) is one of the layers moved.

As for Jakube's rules as stated, what are the correct numbers? Well, I did my own calculations and the results I got exactly match those that unsolved gives in the previous post. So I believe there must be a flaw in Jakube's formula or at least in his calculated numbers.

Using 6 depth-3 syllables, the numbers I get are given below. (I note that here I am counting the "null" maneuver at depth 0, while unsolved chose not to count it in his table.)

I looked at using a syllable-based approach to see how many different maneuvers we should get at each depth. I got numbers that were generally a bit less than unsolved's numbers. His numbers were generally a bit less than Jakube's numbers. So I looked into why there were discrepancies.

As for Jakube's rules as stated, what are the correct numbers? Well, I did my own calculations and the results I got exactly match those that unsolved gives in the previous post. So I believe there must be a flaw in Jakube's formula or at least in his calculated numbers.

So I added Bruce Rule #4 to the move generator code, turned off the centers database probing code, and this is what my 13-move litmus test position is reporting:

So our cumulative numbers match if you factor into account that my program does not generate a null move at depth 0.

I looked at using a syllable-based approach to see how many different maneuvers we should get at each depth. I got numbers that were generally a bit less than unsolved's numbers. His numbers were generally a bit less than Jakube's numbers. So I looked into why there were discrepancies.

The 4x4x4 obviously has 4 layers for each of the three main axes. Each layer can be in one of 4 positions (that we care about). So there are 4^4 - 256 possible ways a given set of 4 layers can be arranged. But some are essentially equivalent if we don't care about arrangements that are the same except for how the cube is oriented. Also, each time we change axis, we want to make some change for the new axis. So the total number of arrangements that matter is 4^3-1 = 63. This is the number of syllables we need to use.

Jakube's rules correctly allow 12 syllables at depth 1. They also correctly allow 45 syllables of depth 2. This leaves 63 - 12 - 45 = 6 syllables we need at depth 3 (or greater). None require more than 3 moves to produce, so 6 is indeed the number of syllables we need at depth 3.

In looking at Jakube's rules and numbers, it appears he is using 24 "syllables" at depth 3. For depth 3, Jakube allows any layer to remain fixed, and each of the other three layers must be moved different amounts (and not remain fixed, of course). Since we only care about how they move with respect to one another, we can arbitrarily pick one of the layers to be the fixed one. This reduces the number at depth 3 from 24 (what Jakube used) to 6 (in agreement with the number depth-3 syllables I calculated in the previous paragraph).

Thus, to properly reduce to minimal syllables, there should be another rule added to Jakube's rules. This rule could be:

if there are 3 moves on the same axis, ignore the case if the D layer (or L or B) is one of the layers moved.

As for Jakube's rules as stated, what are the correct numbers? Well, I did my own calculations and the results I got exactly match those that unsolved gives in the previous post. So I believe there must be a flaw in Jakube's formula or at least in his calculated numbers.

Using 6 depth-3 syllables, the numbers I get are given below. (I note that here I am counting the "null" maneuver at depth 0, while unsolved chose not to count it in his table.)

I do not know how you generate the counts, but there is a quite elegant way to do get them using generating functions. I there are 12/45/6 syllables of length 1/2/3 on one axis we define a generating function a(x) = 12x + 45x^2 + 6x^3 for this axis. Then the generating function for the number of canonical maneuvers is

gf(x) = 1+3a(x)/(1-2a(x))

and the generating function for the cumulated counts is Gf(x) = gf(x)/(1-x)

This works not only for 4x4x4 but for any metric for nxnxn, you have to plug in the correct numbers for the syllables in the formula for a(x) of course. Only in the case of OBTM we have a closed formula for a(x) for nxnxn which is a(x)= (1+3x)^(n-1) - 1 in HTM and a(x)=(1+2x+x^2)^(n-1) - 1 in QTM.

But back to the 4x4x4. If we use a(x) = 12x + 45x^2 + 6x^3 we get

gf(x) =-(1/2)-3/(-2+48 x+180 x^2+24 x^3)

In Mathematica, you then use for example

Series[gf[x], {x, 0, 10}]

to get the first 10 terms of the Taylor series expansion which gives

The way Jake did his original enumeration was quite easy.

The move counts are a function of the sum of legal parallel plane moves. Compute how many legal moves there are when only one plane is moved, then two parallel planes (like U followed by either u d or D), then three consecutive parallel plane moves (U u2 D = U2 u' d' so one of these must be disallowed) and then add them up.

There is no way to perform 2 consecutive parallel plane moves at depth 1, so that gets as 0, as well as 3 parallel plane moves. So for depth 1 you get 36 moves of one plane (U, R, F, etc..) and 0 for the others. At depth 2, there are 135 parallel plane moves of 2 parallel planes. At depth 3, there are 72 moves of 3 parallel planes.

The numbers 36, 135, and 72 are shown shaded in purple.

Every other number below the purple is derived from recursive formulas.

For 1 parallel plane, the total moves at a given depth is 24 times the total sum at the prior depth. 24 * 36 = 864.

For 2 parallel planes, the total moves at a given depth is 90 times the total sum from 2 depths prior. 90 * 36 = 3240.

For 3 parallel planes, the total moves at a given depth is 48 times the total sum from 3 depths prior. 48 * 36 = 1728.

Excel can handle up to depth 10 before rounding errors screw things up. Then you need a big number library to handle the rest.

So cube_move_array[ ... ] takes an index, and that index sends you to the routine that executes the associated move. It only takes 6 lines of code to make any move.

When I disable the 64-bit hash key generation and hash table probing, the search reaches ~ 20,000,000 nodes/sec per core. I typically have 12 cores engaged for parallel searching, which can absorb everything generated by the first two for loops.