Welcome to the Speedsolving.com, home of the web's largest puzzle community! You are currently viewing our forum as a guest which gives you limited access to join discussions and access our other features.

That's not quite valid, though. Stefan said: "So get to m=13..20 moves and then add an 8-moves-identity (if m=20) or replace some move so that you have 28 moves". But there is no 7-move identity to add, and your sequence adds 8 moves. So you missed the case of taking a 20-move algorithm and getting a 27-move algorithm. As for the paragraph you typed after the above quote, no idea what you're saying

If we are trying to find this number, and we know it is not a 20f* state if it does exist, why should we include 20f* states?

@ Stefen - of course there has to exist such a number, but if it is 20, I wouldn't call it a second number. A position could be a counterexample as Tempus said.

Edit: After some further thinking, I realize why 20f* states should be counted.

I have finished a proof for what I was thinking before would make the upper bound 25, but it actually only makes the upper bound 26.

If a sequence ends in U' R', you can replace the U' R' with F2 L F L' U' F2 R' F' unless the move before the U' R' is an F2. In that case, you can replace it with R2 F D R' D' R2 F' U'. This lets you replace the last 2 moves with an 8 move solution adding 6 moves. I have done this with every combination of 2 moves.

Shorter sequences have been given if there were any.

Spoiler: combinations

U' R' (2,2) F2 L F L' U' F2 R' F' (8,10) R2 F D R' D' R2 F' U' (8,10)

U2 R' (2,3) D2 L2 R2 D2 U2 L2 R (7,13) L' R B2 R' B2 L R' U2 (8,11) B U B' R' U2 F' U' F (8,9)

U R' (2,2) R' U2 F' L' U L U2 F (8,10) B' R2 D' R' D B R2 U (8,10)

U' R2 (2,3) D2 U L2 R2 D2 U2 L2 (7,13) D' F2 U F2 D U' R2 U' (8,11) R2 D' U F2 U' F2 D U' (8,11)

U2 R2 (2,4) D2 L2 R2 D2 U2 L2 (6,12) L2 B' F U2 R2 D2 B F' (8,12) D2 L2 U L2 D2 U2 R2 U' (8,14)

U R2 (2,3) D2 U' L2 R2 D2 U2 L2 (7,13) B' R B R2 U F R' F' (8,9) R' U F R2 D R D' F' (8,9)

U' R (2,2) F R2 D R D' F' R2 U' (8,10) D2 U L2 R2 D2 U2 L2 R' (8,14)

U2 R (2,3) D2 L2 R2 D2 U2 L2 R' (7,13) F' L' U L U2 F R U' (8,9) D2 B2 F2 D2 U2 B2 F2 R (8,15)

U R (2,2) D2 U' L2 R2 D2 U2 L2 R' (8,14) R2 B' D' R D R2 B U (8,10)

L R (2,2) L' B2 F2 L2 R2 B2 F2 R' (8,14) B2 F2 L2 R2 B2 F2 L' R' (8,14)

L2 R (2,3) B2 F2 L2 R2 B2 F2 R' (7,13) D2 U2 L2 R2 D2 U2 R' (7,13) R' B2 F2 L2 R2 B2 F2 (7,13) R' D2 U2 L2 R2 D2 U2 (7,13) R2 B2 F2 L2 R2 B2 F2 R (8,15)

L' R (2,2) L B2 F2 L2 R2 B2 F2 R' (8,14)

L2 R2 (2,4) B2 F2 L2 R2 B2 F2 (6,12) D2 U2 L2 R2 D2 U2 (6,12) R B2 F2 L2 R2 B2 F2 R' (8,14)

L' R2 (2,3) L B2 F2 L2 R2 B2 F2 (7,13) L D2 U2 L2 R2 D2 U2 (7,13) B2 F2 L2 R2 B2 F2 L (7,13) D2 U2 L2 R2 D2 U2 L (7,13) L2 B2 F2 L2 R2 B2 F2 L' (8,15)

L' R' (2,2) L B2 F2 L2 R2 B2 F2 R (8,14)

Edit 2:

To improve upon this, you would have to find 8 move solutions for every possible 3 move ending (or 9 move solutions for every 4 move ending and so on..). This started to seem like a lot of work and I didn't think it would yield any results until you got to higher movecounts. I did some preliminary testing just to see how far up I could push it. I found three move ending D F' U'. This 3 move ending takes 9 moves to solve optimally if you don't count the 3 move solution. From here I tried moves to see If I could come up with a 4 move ending that could not be solved in less than 10 moves optimally except for the 4 move solution. I found B D F' U'.

This continued up until I got the 9 move ending R' F' L U' R' B D F' U' which takes 15 moves optimally if you don't count the 9 move solution. I would go further, but it is beginning to take a longer amount of time for cube explorer to find solutions.

Edit 3:

B' R' F' L U' R' B D F U' requires 16 moves not counting the 10 move solution.

I just run the HugeOptimalSolver of CubeExplorer 5.00s on three
random cubes. All three of them were 18f*, and the runtimes were
0:50, 1:30 and 2:56. The superflip took 3:40 (it reached depth 18
much faster, I think because of the high level of
symmetry.)

For most optimal solving needs, CubeExplorer is the right way to
go. Like a bike, it gets you there easily, quickly, and with a
minimum of fuss.

My optimal solver is a freight train. It's made to solve
hundreds of thousands or millions of random positions as quickly
as possible. There is no UI (command-line only). It wants lots
of memory and lots of cores. It only supports two metrics (QTM
and HTM, not STM).

Even if you have lots of memory, generating or loading the
pruning table it wants can be slow; I typically use a 176GB
pruning table, which takes many minutes to load.

For pruning tables similar in size to what Cube Explorer uses, I
don't believe it is significantly faster than Cube Explorer.

Given all this, if anyone *really* has interest, I can document
it and put it out there. The pruning table sizes it supports
are:

Your choice. The larger the pruning table, the faster the
solution rate, but also the longer it takes to load and generate
the pruning table. On my 256GB 16-core box, I use the 176GB
size; on my 64GB 12-core box, I use the 33GB size. On my 16GB
2-core laptop, I use the 7.6GB size.

With a 176GB pruning table on the 16-core box I can solve random
positions in the half-turn metric at a rate of 33 per second. In
the quarter-turn metric it is even faster.

Now, back to the point of this thread. I solved a few hundred
thousand random positions and was able to find a length-20
canonical solution sequence for all. Then I tried all positions
with four degrees of symmetry or more, and for all of those was
able to find a length-20 canonical solution sequence. I am
running positions with three degrees of symmetry now and am
considering solving all positions with two degrees of symmetry
next.

Based on these results, I am moderately confident that every
position has a distance-20 canonical solution sequence (that is,
that God's Second Number is also 20.)

I have pretty much gotten as far as I am going to following that sequence I posted in my last post (it became to time consuming for regular cube explorer). I decided I would go back and find Every 3 move position that takes at least 9 moves and follow them all up as far as I can.

To rule out some 3 move positions that do have other solutions lower than 9 moves, I went back to my 2 move combinations so I wouldn't try to branch off from 2 move positions with other solutions lower than 8 moves. That's was when I realized that all of the 2 move combination that had multiple solutions under 8 moves in length were the combinations that had half-turns in them.

This makes me want to think that there is at least some truth to what I said earlier about positions with fewer half-turns seeming harder.

Hi rokicki, I'm pretty interesting in your pruning table. Before guessing the coordinate you used to contribute the pruning table, I wonna confirm how you store the pruning table. Is it stored as 1.6 bits per entry(only store depth mod 3, as Kociemba's implementation)? Or 32 bits per entry(as mentioned in the source code of the proof of 20 moves)?

Hi rokicki, I'm pretty interesting in your pruning table. Before guessing the coordinate you used to contribute the pruning table, I wonna confirm how you store the pruning table. Is it stored as 1.6 bits per entry(only store depth mod 3, as Kociemba's implementation)? Or 32 bits per entry(as mentioned in the source code of the proof of 20 moves)?

It's two bits per entry. I don't store the value mod 3, though.
For each specific size of pruning table, I have a fixed BASE
value, and I store whether the pruning value is <BASE,
=BASE, =BASE+1, or >BASE+1 using those two bits.

In addition, for every 62 values, I have a single four-bit value
that holds the minimum *actual* value across those 62 values.
(This is in the same cache line as the 62 values.)

Essentially, then, I'm using 512/495 * 2 or 2.07 bits per entry.

So, you're going to guess the coordinates I use for all of my
12 different pruning tables? That would be impressive. Do you
want the exact sizes to help you in this quest?

Here's my guess of the coordinate of these pruning tables:
In each lines, the first number is the estimated file size of the table, the second number is the estimated number of entries of the table and then the possible coordinates.

"495" is the UDSlice coordinate
"*16" is the edge orientation of 4 edges represented by the UDSlice coordinate.
"*256" is the edge orientation of 8 edges not represented by the UDSlice coordinate.
"*2048" is the edge orientation of all 12 edges.
"*2187" is the corner orientation of all 8 corners.
"*24" is the permutation of 4 edges represented by the UDSlice coordinate.
"*70" can be the "enhanced" UDSlice coordinate. Based on UDSlice, the position of 4 edges in U face or D face can be represented by 1..C(8,4)=70.
"*70" also can be the number to represent the position of 4 corners in U face or D face.
"/16" symmetry reduction. All combined coordinates listed follow can be reduced via the subgroup of the symmetry group generated by S_F2, S_U4 and S_LR2 with 16 elements.
As the symmetry reduction is not perfect because of some self-symmetry state and the imperfect implementation of symmetry reduction, the file size as well as the number of entries is underestimated.

All the pruning tables I use are with respect to subgroups that include U and D.

That is, the positions U, D, U2, D2, U3, and D3 are all at distance 0 in the
subgroups represented by the pruning table.

All also have sixteen degrees of symmetry.

This means for every node I can use six pruning table entries: three in the
three major axis orientations, and three more for the inverse position.

My search switches between forward and backward depending on which
pruning results are most restrictive (by branching factor).

So for instance if I have 12 moves to go, and a lookup in the primary
orientation forward says the position is at distance 12, I know the result
will not end in any twist of the up or down faces. Thus, I can search
from the *reverse* position and only consider moves of the other four
faces. If both the UD position and the LR position give a value of 12,
I know any solution must end in a move of only the F or B faces. This
is a generalization of the "if all three probes are 11, then the distance
must be 12" trick that Kociemba uses in his optimal solver.

Further, every recursion into a deeper node leaves one pruning value
unchanged, so I need do at most five probes of the table.

And of course I exploit the 16-way symmetry to get pretty deep tables in
a relatively compact space.

Hi rokicki, I have a question about your solver.
As you mentioned, once you turn a move, you will do 5 probes of the pruning table. And 3 of these probes are done through the inverse position. Does it mean that when you do a move 'X' on state 'S', you will calculate S * X and X^-1 * S^-1 simultaneously?
If yes, according to my knowledge, it's hard to calculate X^-1 * S^-1 at coordinate level. But if we do the move at cubie level, it will be 10x slower than at coordinate level.
Would you mind sharing your implementation details about how to deal with this?

BTW, have you compared the performance between your implementation and kociemba's implementation when using the same pruning table?

Hi rokicki, I have a question about your solver.
As you mentioned, once you turn a move, you will do 5 probes of the pruning table. And 3 of these probes are done through the inverse position. Does it mean that when you do a move 'X' on state 'S', you will calculate S * X and X^-1 * S^-1 simultaneously?
If yes, according to my knowledge, it's hard to calculate X^-1 * S^-1 at coordinate level. But if we do the move at cubie level, it will be 10x slower than at coordinate level.
Would you mind sharing your implementation details about how to deal with this?

BTW, have you compared the performance between your implementation and kociemba's implementation when using the same pruning table?

Yes, I calculate XS and S'X' both. I don't use coordinates; I use cubie-level for almost all my programs including this one.

Coordinates work great when all the coordinates and their move tables fit in cache, but once they don't, you're actually
faster doing things at the cubie level, assuming your indexing is fast enough.

It would be interesting to compare my implementation to Kociemba's, but Kociemba's is for Windows and mine is in C.
I'd have to find a Windows machine and a C compiler for it and compile my programs for that (I've long since switched
to Linux and Macs). Maybe I'll do that.

I'd be happy to share the code, but it's not beautiful yet and there's no real UI.

Here's my knowledge about cache and move table:
The L3 cache of modern CPU (e.g. intel core i7 4770) is ~8M, which is large enough to store most of the move tables. Although the L1 and L2 cache miss rate may be very high, I don't think the average access time to the move table might exceed 50 clocks (http://stackoverflow.com/questions/4087280/approximate-cost-to-access-various-caches-and-main-memory). Especially, if we split the move table into several parts, the size of the move table will extremely decreased. For example, the size of CornerTwistMoveTable is only 2,187 * 18 * 2 = 78,732 Bytes, which is smaller than the size of L2 cache and the table can be accessed within 10 clock cycles on average.

But if the move is done at cubie-level, there will be more than 20 memory accesses (8 corner cubies+12 edge cubies), which will be about 80 clocks without calculating the overhead of indexing.

However, when the performance bottleneck is not the move procedure (which might be the memory access to the pruning table), the benefit of doing moves at cubie level, e.g. search from the inverse simultaneously, will speedup the search.

Can you release an alpha version who can solve a given state (no matter how to input the state)? Let's do some benchmark to verify the performance of doing move at cubie level.

BTW, in my opinion, the searching time is proportional to the number of accessed nodes. And by pruning at the inverse state simultaneously, the number of accessed nodes can be reduced. However, once I implemented such solver, the number of accessed nodes is only reduced by about 50%. The result is reasonable because we have 3 pruning values already, and in most cases, max(x1, x2, x3) == max(x1_inv, x2_inv, x3_inv) according to the property of order statistic. If it's true, the inverse pruning won't show much improvement than the conventional solver. Have you done a statistics on the number of accessed nodes comparing to the solver who doesn't using the inverse state?

Kociemba claims 7000 cubes/day on an i7-920, which is an average rate of 1 cube per 12.34 seconds. And the i7-920 is supposed to be only about 55% of the throughput of the E5-1620. Something is odd.

The distribution appears to be *highly* skewed but no easy way to get all the data

Node rate: 4.55M nodes/sec/core or 36.4M nodes/sec overall (very fast!)

Long term average: 549M nodes per solution

My 40 samples from Kociemba show 7.53s and 274M nodes

Mine, at 1.3GB: 227M nodes per solution; solution every 8.8s.

Here are some notes on the different options for my solver.
The "average1" is the average value in the pruning table with
one probe; the average6 is the average value returned with
6 probes (these are for random positions).

Cool, kociemba's 1.8G table is equivalent to your 2.5G table.
And it seems that doing moves at coordinate level is a little faster than at cubie level, while due to lack of inverse pruning, the number of accessed nodes is almost doubled.

Cool, kociemba's 1.8G table is equivalent to your 2.5G table.
And it seems that doing moves at coordinate level is a little faster than at cubie level, while due to lack of inverse pruning, the number of accessed nodes is almost doubled.

That might be reasonable, but I'll have to check; I don't remember
exactly what he uses. If it is equivalent I'm a bit surprised; I'm
using 2 bits per and I thought he was using 1.6 bits per, so
the ratio should be 1.25:1, but 2.5:1.8 is significantly greater
than that.

I would not be surprised if my optimal solver could be sped up
quite a bit.

The inverse helps more than just max(a,b,c) vs max(a', b', c');
let us say the number of allowed remaining moves is 10, and the
pruning values come out to be (10,9,9); (10, 10, 9). This
tells me that any move sequence cannot start with a twist of
the UD or FR faces, and it cannot end with a twist of the UD
faces. Because the start move is more constrained, I recurse
on the next (start) move; if the end move was more constrained,
I would recurse on the last (end) move. I think this helps
cut the nodes explored a fair bit.

It's due to your implementation of symmetry. The symmetry coordinate kociemba used is UDSliceFlip, 2048*495->64430, 1.69% redundancy. So the total number of the entries of his table is 64430 * 2187 * 70 = 9,863,588,700. The file size is 9,863,588,700 / 5 = 1,972,717,740 Bytes.
While the symmetry coordinate you used is CornerCombinationTwist, 2187*70 -> 9930, 3.78% redundancy. The total number of the entries is 9930 * 495 * 2048 = 10,066,636,800. The file size is 9930 * 512 * 2048 / 4 = 2,603,089,920 Bytes (which is not 9930 * 495 * 2048 / 4 as you described below. Is that correct?)

It's due to your implementation of symmetry. The symmetry coordinate kociemba used is UDSliceFlip, 2048*495->64430, 1.69% redundancy. So the total number of the entries of his table is 64430 * 2187 * 70 = 9,863,588,700. The file size is 9,863,588,700 / 5 = 1,972,717,740 Bytes.
While the symmetry coordinate you used is CornerCombinationTwist, 2187*70 -> 9930, 3.78% redundancy. The total number of the entries is 9930 * 495 * 2048 = 10,066,636,800. The file size is 9930 * 512 * 2048 / 4 = 2,603,089,920 Bytes (which is not 9930 * 495 * 2048 / 4 as you described below. Is that correct?)

I believe you are correct! The file size is different from my description because there are some other things going on with my files.
For instance, for the four "values" I might use ranges of 0-9, 10, 11, 12+, but every 62 entries or so I have a 4-bit field that
gives the minimum value over those 62 entries exactly (so whenever the two bits give me a range of 0-9, I use that larger
field instead which improves the accuracy of my table). This adds a factor of 512/495, which then matches your calculations.