# 5x5x5 OBTM upper bound

#### xyzzy

##### Member
(Has anyone else attempted such computations before? I didn't find anything.)

tl;dr: 143 OBTM 141 OBTM 135 OBTM (2017-10-31) 134 OBTM (2018-05-15) 130 OBTM (2018-05-20).

I finished writing a 5x5x5 solver about three weeks ago, and in the time since I've mostly been working on trying to get a "good" upper bound for the 5x5x5 God's number in OBTM. I'm not entirely satisfied with this analysis, because according to this older thread with move counts of human solves, reduction can and should take much fewer moves than the upper bound I got, but anyway.

We divide the solving process into seven main phases, the last of which is a 3x3x3 solve. Some of these phases will in turn be divided into subphases that are easier to analyse. The division of phases is largely based on Charles Tsai's documentation of his 4x4x4 solver.

(Side note: the code I use for generating these tables is mostly not shared with the solver, apart from trivial utility functions like calculating binomial coefficients and such, so I'm not fully certain the numbers are correct. Independent confirmation would be awesome. Also, the solver itself usually finds a reduction in around 70 to 75 moves with about a minute of CPU time, which is… very far from the bound presented here, to say the least.)

Phase 1: Solving the UD centres to the UD faces

E: Obsolete; see post #7 by qq280833822.

Someone with more disk space to spare (and using a programming language where accessing hundred-gigabyte arrays isn't a PITA) could probably do a breadth-first search on the whole state space of C(24,8)^2 ~ 541 billion states, but that someone is unfortunately not me. This is split into two subphases, where the first subphase solves the T-centres and the second subphase solves the X-centres. As an optimisation, we can reduce the move count by one if we skip the last move of the first subphase before proceeding to the second subphase.

Obsoleted by later analysis; preserved for posterity.

Moves allowed: U, D, R, L, F, B, Uw, Dw, Rw, Lw, Fw, Bw
Code:
0     5
1     66
2     900
3     9626
4     80202
5     329202
6     302146
7     13324
Obsoleted by later analysis; preserved for posterity.

Moves allowed: U, D, R, L, F, B, Uw, Dw, Rw, Lw, Fw, Bw
Code:
0     -
1     1
2     -
3     8
4     64
5     363
6     3604
7     27752
8     159086
9     392648
10     151261
11     684

Upper bound: 7 + 11 = 18 12 moves.

Phase 2: Solving the RL centres to the RL faces; FB centres to the FB faces; and wing parity

Moves allowed: U, D, R, L, F, B, Uw, Dw, Rw2, Lw2, Fw2, Bw2
Code:
0     1
1     2
2     33
3     374
4     3848
5     40230
6     409051
7     3816922
8     27309148
9     108088828
10     153042267
11     38271627
12     291452
13     17

Upper bound: 13 moves.
Cumulative upper bound: 31 25 moves.

Phase 3: Orienting all midges and separating wings into two orbits

Note that no restriction is placed on the centres in this phase, unlike the corresponding phase in Tsai's solver. We allow either all the edges to be oriented, or all the edges to be flipped.

Obsoleted by later analysis; preserved for posterity.

Moves allowed: U, D, R, L, F, B, Uw2, Dw2, Rw2, Lw2, Fw2, Bw2
Code:
0     1
1     2
2     33
3     382
4     3920
5     45980
6     515594
7     4882282
8     37725852
9     214944909
10     748549490
11     1177896283
12     541147752
13     43045056
14     297920
15     288

Upper bound: 15 13 moves.
Cumulative upper bound: 46 44 38 moves.

(See post #5 in this thread for how the upper bound was reduced from 15 moves to 13 moves.)

Phase 4: "Domino" reduction

Again, no restriction is placed on the centres here. This is divided into two subphases. In the first subphase, the pieces needed to form four tredges are placed in the L and R layers. (This takes no moves at all ~52% of the time, or ~78% of the time if we get to do a z rotation as well, but we're doing a worst-case analysis, not an average-case analysis.)
In phase 3, we have the wings separated into two orbits (each of 12 pieces), such that within each orbit, each wing has a different pair of colours. (To borrow terminology from earlier work, arbitrarily call one of the orbits "low" and the other "high".) The low orbit has eight wings in the L, R layers, as does the high orbit.

Let A be the set of colour pairs of the eight wings in the low orbit and the L, R layers, and B be the set of colour pairs of the eight wings in the high orbit and the L, R layers. We have |A| = |B| = 8 and |A∪B| ≤ 12 (there are only 12 possible colour pairs), so by inclusion-exclusion, |A∩B| ≥ 4, i.e. there are at least four colour pairs shared between A and B. We can choose any four such colour pairs, and then move the midges and wings with those colour pairs onto the L and R layers.

This has C(8,4) ⋅ C(8,4) ⋅ C(12,4) cases to consider, where the four low wings are already in the L, R layers, the four high wings likewise, and the four midges can be anywhere. We do not constrain the four low wings or the four high wings to stay in the L and R layers throughout the subphase; we only require that they end up there.

(The actual computation was done with a BFS through the C(12,4)^3 cases with the wings being anywhere, and then filtered to include only the cases where the low/high wings start on the L and R layers.)

We can use the same inclusion-exclusion trick on, say, the low wings and the midges (instead of low wings and high wings), but that leads to an upper bound of 7 moves, which is worse than the 6 moves we have here.

In the second subphase, four tredges are formed out of the pieces chosen in the previous subphase, and these tredges are placed in the E slice positions (FL, BL, BR, FR). The moves used here are almost the same as in phase 5, but off by a z rotation.

Moves allowed: U, D, R, L, F2, B2, Uw2, Dw2, Rw2, Lw2, Fw2, Bw2
Code:
0     343000
1     24750
2     194785
3     977228
4     672305
5     204640
6     8792
Moves allowed: U2, D2, R, L, F2, B2, Uw2, Dw2, Fw2, Bw2
Code:
0     1
1     3
2     17
3     110
4     761
5     4944
6     30453
7     178759
8     995736
9     5144375
10     22521971
11     67281036
12     84004674
13     17258876
14     146284

Upper bound: 6 + 14 = 20 moves.
Cumulative upper bound: 66 64 58 moves.

Phase 5: 3x3x3 reduction (except centres)

In this phase, all the tredges are formed; the centres remain in a "3-colour solved" state.

Moves allowed: U, D, R2, L2, F2, B2, Rw2, Lw2, Fw2, Bw2
Code:
0     1
1     4
2     26
3     156
4     999
5     5892
6     36376
7     222480
8     1301886
9     7238228
10     36410756
11     144974952
12     343690470
13     262142742
14     16825016
15     1216

Upper bound: 15 moves.
Cumulative upper bound: 81 79 73 moves.

Phase 6: Centres

Three subphases: one for the U and D centres, one for the R and L centres, and one for the F and B centres. The depth distributions are identical for all three.

Partially obsolete.

Moves allowed (for solving R, L, F and B centres): U2, D2, R, L, F, B, Uw2, Dw2, Rw2, Lw2, Fw2, Bw2
See post #9 for details. Upper bound is 27 moves.

Moves allowed (for solving U and D centres): U, D, R2, L2, F2, B2, Rw2, Lw2, Fw2, Bw2
Code:
0     1
1     -
2     -
3     -
4     2
5     6
6     29
7     146
8     331
9     189
10     592
11     940
12     1560
13     1064
14     40

Upper bound: 3⋅14 = 42 27 + 14 = 41 moves.
Cumulative upper bound: 123 121 115 114 moves.

Phase 7: 3x3x3

Upper bound: 20 moves.
Cumulative upper bound: 143 141 135 134 moves.

Last edited:

#### Herbert Kociemba

##### Member
Someone with more disk space to spare (and using a programming language where accessing hundred-gigabyte arrays isn't a PITA) could probably do a breadth-first search on the whole state space of C(24,8)^2 ~ 541 billion states, but that someone is unfortunately not me.
It is not so hopeless as you think. You can reduce the problem by 16 symmetries, then you have two symmetry-reduced coordinates with size each of about C(24,8)/16 = 50000. These can be combined to a coordinate of size about 50000^2*16= 40GB. You can construct a pruning table with 2 bits for each entry, so the pruning table will not have more than 10 GB. Just for a breadth first search you only need 1 bit per entry, so even 5 GB will be enough.

#### xyzzy

##### Member
It is not so hopeless as you think. You can reduce the problem by 16 symmetries, then you have two symmetry-reduced coordinates with size each of about C(24,8)/16 = 50000. These can be combined to a coordinate of size about 50000^2*16= 40GB. You can construct a pruning table with 2 bits for each entry, so the pruning table will not have more than 10 GB. Just for a breadth first search you only need 1 bit per entry, so even 5 GB will be enough.
We don't need two symmetry-reduced coordinates, but only one symmetry-reduced coordinate and one non-reduced coordinate, right? 5 GB would fit in my free RAM pretty snugly, so I'm going to give this a shot. I'm somewhat afraid that the computation will take many hours and I'll stay up all night staring at a progress bar slowly fill up, though.

#### Herbert Kociemba

##### Member
We don't need two symmetry-reduced coordinates, but only one symmetry-reduced coordinate and one non-reduced coordinate, right? 5 GB would fit in my free RAM pretty snugly, so I'm going to give this a shot
Yes this ist true and surely a simpler approach. Additionally you then need a table which transforms the non-reduced coordinates to new non-reduced coordinates by conjugation with each of the 16 symmetries.

#### xyzzy

##### Member
For phase 3, we don't actually have to look at all 2^11 C(24,12) / 2 ~ 2.8 billion states, since we're free to assign the starting midge position to be whatever we want. For instance, if we consider the midges to always start as being oriented, the distribution changes to:

Code:
 0          1
1          -
2          -
3          -
4          8
5        128
6        863
7       3990
8      23788
9     151274
10     611614
11    1110867
12     661700
13     132522
14       7364
15         37
This doesn't improve on the upper bound, but we can look at other starting midge orientations (since we don't care if the reduction leaves the cube with flipped tredges at the end).

Of the 288 antipodal positions, 37 of them have midges solved at the start, 21 have midges in the S slice flipped, 75 have midges in the E slice flipped, and 75 have midges in the M slice flipped, so 208 of the 288 antipodes have D8 symmetry on the midges along some axis. One of the antipodes is given by having all wings solved and all midges flipped, which is about as symmetric as you can get.

Anyway, if we start with the six midges at UF, UL, UB, FL, BL and DR flipped, this is the distribution.

Code:
 0         -
1         -
2         -
3         -
4         -
5         9
6       194
7      2888
8     30212
9    216169
10    857006
11   1246820
12    346302
13      4556

This reduces the upper bound for phase 3 to 13 moves, and the overall upper bound to 141 moves. This is the "best" we can do with a fixed starting midge orientation (every starting orientation has some case that needs at least 13 moves), but a potential improvement in this direction is to use set cover techniques to try different starting orientations.

Last edited:

#### qq280833822

##### Member
I've just finished the analysis of the phase1 of your algorithm. The analysis result shows that phase 1 can be solved in 12 moves. Here is the depth distribution of phase 1.
Code:
Depth      Positions             Total
0                 1                 1
1                 4                 5
2                98               103
3             2,036             2,139
4            41,096            43,235
5           824,950           868,185
6        16,300,291        17,168,476
7       311,709,304       328,877,780
8     5,376,178,126     5,705,055,906
9    67,415,029,758    73,120,085,664
10   316,925,367,079   390,045,452,743
11   150,473,278,964   540,518,731,707
12       398,860,134   540,917,591,841
BTW, It seems that the moves spend on phase 6 is 3*14, wherein the centers of UD/FB/RL are independently solved. Will the depth be significantly reduced by solving centers of 3 axes simultaneously?

The number of states in this phase is C(8, 4)^6 = 117,649,000,000. If all 48 symmetries are considered, the search space will be reduced to about 2.5 billion unique states. If we have a really fast solver, maybe it can be finished in several hours or several days at most.

Last edited:

#### xyzzy

##### Member
Nice! The phase 1 distribution matches mine up to 6 moves, which was where I stopped my computation.

BTW, It seems that the moves spend on phase 6 is 3*14, wherein the centers of UD/FB/RL are independently solved. Will the depth be significantly reduced by solving centers of 3 axes simultaneously?

The number of states in this phase is C(8, 4)^6 = 117,649,000,000. If all 48 symmetries are considered, the search space will be reduced to about 2.5 billion unique states. If we have a really fast solver, maybe it can be finished in several hours or several days at most.
Yeah, I think phase 6 is wasting a lot of moves, but I really don't know where to even begin shaving moves off that.

Within ⟨U, D, R2, L2, F2, B2, Rw2, Lw2, Fw2, Bw2⟩, the R/L/F/B faces are always in a "vertical bars" state (C(4,2)^2 = 36), and in total all the centres can be solved in 16 moves with this restriction:
Code:
0 1
1 -
2 -
3 -
4 12
5 18
6 68
7 294
8 986
9 3489
10 9953
11 15459
12 34228
13 54528
14 45832
15 11340
16 192
(I ran this computation last year, but ended up not using these results.) So we could do something like two subphases: first subphase brings the RLFB centres to a vertical bars position, then second subphase solves all centres (at most 16 moves). The first subphase has only C(8,4)^4 = 24010000 positions to consider (~1.5e6 reduced by symmetry), and we can use ⟨U2, D2, R, L, F, B, Uw2, etc.⟩ to solve this, so the "edge pairing" part has only 12!^2/2 states because EO is preserved (instead of 24!/2 if EO isn't preserved), which allows for good pruning tables to be constructed.

(I have an optimal reduction solver, but it's extremely slow, partly because of how ineffective the edge pairing pruning tables are when there are no move restrictions at all. I think we would run into the same problem if we want to do phase 6 all at once, which has no meaningful move restrictions, which in turn makes pruning less effective, which makes exhaustive searching slower.)

A lot less ambitious than doing phase 6 without subphases, but still likely to reduce the upper bound for phase 6 significantly.

#### xyzzy

##### Member
Some experimentation on optimising phase 6 (and more generally phases 4 through 6), hopefully an actual lowering of the bound within the next few days as I verify my code and run a bunch of exhaustive searches.

So we could do something like two subphases: first subphase brings the RLFB centres to a vertical bars position, then second subphase solves all centres (at most 16 moves). The first subphase has only C(8,4)^4 = 24010000 positions to consider (~1.5e6 reduced by symmetry), and we can use ⟨U2, D2, R, L, F, B, Uw2, etc.⟩ to solve this, so the "edge pairing" part has only 12!^2/2 states because EO is preserved (instead of 24!/2 if EO isn't preserved), which allows for good pruning tables to be constructed.
This might work, but it's much slower than I'd expected it to be. I have a solver for solving UDRL centres and pairing all the edges within the move set {U, D, R, L, F2, B2, Uw2, Dw2, Rw2, Lw2, Fw2, Bw2}, and I tweaked the algorithm slightly to be able to return suboptimal solutions (basically, multiplying the pruning table values by a fixed constant greater than 1).

For solving the UD centres, preserving RL and ignoring FB, with the aforementioned set of allowed moves, this is the move count distribution:
Code:
 0:  1
1:  0
2:  0
3:  0
4:  6
5:  12
6:  19
7:  178
8:  459
9:  285
10:  722
11:  1274
12:  1440
13:  496
14:  8
The upper bound here is 14 moves, which isn't better than what we had before. The antipodal positions are this state with 3 of the x-centres wrong and this other state also with 3 of the x-centres wrong (and the mirrors of these).

The cases with either of these patterns on both the UD centres and the RL centres can be solved in 21-24 moves, and for the remaining cases we can solve the UD centres and RL centres separately in at most 13+14 = 27 moves (since we won't have the 14-move cases on both axes), so this reduces the upper bound of phase 6 by a grand total of one move to 41 moves, bringing the total upper bound down to 134 moves.

(that's a lot of work for shaving off one move, wtf)

Solving to a position with the UDRL centres in a bars state doesn't seem to be faster. ("Faster" in the sense of CPU time needed, which is the real bottleneck here.) I also ran the computation for solving UD centres without preserving RL and FB, but less than one quarter of the way through I noticed that the 14-move positions were still taking 14 moves even if RL didn't need to be preserved, so I aborted it.

The 21-move solution of the "worst case" position was produced with the suboptimal solver, because I ran the optimal solver for two hours and it got nowhere. We might need to do some extra face aligning (because this isn't a unique "worst case" position), which explains why the actual bound we get for this position is 24 moves. I think at most two alignments are needed, but it's 4 am and I can't be bothered to work out the cases now (and it's not necessary anyway).

Given that the worst case of "axis-by-axis" solving comes from the case with only x-centres unsolved, we might get a better bound by solving the x-centres then the t-centres. Or we could just throw this approach out and come up with something completely different and """""original""""", like solving the centres first then doing edge pairing.

This is the move count distribution for solving one orbit of the wings after phase 3 (ignoring all centres):
Code:
total states: 479001600
distance 0: 1 nodes
distance 1: 6 nodes
distance 2: 55 nodes
distance 3: 454 nodes
distance 4: 3287 nodes
distance 5: 21258 nodes
distance 6: 125118 nodes
distance 7: 680883 nodes
distance 8: 3278462 nodes
distance 9: 13363986 nodes
distance 10: 43383252 nodes
distance 11: 103042620 nodes
distance 12: 156868710 nodes
distance 13: 124957752 nodes
distance 14: 33212116 nodes
distance 15: 63640 nodes
Solving the other orbit of wings while preserving the first (and still ignoring all centres) takes at least 18 moves in the worst case. Assuming the worst case is indeed 18 moves, we could decrease the upper bound of phase 4 + phase 5 (currently 35 moves) to 15+18 = 31 moves. However, this would take way too much computation to do directly (~3 minutes per state per thread and there are ~3 million of them with symmetry reduction), and there's no "room" for splitting up into subphases because that would probably waste 4+ moves.

This is the distribution for solving the wings relative to each other (i.e. the edges are all paired, except the midges might be wrong), still ignoring all centres:
Code:
total states: 14999140
distance 0: 1 nodes
distance 1: 2 nodes
distance 2: 7 nodes
distance 3: 24 nodes
distance 4: 103 nodes
distance 5: 619 nodes
distance 6: 4287 nodes
distance 7: 28697 nodes
distance 8: 187493 nodes
distance 9: 1087267 nodes
distance 10: 4323558 nodes
distance 11: 7657009 nodes
distance 12: 1708625 nodes
distance 13: 1448 nodes
(note: this is with full symmetry reduction)
This looks a bit more promising, because it seems that we can solve the midges in relatively few moves and it's way faster (~275 milliseconds per state per thread). A random sample of two thousand states:
Code:
  8:  1
9:  2
10:  2
11:  10
12:  33
13:  129
14:  441
15:  563
16:  702
17:  117
This could potentially cut the upper bound for phase 4 + phase 5 to 13+17 = 30 moves, which is still pretty high, but we'll take it.

Last edited:

#### xyzzy

##### Member
>triple post

Here we optimise phases 4+5 by doing something completely different instead. This cuts the bound for those two phases to 13+18 = 31 moves, which is four moves fewer than what we had in the original analysis (20+15 = 35). Just to avoid confusion, these will be called phases A4 and A5 instead of just "phase 4" and "phase 5". (I still think it's possible to optimise the original phase 4 more—it just needs more disk space to store the computation than I have free.)

Phase A4: Pairing wings (ignoring midges) + parity of wings relative to midges

At the end of phase 3, we have the wings separated into a "low" orbit and a "high" orbit, each with twelve pieces. Thus we can define the permutations of the wings within the orbits as $$L,H\in S_{12}$$, with the permutation of the midges $$M\in S_{12}$$. In this phase, we solve $$H^{-1}L$$ (the goal is to have $$H^{-1}L=e$$, the identity permutation) and $$\operatorname{sgn}(M^{-1}L)$$.

As the parity of all 24 wings is solved in phase 2, this ensures that $$H^{-1}L$$ can only be an even permutation, so this phase has (12!/2) × 2 ~ 4.8e8 cases. An example of a case where $$H^{-1}L=e$$ and $$\operatorname{sgn}(M^{-1}L)=1$$ is given by doing the scramble Rw2 U2 Rw2 U2 Rw2.

This is to reduce the number of actual solving that is needed. It turns out that, regardless of whether parity is considered or ignored, the upper bound for this phase will be 13 moves. The distribution for this phase can be computed with a straightforward breadth-first search, which takes only a couple of minutes, whereas the distribution for the next phase has to be done with loads of optimal solves. Thus increasing the number of cases in this phase (which is relatively fast) in exchange for halving the number of cases in the next phase (which is very slow) is a net win in terms of CPU time needed. It also (possibly) reduces the upper bound.
Moves allowed: U, D, R, L, F2, B2, Uw2, Dw2, Rw2, Lw2, Fw2, Bw2
Code:
0:         1
1:         3
2:        20
3:       149
4:      1198
5:      8608
6:     67897
7:    496029
8:   3476696
9:  21726393
10:  98243928
11: 237241650
12: 116562172
13:   1176856

Upper bound: 13 moves.

Phase A5: 3×3×3 reduction (except centres)

This is split into two subphases. The first subphase reduces to a set of roughly one million cases that can be solved in at most 10 moves, and the second subphase finishes edge pairing in at most 10 moves. The first subphase takes at most 7 moves 99.989% of the time, and for the remaining 0.011%, we run an optimal solver on all of those cases.

(Note: the computation was done with symmetry reduction, but the distributions here are for all cases.)

For this subphase, the set of destination states we choose consists of the states with $$H^{-1}L=e$$ that are solvable in at most 10 moves. The computation was done by first filtering out the states where the individual orbits are solvable in 10 moves (47879988 cases in total), then doing optimal solves for those. Note that this set does not restrict to even parity like phase A4 does; 741052 of the states have even parity and 207024 have odd parity.

Moves allowed: U, D, R, L, F2, B2, Uw2, Dw2, Rw2, Lw2, Fw2, Bw2
Code:
  0:    741052
1:   2155220
2:   8027812
3:  15356860
4:  49350248
5:  69508264
6:  75709048
7:  18625784
≥8:  26512
Moves allowed: U, D, R, L, F2, B2, Uw2, Dw2, Rw2, Lw2, Fw2, Bw2
Code:
  0:       1
1:       0
2:       3
3:      14
4:      77
5:     534
6:    1584
7:    7247
8:   45800
9:  197708
10:  695108
These are the 26512 states that cannot be reduced to the "10 moves from solved" set in 7 or fewer moves. An optimal solver (wrt the move set) was used. It's plausible that the 18-move states can be solved in fewer moves if we use stuff like M' U2 M, since this falls outside of the standard move set.

Moves allowed: U, D, R, L, F2, B2, Uw2, Dw2, Rw2, Lw2, Fw2, Bw2
Code:
 15:    112
16:   3584
17:  22176
18:    640

Upper bound: max(7+10, 18) = 18 moves.

Upper bound for reduction: 12 (phase 1) + 13 (phase 2) + 13 (phase 3) + 13 (phase A4) + 18 (phase A5) + 41 (phase 6) = 110 moves.
Upper bound for the full solve: 110 + 20 = 130 moves.