Welcome to the Speedsolving.com, home of the web's largest puzzle community! You are currently viewing our forum as a guest which gives you limited access to join discussions and access our other features.

Thread closed:
This thread has been closed as requested by unsolved
---

For those who are interested in seeing if they have the correct hardware to run my Windows 7 64-bit brute force 4x4x4 solver console application, here is the link to download and test it:

I named the program Omnia Obtorquebantur 4x4x4 (Latin for All (sides) will be turned). A logical nickname for it is OO 4x4x4 (since I may make a 5x5x5 solver if this proves successful in terms of speed/performance). I pronounce this nickname as "Oh Oh 4 by 4 by 4."

The newest version of OO_4x4x4 has the following features:

1. Up to 8-Turns-From-Solved databases for centers-only needed to be solved
2. Up to 6-Turns-From-Solved databases for every cube position.
3. When a sequence of moves leads to a shorter solve, then only moves of that length or shorter will be printed out subsequently (see the image above).
4. Theoretical Minimum Search Tree is now being generated, a fairly large improvement over prior versions.
5. All solves are logged to a "solved_cubes.txt" file, with an ASCII drawing of the starting cube along with the moves of every solution.

Several people I would like to thank already for their input. The following have already shared information that made this program much, much faster:

This is just the first iteration, to make sure things are working properly. Optimization and speed comes later. It's a brute-force solver so it was designed for searching for algs mostly, not solving completely scrambled cubes from the start. If there is a solution within its search horizon, however, it will find it. A good way to test to see if an alg is the fewest number of moves.

Edit:
I just benchmarked the speed of the program tonight on my 3.1 GHz Intel E5-2687W. It averages 92,000,000 turns/second. The problem is, the branching factor is so large.

1st level = 36 moves (I use U,U'U2,u,u' and u2 moves for the Up side, and similar moves for the other 5 sides).
2nd level = 1188 (only 33 moves are needed for each level after the first, since there is no need to generate U' if the move at level one is U, otherwise you just undo it. No need for U or U2 either after U move is made).
3rd level = 39,204
4th level = 1,293,732
5th level = 42,693,156
6th level = 1,408,874,148
7th level = 46,492,846,884
8th level = 1,534,263,947,172 = 4 hours 36 minutes to complete.

Does anyone know how fast any other 4x4x4 solvers are? I probably have to implement a hash table to prune similar positions created from other move orders.

1st level = 36 moves (I use U,U'U2,u,u' and u2 moves for the Up side, and similar moves for the other 5 sides).
2nd level = 1188 (only 33 moves are needed for each level after the first, since there is no need to generate U' if the move at level one is U, otherwise you just undo it. No need for U or U2 either after U move is made).
3rd level = 39,204
...

1. You don't need 36 different moves for 4x4. The l,d,b wide/slice turns aren't necessary, just u,r,f are sufficient.
2. Your count is including useless things like U D u D' U' u'.

It's cool that you started this project, but I hope you realize you have a long way to go (in terms of understanding the theory, and in terms of improving the program itself) to catch up with existing tools. The 8-move alg you gave (U b2 U' f2 U b2 U' f2) is a commutator that should be obvious to an experienced solver. Even your upper limit of 10 moves is way too short to solve the kind of situations people would want algs for, like OLL parity, most last layer algs, or situations that normally require two commutators. Well, at least now you are using more standard notation rather than the T- stuff.

1. You don't need 36 different moves for 4x4. The l,d,b wide/slice turns aren't necessary, just u,r,f are sufficient.
2. Your count is including useless things like U D u D' U' u'.

I gave this some thought, so perhaps you can weigh in on this subject and see if my reasoning makes sense.

I set up the move generator to generate all U,F, and R moves first, but in triplets.

Move 1 = U
Move 2 = U'
Move 3 = U2

Move 4 = R
Move 5 = R'
Move 6 = R2

.....

Move 31 = b
Move 32 = b'
Move 33 = b2

Move 34 = l
Move 35 = l'
Move 36 = l2

My logic was this:

For every one of those 36 possible moves, I only have to generate 33 moves from now on.

Why?

If I make U on the original position of the cube, at the "next level" of search, I can omit U, U', and U2. That's because U @ level 2 after U @ level 1 = just U2 @ level 1. U' @ level 2 after U @ level 1 = no net move made at all, and it just a waste of a move. Likewise, U2 @ level 2 after U @ level 1 = the same as U' at level 1, so, again, a duplicate position is generated, and the search tree is bloated.

If I make R on the original position of the cube, at the "next level" of search, I can omit R, R', and R2 (same reasons).

So my search would look like this (simplified for means of clearer explanation):

Every "clump of 3" in the lower loop can be skipped without effecting the search, and trimming the tree automatically. This perpetuates on down the loops.

If I generated only U and U' at each level, I would generate the U2 position twice on each and every even level of the search.

Level 1 = U
Level 2 = U

That is the same as just Level 1 = U2, and I saved myself an additional depth of search.

Level 1 = U'
Level 2 = U'

That duplicates the U2 position, took 1 additional level to reach, and is superfluous.

I think it is more important to create positions early in the tree, rather than deal with generating duplicate branches that bloat the tree and result in having to examine more nodes to get to the same position.

Yes, indeed. But it is a start and it can only improve from here. You have to understand, I am programming on a computer that has a total of 16 CPU cores (Dual Intel E5-2687W with 8 CPU cores per chip) and I am running this in single-core mode. Once I get it to run as a parallel search, the speedup in nodes/second is guaranteed to be linear (16x faster or 1.6 billion turns/second) but the time-to-depth will not be as fortunate (branching factor of 33 means I will be lucky to cut search times in half to each given depth with 16 cores running).

Adding a hash table to prune redundant positions from the tree will be a big help. Changing the move generator over to a bitboard implementation will be about a 4x speed increase. I compiled with optimized 64-bit code and saw the speed more than double, for example, and this is without even using 64-bit data structures. The 16 squares per side of the 4x4x4 cube make it perfect to represent 2 squares per byte, using a nice native 8-byte = 64-bit variable per side of the cube.

I expect, with every programming trick in the book added to the code, then turning on hyperthreading to turn my machine into 32 virtual cores, I should be able to complete 18-move brute force searches on the 4x4x4 within a reasonable amount of time.

Dude! You just gave me a brilliant idea! I know how to prune away every such position without having to call the move generator!!!

And, the beauty of it is, the mathematical construct is so similar to a "3-cycle" it is not funny!

I need to multiply numbers together for each triplet of moves in such a way, that when the same position is created, the answer = 1.

For U I will use 2.0
For U' I will use 0.5
For U2 I will use -1.0

So say I have the move sequence U U' ..... some other moves .... U ..... some other moves.... U2 ...... some other moves .... U' ....... some other moves ...... U2.

(2.0)(.5)(2.0)(-1.0)(0.5)(-1.0) = 1.0

That means all of the U moves are "back to normal."

I do the same thing for D moves, but I use 0 for the other moves to make this invalid.

That way, if the product = 1.0, I know that parallel, non-intersecting faces were just slid back and forth without changing the parent position!

What if you have somethink like U D U. This position can also reach by U2 D2. Your product would be 2*2=4. So you would search twice the same position.

An idea would be, that you order the moves on an axis: U,U',U2 > u,u',u2 > d,d',d2 > D,D',D2. This should mean, if you last move was U,U',U2 you don't allow U,U',U2 for the next. If your last move was u,u',u2 than you don't allow U,U',U2,u,u',u2. If your last move was d,d',d2 you don't allow U,U',U2,u,u',u2,d,d',d2 and if your last move was D,D',D2 you don't allow any of these moves.
In this way you won't look for d U2 d u, but you would look for U2 u d2, which is the same.
This also solves the U2 D = D U2 problem.

Dude! You just gave me a brilliant idea! I know how to prune away every such position without having to call the move generator!!!

And, the beauty of it is, the mathematical construct is so similar to a "3-cycle" it is not funny!

I need to multiply numbers together for each triplet of moves in such a way, that when the same position is created, the answer = 1.

For U I will use 2.0
For U' I will use 0.5
For U2 I will use -1.0

So say I have the move sequence U U' ..... some other moves .... U ..... some other moves.... U2 ...... some other moves .... U' ....... some other moves ...... U2.

(2.0)(.5)(2.0)(-1.0)(0.5)(-1.0) = 1.0

That means all of the U moves are "back to normal."

I do the same thing for D moves, but I use 0 for the other moves to make this invalid.

That way, if the product = 1.0, I know that parallel, non-intersecting faces were just slid back and forth without changing the parent position!

WTF. Not only will this not work (U U U U = 16.0?) but floating point multiplication is one of the slowest possible commands for a computer. If you are going to do this kind of thing (and I don't recommend it, do what Jakube mentioned), just use an integer counter where U=+1 U'=-1 U2=+2, and check its value modulo 4 if you want to know if the U layer is back to normal.

WTF. Not only will this not work (U U U U = 16.0?) but floating point multiplication is one of the slowest possible commands for a computer. If you are going to do this kind of thing (and I don't recommend it, do what Jakube mentioned), just use an integer counter where U=+1 U'=-1 U2=+2, and check its value modulo 4 if you want to know if the U layer is back to normal.

I never had to worry about the case of U U U ... since my move generator skips moves in triplets on the subsequent depth of search.

I generate U, U', and U2 as a triplet, and in the next level of search, F, F, and F2 are next. U U never occurs, neither does U U' nor U U2 nor U' U2 and all other permutations possible with the other faces and slices.

I finally settled on:

signed short GLOBAL_TOP_FACE_PLUS_MINUS_TRACKER[36] = {-1,-1,1, 0,0,0, 0,0,0, 1,1,1, 0,0,0, 0,0,0, 1,1,1, 0,0,0, 0,0,0, 1,1,1, 0,0,0, 0,0,0};
signed short GLOBAL_TOP_FACE_DOUBLE_SPIN_TRACKER[36] = {1,1,-1, 0,0,0, 0,0,0, 1,1,1, 0,0,0, 0,0,0, 1,1,1, 0,0,0, 0,0,0, 1,1,1, 0,0,0, 0,0,0};
signed short global_top_face_plus_minus_unchanged_tracker;
signed short global_top_face_double_spin_unchanged_tracker;

signed short GLOBAL_TOP_SLICE_PLUS_MINUS_TRACKER[36] = {1,1,1, 0,0,0, 0,0,0, 1,1,1, 0,0,0, 0,0,0, -1,-1,1, 0,0,0, 0,0,0, 1,1,1, 0,0,0, 0,0,0};
signed short GLOBAL_TOP_SLICE_DOUBLE_SPIN_TRACKER[36] = {1,1,1, 0,0,0, 0,0,0, 1,1,1, 0,0,0, 0,0,0, 1,1,-1, 0,0,0, 0,0,0, 1,1,1, 0,0,0, 0,0,0};
signed short global_top_slice_plus_minus_unchanged_tracker;
signed short global_top_slice_double_spin_unchanged_tracker;

signed short GLOBAL_BOTTOM_FACE_PLUS_MINUS_TRACKER[36] = {1,1,1, 0,0,0, 0,0,0, -1,-1,1, 0,0,0, 0,0,0, 1,1,1, 0,0,0, 0,0,0, 1,1,1, 0,0,0, 0,0,0};
signed short GLOBAL_BOTTOM_FACE_DOUBLE_SPIN_TRACKER[36] = {1,1,1, 0,0,0, 0,0,0, 1,1,-1, 0,0,0, 0,0,0, 1,1,1, 0,0,0, 0,0,0, 1,1,1, 0,0,0, 0,0,0};
signed short global_bottom_face_plus_minus_unchanged_tracker;
signed short global_bottom_face_double_spin_unchanged_tracker;

signed short GLOBAL_BOTTOM_SLICE_PLUS_MINUS_TRACKER[36] = {1,1,1, 0,0,0, 0,0,0, 1,1,1, 0,0,0, 0,0,0, 1,1,1, 0,0,0, 0,0,0, -1,-1,1, 0,0,0, 0,0,0};
signed short GLOBAL_BOTTOM_SLICE_DOUBLE_SPIN_TRACKER[36] = {1,1,1, 0,0,0, 0,0,0, 1,1,1, 0,0,0, 0,0,0, 1,1,1, 0,0,0, 0,0,0, 1,1,-1, 0,0,0, 0,0,0};
signed short global_bottom_slice_plus_minus_unchanged_tracker;
signed short global_bottom_slice_double_spin_unchanged_tracker;

... etc., for all of the different faces and slices. That way, all I have to do is this:

unsigned short level_4_prune(unsigned short i, unsigned short j, unsigned short k, unsigned short m)
{
global_top_face_plus_minus_unchanged_tracker = GLOBAL_TOP_FACE_PLUS_MINUS_TRACKER * GLOBAL_TOP_FACE_PLUS_MINUS_TRACKER[j] * GLOBAL_TOP_FACE_PLUS_MINUS_TRACKER[k] * GLOBAL_TOP_FACE_PLUS_MINUS_TRACKER[m];
global_top_face_double_spin_unchanged_tracker = GLOBAL_TOP_FACE_DOUBLE_SPIN_TRACKER * GLOBAL_TOP_FACE_DOUBLE_SPIN_TRACKER[j] * GLOBAL_TOP_FACE_DOUBLE_SPIN_TRACKER[k] * GLOBAL_TOP_FACE_DOUBLE_SPIN_TRACKER[m];
global_bottom_face_plus_minus_unchanged_tracker = GLOBAL_BOTTOM_FACE_PLUS_MINUS_TRACKER * GLOBAL_BOTTOM_FACE_PLUS_MINUS_TRACKER[j] * GLOBAL_BOTTOM_FACE_PLUS_MINUS_TRACKER[k] * GLOBAL_BOTTOM_FACE_PLUS_MINUS_TRACKER[m];
global_bottom_face_double_spin_unchanged_tracker = GLOBAL_BOTTOM_FACE_DOUBLE_SPIN_TRACKER * GLOBAL_BOTTOM_FACE_DOUBLE_SPIN_TRACKER[j] * GLOBAL_BOTTOM_FACE_DOUBLE_SPIN_TRACKER[k] * GLOBAL_BOTTOM_FACE_DOUBLE_SPIN_TRACKER[m];

Basically, I track the faces and slices with different variables. I give a clockwise turn a score of -1, and likewise for a counterclockwise turn. I then compute the products. If the answer is -1, then I know they did not cancel out. If it is +1, then I know the starting position has occurred again. If I get a 0 value, that means in the middle of all of the U u d U' d' D u' there must have also been an F,R,B, etc., somewhere along the way, which invalidates the whole thing.

Only parallel faces and slices have the +1/-1 values, the others have zeros.

My move generator can't produce a cycle back to the original position in fewer than 4 plies. Here is the data I collected on the positions that were pruned from the tree:

The problem was, when pruning at a leaf node (the deepest spot in the tree) you get no further reductions since you are done and the search retreats one ply to start back down with a new move. Also, it took longer to evaluate a leaf node for a no-net-progress scenario than it to would have to just generate the node at the leaf. So, I removed pruning at the leaf nodes, and left pruning in for plies 4 and up. The result: It ran faster, examined slightly fewer nodes, and was an overall winner.

I don't think, you really understand, which sequences you have to filter. The idea here is, that you don't search for a sequence of moves multiple times. For instance, you don't have to try the sequence d2 U d', when you have have already checked the sequence d U (You can cancel d2 and d', because the U doesn't move any pieces of the d-Layer). I have the impression, that you try to only look for moves that cancel themselves completely, like U2 d U D2 U (U2 + U + U = nothing). At least the "Multiplication-Idea" implies this. I haven't look at your recent code, because the amount of variables and especially the length of the variable names scared me off.

At least from the statistic I can see, that you don't filter out all unnecessary sequences.
It should filter out sequences even from the 2nd level. If you tried U D, you don't have to try D U. If I count correctly, there are only 1026 different unique sequences of length 2, way less than 36*33 = 1188
The same with sequences of the 3rd level. You don't have to check F r R2, if you already checked F R2 r, or you don't have to check sequences like U2 D U', because it is the same sequence as U D, you you checked this sequence already in level 2.
I'm too lazy to calculate the exact numbers, but I'll rewrite my only own 4x4x4 solver (unfinished), so it output's the correct numbers.

Yep, those numbers are correct. Using unsolved's 36 turns, the recurrence looks like this:

Spoiler

Let a, b, c, d be the number of sequences that currently have 1, 2, 3, or 4 (respectively) turns on the same axis
a(1) = 36, b(1) = 0, c(1) = 0, d(1) = 0
a(n) = 24*(a(n-1) + b(n-1) + c(n-1) + d(n-1))
b(n) = (9/2)*a(n-1)
c(n) = (6/3)*b(n-1)
d(n) = (3/4)*c(n-1)
moves(n) = a(n) + b(n) + c(n) + d(n)

So moves(10) = 401001809545944. The growth factor is ~28.1209788, and moves(n) is roughly 1.29670744 * 28.1209788^n.

And I agree that his code and variable names are extremely long. For computers, multiplying is still slow (even if they are shorts, and even if they are small numbers). Whatever he's doing (I can't tell) there is definitely a clearer and more efficient way to code it.

I would say you could even save more. Assuming the Singmaster convention where u is single inner layer move, U u can be regarded the to be the same as D d, as the resulting states will be the same except for the orientation of the puzzle. Likewise, U is essentially the same as D d u' and two moves fewer in single-slice turn metric. Of course, this would require considering all 24 cube orientations of the canonical solved state as being solved states (which should be considered in any case).

@qqwref: Thanks for the nice recursion. I thought quite a while about a recursion, but couldn't find it.

@cuBerBruce: Such a simple idea, but it's actually quite hard to implement it correctly. But I think I figured out the three necessary rules.

if there are 2 moves on the same axis, both turn in the same direction (U is the same direction as d') and one of the moves is of <D>, then ignore the case.

if there are 3 moves on the same axis, and >=2 moves turn in the same direction, then ignore the case.

ignore all cases with 4 moves on the same axis.

Using this rules, if get the following numbers of valid sequences:

n

all
36*33^(n-1)

canceling moves
(U d U = U2 d)

cancel by rotation
(U u = d D)

0

1

1

1

1

36

36

36

2

1.188

1.026

999

3

39.204

28.836

27.288

4

1.293.732

810.891

746.562

5

42.693.156

22.803.120

20.421.648

6

1.408.874.148

641.245.896

558.627.948

As it turns out, there is not much difference. 2nd column has a grow rate of ~28.12, 3rd has ~27.35. It only has an impact for searches with >> 6 moves.

Spoiler: recursion

You can simplify the recursion:
Let a, b, c be the number of sequences that currently have 1, 2, 3 (respectively) turns on the same axis
seqCnt(n) = a(n) + b(n) + c(n)
a(1) = 36, b(1) = 0, c(1) = 0
b(2) = 3*(12*9/2-3*3) = 135
c(2) = 0
c(3) = 3*(12*9*6/6-4*7*3) = 72
a(n) = seqCnt(n-1) * 24
b(n) = seqCnt(n-2)*2*(12*9/2 - 3*3) = seqCnt(n-2)*90
c(n) = seqCnt(n-3)*2*(12*9*6/6-4*7*3) = seqCnt(n-3)*48

seqCnt(10) = 312757467379056, which is approximately 3/4 of the last implementation.
seqCnt(n) ~= 1.332886714636981*27.3543072482344^n. => Grow factor is ~ 27.3543072482344. Not much difference.

I don't think, you really understand, which sequences you have to filter. The idea here is, that you don't search for a sequence of moves multiple times.

I understand that repeated positions are the ones to filter out ideally, but that would require a hashing algorithm, and I am nowhere near ready for that yet. Down the road when I switch to a bitwise move generator, yes, absolutely, you are correct.
For now, I am just filtering out the starting position from moves that reproduce it by moving parallel faces and/or slices.

I had another idea that worked out much better than expected. I added more moves to the move generator, to treat Uu, Rr, Ff as one move. That increases the legal move count to 45 from 36, but it has some nice benefits when you couple it with another great piece of technology: pre-solved cube endings.

I generated every one of the (including superfluous positions for now) 140,026,320 cube arrangements possible from 5 turns, @ 48 bytes per cube position, and load them into a huge 6.25 GB database in RAM. When I search 6 moves or deeper, I probe this database at the leaf nodes, and look for solutions. This slows down the search tremendously, but with the new "2 moves in 1" added to the move generator, I am able to solve the dreaded single dedge flip during an overnight run.

My apologies for the notation, I was writing it before I experienced this forum and have it in my program:

[Rr]- means a face and a slice turn for the right side in the counterclockwise direction.
(tb)- means turn both the top and bottom slices in the same direction, - from the perspective of the top side. I guess this is u'd in your notation.
K is the letter I use for Back, and B is the letter I use for bottom.
T is top.
(fk)+ means turn both the front and back slices in the same direction, + from the perspective of the front slice. So this is fb' in your notation.

And one other thing that you guys will hate: My l- is in the same direction as r-, so it is really like l and not l'. I could never get used to l and r spinning in different directions. I treat everything as a move being made by a right hand, "reaching through" the cube to get to the left side. Ditto for my T and B which is your U and D. My T+ = your U, and my B+ = your D'.

edit: so that makes the solution:

R'r' u'd R D'd' R'r' u' Rr Dd R'r' D2d2 B'b' u' Bb U'u' U' l fb' Rr

I can't get your algorithm to work. Nevertheless, the speed is really impressive. Even if you prune 5 moves, you have to search up to 13 moves, to find a 18 move solution.

I can't get your algorithm to work. Nevertheless, the speed is really impressive. Even if you prune 5 moves, you have to search up to 13 moves, to find a 18 move solution.

Actually, with all of the [] attempted in pairs, there was only 9 "nominal" depths probed. Actually, my code had a bug in it, so I was lucky to hit on this by virtue of the move generator hitting on the correct move to make early on in the game tree. In some parts of the code I did not change the [36] dimension properly! It is a miracle it did not crash.

I had another crazy thought: Generate 15 "moves" per cube side, as pseudo 1- and 2-ply move combinations that could actually be 3 moves in 1 in some cases. Example:

That way, on level 2, you can eliminate all 15 of these, since they are there own "2-move" combinations, inclusive, without duplicates.

So if "i" is the move index at level 1 and "j" is the move index at level 2, then if((i/15) == (j/15)) you can eliminate all of these nodes with one simple instruction.

So, with 15x6 = 90 nodes @ level 1, you really did a 2-ply and partial 3-ply search, if you think about it.

And, it is much easier to code which moves to eliminate for later levels also.

If i = 1,2, or 3 and j = 9,10, or 11, then you have U,u,[Uu] vs. U', u', and [Uu]'. With the right intermediate turn, you can skip these as well.

I was originally quite skeptical, since it looked like you were trying to write something where you didn't really understand what you were doing, and what others have done before. But I like that you're staying determined and learning things.

I generated every one of the (including superfluous positions for now) 140,026,320 cube arrangements possible from 5 turns, @ 48 bytes per cube position, and load them into a huge 6.25 GB database in RAM. When I search 6 moves or deeper, I probe this database at the leaf nodes, and look for solutions. This slows down the search tremendously, but with the new "2 moves in 1" added to the move generator, I am able to solve the dreaded single dedge flip during an overnight run.

It seems you've discovered a classic meet-in-the-middle technique.
A good technique (that works on 3x3x3 stages) is to use the forward calculations at the same time as the backwards calculations. Each "backwards" probe can also be interpreted as a "forwards" probe (if you're working with permutations), and eventually two of them have to collide around half the search depth. This gets trickier on 4x4x4 due to centers, and maybe your solver isn't quite set up for it.

In any case, consider how hard it is to brute-force anything near 18 moves for 3x3x3. Even if you can push your solver to the mid-tens to moves, getting to 18 will require a nearly impossible push if you don't have clever techniques.
Even with clever enumeration, exponentiation base 45-ish is *huge*.

Any chance you're willing to use SiGN?
It will make some parts of your solution longer, and some shorter, but at least it's some sort of notation standard. Plus then you can link your solutions to alg.cubing.net. ;-)

I was originally quite skeptical, since it looked like you were trying to write something where you didn't really understand what you were doing, and what others have done before.

I've written chess and checkers programs and probed pre-computed databases before. It seems like it would be perfect for the cube. I need a better way to index/retrieve positions. That is what I am experimenting with now.

Right now, centers just bloat the database. Even though they are 4 "of the same" piece, my program as of yet considers them as center #1, center #2, center #3, and center #4. This means 1234 is considered different from 2143 even though they are identical looking. Once I get rid of this issue, the database sizes will be small enough to distribute with the program.

Even if you can push your solver to the mid-tens to moves, getting to 18 will require a nearly impossible push if you don't have clever techniques.
Even with clever enumeration, exponentiation base 45-ish is *huge*.

That one dedge result was a fluke and should be discarded. I re-ordered the move generator to perform pairs of moves with them first, so whenever you saw two of my [][] moves back to back, that was accomplished via one ply of nominal depth and not two plies as would have been normal. Combined with the pre-computed database, it was only a search through an exhaustive depth of 9 plies.

I have since reverted back to the 36/33 move generator, with non-leaf node pruning + superfluous 5-turn database probing.

My goal for today is to remove the superfluous positions and see if a 6-turn database would fit in RAM on my 32 GB box with 16 cores. I'd also love it if someone could translate the duplicate moves previously cited by others into an indexing function. That way, I pass in a move index (1-36) and the move history (1 to whatever depth) and it can prune that node, and all of the other branches springing forth from that node.

Any chance you're willing to use SiGN?
It will make some parts of your solution longer, and some shorter, but at least it's some sort of notation standard. Plus then you can link your solutions to alg.cubing.net. ;-)

Yes, at some point, I will learn how it is you guys communicate.

I use the notation {t/f} to refer to your M. It is the center slice from the top to the front. I call {t/f}+ a move of that center slice in the direction from t (the top) towards f (the front). So {t/f}- is a turn in the opposite direction.

So f2 B- {t/r}2 B+ f2 B- {t/r}2 B+ is the way to exchange 2 centers on a 5x5x5 cube from the top and bottom faces, which you guys would write as f2 D E2 D' f2 D E2 D' I think.

I finally implemented your awesome idea, and it is way, way, waaaaay faster than my older code! Thanks!!!

You need to get the last few moves, not just one, but I knew what you meant. This is the code (mostly comments) and it has been truncated immensely. Many of the permutations are not even shown, but you get the idea:

I believe Clément Gallet had once made an optimal solver, good enough to solve dedge flip. I also several years ago, hacked up an optimal solver of limited use (fairly limited search depth in a reasonable amount of time) and no way to enter a scramble or position to be solved. (I needed to edit/recompile the program to specify the scramble and/or what moves are allowed.)

I have also made custom solvers for optimally changing OLL parity while preserving reduction. I used both depth-first search and meet-in-the-middle breadth-first search approaches. These programs did not need to use full 4x4x4 state information. The meet-in-the-middle solver used a big hash table to store the positions. Because the two trees were symmetrically related, I only actually had to build (and store) one tree. I discussed the results in this thread: http://www.speedsolving.com/forum/showthread.php?26564-4x4x4-edge-parity-is-there-a-shorter-alg-that-doesn-t-preserve-corners