# Most Efficient Way to Digitally Store CubeStates?

##### Member
Sort of a side note; if you wanted to store the entire 43 quintillion cube states, you could further reduce that using all the rotations (both sorts) reflections and inverses to reduce by a factor of 2^3*2*24*6~2300 unless I for my calculations wrong somewhere if you add a certain tag at the front to show rotation/reflection which would be 10 bits long I think tough it may be possible to reduce that further.

#### Herbert Kociemba

##### Member
I think that the most straightforward way is just encode EP, EO, CP, and CO separately then concatenate them all together:
EP from 0 to 12!-1, use 29 bits
EO from 0 to 2^11-1, use 11 bits
CP from 0 to 8!-1, use 16 bits
CO from 0 to 3^7-1, use 12 bits
Total of 68 bits.
....

http://www.jaapsch.net/puzzles/compindx.htm
Since the position of the last two edges is forced you can use a EP* from 0 to 12!/2 -1 by combining the index of the combination of 10 edges (Binomial(12,10) possibilities) with the 10! possible permutations of these 10 edges (Binomial(12,10)*10! = 12!/2). I used this kind of coordinate several times in my programs. This saves 1 bit.
Of course you can now build the total Index for example with

3^7(8!(2^11 EP*+EO)+CP)+CO

and you get an unique index from 0...43252003274489856000-1 for each cube, which needs 66 Bits.

#### Herbert Kociemba

##### Member
Sort of a side note; if you wanted to store the entire 43 quintillion cube states, you could further reduce that using all the rotations (both sorts) reflections and inverses to reduce by a factor of 2^3*2*24*6~2300 unless I for my calculations wrong somewhere if you add a certain tag at the front to show rotation/reflection which would be 10 bits long I think tough it may be possible to reduce that further.
This will not help. Your tag bits will eat up your savings by the reduction. It is impossible to get below Log[2,43 quintillion] bits.
Btw. the reduction factor by symmetries and inversion is only 96, not 2300.

Last edited:

#### hamfaceman

##### Member
I think you guys may be overthinking this. How about a scramble sequence?

##### Member
I think you guys may be overthinking this. How about a scramble sequence?
That has come up. It's less efficient than some of he other methods and would also mean you need to gen all of the algs foe all of the cases and store them in binary.

#### EMI

##### Member
The efficiency of your representation depends on what you want to do with it. The length of the shortest possible (binary) representation is obvious. (Btw, too bad that it is 66 (65?) bits, not 64...)
But to make calculations fast, you want to store a state as some bit string(s) in such a way, that simple bit operations made up from AND, OR, XOR etc. and some constants can represent single moves, as well as the "concatenation" of two different states maybe.
I have wondered before how those fast solving programs (Cube Explorer, ...) implement it, but I was too lazy to find out. ^^

#### RicardoRix

##### Member
That has come up. It's less efficient than some of he other methods and would also mean you need to gen all of the algs foe all of the cases and store them in binary.
That's not true. Whatever is deemed 'human readable' has not be defined, but if you encode a scramble then the 'decoder' method (and everything suggested so far would need some kind of decoder) would just give you a scramble, no need for any kind of 43q lookup.

hamfaceman maybe talking about storing the scramble sequence without any compression at all, and so then would not need any kind of decoder to be human readable, would need a few more bytes though, but even so would be an efficient solution for this scenario.
Also a coded scramble (19 different types of move (5 bits), 20 moves = 100 bits = 13 bytes) and very easily decoded, no lookup, the same is true for your BLD method, and also makes it more efficient than the 14 bytes you quoted earlier.

#### SenorJuan

##### Member
If you're describing a scramble sequence, then after the first turn (18 possibilities = 5 bits) you have less options for the subsequent move: just 5 faces & 3 rotation choices = a 4-bit number. Then a 20-turn sequence would be 5 + 19 x 4 = 81 bits long.

#### RicardoRix

##### Member
If you're describing a scramble sequence, then after the first turn (18 possibilities = 5 bits) you have less options for the subsequent move: just 5 faces & 3 rotation choices = a 4-bit number. Then a 20-turn sequence would be 5 + 19 x 4 = 81 bits long.
Fair enough, very clever. I was considering a 'no-move' for the 19th different type of move, but if you go to that extent I suppose the length of the data could deal with no-moves.

#### SenorJuan

##### Member
I guess you would choose the 'unused' 16th value to define a no-move, and hence end-of-scramble. And ditto, one of the many spare states for the first move could be used to represent no-move, ie. the cube is solved, without needing any turns.

#### rokicki

##### Member
Since the position of the last two edges is forced you can use a EP* from 0 to 12!/2 -1 by combining the index of the combination of 10 edges (Binomial(12,10) possibilities) with the 10! possible permutations of these 10 edges (Binomial(12,10)*10! = 12!/2). I used this kind of coordinate several times in my programs. This saves 1 bit.
Of course you can now build the total Index for example with

3^7(8!(2^11 EP*+EO)+CP)+CO

and you get an unique index from 0...43252003274489856000-1 for each cube, which needs 66 Bits.
I have to ask, why do we need a particularly concise representation? If you're not storing many positions, the
difference between a cubie representation (20 bytes, one per cubie, each holding a value of 0..23) and a 66-bit
index (which generally really takes 128 bits or 16 bytes due to alignment) is not a big deal.

Directly indexing positions has a certain elegance, but it's not too far from just lexicographically ordering a
working cubie representation.

The cubie representation permits operations such as moves, inversions, reorientations, and direct position
multiplication (pretty much the standard group operations). And it's pretty quick to turn into an index, or to
derive from an index.

So if the reason you want such a concise representation is to store many many positions, then you get into
ideas of how to efficiently store a finite subset of a large finite set, or conversely, how many distinct
elements you can store in a given amount of memory. While you can't do a whole lot better than 66 bits
per, you can do somewhat better, depending on the operations you need to perform and their required
relative speed.

For instance, to store 200M positions (say, known distance-20 positions) concisely, you
might just index them, sort the indices, and then break them out into about 5000 separate lists by the
leading 13 bits, reducing the bit count of each entry to about 53 bits (and paying a small amount of
overhead for the required datastructures to keep track of this information). If access speed is not important,
you can just encode the delta between adjacent entries in a big sorted list; this would probably be about 40
bits per position (using Huffman or arithmetic coding to encode the bit length of each entry).

But you have to ask yourself if saving 26 bits out of 66 or about 40% is worth all these heroics.

#### Bruce MacKenzie

##### Member
There are 4.33 x 10^19 3x3x3 cube positions. One may map each cube position onto the integers and uniquely represent a cube state in I think 65 bits. This is commonly done to index pruning tables based on cube subgroups using some kind of radix compression scheme. For example the configuration of the corner cubies may be represented by the number:

... 15 * (18 * (21 * {0 to 23} + {0 to 20}) + {0 to 17} ) + {0 to 14} ) ...

The first corner may in any of 24 states numbered 0 to 23, leaving the second corner any of 21 states, ect.

Last edited:

##### Member
There are 4.33 x 10^19 3x3x3 cube positions. One may map each cube position onto the integers and uniquely represent a cube state in I think 65 bits. This is commonly done to index pruning tables based on cube subgroups using some kind of radix compression scheme. For example the configuration of the corner cubies may be represented by the number:

... 15 * (18 * (21 * {0 to 23} + {0 to 20}) + {0 to 17} ) + {0 to 15} ) ...

The first corner may in any of 24 states numbered 0 to 23, leaving the second corner any of 21 states, ect.
This was discussed earlier in the thread but was ruled out as an answer to the problem as envisioned by the op pretty early on.

#### Bruce MacKenzie

##### Member
This was discussed earlier in the thread but was ruled out as an answer to the problem as envisioned by the op pretty early on.
The point being made is that a representation of a cube position requires a minimum of 65 bits. Any scheme requiring more than that is not as efficient in terms of space. Any scheme using less than that will not be able to represent all possible states.

Woops, I went to the calculator and found that it requires 66 bits, not 65.

Last edited:

##### Member
The point being made is that a representation of a cube position requires a minimum of 65 bits. Any scheme requiring more than that is not as efficient in terms of space. Any scheme using less than that will not be able to represent all possible states.

Woops, I went to the calculator and found that it requires 66 bits, not 65.
I agree that it is not as efficient in terms of space though the op stated he wanted a somewhat "human-readable" way of storing the infomation rather than just assigning each state a number (though again that is the most efficient way to store cubestates).

A way of getting the mapping on the cubestates is by using the devil's algorithm and assigning each state a certain distance along the cycles.

Last edited:

#### Bruce MacKenzie

##### Member
I agree that it is not as efficient in terms of space though the op stated he wanted a somewhat "human-readable" way of storing the infomation rather than just assigning each state a number (though again that is the most efficient way to store cubestates).

A way of getting the mapping on the cubestates is by using the devil's algorithm and assigning each state a certain distance along the cycles.
A cube simulation will have a "usable" representation of the cube. Commonly, the simulation will have a representation which specifies the position and orientation of the eight corner cubies and the twelve edge cubies. The simulation has routines for graphically rendering the representation to the display. It can apply face turns to the representation, multiply two representations, etc.

Anyway, if it becomes necessary to store cube positions in a minimum of space the algorithms for mapping such a representation onto the integers 0 to 43,252,003,274,489,855,999 are straightforward.

#### Bruce MacKenzie

##### Member
66-bit index (which generally really takes 128 bits or 16 bytes due to alignment) is not a big deal.
You're correct that if the configuration is saved in a structure such as below the structure uses 16 bytes.

struct configuration{ uint8 rump; uint64 radix; };

However, the data may be stored in 9 bytes if a 9 element array of uint8 is used instead. i.e

struct configuration{ uint8 bytes[9]; } foo ;

orientationIndex = permutationIndex % MAX_ORIENTATION_INDEX;

permutationIndex = 4 * (permutationIndex / MAX_ORIENTATION_INDEX) + foo.bytes[8];

#### Bruce MacKenzie

##### Member
So I could put a further constraint on the question: The Possibility of Human Recall: "Store the cubestate in such a way that a Human Being can look at the data and easily figure out the cube state".
The best human readable representation I've run across is that of Michael Reid who wrote one of the first optimal solvers 15 or 20 years ago. His represented the cube as a facelet permutation. The representation for a solved cube is:

UF UR UB UL DF DR DB DL FR FL BR BL UFR URB UBL ULF DRF DFL DLB DBR

and an example of a scrambled cube would be:

DR FL FD UB DL LB UL BD BR RF UR UF LFU BDL RFD URB LDF DBR LUB FRU

This specifies that the down_right cubie is in the up_front slot with the down facelet in the up position. The front_left cubie is in the up_right slot with the front facelet in the up position, and so forth. Stripped of spaces this rep uses 48 bytes.

#### CuberFles

##### Member
But 48 bytes is like 384 bits, not at all efficient :?

#### Bruce MacKenzie

##### Member
But 48 bytes is like 384 bits, not at all efficient :?
Not efficient but it is human readable. Also, it may be directly used to represent the cube as a facelet permutation. One may then use permutation multiplication to model cube manipulations.

...........................1..............2.............3.............4
Index.........12 34 56 78 90 12 34 56 78 90 12 34 567 890 123 456 789 012 345 678
Slot..........UF UR UB UL DF DR DB DL FR FL BR BL UFR URB UBL ULF DRF DFL DLB DBR
Configuration.DR FL FD UB DL LB UL BD BR RF UR UF LFU BDL RFD URB LDF DBR LUB FRU

Facelet Permutation (GAP Canonical Cycle Notation):
(1,11,24,2,12,23)(3,19,18,22,4,20,17,21)(5,10,16,13,7)(6,9,15,14,8)(25,35,29,43,33,37,42,48)(26,36,30,44,31,38,40,46)(27,34,28,45,32,39,41,47)