• Welcome to the Speedsolving.com, home of the web's largest puzzle community!
    You are currently viewing our forum as a guest which gives you limited access to join discussions and access our other features.

    Registration is fast, simple and absolutely free so please, join our community of 40,000+ people from around the world today!

    If you are already a member, simply login to hide this message and begin participating in the community!

Quantitative Method Analysis

Silky

Member
Joined
Apr 5, 2020
Messages
432
Hi, Silky here,

Wanted to make a post to start discussion on creating a more systematic way to analyze and compare different methods. The majority of method debates, in the past, have relied on rather subjective analysis and comparisons which makes comparing methods difficult as each one is compared on unequal grounds. Our main goal here is the use the MCC calculator to analyze the Ergo-Efficiency of different methods so to better compare them. The main plan here would be to compile a list of methods and have 100 example solves of each. We'd run each solve though the calculator and find our average movecount and average Ergo-Efficiency Metric. This would also be helpful in analyzing the Ergo-Efficiency of each individual step between methods to try and optimize methods. Long term we'd like to use these results to update the Which 'Method Should I Use Thread' with the methods with the greatest potential. For now please fill out this survey so we can start making moves on this project.

Cheers!
 

IsThatA4x4

Member
Joined
Jul 18, 2021
Messages
587
Location
UK
WCA
2022RITC01
I'd honestly say 100 solves per method is not enough, and something more like 250-500 would be better.
For example, look at your PB ao100, it could be quite a bit away from your PB ao250/500. Some methods have a lot of cases and luck potential should also be taken into consideration as well, more solves would help get a better idea of that.
Filled out the survey, looking forward to the results of this investigation!
 

Silky

Member
Joined
Apr 5, 2020
Messages
432
I'd honestly say 100 solves per method is not enough, and something more like 250-500 would be better.
For example, look at your PB ao100, it could be quite a bit away from your PB ao250/500. Some methods have a lot of cases and luck potential should also be taken into consideration as well, more solves would help get a better idea of that.
Filled out the survey, looking forward to the results of this investigation!
I agree but we're talking about 100 solves over 7-8 method not including variants. Not to mention that for some methods only 1-2 people would be contributing (aka me with SSC). 100 is just the current benchmark, longer term I like it to much higher. I'd also like to incorporate fastest video execution but that be an even bigger task. This will take several months and will be on a rolling basis, we'll update as we go along.
 

Athefre

Member
Joined
Jul 25, 2006
Messages
1,180
You and I had a short discussion about this on Discord. This is something that I've been thinking about for several months. After I finish my current project, the plan was to talk about what I think in the development Discord server. But since a discussion has been started, I can post here.

The big idea that I have is that if we do a comprehensive analysis of many methods and their variants, it may be possible to produce close to objective results. These results can then be shared around the community and it may effect a change in the larger community. It seems that many aren't aware of the benefits of other methods and even the existence of some methods. How should we do it? We can't analyze based on competition results. It is illogical to say that because one method has all of the results it must mean it's the best. This is because the one method with all of the results is used by almost every competitor. Popularity breeds popularity. This means that there is much of a chance of that method getting records.

I think we can do something similar to what I did for the Petrus vs APB vs Mehta vs Others analysis. This showed how strong APB is in so many areas and caused a little shift in the community. So for an analysis of other methods and their variants, I think we should have a list of method qualities and create a table and also a written analysis where necessary. This should be complemented by analysis of actual speedsolves, just as you also mentioned. There should be an overall analysis of all steps combined and a step by step analysis of the speedsolve reconstructions received for the project. Some methods have had very little use. Those may have to have a qualifier attached saying that the full potential isn't necessarily represented in the speedsolve reconstruction results. A few of the typical method qualities are below. I think we can go even deeper with more qualities and sub-qualities.
  • Movecount - Human movecounts. Use the speedsolve results for intuitive steps. For algorithm steps, use both the speedsolve results and the full list algorithms required for the steps.
  • Ergonomics - This is still a debate, but we can create data from the information provided by the speedsolve reconstructions. MCC is also something that should be used with the understanding that it is movecount + ergonomics and not just a measure of ergonomics.
  • Recognition - Number of steps in the recognition process, number of stickers, sticker locations (meaning the easiness or difficulty of finding them), how long it took the solver to actually recognize in the speedsolve reconstructions.
  • Lookahead - How easy is it to find the necessary pieces of the next step?
  • Advancement - Some methods have obvious advancements. This can show their future potential. These advancements can be analyzed as well to show if a method has something worth working towards vs a method which becomes a dead end by not having any good advancements.
 

L1meDaBestest

Member
Joined
Oct 21, 2020
Messages
16
Location
Australia
WCA
2015HIGH01
YouTube
Visit Channel
We should also be specific in the way we solve with a certain method. For example, making sure to only use something like base CFOP. This means despite most CFOP users making use of other alg sets, whoever does the test must consciously try to avoid that. This would mean nothing like WV, VLS, ZBLL, COLL or even edge control at LS can be done during the speed solves. While we're at it we may also need to place restrictions on the beginning of the solve, does a cross explicitly have to be solved before anything else? I know I for one often solve a pair or 2 before my last cross edge on certain scrambles.

Then again we could also choose to include this, and perhaps even make notes of how many times this occurs in the average. If we did this however it would become a lot more subjective as not every solver will know the same amount and it would be less accurate as an Ao100 would be nowhere near enough.

The same can be applied to other methods.
 

Silky

Member
Joined
Apr 5, 2020
Messages
432
We should also be specific in the way we solve with a certain method. For example, making sure to only use something like base CFOP. This means despite most CFOP users making use of other alg sets, whoever does the test must consciously try to avoid that. This would mean nothing like WV, VLS, ZBLL, COLL or even edge control at LS can be done during the speed solves. While we're at it we may also need to place restrictions on the beginning of the solve, does a cross explicitly have to be solved before anything else? I know I for one often solve a pair or 2 before my last cross edge on certain scrambles.

Then again we could also choose to include this, and perhaps even make notes of how many times this occurs in the average. If we did this however it would become a lot more subjective as not every solver will know the same amount and it would be less accurate as an Ao100 would be nowhere near enough.

The same can be applied to other methods.
Yes I plan not to allow option selecting. Variants will be split up accordingly. We should, in general, be evaluating each method on its own merits. Sometimes things can't really be avoided (e.g. sometimes you just get a R U' R' Winter Variation case in ZZ/Petrus which would be more so a skip than an option select). We'll have to wait and see, it's most likely going to depend on the variants included. WV is usually an ancillary algset and not a variant one. As such we'd probably ignore cases that aren't R U' R' or R U2 R'.

It will be a bit challenging because, on one hand, we don't want to punish clever solving but, on the other, we don't want optimized/FMC solutions (I'd like this to evaluate methods on average rather than on optimized potential. I'm using example solves instead of reconstructions here as it seems to me to be the right balance between the two. Recons would be good but they are much more dependent on the skill of the solver than the viability of the method.
 
Last edited:

L1meDaBestest

Member
Joined
Oct 21, 2020
Messages
16
Location
Australia
WCA
2015HIGH01
YouTube
Visit Channel
Yes I plan not to allow option selecting. Variants will be split up accordingly. We should, in general, be evaluating each method on its own merits. Sometimes things can't really be avoided (e.g. sometimes you just get a R U' R' Winter Variation case in ZZ/Petrus which would be more so a skip than an option select). We'll have to wait and see, it's most likely going to depend on the variants included. WV is usually an ancillary algset and not a variant one. As such we'd probably ignore cases that aren't R U' R' or R U2 R'.

It will be a bit challenging because, on one hand, we don't want to punish clever solving but, on the other, we don't want optimized/FMC solutions (I'd like this to evaluate methods on average rather than on optimized potential. I'm using example solves instead of reconstructions here as it seems to me to be the right balance between the two. Recons would be good but they are much more dependent on the skill of the solver than the viability of the method.
Yes that sounds good. The one caveat with example solves is people may end up seeing stuff they otherwise wouldn’t have if they were not going so slow and hence may become more efficient.

This probably wouldn’t be too hard to avoid by basically saying as a rule of thumb don’t try too hard to optimise the solution and mostly just do the first or second thing you see.
 

Billabob

Member
Joined
Jul 12, 2018
Messages
136
I think you'll run into problems quickly because proficiency in a method is extremely important - even within solves using the same method by the same person there'll be some large variance. If I naïvely go through a CFOP solve step by step the movecount would be around 60-70 moves, but in my slow solves where I look ahead as much as possible I average 35-45... you'll have to make some effort to define conditions that will be the same for all methods, perhaps using a competition format?

The fact you can look ahead like this with CFOP comes from its relatively simplicity I think, for the majority of the solve you're only solving 2 pieces at a time and there's only so many ways you can affect the other pieces. Curious if it's more or less flexible than for example Petrus. You can plan ahead and alter your solution in all methods but CFOP is so much more atomised... but maybe I'm biased as I haven't practiced these methods much.

Obviously some methods (including 100% of the meme methods that got a Wiki page in the last 5 years) will unavoidably be worse than the old guard. I'm interested to see how you'll quantify these methods. IIRC Feliks said once that if he didn't already have 10 years of experience learning CFOP he would be using Roux, but I may have confused him in my head with another speedsolver.
 

L1meDaBestest

Member
Joined
Oct 21, 2020
Messages
16
Location
Australia
WCA
2015HIGH01
YouTube
Visit Channel
I agree. There are a lot of situations and rules that must be specified before we can start doing solves. However I think it’s only a matter of determining these and once Silky or others have them set we can begin giving recons/examples/solves.
 

Swagrid

Member
Joined
Jun 5, 2018
Messages
375
Location
Downtown Swagistan
YouTube
Visit Channel
As well as all of the issues with the solves themselves, the MCC itself isn't perfect. And I can't blame it, ergonomics are complicated, and I don't think it ever will be perfect. It's good for genning a good-enough alg on the spot but not for objective optimality. See attached image of MCC favouring a UFEM gen monster of an alg over many PLLs.
 

Attachments

  • unknown-54_34616583710863.png
    unknown-54_34616583710863.png
    44.4 KB · Views: 18

Silky

Member
Joined
Apr 5, 2020
Messages
432
I think you'll run into problems quickly because proficiency in a method is extremely important - even within solves using the same method by the same person there'll be some large variance. If I naïvely go through a CFOP solve step by step the movecount would be around 60-70 moves, but in my slow solves where I look ahead as much as possible I average 35-45... you'll have to make some effort to define conditions that will be the same for all methods, perhaps using a competition format?

The fact you can look ahead like this with CFOP comes from its relatively simplicity I think, for the majority of the solve you're only solving 2 pieces at a time and there's only so many ways you can affect the other pieces. Curious if it's more or less flexible than for example Petrus. You can plan ahead and alter your solution in all methods but CFOP is so much more atomised... but maybe I'm biased as I haven't practiced these methods much.

Obviously some methods (including 100% of the meme methods that got a Wiki page in the last 5 years) will unavoidably be worse than the old guard. I'm interested to see how you'll quantify these methods. IIRC Feliks said once that if he didn't already have 10 years of experience learning CFOP he would be using Roux, but I may have confused him in my head with another speedsolver.
Personally what I've been doing is going through a solve 1-3 times and writing down the solution. A bit of optimization isn't bad (getting rid of double moves or unnecissary rotations) especially for intuitive methods as there are so many options in how to solve something and a first look through isn't always the best. Giving several times to walk through solves helps to bridge the gap between skill levels.

I plan to have a verification process/time limit for submission. If we see 30-40 move solutions for a CFOP solve we may flag that. We can have others do an example solve of the same scramble and see how reasonable the solution is. One good thing to note is that the Coefficient is more important than the movecount. Even if we were to get a bunch of 30 move solves we can take the coefficient of that solve and just multiply it by the projected average movecount. Also the more solves we get the less this will matter.

As well as all of the issues with the solves themselves, the MCC itself isn't perfect. And I can't blame it, ergonomics are complicated, and I don't think it ever will be perfect. It's good for genning a good-enough alg on the spot but not for objective optimality. See attached image of MCC favouring a UFEM gen monster of an alg over many PLLs.

Agreed. When filling out the survey I did include an option to change multiplier setting so if you haven't filled that out please do. MCC is not a perfect tool of analysis ofc but its the best quantitative tool we have atm.
For things to be most accurate we'd have to adjust the multipliers per method. In Jperms video comparing the Big3 (which the multipliers are based off of) he penalizes ZZ for RUL regrips more than rotations in CFOP. However, depending on the method that you main, these things will matter more or less. Regrips during ZZ is bad, sure, but a good ZZ solver knows how to compensate for that where a CFOP solve may not. Similarly, S/E moves are penalized heavily but for a Blind solver or someone who uses LMCF these would be far less of a problem. Unfortuneatly this reintroduces the problem of subjective analysis but overall we can just vote on multipliers for what we feel is fair. What I would purpose to fix this problem is to use videos of the fastest execution of example solves. This would give us a bunch of data on how different moves/movesets compare to one another. Doesn't fix the problem all together but could help
 

GodCubing

Member
Joined
May 13, 2020
Messages
192
I feel the need to point out the bias of MCC and the few error tendencies it had that I'm not sure we're patched.
1. MCC was based around OLL and PLL algorithms and their speeds
2. There are some aspects of MCC which are not fine tuned and are often changed depending on the how the solver prefers their algorithm such as the S/E move penalty which is arbitrary because of the low amount of data on algs that use these moves.
3. There were some clear examples presented in the past of times when a U move before an alg actually gave it a better MCC (not sure if patched). After posting this I will go check those messages for concrete examples and check to see if they are patched
4. Solves are not fully memorized solutions and so when the way the MCC does finger tricks might not be how a solver would execute it because they do not know the whole solution yet
5. Obviously this fails to take into account lookahead
6. MCC is known to be very biased against MU algs. Some of you may rejoice at this others will be saddened.


4 continued -

Example:
R S' U S R: 9.7
U' R S' U S R: 9.1
(The second alg is clearly just a U' then the first alg yet it's MCC is lower by 0.6)

According to Trangium (the creator of the MCC), "it's a consequence of the MCC's greedy algorithm. The MCC tries all the initial starting grips, then goes as long as possible before regripping, always choosing the finger trick that adds the least time (going from left to right). The U' forces finger tricks that are worse in the short run but better in the long run (if you know what I mean)."

The MCC sacrifices accuracy for speed by, instead of testing every fingertrick combination, chosing the best fingertrick at each point. So there are cases when the fastest fingertrick at one point forces worse fingertricks later on.

example 2:
R U' R2' F R U R U' R2' F' R f R' F' R S': 16.3
S' R U' R2' F R U R U' R2' F' R f R' F' R: 17.7
U S' R U' R2' F R U R U' R2' F' R f R' F' R: 17.5

The third alg has a lower MCC by 0.2 but it's just a U then the second alg.

Of course these are small differences, but over the course of an entire solve these defects add up. However, the large sum of solves may lessen the degree of error from these defects.

If it is a very close difference between method ratings then it should be known that there are large errors incurred by this approach. This will ultimately be better than nothing but some will take it too seriously. I'm making this post to people from that. This will be at best a rough estimate.

MCC is not god.

I'm not either
 
Last edited:

Silky

Member
Joined
Apr 5, 2020
Messages
432
I feel the need to point out the bias of MCC and the few error tendencies it had that I'm not sure we're patched.
1. MCC was based around OLL and PLL algorithms and their speeds
2. There are some aspects of MCC which are not fine tuned and are often changed depending on the how the solver prefers their algorithm such as the S/E move penalty which is arbitrary because of the low amount of data on algs that use these moves.
3. There were some clear examples presented in the past of times when a U move before an alg actually gave it a better MCC (not sure if patched). After posting this I will go check those messages for concrete examples and check to see if they are patched
4. Solves are not fully memorized solutions and so when the way the MCC does finger tricks might not be how a solver would execute it because they do not know the whole solution yet
5. Obviously this fails to take into account lookahead
6. MCC is known to be very biased against MU algs. Some of you may rejoice at this others will be saddened.


4 continued -

Example:
R S' U S R: 9.7
U' R S' U S R: 9.1
(The second alg is clearly just a U' then the first alg yet it's MCC is lower by 0.6)

According to Trangium (the creator of the MCC), "it's a consequence of the MCC's greedy algorithm. The MCC tries all the initial starting grips, then goes as long as possible before regripping, always choosing the finger trick that adds the least time (going from left to right). The U' forces finger tricks that are worse in the short run but better in the long run (if you know what I mean)."

The MCC sacrifices accuracy for speed by, instead of testing every fingertrick combination, chosing the best fingertrick at each point. So there are cases when the fastest fingertrick at one point forces worse fingertricks later on.

example 2:
R U' R2' F R U R U' R2' F' R f R' F' R S': 16.3
S' R U' R2' F R U R U' R2' F' R f R' F' R: 17.7
U S' R U' R2' F R U R U' R2' F' R f R' F' R: 17.5

The third alg has a lower MCC by 0.2 but it's just a U then the second alg.

Of course these are small differences, but over the course of an entire solve these defects add up. However, the large sum of solves may lessen the degree of error from these defects.

If it is a very close difference between method ratings then it should be known that there are large errors incurred by this approach. This will ultimately be better than nothing but some will take it too seriously. I'm making this post to people from that. This will be at best a rough estimate.

MCC is not god.

I'm not either
All very good points. Ofc MCC isn't a deity. As Athefre pointed out there are tons of other aspects to be analyzed. Atm this is generally a jumping off point and an attempt to help standardize analytical approaches. A large part it just collecting a ton of raw data so we have more to go off of as approaches improve. I know @Athefre has had discussions with @trangium about adapting MCC for method analysis, that will be up to him to help improve software (or make stuff open source so others can contribute?).

One way we could hopefully improve analysis is running each individual step through MCC and taking a summation instead of running the entire solve as a whole?

The (4) 'U' issues was fixed by the option to ignore starting/ending U moves.
As I mentioned hopefully using execution videos can help compare the speed difference between moves/movesets. Also previously mentioned this may need to be fine tuned per method as compensation (having a way to have more control over multipliers via moves could help?).

I'm not entirely sure how to quantify lookahead but I assume the best way is to analyze solve videos. Trainers could be another way for recognition times. Obviously TPS plays a factor as well.. but, for now, we probably shouldn't focus on a multi-variable analysis. One step at a time.

For now lets get as much input via survey and then transition to next steps. This is going to be a long term study so we'll have to adjust as we continue.
 

Silky

Member
Joined
Apr 5, 2020
Messages
432
Also for everyone that filled out, I've updated the survey with more Methods/Variants.
 

GodCubing

Member
Joined
May 13, 2020
Messages
192
The (4) 'U' issues was fixed by the option to ignore starting/ending U moves.
However in a solve that will not be an option and some solves might get forced into some pretty bad scenarios.

You are correct, this is an excellent starting line. It reminds me of the data collected on fast CFOPers and hopefully soon data on fast Rouxers. Except that data tends is obviously not about method comparison, and more precise for various reasons. This should be a nice ball park!
 

Silky

Member
Joined
Apr 5, 2020
Messages
432
Going to be closing the survey in a few days!! Please make sure to fill out the form. Will have a fleshed out guidelines within the next week for every one and hopefully have a google doc by the end of the weekend/early next week.
 
Top