• Welcome to the Speedsolving.com, home of the web's largest puzzle community!
    You are currently viewing our forum as a guest which gives you limited access to join discussions and access our other features.

    Registration is fast, simple and absolutely free so please, join our community of 40,000+ people from around the world today!

    If you are already a member, simply login to hide this message and begin participating in the community!

Speed-Cubjectives: The tool that computes statistics-based speedcubing objectives

tseitsei

Member
Joined
Jan 12, 2012
Messages
1,374
Location
Tampere, Finland
WCA
2012LEHT01
Funniest results:

event time given by program real time how many % faster the real time is
3BLD 3:42.47 36.99 83%
4BLD 13:02.79 4:15.07 67%
5BLD 24:31.21 15:00.00 39%
Pyra 13.30 5.79 56%
MBLD 3.17 cubes 9 cubes(my official MBLD sucks) but still 183% more cubes than predicted :D

If I count my MBLD with 20/23 which I should be AT LEAST able to do at my next comp I would get 436% more cubes than predicted...

But then again those are the only events that I practise. In all other categories (except 2x2) I lose to my predicted times.

Interesting program...
 

kotarot

Member
Joined
Feb 9, 2015
Messages
8
Location
Tokyo, Japan
WCA
2010TERA01
As a statistician, using simple linear regression for (in most cases) data with increasing variation is really making me uncomfortable. :p Results would be much better with a GLM or even just log transformations on the variables.

I intuitively know simple linear regression is not suitable for most of the combinations of events.
I used linear regression just because it's the simplest one that I know.

So the problem is, the variables are not normally distributed?
I'll try Poisson regression instead.

Anyway, I'm very glad to hear your opinion because I'm usually in computer science field not in statistics field.
 

kotarot

Member
Joined
Feb 9, 2015
Messages
8
Location
Tokyo, Japan
WCA
2010TERA01
Well, "correlation does not imply causation", so I don't like that the results are called "objectives".

That's true...
but the basic idea is that fast cubers in a event who can turn faces faster and know more algorithms, etc..., must have solve fast in other events.
Causations depend on that expectation.

In the first place, only a few combinations of events have well correlation...
 
Last edited:

AlphaSheep

Member
Joined
Nov 11, 2014
Messages
1,083
Location
Gauteng, South Africa
WCA
2014GRAY03
Well, "correlation does not imply causation", so I don't like that the results are called "objectives".

I really like that they're called objectives because, to me at least, it doesn't imply any direct correlation between the events. It simply says that if you average x in 3x3, then you should aim to average less than y in 4x4 to be faster than a typical cuber with your 3x3 speed.

I can't think of a better name. Any other name like "expected times" or anything along those lines would have been a terrible choice of name, because they imply stronger correlations between times, whereas there is actually only a slight correlation.
 
Last edited:

Laura O

Member
Joined
Aug 27, 2009
Messages
289
Location
Germany
WCA
2009OHRN01
YouTube
Visit Channel
That's true...
but the basic idea is that fast cubers in a event who can turn faces faster and know more algorithms, etc..., must have solve fast in other events.
Causations depend on that expectation.

Sure, I know what the idea behind this is and I like those statistics, but I don't like how they are actually interpreted.

Another way to describe objectives would be to look at the results of competitors at your level (e.g. +/- 1%) and then get the minimum result of every event of them. That's actually what I would call "objective" - at least a way more motivating one. :)
 

lerenard

Member
Joined
Sep 5, 2014
Messages
274
Location
Tennessee
It seems like basically everyone has better times than the program expects. Perhaps this is because most people practice 3x3 more than other events? Maybe you should do something like find the competitors 3x3 speed as a percentile of all competitors and then use that percentile to find the corresponding times in other events? Idk
 

AlphaSheep

Member
Joined
Nov 11, 2014
Messages
1,083
Location
Gauteng, South Africa
WCA
2014GRAY03
if i enter my 3x3 time as 2.00 sec it gives pyraminx expected time as 9.29 sec, why is that? many people are sub 9 in pyra but no one is even close to 2 sec in 3x3.

can someone explain this?
It's just a quirk of the straight line fitted to the data, which is 0.31X + 8.67. Pyraminx is very easy, so it's easy to be, say, sub-20 even if you're a 60 second solver on 3x3, which skews the higher end downward. Then at the lower end, I guess it is skewed upwards by people who are fast at 3x3 but who have never bothered getting faster at pyraminx.

What's even funnier is that some people (Feliks *cough*) should be faster at one-handed than at two handed. In fact, anyone faster than 4.52 seconds on 3x3 will get a negative time for one-handed.
 

qqwref

Member
Joined
Dec 18, 2007
Messages
7,834
Location
a <script> tag near you
WCA
2006GOTT01
YouTube
Visit Channel
Some of these results are incredibly bad. No matter how fast you are at 3x3x3, it'll never give you results better than 1:31 BLD, 8.70 pyraminx, 24.20 square-1, 19.38 clock, 9.60 skewb, 9:36 4BLD, 20:24 5BLD, and 4.18 cubes multi. The results I listed vary from noobish to decent but not great (other results are very good, as they should be for someone who gets 0.10 second times on 3x3x3).

Straight regression lines are really not a good fit for this type of data. In fact, even a more complicated curve would be a very poor fit. Different people have different amounts of skill on different events, and that skill doesn't automatically transfer over. So a better 3x3x3 time (for instance) doesn't lead to a much better time at other events. So the given times don't really correspond to anything, especially in the outlying regions where most experienced cubers are. It would be better to use the WCA ranks, and compare (say) top 5% in 3x3x3 to top 5% in other events - that way the times on other events would require very roughly the same skill as it took to get the time you input at the top.
 

Kit Clement

Premium Member
Joined
Aug 25, 2008
Messages
1,631
Location
Aurora, IL
WCA
2008CLEM01
YouTube
Visit Channel
Here's the problem with the data -- the variance is obviously increasing as times get larger (for most relationships). This is very clear for the example of 3x3 vs. 4x4:

Rplot1.png

To reduce this, as I mentioned before, we can take logarithms of the data. The variance increase is so strong here, I actually found that taking the log twice worked well.

Rplot2.png

Conveniently, this also fixes the strong right skew of the data, and makes it much closer to normal. Still a bit off, but it's not as important as the variance.

We can now fit a linear model to the loglog data.

Code:
Call:
lm(formula = X333loglog ~ X444loglog, data = x3x4)

Residuals:
     Min       1Q   Median       3Q      Max 
-0.41500 -0.04422  0.00383  0.04860  0.32557 

Coefficients:
             Estimate Std. Error t value Pr(>|t|)    
(Intercept) -0.657522   0.009407   -69.9   <2e-16 ***
X444loglog   1.126283   0.006288   179.1   <2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

How do we interpret these coefficients now? We can see our model as such:

Pred. 3x3x3 times = exp{exp{-0.658 + 1.126*log{log{4x4x4 time}}}}

As you can see, this fits the data much better:

Rplot3.png

This won't work for every pair of events (MBLD is especially hard to model) but should work much better for events that have this kind of pattern.
 

YouCubing

Member
Joined
Mar 31, 2015
Messages
2,420
Location
Roswell GA
WCA
2015JOIN01
I put in my 3x3 global average of 34.99. I had some fun looking at the average predictions, and then... 6x6. My global average is 8:16.63 on 6x6 (Yes, I am a nub. I know.) But the website predicted my global average to be 8:16.64. Gj.
 
Top