• Welcome to the Speedsolving.com, home of the web's largest puzzle community!
    You are currently viewing our forum as a guest which gives you limited access to join discussions and access our other features.

    Registration is fast, simple and absolutely free so please, join our community of 40,000+ people from around the world today!

    If you are already a member, simply login to hide this message and begin participating in the community!

Hard statistical question: Erik beating Rama on OH

AvGalen

Premium Member
Joined
Jul 6, 2006
Messages
6,857
Location
Rotterdam (actually Capelle aan den IJssel), the N
WCA
2006GALE01
YouTube
Visit Channel
Not entirely Puzzle Theory, but this is the best place for this question:

Given these results, is it possible to calculate the chances of Erik beating Rama?

I would like to know:
1) What are the chances of Eriks best single being better than Ramas best single during an average of 5
2) What are the chances of Eriks (corrected) average being better than Rama's (corrected) average during an average of 5

Bonus question: What about Mats beating them both on single :D
 
Joined
Apr 29, 2008
Messages
1,680
Location
Almelo, Holland
WCA
2008SMIT04
YouTube
Visit Channel
I have no idea. Nuff said.

I think I can get a pretty vague estimation, by taking the average of all the ranks. The number that's lower has a bigger chance. That probably will be Rama, as I am too lazy to compute it all. I think that from the ratio between those averages you should be able to give a (really) rough estimation of Erik's chances beating Rama. But seriously, there are so many other factors involved, like experience, practise, etc., so even with a more professional approach than mine, an estimation won't say anything.
 
Last edited:

Swoncen

Member
Joined
Jun 3, 2008
Messages
436
It's hard to tell with the given data because it is spread over the years and improvement plays a big role. If you have e.g. 100 solves of each of them from a recent date you can calculate the distributions and then calculate the overlaping area to the left which is the probability of Erik beating Rama. I know the explanation is not good without a picture so I make one:

erik.gif


The cyan area should be the probabilty of Erik's chance to beat Rama..

EDIT: "The cyan area should be the probabilty of Erik beating Rama.."
 
Last edited:

Swoncen

Member
Joined
Jun 3, 2008
Messages
436
Sorry for beeing stupid. You have to integrate over the time (x-Axis) and in the integral multiply the distributions. In my first assumption I only had ONE case. You need all cases from the most left of erik's distributions to the most right of ramas distribution.. therefore you use the integral.. thats it!

@trying: What is your computation? Taking the average and then give an estimation? Based on what? If you have the averages, you have two numbers and one is better.. then you'll have 100% for one and 0% for the other. At least you have to include the deviations.
 
Joined
Apr 29, 2008
Messages
1,680
Location
Almelo, Holland
WCA
2008SMIT04
YouTube
Visit Channel
As I said, I have no experience in this, I am not good in chance computations.
you should be able to give a (really) rough estimation of Erik's chances beating Rama.
Well, say Rama gets a 20 average and Erik gets a 40, you might say that Erik has a 40/(40+20)*100 = 66.7% chance of losing to Rama?

Again, I have no idea. I'm just randomly trying stuff.
 

Swoncen

Member
Joined
Jun 3, 2008
Messages
436
Int_-inf_+inf(D1(x)*D2(x)*dx)

D1 and D2 are the distributions. Now it should be clear for all of us.
 

Swoncen

Member
Joined
Jun 3, 2008
Messages
436
As I said, I have no experience in this, I am not good in chance computations.
you should be able to give a (really) rough estimation of Erik's chances beating Rama.
Well, say Rama gets a 20 average and Erik gets a 40, you might say that Erik has a 40/(40+20)*100 = 66.7% chance of losing to Rama?

Again, I have no idea. I'm just randomly trying stuff.

nonononono. You don't include the deviation and your calculation is wrong! Let's say one has an average of 10 seconds and never had a time over 11 seconds. The other one has an average of 20 and never had a time below 19. In your computation the second one would have a chance of 20/30 * 100 = 66,7% of beating the better one? Think about it *g*. I don't blame you, but don't say your's is more accurate :)
 
Last edited:

Swoncen

Member
Joined
Jun 3, 2008
Messages
436
Then just take the times and integrate as in my previous post.. it should work. You just have to compute the distribution first.
 

Swoncen

Member
Joined
Jun 3, 2008
Messages
436
Ok, since we don't have such distributions but each result is a point-probabilty (I don't know the english word - it is: "Punktwahrscheinlichkeit" in german) which is 1/n for each result, we have to do that computation for discrete values: An example:

Distribution x: 1.1, 1.2, 1.3, 1.4
Distribution y: 1.34, 1.35, 1.36, 1.5

The chance of x beeing worse then y is (y is better then x)

(x=1.1 and y < 1.1) + (x=1.2 and y < 1.2) + (x=1.3 and y < 1.3) + (x=1.4 and y < 1.4)

which is:

1/4 * 0 + 1/4 * 0 + 1/4 * 0 + 1/4 * 3/4 = 0.1875 = 18.75%
 

TomZ

Member
Joined
Dec 30, 2007
Messages
294
Rama
Average: 19.865 - SD: 1.281

Erik
Average: 20.813 - SD: 0.785

Using a recursing computer program to simulate each of them getting three times in the range of 15-23 (step 1) I found that:

The cases where Rama wins have a combined probability of 0.853. The cases where Erik wins have a combined probability of 0.142.

Accounting for the 0.5% of cases my search did not cover, Erik has a 14.2% chance to win his next mean-of-3 match against Rama. I think this probability would be the same for an average-of-5, as you do not need to account for their best and worst times.

Mats:
His chances are pretty slim. Looking only at his most recent competition, we find:
Average: 27.462 - SD: 3.078

Using a similar technique (this time in the range 10-35) we find that Mats has a 1.2% chance of beating Erik and a 0.7% chance of beating Rama. The chance of beating them both at the same time is less than 1%%.
 
Last edited:
Joined
Apr 29, 2008
Messages
1,680
Location
Almelo, Holland
WCA
2008SMIT04
YouTube
Visit Channel
As I said, I have no experience in this, I am not good in chance computations.
you should be able to give a (really) rough estimation of Erik's chances beating Rama.
Well, say Rama gets a 20 average and Erik gets a 40, you might say that Erik has a 40/(40+20)*100 = 66.7% chance of losing to Rama?

Again, I have no idea. I'm just randomly trying stuff.

nonononono. You don't include the deviation and your calculation is wrong! Let's say one has an average of 10 seconds and never had a time over 11 seconds. The other one has an average of 20 and never had a time below 19. In your computation the second one would have a chance of 20/30 * 100 = 66,7% of beating the better one? Think about it *g*. I don't blame you, but don't say your's is more accurate :)
It's more accurate than a 90% chance that noone wins ^^

Also, I was talking about the average rank, not the average time.
The cases where Rama wins have a combined probability of 0.853. The cases where Erik wins have a combined probability of 0.142.
Hehe... And the last 0.005 are for Mats :p
 
Last edited:

TomZ

Member
Joined
Dec 30, 2007
Messages
294
The cases where Rama wins have a combined probability of 0.853. The cases where Erik wins have a combined probability of 0.142.
Hehe... And the last 0.005 are for Mats :p

:p No, the last 0.005 are for the cases where Rama or Erik either break the world record or screw up horribly. My computer didn't have the time to calculate that! (And of course the nearly infinitesimal chance of a draw)
 

Tortin

Member
Joined
Feb 26, 2009
Messages
373
Location
Toronto
WCA
2009WANG15
YouTube
Visit Channel
I'm probably missing somthing big here, but couldn't you get the % from the number of competitions they competed in together?

For example, if you take the stats from 5 compitions, because each has two rounds, that would make ten rounds. If Erik beat Rama once out of those ten rounds, that would make his chances of winning 10%.
 

qqwref

Member
Joined
Dec 18, 2007
Messages
7,834
Location
a <script> tag near you
WCA
2006GOTT01
YouTube
Visit Channel
Roughly, if you write down the last m of Erik's OH solves and the last n of Rama's OH solves, and k is the number of ways you can pick one of Erik's OH solves and one of Rama's slower OH solves, the chance of Erik beating Rama in a single solve is k/(n*m). Of course this is only an approximation to the real-life probability (which you could never calculate), and you'd have better results if you had more data, but it's better than assuming a normal distribution (since it might very well not be). I don't feel like writing up code to do this, but that's how you'd do it. You'd want to choose n and m so that Erik and Rama didn't improve (much) in the interval.

Tortin: That doesn't work because it's a really really small sample size. You don't get an accurate result at all.
 
Last edited:

Swoncen

Member
Joined
Jun 3, 2008
Messages
436
Roughly, if you write down the last m of Erik's OH solves and the last n of Rama's OH solves, and k is the number of ways you can pick one of Erik's OH solves and one of Rama's slower OH solves, the chance of Erik beating Rama in a single solve is k/(n*m).

This is only at one position of the distributions. In my example

n=4
m=4

k=0 for x=1.3

would be 0

You have to do that for each value (approximation of the integral since we have point distributions)

k=0 for x=1.2 would be 0
k=0 for x=1.1 would be 0
k=3 for x=1.4 would be 3/16

0+0+0+3/16 would be the right answer. In this case we only have a higher value then 0 for one value, but thats not allways like this..
 

qqwref

Member
Joined
Dec 18, 2007
Messages
7,834
Location
a <script> tag near you
WCA
2006GOTT01
YouTube
Visit Channel
Roughly, if you write down the last m of Erik's OH solves and the last n of Rama's OH solves, and k is the number of ways you can pick one of Erik's OH solves and one of Rama's slower OH solves, the chance of Erik beating Rama in a single solve is k/(n*m).

This is only at one position of the distributions. In my example

n=4
m=4

k=0 for x=1.3

would be 0

You have to do that for each value (approximation of the integral since we have point distributions)

k=0 for x=1.2 would be 0
k=0 for x=1.1 would be 0
k=3 for x=1.4 would be 3/16

0+0+0+3/16 would be the right answer. In this case we only have a higher value then 0 for one value, but thats not allways like this..

This post makes no sense... in your example n=4 and m=4, and then k=3 because there are 3 ways to choose one number from x and then one smaller number from y, so the probability is 3/16.

k/(n*m) is the chance that if you randomly pick one of Erik's solves from the set you chose, and randomly pick one of Rama's solves from the set you chose, Erik's solve will be better. So this is a pretty good approximation for the actual probability, and since we don't know the real distribution it's probably more accurate in the long run than just taking the mean and SD and assuming they're both normal. (And if you do the mean and SD bit, how do you factor in DNFs? Does it make the mean infinite?)


By the way, if the distribution for one solve has mean m and standard deviation s, the distribution of the mean of 3 solves is approximately normally distributed (even if the distribution for one solve isn't) with mean m and standard deviation s/sqrt(3). I'm not sure about the distribution of a trimmed avg5 but it's probably similar.
 
Last edited:

Swoncen

Member
Joined
Jun 3, 2008
Messages
436
This post makes no sense... in your example n=4 and m=4, and then k=3 because there are 3 ways to choose one number from x and then one smaller number from y, so the probability is 3/16.

Ah sorry, your post was not clear to me.. I think I know what you mean. Your "k" is allready the sum of my k's and my 1/16 is just your 1/(m*n). Just another way to write..

k=Sum_i=1-4_(j(i))

j(1)=0 for x=1.1
j(2)=0 for x=1.2
j(3)=0 for x=1.3
j(4)=3 for x=1.4

Did you mean that? Then we had the same result..
 
Top