EGA (Estimated Global Average) - A new metric for assessing competitors' strength

KAINOS

Member
I've been thinking about this idea for quite a while now (since ~2 years ago) but never really got to publish it, mainly because I don't really know how to write codes and couldn't post results of it without some manual working. But finally I decided to post this thread anyway ¯\_(ツ)_/¯ So here it goes!

How does this algorithm/system work?

Basically, you take every official solve the competitor has done in the event, including DNFs. Then you give weights to each solve, according to how old each one of them are, with exponential decay. (I set the decaying rate to 0.1x per year in the examples below, which means a solve done today is 10 times more relevant than a solve from a year ago, 10^0.5=3.16 times more relevant than one from 6 months ago, and so on.) From this we calculate the weighted average as if it was done in a single huge round (i. e. average of middle 60% for sighted events and top 1/3 for blind events.)

Examples

Here are the lists of top 20 3x3 and 3BLD cubers, with EGA algorithm I explained. (These were actually done a while ago (in September and July respectively) but I think it should be enough to show how it works. Also I don't want to do this again because I had to copy and paste a LOT of data to Excel and run macros by hand :/ )

Max Park 6.538

Feliks Zemdegs 6.786

Seung Hyuk Nahm (남승혁) 6.980

Bill Wang 7.044

Lucas Etter 7.097

Philipp Weyer 7.112

Patrick Ponce 7.168

Sebastian Weyer 7.367

Leo Borromeo 7.401

Sean Patrick Villanueva 7.433

Mats Valk 7.434

Tymon Kolasiński 7.470

Dylan Miller 7.661

Rami Sbahi 7.684

Martin Vædele Egdal 7.690

Danny SungIn Park 7.716

Zibo Xu (徐子博) 7.774

Max Siauw 7.810

Kai-Wen Wang (王楷文) 7.836

Max Hilliard 19.272

Jake Klassen 19.572

Jack Cai 19.736

Tommy Cherry 20.337

Stanley Chapel 20.598

Daniel Lin 21.672

Liam Chen 21.848

Jeff Park 22.006

Manuel Gutman 22.104

Berta García Parra 22.221

Kaijun Lin (林恺俊) 22.340

Grigorii Alekseev 22.900

Jens Haber 23.020

Gianfranco Huanqui 23.070

Yichuan Xie (谢逸川) 23.239

Heejun Kim (김희준) 23.364

Sebastiano Tronto 23.374

Martin Vædele Egdal 23.572

Arthur Garcin 23.780

Ádám Barta 23.803

Pros and Cons

The biggest advantage it gives is that It is a lot less luck dependent than traditional ranking method with single solve or round. This will allow us to really know who is faster in the quicker events (2x2, Pyra, Skewb, etc.)

One problem I couldn't solve is the delay in reflecting growth in speed. For example, in the 3x3 top 20 list above, Tymon Kolasiński has 3x3 EGA of 7.470. Yet I'm pretty sure he was (in September) in sub-7 range, or at least very close to it. This is because he was slower than that for a while and he had just improved by quite a bit within only a half a year or so. Maybe lowering the decay rate would fix it?

My original plan was to make a website that displays EGAs of top competitors in every event and is updated regularly, but I haven't learn how to. I'm hoping to actually figure out that kind of stuff and work on the webpage.

Let me know if you have any questions about it!

Last edited:

Iwannaganx

Member
Sounds cool, I don't know computer stuff so can't help with the website. I was just wondering if there was any way some competitors could be advantaged, by going to more/less comps and I couldnt think of one so well done !

casi

Member
Do you have an equation that represents this so I can try coding it?

GuRoux

Member
something like this?:

SUM[ ( 0.1 )^( t_i ) * T_i ] / SUM[ ( 0.1 )^( t_i ) ] ,
T_i = the time for the solve
t_i = the number of years since the solve has been done
SUM['expression'] = the sum of 'expression' for every solve that is in the middle 60%.

KAINOS

Member
Do you have an equation that represents this so I can try coding it?
something like this?:

SUM[ ( 0.1 )^( t_i ) * T_i ] / SUM[ ( 0.1 )^( t_i ) ] ,
T_i = the time for the solve
t_i = the number of years since the solve has been done
SUM['expression'] = the sum of 'expression' for every solve that is in the middle 60%.
Pretty much correct, but there's one thing I forgot to mention in the first post: when I said 'middle 60% of the solve' I meant 'solves in the middle that takes 60% of the total weights.' I know that's pretty confusing (mostly because of my terrible English skill ) so let me give an example.

Let's say someone has done only 10 solves in an event and the weights are 1.0, 0.7, 0.8, 0.5, 0.4, 0.6, 0.3, 0.4, 0.2, 0.1 respectively, when listed from fastest to slowest. Now you want to exclude fastest and slowest solves that occupies 20% of the total weights. In this case the sum is 5.0, so the fastest solve (weight=1.0) and the 4 slowest solve (0.3+0.4+0.2+0.1=1.0) isn't included in the final calculation. This is different from merely using 6 solves in the middle. It also goes same for the BLD events.

Also I'm sharing the .xlsm files I made! You simply copy and paste the official results from WCA website to 'Data' sheet, run the macro named 'calculation', and the result will pop up a few seconds later. (You'll need to download the files and run it on Excel because macros doesn't work on Google spreadsheets.)