• Welcome to the Speedsolving.com, home of the web's largest puzzle community!
    You are currently viewing our forum as a guest which gives you limited access to join discussions and access our other features.

    Registration is fast, simple and absolutely free so please, join our community of 40,000+ people from around the world today!

    If you are already a member, simply login to hide this message and begin participating in the community!

CoLPI - collective letter-pair images database for BLD. In all languages.

Roman

Member
Joined
Jan 10, 2013
Messages
687
Location
Dubai, UAE
WCA
2012STRA02
YouTube
Visit Channel
Inspired by Tom's Letter Pair List, this tool is intended to collect letter-pair images used by different people in all languages.
Huge thanks to Tom Nelson for the initial set of words and Enoch Gray for exporting this collection (as well as his own!) into CoLPI.

>>> bestsiteever.ru/colpi <<<

Features:
  • Quick search for words on certain letter-pair
  • Multiple languages support
  • PAO support
  • Export as table (.csv)
  • Export as Anki deck (.txt)
Screenshot from 2019-07-03 00-01-48.png

Before exporting to Anki or csv, make sure no words appear twice for different letter-pairs. How?

Deal with offensive words. Maybe in user settings panel he may specify "show offensive words" - an option which is disabled by default.

Deal with PAO type. So far users can specify/edit it but it isn't used anywhere.

Add two-letters-per-sticker option. Most likely I would never do that as it requires a complete DB rebuilding.

Passive learning page. Same as http://bestsiteever.ru/stare but for letter-pairs, so that beginners can "passively" learn words by staring at the screen while doing dishes or pushups.
edit: thanks to xyzzy, I have started this https://docs.google.com/document/d/1Eexb0EI5473gcbc81gMj8i4ZHb1Z7uqn_tzG7Tfzw10/edit?usp=sharing

I have an intention to do similar tool where users can submit their 3cycle/flip/twist/parity algs and vote for them.
 
Last edited:
I have an intention to do similar tool where users can submit their 3cycle/flip/twist/parity algs and vote for them. Fot that, I would need an advice from someone good at SQL on resolving this issue:
(I'm not good at SQL so maybe I'm talking nonsense.) I think what you're looking for is some way of canonically naming the 3-cycle cases. Rather than restricting to a fixed set of buffers, what if you consider the general problem of referring to any 3-cycle (with any buffer)? Then for edges you have six choices of a letter triplet, and for corners you have nine choices of a letter triplet; you could always pick the alphabetically earliest one as the canonical name for that 3-cycle. So for example, UF-FD-BL would have CKR, KRC, RCK, UHI, HIU, IUH as the six choices (using Speffz), and CKR is the alphabetically earliest one, so that's the canonical name. It wouldn't matter whether the user is actually using UF or DF (or FU or FD) as their buffer because the case's name doesn't depend on that.
 
Rather than restricting to a fixed set of buffers, what if you consider the general problem of referring to any 3-cycle (with any buffer)?
This makes it a bit hard to learn. I think a good way to do it would be to choose the buffer, and then it gives you options for the next pieces after that. It would be a lot more user friendly to be able to see all the cycles for a specific buffer.
 
Updates!
  • Once logged in with WCA, you stay logged in forever until clicking "logout".
  • Quick quiz for the least filled letter-pair. You will see the question "Which word would you use for ...?" when open coLPI.
  • Fixed: accent marks collation. "Sabiá" and "Sábia" are now considered different words, "AÑ" and "AN" are different letter-pairs.
  • Added website footer with some stats.

A lot is yet to be done. Web programming turned out to be fascinating :p
 
(I'm not good at SQL so maybe I'm talking nonsense.) I think what you're looking for is some way of canonically naming the 3-cycle cases. Rather than restricting to a fixed set of buffers, what if you consider the general problem of referring to any 3-cycle (with any buffer)? Then for edges you have six choices of a letter triplet, and for corners you have nine choices of a letter triplet; you could always pick the alphabetically earliest one as the canonical name for that 3-cycle. So for example, UF-FD-BL would have CKR, KRC, RCK, UHI, HIU, IUH as the six choices (using Speffz), and CKR is the alphabetically earliest one, so that's the canonical name. It wouldn't matter whether the user is actually using UF or DF (or FU or FD) as their buffer because the case's name doesn't depend on that.

That is exactly what I need!
And I need to also refer to multiple cycles at once (like 2e2e algs) as well as flips and twists. And I think involving Speffz is redundant. Here is my approach: Cube cycles canonical representation.

Do you see any flaws in it? Am I reinventing the wheel?
 
nice tool! why is ʧ included? It's probably clearer as ch or if you prefer one letter, č from Czech
also what does green mean? also can you show how many votes each phrase has?
 
Thanks!
- "ʧ" vs. "ch" vs. "č" is just a matter of preference. I don't think "ch" wouldn't be clearer than the IPA symbol that explicitly denotes this sound.
- Green color means for this letter-pair there exists an image that has sufficient votes.
- There is no practical benefit from showing how many upvotes/downvotes does each word have (or is there?)
 
- There is no practical benefit from showing how many upvotes/downvotes does each word have (or is there?)
maybe more popular words are better. I don't know
why only have ch tho. why not sh or th.
actually in chinese, q has a ch sound like in qiyi. so maybe you can use q for ch
 
Last edited:
nice tool! why is ʧ included? It's probably clearer as ch or if you prefer one letter, č from Czech
also what does green mean? also can you show how many votes each phrase has?
č is worse than ch since how many people know about č who isn't czech. Ch is used in english so most people would be familiar with that.

IPA allows for complete clarity. Assuming you know it but then /ʧ/ is a unique character. The sound th is in ipa is either a theta or eth.
 
č is worse than ch since how many people know about č who isn't czech. Ch is used in english so most people would be familiar with that.

IPA allows for complete clarity. Assuming you know it but then /ʧ/ is a unique character. The sound th is in ipa is either a theta or eth.
yeah but ch isn't one letter. idk if that matters
also I think č was the former IPA symbol and easier to think about than ʧ
 
Not many contributors for Danish and Gujarati, I have to get interested speedcubers into contributing for these languages.

Also good feature update to toggle between kid's mode, and mode where all types of words are allowed.

Also the leader board system has changed, first it was all-time contribution and now its contribution done in last 90 days.
colpi.PNG
 
What should be the next language added to Colpi? (https://bestsiteever.ru/colpi/)

Languages already there:
  1. Czech
  2. Danish
  3. German
  4. English
  5. Spanish
  6. French
  7. Gujarati
  8. Hindi
  9. Hungarian
  10. Indonesian
  11. Italian
  12. Lithuanian
  13. Macedonian
  14. Malay
  15. Dutch
  16. Norwegian
  17. Polish
  18. Portuguese
  19. Russian
  20. Swedish
  21. Slovene
  22. Thai
  23. Turkish
  24. Vietnamese
  25. Chinese
 
Back
Top