Classify backgrounds into difficulty


Although the central place for design discussion is ##crawl-dev on freenode, some may find it helpful to discuss requests and suggestions here first.

User avatar

Tomb Titivator

Posts: 907

Joined: Monday, 29th September 2014, 09:04

Post Friday, 17th August 2018, 22:24

Classify backgrounds into difficulty

Species have been grouped into something vaguely similar to "difficulty". I think the same should happen to backgrounds.

Potentially, keep the layout of background screen the same and use colour to distinguish the three difficulties.

I think this would help steer newbies to stronger backgrounds, and for all players reduce feelings of frustration when you die in early game with a weak background.
Last edited by chequers on Friday, 17th August 2018, 23:28, edited 1 time in total.

For this message the author chequers has received thanks:
Implojin

Ziggurat Zagger

Posts: 8651

Joined: Sunday, 5th May 2013, 08:25

Post Friday, 17th August 2018, 23:17

Re: Classify backgrounds into difficulty

CK and Mo are harder than any of the other backgrounds imo.

Tartarus Sorceror

Posts: 1774

Joined: Tuesday, 23rd December 2014, 23:39

Post Saturday, 18th August 2018, 02:16

Re: Classify backgrounds into difficulty

To get anything other than unfounded opinion we need to back up everything with stats.
  Code:
<causative> !lg tenpercenters recent /won s=class o=%
<Sequell> 2883/11222 games for tenpercenters (recent): 243/678x Wanderer [35.84%], 239/739x Berserker [32.34%], 75/240x Assassin [31.25%], 115/376x Hunter [30.59%], 99/324x Artificer [30.56%], 507/1703x Fighter [29.77%], 99/342x Earth Elementalist [28.95%], 138/489x Gladiator [28.22%], 91/339x Skald [26.84%], 109/408x Fire Elementalist [26.72%], 81/315x Wizard [25.71%], 75/292x Conjurer [25.68%], 82/330x
<Sequell> Arcane Marksman [24.85%], 123/512x Monk [24.02%], 82/346x Abyssal Knight [23.70%], 69/305x Warper [22.62%], 95/422x Necromancer [22.51%], 73/340x Air Elementalist [21.47%], 68/329x Venom Mage [20.67%], 82/403x Ice Elementalist [20.35%], 84/427x Summoner [19.67%], 90/470x Enchanter [19.15%], 73/423x Chaos Knight [17.26%], 91/670x Transmuter [13.58%]

There is also viewtopic.php?t=15936 where I analyzed how likely different classes are to break or continue streaks. The advantage of that analysis is that if a player is on a long streak, he's going to be trying harder to win. That makes whether or not he wins a better indicator of how objectively hard it was to win (for a player of his skill level), helping to filter out speedrunners or careless play. The chart of my conclusions is:
Image

In my opinion:
Hard classes: CK, Transmuter, Arcane Marksman, Summoner, Warper, Skald, Wizard, Venom Mage.
Easy classes: Berserker, Artificer, Assassin

Wanderer is consistently one of the highest winrate classes, but I can't find a reason for it to be an "easy class" - I think it's more likely that players are just more careful with it. There is also self-selection bias, because players who are willing to play Wanderer, especially while streaking, are more confident in their skills.

Justifications for hard classes:
CK - no explanation needed.
Transmuter - poor damage starting out due to skills and stats split between magic and might. Worse than Monk.
Arcane Marskman - poor damage starting out due to split skills and stats. Worse than Hunter. Also faces the difficulty that they may run out of ammo.
Summoner - ranks low in my chart, and ranks low in the tenpercenter query too. I know it's not a popular opinion that summoners are weak, but there are reasons to explain why they might rank low in the stats. Dedicated summoners are fragile mages, but they don't have spells that say "get rid of that deadly threat in two turns" like a blaster does. Summoning encourages a character to fight in the open so their summons can get surface area on the enemy, which is itself dangerous. The summons act as a shield in some circumstances, but many monsters will ignore the summons and go straight for the mage, with smiting, lightning bolt, other bolt spells, Portal Projectile, fast movement, etc. Also, summoning is an unusual playstyle, and therefore unfamiliar, which causes players to make more mistakes.
Warper, Skald - poor damage starting out due to split skills and stats. Their magic does not compensate for this until later in the game.
Wizard, Venom Mage - starting book lacks a solid all purpose damage spell of Lair tier. VM also has nothing against early dungeon undead.

Justifications for easy classes:
Berserker - no explanation needed
Artificer - consistently high in the stats. It's not hard to see why. They start with wands that can take care of any dangerous early threat. Aside from that they're just a gimped fighter background, they can pick up any decent weapon and have no problem.
Assassin - ranks high in both queries.

I find it hard to name any more classes as "easy." There is a lot of variation between my chart and the tenpercenter query on most of these classes. I might say AK, FE, Cj, IE, Fi, Gl are relatively easy. These classes are fairly straightforward to play, you can just train your starting skills and end up with a strong character. However, it doesn't seem to clearly show up in the win rates.
streaks: 5 fifteen rune octopodes. 15 diverse chars. 13 random chars. 24 NaWn^gozag.
251 total wins Berder hyperborean + misc
83/108 recent wins (76%)
guides: safe tactics value of ac/ev/sh forum toxicity

For this message the author Berder has received thanks:
hermbot

Tartarus Sorceror

Posts: 1774

Joined: Tuesday, 23rd December 2014, 23:39

Post Saturday, 18th August 2018, 03:07

Re: Classify backgrounds into difficulty

The above analysis assumes it is relevant how difficult a background is to a skilled player. That may actually be the wrong view; if the difficulty of a background is supposed to be a guide for novices, then we would rather look at how difficult a background is to a novice with few wins and a low winrate. However, I don't know how to construct such a query.
streaks: 5 fifteen rune octopodes. 15 diverse chars. 13 random chars. 24 NaWn^gozag.
251 total wins Berder hyperborean + misc
83/108 recent wins (76%)
guides: safe tactics value of ac/ev/sh forum toxicity

Ziggurat Zagger

Posts: 8651

Joined: Sunday, 5th May 2013, 08:25

Post Saturday, 18th August 2018, 04:03

Re: Classify backgrounds into difficulty

If you come up with an approach for evaluating the strength of backgrounds, and it ends up placing Summoner as worse than Monk, your conclusion should be that either the approach is flawed or the input data is bad.

For this message the author duvessa has received thanks: 4
Arrhythmia, Implojin, nago, Shard1697

Tartarus Sorceror

Posts: 1774

Joined: Tuesday, 23rd December 2014, 23:39

Post Saturday, 18th August 2018, 04:13

Re: Classify backgrounds into difficulty

There's nothing wrong with Monk. It's just a brute background, not the best or the worst. It's not even unarmed combat anymore.
streaks: 5 fifteen rune octopodes. 15 diverse chars. 13 random chars. 24 NaWn^gozag.
251 total wins Berder hyperborean + misc
83/108 recent wins (76%)
guides: safe tactics value of ac/ev/sh forum toxicity

Blades Runner

Posts: 618

Joined: Saturday, 12th December 2015, 23:54

Post Saturday, 18th August 2018, 05:26

Re: Classify backgrounds into difficulty

Mo is a bottom 3 background and clearly worse than Tm. It deals far less damage than Tm at level 1 and will continue to deal less damage than Tm for the remainder of the earlygame unless it gets very good weapon drops. Tm earlygame damage is high btw - you'd have a stronger case for calling it a hard bg if you focused on their bad defenses.
Remove spell hunger.

For this message the author Hellmonk has received thanks: 3
duvessa, nago, Shard1697
User avatar

Tomb Titivator

Posts: 907

Joined: Monday, 29th September 2014, 09:04

Post Saturday, 18th August 2018, 05:38

Re: Classify backgrounds into difficulty

I originally had a table in the OP of my own selections but I removed it because it's not important. The species categories were endlessly bikeshedded and everybody has their pet "major error", but the result is nevertheless fantastic. The species screen rework benefits newbies, who got a framework to compare species, and also experienced players, who are gently encouraged to look to to more challenging options than yet another minotaur.

For this message the author chequers has received thanks: 2
Arrhythmia, hermbot

Ziggurat Zagger

Posts: 4341

Joined: Friday, 8th May 2015, 17:51

Post Saturday, 18th August 2018, 06:57

Re: Classify backgrounds into difficulty

Wn is indeed a very powerful class, starting with identified scroll of teleportation or potion of haste changes a lot on D:1.
Underestimated: cleaving, Deep Elf, Formicid, Vehumet, EV
Overestimated: AC, GDS
Twin account of Sandman25

Tartarus Sorceror

Posts: 1774

Joined: Tuesday, 23rd December 2014, 23:39

Post Saturday, 18th August 2018, 08:28

Re: Classify backgrounds into difficulty

You're right, Transmuter does deal good damage on d:1 once you cast Beastly Appendage. The question about Transmuter is not to argue that it is weak, but to explain why it is weak; it consistently turns up at the bottom of the stats. It may be that people skill it wrong, or forget to use Beastly Appendage, or that splitting skills and stats between physical and magic is inherently the problem. Or it may be the weak defenses later on.

Monk does deal significantly more damage on d1 than Skald or Warper (~20% more in my test against a leopard gecko). It also receives 10% more damage, which will be remedied as soon as you find a better armor. This is offset by the fact that Monk starts with more HP than Skald or Warper due to having more Fighting; a HuMo will have 18 HP compared to a Skald's or Warper's 16. The raw combat power of Monk is higher than Skald or Warper, although this is offset by the Skald and Warper spells. However, putting XP into spells early on and wearing lighter armor to cast them carries its own cost. So perhaps we can say that Monk is on a par with Skald and Warper, certainly not significantly weaker, perhaps stronger.

Incidentally, here are the recent stats from players with >20 wins who had >50% recent winrate around a month ago (a list of players I had saved from an earlier discussion):
  Code:
<causative> !lg MalcolmRose|dying5ever|Dynast|NicuTudor|Nebukadnezar|ebarrett|Berder|gammafunk|Sphara|BestGodBeogh|Pekkekk|jumbajumba|ParticlePhysics|gameguard|dstrtn recent /won s=class o=%
<Sequell> 1010/1479 games for MalcolmRose|dying5ever|Dynast|NicuTudor|Nebukadnezar|ebarrett|Berder|gammafunk|Sphara|BestGodBeogh|Pekkekk|jumbajumba|ParticlePhysics|gameguard|dstrtn (recent): 1/1x Death Knight [100.00%], 57/68x Artificer [83.82%], 28/36x Abyssal Knight [77.78%], 82/107x Fighter [76.64%], 32/42x Warper [76.19%], 30/40x Air Elementalist [75.00%], 46/62x Berserker [74.19%], 254/348x Wanderer
<Sequell> [72.99%], 32/45x Earth Elementalist [71.11%], 36/51x Monk [70.59%], 38/54x Assassin [70.37%], 37/54x Gladiator [68.52%], 21/31x Conjurer [67.74%], 27/40x Fire Elementalist [67.50%], 30/45x Ice Elementalist [66.67%], 30/45x Skald [66.67%], 27/41x Transmuter [65.85%], 23/35x Summoner [65.71%], 41/64x Hunter [64.06%], 27/44x Venom Mage [61.36%], 26/45x Necromancer [57.78%], 25/44x Enchanter
<Sequell> [56.82%], 18/33x Wizard [54.55%], 25/55x Arcane Marksman [45.45%], 17/49x Chaos Knight [34.69%]

This would agree with my feelings. CK, AM, Wz, En, Ne, VM, Hu, Su, TM, Sk - these are the lowest ranked classes here, and they are all weak in my estimation. Monk does quite well here. The only thing surprising to me is how high Warper scores. I think this must be a fluke, since Warper has so little to recommend it.
Last edited by Berder on Saturday, 18th August 2018, 08:49, edited 1 time in total.
streaks: 5 fifteen rune octopodes. 15 diverse chars. 13 random chars. 24 NaWn^gozag.
251 total wins Berder hyperborean + misc
83/108 recent wins (76%)
guides: safe tactics value of ac/ev/sh forum toxicity

Lair Larrikin

Posts: 28

Joined: Saturday, 10th December 2016, 15:38

Post Saturday, 18th August 2018, 08:43

Re: Classify backgrounds into difficulty

Does it really matter which background is stronger or weaker in the first 3-4 levels of the dungeon? If you die that early, who cares? You start a new game and in 3 minutes you're back where you died.

For this message the author Scuka has received thanks:
Wahaha

Ziggurat Zagger

Posts: 8651

Joined: Sunday, 5th May 2013, 08:25

Post Saturday, 18th August 2018, 08:55

Re: Classify backgrounds into difficulty

Berder wrote:Incidentally, here are the recent stats from players with >20 wins who had >50% recent winrate around a month ago (a list of players I had saved from an earlier discussion):
Are these players okay with you using their games as "evidence" that fucking Summoner is weak?

Tartarus Sorceror

Posts: 1774

Joined: Tuesday, 23rd December 2014, 23:39

Post Saturday, 18th August 2018, 09:09

Re: Classify backgrounds into difficulty

It might be more useful for new players to divide the classes into those that are "straightforward" in the sense that you can just train your starting skills and fight with them and you'll do OK, versus those that will require you to substantially depart from your starting build. Classes that start with only a dagger, that rely on ranged ammo, or that are mage classes without a high damage spell for lair would be the non-straightforward classes. I include Air Elementalist as a non-straightforward class because, while Lightning Bolt and Airstrike can kill things in lair, pure air elementalism is weaker than fire, earth, or ice in the midgame until you get to the level of Chain Lightning. I would usually make it a priority to branch out on an Air Elementalist and not train Air magic too high.

Straightforward classes: Skald, Monk, Warper, Fighter, Gladiator, Berserker, Abyssal Knight, Conjurer, Fire Elementalist, Summoner, Earth Elementalist, Ice Elementalist, Chaos Knight, Transmuter.

Non-straightforward classes: Wanderer, Hunter, Wizard, Arcane Marksman, Necromancer, Assassin, Artificer, Venom Mage, Air Elementalist, Enchanter.

But I wouldn't recommend any new player to try Summoner, Chaos Knight, or Transmuter, because they are unusual and difficult. So we get the altered list:
Recommended for new players: Skald, Monk, Warper, Fighter, Gladiator, Berserker, Abyssal Knight, Conjurer, Fire Elementalist, Earth Elementalist, Ice Elementalist.
Not recommended: Wanderer, Hunter, Wizard, Arcane Marksman, Necromancer, Assassin, Artificer, Venom Mage, Air Elementalist, Enchanter, Summoner, Chaos Knight, Transmuter.
Last edited by Berder on Saturday, 18th August 2018, 09:15, edited 1 time in total.
streaks: 5 fifteen rune octopodes. 15 diverse chars. 13 random chars. 24 NaWn^gozag.
251 total wins Berder hyperborean + misc
83/108 recent wins (76%)
guides: safe tactics value of ac/ev/sh forum toxicity

Ziggurat Zagger

Posts: 8651

Joined: Sunday, 5th May 2013, 08:25

Post Saturday, 18th August 2018, 09:12

Re: Classify backgrounds into difficulty

Putting Skald in both lists is a nice touch

For this message the author duvessa has received thanks:
Berder

Tartarus Sorceror

Posts: 1774

Joined: Tuesday, 23rd December 2014, 23:39

Post Saturday, 18th August 2018, 09:15

Re: Classify backgrounds into difficulty

Fixed.
streaks: 5 fifteen rune octopodes. 15 diverse chars. 13 random chars. 24 NaWn^gozag.
251 total wins Berder hyperborean + misc
83/108 recent wins (76%)
guides: safe tactics value of ac/ev/sh forum toxicity

Zot Zealot

Posts: 946

Joined: Tuesday, 4th January 2011, 15:03

Post Saturday, 18th August 2018, 10:30

Re: Classify backgrounds into difficulty

Beder: I am not sure if you have done much statistics apart from DCSS.

But I would like to point out for you that a statistics like this is completely meaningless: you obtained percentages without taking into account other factors that can affect difficulty.

In particular, it seems obvious that race also has an effect on difficulty, and it is absolutely not obvious that in your sample race and background are not correlated.
(In layman terms: did you checked if amongst those monks were more trolls and amongst those transmuters were more nagas?)

As a very first step here you must group games by races first, and only calculate win percentages of the backgrounds for each individual group. And no, you cannot sum them up to get a meaningful number.

*****

A more fundamental problem is that since it has been demonstrated that DCSS can be won with nearly 100% of the time for a good player for any combination (maybe except CK), the percentages can be a result of something entirely different than the combinations itself. Maybe some players find some combinations very annoying to play. Maybe they speedrun with some backgrounds, but play normally with others. Maybe they find summoners so easy that they always try challenge games with them. I especially think that for an experienced player, whose win percentage is above 50%, it is NOT likely that losing a game is a result of the background being not straightforward, or easy.

Your personal straightforward / non-straightforward classification seems to be completely arbitrary to me. Why is warper or CK is straightforward? They start with no good tools to kill, they need to adapt. Why is hunter not straightforward? Because you need to melee early on to converse arrows? It can be annoying I guess but what is not straightforward about it?

For this message the author sanka has received thanks:
Rast

Tartarus Sorceror

Posts: 1774

Joined: Tuesday, 23rd December 2014, 23:39

Post Saturday, 18th August 2018, 17:01

Re: Classify backgrounds into difficulty

sanka wrote:Beder: I am not sure if you have done much statistics apart from DCSS.

But I would like to point out for you that a statistics like this is completely meaningless: you obtained percentages without taking into account other factors that can affect difficulty.

In particular, it seems obvious that race also has an effect on difficulty, and it is absolutely not obvious that in your sample race and background are not correlated.
(In layman terms: did you checked if amongst those monks were more trolls and amongst those transmuters were more nagas?)

As a very first step here you must group games by races first, and only calculate win percentages of the backgrounds for each individual group. And no, you cannot sum them up to get a meaningful number.

I did do exactly this for the graphical chart. It's not possible to do it in a simpler sequell query. You can see that the classes in the graphical chart are not sorted precisely by win percentage (the numbers in the margin); this is because the ranking is adjusted also for race difficulty.

A more fundamental problem is that since it has been demonstrated that DCSS can be won with nearly 100% of the time for a good player for any combination (maybe except CK),

That's just a tavern meme. 80%, 90% maybe. Not 100%. It's also a meme that the game is won beyond the early game. Many players, even the best players, get into trouble and die in the later game. The early game is only half the challenge.

But the more important point is that there are substantial difficulty differences in the amount of skill required to keep different characters alive. Some characters will find themselves in much more dangerous situations more often, and you have to be experienced to understand when this is starting to happen and what to do about it. Other characters can tab through everything or blast through everything without any stress.

the percentages can be a result of something entirely different than the combinations itself. Maybe some players find some combinations very annoying to play.

I would think that annoyance to play should legitimately be a part of the difficulty.

Maybe they speedrun with some backgrounds, but play normally with others.

Part of the reason I look at very high winrate players or at streak players is to address this problem. A player with >50% winrate or a player on a long streak is unlikely to be doing much speedrunning.

Maybe they find summoners so easy that they always try challenge games with them.

You mean, challenge races? I mentioned how that is accounted for in the chart.

I especially think that for an experienced player, whose win percentage is above 50%, it is NOT likely that losing a game is a result of the background being not straightforward, or easy.

I am such an experienced player, and I can tell you, some combos are a lot harder to win than others. They get into a lot more trouble, burn a lot more scrolls of blinking and fear, have to abort a lot more fights, and sometimes die.

Your personal straightforward / non-straightforward classification seems to be completely arbitrary to me. Why is warper or CK is straightforward? They start with no good tools to kill, they need to adapt.

Warper and CK do start with good tools to kill, they have a decent non-shortblades weapon. They can just train that weapon and swap it out for a better one of its kind later on. In many cases they don't need any skills outside their starting ones. Straightforward here does not mean "easy" - I defined it as being able to just continue using and training with the skills they started with.

Why is hunter not straightforward? Because you need to melee early on to converse arrows? It can be annoying I guess but what is not straightforward about it?

Yes - and a new player wouldn't really be prepared for that. You can run out of arrows or bolts anyway sometimes.

Often I branch into melee with a hunter. In fact, perhaps the safest way to play a hunter is with javelins and melee, which requires you to find a decent melee weapon.


Sanka, if you would like to criticize the use of statistics, my challenge to you would be: find better statistics yourself. Just opinion against unfounded opinion is not going to lead anywhere. If you have an opinion, you should try to justify it by looking at the evidence, and what better evidence is there than the statistics? It may be imperfect but it's the only source of evidence about these things that we have.
streaks: 5 fifteen rune octopodes. 15 diverse chars. 13 random chars. 24 NaWn^gozag.
251 total wins Berder hyperborean + misc
83/108 recent wins (76%)
guides: safe tactics value of ac/ev/sh forum toxicity

For this message the author Berder has received thanks:
mdonais

Ziggurat Zagger

Posts: 8651

Joined: Sunday, 5th May 2013, 08:25

Post Saturday, 18th August 2018, 18:00

Re: Classify backgrounds into difficulty

If the "evidence" places Summoner below Monk, then the evidence is worse than unfounded opinion.

Tartarus Sorceror

Posts: 1774

Joined: Tuesday, 23rd December 2014, 23:39

Post Saturday, 18th August 2018, 18:23

Re: Classify backgrounds into difficulty

You already said that. Find evidence supporting your view if you think that way.
streaks: 5 fifteen rune octopodes. 15 diverse chars. 13 random chars. 24 NaWn^gozag.
251 total wins Berder hyperborean + misc
83/108 recent wins (76%)
guides: safe tactics value of ac/ev/sh forum toxicity

Zot Zealot

Posts: 946

Joined: Tuesday, 4th January 2011, 15:03

Post Saturday, 18th August 2018, 18:31

Re: Classify backgrounds into difficulty

Berder wrote:I did do exactly this for the graphical chart. It's not possible to do it in a simpler sequell query. You can see that the classes in the graphical chart are not sorted precisely by win percentage (the numbers in the margin); this is because the ranking is adjusted also for race difficulty.


Well, I am really sorry, but I can't seem to find your explanation. How did you account for race difficulty?

Also, I am puzzled why the percentages are there, instead of the statistics used to order them, then?

Why did you bothered to calculate the percentages if you agree that they are meaningless? Sorry for this part of my post, but I am always a little annoyed by summing different data sources and then taking a percentage. You know that the result can be the exact opposite of the thing you want to measure, do you? So even if you used secretly some other statistics instead of the percentages, they are very confusing there.

Berder wrote:That's just a tavern meme. 80%, 90% maybe. Not 100%


Well, we may argue what is "close" to 100%, but there were many accounts with 100% winrate the last time I checked, and some extremely long streaks.
I thought maybe 90% at least. But the main point is that since the percentages are much lower than those that demonstrably possible by good players if they try then surely it reflects something other than the difficulty.

Berder wrote:I would think that annoyance to play should legitimately be a part of the difficulty.


Well, it certainly explains many things in your posts, so thanks for the addition. But for me it implies that your list is very loosely correlated with how hard to win a background (if one tries). And this certainly explains my impression on the very arbitrary order.

Because in DCSS annoyance is often negatively correlated with how hard to win a race. Many backgrounds are annoying, exactly because they are powerful and not much of a challenge, so you get bored after ~5 floors. So hard to win and annoyance cannot be combined in a meaningful number to order anything.

Berder wrote:Sanka, if you would like to criticize the use of statistics, my challenge to you would be: find better statistics yourself.


I intended to imply that it is impossible to measure how hard to win a certain background/race combination etc. from these data. That is because it is more of a psychological measure of how people playing (annoyance, etc.) than properties of the game (how hard it is to play a combination and how complex a good strategy is), and you cannot make good predictions of these psychological properties without taking into account the persons who play.

So if you think that your list correlates with how hard to win these backgrounds than I disagree.
If you think that your list correlates with how annoying, etc. are these backgrounds, well, probably, but I do not think it proves it. We would need some data about the players themselves, too.

Spider Stomper

Posts: 248

Joined: Monday, 4th September 2017, 10:53

Post Saturday, 18th August 2018, 18:40

Re: Classify backgrounds into difficulty

duvessa wrote:If you come up with an approach for evaluating the strength of backgrounds, and it ends up placing Summoner as worse than Monk, your conclusion should be that either the approach is flawed or the input data is bad.



I've said it before, and I will say it again. You cant really use Sequell statistics to judge mechanical strengths because it is not normalized data and instead simply samples player choice, so all you do is get a statistical representation of that bias, as there is no way to deconvolve the bias from the desired observable. Berder doesn't give a fuck though. He will argue that monk isnt that bad, when really all you can say is the player base prefers monk and has learned to play them better than summoners, in general. Even if you normalized for the number of character combinations run, you still couldnt get away from this problem of more runs = more experience = better runs, in general.

For this message the author crawlnoob has received thanks: 4
Arrhythmia, duvessa, Fingolfin, Rast

Tartarus Sorceror

Posts: 1774

Joined: Tuesday, 23rd December 2014, 23:39

Post Saturday, 18th August 2018, 19:39

Re: Classify backgrounds into difficulty

Mechanical strengths - which we might interpret as the ideal win rate of a character if it was played by the hand of god - are to some degree not what we want to measure. It matters more to us how well actual players do with a character. If it turns out the player base is less experienced with a certain type of character and therefore does worse with it, then that is a legitimate basis for saying that character is more difficult for the player base.

An individual player choosing a character would be most interested in how well *he* would do with that character. The best way for him to figure that out is to look at players most similar to himself.
streaks: 5 fifteen rune octopodes. 15 diverse chars. 13 random chars. 24 NaWn^gozag.
251 total wins Berder hyperborean + misc
83/108 recent wins (76%)
guides: safe tactics value of ac/ev/sh forum toxicity

Tartarus Sorceror

Posts: 1774

Joined: Tuesday, 23rd December 2014, 23:39

Post Saturday, 18th August 2018, 20:05

Re: Classify backgrounds into difficulty

Here is the streak-breaker chart updated to the present day (previous chart was from 2015):
Image
- Green dots represent games in streak that won
- Red dots represent games in streak that broke the streak
- Percentages in margin are win percentages
- Only games that could be 4th in streak or later are counted. This includes streakbreakers of a 3 streak or longer, or wins that are 4th in streak or later.
- Classes are reordered to account for the difficulty of the races that they were tried with, and vice versa for races. For classes, this process involves looking at which proportions of races the class was tried with. Then, look at the baseline expected winrate for that composition of races, using the classes other than the one being scored currently. The class score is then how much above or below the baseline that class performed, and the score is used to order the classes. A similar process is used for scoring and ranking the races.
streaks: 5 fifteen rune octopodes. 15 diverse chars. 13 random chars. 24 NaWn^gozag.
251 total wins Berder hyperborean + misc
83/108 recent wins (76%)
guides: safe tactics value of ac/ev/sh forum toxicity

Spider Stomper

Posts: 242

Joined: Friday, 17th April 2015, 16:22

Post Saturday, 18th August 2018, 20:36

Re: Classify backgrounds into difficulty

Isn't the thread's stated purpose (supposed to be) accomplished by highlighting the races for viability?

I'm all for adding a third color though and combing through the current definition of "viable", since there's a difference between MiAr, DsSu, and OpAs. (only the last of the three is considered viable by the game for some reason)

For this message the author Doesnt has received thanks:
Fingolfin

Zot Zealot

Posts: 946

Joined: Tuesday, 4th January 2011, 15:03

Post Saturday, 18th August 2018, 21:56

Re: Classify backgrounds into difficulty

Berder wrote: Classes are reordered to account for the difficulty of the races that they were tried with, and vice versa for races. For classes, this process involves looking at which proportions of races the class was tried with. Then, look at the baseline expected winrate for that composition of races, using the classes other than the one being scored currently. The class score is then how much above or below the baseline that class performed, and the score is used to order the classes. A similar process is used for scoring and ranking the races.


Thank you for including your explanation (I guess again).

I am sorry, but I cannot figure out what it supposed to mean.

Zot Zealot

Posts: 946

Joined: Tuesday, 4th January 2011, 15:03

Post Saturday, 18th August 2018, 22:00

Re: Classify backgrounds into difficulty

Also, I am a little bit puzzled that NaCK being better than CeCj does not bother you.

For this message the author sanka has received thanks:
Arrhythmia

Tartarus Sorceror

Posts: 1774

Joined: Tuesday, 23rd December 2014, 23:39

Post Saturday, 18th August 2018, 22:20

Re: Classify backgrounds into difficulty

Thank you for including your explanation (I guess again).

I am sorry, but I cannot figure out what it supposed to mean.

Perhaps this is clearer:

Score of a class c = (\Sum_{r in races} games(r,c) * (wins(r,c)/games(r,c) - baseline(r,c))) / (total games with class c).
baseline(r,c) represents the baseline winrate for that race, excluding games by that class. Specifically:
baseline(r,c) = (\Sum_{c2 in classes, c2 != c} wins(r,c2)) / (\Sum_{c2 in classes, c2 != c} games(r,c2))

Also, I am a little bit puzzled that NaCK being better than CeCj does not bother you.

I don't see where you would make that assumption. My data has only 1 NaCK and 1 CeCj game. The NaCK won and the CeCj lost but only 1 game is not enough to draw any conclusion about that.
streaks: 5 fifteen rune octopodes. 15 diverse chars. 13 random chars. 24 NaWn^gozag.
251 total wins Berder hyperborean + misc
83/108 recent wins (76%)
guides: safe tactics value of ac/ev/sh forum toxicity

Ziggurat Zagger

Posts: 4341

Joined: Friday, 8th May 2015, 17:51

Post Sunday, 19th August 2018, 05:09

Re: Classify backgrounds into difficulty

The whole thread is meaningless IMHO. Mo better than Su? Of course, just play Tr. Su better than Mo? Of course, just play DE.
Underestimated: cleaving, Deep Elf, Formicid, Vehumet, EV
Overestimated: AC, GDS
Twin account of Sandman25

Zot Zealot

Posts: 946

Joined: Tuesday, 4th January 2011, 15:03

Post Sunday, 19th August 2018, 10:42

Re: Classify backgrounds into difficulty

Berder wrote:Perhaps this is clearer:

Score of a class c = (\Sum_{r in races} games(r,c) * (wins(r,c)/games(r,c) - baseline(r,c))) / (total games with class c).
baseline(r,c) represents the baseline winrate for that race, excluding games by that class. Specifically:
baseline(r,c) = (\Sum_{c2 in classes, c2 != c} wins(r,c2)) / (\Sum_{c2 in classes, c2 != c} games(r,c2))


Thank you for the equations, it is much clearer now.

This statistics, while more sophisticated, does not avoid the fundamental error similar to the plain win percentages. See the following example:

Let us have two races and three classes. The table shows the (won games)/(all games) for every combination in the dataset:

  Code:
    C1   C2   C3
R1  9/9  0/1  1/1
R2  1/1  8/9  0/4


score(C1) = (9*(9/9 - 1/2) + 1 *(1/1 - 8/13)) / 10 = 0.488
score(C2) = (1*(0/1 - 10/10) + 9 * (8/9 - 1/5)) / 10 = 0.52

As you can see, C1 gets a lower score, while it has better winrate both overall, and both for the races individually. Surely this is a strange property of your statistics.

(Let me know if I misunderstood your equations or made a calculation error.)

Berder wrote:I don't see where you would make that assumption. My data has only 1 NaCK and 1 CeCj game. The NaCK won and the CeCj lost but only 1 game is not enough to draw any conclusion about that.


Yet, in your statistics one of them will appear as "0" and the other as "1.0". So it will affect the scores you calculate.
User avatar

Vestibule Violator

Posts: 1586

Joined: Saturday, 18th June 2016, 13:57

Post Sunday, 19th August 2018, 11:29

Re: Classify backgrounds into difficulty

ATM the difficulty rating for species is actually a "straightforwardness" rating. The more unusual the species is, assuming human is the most usual, the more advanced it is rated.

I am not sure that such a system can be easily implemented for backgrounds. All of them are different, but I don't think that there is a "baseline" background from which others branch out.

So you would have to point out a few basic archetypes, identify the backgrounds which represent them best, and then search for variations ramificating away from the archetype backgrounds. The further removed, the more advanced.

This brings to some oddities. For example, Be is an easy background, but it is more complex than Fi, because it comprises the choice of a god, and the player starts off with a relatively complex ability like rage (with its aftereffects). However, because winning Be is easier than winning Fi, it is used by the game as an introductory background.

All in all, I don't think that statistics have much to do with this. This shouldn't be a matter of which background is stronger, or which one has won the most. It should be a matter of which one is more straightforward.
I Feel the Need--the Need for Beer
Spoiler: show
3DsBeTr, 15DsFiRu, 3DsMoNe, 3FoHuGo, 3TrArOk, 3HOFEVe, 3MfGlOk, 4GrEEVe, 3BaIEChei, 3HuMoOka, 3MiWnQaz, 3VSFiAsh, 3DrTmMakh, 3DsCKXom, 3OgMoOka, 3NaFiOka, 3FoFiOka, 3MuFEVeh, 3CeHuOka, 3TrMoTSO, 3DEFESif, 3DsMoOka

Tartarus Sorceror

Posts: 1774

Joined: Tuesday, 23rd December 2014, 23:39

Post Sunday, 19th August 2018, 12:29

Re: Classify backgrounds into difficulty

sanka wrote:
Berder wrote:Perhaps this is clearer:

Score of a class c = (\Sum_{r in races} games(r,c) * (wins(r,c)/games(r,c) - baseline(r,c))) / (total games with class c).
baseline(r,c) represents the baseline winrate for that race, excluding games by that class. Specifically:
baseline(r,c) = (\Sum_{c2 in classes, c2 != c} wins(r,c2)) / (\Sum_{c2 in classes, c2 != c} games(r,c2))


Thank you for the equations, it is much clearer now.

This statistics, while more sophisticated, does not avoid the fundamental error similar to the plain win percentages. See the following example:

Let us have two races and three classes. The table shows the (won games)/(all games) for every combination in the dataset:

  Code:
    C1   C2   C3
R1  9/9  0/1  1/1
R2  1/1  8/9  0/4


score(C1) = (9*(9/9 - 1/2) + 1 *(1/1 - 8/13)) / 10 = 0.488
score(C2) = (1*(0/1 - 10/10) + 9 * (8/9 - 1/5)) / 10 = 0.52

As you can see, C1 gets a lower score, while it has better winrate both overall, and both for the races individually. Surely this is a strange property of your statistics.

The key to your example is the baseline for R2. (In fact you would get the same result of C2 > C1 if you deleted the R1 row, and with only the R1 row you would find C1>C2.) The baseline for R2 is lower for C2 than for C1.

You might argue that the baseline should be the same for C2 and C1, that the baseline should include all classes including the one being scored. The resulting chart is a little bit different - not substantially.

When the baseline excludes the class itself (as in the image given):
Wz CK Tm Ne AM Sk Su AE Mo Hu VM Wr AK IE Gl Cj En EE Fi Be FE As Ar Wn
When the baseline includes the class itself:
Wz CK Tm Ne AM Su Sk AE Mo VM Hu Wr AK IE Gl Cj Fi En Be EE FE As Ar Wn

The only differences are VM and Hu swapped places, and En-EE-Fi-Be shuffled around.

Berder wrote:I don't see where you would make that assumption. My data has only 1 NaCK and 1 CeCj game. The NaCK won and the CeCj lost but only 1 game is not enough to draw any conclusion about that.


Yet, in your statistics one of them will appear as "0" and the other as "1.0". So it will affect the scores you calculate.

It affects the scores only a tiny bit, as it should since it's only two games. Everything is weighted by the number of games.
streaks: 5 fifteen rune octopodes. 15 diverse chars. 13 random chars. 24 NaWn^gozag.
251 total wins Berder hyperborean + misc
83/108 recent wins (76%)
guides: safe tactics value of ac/ev/sh forum toxicity

For this message the author Berder has received thanks:
sanka
User avatar

Tartarus Sorceror

Posts: 1891

Joined: Monday, 1st April 2013, 04:41

Location: Toronto, Canada

Post Sunday, 19th August 2018, 15:52

Re: Classify backgrounds into difficulty

Shtopit wrote:ATM the difficulty rating for species is actually a "straightforwardness" rating. The more unusual the species is, assuming human is the most usual, the more advanced it is rated.


This is demonstrably untrue, because Human is in the second category rather than the first.
take it easy

For this message the author Arrhythmia has received thanks:
duvessa
User avatar

Vestibule Violator

Posts: 1586

Joined: Saturday, 18th June 2016, 13:57

Post Sunday, 19th August 2018, 19:24

Re: Classify backgrounds into difficulty

That should teach me not to write from memory, I guess!
I Feel the Need--the Need for Beer
Spoiler: show
3DsBeTr, 15DsFiRu, 3DsMoNe, 3FoHuGo, 3TrArOk, 3HOFEVe, 3MfGlOk, 4GrEEVe, 3BaIEChei, 3HuMoOka, 3MiWnQaz, 3VSFiAsh, 3DrTmMakh, 3DsCKXom, 3OgMoOka, 3NaFiOka, 3FoFiOka, 3MuFEVeh, 3CeHuOka, 3TrMoTSO, 3DEFESif, 3DsMoOka
User avatar

Abyss Ambulator

Posts: 1192

Joined: Friday, 18th April 2014, 01:41

Post Thursday, 23rd August 2018, 01:37

Re: Classify backgrounds into difficulty

I don't think trying to rank-order backgrounds is that useful. it depends on what race you are, and trying to average over races is an exercise in futility. Plus, most of them are more or less as good as one another, barring a few which are really good (Be, IE, FE) and really bad (CK, AM, VM).
remove food
User avatar

Ziggurat Zagger

Posts: 4288

Joined: Wednesday, 23rd October 2013, 07:56

Post Thursday, 23rd August 2018, 07:15

Re: Classify backgrounds into difficulty

Please stop calling VM really bad, because it isn't.
...{HaBeKoAK}CeVM{MfWnMiAK}TeAMDrIE{FoVMVSFi}{MuVMGhGlVpMo}HaWrSpWz{OgGlTrMo}{CeWnMfBeMiSk}
DrEE{GrFiFoGl}DgEnFeNe{OpGlHuSu}DDArHaCKSpAEGrTmDgFEDsCjGhMoHuVM{HaAMBaEn}{HuMoHOWn}
DsWzDDHu{DgWnGnBe}FeIE{MiEnMfCj}SpNeBaEEGrFE{HaAKTrCK}DsFESpHu{FoArNaBe}FeEE

Ziggurat Zagger

Posts: 8651

Joined: Sunday, 5th May 2013, 08:25

Post Thursday, 23rd August 2018, 07:27

Re: Classify backgrounds into difficulty

Meh, VM is below median and I can see an argument that it's quite far below median, even if I'd disagree with it.

I feel like starting with a hand crossbow should rule out AM and Hu from being mentioned in the same breath as the likes of Mo or CK though.

Tartarus Sorceror

Posts: 1774

Joined: Tuesday, 23rd December 2014, 23:39

Post Thursday, 23rd August 2018, 09:03

Re: Classify backgrounds into difficulty

Feel free to support your claims with some evidence.
streaks: 5 fifteen rune octopodes. 15 diverse chars. 13 random chars. 24 NaWn^gozag.
251 total wins Berder hyperborean + misc
83/108 recent wins (76%)
guides: safe tactics value of ac/ev/sh forum toxicity
User avatar

Tartarus Sorceror

Posts: 1750

Joined: Monday, 14th October 2013, 01:05

Post Thursday, 23rd August 2018, 18:03

Re: Classify backgrounds into difficulty

Doing this thing where you brush aside anyone else's opinion when they don't have "evidence" when anyone who knows anything about statistics knows your evidence is clearly flawed and not something to make assumptions based on isn't doing you any favors

For this message the author Shard1697 has received thanks:
Arrhythmia

Tartarus Sorceror

Posts: 1774

Joined: Tuesday, 23rd December 2014, 23:39

Post Thursday, 23rd August 2018, 19:34

Re: Classify backgrounds into difficulty

Evidence is better than no evidence.
streaks: 5 fifteen rune octopodes. 15 diverse chars. 13 random chars. 24 NaWn^gozag.
251 total wins Berder hyperborean + misc
83/108 recent wins (76%)
guides: safe tactics value of ac/ev/sh forum toxicity

Ziggurat Zagger

Posts: 6177

Joined: Tuesday, 30th October 2012, 19:06

Post Thursday, 23rd August 2018, 19:39

Re: Classify backgrounds into difficulty

Berder wrote:Evidence is better than no evidence.

Not always, I don't have any comment on whether or not it's better than no evidence in this case, but cherry picked and misrepresentative evidence can generate false confidence in erroneous conclusions, which is, in fact, worse than "we just don't know, or are uncertain but have this suspicion or opinion".
Spoiler: show
This high quality signature has been hidden for your protection. To unlock it's secret, send 3 easy payments of $9.99 to me, by way of your nearest theta band or ley line. Complete your transmission by midnight tonight for a special free gift!

For this message the author Siegurt has received thanks: 2
Arrhythmia, Fingolfin

Tartarus Sorceror

Posts: 1774

Joined: Tuesday, 23rd December 2014, 23:39

Post Thursday, 23rd August 2018, 19:50

Re: Classify backgrounds into difficulty

My evidence is pretty good, in fact. You'd have to do a lot of work to make a better analysis than my streak table. Regardless, even weak evidence is better than no evidence.

"we just don't know, or are uncertain but have this suspicion or opinion".

You think suspicions or opinions are not affected by cherry-picking and biased statistical sampling? They absolutely are. Anyway, it's not opinion vs evidence, it's opinion vs. opinion+evidence. Aside from the evidence, my opinion is backed up by a lot of experience and skill in the game.
streaks: 5 fifteen rune octopodes. 15 diverse chars. 13 random chars. 24 NaWn^gozag.
251 total wins Berder hyperborean + misc
83/108 recent wins (76%)
guides: safe tactics value of ac/ev/sh forum toxicity

Ziggurat Zagger

Posts: 6177

Joined: Tuesday, 30th October 2012, 19:06

Post Thursday, 23rd August 2018, 20:40

Re: Classify backgrounds into difficulty

Berder wrote:My evidence is pretty good, in fact. You'd have to do a lot of work to make a better analysis than my streak table. Regardless, even weak evidence is better than no evidence.

"we just don't know, or are uncertain but have this suspicion or opinion".

You think suspicions or opinions are not affected by cherry-picking and biased statistical sampling? They absolutely are. Anyway, it's not opinion vs evidence, it's opinion vs. opinion+evidence. Aside from the evidence, my opinion is backed up by a lot of experience and skill in the game.

Like I said, I wasn't commenting on whether *this* evidence was or was not better than no evidence, only that it's *possible* to use evidence to cement erroneous conclusions, and that therefore there *exist* cases when evidence is worse than no evidence.
Spoiler: show
This high quality signature has been hidden for your protection. To unlock it's secret, send 3 easy payments of $9.99 to me, by way of your nearest theta band or ley line. Complete your transmission by midnight tonight for a special free gift!

Tartarus Sorceror

Posts: 1774

Joined: Tuesday, 23rd December 2014, 23:39

Post Thursday, 23rd August 2018, 21:23

Re: Classify backgrounds into difficulty

Siegurt wrote:Like I said, I wasn't commenting on whether *this* evidence was or was not better than no evidence, only that it's *possible* to use evidence to cement erroneous conclusions, and that therefore there *exist* cases when evidence is worse than no evidence.

And why would you comment this if you didn't think it applied? In any case, it is possible to have bad evidence, but the bar to beat *no evidence at all* is extremely low. Bob, who has never seen a Gorblat, says "I think Gorblats are round." Alice says "Hmm, I've studied Gorblats and according to my data they are square." Maybe Alice is wrong, but she doesn't have to do much work at all to have a more informed opinion than Bob, who did no work and has no evidence.
streaks: 5 fifteen rune octopodes. 15 diverse chars. 13 random chars. 24 NaWn^gozag.
251 total wins Berder hyperborean + misc
83/108 recent wins (76%)
guides: safe tactics value of ac/ev/sh forum toxicity

Ziggurat Zagger

Posts: 6177

Joined: Tuesday, 30th October 2012, 19:06

Post Thursday, 23rd August 2018, 22:40

Re: Classify backgrounds into difficulty

Berder wrote:And why would you comment this if you didn't think it applied?

Because you appeared to be making a blanket generalization that any evidence at all, in any circumstance, even misleading or false evidence, is better than no evidence, and I disagree.
Spoiler: show
This high quality signature has been hidden for your protection. To unlock it's secret, send 3 easy payments of $9.99 to me, by way of your nearest theta band or ley line. Complete your transmission by midnight tonight for a special free gift!

Tartarus Sorceror

Posts: 1774

Joined: Tuesday, 23rd December 2014, 23:39

Post Friday, 24th August 2018, 00:07

Re: Classify backgrounds into difficulty

Well, technically you're right, but in practice, if one side of a dialectic has nothing but opinion and the other side has facts backing them up, by default I'm going to be persuaded by the side with the facts.
streaks: 5 fifteen rune octopodes. 15 diverse chars. 13 random chars. 24 NaWn^gozag.
251 total wins Berder hyperborean + misc
83/108 recent wins (76%)
guides: safe tactics value of ac/ev/sh forum toxicity

Shoals Surfer

Posts: 329

Joined: Tuesday, 14th April 2015, 19:56

Location: France

Post Friday, 24th August 2018, 02:08

Re: Classify backgrounds into difficulty

Obligatory xkcd : https://xkcd.com/1781/
Image
3 runes : MiMo^Ru, HOFi^Beogh, TrMo^Yredelemnul, GrFi^Ru, FoFi^Gozag, MiGl^Okawaru
4 runes : DDFi^Makhleb
5 runes : GrEE^Vehumet
15 runes : MiFi^Ru, NaWz^Sif Muna, GrWz^Sif Muna
I mostly play offline or online on CXC

For this message the author Fingolfin has received thanks:
Arrhythmia

Ziggurat Zagger

Posts: 8651

Joined: Sunday, 5th May 2013, 08:25

Post Friday, 24th August 2018, 02:26

Re: Classify backgrounds into difficulty

Image

this should all probably be split off to another thread

For this message the author duvessa has received thanks: 6
Arrhythmia, Fingolfin, Gigaslurp, Implojin, Pereza0, Shard1697

Tartarus Sorceror

Posts: 1774

Joined: Tuesday, 23rd December 2014, 23:39

Post Friday, 24th August 2018, 03:14

Re: Classify backgrounds into difficulty

Can we not be reduced to grunting and pointing with memes. If you have anything specific to say about the streak table, let's hear it. If you have a better analysis of the data, let's hear that. As it is, it is the best analysis so far. And there is no cherry picking. My process is transparent and not designed to produce any particular result.

If you'd like my source code so you can play around, just ask. IMO the most important variable to control for is player skill. I partially control for this by restricting to longer streaks. There are better ways to control for player skill, and if anyone feels like taking on the project, go right ahead. I could suggest an Elo-like system to find player ratings and combo ratings, treating each game as a contest between a player's rating and a combo rating equal to (class rating + race rating).
streaks: 5 fifteen rune octopodes. 15 diverse chars. 13 random chars. 24 NaWn^gozag.
251 total wins Berder hyperborean + misc
83/108 recent wins (76%)
guides: safe tactics value of ac/ev/sh forum toxicity

For this message the author Berder has received thanks:
RoGGa

Zot Zealot

Posts: 946

Joined: Tuesday, 4th January 2011, 15:03

Post Saturday, 25th August 2018, 14:57

Re: Classify backgrounds into difficulty

Berder wrote: Everything is weighted by the number of games.


Well, this is not true, or, if interpreted differently, this is one of the problem with your statistics.

It is biased by the number of combo selections, which is especially bad since there is a huge number of difference between the games played on certain combos - for some combinations there is no example in your data, and you did not even mention how you handled those!

I try to convince you again. Let's assume you have two races and two backgrounds, R1 R2, and C1 and C2. Let's assume that the win probabilities of a player are the following:

R1C1 = 0.1 R1C2 = 0.2
R2C1 = 0.8 R2C2 = 0.9

And assume that the games played are the following:

  Code:
    C1   C2
R1  10  1
R2  1  10


If you run some simulations, you will find that C1 is more likely to produce a higher score than C2, while for both races we assumed that C1 has a lower probability to win.

Spoiler: show
  Code:
from __future__ import division
import numpy.random as rand

racenum, classnum = 2, 2;

probs = [[0.1,0.2],
         [0.8,0.9]]
games = [[10,1],
         [1,10]]

def score(c):
   return sum([games[r][c] * (wins[r][c]/games[r][c] - baseline(r)) for r in range(racenum)]) / sum([games[r][c] for r in range(racenum)])

def baseline(r):
   return sum([ wins[r][c] for c in range(classnum) ]) / sum([ games[r][c] for c in range(classnum)])

count = 0
for i in range(10000):
  wins  = [[rand.binomial(games[r][c],probs[r][c]) for c in range(classnum)] for r in range(racenum)]
  if score(0)>score(1):
     count+=1

print "Count : "+str(count)


You can also get a strange result if you assume that the number of wins mirrors the probabilities:

R1C1 = 1 R1C2= 0
R2C1 = 1 R2C2 = 9

These win counts still lead to C1 being better than C2.

For this message the author sanka has received thanks: 2
Arrhythmia, Berder

Tartarus Sorceror

Posts: 1774

Joined: Tuesday, 23rd December 2014, 23:39

Post Saturday, 25th August 2018, 17:46

Re: Classify backgrounds into difficulty

That's an interesting quirk, but it's because of the low number of games in your example. It's not a flaw in my metric; any sane metric should give the same result. Instead if you multiply the number of games by 10, with games = [[100,10],[10,100]], then C2 wins most of the time as you would expect.


To figure out what is happening in the [[10, 1],[1,10]] case, consider the [[100,1]] case which is simpler and easier to analyze, but shows the same effect you saw.

racenum, classnum = 1,2
probs = [[0.1,0.2]]
games = [[100,1]]

Now, in this example, C1 will almost always have some wins, because it played 100 games at 10% underlying winrate. But 80% of the time C2 will have no wins. Therefore, it makes sense that 80% of the time, C1 (with some wins) should beat C2 (with no wins), despite the lower underlying winrate. Any sane metric should say that C1 beats C2 most of the time in this example.

The same effect is seen when games = [[10,1]], just of lesser magnitude: C1 winrate is nonzero more often than C2 winrate is nonzero. And when games = [[10,1],[1,10]] the effect is just combined with the opposite one, where C1 winrate with R2 is 100% more often than C2 winrate with R2 is 100%, still in favor of C1.
streaks: 5 fifteen rune octopodes. 15 diverse chars. 13 random chars. 24 NaWn^gozag.
251 total wins Berder hyperborean + misc
83/108 recent wins (76%)
guides: safe tactics value of ac/ev/sh forum toxicity
Next

Return to Game Design Discussion

Who is online

Users browsing this forum: No registered users and 7 guests

Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group.
Designed by ST Software for PTF.