Wednesday, January 28, 2015

Theorycrafting 101

So I saw a Twitter conversation where someone was basically asking, "How do I Theorycraft?" While Twitter is amazing for jumping in on these conversations, it's terrible for any sort of protracted discussion. I'm relatively new to the Theorycrafting space for WoW itself, but I've managed to make a couple waves (like with my secondary stats posts). I'm no Theck, or Bouchebaguette, or Vixsin, but I do okay I think.

So here's how I got into Theorycrafting, and a slightly formalized measure of how to do it. Warning, my example will be WoW-heavy.

There are definitely days where I feel this.

Theorycrafting 101

Theorycrafting is science. You're basically applying the scientific method to something in the game you're playing. As a refresher for those who don't remember 6th grade science class, the scientific method is as follows:

Totally stolen from http://www.cdn.sciencebuddies.org/Files/5084/7/2013-updated_scientific-method-steps_v6_noheader.png

Step 1: Ask a Question, Do Some Research

The process of forming your question starts with being curious about something, then performing research, simplifying variables, and adding constraints and base assumptions to come up with a more specific question, repeating until you think you have something you can try to answer.

Okay, so what do you want to figure out? For my secondary stats posts, one of the questions I wanted to get the answer to was, "When should I take an upgrade?". I started scouring the Internet using sources like other theorycrafters, blue posts, data in wowpedia.org, wowhead.com, icy-veins.com, and using my own logic and knowledge of the game to figure out what is involved in a gear upgrade: ilvl, primary stats versus secondary stats, sockets, warforged, tertiary stats, DPS versus survivability, different classes determining how much a stat is worth via different stat weights. That's clearly way too much to handle, so it was time to start whittling that down to more specific questions and clear away irrelevant data.

I decided personally to focus on DPS. "When was it useful for DPS to take an upgrade?" Tanks and healers muddy the water because they're not concerned with raw throughput, but for my first attempt, raw throughput is a relatively easy concept to measure.

Once I had a more specific question, I could clear away largely irrelevant data. DPS versus survivability, well, we're trying to optimize our DPS throughput, so Stamina and the defensive benefits of Versatility could be ignored. Tertiary stats are not DPS boosting stats. Well, wait, what about movement speed? The faster you can move, the less moving you need to do, which means more uptime on your rotation. Ehh, like physicists, let's pretend the cow is a perfect sphere, or the fight is Patchwerk, so we're creating more constraints to simplify the problem.

This is how theoretical physicists do it.
So now we're down to ilvl, primary stats versus secondary stats, sockets, warforged, and stat weights. More research indicates that ilvl is tied to something called a stat budget. Basically, the amount of primary and secondary stats on a piece of gear is related to a formula based on ilvl. Ideally, two bracers, for example, with the same ilvl, will have the same budget, or raw values of primary and secondary stats. Basically, ilvl is equivalent to your primary stats (because your gear will always have Stamina plus your primary attribute), so we can collapse a couple more of our inputs.

So, ilvl, secondary stats, sockets, warforged, and stat weights. Warforged is just ilvl, so we can collapse that down to ilvl as well. Sockets throw a wrench in the works, so I decided to ignore that for the time being.

We know that higher ilvl should generally be better, but because of how secondary stats and stat weights interact, not every piece is created equally for every class. Stat weights inform how good a specific secondary is for a class, so we know that this is going to vary per class. but maybe there's a good rule of thumb? I mean, eventually we'll get enough stat budget that it won't matter what secondaries are on the item, we'll just want to take it regardless.

Stat weights themselves are generated through simulation. Someone else did all the work, and for my piece, while I did a lot of research into how those are generated, that research strictly wasn't necessary--unless I felt like challenging how stat weights are generated, but I didn't. So a basic assumption I went in was that stat weights are generally accurate for our purposes.

So the question has become, "For maximum standing DPS throughput for most classes, how many more ilvls does an upgrade need to have before you should probably take it no matter what, ignoring sockets and tertiaries?"


Step 2: Form a Hypothesis

Hypothesis: Poop Theft
For our question, we should take an educated guess at what the result might be. For my post, I figured, well, 15 ilvls is probably a pretty good bet given that's how much space there is between a raid tier.

If you're having difficulty forming a hypothesis, or you have so many caveats your hypothesis might be mistaken for a cell phone contract, you'll want to go back to Step 1. Your question is likely too broad.

The process of coming up with an educated guess will inform how you design your experiment. If your question is good enough, this step should be relatively simple.


Step 3: Construct and Execute an Experiment

Constructing an experiment requires you to take your question with your research, and use logic and/or math to determine a way to test your hypothesis.


Time to get into the nitty gritty! So to test that a 15 ilvl jump is sufficient to take an upgrade, we need a way to measure that jump. Thankfully, we can use aforementioned stat weights to measure how much value a piece of equipment offers. To determine if an item is an upgrade, we need only evaluate its value and compare it to the piece we're currently trying to upgrade.

But that's not sufficient. Remember that we're trying to find a more general answer, so perhaps we should consider the worst-case scenario: the item we're starting with has nothing but the best secondary we like, and the item we're upgrading to has nothing but the worst secondary, like an Enhancement Shaman with 630 Haste bracers, upgrading to 645 Versatility bracers. If we can show what ilvl bump is needed for the worst possible scenario, then anything better will clearly fit our model and hooray, success!


Step 4: Analyze Data and Draw Conclusions

What conclusion can you draw from this data?
Once you have your data, you need to compare it to your hypothesis. Do the results hold up your hypothesis? Partially or fully? Can you break or find holes in your results? Can you refine your experiment to test boundaries? Can you draw other inferences from your data? You may need to repeat the loop between Step 3 and Step 4 many times before coming to a satisfactory answer.

When I executed the experiment, I ended up showing that actually, 10 ilvls was more sufficient for Enhancement Shaman in the worst possible case. But again, that's not enough. We can't generalize with that. While Enhancement Shaman have some of the more divergent secondary stat weightings, they're not the worst.

So instead I found even more divergent stat weightings and repeated the experiment for a stat weight spread where the best is nearly twice as good as the worst (something that pretty well never actually occurs in simulated stat weights), and we found that, hm, actually, we'd need about 18 ilvls. More data and more calculations proved that to be an outlier, however, so we could caveat our results to the point where in most but the most egregious stat weighting cases, 10 ilvls was sufficient, and 15 ilvls was a sure bet.

Since the results held at lower ilvls, and stat budget is exponential in nature, ilvl gaps at higher ilvls would only exhibit this behaviour more strongly. That is, it would require smaller ilvl gaps to make an upgrade worthwhile at higher ilvls. Empirical analysis (basically, just trying the experiment at higher ilvls) bore this out.

Interestingly, my results didn't take into account procs on Trinkets, or the stat weightings for weaponry, so I had to caveat my results there as well. They don't match the constraints I put together in my question and research phase, so they'd also require their own experiments and refinement. Sockets, as mentioned before, also did not fit.


Step 5: Communicate Results

This is a very important step, not only because you're trying to answer people's questions, but because you're opening yourself up to peer review. Folks who read your work will naturally find holes in it. Every theorycrafter makes mistakes, but mistakes are completely okay. By correcting those mistakes, you either find your conclusions were bunk, or you can alter the experiment to make your conclusions even more airtight.

In my case, I published my results for secondary stats on my blog, and a totally valid criticism came along that I had used bracers, an item with very low, if not the lowest, stat budgets. So I repeated my experiments with an item that had one of the highest stat budgets: the chest piece. My results came back even stronger.

Expect challenges to your work. In fact, relish in those challenges. If someone pokes a hole in your logic or math, don't despair. Fix it! And if you can't fix it, and your experiment was a failure, you've still contributed knowledge to the greater community as well as your own experiences. In science, and in theorycrafting, failing and being wrong aren't bad things. It's still useful information other folks can build upon.



Conclusion

So that's it, that's theorycrafting in a nutshell. Having a math/science background helps significantly, as does knowledge in computer science. I can't offer how to analyze your data precisely, or how to generate exact experiments, because it differs greatly depending on the question, and some of that does require mathematics, statistical models, and so on.

But people didn't know off-hand how to build an experiment to test if molecules were made up of atoms: that only came after numerous other experiments that expanded scientific knowledge and consensus, and the same holds with theorycrafting. It takes practice, doing, and sound logical thought.

So start with something relatively small, what you think might be easy--you'll be quickly surprised at how not easy even the simplest questions tend to be with all of their caveats--but that's okay. Simplify and answer a very specific question, add in the caveats afterwards to generalize or explore your results further. And most importantly, have fun doing it! #Theorycrafting

Monday, January 26, 2015

[WoW] Is Raiding With 10 Harder Than 30?

So one of the things I've been mulling about in the back of my mind is in World of Warcraft, are current raids more difficult with fewer people? When you look at Highmaul, most mechanics require a static number of players to handle them, regardless of raid size:
  • Kargath - 5 players go to the stands
  • Butcher - Most strategies require the movement of 2 players
  • Tectus - Each shard requires a ranged player to deal with Crystalline Barrage
  • Brackenspore - Two players are utilizing Flamethrowers to keep moss at bay
  • Twin Ogron - Enfeebling Roar is distributed among 10 targets maximum (20 for Mythic)
  • Ko'ragh - Every time Ko'ragh goes into recharge mode, you need to use a (preferably) ranged player to use Nullification Barrier to soak Overflowing Energy
  • Imperator Mar'gok - Usually need 2 players to handle Branded
The more players you have, the less you notice the absence of players handling special mechanics. For example, in a 30-player raid, losing two DPS to handle Flamethrowers is trivial, because you're probably running 22 - 23 DPS, meaning at worst you're only losing 10%. In a 10-player raid, you're only running 5 - 6 DPS, so you've now lost 33% - 40% of your total DPS.

Now, granted, in the Brackenspore case they get a buff to offset DPS lost, but that's not always the case. Tectus at some point you're dealing with losing 5 ranged players to running around dodging Crystalline Barrage. In a 10-man raid, that means you may end up with all of your ranged DPS and healers running around only able to toss out instant-casts!

Similarly with Kargath, losing half your raid (more than half your DPS if you're currently running 3 healers) every time you get tossed into the stands is a significant loss of DPS on the boss. Twin Ogron's Enfeebling Roar becomes trivial to soak if you have a big enough raid, as your melee/tanks will basically auto-soak it by virtue of 10 players near the boss, with nothing extra required to handle it.

I understand why the developers/designers wanted to avoid numerically scaling mechanics: it creates optimal inflection points (like the ye olde 14 players of SoO Flex) that correctly or incorrectly, raids will balance themselves around. But arguably health on some of these bosses should not scale linearly. The difference between adding an extra DPS player on Kargath means you've added 50% extra damage during the stands phases with 11 players (having 3 DPS on the ground), versus adding an extra DPS player from 29 to 30 players, which would be an extra 6% or so damage during the stands phases (with 19 DPS on the ground).

Given each Chain Hurl you're expected to be in the stands for the 45 seconds, and starting at 1:30 into the fight you see a Chain Hurl about every 2 minutes, a 5-minute fight you're short DPS for 30% of the fight. Meaning for 10-player, you're short a total of ~15% DPS or so, compared to 30-player where that would work out to a ~1.8% DPS deficit. Adding an extra DPS to make an 11-player raid would reduce that to ~7.5%, and from there the number begins to decline.

Not quite sure what happened at Players = 20. I checked a couple logs and they both had that dip.
However, digging through WarcraftLogs, I found the health values for Normal Kargath across all raid sizes. It's clearly a linear progression (approximately 3.3 million health per player added). This suggests that fewer DPS players need to do a fair bit more DPS to make up for the lack of players. Thus, Kargath is quantifiably more difficult for 10 players than he is for 30. The more players you have, the less DPS each player needs to do in aggregate.

Granted, especially on lower difficulties, many raids just ignore the stands completely (healing through the extra damage), but the fight as designed expects you to send folks to the stands, and for initial kills that's likely the case.

The same pattern can easily hold for pretty well every mechanic I've listed in the preamble. I just wish I could filter WarcraftLogs' data for average ilvl for kills over a time period per raid size so I could either support or debunk this argument. In theory, if the average ilvl for a given raid size is smaller than a different raid size, if we have enough samples we could say that it is likely easier--caveat certain raid sizes for Mythic guilds, as they tend to be above 20 players on Heroic mode so they can gear their bench, so data may be slightly skewed.

How's Healing Affected?

Healing is a lot harder to quantify, as lots of damage tends to be optional. However, more healers means more overlap. When you're running two healers, that leaves with little to no room for error. When you're running with 5 or 6 healers, yes, there's more damage going out, but you can only amp up tank damage so much before we're back in two-shot territory circa MoP, and even if you say, double the damage taken between 10 players and 30 players, the fact that you have twice as many healers staring at your tanks' health bars means if one healer gets distracted for a moment (like having to run away from Tectus' mechanics), you're less likely to lose that tank.

Then there's also the age-old issue of losing players. Thankfully the battle-rez charging mechanic they added helps this issue quite nicely, but you still have the issue where at the end of the day, if you're down two players in 10-player raids, that's 20% of your raid, versus two players in 30-player raids, that's only 6.67% of your raid.

About the only thing that is harder in bigger raids--aside from having to organize that many people and analyzing what went wrong is probably nightmarish--is space. Fewer players means fewer people overlapping for any mechanics that require spreading (i.e: Expel: Fire on Ko'ragh, or Pulverize on Twin Ogron), but that goes back to the fact that when you have 2 - 3 times as many healers, you need to amp up damage somewhere, and if you aren't increasing it on the tank, you're increasing it on the raid, and a lot of these overlap mechanics are where you'll see the increased damage taken to keep those extra healers busy.

Conclusion

So even ignoring the touchyfeelycraft around tank damage and healers, DPS at least shows a quantifiable increase in difficulty for smaller raid sizes that is currently not being taken into account by Blizzard's raid encounter designers if Kargath himself is any indication. Note that the more stringent DPS requirements means that if they're not being met, the strain will be passed on to the healers and their mana pools, as they'll have to heal for a longer amount of time.#WorldOfWarcraft, #Theorycrafting



UPDATE: It's clear between the discussions on Twitter and Bakoth's comment below, my point was lost in the minutia of my post. It's not just that raiding with 10 is harder than 30, it's that it's excessively more difficult.



The above graph is the DPS loss charted over the number of players, (and you can clearly see in the graph when I add an extra healer). For a 10-man, the players need to play disproportionately better than the players in a 11-man, and then again for a 12-man. The DPS loss is a curve. If it was linear, then yes, I would agree that all is fine and dandy.




This graph is the DPS deficit each remaining DPS would have to make up (this is mildly faulty given that the 3 players in the stands can increase their own DPS in the ~70% of the fight they aren't in the stands, but I don't really care to do the algebra when this will still illustrate my point accurately enough). This graph is clearly an inverse exponential curve.

The remaining DPS must make up a 9% deficit each for 10-player. But 11-player needs only to make up 5% each. 12-player, 3.21%. 13-player 2.25%. Once we hit 14 or 15 players, we're close enough to linear that I would say it would meet the goal of making things a little bit easier as your raid gets bigger.

Kargath, due to the nature of 3 DPS rather than 2 in most other cases, is a more extreme example. Luckily, he's the first boss of the tier, so this difference is largely steamrollered by the fact that, well, he's quite an easy boss on both Normal and Heroic. However, my point stands: static assignments disproportionately punish the smallest raids.

Is it a huge differential? To a Mythic guild bumming around in heroic, no. It's not. To a guild that's struggling to hit their DPS targets to begin with (take a hypothetical Kargath-like fight later in pretend Tier-18), you have fewer players, and all of them must improve a significant margin.

Is it even worth changing the mechanics? Honestly, probably not. But a simple tweak in health scaling at the lowest end would smooth that curve out a little. Leave it such that more players means an easier raid, if that's Blizzard's goal, by all means, have at it. However, the current style of mechanics aren't really solving the "inflection point" issue they had in SoO Flex. Instead, it's just pushed the inflection point such that optimal raid size is greater than 13 or 14 players.

Monday, January 19, 2015

[WoW] My First WoW Arena Foray

Over the past couple weeks I've started getting into WoW's PvP, and I've actually had a pretty good time.

Now, let me make something clear: I still hate World PvP, where the modus operandi is wander around with a gang of players and find smaller gangs to annihilate. And after playing through enough Ashran to make my eyes bleed to get my honor set, I still hold that to be true. That is emphatically Not Fun. Oh the occasional one-on-one or two-on-two skirmish that I got into was a blast, even when I lost. But if I wasn't getting whacked by roving bands of the opposing team, it was watching the complete cluster that was the mobs of players on the main road.

Granted, a single Flame Shock and a judicious Lava Lash, run away and spam Fire Nova? Numbers everywhere the eye could see. But even that lost its luster incredibly quickly.

What I DID find fun was Arena. But let's back up a second.

My main character is an Enhancement Shaman this expansion, and since my BF convinced me to give Arena a try, and he always mains a healer, I figured, all right, let's take the DPS I'm most familiar with even if everyone says it sucks.

In Ashran, at least, I managed to make about 3k honor an hour thanks to Tablet of Ghost Wolf and just wandering around looking for treasures (which would net me anywhere from 30 - 150 artifact fragments). When you turn in 500 of those suckers, you're looking at 2k honor plus 2.5k rep. Getting my honor set went extremely quickly. I just basically ignored all the PvP aspects of Ashran to do it.

With a full set of honor gear, a little research into glpyhs/talents and PvP tactics away we went. We weren't expecting to win very many matches. He hadn't Arena'd since Cataclysm, and I never Arena'd at all. Add that to our very odd combination of Enhancement Shaman/Discipline Priest, and, well, we expected the worst.

Turns out we're better than we thought.

Enter the Arena

Sixty-one 2v2 arena matches, with a 54% win rate at a 1400ish MMR. Which isn't great, I realize. It's kinda terrible if you compare it to folks who do PvP with any sort of regularity. But given that's a grand total of 61 PvP battles I've **ever** had, and well, I'd say it's okay. Anyways, we were expecting closer to a 25% win rate to start.

However, boy am I really starting to feel how poorly Enhancement Shaman is set up for PvP. Our one major CC, Hex, 45 second cooldown. Our only stun, Capacitor Totem, easy to stomp, 5 second windup, basically requires Totemic Projection to be usable (which I'm still only so-so at). Compare that to spammable Cyclone, or a single Rogue putting me out for a full 12 - 16 seconds, even with a trinket, and it's just strange. I cannot begin to imagine how frustrating PvP was before they did a pass on hard CC.

Granted, I do have lots of snares, and lots of movement boosts, so unless I'm facing a DK, I'm pretty hard if not close to impossible to shake. DKs, on the other hand, are close to impossible for me to escape, even with Windwalk Totem/Spiritwalk.

And having 20%+ of our DPS tied to a totem that can be taken out in 2 GCDs? Thanks Fire Elemental, for absolutely nothing. I've gotten decent at hiding the totem behind pillars, or using Totemic Projection to move it if enemies are getting close, but anybody who's even remotely useful stomps it flat in no time. Actually, totems in general are a problem for that.

But spammable Purge? Yay! At least we're good for something.

However, Enhancement single target DPS is definitely lacking. Nine times out of ten the other team has more damage done than I did, even with all of my cooldowns in use. I simply cannot compete with Warriors, Rogues, DKs, or Ret Paladins for DPS even remotely. They usually have a good 30% on top of me. And don't get me started on Hybrid Heals. I've seen Frost DKs pull off as much healing as a healer does in an Arena match. I mean, what's the point of bringing a healer?

Comp Time

Anyways, those frustrations aside, I have been having fun. There are some comps that we generally just throw up our hands because unless they're totally incompetent, we can't do squat about them (Rogue/Warrior or Ret/DK are generally the case, though we've actually gotten better at surviving Ret/DKs and have actually won a couple, but there are a LOT of Ret/DK comps). Other comps are annoying, but we generally win unless we goof up.

I've seen a total of 2 other Shaman: a Resto and an Enhancement. The Enhancement Shaman was part of a comp with a Beast Mastery Hunter, and they summoned all the pets. I was like, okay, you're going to give me 6+ targets? You're going to get Fire Nova'd into oblivion. And they did, and we won in short order. It was like Enhancement was the perfect counter to Enhancement. The Resto Shaman I trounced quickly as well, burst them into the ground. They definitely did not pose a threat to us whatsoever.

Interestingly, the grand majority of players we've run into have been Ret Paladins, Frost DKs, Arms Warriors, Hunters, and Rogues, with a few Feral Druids for good measure. Pretty much every other spec we've only run into 1 or 2 (although that Brewmaster was lethal as heck. Another class where I just could not out-DPS their healing throughput). Not a lot of ranged classes, now that I think of it.

My BF is starting to note comps we lose or win to, and why we won/lost so we can start being more strategic about things, but it's been a good time so far. Should be interesting to see how things go. In any case, my PvP instincts are starting to get better, and hopefully we'll see if my skills improve. On the bright side, some of the Conquest 660s I've bought are good PvE upgrades since we're only starting to get into Heroic mode.
#WorldOfWarcraft, #PvP

Monday, January 12, 2015

[WoW/Design] Whose Game Is It Anyway? Designers vs. Players

Alternative Chat has yet another great post for thought on social interactions in MMOs, pointing out the differences between what players want, and what designers need to do. It's hard to boil her post down into a single point, but the big themes I got out of it are designers are trying to make a game, and players want something the designers aren't giving them.

Or more to the point, the players think they want something the designers aren't giving them. If there's one thing I've learned watching the WoW community, it's that players are pretty bad at figuring out what they actually want. And to be fair, that's not the players' job.

A prime historical example of this would be the transition from Wrath to Cataclysm in World of Warcraft. Vocal players thought Wrath was too easy. Designers agreed, so they made Cataclysm harder. Whelp, turns out the designers missed the silent majority of players who didn't think the game was too easy, and pissed them right off to the point where many of them left. However, making the game easy again wasn't the correct solution--though if you asked those players who were saying, "5 mans are too hard!" they'd have told you, make it easier. But making it easier again would just lose the customers who enjoyed the harder content.


(Source) Corborus, The PUG Breaker. I can't count the number of times I've had random groups stuck on this boss.
That's not to say there aren't players out there who have good ideas. But as a designer, and heck as a programmer for other designers, the thing I've learned to do is ask what they think the problem is. Many folks love providing solutions, but as a programmer, if you give me a solution without telling me what problem you want to solve, I'll program what you think you want, but I can guarantee it won't make you happy. I've already seen that happen a few times on the game I'm currently making.

Context is key, and asking "Why?" over and over again until you can distill the actual reasoning for the solution posited is a better way to figure out if the proposed solution actually addresses the problem in a way that makes sense, or if the person is offering up potential solutions without actually thinking of the real reasons why. And those reasons can differ from player to player:

Let's take the statement, "Heroic 5-mans in Cataclysm were too frustrating."
Player 1: Why? Because we keep dying to bosses. Why? Because people don't know how to stay out of the bad. Why? Because they don't know how to play. Why? I don't know. My guildies don't have the same issues.
Once you get down to the bottom of it, this player probably had an issue with being thrown into the LFD tool with folks of differing skills/attitude/goals.
Player 2: Why? Because the dungeons are too hard. Why? Because we keep dying to bosses. Why? Because I keep instantly dying. Why? Because bosses kill me in one shot. Why? Because I can't react fast enough or I don't see the mechanic.
This player isn't at the skill level required to down the dungeons, and may never be, but they still want to do the content because last expac, heroic 5-mans were easy enough to do, but now they aren't. Or potentially some of the mechanics aren't as obvious as they should be.

So two player perspectives, completely different, stemming from the same base statement of, "this content is too frustrating." Once you start digging, you realize that you need different solutions, because while on the surface it sounds like the same issue, it's quite different once you get a little bit below the surface.

For WoW, solutions came about eventually by the addition of the buff for random groups to help offset communication issues, and later on in Warlords the addition of an automated skill gate to ensure they didn't accidentally walk into content that was too difficult for them. Also, the addition of further difficulty levels for raiding. Lots of different techniques to mitigate the "too hard" issue without actually necessarily changing the difficulty of the content for folks who were enjoying it.



Difficulty Levels. Not a new concept, but an important one for a diverse player base.
While Alternative Chat talks about how "designers are first and foremost making a living," and that the No Flying in Draenor issue illustrates the designers making decisions--no matter how unpopular--for the betterment of the game as a whole, I think the scenario is a bit more nuanced than that.

MMOs provide a unique opportunity in video games for rapid iteration of features. One-shot video games don't get this. They'll hear feedback on forums and the like after release, but the game is done, there's no more fixing it. Sequels can address that feedback, but MMOs are a fantastic opportunity for player feedback to be heard and acted upon. And player feedback is key to keeping the game engaging and keeping the dollars flowing for said designers making a living, because if your playerbase isn't happy, they'll leave, and take their dollars with them.

That's not to say designers should address everything players say. Sometimes players will never be happy, and sometimes you might have to decide, well, this 3% segment hates this part of the game, but this other 50% segment loves it. Oh well, too bad for that 3%. And sometimes other parts of the game fall apart without making an otherwise unpopular decision (like Draenor's treasure finding and flight).

So whose game is it anyway? Ultimately, it's the designers. They'll make decisions that are (hopefully) right for the game, to make it fun and to give it longevity. But they can't ignore player feedback, either. It needs to be taken into account in some fashion, or they'll end up losing a chunk of players.
#WorldOfWarcraft, #GameDesign

Monday, January 5, 2015

[WoW] Are Silver Proving Grounds a Good Gate for Random Heroic 5-Mans?

Something that's been on the back of my mind since WoD launched is the Silver Proving Grounds requirements for queuing random Heroic 5-Man dungeons. I've ran Gold Healer and DPS PGs on two characters so far, and while Silver was trivial, Gold required some thought and planning for DPS at least--healing I just waltzed in, waved my hands, took Gold, and walked out, but then again, I'm a much better healer than I am DPS.

Reading The Ancient Gaming Noob's recent post, he goes into how he nabbed DPS Silver, no problems, and then proceeded to get trounced by the Heroic dungeons. Similarly, I recently healed Heroic Slag Mines for some guildies of mine, and while I love them dearly, after wiping on Roltall six times I had to break it to them they did not have the DPS to kill the boss (despite being nearly 630 ilvl each). Roltall is a DPS race, if you don't kill him fast enough, the entire platform covers with fire and ain't no one healing through that.

But since we queued for that random 5th player, they all must have had Silver PGs in their roles. So where's the disconnect?

Clearly Silver isn't actually doing what it's supposed to: ensuring that folks who are queuing for random dungeons meet a minimum bar where they can be successful in a group where skill levels are not guaranteed and communication is at a premium.

Make no mistake, current Heroic 5-mans are difficult. They're not Challenge Mode difficult, sure, but they're nearly as difficult as Cataclysm Heroics were. Those of us with a lot of experience raiding probably cannot fairly judge the relative difficulty of dungeons, as once you've seen a few expansions worth of content, everything starts to look recycled. It's simple to break down fights into older fights as a proxy and use known strategies/coping techniques.

But for those who are not that experienced, it's a wake-up call, and not a pleasant one. If Silver was supposed to show you are prepared, well, enter Illidan's infamous quote. At least, not in its current incarnation.

Gold PGs currently might be a tad too difficult as a gate for Heroic 5-Mans

So let's say Blizzard made Silver more difficult, or upped the requirement to Gold. Some folks already struggle with Silver, and only get through because someone else points out a flaw in their strategy (often around the use of interrupts and/or stuns if my experiences are any indication). This is currently a failure on Blizzard's part for not giving enough feedback in PGs to show why you failed. Currently it is left as an exercise for the player, and many players don't have the skills yet to break down why they goofed. Heck, I've seen Mythic-level raiders who've subbed in our raid on occasion give completely incorrect and off-base wipe analyses, so it's no wonder that players who already struggle with Silver cannot figure it out.

And sure, one could argue that they need to do the research outside the game, the same as the rest of us have, but to me that's a failure in game design. The game should give you enough feedback to figure out what happened and such that you can tailor a new strategy from there. Now, this is probably far too big for raiding/dungeons, but for PGs that have a very scripted, controlled environment? Should be completely doable.

Of course, another argument from there is if they get used to the game telling them, it's a crutch that doesn't exist elsewhere so they may be lost in dungeons, but frankly, I think the WoW design team needs to start somewhere.

I also think that Proving Grounds currently don't provide enough mechanics. Where's the bad on the floor that's so prevalent in most dungeons? Something that deals you damage and reduces your DPS? Perhaps more interesting techniques like Line of Sighting casters, or anticipating a big attack on your tank and pre-casting a heal to line up after the attack lands? The PGs provide a prime opportunity for teaching, not just testing, and I think if Blizzard wants to continue to use them as an effective gate, they need to up the quality of each test.

So currently, no, I don't think Silver Proving Grounds are a good gate. Their difficulty is not in tune with Heroic 5-mans currently, and they leave newer DPS players with little recourse but to look up the puzzle solution if they can't figure it out--Healing is less a puzzle and more about cooldown usage, but stuns/interrupts definitely make it much easier; I cannot comment on Tanking.

They're not a bad gate--better than no gate, I think, so from that perspective they've improved things by keeping the truly unaware/unprepared away from an environment where they and everyone around them would be frustrated, but they could definitely be better. #WorldOfWarcraft, #GameDesign

Wednesday, December 31, 2014

[WoW] Secondary Wars: The Socket Strikes Back

Last week I took a look at what point did it make sense for you to take a gear upgrade regardless of the secondaries involved. One of the criticisms leveled at it was I used a pretty odd slot (bracers), given how small of a contribution it gives. Totally valid, because it's harder to see the exponential curve as stats go up.

I also glazed over Sockets pretty quickly, and between my conversation in the comments with Balkoth, as well as more thought on it myself, I wanted to do a more thorough analysis, and also perform a comparison versus Warforged. But first, let's talk secondary stats a second time.

Secondary Stats Revisited

So last week I chose bracers. This week, let's go the other direction, let's pick the chest piece, one of the--if not the--biggest chunks of stats on a single piece of gear.



I nabbed plate chest pieces this time, and when you compare it to the bracers last week, it provides nearly twice the secondary stats as the bracers did (and about 50% more primary stat). So I ran through the same calculations I did last week for both Retribution Paladin and Enhancement Shaman stat weightings.


Retribution Stat Weights. You'll note they're much closer together than the Enhance weights were.
As before, I choose to make the best piece mathematically possible for the base one, and check against the mathematically worse Warforged version to see if it's worthwhile.

Retribution Chest
Item Level 630 Item Level 636
167 Strength 177 Strength
216 Mastery 229 Versatility
1632.3 = (167 * 5.7) + (216 * 3.15) 1627.2 = (177 * 5.7) + (229 * 2.70)

Item Level 685 Item Level 691
279 Strength 295 Strength
365 Mastery 385 Versatility
2740.05 = (279 * 5.7) + (365 * 3.15) 2721.00 = (295 * 5.7) + (385 * 2.70)

As we can see, it's really quite close. Close to the point where 10 ilvls would be overkill for Paladins. Chances are, Warforged anything is better than the plain piece you're running. But they have a much closer stat spread than other classes, so let's take a look at Enhancement Shaman based on the spreads last week.

Enhancement Chest
Item Level 630 Item Level 636
167 Agility 177 Agility
216 Haste 229 Versatility
1463.4 = (167 * 5.4) + (216 * 2.6) 1413.8 = (177 * 5.4) + (229 * 2.0)

Item Level 685 Item Level 691
279 Agility 295 Agility
365 Haste 385 Versatility
2455.6 = (279 * 5.4) + (365 * 2.6) 2363.0 = (295 * 5.4) + (385 * 2.0)

So a larger gap, but the calculations for 640 item (1467.6 = [184 * 5.4] + [237 * 2.0]) show that 10 ilvls is still sufficient to overcome the disparity. And frankly, like last week, if you pick the more reasonable stats where you choose the best two secondaries and compare them to the worst two, Warforged is actually generally an upgrade for Ret:

Retribution Chest
Item Level 630 Item Level 636
167 Strength 177 Strength
121 Mastery 128 Versatility
95 Haste 101 Crit
1618.05 = (167 * 5.7) + (121 * 3.15) + (95 * 3.0) 1637.30 = (177 * 5.7) + (128 * 2.70) + (101 * 2.8)

Item Level 685 Item Level 691
279 Strength 295 Strength
197 Mastery 208 Versatility
168 Haste 177 Crit
2714.85 = (279 * 5.7) + (197 * 3.15) + (168 * 3.0) 2738.70 = (295 * 5.7) + (208 * 2.70) + (177 * 2.8)

But for Enhancement it's actually a (very minor) downgrade, but so minor I'm not sure it's worth bothering with unless you think that extra ~20 DPS you'll get is worth it:

Enhancement Chest
Item Level 630 Item Level 636
167 Agility 177 Agility
121 Haste 128 Versatility
95 Mastery 101 Multistrike
1420.65 = (167 * 5.4) + (121 * 2.6) + (95 * 2.15) 1418.85 = (177 * 5.4) + (128 * 2.0) + (101 * 2.05)

Item Level 685 Item Level 691
279 Agility 295 Agility
197 Haste 208 Versatility
168 Mastery 177 Multistrike
2380.00 = (279 * 5.4) + (197 * 2.6) + (168 * 2.15) 2371.85 = (295 * 5.4) + (208 * 2.0) + (177 * 2.05)

So the conclusions I came to last week still hold as far as vanilla vs. Warforged. 15 ilvls is still a no-brainer, and unless you have a really broad stat spread (broader than Enhancement's), even 10 ilvls is solid regardless of secondaries on the piece. Add in the extra Stamina for survivability, and it's an even bigger boon.

The "exponential" stat curve is still really low at the ilvls where we are, so even on a chest piece doesn't show it too deeply. That exponential curve really starts showing in aggregate, as the combined stat budget of all of your items starts scaling quickly as you go up ilvls. It'll be interesting to see how that changes when we hit T18 in a patch or two.

Warforged Versus Sockets

As Balkoth brought up last week, Sockets are a static bonus, and as stat budgets increase exponentially, the relative value of a Socket will decrease. Also, when you look at say, bracers, which have a smaller overall budget than say, a chest piece, it has a greater relative effect. That is, a gem is going to compare more favorably to Warforged in bracers than in a chest.

So using Enhancement stat weights, I graphed the difference in stat weights for worst Warforged, best Warforged, and +50 sockets with the best secondary.



That graph needs some explanation. I took two baselines for 630/640/655/670/685, one with the best possible secondary, and one with the worst possible secondary. Then, I have the stat score for a Warforged item of that ilvl with the best/worst as a base, and I have the stat score for a +50 Haste gem in that ilvl with the best/worst item as a base, the idea being reality will lie between the maximum/minimum values of best/worst stat scores.

The green line, the worst secondary baseline, is clearly terrible. Everything is better. The best secondary baseline, however, is better than either gems or sockets. But that makes sense given sidegrades, and the much larger chunk of base stats to work off of. The baselines of course change depending on the stat spread of a given class.

Really, the more interesting piece of information we can glean from this is that there's an inflection point where a +50 gem goes from better to worse than Warforged, as predicted by Balkoth in our conversations prior. The inflection point for Enhancement chest pieces lies somewhere between ~665 ilvl (worst stats, Warforged vs. +50 Socket) and ~687 ilvl (best stats, Warforged vs. +50 Socket), depending on the secondary stats on your item.

Given just how close it is for even the best--and you'll not likely see to many chest pieces with nothing but Haste--for Mythic/Heroic pieces, Warforged is probably about the same as Sockets, or Sockets eke out Warforged slightly until Mythic, which Warforged wins (barely). So I'd expect in this scenario at least, we'll hit the full inflection point somewhere in Mythic Foundry gear.

Let's take a look at Bracers:



No contest. Sockets > Warforged all the way through Mythic, and beyond. The inflection point likely lies in T18, I'm going to guess somewhere in the range of 705 ilvl.

So Enhancement has a decently wide stat spread, and Agility is worth a lot compared to even the best secondary. Let's perform the same exercise for Retribution, which has a much closer spread, and a primary that isn't worth as much comparatively.



Retribution mimics Enhancement relatively closely. For both chest and bracers we're looking at nearly identical inflection points. That may be a fluke, however, so let's take a third spec, Frost Mage, which has an even wider stat spread than Enhance.


Frost Mage Stat Weights. Versatility isn't last? Wow!

Even more interesting, the inflection point for a Frost Mage chest item is actually beyond Mythic for the best case item, probably about 3 - 5 ilvls if I were to guess visually. And of course, bracers it'll likely be by 710ish where that inflection point hits.

So with all of that, I'm quite confident that Sockets are better than Warforged for this tier, excepting perhaps Mythic Blackrock Foundry gear for the big ticket items (i.e.: chest). For smaller items, Sockets will be better than Warforged into T18, unless Blizzard introduces Epic gems or something, in which case Sockets will likely be king for the whole expac as long as you're using +50 gems--which if you're using Heroic or better gear, when that +15 difference starts to really matter, you should be.

The caveat on this are weapons and trinkets. Weapons rely heavily on weapon damage/spell power boosts. I would bet money that Warforged for those would be better than Sockets (though I haven't done the math to prove it). Trinkets I think it'll depend on the proc, as Warforged would boost the proc effect, but a Socket would not. Also noting that I'm ignoring Stamina still, but given how low Versatility is on the totem pole, many people seem to be ignoring the benefits of survivability.

All of this will get a bit more interesting once we toss in Blackrock Foundry gear, given it's 10 ilvls off Highmaul (650/665/680/695), giving us more options for in-between gear, and side-grades where we may have to decide if 4 or 6 ilvls are worth dropping a Socket (they won't be given Sockets > Warforged in all but the super-top-end gear, and Warforged is just 6 ilvls) or if 10 ilvls are worth dropping a Socket, or so on.

Addendum: Link to the spreadsheet where all my work on this is stored for perusal.
#WorldOfWarcraft, #Theorycrafting

Monday, December 22, 2014

[WoW] Why Rando-Secondary Stats Matter Less Than You Think

I've seen it time and time again:

"Versatility is garbage."
"Crit is the worst stat for my spec."
"I don't want that upgrade, haste is useless to me."

I've had guildies pass up 15+ ilvl upgrades because they didn't like the secondary stats. Murf mentions something quite similar in his listmas post. So, is that the truth of the matter? I venture that it's not, and that you're best taking a raw ilvl upgrade in most cases.

Let's take a look at some classes. I know Enhancement Shaman relatively well, so I brought up my character on AskMrRobot and took a look at the stat weights given:


Not gonna lie, the fact that AskMrRobot doesn't normalize these to Agility = 1 is driving me mildly batty.
Now, at the end of the day, I don't actually care what the stat weights are. We're trying to find out here at what point is it that it really does matter what secondaries your gear has in terms of upgrading your character. And when you think about stat weights, they're generally just generated via simulation runs to see how much extra DPS each point of a stat is worth, usually compared to your primary stat. You'll also note that due to the way that stats interact with abilities, they don't always scale the same. For example, Enhance loves Haste right now, but if I recall correctly, the more Haste you get, the better Mastery looks, until there's an inflection where Mastery overcomes Haste.

So really I only care about what the spread is between the "best" secondaries and the "worst" secondaries. The actual values, or the exact secondaries we care about are immaterial.

I grabbed an some items from the current WoW raids/heroic 5-mans, and charted out the stat progression for the basic item, and with warforged (I'm ignoring sockets for a moment for simplicity. We'll get back to those). I chose Bracers--interesting point I learned, there are no mail belts in Highmaul LFR from bosses. I wonder if there's a shared drop I missed? Below you can see a chart that shows all of the values that I extracted:



So now that we have all of that, let's assume you have double your best secondary on your 630 ilvl item, and all your upgrades have double your worst secondary. In the case of the Enhancement Shaman weightings, that would be an item that looked like:

Item Level 630
+94 Agility
+121 Haste

And all our upgrades only have Versatility. So let's compare our 630 to the next closest thing, a Warforged version with Versatility, which would look like:

Item Level 636
+99 Agility
+127 Versatility

So now take the stat weightings above, and multiply them by the differences. We had 121 Haste, so that's 314.6 points of power (121 * 2.6). 127 Versatility is 254 points of power (127 * 2), meaning the difference is ~60.6 points. The difference in power attributed to Agility is (99 * 5.4 - 94 * 5.4) = (5 * 5.4) = 27, leaving us with a power differential of ~33.6, or around 7 more points of Agility to make it worthwhile.

All right, let's bump it up to LFR.

Item Level 640
+103 Agility
+134 Versatility

So we have our prior 121 Haste (314.6), versus our new 134 Versatility (268), but a new difference of 9 Agility from our base item (48.6), meaning the upgrade is 2 points better than what we started. So in the absolute worst case for the weights given, 10 ilvls difference is sufficient to say screw it, take the item regardless of the secondaries no matter what (for a weighting of 2.6 vs 2.0 and stacking the BEST versus the worst).

Interestingly enough, that seems to hold scaling up. Performing similar calculations for 676 upgrading to 685 shows that it would be better to keep the 676, caveat the strange stats on the item, but if we were to get a couple more ilvls on the 685, it would beat out the lower level item.

So that's the worst case. Let's take a normal case, where the secondaries are split between the best two and the worst two.

Item Level 630
+94 Agility
+70 Haste
+51 Mastery
(94 * 5.4 + 70 * 2.6 + 51 * 2.15) = 761.65

Item Level 636
+99 Agility
+74 Versatility
+53 Multistrike
(99 * 5.4 + 74 * 2 + 53 * 2.05) = 791.25

And just like that, the the 636 is already better than the 630 by a decent margin.

All of the above assumes our weightings are "correct". Haste is 30% better than Versatility according to AskMrRobot's weights. What if the gap was wider? If you look at the weightings for Holy Priests, you're looking at a 57% gap--1.1 for the best, 0.7 for the worst, with Intellect at 2, ignoring Spirit for a moment because that's another bucket of worms.

Holy Priest Weightings. It's clear to me that these weightings are generally created to enforce priority order, rather than work with raw numbers...but let's pretend for a moment, shall we?
Item Level 630
+94 Intellect
+121 Multistrike
(94 * 2 + 121 * 1.1) = 321.1

Item Level 640
+103 Intellect
+134 Versatility
(103 * 2 + 134 * 0.7) = 299.8

As expected, still worse. In the case of weightings like these, it doesn't overcome the stat weight disparity until a little above 646 (if we could get items at 648 or so, that would do the trick). But again, it's rare to see stats stacked like that, so let's take a much closer to real world example:

Item Level 630
+94 Intellect
+70 Multistrike
+51 Haste
(94 * 2 + 70 * 1.1 + 51 * 1) = 316

Item Level 640
+103 Intellect
+74 Versatility
+60 Mastery
(103 * 2 + 74 * 0.7 + 60 * 0.8) = 328.07

In the case of Holy Priests with the weights as given, Warforged wasn't sufficient. We had to go up 10 ilvls still. However, 10 ilvls is definitely more than enough to say screw it, replace it regardless. Even had I decided to use Spirit/Multistrike in the 630 calculation, the 640 would've been effectively equivalent.

All of this is to say that ilvl is still largely king. If it's a gap of 10 ilvls or more, you're probably best just taking the upgraded item and calling it a day, regardless of the stats that it has on it. All of this gets completely out of whack when it comes to trinkets (because of procs), weapons (due to weapon damage/spellpower), or sockets. I'm not going to cover trinkets because that's a case by case basis, but sockets are interesting.

Right now sockets are worth +35/+50 of a secondary stat. For our Enhancement case, if we got a 630 item with a socket and gemmed Haste, that's a huge power differential:

Item Level 630
+94 Agility
+70 Haste
+51 Mastery
+35 Haste (Socket)
(94 * 5.4 + 70 * 2.6 + 51 * 2.15 + 35 * 2.6) = 890.25

Item Level 646
+109 Agility
+78 Versatility
+64 Multistrike
(109 * 5.4 + 78 * 2 + 64 * 2.05) = 875.8

So a single green gem in a socket is enough to make that 630 better than the unsocketed LFR Warforged item. What if we flipped it so that we had the best socket but worst stats to start?

Item Level 630
+94 Agility
+70 Versatility
+51 Multistrike
+35 Haste (Socket)
(94 * 5.4 + 70 * 2 + 51 * 2.05 + 35 * 2.6) = 843.15

Item Level 636
+99 Agility
+74 Haste
+53 Mastery
(99 * 5.4 + 74 * 2.6 + 53 * 2.15) = 840.95

The socket is better than Warforged, even when comparing the worst stats to the best. And that's with a crappy gem, no less.

All of the above assumes the stat weights are accurate, which frankly, they are not. It's clear in both the Enhancement Shaman and Holy Priest cases that while they may be based on sims, they've been tweaked to make pretty--especially for the Priest, which I'm confident is literally just enforcing priority ordering rather than actual weights with math behind them. Remember, those weights change as you get more of a given stat because of how stats interact in your spec. 

In my mind, that means a 15 ilvl upgrade (ie: LFR->Normal->Heroic->Mythic) is an absolute no-brainer, take it regardless of stats, unless you have a socket. Sockets are worth a good 5 - 7 ilvls on their own, if not 10 with a rare gem. 10 ilvls is probably okay as well, unless you're truly going from 2 awesome stats to 2 of your worst stats, at which point you need to understand more about your stat weights and how spread out they are. Basically, you mostly only need to worry about optimizing your secondary stats once you're getting side-grades (including warforged).

I think I'm justified in saying ilvl is largely king, but I definitely underestimated the power of sockets. #WorldOfWarcraft, #Theorycrafting