Going deeper on the London Mulligan

0
4359

The London mulligan is here to stay.

Mythic Championship London saw the ultra-fair Humans triumph in the finals over the mostly-fair Affinity. The sky didn’t fall, Modern wasn’t overthrown by Turn 2 kills.

But in the meantime…something evil was forming, ready to arise and cause terror among unsuspecting Modern players. The printing of Neoform in WAR powering out turn-one kills and Hogaak in MH1 allowing for the same on turn-two. Cards from both sets have allowed for more linear strategies to potentially break the new mulligan rule.

Now, while the rule is becoming more questionable in Modern with each passing set, what about Limited? Wizards’ recent moves have clearly been about pushing Arena’s Standard and Limited formats, so we’ll take a look at what the rule’s effect is on Limited. With the data in your hands you can then consider whether you think the risk to Modern is worth it.

Previously I walked you through how the new rule would affect starting hand quality. I showed a significant power boost to linear strategies relying on a strong starting hand, while a much more tempered effect on the format’s more “fair” strategies. In particular, I showed that a very aggressive mulligan strategy for Tron could be worth the risk with a much greater likelihood of finding a playable hand. I predicted Tron players at MC London would be mulliganing much more aggressively than usual, and while I don’t have that data, the coverage of that event did indeed seem to show high performing Tron players mulliganing very aggressively. I myself managed to lose to Tron three times on day one including losses to opponents on only four and five cards.

Method

In my previous article, for simplicity, we only compared the Paris versus London mulligan at the time of a player keeping. In this article we will instead look at hand quality after the first draw allowing us to consider the Vancouver mulligan as well. Then we’ll compare all three mulligan styles against each other. We will also analyze each mulligan system on a simplistic Limited deck to properly consider the wide range of games of Magic. I’ll show the improvement from Paris to Vancouver to London and the difference in improvement for each deck.

A quick overview of the method being used: we’re using a Monte Carlo simulation where a hand of cards is randomly selected from a deck, the hand is scored based on my own custom parameters and the ranks are recording to determine the likelihood of each hand strength for each mulligan method. For this article I’ve overhauled the hand selection algorithms a bit to ensure that drawing a card never reduces a hand’s quality. This is because we’re more concerned about overall strength vs keepability and want to have a direct comparison of seven to five card hands. This, and the fact that we’re looking at hands after the first draw, will account for the difference in data between these two articles. However, as you’ll see, the overall conclusions are still the same.

As I write this article Hogaak has arisen and is currently running amok in Modern, and while I’d have loved to include a Hogaak analysis writing the hand strength algorithm takes time and experience with the deck. I haven’t had a chance to play with it yet, but if it continues its rein of terror and survives the next ban hammer I’ll look into including it in a future article. I’ve also pushed all of the code to my github repository if you’d like to play with it yourself.

Results

I’ll first show a chart for each deck displaying the likelihood of finding a keepable hand at seven, six and five cards for each mulligan strategy. We’ll see the likelihood of a keepable hand at seven cards being the same for all mulligan strategies, as you’d expect with no difference between them, then we can view the drop-off in quality from there. I’ve created this set of high level charts for easy digestion, but I’ve also created a sheet with the full data set if you’d like to look at the data in more depth.

As we discussed in the previous article we indeed see that the London mulligan provides a significant improvement for each deck. In particular Tron and Amulet see the largest improvements. We can also now see the effect of the Vancouver rule’s scry being part way between the quality of Paris and London mulligans. Not exactly groundbreaking data, but nice to see the data roughly following what we’d expect.

In the first article I made a simplification stating that the strength of a hand with the London mulligan would be roughly equivalent to that of a seven card hand. For this article I spent the extra effort simulating the London mulligan directly by testing the hand quality of each subset hand to select which card to “put back.” With this new method we do indeed still see that the overall likelihood of finding a playable hand is roughly the same. I’ve chosen to display Tron here as a sample showing how the quality of the hands are generally reduced, so while hands are indeed just as keepable they are on average weaker. Every archetype similarly shows a reduced hand quality, with a stable keepability percentage.

The London mulligan helps every deck, but does help linear, synergy-based decks more than non-linear decks. A player is also just as likely to find a keepable five card hand as seven, though the card quality will be lower.

Limited

We’ve taken a look at four staple Modern archetypes, but what about the other formats? Modern and other eternal formats are generally very synergistic formats where players will frequently give up resources to push a single synergistic strategy. At the other end of the spectrum is simple core-set Limited. This kind of Magic is generally defined by one-for-one trades and trying to curve-out. Where players are incentivized to mulligan very aggressively in Modern, the opposite is true in Limited where every card can count.

While there has been some discomfort in what the London mulligan rule might do to eternal formats, there has been near universal agreement that it would be good for Limited play. Let’s take a look at the numbers to see how much of an improvement there is.

I’ve decided to use a very simplistic, all commons version of a Core 2019 Limited deck. While this may be oversimplified I believe it can give a good indication of how mulliganning will affect the simplest Limited deck. It is a deck following Cards that Affect the Board aka “CABS Theory” a concept popularized by Marshall Sutcliffe which, conveniently, is also easy to simulate. While I would not be that happy drafting this deck it wouldn’t be complete disaster and provides a good baseline for the mulligan rule in Limited.

Since I described my mulligan strategies in my previous article I decided to skip them in this article, but I’ll quickly describe what I’m defining as keepable hands with this deck.

  1. Perfect Curve
    1. Both colours of mana
    2. At least four spells
    3. A 2-drop and either a second 2 drop or a 3 drop
  2. Good Curve
    1. Both colours of mana
    2. At least three spells
    3. At least one 2 or 3-drop
  3. Keepable
    1. Version 1)
      1. 1 land of each colour
      2. At least three spells
      3. At least one 3-drop
      4. No 2 drop (this would have been called “good curve” otherwise)
    2. Version 2)
      1. At least two lands
      2. At least two spells
      3. At least one 2 or 3-drop
      4. At least one castable spell of the same colour as lands by turn four

I’d consider version two of “keepable” to be a bit of a loose keep, some players may mulligan more aggressively than that, but I believe any hand in that category would at least be under consideration. As a small sanity check, our team’s data from testing RNA for MC1 we recorded mulligans in 750 games and found we were mulliganing between 7 per cent and 30 per cent of our hands, showing a fairly wide range. This data shows that using this algorithm it will mulligan 13.9 per cent of seven card hands, doing a pretty good job of simulating a rough average of this real life data.

As we saw with the Modern deck data we again see the London mulligan staying roughly equivalent for all hand sizes, as far as keepability goes. We then also see the same pattern where those hands are keepable but are of generally lower quality. I find the most interesting data point being the comparison between the London and Vancouver mulligans at six cards. We see the London hand producing a keepable hand after the first draw only 1 per cent more often than the Vancouver mulligan. However, where we do see a significant difference is before that first draw. After a single mulligan the Vancouver mulligan will have a keepable hand 75.3 per cent of the time while the London mulligan offers a keepable hand 86.3 per cent of the time, this implies the Vancouver mulligan is forcing players to keep potentially sketchy hands and cross their fingers that the scry will get them there.

I’ve included the hand quality of the Vancouver mulligan to show that the hand quality of those kept hands is also worse than we’re seeing with the London mulligan. A Vancouver mull to five will be keepable 84.0 per cent of the time vs 92.7 per cent of the time for London. Not only are players less likely to be able to keep their Vancouver mulligan’d five card hand, but the quality of that hand will also be reduced.

We again see the London mulligan is helping players find keepable hands after mulliganing; giving them a better shot at playing Magic. It is clear to see Wizards reasoning for this change, given frequent jokes directed at the game, I’m sure it’s a bit of a sore spot for them. While the new rule has less of an effect on Limited than other formats, the difference is still clearly noticeable.

Overall Comparisons 

We now calculate the likelihood of finding a keepable five card hand by the time we reach five cards. I’m considering this essentially the last chance for a player to play a reasonable game of Magic. Therefore we can consider this data to be: “how likely are we going to get to play a game?” We see clear improvement across the board, and as noted multiple times the linear decks, Tron and Amulet see the greatest increase in performance.

Lastly, we calculate the likelihood of finding a keepable hand by the time we reach five cards. From this data we see the biggest improvement from Amulet where the London mulligan allows the deck to select which of its many moving pieces it can afford to give up. In Limited we see a jump from Paris at 75.1 per cent to Vancouver at 84.0 per cent to London at 92.7 per cent upping the chance of keeping that five card hand by nearly nine percentage points. This is the gain we’re seeing for Limited and Wizards reasoning for wanting to push through this new mulligan rule. I’m also sure in more synergistic formats or decks with splash colours we would likely see more aggressive mulliganning and therefore an increased improvement from the London mulligan.

How accurate is this?

For this article I ran 100,000 simulations for each mulligan style at each hand size. Given that we have five decks, with three hand sizes and three mulligan styles we’re looking at data generated after 4.5 million simulations. The full simulation takes my PC roughly an hour to run.

Now for a little math: after my previous article I got a few questions about confidence intervals and the error bounds of my data. Given that the data in the set is binary data, we cannot use a standard ANOVA test. Using the Wald test I can calculate the 95 per cent confidence interval using ((1-p)/n)*2 where n is the sample size and p is the calculated probability. From my dataset the smallest value recorded is roughly 0.01, the largest is 0.9995. Calculating each of these values we get errors of 0.0063 and 0.00014, if we also check the error at a value of 0.5 we get an error bound of 0.0045. In simple terms, the 95 per cent error of the data you’re seeing should be no more than +/- 0.6 per cent.

With that said, when viewing the numbers we should consider that greater error will come from the hand ranking system we’re using. It should be understood that strategies will change for each mulligan style and at different hand sizes. Mulligan strategy changes further based on matchup, play/draw, etc. Here we assume a consistent mulligan strategy so we can make a direct comparison, but I believe the majority of any error we’re seeing is from this simplification.

Conclusions

We’ve now examined the mulligan rule’s impact on both Modern and Limited play and shown that they are both improved. Overall Limited sees a much lower improvement than any of the Modern decks. This appears to come from the fact that players just don’t mulligan as aggressively in Limited. This rule could encourage players to mulligan more with the confidence that they’ll find a playable hand, especially in formats with higher synergy.

I have not yet considered the impact of interaction with this mulligan rule. I’ve begun working on an algorithm accounting for an early Thoughtseize and had wanted to include it in this article, but the work building the simulation ended up taking longer than expected and decided there was enough data to share at this time. For my next article, I’ll discuss how interaction affects those mulligans. I expect to see the London mulligan’s consistently playable hand once we add interaction to the mix.

My personal stance on the London mulligan is that I do believe it makes Modern a worse format. It incentives players to mulligan more aggressively attempting to push their primary game plan. This makes decks more consistent, incentives linear strategies and causes more shuffling (ugh!). While it doesn’t necessarily mean anything is “broken” or “unfair”, I believe it’s worse for the format as a whole. However, as Wizards is focused on Limited and Standard for play on Arena, I do think the change is overall good for the game. It’s just unfortunate that the eternal formats will likely suffer a bit until they find equilibrium either through metagame shifts or bans.

Discussion