March « 2019 « Dungeon Crawl Stone Soup

What is seeding?

The 0.23 release introduced a major and ongoing overhaul in how DCSS approaches random seeding from the player’s perspective. If you’re not familiar with this idea from other games with random procedural generation like Minecraft, Spelunky and brogue, here’s how it’s supposed to work: if you enter a specific code (the “seed”) into the game when starting it, you’ll get a repeatable form of randomness that encompasses dungeon layout, items, and monsters. Any dungeon generated with that seed should be identical. This has many applications: from Spelunky or Brogue (and many others) you might be familiar with the concept of a daily, weekly, or monthly run. In these, all players enter the same seed and over some time interval explore the same dungeon. Here are a few ways crawl community has been experimenting with this since the release:

On the CPO server, run by chequers, there is an experimental weekly online challenge with a share dungeon.
Ultraviolent4 has started running an offline Game of the Month challenge. (Edit: The next challenge begins on Apr 7, per uv4.)

I’m excited to see what else develops. But what does it take for this to actually work? This post goes into the background on what has been involved in doing this retrofitting for DCSS, where it is at, and what the remaining problems are.

Randomness and procedural generation

Procedural level generation happens via constrained randomness. Some process repeatedly draws on a random number generator according to a fixed, predefined set of constraints and rules, making choices that (ideally) generate a coherent, explorable, locale. At the heart of this is a random number generator, which usually takes the form of a function that the programmer can repeatedly call, generating a sequence of numbers.

Now, “true randomness” is more of an ideal than anything that can be implemented, so RNGs are really only “pseudo-random”. What this means is that they aim to (in addition to matching some overall distribution) be unpredictable, which a viewer or adversary will perceive as random. Unpredictability can be defined in precise mathematical terms that I won’t go into here, but it doesn’t actually mean complicated: some of the best rng algorithms are remarkably simple. They basically all work by initializing the rng state with a “seed”, and repeatedly transforming that state, and the heart of the one we use, PCG, is just 5 lines of code. For cases where there is an adversary, you want this initial seed to itself be unpredictable, and so common strategies are to use some external state of the program (e.g. exact time, external devices) to do this initialization: then, if the adversary doesn’t know the initial state, and the algorithm is unpredictable in the right way, they’re out of luck.

Video games flip this idea on its head. For some kinds of RNGs, if you initialize the generator in the same way, multiple instances will generate the same sequence of numbers in the same order, while still behaving unpredictably and well-distributed as long as the seed is unknown. This means that if the rules and constraints that draw on the RNG are implemented correctly (we’ll come back to this), the behavior given that seed acts random but is completely replicable. You can think of the RNG as, given a seed, providing a “code book” of integers that acts random, but as long as you have the code book and the correct rules/constraints (giving you position in the list), you can decode the behavior of the game.

There are, then, two main ways to make this go wrong: mess up the initial conditions, or cause the rng to get out of sync. This latter case is when two processes are using the same sequence, but make decisions starting from different points in the sequence. Because of unpredicatability, their behavior will diverge extremely rapidly (more on this later). In general, if RNG behavior diverges at all, it tends to diverge thoroughly and quickly.

DCSS has long used the PCG RNG which (like the Mersenne Twister, xorshift, and many others) supports this reproducibility; retired dev bh added this in 2015, replacing a hand-rolled implementation of an algorithm from the 80s that was more predictable. However, using a good RNG algorithm is only one piece of the picture, and turns out to be probably the least complicated part of this project.

Why player-facing seeding was (and is) hard for DCSS

Crawl wasn’t remotely designed with seeding as a player-facing feature in mind; though many roguelikes have used specifiable seeds for testing, the mainstream use as a player feature seems to be quite recent. Actually, I couldn’t find out much of the history of using seeding for completely repeatable procedural generation as a player-oriented feature, but it doesn’t seem to have been very mainstream before games like Minecraft and Spelunky; an earlier example, pointed out to me by Roguelikes discord user esc, is Civ 3. Seed-based pseudo-randomness, internal to the game, is much older, and for example Moria allowed choosing a seed for testing purposes. But getting seeding to work as a player feature is a whole different project.

Here are four challenges that DCSS faced (and still faces) when trying to get player-facing seeding to work well:

Mixing of gameplay / levelgen RNG
Arbitrary interactions in level generation between levels.
Extremely complex level generation with a huge diversity of procedural generation techniques, implemented in a mix of C++ and Lua code.
Diversity of build targets and settings: support for windows, mac, linux, with 32 or 64 bit builds. Builds on non-x86 architectures such as ARM.

When starting the project, I was aware of factors 3-4, but thought 1-2 would be the biggest challenges, and so it was there that I began. In retrospect, 4 is actually the biggest problem.

Before 0.23, generating a level and entering a level were completely intertwined: levels were not generated until you entered them. “Greedy” level generation goes back to Linley’s Dungeon Crawl as far as I’m aware, and on its own terms is a fairly simple and clean approach, spreading the CPU cost of level generation across the game in a way that hides it from the player. However, it can lead to a problem illustrated by the following toy example: suppose that you have a separate RNG for each level, the player is on level 2, and for both levels 3 and 4 Sigmund is poised to generate given the RNG state. If the player takes stairs to level 3, then he’ll generate there; if the player takes a shaft to level 4, then he’ll generate there. Since Sigmund is a unique monster, he cannot then regenerate if the player enters levels 4 and 3 respectively, and so the RNG diverges right away — the builder may place another unique, or do something different altogether, and after that initial decision will tend to be completely different.

Not only that, this first assumption wasn’t made: DCSS before 0.23 used a single RNG for gameplay and levelgen decisions, so any different action the player took might lead to completely different level generation.

Part 1: pregeneration and RNG separation

This problem has long been known, and comes up regularly in discussions of seeding. I adopted two solutions from these discussions: separate out levelgen RNG from gameplay, and allow pregeneration of the dungeon. The latter solves almost all of problems 1-2 on its own, and had several past implementations by community members. The most developed of these was due to tavern user bel in the above linked thread, and just needed some adjustments so that level generation was better separated from level entry. The initial seeding patch can be seen here.

Incidentally, if you’re designing a game from scratch, your life is going to be easier if you take a different strategy for these sorts of non-independent decisions. For example, you could place uniques independently of levelgen (something we do now only for one unique, Boris, who respawns), perhaps deciding where they go at the beginning of the game. However, it’s probably too late for DCSS in this respect.

Part 2: testing and fixing many, many seed stability bugs

This patch, it turns out, was only the beginning, opening things up to a whole world of ways in which a repeatable RNG can get out of sync based on problems 3-4.

To start with, I wrote a lua test that, given a seed, prints the vault list for that seed, re-runs generation, and compares the result. While true seed stability bugs on a single device have been fairly rare, this second part helped identify a tremendous amount of global state that can matter for seeding, scattered all over crawlcode. You can see an example of the output here if you scroll down to “Running test #31: ‘vault_catalog.lua’.”

The real challenge has been seed stability across devices. Here is a sampling of problems that have come up:

If you’re drawing randomly from a list, and something permutes the underlying list, it doesn’t matter if your rng draw is the same! For example, if the OS provides a directory listing in a different order.
If the order of some calls isn’t determined by the language/library standard, then it will probably be implementation dependent. This is a problem that has come up again and again:

C++ (and C) doesn’t define the order of multiple function calls that make up an arithmetic expression, and in practice their order varies. You might think this would go by order of operations, but this would be wrong. They all have to be explicitly sequenced.
Lua iteration order, both through the C interface, and via the “pairs” function, is not guaranteed to be stable.
Anything in C++ that sorts may have implementation-specific behavior if a comparison operation indicates two objects are equal. This is fine if your comparison operation is fine-grained enough, but sometimes it wasn’t.

A large set of bugs that were introduced with the initial pregeneration patch because the level builder (and related code) assumed the player was about to enter the level; something that could be designed around from scratch but was fairly baked into dcss as it stood.
DCSS caches parsed maps, and some cases of randomness in the cache were causing divergence.
Cases where the builder made use of player state: the most prominent of these were vaults that generated items using the acquirement code. (This still can happen in troves, but only there.)
Floating point issues. First, Lua 5.1 uses different Number to int conversion techniques on different build targets, and these have different truncation behavior. Then, even controlling for this it turns out that e.g. SSE2 behavior for non-exactly-represented integers can lead to different ints than x87, for example, (1 / 100 * .7). I doubt all of these have been found yet, and for the 0.23.2 release I still needed to manually force the windows build to not use x87. I’m looking at you layout_delve…
Cases where unexpected code was drawn upon by the builder. For example, levelgen tries to ensure connectivity, and it turned out the code that does this was being affected by certain rc options! Another example is that slime connectivity in particular uses a different algorithm (because of the wall damage) that led to an underspecification bug mentioned above.

In effectively all of these examples, the result of the bug was immediate loss of seed synchronization across devices — and many of these were things that wouldn’t be considered bugs at all without the goal of seed stability. Debugging them was uniformly extremely difficult, because it required both replicating the divergence and then finding the exact point of RNG divergence. Without VMs and docker, I would not have been able to do most of this.

Some of these problems can be programmed defensively against, with experience, but I doubt I’ve found all the seed stability bugs in crawlcode. At this point, I’m mostly reliant on people testing seeds on various platforms — Ultraviolent4′s seeded Game of The Month series has so far been invaluable for finding seed/pregen bugs. (Incidentally, the challenge of localizing seed bugs, together with the challenges of developing for seed stability, are why I would strongly recommend fellow roguelike devs against using seed-based save formats…)

Part 3: next steps

There are still many improvements to be made, and I expect a lot of my effort for 0.24 will go towards finding and quashing more seed stability bugs. Here are a few improvements I hope to include in the next version:

Incremental pregeneration: generate the dungeon in a predetermined order, but only do as much as you need to do to play catchup on level changes. This will allow hopefully removing the distinction between pregenerated and non-pregenerated games, along with allowing seeded play online. (Right now, the main barrier is that up-front level generation is very hard on a server’s cpu.)
More RNG improvements for repeatability: apply various techniques to make rng de-synchronization less impactful. Remove floating point code as much as possible, and improve some of the lua numeric issues further.
Non-numeric (readable) seeds, a la cogmind.
A seed catalog of some kind.
What seed-related features would you like to see?

The DCSS community has been extremely helpful in finding and reporting bugs, and I doubt we are done. Please go play one of the seeding based challenges, and let us know if you find problems!

The 0.23 Winter tournament is over. On behalf of the Dev Team, many thanks to all the server admins, outside contributors, bug reporters, and to the many DCSS players who made the 0.23 release and tournament possible! For 16 days, players could compete for tournament points and banners by playing 0.23 games on the public servers. First let’s give a short summary of player and clan results. To see first through third place winners in all categories, please see the tournament overview page.

The winning player was Yermak, with 8721 points. Yermak won 35 games during the tournament, including the longest streak at 22 games, and won 21 tier-three banners. In second place was MrMan, with 8092 points and 31 wins, including the second-longest streak for 20 wins and 17 tier-three banners. In third place was JiyvaJigglypuff, with 6631 points and 36 wins and 16 tier-three banners. The players ethniccake and watermold were the only ones to win all 24 tier-three tournament banners.

The fastest win by turncount was achieved by Yermak (13586 turns with a DDFi of Makhleb). Yermak also had the highest scoring game of the tournament, a 15 rune GnEE of Chei won in 19994 turns for 71M points. p0werm0de had the fastest realtime win in 25 minutes with a DDFi of Uskayaw, and Yermak had the lowest level win with an XL 15 FeEn of Ashenzari. The first victory of the tournament was claimed by JiyvaJigglypuff, winning a MiFi of Makhleb in just 33 minutes (38 minutes after tournament start). The first 15 rune win of the tournament was by YTcomUltraviolent4, winning a GrGl of The Shining One in 2h 29m (2h 31m after tournament start).

The clan competition was won by Tab and Dab (26578 points), followed by Gozag or Go Home (22637 points), and Big Dispersal (19276 points). There were 126 clans in all with points scored in the tournament.

Here are some basic statistics on the players in this tournament compared with the 0.22 Summer and 0.21 Winter tournaments (in parentheses):

Players: 3440 (0.22: 3130, 0.21: 3153)
Total player time: 38876 hrs (0.22: 41988 hrs, 0.21: 41283)
Avg player time: 11.3 hrs (0.22: 13.41 hrs, 0.21: 13.09)
Proportion of players using WebTiles: 95.88% (0.22: 95.39%, 0.21: 94.48%)

It’s nice to see that this tournament had the most players of any we’ve run so far. It looks like the winrate decreased from the high of 2.82% in version 0.22 down to 2.46%, which is similar to version 0.21. We buff players by locking up ghosts, then we nerf players by giving them new traps and Klown pies to the face. Yet you all keep coming back for more! It must be the shiny loot…

We had a lot of fun bringing the 0.23 release together, with multiple devs individually putting in hundreds of commits. We also had lots of great outside contributions ranging from vaults to artwork to bugfixes, including some important feature implementations. Thanks to all those taking the time to contribute to the project! A 0.23.1 bugfix release is imminent, so be on the lookout and be sure to update if you’re playing offline.

In the meantime, we on the Dev Team are excited to start work on the next version. Thanks again, everyone, and happy crawling!

Dungeon Crawl Stone Soup

Development live!

Archive for March, 2019:

0.23.2 and 0.22.2 bugfix releases

Behind the scenes on retrofitting DCSS’ seeding