Some notes on writing parser-based interactive fiction in Python (part 2)
A slightly earlier draft of this post was a response to a question on the intfiction.org forums. It's been moved here because it might be interesting outside of that context.
Other posts in this series:
Part 1
Part 3
Part 4
Part 2: Talkin' 'bout Data Ontology but Keepin' it Funky
All that being said, though, let's dive in and start by talking about data ontology.
The traditional way for homebrew parser IF to function was that objects were illusions presented by the underlying system; that there was no brass lantern represented as an object in the underlying memory banks, but rather just an entry in several tables or arrays with a common ID number. So it might be that the ID number for the brass lantern is, say, 22, and when the player types EXAMINE BRASS LANTERN, the parser seizes on the description BRASS LANTERN, then looks it up to discover that BRASS LANTERN corresponds to the object with ID# 22, and then looks at the twenty-second entry in the array of descriptions, and prints something along the lines of "It a small lantern made of brass." If you need to know something else about the lantern, you look in another table or array, or find the information somewhere else in the model world: if you want to know where the brass lantern is, the system might look through every room, then see if it is in any containers, then if it is being held by anyone ... or it might just have another array that tracks the current location of every object.
All of this was necessary because '80s homebrew games were largely written in BASIC, the language that came with virtually every home computer at the time; users could be assumed to have a BASIC interpreter pre-installed, because it was wired into the computer's hardware. But BASIC in most implementations has no data types other than numbers and strings, and arrays of those things; there was no underlying representation of "objects," so programmers made do without one. They did that by using tables and arrays of numbers and strings.
You can write BASIC-style code in Python if you want, and of course it's possible to represent objects in tables and arrays, but it seems to me that representing in-world objects as Python objects is a good move for a lot of reasons: it makes it conceptually simpler and lets objects themselves carry around information about what they're capable of and what can be done to or with or by them. it also lets you leverage Python's inheritance system to provide and override defaults. And since Python has good introspection facilities for objects, it's possible for the parser to query the objects themselves about their characteristics, which makes parsing easier.
Every in-game noun in Zombie Apocalypse is a descendant, at some (possibly far) remove, of an abstract object called Noun. Noun provides basic information for every single noun in the game. Mostly, this means default rejection messages and default attributes.
class Noun(object, metaclass=StandardGrammar): """Person, place, or thing. Everything accessible to the parser is an instance of this class, or a subclass of it. """ _points = None _plural = False _plural_in_form = False _touched = False _size = 0 def describe(self, **kwargs): """Every descendant needs to implement this in a different way.""" raise NotImplementedError("ERROR: Noun.describe() must be overridden by all descendants, but the class %s does not do so." % self.__class__) def look(self, **kwargs): """LOOK is a synonym for DESCRIBE""" self.describe(**kwargs) def look_at(self, **kwargs): """A synonym, in this case.""" self.look(**kwargs) def examine(self, **kwargs): """Yet another synonym for DESCRIBE""" self.describe(**kwargs) def inspect(self, **kwargs): """INSPECT is a synonym for DESCRIBE""" self.describe(**kwargs) def eat(self, actor): """Though I have a cat who disagrees with me about this, most things are not edible. """ if actor == globs.the_hero: su.printer('Sorry, {[spec]} is not edible.', self) # Subclass objects.Food overrides this; so do some medications. else: su.printer('{[SPEC]} is quite unwilling to eat {[spec]}.', actor, self) def swallow(self, **kwargs): """A synonym for EAT.""" self.eat(**kwargs) def consume(self, **kwargs): """A synonym for EAT.""" self.eat(**kwargs) def gobble(self, **kwargs): """A synonym for EAT.""" self.eat(**kwargs)
(su.printer is a utility function, the printer routine in the small_utilities module, that breaks text into appropriately sized lines and buffers it until the entire round has been run, then dumps it to the screen all at once, pausing with a "press any key for more" message occasionally. Just using Python's print() would happily let words be split between lines and would allow more than a screenful of text to be printed during a single turn, forcing the user to scroll back. If they're playing in a terminal emulator that allows for scrollback, of course. It also supports transcripting. Every text-printing operation in Zombie Apocalypse uses su.printer() instead of print() except for a few debugging routines and, of course, su.printer() itself, which internally uses print() to print each line.) (We'll skip the metaclass declaration for quite a long time before coming back to it.)
Again, that's just an excerpt. The actual definition of Noun is much longer, of course, because it provides default-rejection messages for everything, many of which are overridden by subclasses. By default, you can't EAT most things; but you can define something as Food instead of Noun, and Food and its subclasses override the .eat() method. Synonyms just dispatch to the canonically named method: if the player types GOBBLE STEAK or SWALLOW STEAK, the Python attribute search finds .gobble() defined on Noun, kicks of another attribute search for .eat(), and finds .eat() defined on Food, of which the STEAK object is either an instance of, or the instance of a subclass of. (There are currently no STEAK objects in ZA, so this is a purely hypothetical example.) At the level of an abstract base class like Noun, it makes sense to reject pretty much every action, because by default most actions should only succeed on certain subclasses. You can't EAT a place; you can't GO TO an object. (There's no reason why either has to be impossible; it's my game, after all, and certain types of fantasy writing support both ideas. But there have to be basic parameters at some point, and the Noun class sets a lot of them.)
Other defaults can be overridden by subclasses in the same way. Most other descendants of Noun are going to override _size, for instance, at some point in their inheritance chain. Again, there's no inherent need to deal with the sizes of objects in parser IF, and plenty of parser IF makes no effort to do so. But ZA does have situations where the relative sizes of objects are important, so it defines a _size attribute on the base class that everything else derives from; this means that every descendant of Noun has a _size attribute that the Python inheritance search can find, so it's always safe to refer to an in-world thing's _size.
There's a pattern here that's important to support the mechanics of the parser I'm writing: object attributes, including method names, must begin with an underscore unless they are action routines that handle a command. So _plural is a boolean flag that indicates whether a noun is plural, a collection; and _plural_in_form indicates whether a noun is a single thing that gets grammatically treated as plural, like "pants." Neither is an action -- you can't PLURAL STEAK -- and the fact that the attribute begins with an underscore signals that to the parser, .eat() is an action-handling routine, so it doesn't begin with an underscore; the parser, inspecting the object, knows that EAT is something that can be done to that object. (There's nothing special or magical about the underscore at the beginning of the attribute name; it's just a convention that the parsing code checks for. But there needs to be SOME way to distinguish whether something is an action-handling routine, and though Python lets you check whether something is callable with the callable() check, that doesn't distinguish between games handling in-game actions and utility routines that do things for the class other than handling actions, and this adapts a common convention in Python by which "private" -- actually pseudoprivate -- names are signaled as such by being named with a name that begins with an underscore. It's a rather small deformation of that convention, really.)
Noun itself is never instantiated directly. That's why .describe(), the canonical synonym for the LOOK/LOOK AT/EXAMINE family of synonyms, raises NotImplementedError if it's ever called: It's intended to force me to realize early that I've done something I'm not supposed to do. (It's better to see the errors when you're designing the program instead of allowing them to propagate.) Python has a system for formally specifying that something is an abstract base class with methods that must be overridden in the abc module in the standard library, but I avoid working with that and just manually raise errors instead for two reasons: (1) it's much slower to use the isinstance() call to check whether descendants of abstract base classes formally declared as such are instances of another class, and Zombie Apocalypse runs such checks a fair amount; and (2) formally registering Noun as an abstract base class with the abc module requires using abc.ABCMeta as the metaclass of Noun (and therefore all of its descendants), but I've already got another metaclass I'm using for something else, and a class can only have one metaclass. So formally declaring Noun to be an abstract base class is out, and I just do it informally by raising errors during play when things happen that are supposed to be handled by subclasses.
There are three primary subclasses of Noun, all of which are also mostly abstract but are occasionally instantiated directly: classes Creature, Thing (because object is a reserved word in Python), and Location. Each is subclassed repeatedly, sometimes with many steps in a descendant chain. Here is an excerpt from definitions for one abstract subclass:
class Creature(nouns.Noun): """A generic Creature defining default behaviors; possibly never instantiated directly. """ _description = "an indescribable crawling Thing" _size = 3 # But many descendants will override this. See nouns.Noun for documentation. _gender_pronoun = "its" _items = None _contained_by = None _hitpoints = 1 _direction_traveling = None # if the Creature has travel plans that last beyond the current turn, those are here: a direction string _holding = None # Only Humans and subclasses can use weapons, but let's make sure we explicitly track that other Creatures aren't equipped with anything. _last_location = None _relationships = None _responsive_after = -1 # Turn number on which this creature can execute scripts. Its primary # purpose is to delay following a script until the next turn, if a # Creature gets a script before its "move" comes up; this allows things # to have a one- (or more-) turn delay before the script is executed. def describe(self, actor): """Routine for describing a Creature.""" if actor is globs.the_hero: # The narrative is focalized through the protagonist! su.printer("{[SPEC]} {[verbf]} {[desc]}.", self, ('be', self), self) else: su.printer('"{[SPEC]} is {[desc]}," {[spec]} tells you{[str]}.', self, self, actor, random.choice(['', '', '', ' helpfully']))
Here, the method handling the DESCRIBE command is overridden: it has to be, otherwise EXAMINE HORSE would raise NotImplementedError (because that's what's defined at the base class, Noun). On the other hand -- and you can't tell this because you don't see the entire definition of Creature here, which is about five hundred and fifty lines long, not counting the methods and attributes inherited from Noun -- the command handling EAT is not overridden, because I'm happy with a Creature being non-edible by default for the purposes of the story I'm writing. Creature objects (and instances of their subclasses) also have a whole bunch of convenience methods, all of which start with underscores because they're not verbs; these include _every_turn_trigger(), a routine that gets called every turn to give the Creature a chance to do something if it wants to; ._die(), something that happens to creatures often in a world where zombies roam; _go(), which handles movement; _possessive_pronoun(), which returns the possessive pronoun that's grammatically appropriate to use for things belonging to this creature; and plenty of others.
Creature can be instantiated directly, but it's also subclassed over and over and over; Mammal is a subclass of Creature, and Human is a subclass of Mammal. Housecat is also a subclass of Mammal that overrides different behaviors and attributes than Human does. Human itself is subclassed a lot, largely to provide generic support for character tropes from zombie movies that I poke fun at: subclasses include Coward, Leader, Asshole, Cynic, Child. Leader gives several new capabilities; these are people who can have bands of followers that follow them around. Protagonist is a singleton that's a subclass of Leader; it has a lot of modifications to many of its parents' methods, including that calling _die() on the single instance of the Protagonist class runs the handling-the-end-of-the-game code.
Similarly, Location subclasses Noun to provide location-based commands, because there are times where you want the player to be able to refer to locations when talking to the parser: GO TO HOSPITAL moves the player one step closer to a known location. Location is further subclassed, sometimes just to provide different default text so I don't have to describe similar things over and over: InteriorLocation, ExteriorLocation, OutsideBuildingLocation (if the most interesting thing to say is that you're standing outside a building), etc. etc. etc. If a whole group of Locations has specific properties -- every room inside the large mall has specific every-turn behavior, for instance -- those rooms are likely to all be instances of a specific further subclass of InteriorLocation. Or if every place in a forest exhibits a specific kind of "you're lost in the forest" behavior, those are likely to all be instances of a further subclass of ExteriorLocation.
And of course the direct subclass of Noun that's most frequently and deeply subclassed is Thing. Some inheritance chains descending from Thing are:
Noun -> Thing -> Food -> SpoiledFood (rejects attempts to eat it with a "that smells gross" message by overriding the .eat() method);
Bed (inherits behavior from the Noun -> Thing -> Container -> Furniture -> BarricadeableFurniture -> chain and from the Noun -> Thing -> Supporter -> Counter and also from the Noun -> Thing -> Container -> Counter chain, because Counter itself inherits from both Container and Supporter: Python supports multiple inheritance, so it can be both a container [you can put things in drawers under the bed] and a supporter [you can lie on top of it] as well as being a thing you can use to barricade the doors when the zombies are trying to break in);
Thing -> Cigarette
Thing -> PrintedMatter -> Plaque
Anyway, you get the idea. Attributes and behavior can be overridden repeatedly, at different levels, and are defined on objects themselves. (Well, on classes of objects, anyway. Python's object model requires that objects be instances of a class, a small annoyance sometimes, and another one that's ameliorated by languages like Inform, where you can define behaviors for a specific wine bottle and not on the WineBottle class as a whole, nor on a specific class that you have to write so you can instantiate it exactly once as BottleOfReallyExpensiveOldWine. C'est la vie: you've already decided to use Python, you're bound by its rules. You could monkey-patch individual objects by attaching methods manually to them but that introduces huge problems that are not worth raising just to get a boost in conceptual purity.)
So this system groups together all of the behaviors for a specific object (or, well, class of objects, where there may be just one object tin the class, and I'm going to stop talking about this distinction now) on the object themselves and has real benefits in code organization: instead of writing a lot of if object == Bottle: [...] elif object == Cigarette: [...] elif object == Shotgun: [...] else: print('I don\'t know how to ' + verb + 'a '+ objName + '!' code in verb-dispatch tables, you define action-handling behavior on the objects themselves. Leveraging Python's object orientation means you group verb-methods and object attributes together in classes instead of spreading them out throughout the code.
I think that this has a real conceptual benefit for code-organization on large games: You make an Orange a descendant of Food, which is a descendant of Thing, which is a descendant of Noun, and it inherits all of the behaviors of all of its superclasses, most of which are rejection messages, except for those behaviors that it specifically overrides. All of the code that's specific to the Orange class is wrapped together under the class definition, not spread throughout the code base in verb-dispatching tables, all of which have to be looked up by object ID. The parser itself knows what verbs are possible on a given object: it can examine the object's non-underscore attributes. This helps with disambiguation, too: if the player is trying to EAT ORANGE, and there is an ORANGE JACKET in the room, the parser can introspect the objects to see which have an .eat() method that doesn't result in a rejection.















