Some notes on writing parser-based interactive fiction in Python (part 3)
A slightly earlier draft of this post was a response to a question on the intfiction.org forums. It's been moved here because it might be interesting outside of that context.
Other posts in this series:
Part 1
Part 2
Part 4
Part 3: Parsing, an Overview
All that being said, all you really need to do at the most basic level to get yourself a verb-noun parser when the objects themselves know what's possible on them is to try to match what the user typed to the relevant textual description of the objects, then dispatch to the appropriate verb method. Of course, players deserve more than a simple verb-noun parser, and you'll want to handle special cases: pronouns, addressing someone else, verbs that require an indirect object.
That being said, here's most of the main loop for Zombie Apocalypse:
from bin import globs from bin.parser import parsing_help as ph from bin.parser.grammar import * from bin.parser import parsing_help from bin.start import startup from bin.util import debugging from bin.util import simple_commands as sc def main_loop(): """The main loop for the game.""" try: # At this level, catch any exceptions not otherwise caught and dump the text buffer before re-raising while True: try: # At this level, catch control-key combos. command = su.get_input(ph.command_prompt) if len(command) > 0: split_command = [what.lower().strip() for what in ph.tokenize(command)] # First, handle commands requiring ... well, special handling. if split_command[0] == 'debug': # Parse DEBUG commands without altering capitalization. current_verb = ph.parse(command) elif split_command[0] == 'save': # SAVE and LOAD need to represent potentially case-sensitive filenames, too. current_verb = 'save' sc.do_save(the_command=command) elif split_command[0] in ['load', 'restore']: current_verb = 'load' sc.do_load(the_command=command) else: current_verb = ph.parse(command.lower()) # Figure out what the command is and execute it. Everything not handled above is case-insensitive. if current_verb not in ph.extradiegetic_verbs: parsing_help.late_turn_actions(current_verb, command) else: su.printer("{[RND]}?", ["I beg your pardon", "I'm sorry", "Sorry, what", "Hmmmm", "What was that"]) except EOFError: # Ctrl-D in Linux; I forget what it is in Windows. Something else. Ctrl-Z, maybe? print("wait") sc.do_wait() parsing_help.late_turn_actions("wait", "wait") except KeyboardInterrupt: # If Ctrl-C is hit, just abort the current attempt to enter a command and start over. print() except Exception as e: # If an unhandled error occurs ... su.printer("ERROR: We're about to crash. Here's the most recent exception:") su.flush_buffer() # dump the buffer-queued text... import traceback for i in traceback.format_exc().split('\n'): # Add the traceback to the transcript ... su.debug_printer(i.rstrip(), prefix=" ", min_level=-1) su.flush_buffer() su.close_transcript() raise e # ... and let the error propagate. if __name__ == "__main__": # First, set up startup.opening() # Now, run the game. main_loop()
The outer try catches errors, helps to make them more intelligible, and then lets them crash the program; it's better to find errors and then fix the underlying causes than to try to keep a broken system running evenly. But it's better still to get the most detailed, helpful tracebacks you can.
There's then an inner try handler that only catches control-key combos: control-C aborts the current attempt to enter a command and does nothing without "taking a turn"; control-D is mapped to WAIT.
Aside from that, on every turn ...
Input is requested. If it's zero-length, a randomized "Wait, what?" message is printed and nothing happens. If it's not zero-length, processing continues.
It's split into a list of words, after being lowercased and having spaces stripped off of both ends of each word. This is done by parsing_help.tokenize(), which doesn't do much more than the .split() method on a string.
The parser checks to see if it's a debug command; if it is, debugging code is invoked and no other processing happens this turn.
If the command is SAVE, LOAD, OR RESTORE, the relevant file-handling code is invoked, and nothing else happens this turn.
Otherwise, the tokenized list is passed to parsing_help.parse(), which performs (or delegates) the meat of the parsing and returns the name of the verb that it decided on. The name of the verb is important because some verbs are "out of world" actions, metacommands, that don't "take a turn"; this is determined by looking at the list of extradiegetic_verbs in the parsing_help module. If the detected verb is not in this list, then the action "takes a turn," and also a series of "every turn actions" has a chance to occur.
The process then repeats until the player quits the game or reaches the end.
parsing_help.parse is responsible for understanding the command and executing any actions that need to be taken in response to it. If the player types EAT DONUT, it determines whether there is an object in scope that matches the description DONUT. If there isn't, it prints a message saying, essentially, "What are you talking about? You can't see a donut here."
if there is an object matching the description DONUT, then it examines the object to see if it can find an action-handling method that matches EAT. If it can, it calls it, and that action-handling method does whatever needs to be done to handle the action: in this case, prints a message saying something like "Yum, the donut is tasty" and removes the donut from the model world.
If there is no action-handler defined for EAT on the Donut class or any of its ancestors, it prints a message saying something along the lines of "I don't know how to EAT a DONUT." (Though this will never happen: there's an .eat() method defined on Noun, the ancestor of everything, so that will be invoked instead. This is handy if you type EAT TABLE: the rejection-by-default message defined on Noun will print "Sorry, the table doesn't look edible." for everything unless something lower down in the inheritance chain overrides it. So Food, a descendant of Thing, which is a descendant of Noun, has a handler for the eat() action that prints a generic message, which is sometimes good enough, and the Donut class can define a Donut-specific handler that overrides Food because you want to print a special message because Donuts are especially delicious. Similarly, you can write custom rejection messages for other classes: Horse.eat() might print "But horses are beautiful, noble creatures! You would NEVER eat one!", whereas Human.eat() might print "The world is falling apart, but you're not ready to resort to cannibalism yet.")
All that being said, and given that data ontology, here's the current, mediocre version of parsing_help.parse(), which handles only some special cases that need to be handled, then passes everything else to another routine to do the real parsing:
def parse(command): """Parse commands entered by the user and respond to them. Note that this routine only handles simple situations, delegating more complex multi-part commands to the routine multi_parse(), above. """ su.debug_printer("the command is: %s" % command, 3, prefix="PARSING: ") # First, split the command up and check to see if any preprocessing needs to be done. command_parts = tokenize(command) command_parts = regularize_command(command_parts) su.debug_printer("there are %s parts to the command" % (len(command_parts)), 3, prefix=" ") try: # Next, check to see if there's just one word in the command. if len(command_parts) == 1: if command_parts[0] == "again": parse(globs.command_history[-1]) elif command_parts[0] in extradiegetic_verbs: extradiegetic_verbs[command_parts[0]]() elif command_parts[0] in snowflake._all_verbs: getattr(snowflake, command_parts[0])(actor=globs.the_hero) # If we can dispatch one-word commands through this proxy, do so. else: su.printer("Sorry, I don't know how to " + command_parts[0].strip() + ".") # If we haven't handled it yet, pass control off to the real parsing engine. elif len(command_parts) > 1: multi_parse(command_parts) except ParseError as the_complaint: su.printer("Sorry, I couldn't understand that. %s" % the_complaint) except SilentParseError as the_complaint: if str(the_complaint): su.printer(str(the_complaint)) return command_parts[0]
ParseError and SilentParseError are exceptions that are caught at this level; they can be raised anywhere down the call chain to stop processing immediately if it becomes clear that the command cannot be processed. As you might expect, ParseError necessarily prints an "I couldn't understand that" message, whereas SilentParseError does not necessarily do so.
So parse() is maybe inaccurately named because it does very little of the actual parsing work; it mostly handles special cases and dispatches more common commands to the longer and more complex multi_parse() routine, about which more in a minute. The only commands actually handled at this level are: (1) AGAIN, which repeats the last action (at this point, by re-parsing it, which is not ideal and this will eventually have to be re-written, but that's for later); (2) extradiegetic ("out of world") verbs, which are listed in a command-dispatch dictionary I'll talk about in a minute; and (3) "snowflake" commands, those that need special handling, and which are listed in a "snowflake" command-dispatch dictionary that I'll also talk about in a minute. (I initially named it that years ago, using the reasoning that "every one of these situations is different, but they can all be handled by an every-situation-is-different object"; that was before American conservatism was sneeringly applying "snowflake" as a label to anyone who isn't a terrible person, or at least when I wasn't as cognizant of that usage. In retrospect, I might rename it for exactly that reason, but that won't happen today. I have other things to do today.)
The extradiegetic_verbs dispatch dictionary just maps specific verbs to functions that handle them, like so:
# Extradiegetic verbs don't increment the command counter or otherwise "take a turn". extradiegetic_verbs = {'about': sc.do_about, 'brief': sc.do_brief, 'commands': sc.do_help_commands, 'credits': sc.do_credits, 'debug': sc.do_help_debug, 'exit': sc.do_quit, 'gender': gender.set_pronoun_preference, 'help': sc.do_help, 'hint': sc.do_hint, 'history': sc.do_print_history, 'inventory': sc.do_inventory, 'license': sc.do_license, 'load': sc.do_load, 'ponder': sc.do_ponder, 'quit': sc.do_quit, 'restore': sc.do_load, 'save': sc.do_save, 'score': su.print_score, 'script': su.ob.start_transcript, 'verbs': sc.do_list_verbs, 'verbose': sc.do_verbose, }
Most of these are imported from another module, simple_commands, with import simple_commands as sc. Doing extradiegetic_verbs[command_parts[0]]() just looks up the first word of the command in that dictionary and calls the function listed there.
The "snowflake" mostly handles slightly more complex tasks, where an objectless verb needs to be translated to a verb-plus-object pair. it serves as a substitute Noun (note that it is not actually a descendant of Noun) that can be passed to the part of the parser that calls a method on the pseudo-Noun just as if it were an in-game Noun. So, for instance, here is part of its definition:
class SnowflakeDispatcher(object): def defecate(self, actor): """Refusal text.""" sc.do_bodily_functions() def go(self, actor, direction_text): """Move in a direction.""" actor._go(' '.join(direction_text)) def hear(self, actor): """Listen for any noises.""" sc.do_listen() def listen(self, actor): """Listen for any noises.""" sc.do_listen() def look(self, actor): """Describe the current area.""" sc.do_look() def smell(self, actor): """Smell the current area.""" sc.do_smell() def sniff(self, actor): """Smell the current area.""" sc.do_smell() def wait(self, actor): """Let a turn pass without doing anything.""" sc.do_wait() def xyzzy(self, actor): """Refusal text.""" sc.do_xyzzy() snowflake = SnowflakeDispatcher()
So this is the object that transforms "go n" to the game's internal representation of movement action: "north" is not an in-game object, unlike in certain other development systems. (Typing GO NORTH results in calling .go('north') for the Protagonist object, not in finding the north object and GOing it. Directions here are strings, not Nouns. For this reason, it's handled outside the meat of the parsing loop, as a special case, because most of the parsing loop is focused on identifying in-game objects.)
The meat of the parsing is handled by multi_parse, which takes a tokenized list of strings and tries to match descriptions in it objects that are in scope:
def multi_parse(command): """Parse a multi-part command. Pass in the tokenized command list as COMMAND. Some special cases are understood and handled outside of this main parsing logic. Currently, these are: #FIXME: should just be extradiegetic verbs * SAVE, LOAD, RESTORE; #FIXME: this whole list needs revision * GO, MOVE; * any single-word command; * commands that begin with the verb DEBUG; * and maybe other things handled below in parse(), though I try to remember to keep this list more or less current. """ from bin.core.nouns import Noun # Avoid a circular dependency by not importing at the top. # This is the new v4 parser that evolved out of the v3 parser. # Last commit with v1 parser had SHA-1 hash of cec1b506508d058560981b06d212746bca9e4c5b. # 2nd parser was too complex, never worked well, and was never committed in Git. # Final commit for v3 parser had SHA-1 hash of 4bbccb4fae21a109d4974ba414e35feb4f020a3f (21 July 2016) # First work on v4.1 parser started: 30 May 2018. su.debug_printer("Tokenized command is: %s" % command, 3, prefix="PARSING: ") # Modifiers of various kinds get shoved into this dictionary. This will become the **kwargs parameter for the call # to the relevant object's verb method, once we've identified the relevant object and its verb. # Examples of modifiers understood include: # actor -> who's performing the action. The grammatical subject of the action. (Default: globs.the_hero). IMPLEMENTED. # using -> what tool is used in performing the action. # about -> a topic of conversation. # dest-* -> broken up automatically into: # dest -> where to? -- for movement of things # prep -> what preposition expresses the spatial relationship? # direction_text -> tokenized description of the user's directional phrase (for snowflake.go) call_parameters = dict() # Clean up the command by removing any words that have been identified as NEVER being meaningful to the parser. stripped_command = [x for x in command if x not in fluff_words] # FIXME: don't strip fluff words inside quotes # First: check to see if someone is being addressed directly. su.debug_printer("Checking to see if anyone is being addressed directly", 3, prefix="PARSING: ") comma = False for pos, word in enumerate(stripped_command): # Find the first word in the command that ends with a comma if word.endswith(','): comma = pos break if not isinstance(comma, bool): # Something ends with a comma. Let's see if it's a plausible addressee. possible_object = stripped_command[0:1 + comma] possible_object[-1] = possible_object[-1].rstrip(',') # Take the comma off possible_addressees = [what for what in object_list_from_description(possible_object, default_scope()) if what._is_addressable()] if possible_addressees: # Someone present is in fact being addressed here. su.debug_printer("taking the string '%s' to indicate that someone is being directly addressed. ... potential matches are: %s" % (' '.join(possible_object).upper(), possible_addressees), 3, prefix="PARSING: ") possible_addressee = prune_possibilities(object_description=possible_object, current_options=possible_addressees, the_verb=None, actor=None) if possible_addressee: # If there's anything left, we'll assume the Creature being addressed is the first one in the list gender.update_pronouns_from(possible_addressee) stripped_command = stripped_command[1 + comma:] # Remove the appositive from the command before parsing any more su.debug_printer("determined that the person being addressed is: %s" % possible_addressee, 3, prefix="PARSING: ") if possible_addressee._accept_command(the_command=stripped_command, who_commands=globs.the_hero): call_parameters['actor'] = possible_addressee else: # If the person addressed declines to do what the Protagonist wants, stop trying to parse: it's over raise SilentParseError if 'actor' not in call_parameters: call_parameters['actor'] = globs.the_hero # Our next goals are: (a) to find one or more direct objects in the command and map the textual representation # that the player typed in zir command onto one or more in-game objects; and (b) to discover what verb method the # player's command needs to call on those objects. # Subtasks of (a) include parsing prepositional phrases. # Some commands don't map neatly onto the [verb] [in-game object(s)] model. For these special-snowflake cases, we # use the proxy SnowflakeDispatcher object to act as a virtual in-game object for the player's command. In these # special cases, the object is manually "found" by special-casing code in the parser and allowed to serve as # direct-object-to-which-calls-are-dispatched, just as if it were a real in-game object. Its methods then re-route # to whatever other calls are necessary to execute the command. These verb methods on the virtual proxy object # take **kwargs parameters just like "real" objects (i.e., descendants of nouns.Noun). In particular, they should # expect to receive an ACTOR parameter, just like other verb methods. the_verb = None direct_objects = [][:] # First, treat special cases. if stripped_command[0] == "debug": debugging.do_debug_command(stripped_command) return elif stripped_command[0] in ['go', 'move'] and len(stripped_command) == 2 and stripped_command[1] in movement_directions: the_verb = 'go' direct_objects = [snowflake] #FIXME: we should be able to MOVE a Noun. call_parameters['direction_text'] = stripped_command[1:] elif False: # Do whatever other special-situation processing needs to be done for other special situations. pass elif len(stripped_command) == 1 and stripped_command[0] in snowflake._all_verbs: the_verb = stripped_command[0] direct_objects = [snowflake] if the_verb is None: # All right, there are a number of assumptions about the command structure that we're parsing. ("Non-simple" # means commands that aren't handled by parse(), below.) These assumptions are: # 1. A (non-simple) command has this form: # [ADDRESSEE], (VERB) (DIRECT_OBJECT) [and DIRECT_OBJECT and DIRECT_OBJECT .. ] [PREPOSITIONAL_PHRASE] [PREPOSITIONAL_PHRASE...] # 2. Some verbs can only take one direct object; those are listed in single_direct_object_verbs. # 3. Multiple direct objects have to be connected with AND. If there are more than 2, *all* have to be # connected with AND. (No commas.) # 4. Prepositional phrases are sometimes optional, sometimes required to complete an action. # - they are essentially modifiers, as in ATTACK RICK WITH SPOON or ASK STEVE ABOUT THE PUB # - grammar is enforced with decorators: verb methods might be modified with, say, @MustSpecifyTool, or @CantSpecifyTopic # - prepositional phrases MUST COME AFTER all direct objects. # We've already found any appositive leaders. What's next is the verb, or at least the first word of it. the_verb = stripped_command[0] # Luckily, the simple imperative mood in English always begins with the verb # Check to see if there are any quoted (spoken or written) phrases in the command # Current limitations: # * double quotes only # * no nested quotes, period # * ABSOLUTELY NO FUNNY BUSINESS WITH QUOTES if '"' in ' '.join(stripped_command[1:]): found_a_phrase = True # Pass through at least once while found_a_phrase and '"' in ' '.join(stripped_command[1:]): opening_pos, closing_pos, found_a_phrase = False, False, False for i, word in enumerate(stripped_command[1:]): # Skip the verb if not opening_pos: # We're looking for an opening quote if word.startswith('"'): opening_pos = i + 1 # Remember, we skipped the verb; we want this to be the position relative to the command as a whole if opening_pos: # We're looking for a closing quote. if word.endswith('"'): closing_pos, found_a_phrase = i + 1, True break if not found_a_phrase: raise ParseError("I'm confused by your use of quotation marks.") the_phrase = stripped_command[opening_pos:closing_pos+1] direct_objects += [objects.Phrase(_the_phrase=' '.join(the_phrase))] stripped_command = [][:] + \ stripped_command[:opening_pos] if (opening_pos > 0) else [][:] + \ + stripped_command[closing_pos+1:] if (closing_pos < len(stripped_command)) else [][:] # Now, check to see if the verb might be a phrasal verb. # FIXME: practically speaking, this method only works for two-word (not three-or-more-word) phrasal verbs. endings = possible_phrasal_endings(the_verb) if endings: su.debug_printer("possible phrasal verb endings %s detected for verb %s." % (', '.join([shlex.quote(e) for e in endings]), the_verb), 3, prefix="PARSING: ") for e in endings: su.debug_printer("Checking for potential ending %s in command. " % e, 3, prefix=" ") if e in stripped_command[1:]: # We found the other part of our separable multipart verb the_verb = "%s_%s" % (the_verb, e) # Munge the verb name to match the method name for the object stripped_command[0] = the_verb stripped_command.remove(e) su.debug_printer("Found it, and rearranged command to %s." % ' '.join(stripped_command).upper(), 3, prefix=" ") break else: su.debug_printer("...not found.", 3, prefix=" ") # Now that we know what the verb is, figure out what the potential objects we might be trying to match for that verb are. search_path = get_scope(verb=the_verb, actor=call_parameters['actor'] if ('actor' in call_parameters) else None) su.debug_printer(" PARSING: After pruning, the stripped, verb-normalized command is: %s." % (str(stripped_command)), 3) su.debug_printer(" The search path for objects is %s." % str(search_path), 3) su.debug_printer(" The verb detected is '%s'." % the_verb, 3) objects_text = stripped_command[1:] # OK, let's deal with everything after the verb. # Check to see if there are any prepositional phrases in the command. su.debug_printer("PARSING: looking for prepositional phrases.", 3) su.debug_printer(" Text remaining to parse: %s" % objects_text, 3) preposition_locations = [][:] # First, let's pull out all the prepositional phrases for i, w in enumerate(objects_text): # We start by finding the list indexes where each phrase starts. if w in prepositions: # Note that prepositional phrases have to come at the end of the sentence. preposition_locations += [i] if preposition_locations: su.debug_printer(" prepositions found:", 3) for i in preposition_locations: su.debug_printer("\t%d\t->\t%s" % (i, objects_text[i]), 3) else: su.debug_printer(" no prepositions found.", 3) for i, w in enumerate(preposition_locations): # OK, examine and process each phrase individually. if i < (len(preposition_locations) - 1): the_phrase = objects_text[w:preposition_locations[i + 1]] else: the_phrase = objects_text[preposition_locations[i]:] su.debug_printer("PARSING: examining the prepositional phrase '%s'" % ' '.join(the_phrase), 3) key, what = prepositions[the_phrase[0]](command=the_phrase, actor=call_parameters['actor']) if key.strip().startswith('dest-'): # Special handling here. call_parameters['dest'] = what call_parameters['prep'] = key.strip()[len('dest-'):] else: call_parameters[key] = what if isinstance(what, Noun): # Only try to update pronouns based on parser-accessible in-game objects. gender.update_pronouns_from(what) if preposition_locations: objects_text = objects_text[: preposition_locations[0]] # Strip off the prepositional phrases at the end. # OK. Now, split the objects_text into a list of chunks (each of which is a list of words), separated by 'and', # dropping the 'and' each time. Each of these chunks will be parsed as a direct object. # This may result in just one part. In fact, it usually will. That's OK. direct_object_phrases = su.split_list(objects_text, 'and') su.debug_printer("direct objects split into: %s" % direct_object_phrases, 3, prefix="PARSING: ") for which_phrase in direct_object_phrases: if which_phrase: # Don't try to match empty lists and other non-truthy objects su.debug_printer("about to find an object matching the description %s" % ' '.join(which_phrase).upper(), 3) the_object = object_from_description(which_phrase if (isinstance(which_phrase, list)) else [which_phrase], the_verb, search_path, actor=call_parameters['actor']) direct_objects.append(the_object) gender.update_pronouns_from(the_object) #FIXME: we have previously updated from prepositions, which come later. This is potentially confusing. if the_verb in single_direct_object_verbs: su.debug_printer("Note that %s is a verb that can take only one direct object." % the_verb, 3, prefix=" ") if len(direct_objects) > 1: raise ParseError('You can only %s one thing at a time.' % the_verb) # Now we've got a verb and a list of things to do it to. Let's do it to them. if not direct_objects: raise ParseError("I couldn't figure out what you're trying to interact with.") #FIXME: find a missing dir. obj. for the_item in direct_objects: getattr(the_item, the_verb)(**call_parameters) # Call the THE_VERB() method of each THE_ITEM with CALL_PARAMETERS.











