Entities and Components in NullAwesome
Remember the 90s when object-oriented programming was all the rage? There was not only time for Klax, there was also apparently plenty of time to toss your old structured programming code, learn new buzzwords like "encapsulation", "inheritance", and "polymorphism", and come to grips with bundling your operations together with your data in "methods" rather than writing functions or procedures to operate on the data.
The object-oriented craze hit the mainstream around the same time GUIs did -- in the late 1980s and the early 1990s. I don't believe this to be a coincidence, as GUIs -- with their hierarchy of nested controls distinguished by polymorphic behavior in response to keyboard and mouse events -- seemed a close fit to be modelled in an object-oriented fashion. At the time, it seemed like games were going to be an even closer fit.
Now, a game engine generally consists of three parts running in a loop: an input phase, during which keyboard and mouse input is collected or responded to; an update phase, during which the game state -- or status of all the objects in the game world -- is updated for the next timer tick; and a render phase, during which the game world as represented by the current game state is drawn on the screen. Depending on the speed of your rendering routines, there could be multiple update ticks per rendering frame. If you have an object-oriented hammer, this looks like a perfect nail to beat into the wood with class hierarchy: simply create classes that represent game objects and have update() and render() methods which can be invoked at the appropriate points in the game loop, and then put instances of that class in an array list or linked list and loop through them, calling their update() and render() methods. This is more or less how SpriteCore, my first attempt at a game framework originally written over 20 years ago, works.
Here's a concrete example. Let's say you were going to write Super Mario Bros., or a blatantly copyright-infringing clone thereof, in C++. You think to yourself: hmm, a lot of the objects in the game, such as Mario himself, the enemies, the power-ups, and even the coins have lots of things in common, such as a location, (possibly zero) velocity, and an image or animation to be drawn on the screen. I know! I'll create a base class, GameObject, that encapsulates all of these and then subclass it for more specific objects. The subclasses will extend the base class with information and state relevant to that particular game object. I'll create a Player class for Mario himself, an Enemy class for things that can hurt him, a Powerup class for things that power you up -- wait a tick, the Powerup class should be a subclass of the Collectible class, because there are things like coins and 1-ups that don't power you up but you can nevertheless collect them... and so on and so forth.
Well, I ran into some problems when attempting to write games with this method. Let's say you want to model an object that's affected by gravity. Do you put it in the GameObject class? Hold up there, shorty: there are some objects -- such as Lakitu and the Super Leaf that turns you into Raccoon Mario -- that aren't affected by gravity. So what do you do? Typically GameObject or some other base class will have a set of flags associated with it. Do you set an AFFECTED_BY_GRAVITY flag in objects of that class? Okay, fine. You're breaking the class abstraction but whatever. But what if there is state associated with being affected by gravity, like the acceleration induced by gravity or the center of the pull? (We're getting pathological now, and turning this game into 2D Mario Galaxy, but whatever, just go with it.) Then objects with the AFFECTED_BY_GRAVITY flag will use that state but objects without that flag will just have that extra memory sitting there, useless.
Aha, but what if we could compose object types by specifying traits, with state and behaviors that are unique to each trait? So you could say something like:
class Mushroom <AffectedByWalls, AffectedByGravity, Powerup> { .... } class Lakitu <Enemy> { .... }
and have all of the behaviors of the various subclasses AffectedByWalls, AffectedByGravity, Powerup, and Enemy incorporated into those classes, leaving to you only the need to specialize the classes with behaviors unique to those objects?
Some programming languages allow this in their object systems, with extensions to object-oriented programming actually called "traits" (or sometimes "mixins"). C++ and Java do not allow this sort of style easily. It can be done in C++ with multiple inheritance or in Java with interfaces, but there is more boilerplate that must be written in order to make it all work. One thing I leanred in the course of writing SpriteCore (and trying to write games with it) is just how hard it is to make this work. The behavior of any given sprite is scattered across many different classes and hence source files. If you want two object types to share a common behavior, such as floating or shooting fireballs, you have to either implement that functionality separately in each class, or put it as a method in their closest shared ancestor class and call it from within the subclasses. But this risks exposing a float() or shootFireball() method to classes that don't need it, and all such classes have to have instance variables associated with those, which they will never use. (Perhaps fireball-shooting things keep track of the fireballs they shot.)
There are ways around these problems, but the way modern game developers seem to like best is called the entity-component system, or ECS. Instead of a bag of GameObjects, in ECS the game world is represented as a state vector in a configuration space. So what that means is it's a real mathy way of saying -- okay, say I'm in 3-dimensional space and you want to find my position. x, y, z coordinates; a vector of 3 elements, 3 dimensions. But now say you also want to track my orientation, how I'm rotated in 3 dimensional space. So now describing my exact configuration -- position and orientation -- takes a 6-dimensional vector.
But it gets better. Say my friend Jay is in space with me, and you want to describe both my position and orientation and his. You can do so with a 12-dimensional vector -- 3 for my position, 3 for my orientation, 3 for Jay's position, and 3 for Jay's orientation. The vector space in which our positions and orientations are reckoned with these long vectors is called a configuration space. So now expand this thought to encompass an entire game world, where you track not just the position and orientation of everything in the game world, but also whether it is alive or dead, whether it is floating, how much health it has, how much ammo or MP it has, and so forth in a great big configuration space with a very large dimension.
Another way to put it is say that your game world is in a big spreadsheet, with one row for each game object and as many columns as there are variables in that object's state. Changes made to the sheet are reflected in the game world on the rendering step. The columns in the sheet are divided into components. Each component is a group of variables -- a data structure -- which describes some aspect of the entity. So no longer do you have a Player class, an Enemy class, a Powerup class, etc. A Player is an entity that has all the components necessary to make a Player work (so things like a position and velocity, but also things like current power-up state or HP or even a structure that maps user input to player actions). An Enemy might have many of the same components that a Player has, but include one that specifies how its AI will behave. A Powerup is an entity that has all of the components necessary to make a Powerup work (so position and velocity again, but also some component which describes the effect the Powerup will have when picked up). And so forth.
So actually, you can basically model your entire game as a single function which takes a game state, user input, and the time and yields a new game state. But you don't have to. First of all, we're working in Java so it's okay to update the state vector in-place. Secondly, we can break these functions into chunks, which for the time being I'll call updaters. In conventional ECS terminology these are called "systems", but I'm going to call them updaters just to be clear (or less blurry).
Each updater runs each timer tick and operates only on those entities which it's interested in, but it does them all at once. So a basic updater would go through the entity table, look at all entities that have a position and a velocity (generally these will be integrated into one "physics" component), and update the position by adding the velocity on each timer-tick. Another updater may check for collisions between players and enemies, and update the player's status to "dead" (or decrease its HP) accordingly; depending on conditions (such as a player landing on an enemy's head) it may decide to kill the enemy instead. Yet another updater will decide what actions the enemy will take to attack the player: walking, flying, homing in on the player's location, etc. So it's the updaters which are the basic classes we'll define in this new world order of gaming.
We'll also define some renderer classes, which run during the rendering phase and are responsible for basically drawing the game world on the screen. These work on the same principle: each renderer examines only the entities it's interested in and draws them all at once. So for a 2D game like NullAwesome you might have a background renderer which draws a tile-mapped background, a sprite renderer which draws individual bitmaps, etc.
Entity-component systems have several advantages over the "bag of subclassed GameObjects". For one, it favors composition over inheritance, which is an object-oriented catchphrase meaning that instead of relying on subclasses to extend functionality, it encourages the creation of objects by gluing together relevant -- well, components. For another, if you are working in a language like C++ (which we aren't for NullAwesome), an ECS lets you lay out the game data in any way you see fit, allowing the updaters to update the game state in a way that minimizes cache misses. On modern CPUs, this can translate into huge performance benefits.
ECS game engines are more data driven which has its own benefits. For example, if you avoid the use of pointers in your game's components, you also get a win when you go to implement save games: simply copy the contents of the entire game state vector from memory to disk and you have a save file. By double-buffering the state vector, you can have all updates go into the "back buffer", ensuring that all updaters draw their knowledge on the current state of the world from an unchanging state. Since entities are just data, you don't have to be locked into a particular language's object system, making interfacing other programming languages such as Lisp, Python, or Lua to an ECS easier. You can even write ECS-based game engines in plain C.
Finally I find ECS based engines to be more testable. All you need to test an updater is the state vector itself, the updater, and whatever classes that one updater depends on (which is generally very few). There's no need to mock out the world.
There are drawbacks: ECS seems to be a more convoluted, less clear way of describing game behavior, especially if you grew up with object-oriented programming. It seems to make much more sense to say "if an enemy sees the player on this game tick, the enemy should fire at the player" by defining seesPlayer() and fire() methods in some Enemy class and calling them each tick, then to write procedures which update some memory table. But as I found out -- and as I would come to discover the developers of StarCraft found out way ahead of me -- writing things the traditional bag-of-GameObjects way also leads to convolution and lack of clarity, even if it doesn't seem like it would at first glance.
The entity-component system in NullAwesome is a fairly straightforward implementation. Each entity is referenced by an integer called an eid (for Entity ID). The EntityRepository class, a singleton, stores the component info for each entity (components are Plain Old Java Objects with all-public instance variables; each component type is a different class). It does this with a hashtable mapping component classes to arrays of length MAX_ENTITIES. For any given eid x, the element in the x th slot in, say, the SpriteMovement array will contain the SpriteMovement component for the entity whose eid is x.
All this is handled for you behind the scenes by the EntityRepository's interface. Here are the highlights:
get(): Call EntityRepository.get() to retrieve the app's one and only Entity Repository.
newEntity(): Allocates a fresh new entity and returns its eid, or throws an exception if the entity table is full.
removeEntity(int eid): Removes the entity with eid eid from the table. This eid may be subsequently reused on future calls to newEntity().
addComponent(int eid, Object comp): Adds a component comp to the entity whose eid is eid. The entity will be flagged as having a component of comp's class as determined by the Java runtime; i.e., the most specific class to which comp belongs. Can throw an exception if eid does not exist.
getComponent(int eid, Class kls): Returns the component of type kls belonging to entity of id eid, or null if no such component. Throws an exception if eid does not exist.
processEntitiesWithComponent(Class kls, EntityProcessor p): Calls p's process() method on all entities having a component of type kls.
findEntityWithComponent(Class kls): Returns the eid of the first entity having a component of type class. Some entities -- such as the one representing the current level -- are singletons by design, which is where this method comes in handy.
Updaters conform to the UpdateAgent interface; renderers conform to the RenderAgent interface. Both kinds of "agents" often use the EntityRepository's processEntitiesWithComponent() method to perform the same operation across a selection of entities in the system.
Here are some of the component types I've written so far for NullAwesome:
SpriteMovement: contains position, velocity, and acceleration information. It's not only used for in-game sprites but it also determines, for example, where the screen is scrolled to in the level.
SpriteShape: Determines the image and animation a particular sprite has. Animation data is loaded from JSON, which I may get into in a future post.
PlayerInfo: Info unique to the player character.
StageInfo: Info about the current level, including a TileMap object representing the level's layout.
There will of course be many more.
So that's a quick look at entity-component systems and the implementation of such we're using in NullAwesome. Feel free to browse the source for more implementation details!