Can you explain what it means, exactly, for music to be "8-bit?"
Can you explain what it means, exactly, for music to be "8-bit?" What significance does the number of bits have for waveforms, channels, sample size/quality, volume, speed, etc? In other words, the number of bits limits _____ to _____?
This is a great question, I'll try to answer as best as I can.
1. What is 8-bit music? 16-bit music?
In general terms, I'd argue that these titles are meaningless by themselves. A lot of people associate "8-bit" with things that sound like the NES, but the problem with this is that the NES doesn't primarily sound the way that it does because it's 8-bit, it sounds the way it does because of the particular kinds of sounds it uses.
It's not that the limitations don't matter at all, the NES would have had different audio if the system belonged to a different 'bit generation', but here's an example of why these titles don't mean much on their own:
Let's say you have a 16-bit wave file of CD music and you convert it to 8-bit. Is the end result something that sounds like the NES? Not at all.
This is why specificity matters. Classic video game systems sound the way they do because of the approach to building or creating sounds. Is the system using sound synthesis? If it is, is it perhaps using subtractive synthesis, FM synthesis, or wavetable synthesis? Or is the system playing back samples, or some combination of different kinds of synthesis and samples?
Once you determine these distinctions, this tells you why a given system sounds the way it does.
2. Another way of putting it, is that systems can sound similar or dissimilar regardless of which bit generation they belong to.
Another way to illustrate my first point would be to share a few examples. Consider Amiga music:
This is 8-bit music, but it doesn't sound anything like the NES. This is because it is music that's sample-based, making it more comparable to the SNES (SNES uses 8 channels of 16-bit sample playback, Amiga uses 4 channels of 8-bit sample playback).
Another good example is the Virtual Boy.
A while back, I was trying to find out more about the audio behind the Virtual Boy, and was disappointed to find that most sites, including wikipedia, had nothing more to say than "16 bit stereo audio".
That doesn't tell me practically anything. The SNES is also 16-bit, as is the Sega Genesis, which all sound different from each other (it turns out that the Virtual Boy uses 6 channels of primarily wavetable synthesis).
Last but not least is the Game Boy. It is the closest thing there is to the NES in terms of audio, but it's 4-bit, not 8-bit.
3. But the bits do matter...
Before I give the false impression that the 'bit limitations' of a system aren't incredibly important to the sound capabilities (too late?), I'll need to clarify that the amount of bits available that can be manipulated are the be-all end-all of what you can do.
Then why spend so much time downplaying the titles of "8-bit" and "16-bit"? It's because these titles can be misleading, as they often don't tell you about the specific limitations of different parameters.
Let's consider volume on the NES. If we were to assume that the NES was 8-bit everything, then we'd expect there to be 256 different possible volume settings for a given voice. But that's not the case, the channel volume is determined by a 4-bit value, allowing 16 possible volumes (0-15) for the pulse waves and noise channels.
What does this mean in musical terms? Lower bits means less resolution. Let's say you wanted to have a note fade out. If you start with a high or medium volume, you have room to descend and your ear will hear a smooth-enough fade out. But if you started with a quiet note and wanted to do a long fade out, there are no values in-between what's available. If you go from volume 3, to 2, to 1, and then 0, and you don't do this quickly, you won't hear a smooth fade out at all. The volume will distinctly "jump" between those values, with no means for a smoother fade (unless maybe you try to cover it up with other sounds at the same time).
There's also the sample channel on the NES. Those samples aren't 8-bit either, they're actually just 1-bit/ 7-bit... (more on that in a future post)
A quick search on the NESdev page that details the NES APU will yield 5 matches for 4-bit and 0 matches for 8-bit. Go figure.
Some systems use wavetable synthesis, which in layman's terms is a sort of free space to draw whatever waveform you wanted. In software, it often looks like a bar graph.
-The Game Boy allows for a table of 32 4-bit samples to create a waveform (in one channel).
-The Virtual Boy allows for a table of 32 6-bit samples to create a waveform (in five channels).
-The Famicom Disk System add-on adds one additional channel to the Famicom that uses 64 6-bit samples.
Even though these are all examples of wavetable synthesis, the bit-limitations create sounds that are distinct from each other. In a future post, I may compare the wavetable synthesis of these three systems more closely. The basic summary is that the lowest resolution of the three is found on the gameboy, which has the "dirtiest" sound. By comparison, the Famicom Disk System can create some very smooth sounds.
I hope this answers your question. The number of bits limits your resolution, which affects the qualities and potential for all sorts of parameters.
There are plenty of videos out there explaining bit depth for audio samples, or how bits + binary work, though when discussing video game music I think the most important distinction to make is that the "bit generation" a system belongs to doesn't tell you anything about how it sounds by itself.