Using generative AI to create QuickBMS scripts
In this post I'll share what I learned using Claude and ChatGPT to analyze Earthworm Jim 3D's textures.dat archive. To follow along, you'll need access to those tools, plus a hex editor and Windows PowerShell.
First impressions
Claude is much more competent at this type of task than ChatGPT. Before attempting to write a script, it asked for me details about the archive:
Does it have a header with file count/offsets?
Are the BMP files stored with their original headers or are they raw pixel data?
Is there any compression used?
It also prompted me asking whether the archive likely uses offset tables or sequential storage. Together, we concluded that the tree structure apparent in filenames (e.g. Visual FX\Jims_shadow.bmp) makes offset tables more likely.
Would you be able to check if you see any patterns of 4-byte values that could be offsets near the start of the file? Offset tables typically have sequences of increasing 32-bit numbers.
Claude was also curious about a "magic number" in the initial bytes, implying this is a common way to identify archive formats. And separately it searched for a file count in the foremost data of the archive.
Providing sample hex
This is where I got stuck because I wanted to provide at least 1250 characters in order to include the (presumed) offset table as well as some file names. Sadly this exceeds the conversation length afforded to Claude's free tier.
I will probably eventually work up the nerve to upgrade. But until then, for posterity, here PowerShell script to get that sample data:
format-hex "F:\GOG Galaxy\Games\Earthworm Jim 3D\textures.dat" | select -first 1250 | ForEach-Object { ($_ -split '\s{2,}')[1] -replace '\s+', ' ' }














