Utility of Unions in C
âWhy are unions in c useful?â I was actually driven to write this post because I couldnât find any compelling examples of the utility of the union keyword in C. There were a lot of great examples of reference material breaking down the differences between struct and union and the mechanics of union, but not a lot depicting what I felt was a compelling example of how union could be powerful. I only found one that I felt was exemplary but it dealt with it at the byte level, which is great if youâre doing embedded programming, but not relatable all if youâre coming from a different background. So Iâll give a short direct explanation for how I was able to wrap my brain around union and why I think this mental model is better than a lot of the explanations on the internet. I think of union as a way of declaring members in a group that overlap one another in the same area of memory. That immediately clarifies the behavior of union and why things need to be accessed âone at a timeâ. Really what it means is youâre referencing the same block of memory. disclaimer: I donât know if thatâs actually how it works but I went ahead and wrote a program to see if I could validate my mental model, and lo and behold it worked. So hereâs an example I would call useful. Consider this: youâre writing code for a business that sells books at scale, you need it to be performant, or maybe itâs a really old system - at any rate you have to write it in C. You sell physical books and are extending the system to sell eBooks, so naturally you think to yourself âitâd be great if I could just write the extension and leave most of the existing code alone.â Letâs see some sample code, contemplate the following snippet as part of your existing code
struct Paperback { char bookId[25]; char bookTitle[200]; char author[200]; }; void printPaperbackInfo(struct Paperback pb) { printf( "This is a book with ID %s and Title of %s written by %s.\n", pb.bookId, pb.bookTitle, pb.author ); } int main(int argc, char* argv[]) { struct Paperback theFall; strcpy(theFall.bookId, "PHY255ACTF"); strcpy(theFall.bookTitle, "The Fall"); strcpy(theFall.author, "Albert Camus"); printf("Okay so far we have...\n"); printPaperbackInfo(theFall); }
You think to yourself, âOkay, well I need to add some concepts about eBook so Iâd probably create an eBook specific stuff and use it as I need it.â So thatâs what you do:
struct ElectronicBook { char bookId[25]; char bookTitle[200]; char author[200]; char format[10]; char DRMS[50]; }; void printElectronicBookInfo(struct ElectronicBook eb) { printf( "This is a book with ID %s and Title of %s written by %s.\n", eb.bookId, eb.bookTitle, eb.author ); printf( "This book is in format %s signed with %s.\n", eb.format, eb.DRMS ); }
Well now you notice you have duplicated code. There also seems to be a lot ElectronicBook and Paperback share in common. Either way youâre left wondering something to the effect of âI guess I could re-use void printPaperbackInfo(struct Paperback pb) and just cast my structs everytime I want to use it, but either way I have two distinct structs Iâd be using and really I wish I had the one.â Enter union. So at this point I would say, âHey I want a union of these two data structures, because theyâre basically the same thing in memory with some added stuff, and I donât want to be casting stuff all the time and having to deal with compiler errorsâ so your code would evolve to look like this
struct Paperback { char bookId[25]; char bookTitle[200]; char author[200]; }; struct ElectronicBook { char bookId[25]; char bookTitle[200]; char author[200]; char format[10]; char DRMS[50]; }; union Book { struct Paperback pb; struct ElectronicBook eb; }; void printPaperbackInfo(struct Paperback pb) { printf( "This is a book with ID %s and Title of %s written by %s.\n", pb.bookId, pb.bookTitle, pb.author ); } void printElectronicBookInfo(struct ElectronicBook eb) { printf( "This book is in format %s signed with %s.\n", eb.format, eb.DRMS ); }
okay well now we have some definitions, this seems strictly better than the previous duplicated code we had, letâs see where this goes Its usage might look something like:
int main(int argc, char* argv[]) { union Book theFall; strcpy(theFall.pb.bookId, "PHY255ACTF"); strcpy(theFall.pb.bookTitle, "The Fall"); strcpy(theFall.pb.author, "Albert Camus"); printf("Okay so far we have...\n"); printPaperbackInfo(theFall.pb); // but wait maybe we reach out to some part of our system // and discover this is available as an eBook and we // want to treat it as such now printf("We're assigning ebook data\n"); strcpy(theFall.eb.format, "kindle"); strcpy(theFall.eb.DRMS, "Digital Rights Management Signature"); printf("That's cool the paperback code is still happy...\n"); printPaperbackInfo(theFall.pb); printf("But so is the eBook code...\n"); printElectronicBookInfo(theFall.eb); return 0; }
So, I donât know about you but Iâm feeling pretty confident in this mental model. In summary, unions allow us to manipulate the same area of memory in code for the âoverlappingâ members (aka a union of the members). This is useful (as Iâve shown above) because weâre able to define a union with label Book and then use that union in parts of the code where the code only knows about Paperback and where the code only knows about ElectronicBook by passing around pb and eb respectively. I think this is a sound analysis on my part, that or maybe Iâm getting lucky that the references to pb are still there in memory because of deallocation that hasnât happened. However, this example is congruent with the expected behavior from the accepted StackOverflow answer referenced above so that seems unlikely. Hopefully this helps people out. If it backfires on you horribly, Iâd also like to hear about that so Iâm not misinforming people.



















