For one of my classes, I have spent this semester cleaning and analyzing data from the Grand Comics Database (GCD) with an eye towards assessing gender representation in English-language superhero comics.
Starting with GCD’s records of over 1.5 million comics from around the world, I identified the 66,000 individual comic book titles that fit my criteria. For each character appearing in those comics, I hand coded the gender for those with a self-identified male or female gender.
From this, I built a bipartite network – comic books on one side and comic book characters on the other. A comic and a character are linked if a character appeared in a comic. The resulting network has around 66,000 comic titles, 10,000 characters, and a total of nearly 300,000 links between the two sides.
From the bipartite network, I examined the projections on to each type of node. For example, the below visualization contains only characters, linking two characters if they appeared in the same issue. Nodes here are colored by publisher:
The character network is heavily biased towards men; nearly 75% of the characters are male. Since the dataset includes comics from the 1930s to the present, this imbalance can be better assessed over time. Using the publication year of each comic, we can look at what percentage of all characters in a given year were male or female:
While comics were very gender-skewed through the 1970s, in recent years, the balance has gotten a little better, though male character still dominate. If anyone knows what spiked the number of female characters in the early 2000s, please let know. I looked at a couple of things, but couldn’t identify the driving force behind that shift. It’s possible it just represents some inaccuracies in the original data set.
If you prefer, we can also look at the various eras of comics books to see how gender representation changed over time:
I was particularly interested in applying a rudimentary version of the Bechdel test to this dataset. Unfortunately, I didn’t have the data to apply the full test, which asks whether two women (i) appear in the same scene, and (ii) talk to each other about (iii) something other than a man. But I could look at raw character counts for the titles in my dataset:
I then looked at additional attributes of of those titles which pass the Bechdel test. For example, when were they published? Below are two different ways of bucketing the publication years: first by accepted comic book eras and the second by uniform time blocks. Both approaches show that having two female characters in comic books started out rare but has become more common, coinciding roughly with the overall growth of female representation in comic books.
Finally, I could also look at the publishers of these comic books. My own biases gave me a suspicion of what I might find, but rationally I wasn’t at all sure what to expect. But now you can see, Marvel published an overwhelming number of the “Bechdel passed” comics in my dataset.
To be fair, this graphic doesn’t account for anything more general about Marvel’s publishing habits. Marvel is known for it’s ensemble casts, for example, so perhaps they have more comics with two women simply because they have more characters in their comics.
This turns out to be partly true, but not quite enough to account for Marvel’s dominance in this area. About half of all comics with more than two characters of any gender are published by Marvel, while DC contributes about a third.