Visualizing The Gender Gap in Bachelor's Degrees

Visualizing the Gender Gap in College Degrees

My analysis begins by looking at STEM degrees.

I'll be working with data from The Department of Education Statistics that was compiled by Randal Olson, a data scientist at University of Pennsylvania. His blog covers a variety of interesting topics, including Madden player ratings and trends in traffic deaths per a states political leanings. The original dataset is available here.

Our data ranges from 1970 to 2011 and covers 17 general degree fields. The first cell is a quick script looking at the 6 STEM categories considered in the dataset. I plot the percentage of women and men holding a Bachelor's in each field. I will move on to examine liberal arts degrees and our "Other" degrees, which comprises Agriculture, Architecture, Education, Public Administration, Health Professions & Business.

Our dataset only has the percentage of women holding a degree in a given field. By necessity, the percentage of men is the complement of this value. Therefore, I plot the percentage of men in each category by subtracting the percentage of women from 100. I used colors from the Tableau Color Blind 10 palette.

In [1]:
%matplotlib inline
import pandas as pd
import matplotlib.pyplot as plt

women_degrees = pd.read_csv('percent-bachelors-degrees-women-usa.csv')
cb_dark_blue = (0/255,107/255,164/255)
cb_orange = (255/255, 128/255, 14/255)
stem_cats = ['Engineering', 'Computer Science', 'Psychology', 'Biology', 'Physical Sciences', 'Math and Statistics']

fig = plt.figure(figsize=(18, 4))

for sp in range(0,6):
    ax = fig.add_subplot(1,6,sp+1)
    ax.plot(women_degrees['Year'], women_degrees[stem_cats[sp]], c=cb_dark_blue, label='Women', linewidth=3)
    ax.plot(women_degrees['Year'], 100-women_degrees[stem_cats[sp]], c=cb_orange, label='Men', linewidth=3)
    for key, spine in ax.spines.items():
        spine.set_visible(False)
    ax.set_xlim(1968, 2011)
    ax.set_ylim(0,100)
    ax.set_title(stem_cats[sp])
    ax.tick_params(bottom="off", top="off", left="off", right="off")
    
    if sp == 0:
        ax.text(2005, 87, 'Men')
        ax.text(2002, 8, 'Women')
    elif sp == 5:
        ax.text(2005, 62, 'Men')
        ax.text(2001, 35, 'Women')
plt.show()

In Psychology, women reign supreme

The plots here are organized in descending order based on the size of the gender gap. We see there has been a slight increase in the percent of women obtaining Engineering degrees. It's also worth noting that a ~20% increase from zero is pretty decent progress on its own. Engineering is the only category that had essentially no women in the field in 1970. Computer Science saw a surge in interest among women through the 70's, but a decreasing trend since the mid-80's has the share of women close to where it was in 1970, sitting around 20%.

Psychology has apparently exploded in popularity amongst women, where just above 20% of degree holders are men. It would be worth investigating what job prospects are like amongst Psychology degrees vs. the other categories, and how much a Psych degree costs on average (i.e., which schools are conferring the greatest number of psych degrees). Biology is the only other place we see women outnumbering men, which has been the case since the late 90's (1987 to be exact).

In both the Physical Sciences and Math & Statistics the gender gap was consistently closing until roughly 2000, where we see the trend break as men have slowly started to retake the share of degrees. It's interesting to note that nearly every category has seen a break in trends starting between 2000 and 2004 and continuing into 2011. Engineering is the exception, which saw a slight uptick in the percent of women holding Engineering degrees starting in 2008, breaking an intial downtrend in the 00's.

Now let's look at all of the categories

The charts are organized by percentage of women holding a certain degree in descending order.

In [2]:
stem_cats = ['Psychology', 'Biology', 'Math and Statistics', 'Physical Sciences', 'Computer Science', 'Engineering', 'Computer Science']
lib_arts_cats = ['Foreign Languages', 'English', 'Communications and Journalism', 'Art and Performance', 'Social Sciences and History']
other_cats = ['Health Professions', 'Public Administration', 'Education', 'Agriculture', 'Business', 'Architecture']
cb_grey = (171/255, 171/255, 171/255)
fig = plt.figure(figsize=(18, 18))

for sp in range(0,6):
    ax = fig.add_subplot(6,3,(sp*3+1))
    ax.plot(women_degrees['Year'], women_degrees[stem_cats[sp]], c=cb_dark_blue, label='Women', linewidth=3)
    ax.plot(women_degrees['Year'], 100-women_degrees[stem_cats[sp]], c=cb_orange, label='Men', linewidth=3)
    for key, spine in ax.spines.items():
        spine.set_visible(False)
    ax.set_xlim(1968, 2011)
    ax.set_ylim(0, 100)
    ax.set_title(stem_cats[sp])
    ax.tick_params(bottom='off', top='off', left='off', right='off', labelbottom = 'off')
    ax.set_yticks([0,100])
    ax.axhline(50, c=cb_grey, alpha=0.3)
    
    
    if sp == 5:
        ax.tick_params(labelbottom = 'on') 
        
    if sp == 0:
        ax.text(2004, 82, 'Women')
        ax.text(2005, 15, 'Men')
    elif sp == 5:
        ax.text(2005, 90, 'Men')
        ax.text(2004, 7, 'Women')
        
for sp in range(0,5):
    ax = fig.add_subplot(6,3,(sp*3+2))
    ax.plot(women_degrees['Year'], women_degrees[lib_arts_cats[sp]], c=cb_dark_blue, label='Women', linewidth=3)
    ax.plot(women_degrees['Year'], 100-women_degrees[lib_arts_cats[sp]], c=cb_orange, label='Men', linewidth=3)
    for key, spine in ax.spines.items():
        spine.set_visible(False)
    ax.set_xlim(1968, 2011)
    ax.set_ylim(0, 100)
    ax.set_title(lib_arts_cats[sp])
    ax.tick_params(bottom='off', top='off', left='off', right='off', labelbottom = 'off')
    ax.set_yticks([0,100])
    ax.axhline(50, c=cb_grey, alpha=0.3)
    
    
    if sp == 4:
        ax.tick_params(labelbottom = 'on')
        
    if sp == 0:
        ax.text(2004, 76, 'Women')
        ax.text(2005, 23, 'Men')
    
for sp in range(0,6):
    ax = fig.add_subplot(6,3,(sp*3+3))
    ax.plot(women_degrees['Year'], women_degrees[other_cats[sp]], c=cb_dark_blue, label='Women', linewidth=3)
    ax.plot(women_degrees['Year'], 100-women_degrees[other_cats[sp]], c=cb_orange, label='Men', linewidth=3)
    for key, spine in ax.spines.items():
        spine.set_visible(False)
    ax.set_xlim(1968, 2011)
    ax.set_ylim(0, 100)
    ax.set_title(other_cats[sp])
    ax.tick_params(bottom='off', top='off', left='off', right='off', labelbottom = 'off')
    ax.set_yticks([0,100])
    ax.axhline(50, c=cb_grey, alpha=0.3)

    
    if sp == 5:
        ax.tick_params(labelbottom = 'on')
        
    if sp == 0:
        ax.text(2004, 90, 'Women')
        ax.text(2005, 5, 'Men')
    elif sp == 5:
        ax.text(2005, 65, 'Men')
        ax.text(2004, 35, 'Women')

    
    plt.savefig('gender_degrees.png')
    
plt.show()

Overwhelmingly, STEM fields are more likely to have a majority of men holding degrees

In nearly all of the liberal arts categories, more women hold degrees than men. The exception is Social Sciences and History, which is essentially a 50/50 split. However, most of the splits are in the 60/40 range, making them much closer to even than Comp. Sci. and Engineering, where we see 80% gaps.

Looking at our other category, Women enjoy a large percentage in Education, Public Administration and Health Professions. The share of men holding Agriculture degrees has been steadily declining, and in 2011 there was basically an even split. The same is true of Business, where men and women have had been essentially evenly distributed since about 1995. Trends in Architecture have been less regular, with the percent of men and women diverging around 1990 only to come back to about 43% women in 2011.

We see a strong converging trend in Agriculture, Business, Architecture, & Social Sciences and History. There's a weak convergence happening in Foreign Languages leading into 2011. Before 2000 Physical Sciences & Math and Statistics were moving consistently towards an even split. There seems to be a slight divergence over the last four decades in Health Professions, Public Administration, Education, English, & Communications and Journalism.

Of course, the share of women holding a given degree isn't even half the story

Regardless of gender, it's also important to consider the cost of education and the payoff, i.e., how much greater potential income is for having earned a degree. In a future post I will explore median earnings by degree, paying attention to the gender gap in each of those fields and whether a degree is necessary for the type of employment most commonly obtained post-graduation.

In [ ]:
 

links

social