Last update: Friday 10/12/18
Membership in this group is not fixed. Until recently, one would have expected that its size would become substantially larger within the next few decades and that the number of Black students who graduated from each of the institutions within the group would also become substantially larger.
A. What's At Stake?
Table 1 of this report shows that the 6-year graduation rate for all students in accredited four year bachelors programs in the U.S. in 2016, regardless of race, was 52 percent.
Therefore the thousands of Black students who graduated from our 75 percent group had graduation rates that were well above the 52 percent national average. It goes without saying that six year grad rates are incomplete measures of academic success. Nevertheless they enable easy dismissal of some negative stereotypes about the so-called academic deficiencies of Black students.
Conservative opponents of affirmative action in the 1970s, 80s, and 90s charged that Black applicants were unworthy of admission to the nation's top-rated institutions of higher education, unworthy because most were incapable of completing the required course loads. The admission of more than a small number of Black students could only be achieved by unfairly rejecting qualified White applicants. However, Table 4 of this report shows that the success of large cohorts of Black students in the most selective colleges and universities within the Black 6-Year 75 Percent 50 Group is yet another reminder that these old stereotypes were never more than unfounded slurs.
By contrast with the conservative challenge, the neoconservative challenge to affirmative action for Black students in the New Millennium is indirect. Neoconservatives may concede that the growing numbers of Black students who have been admitted to the nation's selective colleges and universities are, indeed, capable of completing the required course loads in a timely manner. But they claim that the admission of so many qualified Black, Hispanic/LatinX, and other minority applicants has been achieved by unfairly rejecting too many Asian applicants who were equally or, perhaps, better qualified.
The pages in the following report merely identify the members of the group, i.e., they describe what might be lost if affirmative action is prohibited. The DLL will soon generate comparable reports for Hispanic/LatinX enrollments and for Asian enrollments at selective U.S. colleges and universities. But first, it will produce a follow-report up to this initial Black edition that will provide a data science critique of the neoconservative legal challenge to Black enrollments at Harvard University as described in the media. The DLL's critique will be based on the data in this report plus additional findings.
Definition -- The six-year graduation rates presented in this report begin as fractions: denominator = the number of full-time, first-time students who entered college in the Fall 2010 semester; and numerator = the number of students from this entering class who graduated by August 2016, six years later. This fraction is multiplied by 100 to convert it a percentage, and the percentage is then rounded to the nearest whole number:
- GradRate % = 100 * (Graduations by Aug 2016) / (Fall 2010 enrollments)
DLL Editor's Apology -- In order to facilitate the publication of our report on the TECH-Levers blog, extensive abbreviations of the names of its variables were required to enable the tables to fit within the blog's narrow margins. We apologize to our readers for this inconvenience.
B. Contextual Tables and Charts
The first two tables and charts provide context for understanding everything that follows. All tables, maps, and charts are based on data that was downloaded from the IPEDS database of the U.S. Dept. of Education's National Center for Education Statistics.
SixYrStats Total Asian White Other Hispanic Black
Grads 884,753 64,877 574,799 89,365 89,412 66,300
Enter 1,685,883 94,960 993,087 185,420 212,274 200,142
SixYearPer 52 68 58 48 42 33
Descriptions of the rows in Table 1
- Enter = The number of students who entered accredited four year bachelors degree programs as first-time, full-time students in Fall 2010
- Grads = The number of students from each racial group who entered as first-time, full-time in Fall 2010 and graduated within the next six years, i.e., by August 2016
- SixYrPer = The percentage of Fall 2010 entrants who graduated by August 2016, i.e., the six-year graduation rates for each race
Key points from Table 1 and Chart 1 (below)
- Highest grad rates -- Asian Americans had the highest six-year graduation rate (68 percent)
- Lowest grad rates -- Black Americans had the lowest (33 percent)
- Missing values -- A substantial number of colleges and universities had missing values for the students who entered in Fall 2010 and completed by August 2016. Accordingly, the entries, graduations, and graduation rates in Table 1 and displayed in Chart 1 should be regarded as rough approximations. This point fully discussed in Appendix #1
Chart 1. Fall 2010 Cohorts -- 6-Year Graduation Rates by Race
Table 2. Summary of Black 6-Year Graduation Rates
Nraw mvRaw N6Yr G6Yr75per50 MedSixYrPer gt100
2,708 889 1,819 55 85 2
Descriptions of the variables in Table 2
- Nraw = The number of accredited colleges that offered four year bachelors degrees in the Fall 2010 semester = 2708
- mvRaw = The number of institutions that did not provide IPEDS with the number of students who entered in Fall 2010 and/or who graduated by August 2016. The graduation rates for these institutions became missing values = 889
- N6Yr = The number of institutions for which this report calculated Black 6-Year graduation rates = Nraw - mvRaw = 2708 - 889 = 1819
- G6Yr75per50 = The number of institutions that graduated at least 50 Black students by August 2016
- MedSixYrPer = the median value for Black 6-year graduation rates among the "Black 6-year 75 Percent 50 Group" = 85%
- gt100 = The number of institutions whose Black 6-year graduation rates were greater than 100% = 2.
Key points from Table 2 and Chart 2 (below)
- Missing values -- Although there were 2,708 accredited colleges and universities in the data file downloaded from IPEDS, calculation of the Black 6-year graduation rates resulted in missing values for the graduation rates for 889 institutions. A full discussion of the logic and the significance of these adjustments is presented in Appendix #1.
- Membership -- Only 55 colleges and universities satisfied the requirements for membership in the "Black 6-year 75 Percent 50 Group"
- Over-rated -- Two institutions reported 6-year graduation rates that were larger than 100%, i.e., UC San Diego and Everglades University. The obvious, but absurd interpretation is that somehow these institutions graduated more students by August 2016 than they enrolled in the Fall 2010 semester. Or more plausibly, this may be two cases of data entry errors. On the other hand, the graduations might have included transfer students who started their studies elsewhere in Fall 2016, and then graduated by August 2016. (Note: Chart 2 does not included the two graduation rates that were over 100 percent)
Chart 2. Distribution of Black 6-Year Graduation Rates
C. The Black Six Year 75 Percent 50 Group (B6Yr75Per50)
Table 3 lists the 53 U.S. accredited colleges and universities wherein 75 percent of the Black students who entered four year bachelors degree programs as first-time, full-time students in Fall 2010 graduated within six years, i.e., by August 2016. Moreover, there were at least 50 Black students in each graduating cohort.
Table 3. Members of the "Black Six Year 75 Percent 50 Group"
Institution Reg ST Grad Sh% B% T% Ga% P% M25
nInst = 55 ALL ALL 6,983 100.0 85 88 -3 21 613
U of IL at Urbana-Ch Lakes IL 286 4.1 80 85 -5 21 680
U of Mich, Ann Arbor Lakes MI 225 3.2 80 91 -11 15 640
Northwestern U Lakes IL 104 1.5 90 93 -3 15 690
U of Wisc, Madison Lakes WI 90 1.3 75 85 -10 15 620
Andrews U Lakes MI 60 0.9 75 62 13 32 430
U of Notre Dame Lakes IN 58 0.8 76 94 -18 12 670
U of MD, College Park MEast MD 412 5.9 81 86 -5 15 610
Syracuse U MEast NY 218 3.1 79 82 -3 25 540
SUNY at Albany MEast NY 182 2.6 78 66 12 37 530
Cornell U MEast NY 155 2.2 92 94 -2 15 670
U of Pennsylvania MEast PA 137 2.0 93 95 -2 16 690
Columbia U MEast NY 131 1.9 87 94 -7 18 690
Georgetown U MEast DC 104 1.5 95 94 1 14 650
George Washington U MEast DC 100 1.4 76 83 -7 13 600
Princeton U MEast NJ 93 1.3 97 97 0 11 710
Binghamton U MEast NY 92 1.3 78 82 -4 26 620
Carnegie Mellon U MEast PA 66 0.9 80 90 -10 13 680
Villanova U MEast PA 59 0.8 91 90 1 11 620
Boston U NewEn MA 113 1.6 85 86 -1 17 600
Boston College NewEn MA 109 1.6 96 94 2 16 640
Harvard U NewEn MA 97 1.4 98 97 1 18 700
Brown U NewEn RI 89 1.3 85 95 -10 17 670
Yale U NewEn CT 88 1.3 98 98 0 13 700
Northeastern U NewEn MA 82 1.2 81 86 -5 13 630
Dartmouth College NewEn NH 72 1.0 97 97 0 13 680
M.I.T. NewEn MA 60 0.9 90 92 -2 19 740
Amherst College NewEn MA 54 0.8 89 94 -5 23 670
Tufts U NewEn MA 54 0.8 78 92 -14 10 680
Wash U, St Louis Plain MO 85 1.2 94 94 0 6 710
U of Florida SEast FL 448 6.4 79 87 -8 29 600
Spelman College SEast GA 334 4.8 77 77 0 51 470
U of NC, Chapel Hill SEast NC 299 4.3 85 91 -6 20 620
U of Georgia SEast GA 274 3.9 80 84 -4 22 570
UVA,Main Campus SEast VA 204 2.9 91 94 -3 13 620
Duke U SEast NC 154 2.2 94 94 0 13 680
Vanderbilt U SEast TN 130 1.9 91 92 -1 13 690
Georgia Tech SEast GA 111 1.6 76 86 -10 17 650
U of Miami SEast FL 100 1.4 76 82 -6 22 620
Coll of WilliamMary SEast VA 83 1.2 90 91 -1 10 620
Wake Forest U SEast NC 69 1.0 85 88 -3 14 NA
Elon U SEast NC 66 0.9 76 82 -6 10 560
Everglades U SEast FL 51 0.7 255 240 15 84 NA
Texas AM, College St SWest TX 195 2.8 88 86 2 23 570
Southern Methodist U SWest TX 66 0.9 80 79 1 15 580
Rice U SWest TX 52 0.7 91 93 -2 16 690
US Naval Academy USsrv MD 80 1.1 86 90 -4 NA 590
USC West CA 150 2.1 89 91 -2 18 650
U of Wash, Seattle West WA 129 1.8 78 84 -6 24 570
Stanford U West CA 111 1.6 91 94 -3 16 690
UCLA West CA 111 1.6 85 91 -6 32 590
UC Santa Barbara West CA 66 0.9 79 82 -3 34 550
UC Davis West CA 65 0.9 83 86 -3 39 560
UC San Diego West CA 55 0.8 177 87 90 47 610
UC Irvine West CA 53 0.8 85 87 -2 38 560
Loyola Marymount U West CA 52 0.7 79 83 -4 18 560
Descriptions of the variables in the Table 3
- Institution = The 55 colleges/universities in the "Black Six Year 75 Percent 50 Group"
- Region = Great Lakes, Mid-East, New England, Great Plains, Southeast, Southwest, U.S. Service Academies, Far West
- ST = State, e.g., New York, California, etc
- BGrad = Black graduates from cohort that entered in Fall 2016 as of August 2016.
- S% = Percentage share of the all Black graduates that come from each institution
- B% = Black 6-year graduation rate ... all numbers are percentages
- T% = 6-year graduation rate of all students (total) ... all numbers are percentages
- Ga% = T% - B%
- P% = Percentage of entering students who received Pell grants in Fall 2010
- M25 = Twenty-fifth percentile score on math SAT exam
Key points from Table 3 and Map 3 (below)
- Group size -- Only 55 colleges and universities met both conditions for membership in the Black 6-Year 75 Percent 50 group, i.e., only 55 graduated cohorts that included at least 50 Black students within six years
- Total Black graduates = The total number of Black students who graduated from this group = 6,877
- Leaders -- University of Florida (448 graduates), UniverCollege Park (412 graduates), and Spelman College (334 graduates)
- Regions -- Over 90 percent of the Black graduates received their bachelors degrees from colleges and universities in the central, eastern, and southern states.
- States -- The institutions in the Black 6-Year 75 Percent 50 Group are located in only 23 of the 51 states + District of Columbia. Map 3 emphasizes this point by only displaying the states within the group, leaving empty spaces for the other 28 states.
- Colors -- The states on Map 3 are colored by the share of all Black graduates who were educated in each state. For example Georgia (southeast) is bright red because its share = 4.9% (Spelman College) + 4.0% (University of Georgia) + 1.6% (Georgia Tech) = 10.5%
- Gaps -- The 6-year graduation rates for Black students were about the same as the graduation rates for all students at each institution, ranging from +2 to -8 where +2 meant that Black students graduated at a rate that was two percent higher and -8 means that Black students graduated at a rate that was eight percent lower ... with eight exceptions
- Outliers -- The gaps at seven institutions are puzzling ... +12 (SUNY at Albany), -10% (Brown), -10% (University of Wisconsin, Madison), -10% (Carnegie Mellon University), -10% (Georgia Tech), -11% (University of Michigan, Ann Arbor), -14% (Tufts University), and -18% (University of Notre Dame).
This report is based on an exploratory investigation, so it can only note what happened; it can't say why things happened the way they happened. At best it can raise questions that would require further investigation to resolve. For example, why was SUNY at Albany's Black graduation rate so much higher than its overall graduation rate? And why were the Black graduation rates at the other outlier institutions so much lower than the overall rates?
The -18% gap at Notre Dame is large enough to be especially worrisome because it suggests that some of the Black students in Notre Dame's entering cohort might not have met the full range of criteria that were imposed on other applicants. Admitting Black applicants who were not fully qualified -- e.g., applicants who required remedial courses before engaging with the regular curriculum -- was the basis for the old conservative challenge to affirmative action. Black students who were admitted under such circumstances took slots that might otherwise have gone to fully qualified applicants, presumably White and/or Asian.
Map 3. State percent shares of Black graduates from B6Yr7Per50
All of the colleges and universities listed in Table 3 had overall-six year graduation rates that were well above the 52 percent national average for individual students; indeed all but three had had graduation rates above 80 percent. So they were all selective institutions. Table 4 will focus on the twenty members of the group that were highly selective.
But before moving on to Table 4, the author asks the reader's indulgence while he offers a few comments about the significance of the presence of Spelman College in Table 3. This digression will put the findings displayed in Tables 3 and 4 into a broader context.
Spelman College (Atlanta Georgia) is one of the nation's most prominent Historically Black Colleges and Universities (HBCUs). Having predominantly Black enrollments, HBCUs don't have affirmative action policies. HBCUs are justly renowned for providing Black students with access to a college education that might otherwise be beyond their reach.
Spelman's achievement of a 78 percent 6-year graduation rate is testimony to the high quality of the educational opportunities that it provided. Up until now this report has adopted a "let's celebrate because the glass is half-full" perspective. Spelman's presence in Table 3 suggests that it is also useful to regard the glass as half empty.
- Like most HBCUs, Spelman enrolled a high percentage of students who were financially pressed. As shown in the next to last column of Table 3, Spelman's percentage of entering students who received Pell grants = 51%. At all of the other institutions listed in Table 3, the Pell percentage was less than 40 percent; indeed, most institutions had Pell percentages that were less than 30 percent.
- As also happens at most HBCUs, many of Spelman's students had math SAT scores that were much lower than the scores at all of the other institutions listed in Table 3. As shown in the last column, Spelman's 25th percentile = 470, i.e., 25 percent of its entering students had math scores that were lower than 470. No doubt many of Spelman's entering students achieved much higher scores, but Spelman accepted a calculated risk by enrolling so many students whose lower scores reflected substantial shortfalls in their prior academic preparation.
- Nevertheless, Spelman managed to transform its Fall 2010 entering class into a cohort that achieved a 78 percent graduation rate, a rate that greatly exceeded the national average. Given that the other institutions in Table 3 had access to far greater financial resources than Spelman, why didn't they produce larger cohorts of Black students with higher graduation rates?
Based on his experience as a member of the tenured faculty and senior staff of another prominent HBCU for over forty years, the author of this report has concluded that the success of Spelman and other highly renowned HBCUs is related to their ability to quickly enmesh their Black students in large, predominantly Black, overlapping, dense networks of student-peers, teachers, advisor-mentors, and role models. These resources enable them to succeed despite being underfunded year after year after year. Perhaps most non-HBCUs weren't more successful because they couldn't mobilize these kinds of support networks.
But what about the other two standouts in Table 3: the University of Florida (448 Black graduates) and the University of Maryland at College Park (412 Black graduates)? Although both have substantial Black enrollments, neither employs large numbers of Black faculty or Black staff. So what accounts for their higher productivity? ... and ... Could their methods enable other non-HBCUs to become more productive?
Table 4. Highly Selective Subset of B6Yr75Per50 Group
Institution Reg ST Grad Sh% B% T% Ga% P% M25
nInst = 20 ALL ALL 2,076 100.0 89 92 -3 15 687
Northwestern U Lakes IL 104 5.0 90 93 -3 15 690
U of IL, Urbana-Ch Lakes IL 286 13.8 80 85 -5 21 680
U of Notre Dame Lakes IN 58 2.8 76 94 -18 12 670
Princeton University MEast NJ 93 4.5 97 97 0 11 710
Columbia University MEast NY 131 6.3 87 94 -7 18 690
U of Pennsylvania MEast PA 137 6.6 93 95 -2 16 690
Carnegie Mellon U MEast PA 66 3.2 80 90 -10 13 680
Cornell University MEast NY 155 7.5 92 94 -2 15 670
M.I.T. NewEn MA 60 2.9 90 92 -2 19 740
Harvard University NewEn MA 97 4.7 98 97 1 18 700
Yale University NewEn CT 88 4.2 98 98 0 13 700
Dartmouth College NewEn NH 72 3.5 97 97 0 13 680
Tufts University NewEn MA 54 2.6 78 92 -14 10 680
Amherst College NewEn MA 54 2.6 89 94 -5 23 670
Brown University NewEn RI 89 4.3 85 95 -10 17 670
Washington U, St Louis Plain MO 85 4.1 94 94 0 6 710
Vanderbilt University SEast TN 130 6.3 91 92 -1 13 690
Duke University SEast NC 154 7.4 94 94 0 13 680
Rice University SWest TX 52 2.5 91 93 -2 16 690
Stanford University West CA 111 5.3 91 94 -3 16 690
Descriptions of the variables in the Table 4
- Institution = The 20 most selective institutions in the "Black Six Year 75 Percent 50 Group"
- Reg = Great Lakes, Mid-East, New England, Great Plains, Southeast, Southwest, U.S. Service Academies, Far West
- ST = State, e.g., New York, California, etc
- BGrd = Black graduates from cohort that entered in Fall 2016 as of August 2016
- Sh% = Percentage share of all Black graduates that come from each institution
- B% = Black 6-year graduation rate ... all numbers are percentages
- T% = 6-year graduation rate of all students (total) ... all numbers are percentages
- Ga% = T% - B%
- M25 = Twenty-fifth percentile score on math SAT exam >= 670
Key points from Table 4 and Map 4 (below)
- Group Size -- The report employed a simple measure of higher selectivity ==> the 25th percentile score on the math SAT exam. The 20 colleges and universities whose 25th percentile for their Fall 2010 entering class was greater than than or equal to 670 are listed in Table 4. In other words, only 25 percent of the first-time, full-time students who entered any of these institutions in the Fall 2010 semester received math scores that were lower than 670; a strong majority, 75 percent, scored 670 or higher.
The number 670 was chosen because it is the lowest score that would include all eight members of the Ivy League in the group, the Ivy League being the initial targets of the current neoconservative challenge to affirmative action admissions policies.
The colleges and universities are sorted by regions in alphabetical order, and within each region by their math SAT scores in declining order.
- Total Black graduates = The total number of Black students who graduated from this highly selective subgroup = 2,076
- Leaders -- The University of Illinois at Urbana-Champaign was the most productive institution by far, its cohort of 280 Black graduates being more than twice as large as the number of Black graduates in the cohorts of all but two of the other members of this highly selective subgroup.
- Location -- Illinois is displayed in Map 4's brightest red, reflecting its having the largest total share of all Black graduates = 21.7% = 13.8% (University of Illinois) + 5.8% (Northwestern University) + 2.8% (Notre Dame)
- Pipelines -- The states that provided the biggest pipelines of Black graduates from the nation's top-rated colleges and universities into the nation's most demanding job markets and best graduate schools were the three reddest states on Map 5 ==> Illinois (21.7%), New York (13.8%), and Massachusetts (12.8%)
Map 4. State percent shares of Black graduates from highly selective B6Yr75Per50
As previously noted, this report is based on an exploratory investigation; as such it can describe what happened; but it can't say why things happened the way they happened. It can only raise questions that would require further investigation to resolve. So here are a couple of final questions for further investigation: What made the University of Illinois such a highly effective producer of Black graduates? ... and ... Could its methods enable other highly selective colleges and universities to become more effective producers of Black graduates?
D. Closing comments
As promised in the opening section of this report, Tables 3 and 4 show what's at stake: 6,983 Black graduates from selective institutions and 2,076 Black graduates from highly selective institutions. Although these figures were for the cohort of Black students who entered in the Fall 2010 semester, one would have expected the numbers to grow year by year, decade by decade ... that is, until the neoconservative challenge to Harvard's admissions policies became a federal case. If the courts decide to prohibit or to severely restrict affirmative action admissions policies at Harvard and at other selective institutions, the numbers of Black students who will be admitted to and graduate from the nation's top-tier colleges and universities in the coming decades may be reduced to small fractions of their current values.
+++++++++++++++++++++
Related notes on this blog:
- Affirmative action at America's top tier private colleges and universities ... updated 9/29/18
APPENDIX #1 -- Technical note
This appendix covers an important technical issue that might not be of interest to most readers.
Missing values
Only two of the variables in the tables of this report had missing values that were relevant to the calculation of the 6-year graduation rates for Black students: (1) the number of full-time, first year Black students who entered in the Fall 2010 semester, and (2) the number of Black students who graduated by August 2016. Table 2MV (below) provides the data required to understand why missing values turned out to be relatively unimportant.
Table 2MV -- Missing values from Table 2
Nraw mvRaw zEntGrad mv6Yr N6Yr
2,708 889 186 703 1,819
Descriptions of the variables in Table 2MV
This is usually just another bit of uninteresting R trivia, but in this case 0/0 represents an important special case. Institutions that enrolled zero Black students would also produce zero Black graduates; so their Black graduation rates would be 0/0, thereby causing R to toss them onto the same pile of missing cases as institutions that failed to inform IPEDS about the number of Black students they enrolled or graduated.
So we only need to be concerned about the 703 institutions whose Black entrants or graduates actually had missing values. We don't need to be concerned about the 186 institutions that didn't teach any Black students, i.e., they didn't enroll any Black students, so they didn't graduate any Black students.
- Nraw = The number of accredited colleges and universities that offered bachelors degrees in the Fall 2010 semester = 2,708
- mvRaw = The number of institutions for whom the data required to construct 6-year graduation rates for Black students was missing from raw data that was downloaded from the IPEDS database, i.e., the data for Fall 2010 entrants and/or the entrants who subsequently graduated by August 2016 = 889
- zEntGrad = The number of institutions that reported zero Black entrants in Fall 2010 and zero Black graduates by August 2016 = 186 (See additional discussion below)
- mv6Yr = The "corrected" number of institutions for Black 6-year graduation rates were missing = missRaw - zEntGrad = 889 - 186 = 703
- N6Yr = The number of institutions used to produce Table 2 = Nraw - (missRaw + rateOver100) = 2,708 - 889 = 1,819
This is usually just another bit of uninteresting R trivia, but in this case 0/0 represents an important special case. Institutions that enrolled zero Black students would also produce zero Black graduates; so their Black graduation rates would be 0/0, thereby causing R to toss them onto the same pile of missing cases as institutions that failed to inform IPEDS about the number of Black students they enrolled or graduated.
So we only need to be concerned about the 703 institutions whose Black entrants or graduates actually had missing values. We don't need to be concerned about the 186 institutions that didn't teach any Black students, i.e., they didn't enroll any Black students, so they didn't graduate any Black students.
The 703 colleges and universities not included in the pool from which the most productive institutions were extracted by this report represent a substantial portion of all U.S. institutions that offer bachelors degrees, 703 / 2708 = 26%. Graduation rates are one of the defining attributes of the Black 6-Year 75 percent 50 Group. Omitting 26% of the nation's bachelors degree programs would ordinarily be a major limitation, if not a showstopper, but not in this case.
However, the author's many decades of experience in examining Black graduation rates at the nation's colleges and universities makes him confident that all of the producers of the largest number of Black graduates are included in Table 3 ... or were excluded because they have not achieved 75 percent graduation rates.
However, the author's many decades of experience in examining Black graduation rates at the nation's colleges and universities makes him confident that all of the producers of the largest number of Black graduates are included in Table 3 ... or were excluded because they have not achieved 75 percent graduation rates.
- For example only one HBCU, Spelman College, has a Black graduation rate that's greater than or equal to 75 percent. A few of the other 86 HBCUs that offer bachelors degrees are close, but they aren't there yet.
- As for non-HBCUs, the biggest enrollees of Black students are in Georgia, Illinois, North Carolina, Florida, the tri-state Maryland-DC-VIrginia region, New York, and the Ivy League ... and they are all fully represented in Table 3 and depicted on Map 3.
On the other hand, this report was intended to be a celebration of the success of affirmative action policies for Black students in higher education. Had the report overestimated the number of institutions that had met its criteria for success, there would be cause for concern. But an underestimate provides cause for hope that subsequent editions of the report will celebrate even greater success as lagging institutions eventually provide IPEDS with their missing data.
Furthermore, the problem of missing values for one cohort of entrants and graduates will be minimized in subsequent editions of this report because subsequent editions will use the moving averages of the most recent two or three cohorts of entrants and graduates from each institution to determine the membership of the Black 6-Year 75 percent 50 Group. Missing entering/graduating values for one cohort will be extrapolated from the entering/graduating values provided by cohorts from the other years.
APPENDIX #2 -- IPEDS Data
The tables in this report are based on data obtained from the IPEDS database of the U.S. Dept. of Education's National Center for Education Statistics. Readers who are interested in obtaining the IPEDS data from which the tables in this report were derived can do so via the following procedures. They should begin by creating a master folder/directory on their computers called "SIX-YEAR-GRAD-RATES".
IPEDS Data -- Enrollment and graduation data
- Direct your browser to the IPEDS Use The Data page
- Click "Shortcuts", then click "Select your shortcut"
- From the drop-down menu select "Previously saved IPEDS data session"
- When the next page is loaded, enter Guest_4802986377 in the "Job number" field. This access number will be in effect for thirty days beginning 10/3/2018
- Click "Continue" to display the selected institutions and variables
- Click "Continue" again to obtain the data download page
- Select "Download in comma separated format", then click "Continue"
- Specify the directory/folder and filename for your data.
Readers who intend to use the R script in APPENDIX #3 to generate the tables that appear in this report should name the file "IPEDS-raw.csv" and place it in their SIX-YEAR-GRAD-RATES folder
APPENDIX #3 -- R Script
The R script in this appendix will enable readers to generate the tables, maps, and charts that appear in this report. The script should be copied and pasted into an file called "master.R" in the SIX-YEAR-GRAD-RATES folder. It's a long, but straight-forward script, straight-forward because it makes extensive use of some of the elegant "tidyverse" tools developed by Hadley Wickham and associates at R Studio.
- The makeTable_0() function reads the data in IPEDS-raw.csv, renames the IPEDS variables to user-friendly names, generates some additional variables, then returns the reformatted data in table_0 (an internal table).
- Other scripts and functions use table_0 to generate the tables, maps, and charts that appear in the report.
- The tables that appear in the report are saved into pairs of .csv files and .html files; maps are saved as .png files; charts are saved as .jpg files
Note 1: Users should remember to set the "home" variable on the first line of the script to equal the full path on their computers to their SIX-YEAR-GRAD-RATES folder.
Note 2: The following R script was preformatted for this blog using Source Code Formatter
Note 2: The following R script was preformatted for this blog using Source Code Formatter
### R Script to generate all tables and maps
### User must change the "home" variable to point to
### the SIX-YEAR-GRAD-RATES folder/directory on his/her computer
#
### A. Set the working directory, load required libraries, and read IPEDS data
home <- "blah/blah-blah/.../blah-blah-blah/SIX-YEAR-GRAD-RATES/"
setwd(home)
#
### Developed using R = version 3.5.1 and R Studio = version 1.456
#
### install.packages("tidyverse")
library(tidyverse)
### install.packages("maps")
library(maps)
### needed by ggplot2, but not installed automatically
###install.packages("mapproj")
library(mapproj)
library(grid)
#
###install.packages(("openintro"))
# Used by mapping functions to convert state names to abbreviations
library(openintro)
#
### Installs prettydoc for converting R-markdown files to html via knitr
###install.packages("prettydoc")
library(prettydoc)
#
### Read IPEDS as tibble, not dataframe
### ... Use tibbles throughout
### ... Most things are better with tibbles,
### ... but, as per Star Trek #1, there is, sometimes,
### ... "Trouble with Tibbles" ... :-)
###
### Code is written to generate five reports, one for Black,
### White, Hispanic/LatinX, Asian, and Other
### Variables are usually accessed indirectly, via column
### names in quotes. For example, set race <- "black", then
### set raceGrads <- paste0(race, "Grads") ... so that
### dt[, raceGrads] is equivalent to dt[,"blackGrads"]
#
dtIPEDS <- read_csv("IPEDS-raw-final.csv")
### Ignore the warning message about unnamed variable now called "X22"
#
### B. Source the local functions
#################################
#################################
#
addPer <- function(num, denom, DEC=0) {
### Function returns vector triple = c(num, denom, and 100 * num/denom)
### Used for making table 1 more compact ...
#
per <- round(100 * num/denom, DEC)
return(c(num, denom, per))
}
#
theme_clean <- function(base_size = 12) {
### Function provides default theme used by makeMap function ...
#
require(grid) # Needed for unit() function
theme_grey(base_size) %+replace%
theme(
axis.title.x=element_blank(),
axis.text.x=element_blank(),
axis.ticks.x=element_blank(),
axis.title.y=element_blank(),
axis.text.y=element_blank(),
axis.ticks.y=element_blank(),
panel.background = element_blank(),
panel.grid = element_blank(),
###panel.margin = unit(0, "lines"),
plot.margin = unit(c(0, 0, 0, 0), "lines"),
complete = TRUE
)
}
#
makeMap <- function(dt, title=NULL) {
### Function draws map of percents in Sh% variable by states ...
### State variable must be ST, share variable must be Sh%
#
states_map <- map_data("state")
#
dt <- select(dt, `ST`, `Sh%`)
dt <- dt[-1, ] #Omit the ALL 100 top row
#
dtMap <- dt %>%
mutate(state = abbr2state(ST)) %>%
mutate(state = tolower(state)) %>%
group_by(state) %>%
summarise(share = sum(`Sh%`)) %>%
select(state, share)
#
dtMap <- merge(states_map, dtMap, by.x="region", by.y= "state")
GroupData <- dtMap[,"share"]
#
ggMap <- ggplot(data=dtMap, aes(map_id=region, fill=GroupData)) +
geom_map(map=states_map, colour="black") +
scale_fill_gradient2(low="#559999",
mid="grey90", high="#FF0000", midpoint= median(GroupData)) +
expand_limits(x=states_map$long, y=states_map$lat) +
coord_map("polyconic") + labs(fill=legend) + theme_clean() +
guides(fill=guide_legend(title.position = "left")) +
theme(legend.title=element_blank(),
plot.margin=unit(c(1,1,1,1), "cm"))
if(!is.null(title)) {
ggMap <- ggMap + ggtitle(title)
}
#
return(ggMap)
}
#
addShareCol <- function(dt, colSource, p=1) {
### Given the name of a column = colSource, and precision = p
### in a tibble, function computes a new column
### wherein each value of original row is divided by the
### total of the column and rounded to p decimal places
### Names the new column = "shareXXX"
### Called by makeTable3, makeTable4 ...
#
totCol <- sum(dt[, colSource], na.rm=TRUE)
share_xxx <- as.matrix(round((100 * dt[, colSource]/totCol), p))
share_xxx <- share_xxx[, 1]
dt <- mutate(dt, shareXXX = share_xxx)
#
return(dt)
}
#
addSumRow <- function(dt){
### Adds a new first row that contains sum of all numeric columns cols
### Called by makeTable3, makeTable4 ...
#
dt <- select(dt, c(institution, state, region, raceGrads, share, raceEntered, totalGrads, totalEntered, race6YrPer, total6YrPer, gapPer6Yr, mat25, totalPell, pellPer))
#
dtSums <- dt %>% #calculate totals for numeric cols
summarise(raceGrads = sum(raceGrads, na.rm=TRUE),
raceEntered = sum(raceEntered, na.rm=TRUE),
race6YrPer = round((100 * raceGrads/raceEntered),0),
totalGrads = sum(totalGrads, na.rm=TRUE),
totalEntered = sum(totalEntered, na.rm=TRUE),
total6YrPer = round((100 * totalGrads/totalEntered),0),
gapPer6Yr = race6YrPer - total6YrPer,
#
totalPell = sum(totalPell, na.rm=TRUE),
pellPer = round((100 * totalPell/totalEntered), 0),
#
weightedMat25 =
sum(dt[, "mat25"] * dt[, "totalEntered"], na.rm=TRUE),
mat25 = round((weightedMat25/totalEntered), 0),
nInst = n())
#
# Reorder
dt <- select(dt, c(institution, region, state, raceGrads, share,
raceEntered, totalGrads, totalEntered, race6YrPer,
total6YrPer, gapPer6Yr, mat25, totalPell, pellPer))
#
# Use first row of dt as "template" for sum Row,
# plug in values into each column
dtSumRow <- dt[1,]
dtSumRow <- dtSumRow %>%
# Tibble fudge ...
mutate(institution = paste0("nInst = ",
as.matrix(dtSums[1, "nInst"][,1])),
region = "ALL",
state = "ALL",
raceGrads = as.matrix(dtSums[1,"raceGrads"])[, 1],
raceEntered = as.matrix(dtSums[1, "raceEntered"])[, 1],
share = 100,
totalGrads = as.matrix(dtSums[1, "totalGrads"])[, 1],
totalEntered = as.matrix(dtSums[1, "totalEntered"])[, 1],
race6YrPer = as.matrix(dtSums[1,"race6YrPer"])[, 1],
total6YrPer = as.matrix(dtSums[1,"total6YrPer"])[, 1],
gapPer6Yr = as.matrix(dtSums[1, "gapPer6Yr"])[, 1],
totalPell = as.matrix(dtSums[1, "totalPell"])[, 1],
pellPer = as.matrix(dtSums[1,"pellPer"])[, 1],
mat25 = as.matrix(dtSums[1, "mat25"])[, 1])
#
dt <- rbind(dtSumRow, dt)
return(dt)
}
#
#
prepTable_0 <- function(dt){
### Reads raw IPEDS data, renames variables, calculates misc other stuff,
### returns master tibble ... Called by makeTable1, 2, 3, _4 ...
#
dt <- dt %>%
# Rename IPEDS variables to user friendly names
rename(institution = `Institution Name`,
totalGrads = `Grand total (GR2016 Bachelor's or equiv subcohort (4-yr institution) Completers of bachelor's or equiv degrees total (150% of normal time))`,
asianGrads = `Asian total (GR2016 Bachelor's or equiv subcohort (4-yr institution) Completers of bachelor's or equiv degrees total (150% of normal time))` ,
blackGrads = `Black or African American total (GR2016 Bachelor's or equiv subcohort (4-yr institution) Completers of bachelor's or equiv degrees total (150% of normal time))`,
hispanicGrads = `Hispanic total (GR2016 Bachelor's or equiv subcohort (4-yr institution) Completers of bachelor's or equiv degrees total (150% of normal time))`,
whiteGrads = `White total (GR2016 Bachelor's or equiv subcohort (4-yr institution) Completers of bachelor's or equiv degrees total (150% of normal time))`, totalEntered = `Grand total (EF2010A_RV Full-time students Undergraduate Degree/certificate-seeking First-time)`,
blackEntered = `Black or African American total (EF2010A_RV Full-time students Undergraduate Degree/certificate-seeking First-time)`,
asianEntered = `Asian total (EF2010A_RV Full-time students Undergraduate Degree/certificate-seeking First-time)` ,
hispanicEntered = `Hispanic total (EF2010A_RV Full-time students Undergraduate Degree/certificate-seeking First-time)`,
whiteEntered = `White total (EF2010A_RV Full-time students Undergraduate Degree/certificate-seeking First-time)`,
region = `Bureau of Economic Analysis (BEA) regions (HD2016)`,
state = `State abbreviation (HD2016)`,
totalPell = `Number of full-time first-time undergraduates awarded Pell grants (SFA1011_RV)`,
mat25 = `SAT Math 25th percentile score (IC2010_RV)`) %>%
#
# Calculate "other" racial groups besides asian, black, hispanic, white
mutate(otherGrads = totalGrads - (blackGrads + whiteGrads + asianGrads + hispanicGrads)) %>%
mutate(otherEntered = totalEntered - (blackEntered + whiteEntered + asianEntered + hispanicEntered))%>%
#
mutate(total6YrPer = round(100 * totalGrads/totalEntered)) %>%
mutate(black6YrPer = round(100 * blackGrads/blackEntered)) %>%
mutate(asian6YrPer = round(100 * asianGrads/asianEntered)) %>%
mutate(hispanic6YrPer = round(100 * hispanicGrads/hispanicEntered)) %>%
mutate(white6YrPer = round(100 * whiteGrads/whiteEntered)) %>%
mutate(other6YrPer = round(100 * otherGrads/otherEntered)) %>%
#
mutate(pellPer = round(100 * totalPell/totalEntered), 0) %>%
mutate(region = as.factor(region))
levels(dt$region) = c("USsrv",
"NewEn", # CT ME MA NH RI VT",
"MEast", # DE DC MD NJ NY PA",
"Lakes", # IL IN MI OH WI",
"Plain", # IA KS MN MO NE ND SD",
"SEast", # AL AR FL GA KY LA MS NC SC TN VA WV",
"SWest", # AZ NM OK TX",
"RockyM", # CO ID MT UT WY",
"West", # AK CA HI NV OR WA",
"Outer") # AS FM GU MH MP PR PW VI"
#
dt <- select(dt, UnitID, institution, totalGrads, blackGrads,
asianGrads, hispanicGrads, whiteGrads, otherGrads,
totalEntered, blackEntered, asianEntered,
hispanicEntered, whiteEntered, otherEntered,
total6YrPer, black6YrPer, asian6YrPer, hispanic6YrPer,
white6YrPer, other6YrPer, region, state, mat25,
totalPell, pellPer)
#
return(dt)
}
#
makeTable1 <- function(dt){
### Creates table of national six-year graduation rates of students
### from all racial groups, ignoring their colleges and universities ...
#
dt <- dt %>%
select(totalGrads, totalEntered, blackGrads, blackEntered,
whiteGrads, whiteEntered, asianGrads, asianEntered,
hispanicGrads, hispanicEntered, otherGrads, otherEntered) %>%
colSums(na.rm=TRUE) # Calculate column sums of all races
#
Total <- addPer(dt["totalGrads"], dt["totalEntered"])
Black <- addPer(dt["blackGrads"], dt["blackEntered"])
White <- addPer(dt["whiteGrads"], dt["whiteEntered"])
Asian <- addPer(dt["asianGrads"], dt["asianEntered"])
Hispanic <- addPer(dt["hispanicGrads"], dt["hispanicEntered"])
Other <- addPer(dt["otherGrads"], dt["otherEntered"])
#
nInst = dim(dt[1])
SixYrStats <- c("Grads", "Enter", "SixYearPer")
dt <- tibble(SixYrStats, Total, Asian, White, Other, Hispanic, Black)
###rownames(dt) <- c("Grads", "Enter", "SixYearPer")
return(dt)
}
#
makeChart1 <- function(dt){
### Function makes chart for Table1 ...
#
# Omit "SixYrStats" and "Total" columns
group <- as.factor(colnames(dt[,-c(1, 2)]))
group<- factor(group, levels = c("Asian", "White", "Other", "Hispanic", "Black"))
SixYearPer <- as.numeric(dt[3, -c(1, 2)])
dt <- data.frame(group, SixYearPer)
#
### title = "Fall 2010 Cohorts -- Graduating Within 6 Years"
xLabel = "Racial Groups"
yLabel = "6-Year Graduation Rates"
gg <- dt %>%
ggplot(aes(x = group, y = SixYearPer)) +
geom_bar(stat="identity", fill="turquoise2") +
### ggtitle(title) +
xlab(xLabel) +
ylab(yLabel) +
geom_text(aes(label = SixYearPer), nudge_y = 2) +
theme(plot.title = element_text(size=12))
#
return(gg)
}
#
prepTables_234 <- function(dt, race, sixYearPerMin, minGrads){
### Prepares tibble required to make tables 2, 3, 4 ...
#
race6YrPer <- paste0(race, "6YrPer")
raceGrads <- paste0(race, "Grads")
raceEntered <- paste0(race, "Entered")
#
### Remove institutions with missing race6YrPer
dt <- filter(dt, !is.na(dt[, race6YrPer]))
#
### drop institutions with 6yr > 100 for all students
##dt <- filter(dt, total6YrPer <= 100)
#
### drop institutions with 6yr > 100 for race students
### dt <- filter(dt, dt[, race6YrPer] <= 100)
#
### Meet or exceed the target graduation rate
dt <- filter(dt, dt[, race6YrPer] >= sixYearPerMin)
#
### Set minimum size of graduating cohorts
dt <- filter(dt, dt[, raceGrads] >= minGrads)
dt <- select(dt, UnitID, institution, state, region, race6YrPer,
total6YrPer, totalGrads, totalEntered, mat25,
raceGrads, raceEntered, totalPell, pellPer)
dt[, "gapPer6Yr"] = dt[, race6YrPer] - dt[, "total6YrPer"]
#
### Shorten names for blog pages ... do final manual
### edit on names on the html page generated by knitr
dt$institution <- str_trunc(dt$institution, width=31)
#
return(dt)
}
#
makeTable2mv <- function(dt, race) {
### Function calculates missing values for 6-Year grad rates
#
# Access data via names of columns
race6YrPer <- paste0(race, "6YrPer")
raceEntered <- paste0(race, "Entered")
raceGrads <- paste0(race, "Grads")
#
N1 <- dim(dt)[1]
mv1 <- sum(is.na(dt[,race6YrPer]))
zEnt1 = sum((dt[, raceEntered] == 0), na.rm=TRUE)
zGrad1 = sum((dt[, raceGrads] == 0), na.rm=TRUE)
zEntGrad = sum((dt[, raceEntered] == 0) & (dt[, raceGrads] == 0), na.rm=TRUE)
#
dt <- dt %>%
select(race6YrPer) %>%
filter(!is.na(race6YrPer)) %>%
summarise(Nraw = N1,
mvRaw = mv1,
zEnt = zEnt1,
zGrad = zGrad1,
zEntGrad = zEntGrad,
mv6Yr = mvRaw - zEntGrad,
N6Yr = n() - mvRaw)
dt <- round(dt, 1)
dt <- select(dt, Nraw, mvRaw, zEntGrad, mv6Yr, N6Yr)
#
return(dt)
}
#
makeTable2 <- function(dt, dt_mv, race, sixYearPerMin, minGrads) {
### Function calculates distribution of 6-year grad rates for
### Black students at all accredited U.S. colleges and universities ...
#
race6YrPer <- paste0(race, "6YrPer")
raceGrads <- paste0(race, "Grads")
raceEntered <- paste0(race, "Entered")
#
dt <- select(dt, c(race6YrPer, raceGrads, raceEntered))
#
dt <- dt %>%
summarise(G6Yr75per50 = n(),
MedSixYrPer = median(as.matrix(dt[, race6YrPer]), na.rm=TRUE),
gt100 = sum(as.matrix(as.matrix(dt[, race6YrPer]) > 100), na.rm=TRUE))
#
dt <- round(dt, 1)
dt <- select(dt, G6Yr75per50, MedSixYrPer, gt100)
dt <- bind_cols(dt_mv, dt)
dt <- dt[, -c(3, 4)]
#
return(dt)
}
#
makeChart2 <- function(dt, race, Race, sixYearPerMin, minGrads){
### Function charts distribution of grad rates ...
### ... see colours() for names of color on chart
#
xLabel <- paste0(Race, " Graduation Rates")
race6YrPer <- paste0(race,"6YrPer")
raceGrads <- paste(race, "Grads")
#
# Exclude outliers over 100
dt <- filter(dt, dt[, race6YrPer] <= 100)
MedSixYrPer <- median(as.matrix(dt[, race6YrPer], na.rm=TRUE))
rates <- as.matrix(dt[, race6YrPer])
#
gg <- dt %>%
ggplot(aes(x = rates)) +
geom_bar(position="identity", fill="turquoise2") +
###ggtitle(title) +
xlab(xLabel) +
ylab("Number of Institutions") +
###theme(plot.title = element_text(size=12)) +
geom_vline(xintercept = MedSixYrPer, colour = "orangered2")
#
return(gg)
}
#
makeTables34 <- function(dt, race, matMin = NULL){
### Creates List of institutions in "Black Six Year
### 75 Percent 50 Group" ...
### plus stats associated with each institution
### Or creates list of most selective institutions via mat25
#
# Change names for three variables
if (race=="black") {
dt <- mutate(dt, raceGrads = blackGrads,
raceEntered = blackEntered,
race6YrPer = black6YrPer)
} else if(race == "asian") {
dt <- mutate(dt, raceGrads = asianGrads,
raceEntered = asianEntered,
race6YrPer = asian6YrPer)
} else if (race == "hispanic") {
dt <- mutate(dt, raceGrads = hispanicGrads,
raceEntered = hispanicEntered,
race6YrPer = hispanic6YrPer)
} else if (race == "white") {
dt <- mutate(dt, raceGrads = whiteGrads,
raceEntered = whiteEntered,
race6YrPer = white6YrPer)
} else {
dt <- mutate(dt, raceGrads = otherGrads,
raceEntered = otherEntered,
race6YrPer = other6YrPer)
}
#
if (!is.null(matMin)) {
dt <- filter(dt, mat25 >= matMin) ### Select the most selective
}
# This function really does two things ... rest of code is fluff
# 1. Add share column of grads to original dt
dt <- dt %>%
addShareCol("raceGrads") %>%
rename(share = shareXXX)
#
# 2. Add summary line at top of tibble table
dt <- addSumRow(dt)
#
# Select and reorder
dt <- select(dt, institution, region, state, raceGrads, share,
race6YrPer, total6YrPer, gapPer6Yr, pellPer, mat25)
if (!is.null(matMin)) {
dt <- arrange(dt, region, desc(mat25))
} else {
dt <- arrange(dt, region, desc(raceGrads))
}
#
# Only reassert true names for needed variables
if (race == "black") {
mutate(dt, blackGrads = raceGrads)
raceGrad6yrPer = "B%"
} else if (race == "asian") {
mutate(dt, asianGrads = raceGrads)
raceGrad6yrPer = "A%"
} else if (race == "hispanic") {
mutate(dt, hispanicGrads = raceGrads)
raceGrad6yrPer = "H%"
} else if (race == "white") {
mutate(dt, hispanicGrads = raceGrads)
raceGrad6yrPer = "W%"
} else {
mutate(dt, otherGrads = raceGrads)
raceGrad6yrPer = "Oth%"
}
#
names(dt) <- c("Institution", "Reg", "ST",
"Grad", "Sh%", raceGrad6yrPer, "T%",
"Ga%", "P%", "M25")
return(dt)
}
############################
############################
#
### C. Execute the script commands that call the functions
# to create the tables, charts, and maps
table_0 <- prepTable_0(dtIPEDS)
table1 <- makeTable1(table_0)
chart1 <- makeChart1(table1)
table2mv <- makeTable2mv(table_0, "black")
tables_234 <- prepTables_234(table_0, sixYearPerMin = 75,
minGrads = 51, race="black")
table2 <- makeTable2(tables_234, table2mv,
"black", sixYearPerMin = 75, minGrads=50)
chart2 <- makeChart2(tables_234, "black",
"Black", sixYearPerMin = 75, minGrads=50)
table3 <- makeTables34(tables_234, race="black")
map3 <- makeMap(table3)
table4 <- makeTables34(tables_234, race="black", matMin=670)
map4 <- makeMap(table4)
#
### Display
glimpse(table_0)
table1
chart1
print(tables_234, n=100)
table2mv
table2
chart2
print(table3, n = 200) ### long tibble
map3
print(table4, n = 100) ### long tibble
map4
#
### Save the tables into a file for subsequent processing by "knitr"
save(file="tables.rda", table1, table2, table2mv, table3, table4)
#
### Save charts and maps in 400x400 file
chart1
ggsave("chart1.jpg", width=5, height=3, dpi=300)
chart2
ggsave("chart2.jpg", width=5, height=3, dpi=300)
map3
ggsave("map3.png", width=5, height=3, dpi=300)
map4
ggsave("map4.png", width=5, height=3, dpi=300)
### Write to .csv files
write.csv(table1, file = "table1.csv")
write.csv(table2, file = "table2.csv")
write.csv(table2mv, file="table2mv.csv")
write.csv(table3, file = "table3.csv")
write.csv(table4, file = "table4.csv")
No comments:
Post a Comment
Thank you!!! Your comments and suggestions will be greatly appreciated ... :-)