Saturday, December 08, 2018

All public policy decisions should be based on open source data

Last update: Saturday 12/8/18

DLL Editor's note: Basing public policy decisions on open source data is an old idea. It's been around for a long time, assuming different names from time to time. I am just the latest "visionary" to recognize its necessity.

My personal epiphany came at the close of the first Harvard trial in Boston (October/November 2018) that challenged its race-based affirmative action admissions policies. I have personal interests in this case as a Black educator and as an alum of one of Harvard's PhD programs. 

The advocates for the plaintiffs, i.e., for the Asian American students whose applications had been rejected by Harvard's admissions office, hired an economist to develop a statistical model that was used to analyze the data in the files of six years of applications to Harvard that was shared with the plaintiffs as part of the discovery process. The model estimated that Asian applicants had been unfairly rejected in favor of Black and other minority applicants. In response, Harvard's expert witness, another economist, declared that the plaintiff's model was "deeply flawed" so its estimates of enrollment penalties imposed on Asian American applicants were invalid. We are now awaiting the judge's decision.

Everyone understands that this trial was not just about Harvard's admissions policies. Everyone expects that this case will be appealed all the way up to the Supreme Court. The highest court in the land will then decide whether race-conscious affirmative action admissions can be used by any U.S. college or university. In other words, the ultimate judicial decision will be a public policy decision that will affect every college and university in the country.

What concerned me about this process was that pundits in the general media and in the higher ed media were voicing strong opinions about the case even though they did not have access to the underlying data. What concerned me even more was that many concerned citizens, like myself and the readers of this blog, read the pundits' opinions, then formed own opinions, even though we did not have access to the underlying data. Of course, without the benefit of the underlying data most of our opinions were merely reiterations of our old opinions; hence they were probably irrelevant to Harvard's trial. 

The difference between me and other concerned citizens was that I had the time to explore related data that was available from a public source: IPEDS. I used IPEDS data to develop my own estimates of the increased share of Harvard's enrollments that Asian American applicants might gain if race-conscious affirmative action was prohibited. My estimates were much, much lower than the estimates provided by the advocates' statistical model. 
This personal experience led me to the following "insights":
  • All public policy decision processes should be based on open source data, i.e., on data that is available to everyone from public sources
  • If a lawsuit or any other kind of public policy decision process requires the use of data that was previously private -- e.g., data in Harvard's application files -- that data should be anonymized and posted on an open source Website in a standard format. This will provide the plaintiffs, defendants, and concerned citizens like me and the readers of this blog with access to the same data.
  • Then we can all develop informed opinions and make our informed opinions heard if we feel strongly enough about what we have discovered. 

Related notes on this blog:

No comments:

Post a Comment

Thank you!!! Your comments and suggestions will be greatly appreciated ... :-)