Home News Harvard, Others Saving Data As Trump’s Team Scrubs Federal Webpages

Harvard, Others Saving Data As Trump’s Team Scrubs Federal Webpages

by admin

A number of organizations, including Harvard University, have begun taking steps to preserve data files and other information as the Trump administration continues its sweeping purge of federal web pages and the data to which they once provided access.

The rate at which federal government websites are being completely disabled or selectively censored by the administration has alarmed researchers, health officials, security experts and the general public, which has relied on government data for everything from public health warnings, scientific research, and environmental risk alerts to crime statistics and mental health information.

According to a February 2 story in The New York Times, more than 8,000 pages from at least a dozen U.S. government websites had been taken down from agencies like the Centers for Disease Control and Prevention, the Census Bureau, the Department of Justice, the Food and Drug Administration, and the IRS. More have been deleted since then.

In some cases, entire websites have been disabled; in others, only portions have been purged to eliminate reference to now apparently taboo topics like diversity or gender.

Now, various nonfederal organizations and agencies are stepping in to preserve the information the government is trying to hide from the public.

Last week, Harvard University announced that its library’s Innovation Lab was releasing its archive of data.gov as part of its data vault project. The collection already includes over 311,000 datasets retrieved from 2024 and 2025, representing “a complete archive of federal public datasets linked by data.gov.”

Reuters quoted Amanda Watson, Harvard Law School’s assistant dean for library and information services, as saying that the library’s rescue effort was about “upholding our fundamental belief that government information belongs to the public.”

Others are also attempting to save information that the administration is wiping from government webpages. For example:

  • The National Security Archive’s Climate Change Transparency Project has published a list of materials and information on climate change and environmental justice that’s been deleted from agency websites.
  • The end of term archive has been retrieving and saving U.S. government websites at the end of presidential administrations, dating back to 2008. It is currently harvesting sites for its End of Term 2024 Web Archive.
  • MIT Technology Review recently identified a handful of other organizations working to preserve government files before they are lost forever.
  • Another resource is the Internet Archive’s Wayback Machine, which gives access to millions of websites across the internet in the form of a digital library.

Harvesting disabled government websites is an important, but difficult, task. Web crawlers can be a useful tool, but sometimes the work must be done by hand, a very time-consuming process that’s fraught with the possibility of missing important elements.

Of course, as essential as data recovery and preservation are, their value becomes increasingly limited as the data age and new information is not made available.

“All of this data archiving work is a temporary Band-Aid,” Christina Gosnell, co-founder and president of the Catalyst Cooperative, told the MIT Technology Review. “If data sets are removed and are no longer updated, our archived data will become increasingly stale and thus ineffective at informing decisions over time.”

The administration’s elimination of data from government websites is likely to have a number of dire consequences, including setbacks to scientific research, new security risks, an increase in misinformation, and a generally less-informed public. The administration may be able to hide data it disfavors, but that won’t eliminate the problems those data illuminate.

You may also like

Leave a Comment