The Challenge of Preserving the Historical Record of #MeToo by Nora Caplan-Bricker (The New Yorker) March 11, 2019
Around the height of the #MeToo revelations, in the fall of 2017, I interviewed an archivist at a prominent research library for a piece about social-media preservation. It quickly became apparent that he knew less about the subject than I did; he saved Facebook posts by painstakingly copying and pasting them into Word, comment by comment, and manually pressing print. The longer we spoke, the more visibly annoyed he grew by my questions, to which he offered no answers. He leaned farther and farther back in his chair and gazed over my left shoulder. Finally, he launched into a story about his senior year of college, when he wrote his thesis on an ill-defined topic and slowly realized that he’d bitten off more than he could chew. At first, I nodded politely, not understanding how this related to anything. Then the intent of the anecdote flooded through me on a tide of adrenaline, and I moved my notepad to my lap so that he couldn’t see my hand shaking with rage.
Outside of the library, I stood in the cold and waited for my heart rate to return to normal. I wondered if the archivist was right that I was a naïve girl with inchoate ideas. Then I asked myself what gave him the power to make me wonder. Until then, I’d endured male condescension as if it couldn’t touch me, observing my small humiliations, across years in the workplace, from the other side of a cool pane of glass. But the barrage of #MeToo stories had stirred the sediment from those experiences, revealing that my self-regard had always been porous. The meeting with the archivist was barely a #MeToo moment—it was not sexual assault or harassment but rather a familiar kind of gendered belittlement. Still, it was the moment when #MeToo fully reached me.
When I consider why I feel compelled to publish this story—so small, and so many months after the fact—I arrive at the answer that I want to contribute, however meagrely, to the record of what it was like to live through that fall. The notion that the memory of #MeToo needs preserving—both because it matters and because it could disappear—is also the premise of a much larger archival effort. In June, the Schlesinger Library at Harvard University’s Radcliffe Institute, arguably the paramount repository of works on American feminism, announced its intention to collect the millions of tweets and hundreds of thousands of Web pages—news articles, legislation, changing H.R. policies, public apologies—that composed #MeToo and remain as its evidence. (Harvard faculty members of the steering committee for the #MeToo project include Jill Lepore, a staff writer for this magazine, and Jeannie Suk Gersen, a contributing writer.)
The undertaking has few major precedents. Only in the past decade have historians recognized the value of social media and libraries begun building tools to collect it at scale. (For seven years, starting in 2010, the Library of Congress vacuumed up every tweet, but that archive remains closed to researchers.) The Schlesinger has had to locate its own answers to a battery of technical and ethical questions as it prepares to allow access to its holdings, at least in part, by late 2019. For example, many of the women and men who shared #MeToo stories may have thought of them as ephemeral; they didn’t anticipate that they could become fodder for future theories of history. It’s a thorny dilemma but one that the library has no hope of solving if it doesn’t gather the posts before they disappear. “The argument is always in favor of preservation and against loss,” Jane Kamensky, the Schlesinger director and a Harvard history professor, told me. “As a historian, I believe we only understand things through primary evidence. Anything that stays dark is not going to be understood.”
I met with Kamensky and her team in January, in temporary offices a few blocks from the Radcliffe Quadrangle (where the library is undergoing renovation), for a virtual tour of what they’ve collected so far. She said that she hopes the data will go some way toward answering essential questions about #MeToo: What, if anything, has it accomplished? Was #MeToo a burst of revolutionary energy that has since flamed out, or is it a still-growing constellation of attempts to organize? Amanda Strauss, the special projects manager at the Schlesinger, said that their Twitter searches for #MeToo and related hashtags have continued to yield around a hundred and fifty thousand tweets every week, leaving them unsure about when to impose a temporal boundary on the archive—or where, in hindsight, historians will locate the end of #MeToo.
The archivists use a tool, created at George Washington University and called Social Feed Manager, to perform weekly downloads of roughly fifty hashtags, which capture both trending conversations (#MuteRKelly) and sector-specific ones (#MeTooMedicine, #TimesUpTech), and also the parallel universe of counter-denunciations that collect under banners like #MeTooLiars and #IStandWithBrett. On the day of my visit, they scrolled through the latest haul of tweets from the evangelical #ChurchToo movement to demonstrate how the app creates spreadsheets of the posts’ metadata (information such as the number of retweets, the number of followers, and geolocation, if user-enabled). Twitter’s A.P.I. permits this kind of free mass download for the first week or so after a post appears on the Web, and the library is working with the company to purchase the nearly nineteen million tweets from the first year of the hashtags’ use.
Other social-media sites present greater challenges: the Schlesinger can crawl individual Facebook and Instagram pages with a tool called the Webrecorder, but collecting them at scale is notoriously difficult. (Strauss told me that the issue of archiving these platforms more extensively is temporarily “on hold.”) The staff had planned to choose news articles and other URLs by hand before realizing that they needed to rethink their reach. Now they’re using Media Cloud, a tool developed by faculty members at M.I.T. and Harvard, to run vast searches of news stories that they can archive en masse.
As Jill Lepore has written in The New Yorker, most people assume that the Web’s contents will be with us forever. (“Don’t post that picture if you don’t want it to follow you!”) In fact, the Internet is among our most ephemeral inventions: a piece of paper can survive for seven hundred years, but “the average life of a Web page is about a hundred days,” she wrote. To Strauss, collecting #MeToo feels “like we’re in the middle of a ticker-tape parade, and this content is raining down around us, and we need to pick it up or it’s going to get swept up and put in the trash.” So far, the vulnerable sites shored up on Harvard’s servers include a Medium post about sexual harassment in the children’s-book industry, a whisper-network-sourced accounting of abuses in stem, and a hyper-local news service’s investigation of the California state legislature. Ultimately, the collection will reach back to 2006, when the activist Tarana Burke began campaigning against sexual assault using the slogan “me too.”
The Schlesinger staff’s choices shape the archive to a degree that’s unusual, and a little unwelcome. Standard collections have implicit boundaries: they’re the papers of a person or an organization or perhaps the surviving documentation of a single event, such as the March for Life. But #MeToo—which drew strength from millions of sources and exerted influence in every direction—can only be captured in what archivists call a “constructed” or “artificial” collection, an assemblage of objects of disparate provenance, with borders that are imposed, not absolute. The Schlesinger has other constructed collections, such as a survey of early women’s blogs, but the #MeToo project is by far the most ambitious. It pushes the library to the edge of its traditional role in a field that draws a stark line between archivists, who leave as few fingerprints as possible, and scholars, who mold history from the assembled clay.
“It’s not the job of the archivist or librarian to try to answer questions like ‘Is #MeToo a movement?’ or impose our own conceptual framework on this material,” Jane Kelly, the Web-archiving assistant who harvests the bulk of the actual posts and pages, said. “My interpretation is a moot point.” But in this case the collectors can’t help but be curators, judging where #MeToo ends and where every other discussion of women, work, harassment, and violence begins.
Some Internet preservationists have been pushing brick-and-mortar libraries to embrace this evolution. “Traditional archivists seem most comfortable dealing with the outcomes of the work of various types of documenters,” Clifford Lynch, the director of the Coalition for Networked Information, wrote in an influential 2017 paper. But the Internet is more like “a nearly infinite number of unique, individual, personalized performances”—a new one every time you log on—than it is an assortment of artifacts. Lynch warns, “If archivists will not create, capture, curate the ‘Age of Algorithms,’ then we must quickly figure out who will undertake this task.”
The #MeToo project is one step in this direction, and it’s fitting that it occurs in the service of feminist history. Feminism’s contributions include a heightened awareness of history as a construct, its shape determined by what it leaves out. In her famous poem “Diving Into the Wreck,” Adrienne Rich, whose papers the Schlesinger holds, envisions the legacy of women’s lives as a hulk left to decay beneath the surface of the official past. She imagines herself swimming down “to see the damage that was done / and the treasures that prevail ... the wreck and not the story of the wreck.” Survivors used #MeToo to dredge their experiences for inclusion in history. The archive of their efforts will tacitly acknowledge, by way of its construction, that no record is ever objective or complete.
Another theme of feminist scholarship, and of #MeToo, is women’s agency: telling your own story on your own terms. In the archive, this principle competes with the belief that the stories are worth saving. It’s impossible to ask the individuals behind millions of Twitter accounts for permission to preserve their words, and, legally speaking, it’s also unnecessary under the company’s user agreement. Twitter says that account holders own their tweets, but the fine print reveals that they don’t control them: in a recent survey of two hundred and sixty-eight Twitter users, almost half were under the mistaken impression that their tweets couldn’t be gathered for research without their express consent.
At the Schlesinger, these concerns are shaping conversations about which parts of the archive will be accessible to researchers, and when. The library could choose to share Twitter data from verified accounts, or accounts that reach a certain threshold number of followers, under the assumption that they had little expectation of privacy. In theory, it could allow keyword searches in which a researcher “asks the data a question and gets back an answer,” Kelly said, even as the actual tweets remain out of sight. The library quite possibly would remove content from view of the researchers—though not from the collection—if petitioned by its author. And the staff has considered the possibility that parts of the collection could be requested in the course of ongoing #MeToo litigation.