Last week I spent three wonderful and intellectually stimulating days at the Forgotten Television Drama conference; an event that was organised by a number of my colleagues in the department of Media Arts at Royal Holloway (you can read more about the wider project here and a report on the conference here). Given the topic of the conference and the aims of the project from which it emerged, there was a natural and understandable gravitation towards the early “analogue” histories of television drama. Although there were a wide variety of papers, the vast majority of these focused on programming between the birth of television to the late 1960s and early 1970s – the point at which the preservation of recordings became much more commonplace (although not universal by any means) and the period in which, it could be argued, television was at its most ephemeral.
Of course, forgotten doesn’t necessarily mean lost – i.e. that no physical recording remains. It can also mean that for one reason or another viewers simply don’t remember certain programmes. However, this issue of access to material, or rather a lack of access to material, was a recurring theme of the conference. Indeed, many of the panels I attended ended with fascinating discussions around the limitations and difficulties of archival research – either programmes were missing or incomplete, or the other useful research materials surrounding them, such as correspondence and production notes, were similarly either missing, incomplete, difficult to access, or even protected by copyright and therefore ineligible from direct inclusion in scholarly works.
I learnt a great deal from these conversations (particularly as I’m much more au fait with the more recent digital history of television) but I was struck by the lack of discussion around current issues of preservation. This observation is not at all intended as a criticism – on the contrary, the range and quality of papers across the three days was superb. Moreover, the conference itself and the History of Forgotten Television Drama project are both primarily concerned with the history rather than the future of television. Nevertheless, I was inspired by these discussions, and I want to take this opportunity to offer some of my own thoughts about digital preservation – particularly as they seem to resonate so clearly with the debates that permeated the conference as a whole. Before I do this, however, I should point out (or rather, confess) that the ideas that follow represent a much-truncated version of the paper that I presented at the conference itself.
For quite some time now, digital technologies have been an integral part of media culture. To use Jay David Bolter and Richard Grusin’s (2000) term, most digital versions of analogue technologies are part of a process of “remediation” in which new devices promises to address the shortcomings of their predecessors. In this instance, and specifically in the context of television, digital innovations such as Freeview, DVRs, DVDs, and streaming services promise a number of significant improvements: near-endless choice, instant and ubiquitous access, and, perhaps most importantly, the possibility of permanent storage. However, these promises are rarely, if ever, fulfilled. As soon we begin to look beneath these utopian discourses we soon discover that digital media are highly ephemeral. In fact, in many ways they are much more ephemeral than their analogue predecessors; hard drives regularly crash; digital files are prone to corruption or subject to software obsolescence; content is suddenly and unexpectedly removed from our favourite streaming service; the list goes on.
The increasing fragility of digital media has not gone unnoticed . Earlier this year, Vint Cerf, vice president of Google, warned us that we are entering into a “digital dark age” with the apparent repercussion that future historians will know very little about 21st Century culture. This highly provocative metaphor has a relatively long history that dates back almost 20 years (see its original formulation here) and is seemingly the result of a growing recognition of the ephemerality of digital media and, to a lesser extent, our increasingly casual and transient relationship with this material. In Google’s case, the threat of a “digital dark age” happens to reinforce their broader corporate strategy, but for media historians it raises some genuinely pressing questions. And so, just as the delegates at the Forgotten Television Drama conference regularly lamented the questionable preservation policies of early TV broadcasters, I can’t help but wonder if historians in the future will be having similar a conversation about the digital media culture of today.
Significantly, anxieties around the “digital dark age” are not only fuelled by a growing recognition of the fragility of digital media but are also exacerbated by the rate at which media (and information more broadly) is now proliferating. As such, there are at least two distinct conditions contributing to this supposed “digital dark age”: firstly, the ephemerality of digital media itself and the associated problems of access, storage and loss, and secondly, the often overwhelming volume of information now available, and the subsequent difficulties this presents when it comes to selecting, storing and archiving. This latter issue is further complicated by the fact that the when we speak of the proliferation of television we are really speaking of its “content” (i.e. the programming) as well as its “information” (i.e. the data and metadata that surrounds or accompanies the main content). Indeed, there is a growing critical consensus that we now live in the era of “big data”: a term that refers to the rapid proliferation of various kinds of data and the increasingly integral, if often unseen, role that they play within economic, political, and cultural spheres.
The production of such large quantities of data might seem to contradict or, at the very least, alleviate some of the widespread fears about an impending “digital dark age”. In the context of digital television, we have potential access to a volume and variety of information that, until relatively recently, never existed.  I should emphasise, however, that this is only “potential” access – and even this is probably a rather optimistic view. And herein lies one of several paradoxes of “big data” (and the “digital dark age” for that matter): the quantity of information may be rapidly expanding but it is also increasingly ephemeral, hard to access, and difficult to interpret. As Mark Andrejevic (2014) has recently pointed out, corporations that gather information rarely make this data available to individuals (or media scholars). Instead, the data is only of value and/or makes sense when it is subsumed within a larger database of information. This has led to a culture in which access to information is highly uneven (favouring corporations rather than individuals) – a situation that Andrejevic (2014) has described as the “the big data divide”. 
Of all the recent digital developments within television culture, it seems to me that streaming services such as Netflix foreground these issues most clearly. Netflix is ephemeral in a number of ways that resonate with the broader debates around the “digital dark age” and “big data” – access is only ever temporary, subscribers do not have the option to purchase or permanently store content, the database is constantly changing, and the service itself produces an enormous volume and variety of data – much of which is either inaccessible or may even be disposed of once its purpose has been served. Given these ephemeral attributes, services such as Netflix therefore present a number of problems for the preservation of digital media and, by extension, challenges for television historians. Perhaps the most obvious methodological barrier is the problem of access, or rather lack of access to information. In many ways, Netflix exemplifies Andrejevic’s “data divide” critique. Despite a recent announcement that US ratings firm Nielsen will soon be publishing basic viewing figures for Netflix (and Amazon Instant), it turns out that this information is only inferred using the company’s somewhat fallible method of audience sampling. Even if Netflix were to provide Nielsen with precise viewing figures, this would only constitute a small fraction of the overall data produced by users of the streaming service. This lack of transparency is especially problematic given that “big data” is increasingly used when it comes to making decisions about the organisation and distribution of media content. As Reed Hastings, CEO of Netflix, explained in a recent interview in Wired magazine:
With a streaming service, we get a lot of signals about what and how people are watching … we know what we’ve shown to you — we know what we put on the screen as possibilities for you, what you snapped up or passed over in favour of something else.
As Hastings goes on to explain, this information forms the basis of Netflix’s recommendation algorithms. However, the company are also harvesting more varied types of data sets, and increasingly using these to inform purchasing and commissioning decisions – an example of what Philip M. Napoli has recently called the ‘algorithmic turn in media production’ (2014). The most regularly cited example of this in regards to Netflix was their acquisition of House of Cards. In this instance, Netflix didn’t use an algorithm to conceive the show but rather to determine whether or not the particular combination of stars, director, producer, plot and genre would prove popular amongst its subscribers. While precise viewing figures aren’t readily available (to media historians, at least), House of Cards’ average rating of 4.5 stars, which is based on more than six million Netflix reviews, suggests, if anything, that the algorithm was right.
Clearly, we haven’t reached the stage where algorithms will entirely replace creative decisions – nor do I think we ever will. Nevertheless, these examples suggest that “big data” is playing an increasingly significant role in the organisation, delivery, commissioning, promotion and purchasing of digital television. At the same time, however, there exists a very problematic “data divide” in which distributors such as Netflix are not only gathering large quantities of information from their subscribers but are keeping this data to themselves. Simply gaining access to this information won’t necessarily solve the problem. As Andrejevic (2014: 1676) notes, in contrast to large organisations, most individuals are generally not equipped to interpret such large bodies of information. Within the field of television studies, we tend to think of ourselves as “researchers”. However, as “big data” continues to make inroads within the media industries, we may need to add new skills such as “data mining” and “sentiment analysis” to our résumés.
It’s difficult to draw any conclusions about what this might mean for the future of television history, not least because we’re still in the midst of these changes. Perhaps, though, by drawing attention to the precarious nature of digital television culture as well as the various advantages and limitations of “big data”, we might be better equipped to prevent a “scholarly dark age” for future television historians.
JP Kelly is a lecturer in film and television at Royal Holloway, University of London. He has published work on the emerging economies of online TV in Ephemeral Media (BFI, 2011) and on television seriality in Time in Television Narrative (Mississippi University Press, 2012). His current research explores a number of interrelated issues including the development of narrative form in television, issues around digital memory and digital preservation, and the relationship between TV and “big data”. He still hasn’t watched House of Cards despite the best efforts of the Netflix algorithm to convince him otherwise.
 These issues are also being recognised within academia. One of Debra Ramsay’s recent contributions to CST, for instance, refers to what Amy Holdsworth and Andrew Hoskins have called the ‘digital archival myth of total access and accumulation’ (forthcoming: 31).
 Incidentally, volume and variety constitute two of the “four Vs” that pose the greatest methodological challenges for those working with big data – the other two being the velocity of information, and the veracity of information. For more on this, see Bizer, Boncz, Brodie and Erling (2012).
 Somewhat ironically, the article in which Andrejevic explores these issues of access is openly available to read here.