Sommaire
Résumé
Ce rapport de recherche explore les pratiques et les politiques en matière d'archivage dans les journaux, les magazines, les services filaires et les producteurs de nouvelles uniquement numériques, dans le but d'identifier l'état actuel de l'archivage et les stratégies potentielles pour préserver le contenu à l'ère de la distribution numérique. Entre mars 2018 et janvier 2019, nous avons interrogé 48 personnes appartenant à 30 organes de presse et à des initiatives de préservation.
Ce que nous avons constaté, c’est que la majorité des organes de presse n’ont pas réfléchi aux stratégies de base, même pour la préservation de leur contenu numérique, et qu’aucun n’a sauvegardé correctement un enregistrement holistique de ce qu’il produit. Sur les 21 agences de presse ayant participé à notre étude, 19 ne prenaient aucune mesure de protection pour archiver leur sortie Web. Les deux autres n'avaient pas de stratégie formelle pour faire en sorte que leurs pratiques actuelles aient le genre de longévité leur permettant de survivre aux changements technologiques.
Dans l'intervalle, les personnes interrogées ont souvent assimilé (et à tort) la sauvegarde et le stockage numériques dans Google Docs ou les systèmes de gestion de contenu à l'archivage. (Ce n'est pas la même chose; la sauvegarde consiste à faire des copies pour la récupération des données en cas de dommage ou de perte, tandis que l'archivage fait référence à la conservation à long terme, garantissant que les enregistrements seront toujours disponibles même si les technologies de formatage et de distribution évoluent à l'avenir.)
Au lieu de cela, les organisations de presse ont confié leurs responsabilités en tant que responsables publics à des organisations tierces telles que Internet Archive, Google, Ancestry et ProQuest, qui stockent et distribuent des copies du contenu des nouvelles sur des serveurs distants. En tant que tel, le cycle d'actualités inclut désormais le recours à des organisations propriétaires avec un contrôle croissant sur les archives publiques. Internet Archive mis à part, le problème plus important est que les incitations de ces sociétés ne sont ni journalistiques ni archivistiques et peuvent entrer en conflit avec les deux.
Bien que de nombreuses technologies d'archivage de nouvelles soient développées par des particuliers et des organisations à but non lucratif, il convient de noter que la préservation du contenu numérique n'est pas avant tout un défi technique. C’est plutôt un test de la prise de décision humaine et une question de priorité. La première étape dans la gestion d’un processus d’archivage consiste à enregistrer le contenu. Les agences de presse doivent y arriver.
Les conclusions de cette étude devraient constituer un appel au réveil d'un secteur qui affirme que la démocratie ne peut être maintenue sans le journalisme, ce qui lui donne sa légitimité d'être un organisme de surveillance de la vérité et de la responsabilité. À une époque où le journalisme est déjà attaqué, la gestion de son bilan et de son avenir sont plus importants que jamais.
Les sources d'informations locales, indépendantes et alternatives risquent particulièrement de ne pas être préservées, menaçant de laisser des exclusions critiques dans un enregistrement qui favorisera les versions dominantes de l'histoire publique. Comme l'a démontré l'arrêt soudain de Gawker en 2016, le contenu peut être confisqué et disparaître instantanément sans que les pratiques d'archivage soient en place.
Principales conclusions
- La majorité des organes de presse ayant participé à cette recherche (19 sur 21) ne disposaient d'aucune politique documentée pour la préservation de leur contenu. Ils n'avaient même pas de pratiques archivistiques informelles ou ad-hoc en place.
- Outre l'échec de l'archivage des articles publiés sur leurs propres sites Web, aucun des organes de presse que nous avons interrogés ne conservait ses publications sur les réseaux sociaux, y compris les tweets et les publications sur Facebook, Instagram ou toute autre plate-forme de média social. Un seul était en train de prendre les mesures nécessaires pour résoudre le problème de l’archivage des applications d’information interactives et dynamiques.
- Les agences de presse exclusivement numériques étaient encore moins sensibilisées que les publications imprimées à l'importance de la préservation. Une confusion persistante du fait que la sauvegarde sur des serveurs en nuage tiers est identique à un archivage signifie que très peu d'efforts sont actuellement déployés pour préserver les actualités.
- Lorsque nous avons demandé aux personnes interrogées pourquoi elles estimaient que les agences de presse n’archivaient pas le contenu, elles ont répété à maintes reprises que le journalisme était principalement axé sur «ce qui est nouveau» et sur «ce qui se passe maintenant». Les journalistes et leurs organes de presse sont plus intéressés par la conservation de la documentation le rend précis que de préserver ce qui est finalement publié.
- De ce fait, les plates-formes et les fournisseurs tiers, qui hébergent de plus en plus de contenu d'actualités sur leurs serveurs fermés, contrôlent les éléments nécessaires à une conservation globale sans incitation journalistique à les diffuser.
- Le personnel des organes de presse a souvent déclaré s’appuyer sur le Archives Internet, une bibliothèque numérique à but non lucratif qui gère des centaines de milliards de captures Web, afin de préserver ses propres publications – même si l'archivage Web présente des limites quant aux formats qu'il peut capturer et ne conserve qu'une fraction de ce qui est publié en ligne.
- Les applications d'actualités et les applications interactives, en particulier, risquent fortement d'être perdues, car les nouvelles technologies sur lesquelles elles sont construites deviennent obsolètes avant que quiconque ne pense à les sauver. Les développeurs de salles de presse et les outils d’archivage Web en cours d’émulation basés sur l’émulation peuvent être des alliés précieux pour préserver ces ressources, ainsi que d’autres en péril.
- Il existe un certain nombre d’autres initiatives d’archivage, à la fois des particuliers et des organisations à but non lucratif, à partir desquelles les responsables de l’information peuvent apprendre ou obtenir des services, notamment: PastPages par Ben Welsh, NewsGrabber par Archive Team, et Archive-It par les archives Internet. Selon les agences de presse, pour que les efforts d'archivage numérique aboutissent, le processus doit être simplifié, à la fois en termes de mise en œuvre et de flux de travail.
- Les partenariats entre archivistes, technologues, institutions de la mémoire et organisations de presse seront indispensables à la mise en place de meilleures pratiques et de politiques assurant un accès futur au contenu d’actualités distribué sous forme numérique. La collaboration entre toutes les parties devrait commencer par deux questions: que faut-il préserver? Qui devrait le préserver?
- Créer des archives numériques robustes implique de se poser des questions épineuses, telles que la fréquence de capture d'une copie d'une page d'accueil constamment mise à jour, si le contenu personnalisé et les newsletters doivent être préservés, et que faire des commentaires des lecteurs et des publications sur les réseaux sociaux.
- Pour susciter un changement durable, il sera essentiel de trouver des leaders d’opinion sur le terrain pour aider à présenter les idées d’archivage d’une manière qui ait du sens pour le personnel, ainsi que pour les personnes occupant des postes de direction qui doivent en fin de compte être convaincues de ses avantages et de leur compatibilité avec leurs objectifs. les priorités.
introduction
Le processus de production du journalisme imprimé, en particulier entre 1950 et 1990 environ, a consisté en un ensemble d’étapes élaborées au fil de décennies, qui ont commencé par la découverte d’un reportage par le journaliste et se sont terminées par une distribution au public. Un seul média a géré la plupart des étapes de la production, y compris l'archivage. Dans bon nombre de salles de rédaction, un bibliothécaire interne a été un arrêt de ce processus de production, garantissant un certain niveau d'accès futur en coupant les informations individuelles du journal et en les classant sur place en fonction des mots-clés de sujet dans une morgue espace alloué aux coupures).
Les anciens numéros de journaux entiers étaient également fréquemment conservés sur place dans des immeubles à plusieurs étages. Les bibliothécaires de certains médias ont négocié des contrats avec ProQuest et d’autres sociétés d’information commerciales pour microfilmer leurs publications. Celles-ci, ou les versions imprimées originales si elles contournaient l'étape du microfilmage, ont été livrées à la Bibliothèque du Congrès. Encore moins systématique et complet, les radiodiffuseurs ont enregistré sur bande les enregistrements de base d’actualités passées d’actualités radiophoniques et télévisées, voire d’épisodes entiers, ainsi que de nombreux autres médias. Les services de dépêches ont géré les versions imprimées des reportages en fonction de leurs propres critères internes.
Cette infrastructure a commencé à s'effondrer au milieu des années 90 avec l'adoption généralisée d'Internet et la production multiforme de nouvelles en ligne. Aujourd'hui, un produit d'information comprend souvent pas moins d'une demi-douzaine d'éléments, dont un titre, une signature, du texte et des images, ainsi que des commentaires, des fonctions interactives, une vidéo intégrée et des liens sortants. En outre, un journaliste ou un éditeur publiera des liens vers des articles sur des sites externes tels que Facebook, Twitter, Instagram et d'autres plateformes tierces. Si Internet a créé une infrastructure d’information dynamique, très peu de contenu numérique est archivé et les anciens modèles ne peuvent plus garantir un accès à long terme. Bien que certains journalistes reconnaissent le risque de perte de contenu, ils continuent de s’appuyer sur des systèmes de gestion de contenu (CMS) ou des serveurs basés sur un nuage pour stocker leurs travaux. Ces pratiques qu’ils confondent avec la préservation et qui, selon nous, ne sont pas les mêmes.
Ce rapport étend les précédentes bourses d'études et enquêtes qui fournissent un contexte indispensable pour examiner l'infrastructure numérique qui définit maintenant la production d'informations, ainsi que le rôle joué par Internet dans les pratiques et les politiques en matière d'archivage. Par exemple, avoir un responsable dédié à l'archivage peut être l'un des facteurs les plus décisifs pour savoir si quelque chose est sauvegardé pour l'avenir, selon le livre Préserver le premier brouillon de l’histoire en préservant l’avenir, un examen formateur des efforts de préservation des médias au cours des 300 dernières années, qui prend en compte les problèmes culturels et systémiques en jeu.
En termes clairs, les anciennes bibliothécaires de salles de presse Kathleen A. Hansen et Nora Paul détaillent l’infrastructure nécessaire à la conservation de l’actualité, y compris la capture numérique de papier journal sur microfilm. Même s’il s’agit d’une pratique répandue que les organisations de presse et les institutions culturelles peuvent considérer comme une bonne alternative à l’archivage, les auteurs notent que «la notion de version« conservatrice »de tout ce qui se présente sous forme numérique est très problématique».
Dans la même veine, "Liens manquants: La discontinuité de la conservation des informations numériques, " connecte défis actuels à une histoire institutionnelle qui n'a pas donné la priorité à la préservation. Le rapport de 2014 et l'enquête qui l'accompagne suivent le développement et la gestion des archives de nouvelles, de la morgue aux dépôts d'archives numériques, en recherchant des preuves de l'intention de préserver. Ses auteurs ont trouvé peu de choses en écrivant: «Alors que le marché récompense les informations de dernière minute, la gestion de contenu d’actualités précédemment publié a toujours été le problème de quelqu'un d’autre, le plus souvent d’un bibliothécaire».
Avec les archivistes et les conservateurs, les bibliothécaires sont ceux qui se soucient le plus de protéger le journal en tant que disque culturel, affirme le rapport. Les bibliothécaires ont été remplacés par des systèmes et des services automatisés de récupération d'informations, notamment des bibliothèques de nouvelles électroniques, des bases de données commerciales et des «morgues électroniques» compilées à partir de journaux par des sociétés privées. L'automatisation des ordinateurs s'est accrue dans les salles de rédaction et, les journalistes ayant accès à des informations indépendantes des bibliothèques de nouvelles et de leurs conservateurs, le rôle des bibliothécaires a diminué, selon ce compte rendu.
À peu près à la même époque, les journaux locaux et quelques grands métros ont cessé de couper. Après des mises à pied et des rachats successifs au cours des dernières décennies, peu de bibliothécaires continuent à travailler dans les salles de rédaction. Comme ils ne supervisent plus les aspects de la conservation de l'information, les décisions sont prises par le personnel de la salle de rédaction, y compris les journalistes, les rédacteurs en chef et la direction, qui ne reconnaissent pas souvent la valeur historique du contenu de l'information.
Le rythme des nouvelles numériques est rapide et les nouvelles sont continuellement mises à jour. Les journalistes sont peu susceptibles de ralentir pour regarder en arrière ce qui a été publié hier lorsque, pour reprendre les termes d'un rédacteur en chef d'un quotidien, «l'élan est toujours vers l'avant». Il est encore moins probable qu'ils doivent déterminer quelle version de l'histoire sauver. et s'il faut inclure des photos, des images interactives et des bases de données. Ceci présente un point d'échec important.
Dans le meilleur des cas, les salles de rédaction qui ont démantelé leurs morgues ou leurs bibliothèques ont fait don de ce matériel à des sociétés historiques ou à des bibliothèques au lieu de les jeter. Cependant, les collections peuvent être incomplètes, car les clips ont été volés ou mal manipulés. Le contenu restant peut être disponible sur microfilm, auquel cas il peut être numérisé – si un financement est disponible. Bien que nous ayons constaté qu’il n’était pas rare que les organes de presse aient des versions numérisées des journaux couvrant la plus grande partie du 20e siècle et même du milieu à la fin du 19e siècle, ils ne disposaient que de peu ou rien des journaux publiés au 21e siècle. siècle.
En 2011, le Centre des bibliothèques de recherche tracer les chemins que l'environnement numérique affectait la conservation des nouvelles, En 2014, le projet d'archives des nouvelles sur le journalisme a commencé à organiser une série de forumsEsquiver le trou de la mémoire" au Université du Missouri Institut de journalisme Reynolds. Une collaboration entre chercheurs, activistes, archivistes, bibliothécaires, journalistes et technologues, le projet organisé Les forums étaient axés sur les stratégies, les modèles et les moyens de sensibiliser à l'importance de la conservation de l'actualité numérique.
Les archives numériques peuvent facilement devenir obsolètes en raison de l'évolution des formats et des systèmes numériques utilisés par les médias modernes, sans parler des pannes de médias, du bit-rot et du link-rot.
Le nom de l'initiative vient de George Orwell 1984, dans lequel des photographies et des documents en conflit avec le récit changeant de Big Brother ont été jetés dans un «trou de mémoire» et détruits. Selon le fondateur des Archives numériques d'actualités Journalism Digital, Ed McCain, le trou de mémoire actuel est en grande partie le résultat involontaire de systèmes technologiques qui ne sont pas conçus pour conserver des informations à long terme. Dans les salles de presse numériques, a-t-il ajouté, un crash logiciel / matériel peut effacer des décennies de textes, photos, vidéos et applications en une fraction de seconde. Les archives numériques peuvent facilement devenir obsolètes en raison de l'évolution des formats et des systèmes numériques utilisés par les médias modernes, sans parler des pannes de médias, du bit-rot et du link-rot. Dans le cadre de «Esquiver le trou de la mémoire», McCain a organisé une série d’événements rassemblant des sociétés de médias, des institutions de la mémoire et d’autres parties prenantes engagées dans la préservation, comme le dit le site, du «premier brouillon de l’histoire» créé en formats numériques. .
Dans son article de 2015 intitulé «Préserver les applications d'actualité posant d'énormes défis», Meredith Broussard a qualifié les applications d'actualités – applications d'actualités interactives, morceaux de journalisme néo-numérique ou logiciels conçus sur mesure pour raconter une histoire – comme un défi à la conservation peu étudié. plus omniprésent et insidieux que la fragilité du papier journal à forte acidité. Elle les décrit comme des logiciels distincts, interactifs et exploratoires caractéristiques, qui créent une expérience qui n'aurait pas pu être construite sous les contraintes d'un système de gestion de contenu conventionnel. Ces caractéristiques les distinguent des autres contenus d’actualité créés par le numérique et les rendent particulièrement difficiles à archiver et à conserver. Comprendre l'ampleur du problème est en soi difficile.
Les priorités et les critères de préservation manquent en partie parce que les efforts pour documenter le nombre et la nature des applications d'actualités, ainsi que pour évaluer combien ont disparu ou ont besoin d'être préservés, ont été incomplets. Les applications d'actualités présentent également des défis supplémentaires et décourageants au-delà des problèmes juridiques, techniques et financiers qui ont empêché la préservation du contenu imprimé et en ligne. Cela met beaucoup de téraoctets de contenu d'actualité né numérique sous peine de perdre le «trou noir de l'obsolescence technologique», comme l'ont dit Broussard et sa collègue Katherine Boss. Comme nous le détaillons dans ce rapport, la solution consiste à émuler l'environnement pour lequel les applications ont été créées à l'origine afin qu'elles puissent être lues à l'avenir.
Educopia, un institut qui favorise la collaboration entre bibliothèques, musées et autres organisations de la mémoire culturelle afin de faire progresser les efforts d'archivage, a également contribué aux efforts visant à faciliter la conservation numérique avec le 2014 “Chroniques en conservation” projet. Le rapport fournissait des informations sur les stratégies, les workflows et les outils permettant de préparer la conservation du contenu des nouvelles pour un référentiel, et comparait les approches techniques MetaArchive-LOCKSS, Chronopolis-iRODS et UNT-CODA pour l'archivage numérique.
Malgré des différences dans leurs méthodes et leurs priorités, ces initiatives révèlent un manque de politique aux États-Unis. À l’heure actuelle, la politique américaine en matière de préservation des journaux s’articule autour du droit d’auteur des journaux imprimés. Autrement, là où la politique existe, elle est variable et sélective. Dans l’ensemble, très peu d’anciens numéros des journaux du pays peuvent être obtenus sous forme numérique, à l’exception de quelques grands quotidiens métropolitains tels que Le New York Times et des magazines nationaux qui donnent accès à leurs sites Web.
Autrement, ce qui est disponible est dispersé dans les salles de rédaction, les institutions patrimoniales et les fournisseurs commerciaux, notamment Google. Cela laisse de côté une vaste gamme de publications, notamment des reportages par les communautés de couleur et les groupes LGBTQ, les hebdomadaires alternatifs et les bulletins d’information que les salles de rédaction utilisent maintenant pour contourner l’influence des plateformes sur la distribution. Il est intéressant de noter que nous avons parlé à un hebdomadaire alternatif qui gérait soigneusement le contenu imprimé, mais ne prêtait aucune attention à la préservation des histoires publiées en ligne.
Dans son livre de 2018, La liberté de la presse en réseau: créer des infrastructures pour un droit d'audience public, chercheur en communication, Mike Ananny, examine comment les organes de presse peuvent lutter contre la conservation en tirant parti des fonctionnalités mêmes qui mettent en péril le contenu numérique. Il décrit comment Nouvelles américaines et rapports mondiaux supprimé une grande partie du contenu produit avant 2007 après le passage à un nouveau système de gestion de contenu. De même, il note que BuzzFeed a effacé plus de 4 000 articles en 2014 qui ne correspondaient plus aux nouvelles normes éditoriales. Les deux incidents ont suscité l’inquiétude du public face à la tentative d’un organisme de presse de modifier son passé et, par extension, son casier judiciaire.
Les journalistes que nous avons interrogés ont souvent cité le cas de Gawker, dont nous discutons ailleurs, comme un récit édifiant illustrant la précarité des informations numériques. Gawker’s fermer incité la Fondation pour la liberté de la presse et l’Internet Archive à s’efforcer de rassembler les sites d’actualités à risque dans un référentiel appelé «Points de vente menacés. " Mais la préservation des nouvelles a longtemps entaché l'industrie et les institutions du patrimoine culturel. En conséquence, les conclusions de ce rapport sont cohérentes avec les travaux antérieurs mettant en lumière les aspects institutionnels, organisationnels, politiques et technologiques de la conservation des informations. Nous nous appuyons sur des travaux antérieurs pour examiner le rôle croissant des plates-formes dans l’infrastructure de publication de nouvelles, ainsi que l’évolution de l’infrastructure du Web en tant qu’API (interfaces de programmation d'application) devenir monnaie courante, en assouplissant davantage le contrôle des éditeurs sur leur contenu et en posant la question de savoir comment préserver ce qui est dynamiquement accessible depuis un site externe.
Recherche précédente sur le centre de remorquage identifié un équilibre de puissance inégal entre les éditeurs et les plateformes, notant que «les plateformes ont de plus en plus de pouvoir sur les formats et les données, y compris les stratégies éditoriales, les stratégies de distribution et les flux de travail des éditeurs». Google fournit un exemple de cette tendance alors que la plateforme continue à se développer dans le journalisme avec des accords entre Google Cloud Platform et Telegraph Media Group, aussi bien que gérer des millions de photos pour Le New York Times. Cependant, la manière dont cela affecte l'historique à long terme n'a pas été abordée et continue d'évoluer.
Les médias sociaux représentent une forme d’information extrêmement instable et mal comprise par les journalistes, qui peuvent rarement rendre compte de l’ampleur de ce qu’ils publient sur des plateformes. Ils oublient également que, même si sa valeur d'actualité continuera à être débattue, les médias sociaux appartiennent aux plates-formes. Les géants de la technologie peuvent simplement basculer le commutateur, causant toute trace d'accords de publication stratégiques centrés sur la plate-forme, qui définissent sans aucun doute les priorités et les valeurs éditoriales.-disparaitre. Bien sûr, certaines agences de presse pourraient préférer cet état de choses; Pendant ce temps, les historiens, les universitaires et les chercheurs perdront la preuve de la manière dont les auditoires ont naguère collaboré avec les organes de presse en ligne.
En bref, les salles de rédaction font actuellement très peu pour rien pour préserver les informations numériques. Cela étant dit, notre point de départ commence par la reconnaissance du fait que les salles de rédaction sont encore en train de s’adapter (même si elles sont mal) aux informations numériques, et que le contenu né du numérique n’est pas simplement éphémère, il est multiforme, instable et malléable. Plutôt qu'une crise pouvant être résolue par la technologie, nous comprenons que le problème est structurel. Bien que la technologie puisse aider à la conservation numérique, l’action humaine est le premier impératif. Pour cette raison, nous consacrons la première partie de ce rapport aux perceptions de la préservation que les journalistes nous ont communiquées.
La section suivante décrit ensuite la complexité de la conservation des informations numériques, tandis que la section suivante traite des personnes et des collectifs qui font des progrès actifs dans le domaine. Enfin, l’annexe fournit des ressources d’archivage supplémentaires pour les salles de presse. Dans cet esprit, nous proposons également ce rapport comme un outil pour découvrir de nouvelles idées sur la maintenance des actualités sur tout support, pour l’avenir.
Méthodologie
Ce projet de recherche sur l'archivage des nouvelles a débuté en novembre 2017 à la Columbia Journalism School et a été conceptualisé lors du Forum IV de Platforms and Publishers: Policy Exchange à l'Université de Stanford, organisé par le Tow Centre for Digital Journalism et le Brown Institute for Media Innovation. La conférence d’une journée intitulée «Dossier public menacé: les nouvelles et les archives à l'ère de la distribution numérique”Ont discuté de l'importance de la conservation des informations à l'ère numérique et ont inclus des membres de l'auditoire issus de la communauté des journalistes, ainsi que des archivistes et des bibliothécaires.
Les principales questions de recherche qui ont motivé nos travaux ici ont cherché à examiner les pratiques et les politiques d'archivage dans les organes de presse. Au cours de cette enquête, nous avons posé des questions telles que: Comment les organisations de presse enregistrent-elles le contenu, notamment les supports imprimés, les supports numériques, les médias sociaux, le multimédia, ainsi que les vidéos, les images et les enregistrements sonores? Quel est leur flux de travail? Quelles technologies utilisent-ils, le cas échéant?
Une autre piste d’examen a été consacrée au processus de prise de décision. Qui les salles de rédaction consultent-elles sur les stratégies d'archivage et comment ces consultants définissent-ils les stratégies / modèles? Comment perçoivent-ils l'importance de la préservation et quelle technologie utilisent-ils pour la préservation? Quelles sont les politiques de préservation au sein et entre les plates-formes? Où commencent et se terminent les responsabilités de journaliste, éditeur, programmeur et consultant?
Pour les besoins de notre recherche, nous avons créé une liste de médias basés aux États-Unis, de tailles et de formats variés, et beaucoup plus volumineuses que ce qui est devenu notre groupe d’interviewés consentants. Nous avons ensuite ajouté à la liste des professionnels et des experts de la conservation numérique, en général, et de la conservation des actualités, en particulier. Par exemple, nous avons inclus Internet Archive en raison de son rôle central dans la préservation du contenu des actualités. Nous avons également inclus des sociétés privées et des fournisseurs proposant des services pour les salles de rédaction incluant des fonctionnalités de préservation et d'archivage.
En mars 2018, nous avons commencé à planifier des entretiens. Nous nous sommes présentés en tant que chercheurs de l’Université Columbia et boursiers du Centre Tow pour le journalisme numérique. Entre mars 2018 et janvier 2019, nous avons interrogé 48 personnes de 30 organisations. Parmi ceux-ci, 21 entretiens avec des membres d’agences de presse, trois entretiens avec des institutions de mémoire (telles que des bibliothèques), deux entretiens avec des sociétés de technologie offrant des services d’archivage aux organes de presse, et cinq entretiens avec des professionnels travaillant et étudiant la question de conservation des nouvelles. Pour la majorité des organes de presse de notre échantillon, nous avons interrogé plus d'un membre du personnel. Dans certains cas, nous avons interrogé jusqu'à quatre membres du personnel ensemble, en même temps.
Nous leur avons demandé de nous parler des défis et des opportunités liés à la préservation du contenu d'actualités. Trouver la bonne personne dans les organes de presse était difficile. Nous avons commencé par demander s'il y avait une position pour un bibliothécaire de nouvelles au sein de l'organisation. En l’absence d’un bibliothécaire spécialisé dans les nouvelles, nous avons demandé de l’aide pour trouver une personne apte à nous parler de l’archivage et de la conservation du contenu au sein de l’organisation. Dans de nombreux cas, il s’est avéré que cette personne occupait un poste éditorial élevé. Dans d'autres cas, nous avons été référés au service de recherche ou à la personne en charge de ce qui restait de la morgue et des fichiers de clips. Comme nous avons demandé à tous les participants de nous rencontrer en personne, ils ont souvent demandé à être contactés par téléphone.
Nous avons promis l'anonymat aux participants. Lorsque l'anonymat n'était pas réalisable, comme dans le cas d'Internet Archive, nous avons demandé l'autorisation d'utiliser le nom de l'organisation. Autrement, tous les noms et toutes les informations d’identité étaient anonymisés et gardés confidentiels, et leurs réponses ne sont pas identifiables dans le rapport.
La majorité des entrevues que nous avons menées étaient en personne. Dans les cas où cela n’était pas possible, nous avons mené des interviews par vidéo ou par téléphone. Les entretiens ont duré entre une et trois heures. Les conversations étaient semi-structurées et centrées principalement sur les pratiques et les politiques de préservation et d’archivage. Avec le consentement des participants, toutes les interviews ont été enregistrés et transcrits. Nous avons conservé toutes les copies des enregistrements, cryptées sur nos propres ordinateurs personnels. Nous avons analysé les données avec une analyse thématique, des méthodes de recherche qualitative. Tout d'abord, nous avons lu les transcriptions des entretiens et identifié plusieurs thèmes qui ont été repris dans la majorité des entretiens. Les perceptions au sujet de la valeur des nouvelles et de leur préservation, dont nous discutons dans la section suivante, ont été importantes.
Perceptions de la conservation des informations
Dans cette section, nous discutons de certaines des perceptions communes concernant la conservation du contenu des nouvelles et des archives de nouvelles, telles que reflétées dans les entretiens que nous avons menés avec le personnel des organes de presse, des bibliothèques et des initiatives d'archives. L'identification de ces idées communes peut être le premier pas vers la sensibilisation aux perceptions et aux idées fausses sur les archives d'actualités et à la préservation du contenu des actualités.
«Si cela se trouve dans Google Doc, c’est là pour toujours. Droite?"
La majorité des participants que nous avons interrogés ont déclaré qu'ils comptaient beaucoup sur les services en nuage, en particulier Google Docs et Gmail, dans leur travail quotidien. L'adoption de plates-formes dans la salle de presse s'aligne sur les études précédentes, qui concluent que les plateformes et les entreprises de technologie sont impliquées dans tous les aspects du journalisme. Bien que la nature des relations entre les éditeurs et les plates-formes ait été largement débattue en ce qui concerne la distribution de contenu, les revenus publicitaires et la nécessité d’optimiser les performances des moteurs de recherche, ces études négligent la manière dont les salles de presse s’appuient sur les outils et services fournis par plates-formes numériques pour la sauvegarde et le stockage de contenu.
Certains des participants que nous avons interrogés ont exprimé leur préoccupation quant à l'utilisation de Google ou d'Amazon, affirmant que la décision d'utiliser ces services avait été discutée et revue lors de réunions de rédaction. Par exemple, lors de nos entretiens, l'un des problèmes soulevés était que, lorsque les journalistes quittaient une agence de presse, ils conservaient la propriété des Google Docs qu'ils avaient créés et pouvaient ainsi révoquer les privilèges de partage, rendant le contenu inaccessible aux autres dans la salle de rédaction.
Malgré ces préoccupations, la plupart des personnes interrogées se sont davantage concentrées sur les allocations de productivité de ces services, ignorant principalement les implications du recours aux plates-formes pour stocker le contenu d'actualités et celles relatives au contrôle des données.
Étant donné que la plupart des correspondances dans les salles de presse se font par le biais de plateformes numériques et que le contenu des nouvelles est publié principalement en format numérique, le recours aux services en nuage semblait presque naturel pour le personnel de la rédaction. La pratique consistant à archiver activement le contenu des nouvelles pour assurer sa disponibilité à long terme, une pratique autrefois pratiquée par les bibliothécaires de nouvelles, semblait redondante pour les personnes interrogées, car «si c'est sur Google Docs, c'est en quelque sorte pour toujours». , que nous avons entendu de nombreux participants, reflète non seulement les changements organisationnels d’un secteur de l’information qui a été relâché par des bibliothécaires et des archivistes de l’information au cours de plusieurs séries de licenciements, mais aussi la manière dont le personnel des organisations de presse perçoit sa propre responsabilité (ou son absence) de préserver les reportages publiés.
Lorsque nous avons posé des questions sur le contenu des nouvelles publiées sur les médias sociaux, la réponse la plus fréquente a été que Facebook, Twitter ou Instagram le conservaient. Une personne a affirmé: «Nous n’archivons pas nos propres tweets ou quoi que ce soit du genre. Si Twitter décidait de fermer ses portes demain et de fermer ses bases de données, nous perdrions tout cela. "
Bien que très peu de personnes craignent que Google n'existe pas dans 10, 20 ou 30 ans, presque tous les participants ont partagé avec nous des histoires personnelles et des expériences de perte d'informations. Dans une interview avec une organisation de presse à but non lucratif, l'éditeur a décrit la perte d'accès à un ancien compte de messagerie comme étant profondément traumatisante. Un éditeur de magazine qui tentait d'accéder à des articles écrits dix ans plus tôt avait déclaré qu'ils venaient juste de "disparaître". Un autre éditeur a fait état de préoccupations concernant le contenu interactif publié il y a cinq ans et qui ne fonctionnaient plus.
Même si l'expérience de perte de contenu est apparue dans la quasi-totalité de nos entretiens (et parfois dans plusieurs cas au cours d'une même interview), lorsque nous avons demandé ce que les participants avaient fait pour empêcher la perte d'informations, peu de gens ont eu une réponse. En d’autres termes, ils n’avaient pris aucune mesure pour minimiser la perte de contenu d’actualités à l’avenir, alors qu’ils étaient conscients que les actualités numériques sont par nature difficiles à maintenir.
Au cours d'une interview, un rédacteur en chef travaillant pour un point de vente numérique a décrit les efforts détaillés déployés pour documenter minutieusement toutes les étapes de leur reporting, y compris un logiciel de suivi payant et des dispositifs de documentation. En fin de compte, ils n'ont pas sauvegardé l'histoire publiée. Quand on leur a demandé pourquoi, ils ont répondu que «tout a été conservé sur le site Web». De même, aucun des créateurs de contenu à qui nous avons parlé n’a tenté de télécharger les histoires qu’ils ont écrites (ou éditées) et de les conserver dans un fichier séparé. périphériques de stockage. Au lieu de cela, la majorité d'entre eux s'appuient sur leurs sites Web de points de presse, un système de gestion de contenu (CMS) ou Google pour récupérer d'anciens articles.
Le défi de la prévention de la perte de contenu numérique n’a été soulevé que par rapport à des expériences personnelles et non pas en tant qu’obstacle organisationnel à prendre en compte. Ce décalage entre la compréhension selon laquelle, en l'absence d'efforts de préservation actifs, les informations numériques ont tendance à disparaître, a été mis en évidence dans toutes les interviews que nous avons menées.
“Les nouvelles concernent ce qui est nouveau et maintenant, dans le présent et non le passé”
Étant donné le manque de politiques de préservation en place dans la majorité des organes de presse ayant participé à notre recherche, il était difficile de trouver la bonne personne à interviewer. Out of the 21 news organizations, only six employed news archivists or librarians. None of the digital-only outlets had a news librarian or archivist on staff. When those positions did exist, they included additional responsibilities, taking the focus away from the work required for preservation. Furthermore, most people we talked to associated archiving with print news.
Digital news outlets think differently about preserving content than organizations that are still publishing print editions. As an editor at a digital-only outlet explained, “The difference between digital-only and print organizations is that we try to keep everything in circulation. We don’t [have] to preserve things for the record. We have to keep the record publicly available.” In this editor’s view, the primary job of digital-only news organizations is to get fresh news content into circulation on the internet, where it can be found by anyone looking.
While those outlets that publish a print edition seemed more attuned to preservation and the stewardship of the news, digital-only publishers often conceived of their companies as content-generators whose work had limited long-term value and life span. As an editor at a leading news organization told us, “In the worst-case scenario, we will always have our papers as evidence.” Those companies that produced print editions were also more likely than digital-only outlets to cooperate with third-party vendors such as NewsBank, a database company that provides archives of media publications as reference materials to libraries.
It quickly became clear that staff at news organizations did not believe it was their responsibility, as reporters or editorial staff, to preserve content for the future.
When we asked why news organizations are not preserving digital content, we heard the same answers from most: reporting and producing news is about what is happening now, while archiving and preservation was perceived as an act of the past. “Archives are all about old news,” one person said.
It quickly became clear that staff at news organizations did not believe it was their responsibility, as reporters or editorial staff, to preserve content for the future. One editor explained it to us this way, saying: “Who cares what existed 10 years ago? I need my thing now. And so, for better, for worse, if there was some value in [archiving], I probably got a better value out of the new thing.” In other words, journalists’ time is better spent working on short-term needs and news cycles rather than ensuring access to historical content in the future.
The addition of data science professionals in the newsroom and the influence of Silicon Valley startup culture on the journalism profession has no doubt informed this attitude. One data editor told us:
I just think that it is part of the ethos that things are disposable, sites are disposable. If I didn’t treat some of our sites as disposable or our apps as disposable, I would kill myself. I mean trying to keep this stuff running . . . we do have to just let stuff go. Don’t get me wrong, I love history but it’s not on my priority list.
One significant outlier in the digital news archiving arena is Le New York Times, which is currently working on building out its own web archive at https://archive.nytimes.com. Its aim, the site notes, is to preserve “the original web presentation of articles” and interactive projects “for posterity” by hosting “a copy of the HTML of NYTimes.com pages from when they were first published. " This is an effort, it should be emphasized, that has required careful consideration among staff around defining specific goals for archiving.
In 2018, when the newspaper moved to a new version of the system that had long powered its website, senior product manager Eugene Wang told Nieman Lab: “There was one path we could’ve taken where we’d say: We have all these articles and can render them on our new platform and just be done with it. But we recognized there was value in having a representation of them when they were first published. The archive also serves as a picture of how tools of digital storytelling evolved.”
Archive versus Backup
Multiple participants reported having backups of story versions, though they are not available to the public. Many interviewees, however, were unable to distinguish between these backups and an archive, or note the difference between storage and preservation. This confusion stood out in the majority of our interviews. It was evident that participants perceived Google Docs, Amazon cloud servers, or their company’s content management system as equivalent to archives, rather than as mechanisms for storage. Given the focus we heard about the present—about what is new—combined with the Silicon Valley emphasis on iteration now increasingly common in newsrooms, the lack of distinction between backups and archives did not surprise us.
However, the difference between the two is an important one. As the literature about digital preservation for archive libraries and museums illustrates, backups maintain continuity of an organization and the ability to recover and restore information, rather than ensure long-term access. Backup strategies do not take into account potential future hazards such as obsolescence of hardware, outdated data formats and storage media, or obsolete software. Therefore, while backup and recovery strategies are a key component of adequate archiving, backing up information is not enough to ensure ongoing access and cannot be considered an archiving policy.
The confusion means that not only is very little being done currently to preserve the news, but also that the prospects for the survival of some of the most significant reporting published online today are shaky. In the past, people had faith that paper or microfilm copies would survive, even if a news organization did not. In contrast, when asked what digital content they believe will still be available in 20 years, few interviewees were optimistic. In the words of one chief editor, “I am not sure we are prepared for the day that [name of the news organization] is no longer a thing.”
When asked to conceptualize what archives should be, participants were unsure and conflicted. One editor described archiving as the “preservation of editorial intent or preservation of fidelity.” That is, it’s more than a technical feat, but is rather a challenge to maintain the integrity and context of the publication. In a different interview, an editor of a digital-only news outlet referred to the question of archiving content on the internet as philosophical, asking if it’s even possible to preserve an experience. For them, the web is so much more dynamic than television or print is to news consumers.
Our interviewees are not alone in their estimations about the significant issues associated with archiving from the web. As Hansen and Paul told us, “It’s like nailing smoke to the floor. No one can do it, and they are not even sure they want to or need to do it.” But just as web archiving raises theoretical, philosophical, and methodological challenges, the first step in tackling the process is the intention to save content. Most news organization not only lack the interest, but are unsure if archiving digital content is even possible.
Archive.org
The majority of the participants mentioned using archive.org, also known as the Internet Archive (IA) or the Wayback Machine, to find content that was no longer available through a search engine or on a publication’s website. When we asked about a specific story or content that had disappeared, the common answer was: “It is probably available at archive.org.”
Some interviewees cited the Internet Archive not just as a tool that they use in their journalistic work, but also as a model for how to archive online content and preserve the news. We repeatedly heard, “Thank God for the Internet Archive.” This reverence was certainly in part a result of a recent series of attempts by high-profile people and publications to delete websites and social media, raising the profile of the Internet Archive for journalists and casual internet users alike. But the organization is also the recognized industry leader in digital preservation, dwarfing other initiatives. Founded in 1996, the Internet Archive envisions itself as a modern-day Library of Alexandria, providing “universal access to all knowledge.”
The organization’s founder, Brewster Kahle, was an early proponent of digital preservation and started the organization to, in his words,“archive the internet.” As a result, the majority of what the organization collects is digital. The main mechanism for that collecting is a procedure called crawling, similar to the way Google “crawls” and indexes the web.
The Wayback Machine is the main tool that researchers and journalists use to retrieve information, including content taken down, whether intentionally or accidentally. It was even referenced in Congress during Mark Zuckerberg’s recent testimony when North Carolina Senator Thom Tillis called it a “history grabber machine. "
The significance of the IA for online journalistic work cannot be underestimated. Newsrooms are not only relying on the organization to preserve evidence for their reporting, but also to preserve their own published content. This is engendering a dangerous and false sense of security. First, as scholars point out, the Internet Archive does not follow traditional archival practices such as standards for description and organization of archival materials. Second, and more importantly, the IA does not preserve everything.
For one, preserving the entirety of digital news content is an immense responsibility that no single organization or technology can meet. An IA staff member told us, “People ask me all the time, ‘did you have a backup of my site?’ And I say, ‘I have no idea. I can tell you what I backed up from your site, but I can’t tell you how that relates to everything that’s on your site, because I don’t know everything that’s on your site.’”
Even if the IA has captured a website, what it collects may be limited to the first level of content and could exclude links, comments, personalized content, and different versions of a story. Also, although the Internet Archive actively crawls sites, the nonprofit relies heavily on voluntary, non-staff contributors. As a manager put it, “We are actively looking for people to work with . . . we know we can’t do it by ourselves.”
Corrections and Versioning
Another issue that emerged in our interviews relates to article corrections and the publication of multiple versions of a story on the web. Corrections (such as a street name or more substantive information) to a news story are routine but tend to be limited. All interviewees reported that with every correction to a published story, they make sure to add a note specifying the change, always trying to make the process as transparent as possible for their readers.
Meanwhile, dozens of versions of a single story may be published in one day. Since most news organizations that participated in our research did not have a policy or procedure to ensure that their content is archived or digitally preserved for the long term, many interviewees expressed concerns regarding the evolution of stories and the ways in which it is difficult to retrieve previous versions. One person mentioned that in the case of a lawsuit, being able to access these versions would became important.
Some participants were confident they could access old versions from their CMS or server, even if the content was not made visible outside of the newsroom. Even so, nearly universally everyone admitted having lost content during the migration from one CMS to another, or because of a server crash. And in one case, content from an old-version CMS could no longer be accessed at all, emphasizing again how long-term preservation and archiving are not the same as backups and storage.
Deletion
Deletion is the opposite side of preservation. News organizations, in certain cases, actively remove content from the public record—an act that raises questions about the role of journalism in society.
While interviewees reported deleting emails sent to the newsroom or phone messages left by readers for reasons ranging from cleaning space on their computers to protecting sources, most said that deleting published content is abnormal behavior in the newsroom. Receiving requests from readers to delete stories is, however, common. An executive editor at a digital-only organization said the issue has been exacerbated by a move from print to digital. “In the past, if your name was mentioned in the newspaper, then everybody read it. But later it was folded and forgotten or preserved on microfilm that people actively need to go and search for.”
Today, when people are searching for someone’s name on the internet, the editor said, “this article is the first thing they see,” adding, “and then again, the internet doesn’t forget.” News organizations generally reply to those requests by saying that they cannot remove content from their site, the editor told us.
Some participants did, however, report that they delete tweets. On Twitter, compared to other social media platforms, there is more of a tendency for deletion. “When the text was changed or misspelled, we delete it,” one person said But in cases of a more significant error, the majority of participants explained their efforts of transparency around any correction or mistake that was made. Another person said, “Our concerns are primarily with the here and now. We are more interested in making sure that misinformation is not being spread [on Twitter], or our reporting is not being used to further kind of disinformation. We’ve been known to correct a tweet or two.”
The Case of Gawker and Gothamist
Naturally, the story of Gawker and Gothamist drew a lot of attention in the journalism community. In August 2016, after losing in a lawsuit filed by former wrestler Hulk Hogan and filing for bankruptcy, Gawker.com shut down. Then in November 2017, after employees at Gothamist voted and won the right to unionize, billionaire owner Joe Ricketts announced shortly thereafter that they had all lost their jobs and that the site would cease to operate.
Many participants talked to us about realizing right then that 10 years’ worth of work could disappear, as if it never existed. The fragility of digital content became apparent. In the days of print, one of the interviewees said, if a newspaper closed down, reporters could still access hard copies of their work. The case of Gawker and Gothamist highlighted how easily digital reporting can être fait to disappear.
As of today, the Gawker website (including old stories) is still available, and in April 2018 Gothamist était relaunched thanks to funding from a consortium of public radio outlets. Even so, interviewees described the ability of a single person to effectively erase history as terrifying. The Gawker and Gothamist cases both scared reporters who don’t personally archive their own work, just as it demonstrated the role of news archives in democratic societies and the need for preservation policies that ensure the public with a faithful account of history.
The Intricacy of Archiving Digital News
Whereas news was originally received as a finished product, delivered or broadcast to audiences, production now continues in a seemingly never-ending cycle. Instead of on industrial printing presses, news is produced in bits and bytes in content management systems like WordPress and distributed on the internet. News is increasingly dynamic, interactive, and personalized. In comparison, the newspaper seems almost frozen.
“The web, however, is not,” one news librarian told us. “It’s constantly changing and being updated.” But while the internet is often implicated in the troubles of today’s news industry, many of the developments discussed here transcend media formats, affecting both print and digital-only publishers.
Control and Care
Since the 1980s, newspapers have been contributing to searchable commercial databases, eliminating conventional clip files and the personnel hired to maintain them. Those automated information retrieval services have expanded over the years. The most prominent vendors include ProQuest, NewsBank, et Gale—and nearly every news organization interviewed had a contract with one of them. Many used their services as a commercial database. A select number of newspapers also contract with one of these vendors to receive microfilm or e-print (i.e., PDF) versions of newspapers that are then distributed to the Library of Congress according to regulations and/or for copyright purposes.
Another database with a growing role in maintaining historical collections, including nouvelles, is Ancestry.com, the largest privately owned genealogy company in the world. En plus de acquiring Archives.com and its digitized collection of newsprint in 2012, the company began hosting the content on a stand-alone site it owns called Newspapers.com, which claims to be the largest online newspaper archive with more than 11,000 newspapers and 43 million pages from the 18th century to today. It adds new pages (millions each month according to company promotional information) by scanning publications accessed through partnerships with libraries, publishers, and historical organizations for free, bypassing the Library of Congress and other public programs.
Newspapers.com provides the archived news content to subscribers for family history research and to its partners, including the New York Public Library et Brooklyn Public Library. The closest competitor is NewsBank.com, which news outlets use to host articles behind a paywall. Staff, including newsroom librarians we spoke to, welcomed the arrangement because it provided otherwise expensive digitization services for free. “It’s the last place that funding goes. So, it’s always kind of a struggle to keep these systems going,” the sole news librarian at a mid-sized newspaper said, referring to digitizing newsprint and photographs.
But scanning and digitization, and storage in a database, are not alone adequate for long-term preservation. True archiving requires forethought and custodianship. To be fair, the tools for the preservation we need are not well developed yet. In the interim, these relationships accomplish specific goals within particular financial realities, but they do not account for the potential impact on long-term preservation and access. While they are expanding to encompass sophisticated access systems under the umbrella of private sector activities, they are also affecting the relationship between user and record on a fundamental level. Commercial companies are now the largest stewards of digital news with the potential to affect findability, since standards for metadata and indexing are inconsistent, variable, and based on commercial priorities rather than ubiquitous access.
Plateformes
Facebook, Google, and Apple possess considerable news discoverability and delivery power. But the boundary between what belongs to the platforms and the news outlets is blurring. In early 2018 Apple News invited the three largest national dailies, Le New York Times, Le journal de Wall Street, et Le Washington Post, to contribute to its new digital magazine distribution app, Texture. Now the app is home to over 200 magazines.
By and large, media outlets and historical institutions also willingly participated in the Google News newspaper archive, which scanned millions of microfilmed newsprint pages and added them to already digitized content absorbed through acquisitions. The project abruptly discontinued service in 2011, however, due to copyright claims by newspaper companies and the complexity of archiving newsprint layouts. The content was added to Google News, meaning that another commercial company controls about 2,000 historic newspaper titles, all indexed according to Google’s standards.
In February 2019, Telegraph Media Group switched from Amazon Web Services to the Google Cloud Platform in a deal that expands Google’s reach deeper into the news organization’s publishing infrastructure. Le télégraphe already distributes digital versions of its news stories using jeu de Google. Soon its digital publishing systems and public-facing digital products will all run on the Google Cloud Platform (GCP). Newsroom staff will be trained to use Google discovery tools to find content and to help develop personalized content for audiences.
Google Cloud also signé un accord avec Le New York Times near the end of 2018 to turn a collection of about five million printed photographs into digital high-resolution scans. Google is providing the Fois with an asset management system. Metadata, including information commonly included on the back of a print photo, is assigned and stored in a PostgreSQL database running on Google’s Cloud database. How the metadata is categorized to make the information discoverable is largely based on Google products.
For example, according to Google, the Cloud Natural Language API identifies Penn Station as a location and classifies it (and the entire sentence it was embedded in) into the category “travel” and the subcategory “bus & rail.” Media coverage of the deals revolved around increasing productivity and user engagement. Nothing was said of the consequences for relying on a proprietary platform for care and control.
Monetizing the archive
The journalism industry has been monetizing the news (beyond subscriptions) for decades by reselling access to third parties, including commercial information aggregators like NewsBank, ProQuest, and others. These vendors do the work of microfilming and digitizing print, and otherwise making content discoverable. NewsBank, for instance, is a fee-based database of digital articles that acts as a paywall would for content on the live website. In return, the vendors receive a share of what users (including researchers access the content through university libraries) pay for access, and news outlets get a cut as well. Some news outlets license photos of celebrities and sports figures, charging film companies wanting old articles or footage. Local papers, in particular, may attract local historians and genealogists.
Many of the news professionals we interviewed said they saw more of an opportunity for monetizing past content from newspapers than digital-only publications, although noted that how lucrative deals are depend on the terms of the contracts. Beyond money, it’s worth noting that these third-party deals take care of a service that news organizations want (digitalization, microfilming, fee-based information retrieval systems) but see as ancillary to their primary focus, namely publishing news and not preserving it.
Des médias sociaux
Despite their pervasive use of platforms to publish and promote news content, none of the organizations interviewed said they had practices for saving Twitter, Facebook, and other social media communications. Tweets, Facebook posts, and the like are regarded as “inherently self-preserving” but not of inherent value. This attitude was informed by an acknowledgment that social media is controlled by commercial firms with potentially limited life spans. One data editor said that most reporters take access to platforms for granted until they can’t find something and suddenly realize, “Oh, Twitter might not be forever. It could be gone in five years.”
Tweets, Facebook posts, and the like are regarded as ‘inherently self-preserving’ but not of inherent value.
Despite a recognition that social media posts will have historic worth for future researchers studying the sites, the news workers we interviewed did not consider newsroom posts on the platforms to be news en soi. Instead, they perceived news to be fixed on a website or in print, and they regarded social media as a tool they could use to direct audiences to that content. Therefore, they considered saving their social media as unnecessary. “The pointers to the content we don’t worry about,” one person said.
Neither the executive director who said this, nor any other staff we spoke to, initially considered the value of social media for understanding the strategic relationships that have developed between newsrooms and social media platforms. Archival evidence will likely be necessary to examine how these relationships affected reporting and publishing, including headlines, sourcing, or push notifications. Complicating matters, social media companies control access to the content on their platforms. Facebook makes it particularly difficult to crawl its News Feed, even for the account owner. Downloading content from the site when permitted is often the only way to capture it and does nothing for discoverability and long-term access.
Content management systems
Newsrooms rely on content management systems for news production but rarely consider the ramifications of their relationships to them in terms of custody, even though the connection to preservation is critical. It’s an important consideration because the content largely exists in no other location. The majority of newsrooms said they had experienced a loss of content during transfers, which might occur once every three to five years during upgrades, or as the result of a merger or purchase of one news organization by another.
Archivists and newsroom librarians agreed that CMS migrations are one of the most important points of failure. Most news organizations can only estimate the gaps that exist after migration, because the IT departments supervising them run sample searches rather than produce exact file counts. Moreover, they define a certain level of loss as acceptable.
Generally speaking, management systems are not designed for preservation and are vulnerable to server crashes. They are, at best, short-term storage with limited capacity for keeping content in a stable way over time. Worse yet, they struggle to keep track of the various pieces that make up a news story. In this context, several interviewees mentioned ARC, a content management system developed in-house by Le Washington Post that brings together the multiple components of an online news story including web print, video, and social media. This helps wrangle the moving parts that make preservation different than print. However, ARC is not an archival system. It is useful for production—getting the news published, as one interviewee said—“but what is useful for production is not necessarily the thing that’s useful for history.”
Les partenariats
Beyond the platforms, news organizations have experimented with a variety of partnership models that bring print newspapers, public radio and TV outlets, magazines, and web-only producers together in collaboration. Media outlets considered the collaborations necessary in the context of financial constraints attributed to the collapse of print advertising income. One unique form of partnership also involved accepting contributions by non-paid bloggers and special initiatives. A trend that has mostly passed, organizations moved on to new strategies and mostly forgot about these contributions. The blogs have been deleted or neglected.
Furthermore, no long-term plans for keeping the content produced by the other partnerships and initiatives have been established. By way of illustration, Digital First Media founded a project called “Thunderdome” to provide multimedia reporting to a network of local newsrooms. Management and former “Thunderdome” journalists said they were unaware of where to find the reporting, noting that it had probably vanished “down the memory hole.”
Shared productivity software
Many of the news organizations interviewed primarily employed Dropbox, Box, and Google Drive to manage content in the pre-publication stages. Each is a cloud-based application that allows users to store and share data online at a relatively low cost without the need for training or in-house troubleshooting. The applications allow multiple users to collaborate regardless of where they are. Story elements can be handed off from a reporter to a copy editor, then a designer, a printer, and an online editor. This flexibility not only makes the system flow but also fits with the nature of dispersed staff and the increased reliance on freelancers. On the flip side, editorial staff members are now very accustomed to using private accounts that ultimately give the user control, meaning the power to delete it and its contents altogether. In the case of Google Docs and Sheets, the user who creates the document controls its accessibility. That removes custody from the newsroom.
These systems also carry with them capacity limitations that can influence decisions about what to keep and what to discard, as well as affect the stability of the system. Those we interviewed from small newsrooms tended to use Dropbox or Box to store image, graphic, and text files. There are several reasons why a newsroom may prefer one over the other, but, either way, news outlets keep content based on the storage capacity they can afford. In addition, they face the possibility of losing content to server crashes or hacks. One newsroom has begun to move staff onto a shared, on-site storage system that centralizes media assets that can easily be backed up at the end of a project. Convincing editorial staff to give up some of the convenience of shared productivity software has not been easy.
Code
Without the computer code in which they are written, the future of newsroom apps is an open question. Much of that code can be found on GitHub and other web-based code management systems, whose popularity is expanding beyond developers to text production. GitHub is loosely equated with an archive in the sense that news developers expect their content to be saved and available in the future. Beyond this, even the largest outfits we spoke to were not archiving their code. However, code management systems cannot provide long-term preservation. En réalité, Microsoft acquired Github in June 2018, placing the content under private, third-party commercial control. Without access to code repositories, there would be no key to unlock software design and functionality. We might as well be staring at a puzzle without all the pieces.
commentaires
By the mid-2000s, online news sites routinely and actively encouraged comments by the public. Visitors to the gender, politics, and culture site Jezebel, for one, prized and encouraged audience comments, a trend that helped shape a culture of commenting that would go on to influence interactions on social media sites. Over time, media outlets began to employ third-party commenting software from Disqus and even Facebook to facilitate this input. Outlets, however, began to phase out website comments about a decade ago when they proved susceptible to hijacking by trolls, anonymous contributors, and offensive and inappropriate submissions.
Interviewees included in this research reported experiencing these problems, and while some shut down their comments, others continued to allow them. Neither group made any provisions for saving comments, though. The sites that opted to shut down their commenting systems could not account for what had happened to their contents. Those who continued to allow comments do not save them.
Microfilming the Internet
In the past, newsrooms, including wire services, preserved the last edition of the day to be published—or, in broadcast terms, aired. But given the atomization and personalization of content and software today, deciding on a final version is anything but straightforward.
One way of thinking about the contrast is the assassination of Abraham Lincoln. The New York Herald printed multiple versions of the April 15, 1865 shooting, some of which can still be found for sale. Today the story might arrive as a breaking news story that would be updated more than a dozen times throughout the day, maybe with video, a photo slideshow, or an interactive timeline written in JavaScript.
As one data editor put it, “I always feel like we’re running our press 24 hours a day every time someone clicks on our site.” Metaphorically speaking, the printing press could break at any moment.
Baxter Orr for the Tow Center
Moving parts
The move online not only required rethinking the nature of content (and producing more of it), but also necessitated changes in staffing. Most media organizations employ IT workers necessary to keep content management systems and websites operational. In addition to these positions, some news organizations invested in specialized divisions of digital news content producers.
Large and medium-sized news organizations produce a variety of data visualizations, outils, and personalized audience applications. Few organizations we spoke to were actively saving these applications, which are complex in terms of preservation because they are dynamic, atomized, and depend on software that is likely to be outdated in a matter of years.
Subscribers increasingly receive distinct news items personalized to their interests and delivered to them over a variety of devices. It’s difficult to begin imagining how an archival effort would tackle personalization of this kind.
Meanwhile, radio and TV news are also producing news content online and for broadcast. Video and photographs, especially, are often stored on drives maintained not by a newsroom but by individuals. This again introduces the question of ownership and access, all before considering that these materials will always be susceptible to loss if the formats in which they are stored become obsolete and the content cannot be displayed. What’s more, some online sites publish as many as 100 articles a day. “There’s so many pieces,” an editor of a high-volume online site said, “and they’re always moving forward.”
The addition of APIs introduces yet another layer of complexity and external control. APIs enable software programs to communicate with each other via applications that are made up of a series of instructions written in computer code. These applications are common elements in stories, often experienced by users as dynamic interactives that draw on Google location data, tweets, or United States Census Bureau American Community Surveys.
Because APIs define how to access that data, when Google, Twitter, or the Census Bureau change instructions, the code no longer works correctly and the application breaks. Developers have no control over these changes, which occur relatively frequently and sometimes without notice.
A similar dynamic occurs with web add-ons that newsrooms adopted but which proved to be short lived. The tool producers were either startups that failed or stopped supporting them. This occurred recently when Google decided to cease its support for Fusion Tables, its map visualization tool. Additionally, many newsroom developers noted the current difficulty of playing interactive projects using Adobe Flash software, which Adobe announced it will éliminer and ultimately stop distributing by 2020. Newsrooms lourdement investi in the early 2000s in Flash-based content. But now, in the recent words of one editor, “Anything in Flash is dead.”
The pace of software and hardware development is so fast-paced that newsrooms are swapping out both every few years. This leaves little time to think about preservation, which requires advance planning and practices. Instead, media outlets may use up to four systems for handling digital-born news, including both a digital management system such as SCC MediaServer and a content management system, such as Drupal ou WordPress. On the web, stories change all the time; they don’t stay static, a newsroom librarian said describing the instability they experienced, adding that, “You can’t microfilm the internet.”
Figure 1: Characteristics of News Publishing, Pre-Internet and Today
Approaches to Preservation
Few newsrooms expressed confidence in their archival practices, or could say that they were taking any steps to make sure that what is published today remains available in, say, 20 years. Rather than a crisis, it may be more useful to think of this as part of a continued transition to a digital infrastructure—one still in flux, with which newsrooms are struggling. While it is true that short-term thinking defines much of the current space, practices for saving content have evolved since the early 1990s. Back then, very few newsrooms were thinking about how to save web files, the result of which meant that news organizations lost early home pages and story posts.
Organizations with the capacity to do so have begun to conceive of in-house systems for saving digital content. The slogan “Lots of Copies Keep Stuff Safe” (known by the acronym, LOCKSS, after the general-purpose digital preservation program of the same name) has helped to integrate into newsrooms the idea that keeping multiple copies of stories and story elements distributed via a peer-to-peer architecture, in which no one participant controls all the copies, provides some assurance against server crashes and CMS migrations.
Along these lines, suggested solutions we heard from reporters, data journalists, publishers, editors, and developers were technology-focused endeavors. Although engineering will be an important part of coming to terms with production et preservation, technology is not the only answer or even the best one. Some strategies are further downstream and involve creating policy and preservation standards, for example, around what kind of news content should be preserved and who should have access to it. There is a diffusion of norms that will be required to encourage new thinking about preservation, from the newsroom to the boardroom.
This section considers the benefits and disadvantages of the technology that will underpin digital archiving, and the new models for thinking about digital preservation, from regulation to workflow, that it will force us to confront.
Upstream
Blockchain
Blockchain startups have marketed the software as an archival solution based on the premise that copies of digital files distributed and stored on multiple servers can prevent against deletion and undetected changes to data. One such startups is Civil, which caters directly to the journalism industry, using Gawker as a cautionary tale. But marketing strategies do not translate into long-term preservation practices. While blockchain software stores information about articles, it is not well equipped to store actual articles, photos, videos, or other data.
le InterPlanetary File System (IPFS), a peer-to-peer, distributed network protocol that pairs well with blockchain, seeks to overcome the storage limitations of blockchain applications by adding actual files rather than limited metadata to the datastore. This does, however, slow operations and increase storage costs. Still, marketing efforts around blockchain technologies have opened the door to a discussion about the need to have strategies for keeping digital content accessible.
DWeb
An alternative decentralized network, referred to as the decentralized web or DWeb, en développement in response to perceived weaknesses of the current internet structure, including central control and capture of data by Google, Facebook, and other platforms. In the decentralized scenario, software breaks files into smaller bits. They are then encrypted, distributed, and stored on a network of laptops, desktop computers, and smartphones.
The plan capitalizes on unused storage capacity on users’ computers. At this point, enlisting smartphones in the plan is more aspiration than reality, although supporters see the gap narrowing with the availability of 5G wireless networks. Moreover, the DWeb rationale rests on surveillance, censorship, and control. Archiving is conceived of as the storage of bits of data broken into pieces and dispersed across multiple computers. While redundancy does protect against deletion and helps to identify that a change has been made (albeit not the actual change), decentralization as imagined by its supporters does not easily reconcile with long-term institutional models for preservation unless the system can be adapted to their controls. For example, the Library of Congress could develop its own decentralized system that would be able to track the location of bits of information.
Better links
In the world of the web, article links are not robust. Broken links that return a 404 error message can damage the credibility of news organizations, and some have invested considerable staff time to confront the problem. Several strategies can help make sure that content, including versions, remain available over time. The Associated Press Style Guide recommande including in a URL (uniform resource locator) the same elements, in the same order, as would be done for a reference to a fixed-media source.
Permalinks and digital object identifiers (DOI) offer options for providing more stability. DOIs are part of a system relatively common among scientific and academic publishing in which a registration agency (the International DOI Foundation) assigns an object (i.e., an article) a unique alphanumeric identifier to content. If a URL changes, the DOI can be redirected to continue to identify and locate the article or other digital object.
Ally archivists
In the meantime, collectives are filling the gap by saving digital objects that would otherwise slip through the cracks and become obsolescent before their value can be recognized. For example, NewsGrabber, developed by Archive Team (“a loose collective of rogue archivists, programmers, writers, and loudmouths dedicated to saving our digital heritage”) seeks to enhance the Internet Archive’s efforts to preserve news content. A member of Archive Team also launched an effort to preserve and manage as many Flash games as possible before they’re no longer playable.
These initiatives are not all specific to journalism—and they can be a bit tech-heavy—but the resources offered by these groups can be relevant to saving digital news products, especially interactives. This will likely not only include Flash-based games, but the new applications being built today, such as ProPublica’s “Dollars for Docs,” whose formats will one day be obsolete.
Émulation
Consisting of multiple parts, games and interactive news applications (usually custom-made) pose a particularly vexing challenge to preservation that migration and distributed copies fail to fully address. To preserve not only the content, but also the purpose and functionality of dynamic, interactive applications, Katherine Boss, librarian for journalism, media, culture, and communication at New York University, and Meredith Broussard, assistant professor at the Arthur L. Carter Journalism Institute of New York University, are developing the first emulation-based web archiving tool. The package would save the entirety of a news app while also providing a digital repository to preserve it for future use, thereby facilitating the look and feel of the original interactive experience.
While other emulation projects are being developed—at Rhizome, the Internet Archive, Carnegie Mellon, Yale, Deutsche Nationalbibliothek, and the British Library, among others—what distinguishes Broussard and Boss is not only an institutional infrastructure that takes into account long-term preservation and access by integrating librarians into the system; their project also formally addresses emulation-based web archiving.
Workflows
Data journalists and developers told us that news preservation starts with changing attitudes, but also emphasized that incorporating strategies into workflows would be necessary for success.
My feeling is if you can’t have the archiving built into the production workflow, then it’s not going to happen because people are going to move onto the next thing . . . As soon as their things are published, they don’t care anymore. So unless you can get them to do it on the front end, it’s always going to be killed.
With developer workflows in mind, Ben Welsh, data editor at the Los Angeles Times, has developed the PastPages software toolbox to assist in saving digital-born news by making content easy to save—or, as he calls it, “archive ready.” By way of example, the plugin Memento for WordPress can send preservation requests to third-party institutions like the Internet Archive upon publishing each time content is modified.
Another application, Save My News, helps individuals save URLs in triplicate distributed across multiple servers. Welsh describes it as a personalized clipping service that empowers journalists to preserve their work in multiple internet archives. These can be integrated into a web browser, similar to the reference management software Zotero: users click on an icon located in the upper corner of the web browser to save content.
Welsh also recommends configuring web pages and software so they’re easier to archive. HTML headers can include a message that permits third-party organizations to collect the web page or site. Software can be designed modularly to integrate easily with “snap on tools” like Memento for WordPress and content management systems. In this way, organizations like the Internet Archive becomes a sort of platform that allows users to curate collections and store them on third-party servers.
PastPages has won an innovation award from the Library of Congress and praise from the Nieman Journalism Lab, Le journal de Wall Street, Journalism.co.uk, the Poynter Institute, and other organizations. But uptake faces obstacles within news organizations. The software is open source, lacking the institutional support necessary for the kind of development requisite for long-term integration in news organizations. Also, newsroom managers may have reservations about using non-enterprise software. When we asked if this points to the need for an infrastructure involving government support and regulation, a newsroom developer said, “We need some institutions to step up—if they can.”
Downstream
Règlement
The centerpiece of US regulations for mandatory deposit of news content revolves around print newspapers, and is administered by the the US Copyright Office and the Library of Congress. But no mandatory deposit regulation exists as of 2019 for digital news content made available on the internet, although initiatives and rule-making discussions regarding digital-born news continue (see the Appendix of this report for more).
The United States is not alone in its absence of mandatory deposit regulations, but it may be unique in the considerations that will be involved: mainly the scale at which news is produced in the United States, as well as the rate of change in news layouts and content that challenged even a well-funded, technologically equipped project like Google News Archive. The result is a patchwork of participants in the commercial, nonprofit, government, and public sectors with a variety of needs and capacities.
Even if legislation did exist, the Library of Congress alone is not yet equipped for the volume, complexity, cost, and technical challenges that digital-born content presents for meaningful preservation.
They vary by region and some states are working with their local press associations, while the Library of Congress has hosted multiple stakeholder meetings with news organizations to discuss the preservation of digital news. Even if legislation did exist, the Library of Congress alone is not yet equipped for the volume, complexity, cost, and technical challenges that digital-born content presents for meaningful preservation.
Integrate news production teams
Media organizations are missing an opportunity to involve developers in the production pipeline, according to a data editor who said that data developers share an orientation with librarians toward gathering data and keeping it organized. It is not unusual to find data developers who worked as librarians and have valuable ideas about preservation but have no institutional control.
Additionally, integrating staff who oversee production and maintenance of web content, including the CMS, can make them aware of archiving priorities and needs. As it is, decisions about technology are made without consulting staff with preservation expertise, including newsroom librarians. Working together, by improving coordination and communication, media organizations could help develop simple solutions that make software “archive ready.” These include:
- Formal practices for attaining long-term durability of custom databases and applications. This begins during design and planning.
- Improving communication between production teams, from editorial to IT.
- Thinking ahead during website redesigns and CMS migrations and employing an incremental and holistic approach so that content fully transfers and URLs remain active.
- Checking new software systems for backwards compatibility with previous file formats.
- Creating static HTML files of dynamic applications in folders.
- Adopting practices and tools with established standards.
- Looking for systems that allow customization (new practices are more likely to take hold).
Distributed responsibility
Third-party entities, in particular the Internet Archive, are acting as de facto substitutes for in-house preservation. Some media outlets don’t know that their content is being collected by external parties and may be preventing efforts by keeping content behind paywalls. As a result, collection can be haphazard and incomplete, even though we often heard expectations that the Internet Archive was collecting news content in full. The expectation is misplaced, as we discussed in a previous section.
One suggestion we heard from a study participant was to engineer newsroom websites to coordinate collection with third-party entities such as the Internet Archive, libraries, or other heritage organizations. Ideally these services should contribute to in-house collection and preservation. Special care, however, would have to be given to interactives and dynamic content that are more difficult to collect and preserve. Nonetheless, it’s a concept that would enable newsrooms to focus on their core products while still archiving with the help of expert external specialists using the systems that they have established and manage.
One news organization recently adopted Preservica, a cloud-based digital preservation software. The Internet Archive also offers a subscription service called Archive-It that provides tools for capturing and storing digital content, as well as a TV News Archive that houses a collection of upwards of 1.6 million TV news programs. Of course, each of these strategies runs into the issue of third-party control and care.
Specialization
The managers of media organizations often make decisions about investing in new technology, rather than relying on staff whose expertise can contribute to improved systems. We heard arguments for dedicating at least one person in every media organization to preservation efforts and consulting them on decisions. That may be a librarian, or newsrooms can rethink the position to match their needs and production models in a way that aligns with newsroom cultures.
For example, instead of a newsroom librarian in conventional terms, a news organization could think of the job as a content production manager who oversees the implementation and operation of systems according to specified priorities that enhance, rather than diminish, preservation. The person could keep digital morgues, help make digital content archive-ready, negotiate with CMS vendors to make preservation part of the design of a content management system, and help develop ways to effectively monetize archives.
Conclusion
Preservation is a multi-pronged process that technology can assist. But ultimately, maintaining news for the future depends on deliberate practices that involve planning around tasks such as migrating content to new formats, assigning consistent metadata, and indexing. Like most media organizations, the individuals interviewed for this report care about maintaining access to the news. But they are at a loss for what to do and may doubt their ability to prioritize preservation.
For one, the lack of funding and policies for archiving results in a fragmented system that constrains both output and preservation. Add to that the pressure of the ever-faster news cycles and shrinking staffs, and the prospects for long-term preservation of digital content can appear dismal. On top of everything else, there is confusion about what distinguishes long-term preservation from backups and storage.
The staff we interviewed by and large understood the complexity of preservation in these terms but were struggling with the implications of the malleability and volume of digital content, beginning with the first step: informed decision-making about quoi to save. The bottom line for the majority of news organization was making long-term preservation as simple as possible, both in terms of budgeting and practices.
“It needs to be cheaper and easier or people will not do it. We know that because they haven’t,” one editor said. That may mean making it part of the workflow by integrating strategies upstream in the software, beginning with content management systems that factor in preservation, and rethinking relationships with third-party vendors.
But that is not the end of the story. Reporters are not accepting responsibility for maintaining the stories they write, and few keep their own work. They trust that the content will be available online when needed, know little about the production pipeline, and have no control over it or the tools involved in publishing their work. For most reporters, a data editor said, “it feels like God is handling these.”
Perhaps the crucial takeaway here is that reporters have few reasons to care. If media outlets recognized the value of preservation, measured in dollars and historical currency, they could take more control over the destiny of the news they produce by negotiating contracts that benefit them, instead of being what Hansen and Paul characterize as captive customers to outside vendors. Some do this already through relationships with NewsBank and ProQuest, but as interviewees told us, they could do much better, beginning with a focus on not purging stories and starting to save social media.
Libraries or other cultural heritage organizations can help newsrooms adapt, but they will not save the day—at least not without the cooperation of media outlets. This cooperation can start with basic stewardship, such as understanding the obstacles to preservation, long-term planning (if not actual management), and using consistent practices for keywords and metadata. News organizations could afford to be poorer custodians in the past, but that no longer holds true in today’s news environment, in which there are no second chances when data disappears and transparency is critical.
The ways in which news workers reconceive of the importance of news preservation, as well as their own responsibility and ability to archive content, can become infectious. Preservation is about history and legacy, and currently many news organizations do not perceive themselves as important enough to act accordingly. Some of the interviewees mentioned Le New York Times as an example of a news organization working toward preserving its content and maintaining a proper archive, for print as well as for digital content. All the interviewees who mentioned the Fois referred to it as “the paper of record.”
Other news outlets, and in particular digital-only news publications, did not perceive themselves as having the same responsibility or legitimacy. But news organizations should care about preservation, in the same way they care about integrity, reliability, and informing the public not just in the present, but also the future.
Appendix: Additional Resources
For more about digital preservation, consider the following sources:
General Information and Background
NDIIPP/NDSA Meetings and Reports
The National Digital Newspaper Program (NDNP) is a voluntary, grant-based program focused on providing access to digitized, historic newspapers. The Library of Congress advanced digital preservation from 2000–2016 under a number of National Digital Information Infrastructure and Preservation Program (NDIIPP) initiatives through funding research and development, convening subject matter experts to develop best practices and training, and promoting knowledge exchange to inspire sustainable models for supporting the work of current and future practitioners.
The National Digital Stewardship Alliance (NDSA) is a consortium of more than 220 partnering universities, associations, businesses, government agencies, and nonprofits committed to the long-term preservation of digital information.
Reports relevant to these programs:
Digital Preservation Meetings – http://www.digitalpreservation.gov/meetings/
NDSA Case Studies – http://www.digitalpreservation.gov/multimedia/publications.html
Copyright Regulations and Policies Documentation
(Source: Library of Congress and National Endowment for the Humanities)
Additional Programs
- The International Internet Preservation Consortium (IIPC) is a consortium of national libraries, universities, and archives all over the world. IIPC’s main objective is to support long-term preservation and access to internet content by developing tools, techniques, and standards for the creation of internet archives, while raising awareness and encouraging memory and research institutions to contribute: http://netpreserve.org/
- The International Federation of Library Associations and Institutions (IFLA) is the leading international body representing the interests of libraries and information services professionals worldwide: https://www.ifla.org/
Acknowledgments
We would like to thank numerous people, including family and friends, who contributed to this research and inspired this report. First, we’re grateful to all the news organizations and staff members who took the time to share their views about news preservation. We could not have completed this study without you. We would also like to thank those scholars and professionals who shared their expert knowledge and experience of news archiving with us, including Kathleen Hansen, Nora Paul, Meredith Broussard, Katherine Boss, Ben Welsh, Ed McCain, and all the rest of the “Dodging the Memory Hole” community who cares so deeply about the preservation of news.
Second, we would like to thank our supportive colleagues at the University of Haifa, Department of Communication and Department of Information and Knowledge Management, and several people at Columbia Journalism School, especially the faculty members of the Communications Ph.D. programme.
We are grateful to the Tow Center for Digital Journalism, which was a true home for this study due to the encouragement and tremendous support from Emily Bell. Our colleagues at Tow were extremely helpful, and we’d like to extend special thanks to Meritxell Roca-Sales, Kathy Zhang, Priyanjana Bengani, and Nushin Rashidian. To the Center’s editors, Samuel Thielman and especially Abigail Hartstone, whose editing made this report so much better than what we could have hoped for, we appreciate your contribution. And last, but certainly not least, to Katie Johnston—there are not enough thanks to express our gratitude for your help and support during this entire journey.
Citations
Sharon Ringel and Angela Woodall are research fellows at the Tow Center for Digital Journalism.
Sharon Ringel is a Postdoctoral researcher at the Columbia University Graduate School of Journalism. Angela Woodall is a Communications Ph.D. candidate at the Columbia University Graduate School of Journalism.







Commentaires
Laisser un commentaire