Archiving Community Memories

A special issue of Future Internet (ISSN 1999-5903).

Deadline for manuscript submissions: closed (15 April 2014) | Viewed by 46201

Special Issue Editors

Head of Electronic Services, University Library Johann Christian Senckenberg, Bockenheimer Landstraße 134 - 138, 60325 Frankfurt am Main, Germany
Interests: semantic evolution; service computing; data management in distributed systems; federated search; self-organizing systems
Department of Computer Science, University of Sheffield, Regent Court, Sheffield S1 4DP, UK
Interests: information extraction; knowledge management; web archiving; language technology; semantic resource creation and analysis

Special Issue Information

Dear Colleagues,

Given the ever increasing importance of the World Wide Web as a source of information, adequate Web archiving and preservation has become a cultural necessity in preserving knowledge. This is especially the case for non-traditional digital publications, e.g., blogs, micro-blogs, social networks. Given the deluge of digital information created and the rapidness of changes on the Web, a first necessary step is to be able to respond quickly by the timely creation of archives, with minimum overhead enabling more costly preservation actions further down the line to avoid an irreparable loss of knowledge.

In addition to the “common” challenges of digital preservation, web preservation has to deal with the sheer size and ever-increasing growth and change rate of Web data. Hence, selection of content sources becomes a crucial and challenging task for archival organizations. Instead of following a “collect-all” strategy, archival organizations are trying to build community memories that reflect the diversity of information people are interested in.

Beside the creation of Web archives, their usage in applications plays an increasingly important role. Allowing the easy access to information based on different facets and across time is just one aspect. The possibility to look into the past, to understand how things are evolving opens the space for new application scenarios and analysis approaches.

This special issue of Future Internet journal contains selected, extended papers presented at the 1st International Workshop on Archiving Community Memories (ARCOMEM 2013, http://www.arcomem.eu/ipres-2013) in conjunction with the 10th International Conference on Preservation of Digital Objects to be held 2-6 September 2013, Lisbon, Portugal. However, the special issue is not limited to workshop but open to any submission related to the topic.

Dr. Thomas Risse
Dr. Wim Peters
Guest Editor

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Future Internet is an international peer-reviewed open access monthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 1600 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.


Keywords

  • Web and Social Web Harvesting
  • Focused & Topical Crawling
  • Deep Web Capture
  • Social Web Analysis
  • Information Extraction
  • Video and Image Analysis
  • Appraisal and selection of content
  • Applications & Use Cases
  • Semantic Web Technologies
  • Temporal Analytics

Published Papers (6 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

723 KiB  
Article
The ARCOMEM Architecture for Social- and Semantic-Driven Web Archiving
by Thomas Risse, Elena Demidova, Stefan Dietze, Wim Peters, Nikolaos Papailiou, Katerina Doka, Yannis Stavrakas, Vassilis Plachouras, Pierre Senellart, Florent Carpentier, Amin Mantrach, Bogdan Cautis, Patrick Siehndel and Dimitris Spiliotopoulos
Future Internet 2014, 6(4), 688-716; https://doi.org/10.3390/fi6040688 - 04 Nov 2014
Cited by 9 | Viewed by 9239
Abstract
The constantly growing amount ofWeb content and the success of the SocialWeb lead to increasing needs for Web archiving. These needs go beyond the pure preservationo of Web pages. Web archives are turning into “community memories” that aim at building a better understanding [...] Read more.
The constantly growing amount ofWeb content and the success of the SocialWeb lead to increasing needs for Web archiving. These needs go beyond the pure preservationo of Web pages. Web archives are turning into “community memories” that aim at building a better understanding of the public view on, e.g., celebrities, court decisions and other events. Due to the size of the Web, the traditional “collect-all” strategy is in many cases not the best method to build Web archives. In this paper, we present the ARCOMEM (From Future Internet 2014, 6 689 Collect-All Archives to Community Memories) architecture and implementation that uses semantic information, such as entities, topics and events, complemented with information from the Social Web to guide a novel Web crawler. The resulting archives are automatically enriched with semantic meta-information to ease the access and allow retrieval based on conditions that involve high-level concepts. Full article
(This article belongs to the Special Issue Archiving Community Memories)
Show Figures

Figure 1

840 KiB  
Article
ARCOMEM Crawling Architecture
by Vassilis Plachouras, Florent Carpentier, Muhammad Faheem, Julien Masanès, Thomas Risse, Pierre Senellart, Patrick Siehndel and Yannis Stavrakas
Future Internet 2014, 6(3), 518-541; https://doi.org/10.3390/fi6030518 - 19 Aug 2014
Cited by 4 | Viewed by 8479
Abstract
The World Wide Web is the largest information repository available today. However, this information is very volatile and Web archiving is essential to preserve it for the future. Existing approaches to Web archiving are based on simple definitions of the scope of Web [...] Read more.
The World Wide Web is the largest information repository available today. However, this information is very volatile and Web archiving is essential to preserve it for the future. Existing approaches to Web archiving are based on simple definitions of the scope of Web pages to crawl and are limited to basic interactions with Web servers. The aim of the ARCOMEM project is to overcome these limitations and to provide flexible, adaptive and intelligent content acquisition, relying on social media to create topical Web archives. In this article, we focus on ARCOMEM’s crawling architecture. We introduce the overall architecture and we describe its modules, such as the online analysis module, which computes a priority for the Web pages to be crawled, and the Application-Aware Helper which takes into account the type of Web sites and applications to extract structure from crawled content. We also describe a large-scale distributed crawler that has been developed, as well as the modifications we have implemented to adapt Heritrix, an open source crawler, to the needs of the project. Our experimental results from real crawls show that ARCOMEM’s crawling architecture is effective in acquiring focused information about a topic and leveraging the information from social media. Full article
(This article belongs to the Special Issue Archiving Community Memories)
Show Figures

Figure 1

1787 KiB  
Article
Should I Care about Your Opinion? Detection of Opinion Interestingness and Dynamics in Social Media
by Diana Maynard, Gerhard Gossen, Adam Funk and Marco Fisichella
Future Internet 2014, 6(3), 457-481; https://doi.org/10.3390/fi6030457 - 13 Aug 2014
Cited by 17 | Viewed by 7501
Abstract
In this paper, we describe a set of reusable text processing components for extracting opinionated information from social media, rating it for interestingness, and for detecting opinion events. We have developed applications in GATE to extract named entities, terms and events and to [...] Read more.
In this paper, we describe a set of reusable text processing components for extracting opinionated information from social media, rating it for interestingness, and for detecting opinion events. We have developed applications in GATE to extract named entities, terms and events and to detect opinions about them, which are then used as the starting point for opinion event detection. The opinions are then aggregated over larger sections of text, to give some overall sentiment about topics and documents, and also some degree of information about interestingness based on opinion diversity. We go beyond traditional opinion mining techniques in a number of ways: by focusing on specific opinion-target extraction related to key terms and events, by examining and dealing with a number of specific linguistic phenomena, by analysing and visualising opinion dynamics over time, and by aggregating the opinions in different ways for a more flexible view of the information contained in the documents. Full article
(This article belongs to the Special Issue Archiving Community Memories)
Show Figures

Figure 1

966 KiB  
Article
Analysing and Enriching Focused Semantic Web Archives for Parliament Applications
by Elena Demidova, Nicola Barbieri, Stefan Dietze, Adam Funk, Helge Holzmann, Diana Maynard, Nikolaos Papailiou, Wim Peters, Thomas Risse and Dimitris Spiliotopoulos
Future Internet 2014, 6(3), 433-456; https://doi.org/10.3390/fi6030433 - 30 Jul 2014
Cited by 11 | Viewed by 7582
Abstract
The web and the social web play an increasingly important role as an information source for Members of Parliament and their assistants, journalists, political analysts and researchers. It provides important and crucial background information, like reactions to political events and comments made by [...] Read more.
The web and the social web play an increasingly important role as an information source for Members of Parliament and their assistants, journalists, political analysts and researchers. It provides important and crucial background information, like reactions to political events and comments made by the general public. The case study presented in this paper is driven by two European parliaments (the Greek and the Austrian parliament) and targets an effective exploration of political web archives. In this paper, we describe semantic technologies deployed to ease the exploration of the archived web and social web content and present evaluation results. Full article
(This article belongs to the Special Issue Archiving Community Memories)
Show Figures

Figure 1

1126 KiB  
Article
The Use of Personal Value Estimations to Select Images for Preservation in Public Library Digital Community Collections
by Andrea Copeland
Future Internet 2014, 6(2), 359-377; https://doi.org/10.3390/fi6020359 - 27 May 2014
Cited by 1 | Viewed by 6871
Abstract
A considerable amount of information, particularly in image form, is shared on the web through social networking sites. If any of this content is worthy of preservation, who decides what is to be preserved and based on what criteria. This paper explores the [...] Read more.
A considerable amount of information, particularly in image form, is shared on the web through social networking sites. If any of this content is worthy of preservation, who decides what is to be preserved and based on what criteria. This paper explores the potential for public libraries to assume this role of community digital repositories through the creation of digital collections. Thirty public library users and thirty librarians were solicited from the Indianapolis metropolitan area to evaluate five images selected from Flickr in terms of their value to public library digital collections and their worthiness of long-term preservation. Using a seven-point Likert scale, participants assigned a value to each image in terms of its importance to self, family and society. Participants were then asked to explain the reasoning behind their valuations. Public library users and librarians had similar value estimations of the images in the study. This is perhaps the most significant finding of the study, given the importance of collaboration and forming partnerships for building and sustaining community collections and archives. Full article
(This article belongs to the Special Issue Archiving Community Memories)
Show Figures

Figure 1

3652 KiB  
Article
Exploiting Multimedia in Creating and Analysing Multimedia Web Archives
by Jonathon S. Hare, David P. Dupplaw, Paul H. Lewis, Wendy Hall and Kirk Martinez
Future Internet 2014, 6(2), 242-260; https://doi.org/10.3390/fi6020242 - 24 Apr 2014
Viewed by 5948
Abstract
The data contained on the web and the social web are inherently multimedia and consist of a mixture of textual, visual and audio modalities. Community memories embodied on the web and social web contain a rich mixture of data from these modalities. In [...] Read more.
The data contained on the web and the social web are inherently multimedia and consist of a mixture of textual, visual and audio modalities. Community memories embodied on the web and social web contain a rich mixture of data from these modalities. In many ways, the web is the greatest resource ever created by human-kind. However, due to the dynamic and distributed nature of the web, its content changes, appears and disappears on a daily basis. Web archiving provides a way of capturing snapshots of (parts of) the web for preservation and future analysis. This paper provides an overview of techniques we have developed within the context of the EU funded ARCOMEM (ARchiving COmmunity MEMories) project to allow multimedia web content to be leveraged during the archival process and for post-archival analysis. Through a set of use cases, we explore several practical applications of multimedia analytics within the realm of web archiving, web archive analysis and multimedia data on the web in general. Full article
(This article belongs to the Special Issue Archiving Community Memories)
Show Figures

Figure 1

Back to TopTop