First Monday

Enhancing user involvement with digital cultural heritage: The usage of social tagging and storytelling by Harry van Vliet and Erik Hekman

This paper focuses on the use of online social tagging and storytelling to enrich digital collections of cultural heritage. Together with several Dutch museums, we examined the question of whether and how social tagging could benefit these museums in disclosing specific digital collections. This led to the development of a social tagging tool ( as a means of researching behaviour when tagging cultural objects. The results show that tagging and storytelling can help museums enrich their collections and involve their audiences.


Museums and social tagging
Research questions
Research design and material
Social tagging findings
Social storytelling findings
Discussion and conclusions




Society’s collective memory is solidified in artefacts of our cultural heritage: hundreds of collections contain an enormous number of archive items, art objects, books, paintings, archaeological remnants, folkloristic objects and audio–visual objects. These treasures are scattered across a large number of cultural heritage institutes, such as museums, archives and libraries. The government’s challenge is to ensure the accessibility of such cultural treasures for each and every person. This is hardly surprising considering that over 45 million objects are in the custody of museums in the Netherlands alone. The same applies to the wealth of material in archives that remains hidden and hence invisible.

Particularly since the beginning of the 1990s, the growing impact of information technology and digitization has provided a fresh impulse for cultural heritage institutions to deal with these problems (van Vliet, 2009). These efforts, however, were aimed mainly at cultural preservation and, for the time being, have done little to bring us closer to the dream of a Virtual Collection in the Netherlands. In 2008, there were still substantial delays in digital registration: ‘Nationally, digital images have been made for 2 to 4 out of every 10 art objects, which means a digitization rate of 17%–37%. The total number of non–digitalized art objects thus amounts to between 28.4 and 37.3 million art objects.’ [1]

Meanwhile, the need for further progress has become more urgent still. The dominant role of the internet in recent years has caused a change in the relationship between media producers, suppliers and consumers in the traditional media landscape. As a low–threshold channel that encourages interactivity, the Internet is used heavily among producers and consumers, who use it to meet, collaborate and keep each other informed. The cultural heritage sector has not overlooked these developments entirely. In fact, there is no shortage of multimedia and cross–media presentations of collections at this point. The current availability of digital cultural heritage is characterized by a rich variety of initiatives: a colourful array of Web sites, mobile applications and multimedia interactive compilations. This confirms the willingness of cultural heritage institutions to start using new media resources and the improvements in accessibility that these new media technologies can offer. This being said, the range currently available leaves one with the distinct impression that it has been developed on an ad hoc basis, with public sentiments perhaps more important than a considered strategy.

The question is, then: what can we achieve using today’s digital resources in response to the changing role of the general public, in terms of improving public access to cultural heritage? Involving multiple media resources — and particularly resources like the Internet and mobile telephony — seems to be an inevitable step, but how can this best be done? This paper addresses this question by focusing on social tagging. Together with several Dutch museums, we investigated the issue of whether and how social tagging could benefit these museums in disclosing specific digital collections. This research was conducted as part of the PACE (Public Annotation of Cultural Heritage) project. The purpose of the PACE project was to examine the ways in which social tagging could be used as a tool to enrich collections, improve accessibility and increase visitor involvement. This led to the development of a social tagging tool ( as a means to research several aspects of people’s behaviour when tagging cultural objects. The main results of this research will be presented and discussed (see van Vliet, et al., 2010).



Museums and social tagging

Searchability is crucial for the accessibility of our digital cultural treasures. The ability to find these digital cultural treasures begins with an effective description of digital art objects. Therein lies the problem. If any object description can be found at all, it will usually include only a minimal amount of technical data focusing on object management issues such as documentation relating to the acquisition process and storage, or art–historical features such as the time frame, style, artist and art–historical value. A minimum description of an art object usually involves six or seven characteristics, including a description of the art object, the date of acquisition, the reason for acquisition, the name of the museum employee responsible for the object, the name of the institute and the inventory number.

When museums present their art collections on the Internet, it is often the case that museums simply use the object descriptions written for the internal management of their physical collection. As a result of this, any problems in describing the physical art collection are reflected in the digital presentation. Typical problems include the following: 1) the information is tightly structured along the lines of the museum’s specific objectives, e.g., for the purposes of a specific exhibition or educational programme; 2) terminology of a technical or specialist nature is used, e.g., strict annotation standards may mean that the formal description of a painting of a cow does not even include the word ‘cow’ and as a result the painting cannot be found by searching on the word ‘cow’; 3) an art object is embedded in a context that does not include the average visitor’s perspective since it exclusively reflects an art–historical context, e.g., tags which indicate a certain genre or art–historical time frame; or, 4) as the case may be, an art object is taken completely out of context, and only a database record is shown (Trant, 2006a; 2006b). In conclusion, art collections are available, but not readily accessible; descriptions have been made, but these are not always readily comprehensible.

Social tagging deployment is a possible solution for engaging the public and making object descriptions more generally comprehensible. Tagging involves assigning labels and/or keywords to a specific item, such as a painting tagged with the word ‘beautiful’. Consequently it creates associations with the structure ‘user — tag — item’. Users tag in order to categorize and describe resources to maintain navigational aid to resources for later browsing or retrieval (Strohmaier, et al., 2010).

When several people are engaged in this activity, and tags are mutually visible, we refer to this process as social tagging. For instance, on the social tagging Web site, tags associated with a particular Web site are immediately visible to other users and when tagging, users can see who has added the same kinds of tags for the site, the different tags added, and the names of sites that have been given similar tags. In these cases, social taggers tend to use common vocabulary to describe items. This entire set of associations is semantically coherent in a statistical sense and referred to as a folksonomy, a neologism formed from the words ‘folk’ and ‘taxonomy’ (Marlow, et al., 2006). A folksonomy is thus distinct from taxonomy, which refers to a formal, hierarchical description of items.

Social tagging offers museums a quick and direct way to learn about visitor experiences, what visitors judge to be significant, and what significance they attach to particular art objects. Various studies suggest that tagging has a positive effect, both on added value for art collections and the level of visitor involvement with those collections (Marlow, et al., 2006; Trant, 2006a; 2006b; Trant and Wyman, 2006: Trant, et al., 2007). The frequently mentioned benefits associated with social tagging include:

  1. Tags provide visitors with access points that are more closely related to the idioms used by visitors than formal object descriptions (Chae and Kim, 2011).

  2. Tags add new information to art collections. In some cases the general public may also have knowledge and information that is not available to a particular institute (Trant, 2006a).

  3. Tagging increases people’s involvement with art collections: taggers contribute by sharing the meanings that collections have to them personally with the museum and other visitors, and consequently may provide insight into visitors’ perceptions of art collections.

  4. Tags can be used to personalize access to art collections by making suggestions, composing virtual exhibitions, providing route maps, or bringing visitors into contact with other visitors (van Setten, et al., 2006; Trant, 2006b).

More and more examples of social tagging can be found in the realm of cultural heritage. Internationally renowned examples include the Steve Initiative (, the Powerhouse Museum (, the Smithsonian Photography Initiative ( and the Brooklyn Museum ( The latter has a so-called Posse (, a group of people active in the ‘Tag! You’re it!’ initiative. This is a playful way of encouraging tagging, showing who has added the most tags. A variety of different initiatives is available in the Netherlands. In 2006, for instance, students at the Utrecht School of the Arts developed the ‘’ Web site (a play on the Dutch words for the Underground scene and the type of ‘canvas’ used by graffiti artists) in partnership with the Institute for Telematics in Enschede. The Web site is dedicated to street art, such as graffiti, posters and stickers. The site allows graffiti artists and others to upload street art photographs and add tags, which may include the location, artist, colour and other keywords. Their contributions have facilitated the creation of online exhibitions on themes like ‘humour’, ‘politics’, or particular artists or locations. The site relies entirely on the community to provide information about any given object.

Social tagging is not without its problems. There is a real risk of corruption through vandalism or nonsense words (tag spam) or simply through erroneous tags added in good faith. Rewarding people for contributing a large number of tags may also lead to sub–standard tags and pollution, which may be counterproductive and make it more difficult to locate something. Just like ‘ordinary’ language, tagging is subject to all kinds of linguistic problems such as synonyms, homonyms and ambiguity (Simons, 2008). One issue is how visitor–generated tags relate to the dominant official taxonomy. With social tagging it become possible to imagine a metadata universe where today’s user–generated tag will eventually integrate easily and conveniently alongside descriptions written a century ago (Dalton, 2010). The discussion has centred mainly on the quality of tags submitted by visitors versus the authority and professionalism of institute staff. A call for moderation is expected before long (Beyl, et al., 2008), but the response of professionals is not necessarily negative; there is general recognition of the validity of both (Trant, et al., 2007).

Not all Web site visitors find tags appealing as part of the user interface and tags that are meant for personal use are experienced as particularly distracting. A tag such as ‘in possession’ may be useful for someone to tag a book with, but it holds little meaning to others. While tags could exhibit useful properties, recent research has shown that the navigability of social tagging systems leaves much to be desired. When browsing social tagging systems users often have to navigate through huge lists of potential results before arriving at the desired resource (Trattner, et al., 2011).

Additionally, research has shown that individuals’ tagging behaviour is influenced by other people’s tags. Quite a few tagging systems make use of suggestions, such as showing the most frequently used tags for a particular item. New taggers will often select from these tags, as a result of which fewer new tags will be added over the course of time (Sen, et al., 2006). Moreover, it has become apparent that tags added at a later stage are less likely to become popular than tags added at an earlier stage. When it comes to adding tags, it seems that timing is crucial. Another issue is that an individual or group of people contributing a disproportionally large number of tags will have a bigger voice than those adding fewer tags. This is referred to as the ‘Matthew Effect’. To address some of these objections to tagging, people have also considered introducing an element of play, as used on CamClickr and in the Brooklyn Posse.



Research questions

Although social tagging might help improve the accessibility of digital collections, it remains to be seen whether it is really useful and what effect it may have when used. The question about the useful deployment was translated into the following research question: ‘What choices do museums have to make for the deployment of social tagging?’ Two such choices have been identified for the purpose of this project:

First, in social tagging research, researchers usually work using a clear cut dichotomy between professionals versus laymen. This means that there is little consideration of the differing degree of knowledge among visitors. In addition to the museum curator and the layman, visitors will also include ‘well–informed’ interested people, including amateur scientists and retired professionals, for example (see, for example, Wubs and Huysmans, 2006). This group of experts can be quite extensive. In some areas, hundreds or thousands of experts are involved and they may even be overrepresented among the group of active taggers. It is extremely relevant for museums to cater for and continue to involve this group in their collections. It is possibly more useful to deploy social tagging for a specific group of experts rather than for a broad audience.

A second choice is to move beyond the limited power of expression of tags. No matter how powerful some tags may be, they are still essentially keywords that convey a limited amount of information. It may therefore be interesting to consider other forms of expression, such as digital storytelling. This may open up possibilities for community development and linking all kinds of sub–collections using a common narrative or conversation (see Srinivasan, et al., 2009). The concept of storytelling is not unknown in a cultural context. Chew (2002) traces a development that began at the end of the 1950s, through which stories about people (oral histories) have increasingly obtained their own meaning and role in the presentation of cultural heritage. Such stories may contribute to making exhibitions more accessible and attractive, precisely because of this personal perspective. However, stories also contribute to and emphasize the meaning and interpretation of art objects (see van Vliet, 2009). Social storytelling is potentially more useful than social tagging.

In addition to the question of what is ‘useful’, we also addressed the effects of social tagging. In the theoretical discussion, we encountered three effects in relation to social tagging:

Based on this analysis of the original question, we can create a matrix to compare the decision to use social tagging (independent variables) with the effects (dependent variables). This leads us to a number of research questions, two of which we will report on here: 1) Do laymen tag in different ways compared to experts? and 2) Which ways of presenting stories lead to greater involvement? These research questions were reformulated as the following hypothesis:

Hypothesis 1: Laymen add more tags overall than experts and more unique tags to objects than experts.

Hypothesis 2: Laymen add different kinds of tags to objects to experts.

Hypothesis 3: Use of vocabulary among laymen is different from use of vocabulary among experts.

Hypothesis 4: Tags added by experts provide more information about objects than tags added by laymen.

Hypothesis 5: Tags added by laymen are more useful for the purposes of object retrievability than tags added by experts.

Hypothesis 6: Presenting a story through video clips elicits stronger involvement than presenting a story through audio recordings alone, which in turn elicits stronger reactions than the use of narrative text alone.

Hypothesis 7: The better a story is told, the more involved people are.

We will briefly discuss the assessment of these hypotheses and the most significant results; we refer to van Vliet, et al. (2010) for the full report, including an explanation of the data, analytical methods and statistics used.



Research design and material

Each of the museums participating in the project — Museon (, Naturalis ( and the Utrecht University Museum ( — has put collections at our disposal. The collections include digital photographs of art objects or digitalized drawings, either in colour or black and white. Except for one collection, added at the end of 2009, they all involve objects stored in the museums’ depositories about which little or nothing was known previously. This was one of the main reasons for the three museums for using these collections in our study. Another important reason was that potential experts on these collections could be identified and approached readily.

A controlled research environment allowing the collection of tags and stories was important to our research. Due to the nature of the research we required an environment to identify and track users throughout the system giving us complete control over the interface and the photos. Building a social tagging tool would give us all the benefits needed to conduct our research.

Since storytelling plays a role in this research, we also included a function to enable storytelling. In the context of our research, the social tagging tool was aimed primarily at supporting the research questions within this research. The tagging tool tracks various types of information, such as user actions, and stores this information in a database for later analysis. The platform Web site was given the name ‘’ (Dutch for ‘I know what this is’) and launched in September 2008.

Three collections were displayed on the Web site: images of 134 objects from Utrecht University Museum’s dental surgery collection, 145 objects from Museon’s collection of drawings of Japanese internment camps, and 100 photos of Naturalis’ beetle collection (see Figure 1).


Sample image from Utrecht University Museum collection used in this studySample image from Museon collection used in this studySample image from Naturalis collection used in this study
Figure 1: Sample images from the different collections used in this study.


We used two approaches to recruit subjects. The ‘’ site went online in September 2008, allowing anyone to visit and engage in tagging. We publicized the launch of the site at conferences and in specific publications. Laymen and experts were also approached directly and asked to tag objects. Most of the laymen subjects were recruited from among students of the Faculty of Communication and Journalism at HU University of Applied Sciences Utrecht. To recruit experts, personnel from the museums were approached, such as the Academic Centre for Dentistry in Amsterdam (Academisch Centrum Tandheelkunde). We used the same research environment with the same online instructions for all groups consistently throughout the experiment. In addition to tagging, subjects were also asked to disclose their gender, year of birth, postal code, highest level of education achieved, their profession, the frequency of their visits to a museum or archive and their main reason for visiting a museum or archive. We ultimately collected data for over a year, from September 2008 until December 2009.

We approached the museums and their associates to begin collecting stories for the Web site’s storytelling function. Museon and the Utrecht University Museum each selected four people to tell their stories as part of an exhibition on Japanese internment camps and the dental surgery exhibition. Consideration was given to people who had themselves been held in an internment camp or, in the case of the dental surgery collections, former employees of the Utrecht University Dental School, such as the dean. These eight people were interviewed at the end of 2008, and these interviews produced 62 stories. Because these 62 stories were too long for use in this research, we made a selection from these 62 video clips on the basis of a number of objective criteria. For example, no editing in the clips were allowed and the video clips had to be equal in length — two to three minutes, with good image quality and clear sound. Applying these criteria enabled us to select six video clips: three clips about the dental surgery collection and three clips about the collection of drawings from Japanese internment camps, varying in length from 1:21 to 2:46 minutes.

It was necessary to determine what constitutes a ‘good’ story and a ‘poor’ story to assess hypothesis 7 on the power of ‘better’ stories to involve the viewer or listener. We set up a pre-test to classify the stories during the spring of 2009. A total of eight professionals from each of the participating museums were asked to assess the six selected video clips with stories about objects. For each clip, questions were asked about the story (‘Is the video clip suitable for a broad audience?’, ‘Does the information in the story add another dimension to the drawing/picture?’ etc.) as well as about more subjective experiences such as whether the story was ‘fun’, ‘boring’, ‘instructive’ and so on. The subjects filled in a questionnaire for each video clip, with all answers scored on a five–point Likert scale. To complete the test, the subject was presented with six images of objects in A4 format in a random order. He or she was then asked to rank the images in order from ‘most appealing to persons interested in the collection’ to ‘least appealing to persons interested in the collection’. Finally, we were able to select two video clips simply and unequivocally from the available material, and these were then used for the storytelling experiment.



Social tagging findings

During the period of the experiment a total of 935 people made use of the application. The vast majority of respondents (92 percent) participated anonymously, with only eight percent registering. Of the 158 people who filled in a questionnaire, which could also be done anonymously, 67 percent were male and 33 percent were female. The average age was 39 years, but there was a considerable age spread. Most of the people had had some higher education; 81 percent had a university degree or advanced professional education. More than half visited museums several times annually, and many did so monthly or more often (65 percent); 28 percent of the people visited museums once a year and 12 percent of the people never visited museums. The main reason for their visits were interest (44 percent), followed by work/study (22 percent) and leisure (21 percent). We were able to identify the categories that 924 people belonged to: 89 percent could be characterized as laymen, 10 percent as experts and one percent as professionals.

The participants added a total of 3,592 tags. The largest number of tags (1,349) was added to the photographs of the Utrecht University Museum’s collection of dental surgical instruments. A further 727 tags were added to the other collection from the Utrecht University Museum. The Naturalis collection had 278 tags, and the entire Museon collection 1,238 tags. When examining each collection separately, the number of unique tags amounted to 2,221. This number dropped to 1,892 when we looked at the total number of unique tags across all collections. The total number of tags was 3,593 across 379 objects, which is an average of almost 9.5 tags per object; this percentage drops to 5.9 tags per object when only unique tags per object per collection are counted. Visitors added an average of 13 tags and a single tagging session lasted an average of 13 minutes and 39 seconds. The tags were entered non–uniformly over the test period, meaning that there were distinct ‘peaks’ in tagging activity after target groups were approached to engage in tagging.

Hypothesis 1 states that laymen add more tags overall and more unique tags to objects than experts do. This hypothesis was tested by examining the distribution of the overall number of tags added and the number of unique tags by laymen and experts, using a single–sample t–test at the five percent significance level. This test was carried out on a sample of data from all the collections combined, and from two sub–collections from the Utrecht University Museum. The other individual collections provided too little data to conduct a reliable test. The basic data are listed in Table 1. Both tests failed to produce a significant result: there was no significant difference in the number of tags added by laymen or experts (t(248) = 0.33; p < 0.37), nor was there any significant difference in the number of unique tags added by laymen or experts within their own groups (t(263) = 0.20; p < 0.42). Additionally, we examined whether there was a difference in the number of unique tags added to the totaal amount of tags by laymen and experts. Here too, no significant difference was found: as a group, laymen and experts added equal numbers of unique tags to the total number of tags (t(238) = 0.32; p < 0.37).


Table 1: Numbers and averages of tags and unique tags (per collection) among laymen and experts.
All collectionsNumbers and avg. of tagsNumbers and avg. of unique tagsTotal numberUnique number
UM Dental Surgery 1 
UM Dental Surgery 2 


To assess hypothesis 2, three appraisers were asked to categorize a selection of 50 tags in a separate session by assigning each of the tags to one of three categories. The 50 tags had been added to the photographs of Utrecht University Museum’s dental surgery collection and consisted of the 25 most frequent laymen tags and the 25 most frequent expert tags for this collection. The tags were ranked and presented in alphabetical order. The three tag categories were defined as: 1) Descriptive: tags that provide factual information about an object. Examples here are words like ‘cow’, ‘black and white’ or ‘painting’; 2) Reference or self–reference: tags that are used to retrieve related information/objects. Examples here are words like ‘important’ or ‘interesting’; 3) Attitude: tags that express an opinion or emotion about an object. Examples here are words like ‘beautiful’, ‘scary’ or ‘fun’. The great majority of tags from laymen and experts were classified as ‘descriptive’ (65 and 63 respectively); a few were classified into the ‘self–reference’ category (10 from both groups of taggers); and the ‘attitude’ category was virtually non–existent (only two instances in the group of experts). Based on this it was concluded that there is no difference between the types of tags used by laymen and experts, at least for the most frequently added tags.

We assessed hypothesis 3 (use of vocabulary) by examining the similarity between the words in the tags added by laymen and experts respectively. Similarities are on the lexical level; so the use of ‘caries’ and ‘little holes’ is considered as a vocabulary difference. The proportion of tags used by both laymen and experts was calculated. Subsequently, we determined whether the sequence of these two series of similar words was the same or different. For example, a tag might be used by both laymen and experts, but laymen might use the tag seldom (low–ranking) while experts might use it frequently (high–ranking), or vice versa. Frequency of use tells us something about the ‘weight’ that a user group gives to a word. By determining the frequency with which a word is used with a certain object (co–occurrence), we also have a crude measure of the semantic similarity of words.

In the Utrecht University Museum’s Dental Surgery 1 collection, experts used 67 tags more than once. Thirty–one of these tags, or 46 percent, were also used by laymen. In the Dental Surgery 2 collection, experts used 30 tags more than once. Nineteen of these tags, or 63 percent, were also used by laymen. In Utrecht University Museum’s Dental Surgery 1 collection, laymen used 97 tags more than once. Forty–seven of these tags, or 48 percent, were also used by experts. In the Dental Surgery 2 collection, laymen used 61 tags more than once, and 26 of these tags, or 43 percent, were also used by experts. Evidently, there was a considerable overlap between the tags added by laymen and by experts. We subsequently examined whether the tag rankings among laymen and experts were equal. In order to assess this, we calculated Kendall’s tau–b, a measure of rank correlation at five percent significance. The four values calculated for the Dental Surgery 1 and Dental Surgery 2 collections show three non–significant results (tau–b = 0.25; p < 0.07; tau–b = 0.31; p < 0.06; tau–b = 0.25; p < 0.19) and one significant result (tau–b = 0.32; p < 0.006). With some caution, it can be said that in those instances where laymen and experts use the same words, those words did have a different weight or ranking.

Of course, when assessing hypothesis 4 (added information), it is important to define the term ‘information’ clearly. For the purpose of our research, we have used two different interpretations. First of all, we used an objective measure from information theory. Objective measurement research employs a well–defined objective measure to define the proximity of two tag clouds and the coherence between tag words in a tag cloud, which is also referred to as the semantic distance (see Brussee and Wartena, 2008; Wartena and Brussee, 2008; van Vliet, et al., 2010). The outcome of these complex calculations is that the tags added by laymen and those added by experts do not differ significantly in their informativity.

Secondly, we used the professionals’ subjective assessment of the informativity of tags. For this purpose, a supplementary experiment was carried out, which also enabled us to assess hypothesis 5. The central assumption is that different target groups make different contributions through tagging. Laymen may add information to an object that is of little relevance, but they may also contribute tags that are more useful for other users to retrieve objects, such as ‘white’, ‘scary’ or ‘head’. Experts are more likely to add relevant information to an object because of their specific knowledge and expertise. A selection of eight objects containing the most commonly added tags of both experts and laymen was collected from the data; in all cases, these turned out to be objects in Utrecht University Museum’s dental surgery collection. For each of these eight objects, we selected the four most frequently occurring tags added by laymen and experts, making eight tags in total per object. Fourteen professionals from the participating museums were asked to classify the tags into qualitative categories ranging from ‘good’ to ‘bad’, firstly answering the question: ‘Which tag, in your opinion, adds the most information to the object?’; and then the question: ‘Which tag, in your opinion, is the best search term to find the object?’

This method of asking subjects to rank tags for an object on a scale from ‘good’ (1) to ‘bad’ (8) is referred to as ordinal measurement or ranking. Whether such a ranking is coincidental or not can be assessed with a non–parametric test: the Wilcoxon Signed–Rank Test. Turning to the question concerning the added information of tags, we obtained a significant result in six of the eight cases at a five percent significance level: the expert tags were considered to be more informative than the laymen tags (z = 2.53, p < 0.05; z = 3.04, p < 0.05; z = 2.88, p < 0.05; z = 2.86, p < 0.05; z = 1.76, p < 0.05; z = 3.22, p < 0.05). There was no significant difference for two of the objects (z = -0.20; z = 0.83). None of the significant results indicated the contrary. As for the question of retrieving objects using tags, only two of the eight cases yielded a significant result (z = 2.09, p < 0.05; z = 2.24, p < 0.05). In these significant cases, the laymen tags were considered more effective for the retrieval of the object than the expert tags. There was no significant difference for the other six objects (z = 0.14; z = -1.14; z = 1.49; z = 1.47; z = 1.16; z = 1.59), but none of the other results indicated the contrary (i.e., that the laymen tags made the object less easy to retrieve). We also assessed whether the group of professionals did indeed form a homogeneous group of assessors; in all cases, it became apparent that a consensus existed among the professionals in their tag–ranking assessments, as tested by chi–square tests of Kendall’s W for all eight objects.



Social storytelling findings

Three different modalities were made for both stories: 1) the video clips presented during the pre–test; 2) an audio version, for which the narrated story was extracted from the video and converted into an MP3 clip; and, 3) a text version using a transcript of the narrated story in the video clip. This resulted in a 2x3 factorial design (good story/poor story x video/audio/text).

The varying conditions (text, audio, video) were presented in the ‘’ research environment, where the video image of the object was displayed consistently under all conditions. The subjects were students at HU University of Applied Sciences, and this group was identified as ‘laymen’. A total of 65 subjects participated in the experiment; four subjects were removed from the data because the video recording application did not work properly. The 61 remaining subjects included 15 women and 46 men, with an average age of just under 23 years. More than half of the students were studying Digital Communication at the Faculty for Communication and Journalism (36); other degree programmes included Commercial Economy (6); Media Technology (5); and, Cultural and Social Education (4). The experiment was conducted at the Crossmedialab in the Faculty for Communication and Journalism at HU University of Applied Sciences in November and December 2009. The subjects followed onscreen instructions while at their computer monitors. A test manager was present to answer any questions that arose. The experiment lasted an average of 20 to 30 minutes.

We examined the data collected to find whether the ‘good’ versus ‘poor’ manipulation had been effective. Statistical analysis then shows the following:



Discussion and conclusions

On the basis of the data acquired and our analysis of these, we can draw the following conclusions. Hypothesis 1 can be rejected: laymen do not add significantly larger numbers of tags than experts, and nor do they add larger numbers of unique tags than experts. In fact, these two groups add virtually the same proportion of unique tags. Hypothesis 2 can also be rejected: laymen do not add different types of tags than experts. Both groups add mainly descriptive tags. While Schmidt and Stock (2009) discovered more attitudinal tags and Strohmaier, et al. (2010) more self–referencing tags, these two types of tags were virtually non–existent in our experiment. Hypothesis 3 cannot be rejected: laymen use different words to experts, but not entirely different. On the basis of the data produced by assessing the two collections, we discovered a significant overlap (almost 50 percent) between the tags added by laymen and experts. Only one of the four cases assessed showed a significant association in the ranking of words used by laymen and experts. In other words, the tags used by both laymen and experts differ in the relative frequency with which they occur. The hypothesis concerning the information added by the two groups of taggers was assessed in two different ways. Using an objective measurement of the ‘distance’ of both tag clouds, no difference was found in the degree of ‘informativity’ between the laymen’s tags and the experts’ tags. The post–experiment research carried out with professionals did reveal, however, that the expert tags are considered ‘more informative’: six of the eight objects, on the basis of which the hypothesis was assessed, produced a significant result. Hypothesis 5 was assessed by having professionals examine the usefulness of tags for retrieving a particular object. In only two of the eight cases examined were the laymen tags considered to be better keywords for object retrieval than the expert tags. In none of the cases were the expert tags considered better keywords for object retrieval. We will discuss these conclusions further after we have presented the results of the storytelling research.

The conclusions from the storytelling research are as follows: we discovered that viewing/listening to/reading stories changed the participants’ attitudes towards museums in a positive sense after the experiment had taken place: there was a more positive attitude towards visiting the museum or its Web site, and recommending the museum to friends and family. Furthermore, after the experiment the subjects indicated that they were more motivated to visit a museum for leisure purposes, and less motivated to visit for work or study. On the other hand, we saw no change in the emotions experienced before and after the experiment: the scores for the emotional pairs indicate that both before and after viewing the clips, the subjects were mainly relaxed, calm, passive, and a little bored but happy nonetheless. Assessing the differences in the emotions before and after the experiment did not lead to a significant result in any of the seven pairs. Hypotheses 6 and 7 were tested with a 2x3 factorial design for measuring media experience. No significant difference in engagement was found in relation to the main effect of modality (video/audio/text) or the overall effect of the story (good/poor). Neither was there a significant difference found in the interaction effect (modality x story). On the basis of this experiment, therefore, both hypotheses 6 and 7 cannot be rejected with regard to experience.

The distinction between different groups — laymen, experts, and professionals — is one contribution that this research makes to the discussion of the value of social tagging for museums. Five hypotheses were formulated in which the distinction between laymen and experts was assigned a decisive role in enriching and facilitating the retrieval of digital objects. On certain important points, our findings show that there is no significant difference between laymen and experts in this respect. But although there was a substantial overlap in the tags used by both groups, each group also added its ‘own’ words, words that were not mentioned by the other group, or that were mentioned less frequently. Moreover, there is a difference between the relevant weights assigned to the tags shared by both groups: one group mentioned those tags less frequently than the other group. This is also visible in the tag clouds, in the sense that the laymen did indeed use more ‘everyday’ words to describe the objects in the dental surgery collection, such as ‘tooth’, ‘back tooth’, and ‘hole’, while experts tended to use more technical lexis such as ‘dental caries’ and ‘dental prosthesis’. The most extreme examples of this tendency were found in the beetle collection, where experts used specialist terms to identify beetle species (usually the Latin names of genera and species) to such an extent that there was virtually no overlap with the terms used by the laymen. Laymen’s terms, such as ‘beetle’, ‘bug’, and ‘black’, were not among the words used by the experts at all.

In a general sense, we can conclude that both laymen and experts provide their own contributions to digital collections through social tagging. The contributions from both groups are well matched in a quantitative sense, but differ to some extent in qualitative terms. Based on this research, therefore, we cannot confirm the assumption that having experts engage in tagging is any more ‘productive’ than having laymen engage in tagging. Both groups contributed in their own, specific ways and the research indicates that experts’ tags contribute more to informativity while laymen’s tags tend to contribute to retrievability. Apart from the distinction between laymen and experts, we may further conclude that tagging does indeed enrich collections, in the sense that it adds keywords that have additional value for enabling access to collections.

This research has also shown that there were very few occurrences of ‘spontaneous’ social tagging. Most tags were added when particular groups (students, dental surgery experts) were actively approached to join the experiment. Familiarity or unfamiliarity with the Web site may have something to do with this; we did draw attention to the Web site among the required target groups, but this was apparently not sufficient. The collections were also uploaded to Flickr, but the results indicated that only a few spontaneous tags were added. In the case of Flickr, the unfamiliarity of this site was not an issue, but a few hundred photographs could have been lost in the vast amount of material found on Flickr.

Another research contribution is the question relating to the role of storytelling in enriching museum collections and its potential for eliciting visitor involvement. As with social tagging, the question must be asked: when is it useful to involve visitors in storytelling and story presentation, and what is the effect of this? Finding no significant results in relation to hypotheses 6 and 7 was quite an unexpected outcome. We had at least expected that viewing a video presentation of a good story would be clearly different from reading the textual presentation of a poor story, to mention the two most extreme modalities in the experiment. One possible explanation for this result is that the difference was simply not large enough to emerge as significant with the measurement method used and the limited number of subjects involved.

Another possibility is that the two selected stories ‘meant nothing’, in an emotional sense, to the subjects; in other words: they were simply indifferent. This interpretation is perhaps backed up by the fact that in the measurement of emotions before and after the experiment, the scores in the seven emotion pairs indicated the subjects were experiencing the same emotions before and after the experiment. In short, there was actually no change in the emotions identified. This leads us to ask whether the manipulation in regard to this aspect (emotional impact) had been sufficient to allow for a significant, measurable result using such a small number of subjects. In previous research using video material to examine emotional experience, ‘fierce and intense’ visual images have often been selected (Lazarus and Folkman, 1984; van Vliet, 1991). For the purpose of follow–up research, it is thus advisable to make a thorough assessment of the visual material’s potential to elicit an emotional response, and the number of subjects required to show a significant difference.

An additional explanation for the absence of significant results is the operationalization of the notion of ‘experience’. For the purpose of this research, we chose to use the validated items from the study by de Haan and Adolfsen (2008). These items were focused on the experience of different types of media, while this research involved not so much different media as much as different modalities (text, audio, and video). For a follow–up experiment, we should once again consider how to operationalize the experience, and also consider alternative experience scales (see for example, Coan and Allen, 2007; Rubin, et al., 2009).

However, in a positive sense, viewing/listening to/reading the stories did result in a change in attitude towards museums after the experiment, making the subjects more positive about visiting the museum, the museum’s Web site, and recommending the museum to friends and family. We were not able to discover the exact cause of this change, or to determine its precise relationship with the fact that there was also a shift in motivation: from ‘for work/study’ to ‘for leisure’. This would mean that exposure to stories in itself has a positive effect on visitor involvement with museums. We did not investigate whether this effect would persist over the longer term and whether it would actually increase visits to museums or their Web sites, but it does mean that continued research in this area is relevant and urgent.

To return to the main question, and taking into account all methodological caveats that we have discussed only partially here (see also van Vliet, et al., 2010), we can conclude that social tagging and storytelling are relevant for museums as tools with which to enrich their collections. Our research has produced more equivocal results in relation to the two other aspects of the benefits of social tagging referred to above, retrievability and involvement, but, in any case, our findings do not contradict the assumption that social tagging and storytelling could contribute to retrievability and promote visitor involvement. An essential aspect here is to consider who is asked to do what: in addition to the museum professional, a distinction is made between laymen and experts, which has proven to be relevant when it comes to interpreting the results clearly. This is inextricably linked to the question of how to reach and win over these target groups so that they will provide the input required. End of article


About the authors

Harry van Vliet is professor of cross media and head of the Crossmedialab at the Utrecht University of Applied Sciences.
E–mail: Harry[dot] vanvliet [at] hu [dot] nl

Erik Hekman is a Ph.D. fellow researching social media and social capital at the Utrecht University of Applied Sciences.
E–mail: Erik [dot] hekman [at] hu [dot] nl



This publication could not have been realized without the museums’ participation in this research; we would particularly like to mention: André van Schie and Reina de Raat of the Utrecht University Museum, Hub Kockelkorn at Museon, and Sander Pieterse and Berry van der Hoorn of Naturalis. We would also like to thank the other members of the PACE project who made substantial contributions to the preparation and conduct of this research and, of course, we are grateful to the members of the Crossmedialab for providing critical views and a stimulating atmosphere.



1. Veeger, 2008, p. 33.



J. Beyl, G. Nulens and B. de Nil, 2008. “On–line heritage presentation in Flanders: A new way of searching and presenting heritage content,” In: J. Trant and D. Bearman (editors). Museums and the Web 2008: Proceedings. Toronto: Archives & Museum Informatics, at, accessed 12 April 2012.

R. Brussee and C. Wartena, 2008, “Automatic thesaurus generation using co–occurrence,” 20th Belgian Netherlands Conference on Artificial Intelligence (University of Twente, 30–31 October), at, accessed 12 April 2012.

G. Chae and J. Kim, 2011. “Can social tagging be a tool to reduce the semantic gap between curators and audiences? Making a semantic structure of tags by implementing the facetted tagging system for online art museums,” In: J. Trant and D. Bearman (editors). Museums and the Web 2008: Proceedings. Philadelphia: Archives & Museum Informatics, at, accessed 12 April 2012.

J. Coan and J. Allen (editors), 2007. Handbook of emotion elicitation and assessment. Oxford: Oxford University Press.

J. Dalton, 2010. “Can structured metadata play nice with tagging systems? Parsing new meanings from classification–based descriptions on Flickr Commons,” In: J. Trant and D. Bearman (editors). Museums and the Web 2010: Proceedings. Denver: Archives & Museum Informatics, at, accessed 12 April 2012.

J. de Haan and A. Adolfsen, 2008. The virtual cultural visitor. Audience interest in cultural websites (De virtuele cultuurbezoeker. Publieke belangstelling voor cultuurwebsites). The Hague: Social and Cultural Planning Office (Sociaal en Cultureel Planbureau).

R. Lazarus and S. Folkman, 1984. Stress, appraisal, and coping. New York: Springer.

C. Marlow, M. Naaman, d. boyd and M. Davis, 2006. “HT06, tagging paper, taxonomy, Flickr, academic article, ToRead,” Proceedings of Hypertext ’06 at, accessed 12 April 2012.

R. Rubin, A. Rubin, E. Graham, E. Perse and D. Seibold, 2009. Communication research measures II: A sourcebook. London: Routledge.

S. Schmidt and W. Stock, 2009. “Collective indexing of emotions in images: A study in emotional information retrieval,” Journal of the American Society for Information Science and Technology, volume 60, number 5, pp. 863–876.

S. Sen, S. Lam, A. Rashid, D. Cosley, D. Frankowski, J. Osterhouse, F. Maxwell Harper and J. Riedl, 2006. “Tagging, communities, vocabulary, evolution,” CSVW ’06: Proceedings of the 2006 20th Anniversary Conference on Computer Supported Cooperative Work, at, accessed 12 April 2012.

J. Simons, 2008. “Another take on tags? What tags tell,” In: G. Lovink and S. Niederer (editors). Video vortex reader: Responses to YouTube. Amsterdam: Institute for the Unstable Media, pp. 239–254.

R. Srinivasan, R. Boast, K. Becvar and J. Furner, 2009. “Blobgects: Digital museum catalogs and diverse user communities,” Journal of the American Society for Information Science and Technology, volume 60, number 4, pp. 666–678.

M. Strohmaier, C. Körner and R. Kern, 2010. “Why do users tag? Detecting users’ motivation for tagging in social tagging systems,” ICWSM2010: International AAAI Conference on Weblogs and Social Media, pp. 339–342.

J. Trant, 2006a. “Social classification and folksonomy in art museums: Early data from the tagger prototype,” Proceedings of the 17th ASIST SIG/CR Social Classification Research Workshop.

J. Trant, 2006b. “Exploring the potential for social tagging and folksonomy in art museums: Proof of concept,” New Review of Hypermedia and Multimedia, volume 12, number 1, pp. 83–105.

J. Trant and B. Wijman, 2006. “Investigating social tagging and folksonomy in the art museums with,” paper presented at the Tagging Workshop at the International World Wide Web Conference, at, accessed 12 April 2012.

J. Trant, D. Bearman and S. Chun, 2007. “The eye of the beholder: and social tagging of museum collections,” paper presented at the International Cultural Heritage Informatics Meeting, at, accessed 12 April 2012.

C. Trattner, C. Körner and D. Helic, 2011. “Enhancing the navigability of social tagging systems with tag taxonomies,” Iknow ’11: Proceedings of the 11th International Conference on Knowledge Management and Knowledge Technologies, at, accessed 12 April 2012.

M. van Setten, R. Brussee, H. van Vliet, L. Gazendam, Y. van Houten and M. Veenstra, 2006. “On the importance of ‘Who Tagged What’,” paper presented at the Workshop on the Social Navigation and Community–Based Adaptation Technologies, in conjunction with AH ’06: Adaptive Hypermedia and Adaptive Web-Based Systems, at, accessed 12 April 2012.

H. van Vliet, 2009. Digital cabinets of curiosity. Cultural heritage & cross-media (De Digitale Kunstkammer. Cultureel Erfgoed en Crossmedia). Utrecht: Utrecht University of Applied Sciences.

H. van Vliet, 1991. Atrractive appearances. [De Schone Schijn. Een analyse van psychologische processen in de beleving van fictionaliteit en werkelijkheid bij theatrale producten]. Amsterdam: Thesis.

H. van Vliet, E. Hekman, N. Veldhoen and M. Rotte. 2010. “Public annotation of cultural heritage (Publieksannotatie van Cultureel Erfgoed),” Research report at the Crossmedialab. Utrecht: Utrecht University of Applied Sciences.

L. Veeger, 2008. Drawing up the balance of collections. Research into the ups and downs of museum collections in The Netherlands. (De collectiebalans. Een onderzoek naar het wel en wee van museumcollecties in Nederland). Amsterdam: The Netherlands Institute for Cultural Heritage (Instituut Collectie Nederland).

C. Wartena and R. Brussee, 2008. “Instance based mapping between thesauri and folksonomies,” ISWC ’08: Proceedings of the 7th International Conference on The Semantic Web, pp. 356–370.

H. Wubs and F. Huysmans, 2006. Digging and sniffing around. On target groups of digitally accessible archives (Snuffelen en graven. Over doelgroepen van digitaal toegankelijke archieven). The Hague: Social and Cultural Planning Office.


Editorial history

Received 19 January 2012; revised 8 April 2012; accepted 9 April 2012.

Creative Commons License
This work is licensed under a Creative Commons Attribution–NonCommercial–ShareAlike 3.0 Unported License.

Enhancing user involvement with digital cultural heritage: The usage of social tagging and storytelling
by Harry van Vliet and Erik Hekman
First Monday, Volume 17, Number 5 - 7 May 2012