A Data-based Perspective on Heritage Interpretation

From Lyndsey Twining
Jump to: navigation, search

There are many possible ways to address the limitations of current Korean cultural heritage interpretation which have nothing to do with a digital or data-based perspective. However, without fundamentally breaking out of the “old-media” and “traditionalist” mindsets, which conceive of heritage interpretation as a one-directional process in which experts educate an unknowledgeable, general public at on-site heritage sites or museums via text or guided tours, there is a limit to the extent to which the ideals of interpretation can be maximized. Even if there is an overhaul in the bureaucratic structure which oversees interpretive resource creation, even if best practices for the interpretive text creation and translation process are researched, even if in-depth research with test audiences into the more effective and understandable content for particular demographics is undertaken, and even if current online resources and engagement opportunities are better linked to one another and advertised, the available resources will remain limited to narrative text, video, and audio. These have limits on 1) the personalization of resources for the interests and motivations of each individual, 2) the ability to reuse the interpretive information for multiple purposes including research and content creation, and 3) the ability to navigate through and organize the context of interpretive information in nuanced and multivalent ways. A data-based perspective toward heritage interpretation not only has the potential to address those problems which can be solved via non-digital approaches, but allows for new and improved functionality that is simply impossible with an old media, traditionalist mindset. Therefore, this thesis proposes such a data-based model to address the current limitations of Korean cultural heritage interpretation.

However, before the model and its functions are presented, there is a need to understand how heritage interpretation has been approached from a digital perspective thus far, and look more specifically at the significance of databases – graph databases in particular – and the ontology which makes the graph database possible. Exactly how this data-based perspective addresses the current limitations or Korean cultural heritage interpretations regarding the five ideals of interpretation must also be outlined. These will be presented in the following sections.

Digital Perspectives on Heritage Interpretation

Heritage interpretation scholars have discussed the role of technology in heritage interpretation for over half a century. Even Tilden, in 1950, addressed the place of technological “gadgets” in heritage interpretation (133). Until the late 2000s, most of the discourse revolved only around the use of the kind of offline digital technologies discussed in Section III.2.2 – touch screens, AV devices, video displays, etc. – as tools to further the same traditionalist, old-media framework of heritage interpretation. While some scholars continue to limit themselves to this conception of the role of technology (see Ham 2013, Shaliginova 2012, Beck and Cable 2011), other scholars have begun to grapple with the way the computers and the internet are fundamentally altering the way in which we understand heritage, as well as and the public’s expectations around and possibilities for heritage interpretation. Volumes such as Heritage: Critical Approaches (2013), New Heritage: New Media and Cultural Heritage (2008), Heritage and Social Media (2012), and Cultural Heritage Information: Access and Management (2015), are filled with articles by scholars who are researching the changing meaning and role of heritage and heritage interpretation in the digital age.

Among such contemporary scholars, Staiff (2016) best summarizes way digital technology, including digitization and the Internet, has fundamentally altered the possible ways we might conceive of heritage interpretation:

“Digitalization has reunited physical sources that have been hitherto kept separate by the silos of government bureaucracy and by the different missions and professional practices of geographically separate institutions like libraries, museums (private and public), archives (private and public) and universities. Now all digital users have the capacity to do what twentieth-century historians and archaeologists had to do professionally: that is, re-unite sources artificially separated in the collection/preservation process whether print, visual or material in order to make coherent analyses and narratives. This democratization of knowledge practices gives the visitor unprecedented opportunities to make their own heritage, to participate in the processes once confined to specialists… It allows the user/visitor to bring together for themselves an explanation that draws upon the patchwork of fragments, snippets and layers of information available on the Web; to create (provisional) meaning in the matter of doing a jigsaw puzzle, a temporary assemblage of parts that is a product of, but which contributes to, the constantly evolving digital environment” (loc. 2960).

However, there is still uncertainty as to what kind of resources “digital” users of heritage information need and how they would use such resources (Stiller 2013, Chowdhury S. 2015). While most of the current research has centered around the development of resources suited to the management needs of heritage institutions, some scholars have attempted to implement and test various digital resources which are geared toward non-institutional use (Stiller 2013, Stiller and Petras 2015, Staiff 2016, Clough et al 2015, Shiri 2015). However, even without specific platforms or resources for heritage interpretation, the Internet itself facilitates modes of heritage interpretation which are simply not possible with old media, and gives unprecedented creative agency to those who have traditionally been in the passive “receiving” position of interpretations.

Staiff gives an example of this in action, describing a class project he led in which he instructed students to “navigate through The Rocks<refThe “site of the first British settlement on the continent of Australia” according to Staiff</ref> as a ‘heritage tourist’ using only the resources available on a smart phone” and students were only told in advance to familiarize themselves “with what was available online” (2016, loc. 2979). Students came up with a variety of different interpretation activities, including replication of historic photos, creating themed itineraries, making videos, attempting to locate the sites of various difference historic events and buildings, “touring” places which no longer exist, and more. This shows that greater personalization of and engagement with the heritage interpretation process is indeed possible with a digital approach.

In the Korean context, the CHA is also not entirely a stranger to such digital approaches to heritage interpretation. In the 2014 Report (Cultural Heritage Administration 2014a), a large section is dedicated to a recommendation of a digital-based cultural heritage information system. The recommendation calls for the development and operation of an independent cultural heritage interpretive text portal, compilation of digital cultural heritage information content for each heritage site, development of multimedia augmented reality content creation and related services, the design of a cultural heritage knowledge information network, and making cultural heritage site information panels digital, among other more detailed recommendations. Below is a translated summary of the Report's key tasks for a digital-based cultural heritage information system design (252-261):

  1. Development and operation of an independent cultural heritage interpretive text portal
    1. Independent cultural heritage interpretive text portal
    2. Cultural heritage wiki design strategy
    3. Measures to improve the quality of user generated content
  2. Compilation of digital cultural heritage information content for each heritage site
    1. Interpretive text content creation for a digital cultural heritage information system
    2. Provision of extended interpretive information linked to offline information panels
  3. Multimedia AR content creation and service
    1. AR cultural heritage information service
    2. Ubiquitous digital cultural heritage information service
  4. Cultural heritage knowledge information network design
    1. Centralized management of cultural heritage
    2. Design of cultural heritage knowledge information network via the connections of relevant cultural heritage information
    3. Relevant data amalgamation service which coincides with the aims of Government 3.0
  5. Project to improve cultural heritage sites information panels on a digital basis
    1. Digital information panel
    2. A structure for a digital heritage site information system


Despite the seeming embrace of a digital perspective on heritage interpretation by the CHA as presented in the report, it should be noted that the report itself was commissioned by the CHA to the Cultural Informatics Lab at the Academy of Korean Studies, and reflects the opinions of the research team; Therefore, to what extent the CHA actually has any intention to implement such suggestions is uncertain.

While the Internet and digital technology in general can, in part, facilitate realizations of the interpretive ideals and the 2014 Report suggestions for a cultural heritage information system, they have a limit to what they can accomplish. For example, how adaptive augmented reality or text content is for each user depends on the extent to which the cultural heritage information within the system is broken down into detailed parts which can be manipulated and rearranged in real time. Text or photo content, merely presented via a digital medium, alone cannot accomplish such nuanced personalization nor contextualization, nor can it be easily utilized for the secondary purpose of research as each text is imbued with its own bias from the author. Although a digital approach may allow for the inclusion of some metadata, it also cannot solve existing problems with redundant content creation and translation. Furthermore, although the Internet allows for access to information by many, from many locations, if this information is not well organized and thoroughly connected, it cannot be fully utilized. This is where the need for a data-based approach to heritage interpretation comes into play.

The Unique Capabilities of the Database

Data-based heritage interpretation can be understood as the practice of organizing, storing, managing, and accessing interpretive information and facilitating the creation of interpretive resources, through the utilization of data, the database, algorithms and interfaces. By extension, interpretive data is a manifestation of abstract interpretive information in a medium which can be processed by computers (or humans) and can be used in the creation of digital interpretive content and resources.

Databases facilitate organization and utilization of information which is simply not possible in the “real world.” Weinberger (2011) explains this by presenting an analogy about the way information was stored traditionally, and now what is possible with digital technology. He uses the example of a physical store, how it organizes the products it holds, and how customers can find the products they are looking for or browse for products if they do not have a specific item in mind. He explains how the physical world has limitations regarding information access, such as “in physical space some things are nearer than others,” “physical objects can be in only one spot at any one time,” “physical space is shared” (i.e. there is only one layout which can be used), “human physical abilities are limited,” “the organization of the store needs to be orderly and neat,” and that because each customer has different needs, the store must stock many more items than any individual customer may need, getting in the way of accessing what they are looking for (5-6). However, in the digital world, these limitations are removed. Weinberger says:

“Instead of atoms that take up room, it’s made of bits. Instead of making us walk long aisles, in the digital world everything is only a few clicks away. Instead of having to be the same way for all people, it can instantly rearrange itself for each person and each person’s current task. Instead of being limited by space and operational simplicity in the number of items it can stock, the digital world can include every item and variation the buyers...could possibly want. Instead of items being places in one area of the store, or occasionally in two, they can be classified in every different category in which users might conceivably expect to find them. Instead of living in the near, ordered shelves… items can be jumbled digitally and sorted out only when and how a user wants to look for them” (6). [1]

This “shopping at a store” analogy can also be applied to interpretive resources. While some Korean cultural heritage interpretive resources are presented digitally, they fail to fully realize the potential which Weinberger speaks of - namely because they do not utilize a database.

Manovich (2002) discusses the phenomenon is slightly different terms. He argues that, “historically, the artist made a unique work within a particular medium. Therefore the interface and the work were the same; in other words, the level of an interface did not exist. With new media, the content of the work and the interface are separated. It is therefore possible to create different interfaces to the same material” (227). He further states that “in general, creating a work in new media can be understood as the construction of an interface to a database. In the simplest case, the interface simply provides access to the underlying database…But the interface can also translate the underlying database into a very different user experience” (226). Such interfaces can take many forms, including a more traditional narrative, which “creates a cause-and-effect trajectory of seemingly unordered items (events)” (225). Though a database inherently rejects such predetermined trajectories, but can nonetheless recreate them in the user's experience via algorithms and interfaces.

In other words, the information stored within the database remains the same, but it can be accessed in countless juxtapositions and forms via various algorithms and interfaces. In the context of interpretive information, this means the work of developing interpretative content, in terms of translation, definitions, and descriptions of relationships, only needs to be done once, yet multiple interpretive resources can be created from it. Furthermore, if information in the database is changed (either because it was incorrect or needed to be improved), this is reflected in all the resources created via algorithms and displayed via interfaces. Such interfaces can take many forms, from personalized interpretive texts, to network graphs, timelines, and even virtual or augmented reality. These could be pre-curated by experts, or explored/generated organically by users. In sum, data-based interpretation allows nearly endless tailoring of content and media of interpretation, facilitating both prescribed and exploratory interaction with the interpretive information, while also allowing this information to be updated and improved instantly. Staiff (2016) discusses how this is the kind of interpretive resource today’s generations want, stating:

“Web 2.0 and the generation of users who inhabit this experience…are not interested in pre-packaged information that is passively received; rather they want open access to databases so that they, as visitors, can share the content and be co-authors of the interpretation. The digital savvy visitor wants to be a creator of meaning as well as a consumer of meaning. This indicates that the old authoritarian structure will not work because visitors of the Web 2.0 generation are already part of a series of interlocking networks of information flows where they are both producers and consumers, and often both simultaneously….The 'new' generation of visitors will not be satisfied with what is provided on signs because the information on the signs may not relate to the visitor's question or context of experience and it will increasingly become easier, as the new technology becomes an indispensable personal accessory, to use Google to find the answer while walking around the site” (loc. 2880).

Another potential benefit of conveying information via data and a database is that it can be incorporated into the larger Semantic Web. The Semantic Web is an idea presented by the creator of the World Wide Web, Tim Berners-Lee. While the World Wide Web connects documents, the Semantic Web would connect contextual elements in ways which demonstrate their relationship to one another, thus theoretically mapping the relationships among all people, places, events, things, concepts., etc. imaginable. Although the Semantic Web has yet to be fully realized, storing interpretive information via the logic of the semantic web facilitates the relationships of contextual elements within the realm of Korean cultural heritages, but also allows it to be connected to other heritages and other databases around the world.

What is a Graph Database?

There are various kinds of database models, which each take a different approach to storing data. Among these are graph databases. Robinson et al (2005) defines graphs as “...a set of nodes and the relationships that connect them. Graphs represent entities as nodes and the ways in which those entities relate to the world as relationships” (1). When connected together, this network of nodes and relationships forms a web of information which can be analyzed and easily visualized. According to Robinson et al, there are several kinds of graph databases, including [labeled] property graphs, hypergraphs, and triples (206). Each of these graph database types have different structures and therefore various strengths and weaknesses in regard to data analysis. However, the general concept behind them is the same.[2]

In a graph database, nodes are connected to one another via relationships. Nodes and relationships in graph databases can have labels and properties. Labels serve as a way to classify nodes and relationships into types, while properties describe various details of individual nodes or relationships. Nodes and relationships are categorized into labels depending on the nature of the node/relationship within the framework of the database. These labels naturally vary depending on the nature of the information being described within the database. Such labels can be useful when to filtering nodes and relationships, and also to easily distinguish nodes and relationships of different types in a visualization. Properties are used to convey details like the ID of the node of relationship, and other details about the node or relationship itself. These details, too, depend on the nature of the node and relationship being stored; different properties are useful in describing different kinds of nodes and relationships. In a visualization (such as Neo4J), these properties can act as display names for nodes and relationships.

There are various benefits to graph databases when considered within the context of heritage interpretation. Unlike relational databases, which are organized around tables of data (the relationships between which can be accessed via keys in each table), graph databases are centered around relationships. As Tilden (1950) said, interpretation “aims to reveals meanings and relationships” (33). Heritages do not have value in and of themselves, but gain such value from their larger context, which, in other words, are the relationships the heritage has with various historical and cultural concepts, people, events, etc. Graph databases allow such contextual relationships to be organized, analyzed, and clearly conveyed. Also, because of this emphasis on relationships, the pathways between nodes can be easily traversed, which makes finding related heritages or related contextual elements (people, concepts, etc.) easier. Graph databases are well-suited to visualization, and therefore, the relationships between a heritage and its context can be displayed in this way.

The current CHA heritage database is not based on the concept of a graph database. In fact, the information provided could be easily stored as a simple spreadsheet. Each heritage has metadata (as shown in section III.2), but there are no relationships between these heritages, nor are there any non-heritage contextual elements included. This consequentially means that there are various limitations to how information about heritages can be presented. For example, heritages can only have one “heritage type” shown in the metadata (for example, only “Buddhist sculpture”) which is categorized hierarchically (under “artifact”).

Graph databases are a possible solution to the current limitations of CHA heritage data storage. In a graph database, the heritage could be connected via relations to two different “heritage types” which are stored as their own nodes (“Buddhist” and “sculpture”) which can be accessed non-hierarchically (“Buddhist” via religions and “sculpture” via art forms), allowing the heritage to be accessed via multiple pathways and connected in more nuanced ways to other similar heritages. Furthermore, a graph database would allow the database to go beyond just storing data about heritages, but also about their contextual elements (such as “Buddhist” and “sculpture,” along with specific historical figures, places, events, and more) and their relations to one another, which would facilitate navigation among concepts and heritages, as well as easier translation of contextual elements (as translations can be stored as node properties).

What is an Ontology?

As mentioned in the previous section about graph databases, nodes and relationships have labels and properties. These labels serve as ways to categorize the data stored in the database, while properties aid in providing useful information about the data. However, in order to be able to functionally utilize the database, there needs to be some strategy for the database organization - deciding what kinds of labels and properties are included, which nodes and relationships should be categorized with which label, and which properties certain nodes or relationships need. However, this depends on the nature of the information which will be stored in the database, therefore, there is a need to make sense of the various elements of the information attempting to be stored in the database, as well as the nature of their relationships to one another and their various properties. Making sense of the nature of the information in this way is the objective of an ontology.

According to the Kim et al (2016), an ontology (in the realm of information technology) is an agreement made regarding the technological language used in [linked] data for the purposes of facilitating communication across the Web (163). Before putting Korean cultural heritage interpretive information into a database, the nature of Korean cultural heritage interpretive information needs to be understood. What elements of interpretive information should become nodes, what relationships they have with one another, how these nodes and relationships should be categorized, and what properties are needed to describe such nodes and relationships needs to be determined before interpretive information can be turned into data and stored in a database. This challenge of creating an ontology to describe Korean cultural heritage interpretive information so that it may be stored as data in a graph database (and, by extension, overcome the limitations of current Korean cultural heritage interpretation by living up to the ideals of heritage interpretation) is the main undertaking of this thesis.

The Ideals of Heritage Interpretation from a Data-based Perspective

The possibilities of a data-based perspective on heritage interpretation in consideration of the five ideals of heritage interpretation are numerous. Because information must be stored as discrete entities and relationships, the various interpretive elements must be clearly defined, which reduces the likelihood of vague or wordy phrasing. The separation of the data (i.e. content) and interface (i.e medium) means that data (including relations, definitions, translations, etc.) need to be made just once, and they can be reutilized to a variety of ends and forms which are tailored to the needs of users. This separation and reusability also reduces the likelihood of typos or other basic errors when new resources are created; while the initial data must be proofread and fact-checked, it can be reused with confidence. Information can be easily updated or added because it is stored as just one node or relationship within a database (as opposed to hidden in the middle of sentences in many different texts), and any updates or improvements to the data can be immediately reflected in the resources because they are presented to audiences via automated algorithms and interfaces. Also, due of the separation between content and medium, data compilation can begin now, yet this content will not go to waste, even as technology advances in ways we cannot not now anticipate, as it can be drawn upon by multiple, new interfaces far into the future. Depending on the interfaces developed, diverse audiences can potentially be involved in data input, storytelling, research and more. This also means that one user can use the database to create content in one language, and it can be displayed in another, which allows the content can reach greater audiences.

The division of content and medium also facilitates the continual growth of the database. While narrative-form resources must be confined to a certain length before they become unwieldy, each user of a database can select to see whatever collection of nodes and relationships in the database they desire. This ability for continued enrichment of content means that, in addition to being a tool for interpretation, such a database has the potential to become a significant tool for scholarly research.

Furthermore, because graph databases are not in narrative form, they facilitate access to information from a variety of starting points – not just heritages, as is the case with most current interpretive resources. Via interfaces, the database can be presented in a narrative form, but this narrative form can be centered around contextual elements rather than heritages alone. Based on the relationships within the database, related information can be discovered via a variety of factors, including contextual elements, with a level of nuance just not possible without a graph database approach. This approach also facilitates the incorporation of links to engagement opportunities which are directly connected to contextual elements in which the audience has an interest.

The following table shows these various benefits of data-based heritage interpretation in comparison with the shortcomings of existing interpretive resources, as discussed in section IV.

Table 15 Overview of the benefits of data-based heritage interpretation

Interpretive Ideal Existing Interpretive Resources' Shortcomings Graph Data-based Perspective
Clear / Accurate
  • Lack of transparency of sources, authors, translators
  • Typos, mistranslations, other basic errors
  • Information (descriptions of layout, series of events, etc.) is sometimes unclear when explained in text
  • Because information has to be stored as discrete entities and relationships, the various interpretive elements must be clearly defined which reduces the likelihood of vague or unclear phrasing.
  • Separation of “content” and “medium” means that the interface can reuse existing data, which means there are fewer chances for typos or other basic errors with the creation of new resources - the initial data must be proofread, but it can be reused with confidence.
  • The separation of content and medium also mean that the same data can be displayed in a variety of forms, not just text
Personal / Tailored
  • One-size-fits-all interpretive texts (apart from language, children's content)
  • Only in narrative form - usually interpretive text, tours, or some video content - or maps
  • Limitation to the depth and length of information presented
  • Multiple interfaces to the same data means that the same content can be displayed in many ways, which can potentially be used for the purpose of creating more personalized content and media of interpretive materials for audiences.
Contextualized / Holistic
  • Most information only organized around heritages
  • Little information on contextual elements (if there is, it is not linked to)
  • Cannot discover related heritages (or other related contextual information) based on anything other than designation, broad periods, location, and type (as determined by the CHA)
  • Because they are not in narrative form, databases, especially graph databases, facilitate access to information from a variety of starting points.
  • They can be turned into narrative form via an interface, but this narrative form can be centered around contextual elements other than heritages alone.
  • Based on the relationships within the database, related heritages can be discovered via a variety of factors.
Facilitates Engagement
  • Few audience-directed opportunities for engagement
  • Existing opportunities are not well advertised
  • Lack of in-depth opportunities for non-Korean speakers
  • Depending on the interfaces developed, diverse audiences can potentially be involved in data input, storytelling, research and more.
  • Data-based content (as opposed to both analog and narrative content) facilitates easier translation (as explained in the following sections), which means that it can be engaged with by a wider variety of audiences.
  • The database can potentially incorporate links to existing off-line engagement opportunities which are directly connected to contextual elements in which the audience has an interest.
Sustainable / Innovative
  • Multiple versions of the same interpretive texts being written and translated over and over again
  • Lack of innovation of content and medium
  • Difficulty in updating information
  • The separation of data and interface, as well as the lack of an inherent narrative, means that data (including relations, definitions, translations, etc.) need to be made just once, and they can be reutilized to a variety of ends (including narrative content).
  • Information can be easily updated because it is stored as just one node or relationship within a database (as opposed to hidden in the middle of sentences in many different texts), and any updates or improvements to the data can be immediately reflected in the resources because they are based on an interface.
  • Because the data and interfaces are separate, in the future as technology advances in ways we cannot yet fully anticipate, various interfaces which make use of the data can be developed (for example, an personalized and real-time augmented reality on-site interpretation experience via a “smart contact lens”)
  • Because there is no limitation in size, as there is with narrative-form resources, the database can be continually enriched with more relations and nodes, which means that, in addition to being a tool for interpretation, it has the potential to become a massive database for academic research.

Footnotes

  1. Weinberger uses the term “digital,” which is not incorrect, however is misleading to assume the functions he discusses are available via any digital technology or webpage. He is referencing functions that go beyond mere provision of text and hyperlinks via webpages, and going on to discuss websites like Amazon which are just interfaces to large databases. Therefore, although he mentions a “digital” world, he is in actuality discussing the functions of a database in particular.
  2. The terminology used to describe these nodes, relationships, labels, and properties varies from model to model. In other graph database frameworks such as RDF/OWL, nodes are referred to as entities or individuals, labels are referred to as classes, node properties are referred to as attributes or datatype properties, and relations are referred to as object properties. However, since the ontology presented later in this thesis is implemented via a labeled property graph as presented in Robinson et al (2005), the terminology of labeled property graphs will be used.