**6. Discussion**

Here, we discuss the current state of research on touchscreen-based image accessibility and missing gaps to be investigated in the future based on the findings from systematic review and our own experience of designing two systems for artwork and comics accessibility.

### *6.1. From Static Images to Dynamic Images*

Various types of images displayed on touchscreen devices were studied in terms of accessibility over a decade since the year of 2008. However, all but one [42] have supported still images without motion. However, dynamically changing images such as animations and videos (e.g., movies, TV programs, games, video conferences) has rarely been explored in terms of accessibility for touchscreen devices for BLV users. Considering the rapid growth of YouTube [81] and its use for gaining knowledge [82], videos are another type of images (a series of images) that have various accessibility issues. While the area has been explored as well regardless of the medium [83–85], it would be interesting to examine how it can be supported for touchscreen devices.

#### *6.2. Types of Information Supported for Different Image Types*

As it has been found in prior work that BLV wish to ge<sup>t</sup> different types of information depending on the context [21], different types of information was provided for different image types. For example, geographic information such as building locations, direction, and distance were offered for map and graph images. On the other hand, shape, size, and line-length information were conveyed to users for geometric objects. However, little study has been done about other types of images, although specific locations or spatial

relationships of objects within an image such as photographs and touchscreen user interface are considered important [20,30]. To identify types of information that users are interested in for each type of image, adopting recommendation techniques [86,87] can be a solution for providing user-specific content based on users' preference, interests, and needs.

### *6.3. Limited Room for Subjective Interpretations*

The majority of the studies have prioritized images that contain useful information (e.g., facts, knowledge) over images that can be interpreted subjectively, differently from one person to another, such as artwork, using touchscreen devices. Even for artwork images, many studies have focused on delivering encyclopedia-style explanations (i.e., title, artist, painting styles) [2–5,88]. AccessArt [16] was an exception, where they demonstrated if their artwork appreciation system can enable BLV people to make their own judgements and criticism about artwork they explored. We believe that more investigations are needed to improve the experience of enjoying the content of the images or of making decisions based on subjective judgements for BLV people (e.g., providing a summary of product reviews of others as a reference).

### *6.4. Automatic Retrieval of Metadata of Images for Scalability*

The greatest number of studies that we have identified in our systematic review, all but seven out of 33 papers, assumed that image-related information is given. If not, researchers manually created the information. However, a number of images on the web do not have alt text, although it is recommended by Web Content Accessibility Guidelines (WCAG) (https://www.w3.org/TR/WCAG21/). Moreover, it is not feasible for a couple of researchers to generate metadata for individual images. Thus, automatic approaches such as machine learning techniques have been studied [12,13,89,90]. However, since the accuracy of descriptions produced by humans is not as high, we recommend the crowdsourcing approach for generating descriptions [18,20] if precise annotations are needed. This can serve as a human–AI collaboration for validating auto-generated annotations [13]. Eventually, these data can be used to train machine learning models for implementing a fully automated image description generation system [91–93].

### *6.5. Limited Input and Output Modalities of Touchscreen Devices*

Unlike other assistive systems that require BLV to physically visit certain locations (e.g., [75,88]) or that require special hardware devices with tactile cues (e.g., [3,4]), touchscreen devices benefit from being portable, where a variety of images can be accessed using a personal device with less physical and time constraints. However, the input and output modalities that touchscreen devices can offer are limited to audio and vibration feedback. To provide more intuitive and rich feedback, more in-depth studies on how to ease the design of 2.5D or 3D models and how the cost and time for producing tactile representations of images can be reduced should be conducted. One way to do so is open-sourcing the process, as in Instructables (https://www.instructables.com/), which is an online community where people explore and share instructions for do-it-yourself projects. Meanwhile, we also recommend touchscreen-based approaches to make a larger number of images accessible to a greater number of BLV people.

### *6.6. Limited Involvement of BLV People during Design Process*

The findings of our systematic review revealed that most studies had user evaluation of proposed systems with BLV participants after the design and implementation. However, it is important to have target users participate in the design process at an early state when developing a new technology [94]. A formative study with surveys or semi-structured interviews is recommended to understand the current needs and challenges of BLV people before making design decisions. Iterative participatory design process is also grea<sup>t</sup> way to reflect BLV participants' opinions into the design, especially for users with disabilities [95–97].
