Me-siri
-
Focusing on the DHH(Deaf and Hard of Hearing) group, by adjusting the barrier-free interaction chain of the iOS system, it can provide barrier-free assistance that is more in line with individual needs and their most natural behavioral paths.
-
The DHH community faces multiple obstacles in communication, social interaction, and environmental perception. Existing technologies only provide basic subtitles or one-way translation, which cannot meet the needs of natural interaction in complex scenarios (such as real-time understanding of emotions in conversations and the location of sound sources).
-
AI and other technical frameworks are used to optimize the information processing and operation path of hearing aid functions, and ultimately meet the individual needs of hearing-impaired users, simplify the interaction process, and improve the barrier-free communication experience
-
Focus on solving the three major pain points of users when using hearing assistance functions: incomplete information transmission and lack of context; complex interaction process and fragmented functions; environmental perception and multi-tasking support:
Improve accessibility for DHH users on iOS
Resonisibility overview
Project category
Case Study
“Improving Accessibility for the DHH“
Process overview
1
Research
Desk Research
Empathy map(Validate insights into unstructured text.)
Identifying Problems
Memeber
Myself
2
Synthesis
Persona
User journey map
Resonisibility
User research
UI design
3
Explore solutions
Concept
Information Architecture
Deliverables
Concept
Resolution
Prototype
4
Designs
Moodboard
High fidelity
Interaction design
Timeline
February-April 2025
Tools
Figma
Google Natural Language AI
5
Conclusion
Personal reflection
Desk research
User voice
Research
In my desk research on the user experience of DHH (deaf or hard of hearing), the DHH communication modes include sign language, text-based and oral. The areas they are most concerned about are social chat and work. I then conducted a competitive product analysis and found that the auxiliary functions of most smart devices can meet basic pain point needs.
Empathy map
If these accessibility features meet important needs of users, why do they still feel excluded from the service?
Problem1:Distorted information
Traditional accessibility functions only complete the basic conversion of "sound→text", and the unstable environment causes users to receive fragmented content.
Jane cannot determine the correctness of the conversation because of the noisy background sound transcribed into the text
Jane will be confused by the surrounding environment because of the single sound information.
Solution1.1:
Prioritize information based on context and supplement with non-verbal information
dynamically prioritize audio sources by identifying the most recent and continuous human voice as the primary sound source, while classifying non-human sounds
Added audio direction cues to help Jane identify the credibility of information sources.
Once it detects the user looking back at the screen, the visual cue fades away to make it easier for the user to read the text
With Apple intelligence , Live Captions can help smarter
If the user agrees to Apple's intelligent privacy policy, the conversation context can be identified to make different adjustments to the information output method and provide possible help.
This is the Live captions that have been enlarged after the revision.
Problem2:Interaction split
Auxiliary functions are divided into independent modules such as real-time subtitles, real-time voice, and sound recognition, which are distributed at different levels. Although this can help users quickly locate specific modules, it ignores the diverse needs and natural behavior patterns of users. When users need to use multiple levels of functions at the same time, the process becomes cumbersome and complicated, disrupting the user's usage rhythm and increasing cognitive load.
Jane needs to quickly meet communication needs
Jane feels nervous and intimidated by switching back and forth between live captions and live speech
Solution2.1:
Loading Live speech into a keyboard container allows users to use real-time speech while using live captions
Integrating Live speech into the keyboard as an interactive input container can not only avoid shortcut key conflicts and simplify multimodal interactions through a unified interface and improve the efficiency of real-time subtitle input and voice output synchronization processing but also reserve integration space for future multimodal technologies such as sign language input and support seamless expansion of functions.
This is Live captions with Apple intelligence
Problem3:Scene Passive
Live captions passive response ignores the need for voice information for special occasions
Jane has to keep looking down to see if her phone is off and Live caption is running.
Jane was answering the interviewer's questions and didn't know the interviewer asked her to stop because the interviewer was wearing a mask.
Solution3.1:
Customizable trigger commands
In the hearing section of the iOS accessibility function, I added a sub-page for setting instructions.
Jane can set it here to start at 2 pm. If there is a voice (defined as a sentence spoken by a person, coughing is only considered as ambient sound), the standby Live captions will remind and provide transcription (if the live captions is turned on at the beginning, it will only serve as a reminder.) This is mainly when the user cannot rely on vision to understand what is happening or may happen in the surrounding environment
Problem1:Distorted information + Interaction split
Jane and Tina need to keep their phones and live captions turned on at all times to receive audio information. Even in daily use, it is impossible to stare at the screen all the time.
Solution1.2:
Add visual and vibration prompts.
Tina allows you to set up ”textural language instructions“. When the device recognizes "Tina", the live caption feature automatically turns on, accompanied by a custom vibration and visual alert. Once the gyroscope detects that the user picks up the iPhone or Apple Watch, the real-time transcription will appear on the screen (when the trigger condition is received, live captions is always transcribing in the background).
This is mainly to provide seamless, multi-modal real-time subtitle assistance, solve the pain points of cumbersome manual operations and delayed information capture, , and improve communication autonomy and efficiency.
Information architecture: Mehrabian - Russell sentiment to visual display
Solution1.3:
Incorporating emotions and volume fluctuations into real-time subtitles allows the device to learn and recover information that may be obscured by sound, such as a friend’s timbre, intonation, and emotion, providing users with a more nuanced information reference.
Tina can set up "real-time recording". After Alice agrees and records, Alice's voice will be recorded to the local system. When the system recognizes Alice's voice, it will automatically prompt Tina and start transcribing the information in the background.
This mainly solves the problems of information loss and subsequent sorting costs in instant communication, transforming conversations from "fragmented capture" to "complete traceability", significantly improving the "sense of information security" in social and work situations.
The Why(Not emoji)
It restores auditory information, but cannot make understanding judgments for users.
The discrete design of emojis (each symbol represents a “pure emotion”) breaks this “emotional ambiguity” and allows users to feel the subtle levels of emotions more naturally through the open visual language of gradient colors.
Scenario Adaptation > General Solutions
In new/weak relationship scenarios, the gradual "low stimulation, high inclusiveness" is more in line with social safety distance, while emoticons may cause misunderstandings due to cultural or contextual differences.
Three-dimensional fitting using the Mehrabian-Russell model
Gradient colors can dynamically transition through hue, saturation, and brightness, while carrying the complex emotions of pleasure (warm tone), arousal(brightness change), and dominance (low saturation and soft transition).
Problem2:Interaction split + Scene Passive
Live Captions only provides a single subtitle function and does not have the need for multi-tasking of sound information in adaptive scenarios.
Tina can only continue to perform a single listening task, and the interaction process will be interrupted by any unexpected factors.
Solution2.2:
The interface logic for the reduced mode of live captions has been redesigned. It now offers two channels, media and real - time conversation, enabling users to multitask.
Process: To prevent interruptions to user tasks, new and sudden sound messages are fed back in the form of pop-up messages. After the user confirms and clicks the pop-up window, the message will be automatically transcribed and a shortcut key for reply will be provided. Tina can also choose to display media or microphone information first
This is mainly to solve the confusion of multi-tasking sound source processing, and to non-invasively access new information through pop-up messages, thereby improving the efficiency of real-time communication and media consumption.
Problem3:Scene Passive——Loss of environmental awareness
Unable to identify "multi-person conversation" vs "single person speech" scenarios, and use the same subtitle strategy
Solution3.2:
Adaptive layout integrates four dialogs in one frame. Sound triggers dynamic effects to highlight source and prioritize relevant transcript.
The phone's microphone transmits sound data. The chip processes this data to identify the sound source direction and provide visual guidance, like flashing icons on the screen. This helps Tina, who is DHH, quickly understand the scene during communication, enabling her to better capture relevant information. To ensure the readability of text information, the solution only supports four dialog boxes
UX Vision
For DHH
Achieve more autonomous and convenient communication and daily tasks
For People around DHH
They don’t have to avoid DHHs because they think they can’t communicate well with them, but can have an equal dialogue with them. Framework.
Thank you Jane for your help. Next, I will invite Tina to help me explain how to improve this idea.
Findings
So I created two persona, and with the help of the auxiliary functions of the iOS 18 system, I looked for specific problems and solutions based on the two user journeys of middle-aged job hunting and young campus dating.
The Why
to choose Tina and Jane as personas for the entire optimization design?
After understanding the characteristics of the group through desktop research, I then went to the largest deaf community website in the UK(The Limping Chicken) to check their sharing or interview records, and extracted the protagonists of several articles to make empathy maps (of course, I know that this may be different from the time and conditions of the general empathy map, but I essentially have insight into their emotions and inner needs). During this period, I still used Google Natural Language AI technology to verify my emotional analysis. Finally, based on the previous research and analysis results, I selected the main social activities faced by the two age groups, work (interview) and campus (social), and created two personas, Jane and Tina.
empathy maps
Storyboarding and low-fidelity ideation
The Why
Choose to optimize iOS accessibility instead of developing a new software
During the desktop research phase, I learned that 21% of DHH people use their phones for an average of 6-8 hours (2022), and most people prefer the system's built-in accessibility features. In addition, the number of people using the iOS system accounts for a large proportion in the field of mobile phone systems. In addition, the choice to optimize the iOS system's accessibility features is based on its mature hardware sensors (such as gyroscopes, microphone arrays), accessibility APIs, and user behavior inertia, avoiding the learning cost of developing new software and ecological fragmentation, ensuring that accessibility features are seamlessly integrated into users' daily operations, and maximizing technology reusability and user reach efficiency.
Design Exploration
Interactive animations
Conclusion
Interface design based on mood boards and using standard components
Try to understand the user group in depth, conduct face-to-face interviews to observe their most real reactions and status
Although I can understand the voice of users through the community website, I cannot observe their real situation when using specific products except for emotions and questions they want to post. I think we should be sensitive to possible unpleasant experiences before users actively complain.
Try to share ideas with relevant technical personnel
Because I don’t understand the technical feasibility, I fell into wonderful fantasies and repeatedly searched for information during the design conception stage, which caused my design ideas to be out of the user role for a period of time.
Allow things to be imperfect at this stage
In fact, sign language interpretation is a very big pain point in the user research stage, but this is a technical problem that I can’t solve. With the limitation of scope and the creation of user roles, I focused the problem solution on three general design aspects: interaction logic, semantic expression, and environmental perception. Even if I can’t solve it, I also reserve its place in my conception….
Explore more
Home Landing
An AI-powered home design product dedicated to creativity
Current site
A case study on improving accessibility for DHH users on iOS
NF-ALERT
A full-link conceptual design of forest fire warning system for improving decision efficiency