About NOVAView
My name is Aarush Muthukrishnan, and welcome to my PA Media and Design 2025 project called NOVAView. This app is designed to allow blind individuals to experience AI. With NOVAView, blind individuals can gain independence and access information by describing surroundings and answering questions in real-time. I hope you find my project NOVAView helpful!
Many blind individuals struggle with accessing information and navigating their surroundings. I wondered if there was an easy way to help them gain independence and access information in real-time. I also wanted to leverage the AI capabilities that were available in my “toolset.” This was my inspiration for NOVAView, which has the tools that can help blind individuals and make their lives easier. This is why I think my project is very helpful and can be further made into something tremendous.
“When I realized that Artificial Intelligence could create an interactive experience to bridge this gap in understanding and empathy, I knew I had to create NOVAView.”
The Journey of Creating NOVAView
Many people struggle to fully grasp the daily realities faced by blind and visually impaired individuals, and often, the potential for Artificial Intelligence to act as a powerful accessibility tool remains an abstract concept. When I realized that technology could create an interactive experience to bridge this gap in understanding and empathy, I knew I had to create NOVAView.
Planning and Conception:
First, I needed a clear plan for how the website would function and feel. To structure this, I sketched out the user flow and architecture, brainstorming various ways to simulate AI assistance. I then consolidated these ideas, focusing on solving the core challenge – demonstrating AI accessibility – through two primary features.
Core Features Implementation:
- The first feature I implemented was the "Real-Time Experience." This allows users to engage their camera and microphone, receiving live, AI-generated descriptions of their surroundings, simulating how AI can interpret the visual world. It’s designed to provide a direct, interactive glimpse into this technology. To power this, I utilized Google's Multimodal API for its robust real-time processing capabilities, connecting the front-end and back-end using WebSockets adapted from Google's sample code.
- The second feature is "NOVABot," a specialized AI chatbot. This chatbot is trained to answer a wide range of questions about blindness, visual impairment, and related assistive technologies, serving as an educational resource. It uses Google's Dialogflow CX, chosen for its structured conversational flow capabilities, and was trained using information gathered from reputable online documents to ensure its responses are informative and grounded.
Development and Technology:
My next phase involved translating the design into functional code. I outlined the logic, adapting the Google WebSocket example and building the user interface with React. This process, involving coding in Visual Studio Code and deploying via Google App Engine, required careful integration and troubleshooting over several weeks to ensure the components worked together smoothly. Iterative testing was crucial; early versions didn't always perform as expected, especially regarding the real-time API's sensitivity, requiring continuous refinement.
Feedback and Refinement:
After establishing a functional website, I sought feedback from my family and friends to understand their user experience. They provided valuable insights on the simulation's clarity and the chatbot's usefulness, suggesting improvements to the user interface and the handling of potential API limitations, like background noise sensitivity. Incorporating this feedback, I polished the code and user experience.
Documentation and Finalization:
With the core features refined, I focused on documenting the project, properly citing the resources used for NOVABot's training, and preparing the necessary descriptions and materials. Finally, I conducted thorough final checks to ensure the project was ready.
Challenges and Learnings:
Some significant challenges during this process included smoothly integrating the real-time Multimodal API via WebSockets, managing its sensitivity to environmental factors like noise, and ensuring the simulation felt intuitive and respectful. More importantly, the guidance and feedback I received from my family and friends were instrumental in shaping the final project.
Looking Forward:
I hope NOVAView will serve as a valuable tool for the community, fostering greater empathy and showcasing the positive, empowering the role AI can play in accessibility within our increasingly technology-driven world!