What email address or phone number would you like to use to sign in to Docs.com?
If you already have an account that you use with Office or other Microsoft services, enter it here.
Or sign in with:
Signing in allows you to download and like content, and it provides the authors analytical data about your interactions with their content.
Embed code for: Why Must Sound Only Be Heard_Experiments in AI
Select a size
Why must Sound only be heard? Experiments with AI
Swetha Machanavajhala Software Engineer 2
Anirudh Koul Senior Data Scientist
Not long ago, Swetha was in her house and the carbon monoxide detector was beeping. Since she is hard of hearing, she was unaware until a neighbor informed her.
People with disabilities
People with disabilities
People who are deaf or hard of hearing
DISABILITY ≠ PERSONAL HEALTH CONDITION
Excluded from hearing culture
Hearing AI Lead
Microsoft AI & R
Exploring meaning of sound through Hearing AI
Seeing Environmental sounds
HEARING AI VISUALIZES MUSIC VIA AR
Transcribe and Translate spoken conversation
Capturing loudness real-time Hearing AI enables livelier conversation
My account is locked
Can you send me your personal details.
Future: Hearing AI automates relay services
Call Recipient Caller (Deaf or Hard of Hearing)
Hearing AI gives everyone the power to see sounds in a single app
Research project at Microsoft
Deep Learning Revolution
Object Recognition Error Rate
Speech Recognition Word Error Rate
Lip reading to text
Dataset - 100,000 sentences from BBC
Chung, Joon Son, et al. "Lip reading sentences in the wild." arXiv preprint arXiv:1611.05358 (2016).
Lips + Speech Character Error Rate
Ground Truth: IT WILL BE THE CONSUMERS
Speech: IN WILL BE THE CONSUMERS
Lips: IT WILL BE IN THE CONSUMERS
Speech+Lips: IT WILL BE THE CONSUMERS
Lips + Speech recognition examples
Ground Truth: CHILDREN IN EDINBURGH
Speech: CHILDREN AND EDINBURGH
Lips: CHILDREN AND HANDED BROKE
Speech+Lips: CHILDREN IN EDINBURGH
The Visual Microphone Passive Recovery of Sound from Video
Davis, Abe, et al. "The visual microphone: passive recovery of sound from video." (2014).
Sound played “Mary and her little lamb” and then recreated by watching video only
Sergey Tulyakov et al, “Heart Rate Estimation from Real Life Videos”
SoundNet Learning sound representations from unlabeled video
Aytar, Yusuf, Carl Vondrick, and Antonio Torralba. "Soundnet: Learning sound representations from unlabeled video." Advances in Neural Information Processing Systems. 2016.
Sign language Recognition & Translation with Kinect
Collaboration between Microsoft Research Asia and
Inst of Computing Tech. Chinese Academy of Sciences
Hololens Localization and intensity of sound
“Acoustic camera”: Finding acoustic buzz in a complex environment
– Campbell associates
Dr. Seok Hyung from Korea Advanced Institute of Science & Technology (KAIST)
Real-time environmental sound recognition
on mobile device
Strong harmonic content
More/less structured soundscapes
* A. Pillos, K. Alghamidi, N. Alzamel, V. Pavlov, S. Machanavajhala ,“Real-Time Environmental Sound Recognition on Android OS”, in Proceedings of the Detection and Classification of Acoustic Scenes and Events, 2016.
I hear the siren, watch out
Localization of sound
How Artificial Intelligence can prove that sound need not be heard!
We want to hear from you.
Connect with us at email@example.com
© Microsoft Corporation. All rights reserved.
Hello everyone, I have a small story to share…
I am not alone
In this world of 7 billion
1.1B people with dis
Which means 1 in 7 people with dis
Out of that there are
360 million are deaf or have profound hearing loss.
In 2001 the W.H.O. redefined their definition of disability. This revision is pivotal to anyone designing interactions and products – it’s not the use of a wheelchair that creates the disability, instead it’s the lack of adequate building access for the diversity of ways humans get around. It’s the mismatch in interactions that creates the disability, and excludes people.
When we talk about interaction, here is the thing.
We all depend on sounds be it env sounds like door knock… or music… or speech…
But reality is these 360 mill either can’t hear or interpret sounds..
We feel excluded from hearing culture
Yes, you got me there!
We found the exclusion point = these millions of people are missing the meaning of sound because so far it’s meant to be heard.
Now, the question is how can we design products to include everyone? Have we ever thought about this instead?
“Why must sounds only be heard?” That’s what our talk is going to be about.
Cons: not real-time, just plain text no emotion
Swetha to talk about this
Swetha to end the talk by summarizing and delivering the message
© 2015 Microsoft Corporation. All rights reserved. Microsoft, Windows, and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.ings of the Detection and Classification of Acoustic Scenes and Events, 2016.
© 2015 Microsoft Corporation. All rights reserved. Microsoft, Windows, and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation.