Sumit Basu, Michael Casey, William Gardner, Ali Azarbayejani, Alex Pentland
Work for a Member company and need a Member Portal account? Register here with your company email address.
Sumit Basu, Michael Casey, William Gardner, Ali Azarbayejani, Alex Pentland
We present novel techniques for obtaining and producing audio information in an interactive virtual environment using vision information. These techniques are free of mechanisms that would encumber the user, such as clip-on microphones, headphones, etc. Methods are described for both extracting sound from a given position in space and for rendering an "auditory scene,'" i.e., given a user location, producing sounds that appear to the user to be coming from an arbitrary point in 3-D space. In both cases, vision information about user position is used to guide the algorithms, resulting in solutions to problems that are difficult and often impossible to robustly solve in the auditory domain alone.