Based on the same sort of triangulation that is used to pinpoint the source of gunfire in some cities, I think I could, using three microphones, build a 3-D model of the soundscape.
Each microphone records continuously. Downstream, the computer matches peaks and troughs and tries to determine the position of the sound source relative to the three mics.
I feel confident that I could place the loudest sound, but other ones might be trickier 😉