In the late 1800s, scientists realized that migratory birds made species-specific nocturnal flight calls—“acoustic fingerprints.” When microphones became commercially available in the 1950s, scientists began recording birds at night. Farnsworth led some of this acoustic ecology research in the 1990s. But even then it was challenging to spot the short calls, some of which are at the edge of the frequency range humans can hear. Scientists ended up with thousands of tapes they had to scour in real time while looking at spectrograms that visualize audio. Though digital technology made recording easier, the “perpetual problem,” Farnsworth says, “was that it became increasingly easy to collect an enormous amount of audio data, but increasingly difficult to analyze even some of it.”
Then Farnsworth met Juan Pablo Bello, director of NYU’s Music and Audio Research Lab. Fresh off a project using machine learning to identify sources of urban noise pollution in New York City, Bello agreed to take on the problem of nocturnal flight calls. He put together a team including the French machine-listening expert Vincent Lostanlen, and in 2015, the BirdVox project was born to automate the process. “Everyone was like, ‘Eventually, when this nut is cracked, this is going to be a super-rich source of information,’” Farnsworth says. But in the beginning, Lostanlen recalls, “there was not even a hint that this was doable.” It seemed unimaginable that machine learning could approach the listening abilities of experts like Farnsworth.
“Andrew is our hero,” says Bello. “The whole thing that we want to imitate with computers is Andrew.”
They started by training BirdVoxDetect, a neural network, to ignore faults like low buzzes caused by rainwater damage to microphones. Then they trained the system to detect flight calls, which differ between (and even within) species and can easily be confused with the chirp of a car alarm or a spring peeper. The challenge, Lostanlen says, was similar to the one a smart speaker faces when listening for its unique “wake word,” except in this case the distance from the target noise to the microphone is far greater (which means much more background noise to compensate for). And, of course, the scientists couldn’t choose a unique sound like “Alexa” or “Hey Google” for their trigger. “For birds, we don’t really make that choice. Charles Darwin made that choice for us,” he jokes. Luckily, they had a lot of training data to work with—Farnsworth’s team had hand-annotated thousands of hours of recordings collected by the microphones in Ithaca.
With BirdVoxDetect trained to detect flight calls, another difficult task lay ahead: teaching it to classify the detected calls by species, which few expert birders can do by ear. To deal with uncertainty, and because there is not training data for every species, they decided on a hierarchical system. For example, for a given call, BirdVoxDetect might be able to identify the bird’s order and family, even if it’s not sure about the species—just as a birder might at least identify a call as that of a warbler, whether yellow-rumped or chestnut-sided. In training, the neural network was penalized less when it mixed up birds that were closer on the taxonomical tree.
#changing #study #bird #migration