My GSOC experience

First month of coding out of 3

Who am I

I’m Ambra, an Italian-born-Chinese studying in London. I did my undergraduate and Master’s work in Biomedical engineering. I followed up with my second Masters degree in Artificial Intelligence and in September will be continuing my studies as doctoral candidate within the field of AI for biomedical engineering.

Studying for so long can extremely daunting at times, hence I prefer working on projects I am passionate about. Since I was young, I’ve always been a supporter of endangered animal conservation. I was a WWF volunteer during my high school years. At the time, I was attending a symposium in northern Italy about wolf preservation. Even while the local Italians were in favour of animal preservation, there were conflicts of interest because conserving wolves without containing them to avoid damage to local livestock was difficult. In addition, during my university years I worked as a volunteer in London’s Queen Elisabeth Olympic Park. There, I assisted in surveying and cataloguing the {ark’s fauna and flora in order to monitor local wildlife and preserve them. My Orcasound project is really important to me because I am contributing to the preservation of the endangered Southern Resident orcas, even if indirectly from London (Munich or Verona).

What is our project about

As Orcasound’s hydrophones record orca calls off the coasts, these sounds are formatted, transcoded, and uploaded to the machine learning pipeline. The current machine learning pipeline uses a fully connected multilayer perceptron network to detect and classify sound inputs. Our project creates an alternative or enhancements to the existing pipeline to improve orca vocalisations separation from the surrounding noises.

Project timeline

The project consists of 3 stages corresponding to each month.

Stage 1 (mid June – mid July) : Processing of the data
Stage 2 (mid July – mid August) : Ablation study
Stage 3 (mid August – mid September) : Testing models performance

Processing of the data

What is stage 1 about

In the initial portion of the project, we used denoising techniques to filter out as much noise (from ships, underwater ambient noise) as possible before sending it to the sound separation models. We compared denoising with Fourier Transform and Least Mean Square methods before deciding which is best for ocean noises in orca habitat.

First we took a sample of Orcasound audio and computed its spectrogram. We applied all possible Fourier transform techniques and plotted the difference in spectrogram between the filtered and the original audio clip. As the noise is consistent, manually zero-ing the spikes in frequency was not an efficient approach.
The filters we used were the moving average filter (with a step of 100 kHz, accounting for a delay of half step), the weighted average filter, the Gaussian expansion filter, cubic, quartic and quintic Savitzky-Golay filters, median average filters of polynomial degrees 2, 10, 12 and the Hampel filter. The least mean square algorithm was not suitable because it needs target values to adapt, but can be used in the future once we have successfully denoised sounds.
To process entire datasets in large quantity we created a Graphic User Interface. The user can upload 1 or multiple mp3 or wav files, plot the spectrogram of the original audio (if only 1 sound is uploaded), choose the filtering technique, evaluate the performance and download the filtered audio. The functions of separating audios with Spleeter and Zeroshot were kindly added by Devdoot.
The main advantage is that anyone could use our software and process the data easily. The main reason we added a spectrogram is because visualising the data before choosing the processing technique is helpful. In the test audio clip, the spectrogram looked like a step function, hence the best filters were the median filter and the hampel. However, in the training dataset some audios had more noise, hence median and hampel filters performed very poorly.

My experience

Have a clear plan
It was fundamental for me to have a plan, including specific goals to accomplish in a given timeline, because many times I was overwhelmed by bugs and side tracked. Currently, I am still trying to figure out what could be considered a good enough denoising, but I find it helpful to constantly remind myself that getting orca calls with any reduction in noise is the main mission. Hence, it calms me from my obsession of aspiring to a perfectly cleaned sound.
Try new challenges
Before this project, I have never made a Graphic User Interface, nor have I ever developed open source code. I have learned from scratch how to create clickable buttons, make windows pop up, create error messages, and up/download specific files. Before I started creating the software, I did not even think of User Experience Design features. As I added more buttons on the screen, I have found myself solving small challenges like resizing the window as it is full, creating a scrollbar or even how to place, name and change things so that a third person could understand the purpose of the software. These new challenges were both annoying and fun at the same time; I was learning something new and excited by being able to visualise our progress!
The numbers could lie, trust yourself
I have added evaluation metrics to understand how well each denoising technique performed. The metrics I used were Signal-to-Noise-Ratio, Mean-Squared-Error and Root-Mean-Squared-Error. When a filtered sound performed very well in the metrics, it might not mean that it is denoised well. Hence, I think it is more useful for me to listen to the sounds myself before fully believing the numbers.
Do your research
In this month I have read a ton of papers, but some of them proved to be not that useful after all. In fact, part of my paper reading was to confirm or deny some assumptions I had. Nevertheless, I think it was essential because it helped me know that I was trying the best options out there. In addition to papers, there are many more helpful resources on the web. I have found Youtube tutorials, blog posts and other people’s Github repositories extremely useful, too!
If you don’ t say it, no one knows it
I think communicating to your mentors about what you are up to is very helpful. Sometimes I find myself stuck on bugs that I can’ t seem to solve in the moment, but by telling others what I do, I can receive helpful suggestions and insights. More specifically, I usually try to tell everyone not only about my challenges, but also what solutions have I tried and what I am thinking to do next.
Record everything or you will forget
For each challenge I face, I try to record not only all the steps that I have taken, but also all the resources that I used. Moreover, I also add comments so when I look back I know what sources have been useful, which parts were useful or why they were unhelpful. Reading back all the challenges I have overcome is actually very satisfying. It proves to myself that even though the main mission is not done yet, I am still on track and have improved a lot!

Summarising, I think this month was intense and packed with challenges, but I can’t wait to see what is more to come!
This is the link to our github repository: https://github.com/orcasound/acoustic-separation

Orcasound

Listen for whales

Orca calls analysis for acoustic separation from Ambra