Fusing shallow and deep learning for bioacoustic bird species classification

Automated classification of organisms to species based on their vocalizations would contribute tremendously to abilities to monitor biodiversity, with a wide range of applications in the field of ecology. In particular, automated classification of migrating birds' flight calls could yield new b...

Full description

Saved in:
Bibliographic Details
Published inProceedings of the ... IEEE International Conference on Acoustics, Speech and Signal Processing (1998) pp. 141 - 145
Main Authors Salamon, Justin, Bello, Juan Pablo, Farnsworth, Andrew, Kelling, Steve
Format Conference Proceeding
LanguageEnglish
Published IEEE 01.03.2017
Subjects
Online AccessGet full text
ISSN2379-190X
DOI10.1109/ICASSP.2017.7952134

Cover

Abstract Automated classification of organisms to species based on their vocalizations would contribute tremendously to abilities to monitor biodiversity, with a wide range of applications in the field of ecology. In particular, automated classification of migrating birds' flight calls could yield new biological insights and conservation applications for birds that vocalize during migration. In this paper we explore state-of-the-art classification techniques for large-vocabulary bird species classification from flight calls. In particular, we contrast a "shallow learning" approach based on unsupervised dictionary learning with a deep convolutional neural network combined with data augmentation. We show that the two models perform comparably on a dataset of 5428 flight calls spanning 43 different species, with both significantly outperforming an MFCC baseline. Finally, we show that by combining the models using a simple late-fusion approach we can further improve the results, obtaining a state-of-the-art classification accuracy of 0.96.
AbstractList Automated classification of organisms to species based on their vocalizations would contribute tremendously to abilities to monitor biodiversity, with a wide range of applications in the field of ecology. In particular, automated classification of migrating birds' flight calls could yield new biological insights and conservation applications for birds that vocalize during migration. In this paper we explore state-of-the-art classification techniques for large-vocabulary bird species classification from flight calls. In particular, we contrast a "shallow learning" approach based on unsupervised dictionary learning with a deep convolutional neural network combined with data augmentation. We show that the two models perform comparably on a dataset of 5428 flight calls spanning 43 different species, with both significantly outperforming an MFCC baseline. Finally, we show that by combining the models using a simple late-fusion approach we can further improve the results, obtaining a state-of-the-art classification accuracy of 0.96.
Author Kelling, Steve
Salamon, Justin
Bello, Juan Pablo
Farnsworth, Andrew
Author_xml – sequence: 1
  givenname: Justin
  surname: Salamon
  fullname: Salamon, Justin
  email: justin.salamon@nyu.edu
  organization: Music & Audio Res. Lab., New York Univ., New York, NY, USA
– sequence: 2
  givenname: Juan Pablo
  surname: Bello
  fullname: Bello, Juan Pablo
  organization: Music & Audio Res. Lab., New York Univ., New York, NY, USA
– sequence: 3
  givenname: Andrew
  surname: Farnsworth
  fullname: Farnsworth, Andrew
  organization: Cornell Lab. of Ornithology, Cornell Univ., Ithaca, NY, USA
– sequence: 4
  givenname: Steve
  surname: Kelling
  fullname: Kelling, Steve
  organization: Cornell Lab. of Ornithology, Cornell Univ., Ithaca, NY, USA
BookMark eNotj11LwzAYhaMouE5_wW7yB1rz5qNpLmU4JxQUpuDdSJM3GqlNaTbEf2_FXZ0DBx6eU5CLIQ1IyApYBcDM7eP6brd7rjgDXWmjOAh5RgpQzDAJoOtzsuBCmxIMe7siRc6fjLFGy2ZB2s0xx-Gd5g_b9-mb2sFTjzjSHu00_C0hTbSLybp0zIfo5j55mkd0ETN1vc05hujsIabhmlwG22e8OeWSvG7uX9bbsn16mB3bMnIJh9IaL3itGwBUoDqhQnA1x65rrPMeHejZ1nkUKFVQwbgGJRNWMcu1guDEkqz-uRER9-MUv-z0sz89F7911lDr
ContentType Conference Proceeding
DBID 6IE
6IH
CBEJK
RIE
RIO
DOI 10.1109/ICASSP.2017.7952134
DatabaseName IEEE Electronic Library (IEL) Conference Proceedings
IEEE Proceedings Order Plan (POP) 1998-present by volume
IEEE Xplore All Conference Proceedings
IEEE Electronic Library (IEL)
IEEE Proceedings Order Plans (POP) 1998-present
DatabaseTitleList
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Electronic Library (IEL)
  url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Engineering
EISBN 1509041176
9781509041176
EISSN 2379-190X
EndPage 145
ExternalDocumentID 7952134
Genre orig-research
GroupedDBID 23M
29P
6IE
6IF
6IH
6IK
6IL
6IM
6IN
AAJGR
AAWTH
ABLEC
ACGFS
ADZIZ
ALMA_UNASSIGNED_HOLDINGS
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
CBEJK
CHZPO
IEGSK
IJVOP
IPLJI
M43
OCL
RIE
RIL
RIO
RNS
ID FETCH-LOGICAL-i241t-a9d3267811e515b35ffc62ebb8acddec17237cde3e45f5f9c8e403a50a2751fc3
IEDL.DBID RIE
IngestDate Wed Aug 27 02:15:07 EDT 2025
IsPeerReviewed false
IsScholarly true
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-i241t-a9d3267811e515b35ffc62ebb8acddec17237cde3e45f5f9c8e403a50a2751fc3
PageCount 5
ParticipantIDs ieee_primary_7952134
PublicationCentury 2000
PublicationDate 2017-03
PublicationDateYYYYMMDD 2017-03-01
PublicationDate_xml – month: 03
  year: 2017
  text: 2017-03
PublicationDecade 2010
PublicationTitle Proceedings of the ... IEEE International Conference on Acoustics, Speech and Signal Processing (1998)
PublicationTitleAbbrev ICASSP
PublicationYear 2017
Publisher IEEE
Publisher_xml – name: IEEE
SSID ssj0008748
Score 2.3082383
Snippet Automated classification of organisms to species based on their vocalizations would contribute tremendously to abilities to monitor biodiversity, with a wide...
SourceID ieee
SourceType Publisher
StartPage 141
SubjectTerms bioacoustics
Birds
Convolutional codes
Convolutional neural networks
data augmentation
deep learning
Dictionaries
flight calls
Machine learning
Mel frequency cepstral coefficient
Monitoring
Neural networks
Title Fusing shallow and deep learning for bioacoustic bird species classification
URI https://ieeexplore.ieee.org/document/7952134
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1Ja8JAFH6op_bSRUt35tBjE2PWybFIxZZaBCt4k1neiBQS0Uihv75vktQu9NDbMDDJMG94S_J93wO4EXQHDHLhSCqWnZDL1BG-8R1yfdqPOcapsHzn0XM8nIaPs2jWgNsdFwYRS_AZunZY_svXudraT2XdJI2sAFkTmnTNKq7WzuvyJOS1qlDPS7sP_bvJZGyhW4lbL_vRP6UMH4MDGH2-uEKNvLrbQrrq_Zcm4393dgidL6IeG-9C0BE0MDuG_W8ag214Glho-4JtbNeU_I2JTDONuGJ1v4gFo7SVyWVOrrHs7EXjtWaWgUlFNFM2u7ZwotKCHZgO7l_6Q6duoeAsKTQXjkg15WeWTYqUuMggMkbFPkrJhSLHpih9CRKlMcAwMpFJFcfQC0TkCT-JekYFJ9DK8gxPgdHzqHbjoUgp6ic6EFKSd6CFmntGBHgGbXsu81WlkjGvj-T87-kL2LO2qdBcl9Aq1lu8ovBeyOvSrh-eeaaq
linkProvider IEEE
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1Ja8JAFB6sPbS9dNHSvXPosYmaZJLJsUhFWxVBBW8yyxuRQiI2Uuiv75sktQs99DYMTDLMG96SfN_7CLkTeAcMcOFILJadgMvYEZ7xHHR92gs5hLGwfOfBMOxOg6cZm1XI_ZYLAwA5-AxcO8z_5etUbeynskYUM9uAbIfsMqwqeMHW2vpdHgW87CvUasaNXvthPB5Z8Fbklgt_KKjkAaRzSAafry5wIy_uJpOuev_VlfG_ezsi9S-qHh1tg9AxqUByQg6-dRmskX7HgtsX9NXqpqRvVCSaaoAVLRUjFhQTVyqXKTrHXNsLx2tNLQcTy2iqbH5tAUW5Detk2nmctLtOKaLgLDE4Z46INWZolk8KmLpInxmjQg-k5EKha1OYwPiR0uBDwAwzseIQNH3BmsKLWMso_5RUkzSBM0LxeVi98UDEGPcj7Qsp0T_gQs2bRvhwTmr2XOarok_GvDySi7-nb8ledzLoz_u94fMl2bd2KrBdV6SarTdwjcE-kze5jT8Akrmp_Q
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=proceeding&rft.title=Proceedings+of+the+...+IEEE+International+Conference+on+Acoustics%2C+Speech+and+Signal+Processing+%281998%29&rft.atitle=Fusing+shallow+and+deep+learning+for+bioacoustic+bird+species+classification&rft.au=Salamon%2C+Justin&rft.au=Bello%2C+Juan+Pablo&rft.au=Farnsworth%2C+Andrew&rft.au=Kelling%2C+Steve&rft.date=2017-03-01&rft.pub=IEEE&rft.eissn=2379-190X&rft.spage=141&rft.epage=145&rft_id=info:doi/10.1109%2FICASSP.2017.7952134&rft.externalDocID=7952134