ICBHI 2017 Challenge

Respiratory Sound Database

The Respiratory Sound database was originally compiled to support the scientific challenge organized at Int. Conf. on Biomedical Health Informatics - ICBHI 2017. The current version of this database is made freely available for research and contains both the public and the private dataset of the ICBHI challenge.

The Respiratory Sound Database contains audio samples, collected independently by two research teams in two different countries, over several years. Most of the database consists of audio samples recorded by the School of Health Sciences, University of Aveiro (ESSUA) research team at the Respiratory Research and Rehabilitation Laboratory (Lab3R), ESSUA and at Hospital Infante D. Pedro, Aveiro, Portugal. The second research team, from the Aristotle University of Thessaloniki (AUTH) and the University of Coimbra (UC), acquired respiratory sounds at the Papanikolaou General Hospital, Thessaloniki and at the General Hospital of Imathia (Health Unit of Naousa), Greece.

The database consists of a total of 5.5 hours of recordings containing 6898 respiratory cycles, of which 1864 contain crackles, 886 contain wheezes, and 506 contain both crackles and wheezes, in 920 annotated audio samples from 126 subjects.

The cycles were annotated by respiratory experts as including crackles, wheezes, a combination of them, or no adventitious respiratory sounds. The recordings were collected using heterogeneous equipment and their duration ranged from 10s to 90s. The chest locations from which the recordings were acquired is also provided. Noise levels in some respiration cycles is high, which simulate real life conditions.

Each file name is divided into 5 elements, separated with underscores (_).

1. Patient number (101,102,...,226)

2. Recording index

3. Chest location 

      a. Trachea (Tc)

      b. Anterior left (Al)

      c. Anterior right (Ar)

      d. Posterior left (Pl)

      e. Posterior right (Pr)

      f. Lateral left (Ll)

      g. Lateral right (Lr)

4.    Acquisition mode 

     a. sequential/single channel (sc), 

     b. simultaneous/multichannel (mc)

5.    Recording equipment 

     a. AKG C417L Microphone (AKGC417L), 

     b. 3M Littmann Classic II SE Stethoscope (LittC2SE), 

     c. 3M Litmmann 3200 Electronic Stethoscope (Litt3200), 

     d.  WelchAllyn Meditron Master Elite Electronic Stethoscope (Meditron)

The annotation files comprise four columns:

  • Beginning of respiratory cycle(s)
  • End of respiratory cycle(s)
  • Presence/absence of crackles (presence=1, absence=0)
  • Presence/absence of wheezes (presence=1, absence=0)

The diagnosis for each subject can be found here and the distribution of the subjects between training/test set can be found here. The abbreviations used in the diagnosis file are:

  • COPD: Chronic Obstructive Pulmonary Disease
  • LRTI: Lower Respiratory Tract Infection
  • URTI: Upper Respiratory Tract Infection

This database is freely available for research and can be downloaded in zip format from this link or you can browse the database files in this page. Furthermore, you can download the files containing the respective demographic information and the detailed events. (The columns in the text file correspond to the following variables (NA: Not Available): Participant ID, Age, Sex, Adult BMI (kg/m2), Child Weight (kg), Child Height (cm)).

Notice: Due to a bug in the file naming process, 92 files in the initial database version had the wrong recording equipment in the file name. The updated version of the database has the corrected file names. The wrong file names in the initial database version can be found infilename_differences.txt. If you have downloaded the previous database and just want to rename the files, you just need to replace the last 8 characters by Meditron or you can use this bash shell script to rename them automatically.

Publications using this database should cite the following paper in order to identify the database:
Rocha BM et al. (2019) "An open access database for the evaluation of respiratory sound classification algorithms" Physiological Measurement 40 035001

Comments on the experience with the dataset, using the following email, would be highly appreciated. 
IFMBE logoCoimbra University logoAveiro university logoAristotle University of Thessaloniki logo