Extracting pitch track from audio files into a data frame

My task was to extract pitch values from a long list of audio files. Previously I used Praat and R for this task but looping in R was rather slow so I wanted to find another solution. The following analysis was developed on Linux (Ubuntu).

Firstly, aubio (CLI-only Python tool) was used to extract pitch from wav files. aubio has fewer arguments than Praat and it returned awkward values using default settings so I didn’t explore it further. The good thing about it is that it is easy to use and is relatively simple and Python-native. To extract pitch with aubio use:

sudo apt install aubio-tools
aubiopitch -i P17_trim_short_10.000-11.150.wav

Eventually I decided to stick to Praat, which is the workhorse of phonetics and can be used from the command line.

Praat saves all commands that are executed and this can be a great start for creating a script. More information about scripting in Praat is here. My solution is here:

This script will extract .pitch files from all .wav files in the working directory and will save them to a subfolder. Praat scripts can be called from the command line:

praat --run extract_pitch_script.praat

Which will extract pitch tracks from all .wav files in the directory. The pitch extraction will use default settings in Praat. The output will be one .pitch file for each .wav file. The files themselves contain all candidates and are not in a tidy format so they have to be transformed. This step could probably be done in Praat scripting but I did not have patience to achieve it there and I moved to R which could easily produce desired output.

R can be called from the command line using littler. Shebang on the first line means that the script can be called from the command line. The script below transforms .praat files into clean .csv files.

To invoke the R script, run in the command line:

r praat_pitch_analysis_CLI.R untitled_script.pitch

This creates a .csv file with the best candidate pitch above a certain confidence threshold. Pitch extraction algorithm used by Praat was developed by Boersma (1993).

Automatic splitting of audio files on silence in Python

In my previous post I described how to split audio files into chunks using R. This time I wanted to use Python to prepare long audio files (.mp3) for further analysis. The use case would be splitting a long audio file that contains many words/utterances/syllables that need to be then analysed separately, e.g. recorded list of words.

The analysis described here was conducted on Linux (Ubuntu 16.04) and it should be fairly similar on MacOS, but Windows would require quite a few ammendments.

The first step was to turn the original .m4a files into .mp3 and to extract the segment I was interested in. I used ffmpeg for these tasks. This can be skipped if your files are already clean.

ffmpeg -i P17.m4a P17.mp3
ffmpeg -i P17.mp3 -ss 00:17:50 -to 00:23:30 -c copy P17_trim.mp3

The second command created a copy of the original .mp3 file and extracted the segment between 17 min 50 sec and 23 min 30 sec. That’s where speech was recorded in my file.

ffmpeg output

The continuous audio file that I used contained repeated utterances of the same syllable. Use the code below to split this file into segments. Silence detection is conducted using Support-vector machine (SVM):

Install pyAudioAnalysis and run on the command line:

python pyAudioAnalysis/pyAudioAnalysis/audioAnalysis.py silenceRemoval -i P17_trim_short.mp3 --smoothing 1.0 --weight 0.3
pyAudioAnalysis detect silence in audio files
Top row shows the waveform of the audio signal. Y-axis is amplitude, X-axis is time. Bottom row shows the probabily of non-silence, the vertical lines are markers that will be used to split the file.

The result is a list of sliced wav files. The names contain timings of the boundaries.

pyAudioAnalysis silenceRemoval output example.

All files in a given directory can be split using the following script:

Make sure to point the script to the directory where audioAnalysis.py lives. Modifying smoothing and weight parameters will lead to different effects so this should be adjusted depending on a type of audio recording. By default the script will show a pop-up window with the suggested split. This is very useful for monitoring data quality. The Python script can be used from the command line with:

python split_continuous_audio.py

Spectrograms in R – a gallery

Creating a spectrogram is a basic step in every analysis of audio signals. Spectrograms visualise how frequencies change over a time period. Luckily, there is a selection of R packages that can help with this task. I will present a selection of packages that I like to use. This post is not an introduction to spectrograms. If you want to learn more about them then try other resources (e.g. lecture notes from UCL).

The examples shown below came mostly from the official documentation and were kept as simple as possible. The majority of functions allow further customisation of the plots.



seewave and ggplot2





Creating a spectrogram from the scratch is not so difficult, as shown by Hansen Johnson in this blog post. Another solution was provided by Aaron Albin.

Praat is a workhorse of audio analysis. It is a standalone software, but there is also an R controller called PraatR, that allows calling Praat functions from R. It is not the easiest tool to use so I will just mention it here for reference.

I am pretty sure that there are more packages that allow creating spectrograms but I had to stop somewhere. Feel free to leave comments about other examples.

Removing triggers from Hitachi ETG-4000 fNIRS recordings

Homer2 needs a particular format of a .nirs file that cannot have consecutive triggers (also called Marks in Hitachi files).
hitachi2nirs Matlab script also removes the markers but I wanted to recreate the whole process and be sure that I’m doing it correctly. Answering Yes to the question Do you want to remove the marker at the end of each stimulus? y/n will run the following code:

To remove the triggers/markings in R follow the steps below.

Start with loading the packages and files

This will produce a table showing a structure

## 'data.frame': 2500 obs. of 50 variables:
## $ Probe1 : int 1 2 3 4 5 6 7 8 9 10 ...
## $ CH1.703.6. : num 0.1865 0.0182 -0.4738 -0.1521 -0.3078 ...
## $ CH1.829.0. : num 0.412 0.547 0.534 0.314 0.106 ...
## $ CH2.703.9. : num 0.739 0.764 0.746 0.751 0.762 ...
## $ CH2.829.3. : num 1.01 1.01 1.03 1.03 1.03 ...
## $ CH3.703.9. : num 1.57 1.58 1.59 1.59 1.6 ...
## $ CH3.829.3. : num 1.64 1.65 1.65 1.66 1.67 ...
## $ CH4.703.9. : num 1.48 1.45 1.55 1.51 1.47 ...
## $ CH4.828.8. : num 1.63 1.64 1.66 1.66 1.68 ...
## $ CH5.703.6. : num -1.226 -1.743 -0.546 -0.556 -0.75 ...
## $ CH5.829.0. : num 0.00397 -0.23102 -1.11099 -0.64056 -1.01425 ...
## $ CH6.703.1. : num -0.247 -0.335 -0.371 -0.667 -1.064 ...
## $ CH6.828.8. : num 0.987 0.892 0.892 0.933 0.796 ...
## $ CH7.703.9. : num 1.03 1.3 1.11 1.02 1.44 ...
## $ CH7.829.3. : num 1.2 1.22 1.21 1.23 1.23 ...
## $ CH8.702.9. : num 2 2.01 2.03 2.04 2.04 ...
## $ CH8.829.0. : num 1.79 1.81 1.81 1.83 1.85 ...
## $ CH9.703.9. : num 2.07 2.02 2.12 2.12 2.01 ...
## $ CH9.828.8. : num 1.82 1.82 1.82 1.84 1.85 ...
## $ CH10.703.1. : num -0.492 -0.135 -0.598 -0.598 -0.328 ...
## $ CH10.828.8. : num 0.672 0.61 0.823 0.724 0.724 ...
## $ CH11.703.1. : num -1.042 -0.255 -1.773 -1.419 -0.449 ...
## $ CH11.828.8. : num 1.071 1.052 0.804 1.107 1.047 ...
## $ CH12.702.9. : num 0.684 0.771 0.704 0.512 0.905 ...
## $ CH12.829.0. : num 1.02 1.01 1.03 1.08 1.07 ...
## $ CH13.702.9. : num 2.03 2.03 2.05 2.05 2.05 ...
## $ CH13.829.0. : num 1.76 1.78 1.79 1.79 1.81 ...
## $ CH14.703.6. : num -1.719 -1.196 -0.359 -0.883 -1.99 ...
## $ CH14.829.0. : num -0.0832 0.0209 -0.1123 -0.2014 -0.2011 ...
## $ CH15.703.1. : num 1.97 1.89 1.82 1.98 2.09 ...
## $ CH15.828.8. : num 1.81 1.78 1.8 1.81 1.84 ...
## $ CH16.703.4. : num 0.0209 -0.4283 -0.0848 -0.278 0.4996 ...
## $ CH16.829.0. : num 1.36 1.26 1.38 1.23 1.27 ...
## $ CH17.702.9. : num 2.35 2.35 2.36 2.37 2.38 ...
## $ CH17.829.0. : num 2.08 2.09 2.11 2.12 2.13 ...
## $ CH18.703.6. : num 2.1 2.1 2.09 2.1 2.1 ...
## $ CH18.828.5. : num 2.14 2.14 2.14 2.15 2.15 ...
## $ CH19.703.6. : num -1.104 -1.134 -0.658 -0.886 -0.336 ...
## $ CH19.829.0. : num -0.1239 0.09369 0.05463 0.01617 -0.00427 ...
## $ CH20.703.4. : num 1.65 1.55 1.28 1.35 1.56 ...
## $ CH20.829.0. : num 1.77 1.75 1.8 1.81 1.8 ...
## $ CH21.703.4. : num 1.41 1.43 1.31 1.42 1.46 ...
## $ CH21.829.0. : num 1.76 1.77 1.76 1.77 1.79 ...
## $ CH22.703.6. : num 2.11 2.29 2.18 2.24 2.21 ...
## $ CH22.828.5. : num 2.17 2.17 2.18 2.17 2.2 ...
## $ Mark : int 0 0 0 0 0 0 0 0 0 0 ...
## $ Time : num 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 ...
## $ BodyMovement: int 0 0 0 0 0 0 0 0 0 0 ...
## $ RemovalMark : int 0 0 0 0 0 0 0 0 0 0 ...
## $ PreScan : int 1 1 1 1 1 1 1 1 1 1 ...

table of triggers

## .
## 1 2 9 10
## 73 10 4 2

and a plot showing raw data in one channel (quite noisy) with all the triggers.

This shows several triggers (all plotted using red colour). I will only keep trigger ‘2’ to mark the beginning of a block. The first step is cleaning the data by removing all but trigger ‘2’.

Which results in fewer events

It turned out that there were two ‘2’ triggers next to each other. That’s because ETG-4000 does not allow odd triggers next to each other, e.g. 212 is invalid, but 22111122 is valid. I wrote a function (soon incorporated into fnirsr package) that deals with this problem.

The result, only the first block events, is here

Interactive plot of fNIRS data

The easiest way to plot ETG-4000 data in R is by using plot_ETG4000() from fnirsr package. However, if you want to explore your data in more detail, then an interactive plot is more appropriate.

I used dygraphs package to create the chart below. In case of using many channels, the colours in the legend can get a bit mixed up like in my example. I haven’t figured out yet how to add a custom colour palette that could deal with multiple channels.

One way or another, this code snippet should be enough to start generating interactive charts. I haven’t added the interactive chart to the main plotting function (i.e. plot_ETG4000) but I might do it in future releases.

The code used to generate the chart is here:

PS: The dygraph generated correctly in the interactive window, when using R notebooks, and when knitting. When I Saved as Web Page from RStudio, I got a header error that I had to clean by removing a tag (<!DOCTYPE html>) from the generated html file.

fnirsr – Fixing bugs, Travis CI, and detrending

I haven’t worked on fnirsr (my R package for analysing fNIRS data) for a while so I thought it’s time for some improvements. I read a great introduction to Travis CI and decided to make it work this time. After running R CMD check (and devtools::check()) several times to fix multiple bugs, I finally got to see that lovely green badge 🙂

The package still needs more testing, but so far it does its job. On top of that, I finally added a function that removes a linear trend from an fNIRS signal:

For more details and the latest updates see the project’s GitHub page.
CRAN, here I come!

fnirsr – An R package to analyse ETG-4000 fNIRS data

As I mentioned in my previous post, I am trying to get my head around analysing fNIRS data collected using Hitachi ETG-4000. The output of a recording session with ETG-4000 can be saved as a raw csv file (see the example). This file seems to be pretty straightforward to parse: the top section is a header, and raw data starts at line 41.

I created a set of basic R functions that can deal with the initial stages of the analysis and I wrapped them in an R-package. It is still a very early alpha (or rather pre-alpha), as the documentation is still sparse and no unit tests were made. I only have several raw csv files and they seemed to work fine with my functions but I’m not sure how robust they are.
Anyway, I think it will be useful to release it even in the early stage and work on the functions as time goes by.

The package can be found on GitHub and it can be installed with the following command:


A vignette (Rmd) is here.

HTML vignette:

I couldn’t find any other R packages that would deal with these files so feel free to contact me if you work(ed) on something similar. Pull requests are encouraged.

Loading and plotting nirs data in R

Recently I started to learn how to use Hitachi ETG-4000 functional near-infrared spectroscopy (fNIRS) for my research. Very quickly I found out that, as usual in neuroscience, the main data analysis packages are written in MATLAB.

I couldn’t find any script to analyse fNIRS data in R so I decided to write it myself. Apparently there are some Python options, like MNE or NinPy so I will look into them in future.

ETG-4000 records data in a straightforward(ish) .csv files but the most popular MATLAB package for fNIRS data analysis (HOMER2) expects .nirs files.

There is a ready-made MATLAB script that transforms Hitachi data into the nirs format but it’s only available in MATLAB. I will skip the transformation step for now, and will work only with a .nirs file.

The file I used (Simple_Probe.nirs) comes from the HOMER2 package. It is freely available in the package, but I uploaded it here to make the analysis easier to reproduce.

My code is here:

This will produce separate time series plots for each channel with overlapping triggers, e.g.:

The entire analysis workflow:

Other files:
RMarkdown file
html report

I hope this helps.
More to follow.

Automatic pitch extraction from speech recordings

I needed to extract mean pitch values from audio recordings of human speech, but I wanted to automate it and easily recreate my analyses so I wrote a couple of scripts that can do it much faster.

Here is a recipe for extracting pitch from voice recordings.


  • Cleaning audio files

My audio files were stereo recordings of a participant saying /a/ while hearing (near) real-time pitch shifts in their own productions. The left channel contains the shifted pitch (heard by participants) and the right channel contains the original speech productions.

The first step is to examine the audio recordings for any non-speech sounds. I used Audacity for that. Any grunts or sights can mess up the outcome of scripts used in the analysis. Irrelevant parts of the audio track can be silenced (CTRL+L in Audacity). Once the audio track is cleaned, I split the channels and save them in separate wav files.


Acoustic signal used in the analysis. Highlighted part is showing noise that should be removed.

  • Splitting continuous recordings using SFS

My pitch-extracting scripts expects each utterance to be saved in a separate wav file so I need to split the continuous recordings. It could be done manually but for longer recordings it’s cumbersome. Speech Filing System (SFS) has an option that allows splitting the continuous files on silence.


1. Load a sound file


2. Create multiple annotations

Tools > Speech > Annotate > Find multiple endpoints


Specify the values of npoint. More information can be found here. You don’t need to know the exact number of utterances, but a close approximation should work.


Visualise the results of automatic annotation:


Check if the annotations are correct. If not, then tweak the npoint settings to get the effect you need.


3. Chop the files on annotations

Tools > Speech > Export > Chop signal into annotated regions

This will save the files in the sfs format, but PraatR can’t work with these files. They need to be transformed into wav.


4. Convert sfs into wav files

Load the files you want to convert, highlight them, and go to:

File > Export > Speech



If you don’t want to spend hours doing what I’ve just described then a simpler solution is using a program that runs all the commands described above.

Use the batch script that follows the steps described above (plus some extras).


  • Extracting mean pitch using PraatR

Pitch could be extracted manually in Praat by going to

View & Edit > Pitch > Get pitch

but doing this for many files would take a lot of time and would be error-prone.


Luckily, there is a connection between Praat and R (PraatR) which can speed up this task.

I extracted mean pitch and duration of files. The latter can be used to reject any non-speech files. Here’s the script:

Now you should get a nicely formatted csv file.

I hope this will save you a lot of time.


Butterworth Filter Demo in Shiny

I am using EEGLab to process my electroencephalografic data (i.e. brain’s electric activity), but I wanted to have an interactive visualisation showing how different filter settings change my data. I prefer using R to Matlab, so I decided to create a Shiny app that would do just that.

I tried to filter brainstem’s activity during several speech conditions using Butterworth band-pass filter to get rid of the artefacts.

I wrote a butterHz function which is based on butter_filtfilt.m from the EEGLab Matlab package and is using butter function from the signal R package.

Here I used a time-domain waveform of speech-evoked Auditory Brainstem Responses to demostrate the use of the Butterworth filter.

The code is available on GitHub.