My task was to extract pitch values from a long list of audio files. Previously I used Praat and R for this task but looping in R was rather slow so I wanted to find another solution. The following analysis was developed on Linux (Ubuntu).
Firstly, aubio (CLI-only Python tool) was used to extract pitch from wav files. aubio has fewer arguments than Praat and it returned awkward values using default settings so I didn’t explore it further. The good thing about it is that it is easy to use and is relatively simple and Python-native. To extract pitch with aubio use:
Eventually I decided to stick to Praat, which is the workhorse of phonetics and can be used from the command line.
Praat saves all commands that are executed and this can be a great start for creating a script. More information about scripting in Praat is here. My solution is here:
This script will extract .pitch files from all .wav files in the working directory and will save them to a subfolder. Praat scripts can be called from the command line:
praat --run extract_pitch_script.praat
Which will extract pitch tracks from all .wav files in the directory. The pitch extraction will use default settings in Praat. The output will be one .pitch file for each .wav file. The files themselves contain all candidates and are not in a tidy format so they have to be transformed. This step could probably be done in Praat scripting but I did not have patience to achieve it there and I moved to R which could easily produce desired output.
R can be called from the command line using littler. Shebang on the first line means that the script can be called from the command line. The script below transforms .praat files into clean .csv files.
To invoke the R script, run in the command line:
r praat_pitch_analysis_CLI.R untitled_script.pitch
This creates a .csv file with the best candidate pitch above a certain confidence threshold. Pitch extraction algorithm used by Praat was developed by Boersma (1993).
R might not be the most obvious tool when it comes to analysing audio data. However, an increasing number of packages allows analysing and synthesising sounds. One of such packages is seewave. Jerome Sueur, one of the authors of seewave, now wrote a book about working with audio data in R. The book is entitled Sound Analysis and Synthesis with R and was published by Springer in 2018. I highly recommend it to anyone working with audio data.
The book starts with a general explanation of sound. Then it introduces R to readers who have no experience using it. Over the 17 chapters the author describes basic audio analyses that can be conducted with R. The underlying concepts are explained using both mathematical equations and R code. There is also some material on sound synthesis, but this is a minor point when compared to the space devoted to the analysis. Additional materials include sound samples used across the book.
As mentioned before the main topic of the book is the analysis of sound, predominantly in scientific settings. Researchers (or data scientists) typically would want to load, visualise, play, and quantify a particular sound that they work on. These basic steps are desribed in this book with code examples that are simple to follow and richly illustrated with R-generated plots. Check the book preview here.
If you ever need to paste, delete, repeat or reverse audio files with R then recipes for these tasks can be found in this book. The book contains twenty DIY Boxes which show alternative ways to use already coded functions and demonstrate new tasks. These boxes cover topics ranging from loading audio files, plotting to frequency and amplitude analysis.
Even though the author created his own package, the book shows how to use a wide range of audio-specific R package like tuneR or warbleR.
I can only wish that this book had been released earlier. It would have saved me a lot of pain conducting audio analyses.
Creating a spectrogram is a basic step in every analysis of audio signals. Spectrograms visualise how frequencies change over a time period. Luckily, there is a selection of R packages that can help with this task. I will present a selection of packages that I like to use. This post is not an introduction to spectrograms. If you want to learn more about them then try other resources (e.g. lecture notes from UCL).
The examples shown below came mostly from the official documentation and were kept as simple as possible. The majority of functions allow further customisation of the plots.
Creating a spectrogram from the scratch is not so difficult, as shown by Hansen Johnson in this blog post. Another solution was provided by Aaron Albin.
Praat is a workhorse of audio analysis. It is a standalone software, but there is also an R controller called PraatR, that allows calling Praat functions from R. It is not the easiest tool to use so I will just mention it here for reference.
I am pretty sure that there are more packages that allow creating spectrograms but I had to stop somewhere. Feel free to leave comments about other examples.
Niedawno odnalazłem ciekawy pakiet geofacet, który umożliwia rozmieszczenie wykresów zgodnie z ich pozycją na mapie. Główna funkcja facet_geo() zastępuje facet_wrap() z ggplot2. Polska mapa jeszcze nie jest dostępna w standardowym pakiecie geofacet, ale mam nadzieję, że już wkrótce tam się znajdzie, bo dodałem ją na GitHubie.
Stworzyłem siatkę z koordynatami poszczególnych województw. Wykresy z pakietem geofacet mogą wyglądać tak:
Rozmieszczenie województw nie jest idealne, ale pakiet geofacet umożliwia użycie własnych ustawień.
Zoopla allows a limited access to its API providing the latest property prices and area indices. I created a package in R that allows querying this database. See the GitHub documentation or zooplaR’s page for the latest info.
You can easily get prices in the last couple of months or years for a particular postcode, outcode or area:
Given, the limit number of queries, it might be worth double-checking the results with the property widget offered by Zoopla (redirects to zoopla.co.uk).
It doesn’t have as many options as the API and obviously is not automatic but it’s worth using for a sanity check.
During the development of another R package I wasted a bit of time figuring out how to add code coverage to my package. I had the same problem last time so I decided to write up the procedure step-by-step.
Provided that you’ve already written an R package, the next step is to create tests. Luckily, devtools package makes setting up both testing and code coverage a breeze.
Let’s start with adding an infrastructure for tests with devtools: library(devtools)
Then add a test file of your_function() to your tests folder: use_test("your_function")
Then add the scaffolding for the code coverage (codecov) use_coverage(pkg = ".", type = c("codecov"))
After running this code you will get a code that can be added to your README file to display a codecov badge. In my case it’s the following: [![Coverage Status](https://img.shields.io/codecov/c/github/erzk/PostcodesioR/master.svg)](https://codecov.io/github/erzk/PostcodesioR?branch=master)
This will create a codecov.yml file that needs to be edited by adding: comment: false
- Rscript -e 'covr::codecov()'
Now log in to codecov.io using the GitHub account. Give codecov access to the project where you want to cover the code. This should create a screen where you can see a token which needs to be copied:
Once this is completed, go back to R and run the following commands to use covr:
The last line will connect your package to codecov. If the whole process worked, you should be able to see a percentage of coverage in your badge, like this:
Click on it to see which functions are not fully covered/need more test:
I hope this will be useful and will save a lot of frustrations.
The easiest way to plot ETG-4000 data in R is by using plot_ETG4000() from fnirsr package. However, if you want to explore your data in more detail, then an interactive plot is more appropriate.
I used dygraphs package to create the chart below. In case of using many channels, the colours in the legend can get a bit mixed up like in my example. I haven’t figured out yet how to add a custom colour palette that could deal with multiple channels.
One way or another, this code snippet should be enough to start generating interactive charts. I haven’t added the interactive chart to the main plotting function (i.e. plot_ETG4000) but I might do it in future releases.
The code used to generate the chart is here:
PS: The dygraph generated correctly in the interactive window, when using R notebooks, and when knitting. When I Saved as Web Page from RStudio, I got a header error that I had to clean by removing a tag (<!DOCTYPE html>) from the generated html file.
I haven’t worked on fnirsr (my R package for analysing fNIRS data) for a while so I thought it’s time for some improvements. I read a great introduction to Travis CI and decided to make it work this time. After running R CMD check (and devtools::check()) several times to fix multiple bugs, I finally got to see that lovely green badge 🙂
The package still needs more testing, but so far it does its job. On top of that, I finally added a function that removes a linear trend from an fNIRS signal:
For more details and the latest updates see the project’s GitHub page.
CRAN, here I come!
As I mentioned in my previous post, I am trying to get my head around analysing fNIRS data collected using Hitachi ETG-4000. The output of a recording session with ETG-4000 can be saved as a raw csv file (see the example). This file seems to be pretty straightforward to parse: the top section is a header, and raw data starts at line 41.
I created a set of basic R functions that can deal with the initial stages of the analysis and I wrapped them in an R-package. It is still a very early alpha (or rather pre-alpha), as the documentation is still sparse and no unit tests were made. I only have several raw csv files and they seemed to work fine with my functions but I’m not sure how robust they are.
Anyway, I think it will be useful to release it even in the early stage and work on the functions as time goes by.
The package can be found on GitHub and it can be installed with the following command: