I recently ran into a situation where I needed to adjust the pitch of an MP3 file for a song that I needed to learn. The problem was that song was recorded in a specific key, and I needed to play the song a half-step different. Of course, rehearsing in the original key and transposing on-the-fly is pretty trivial, but sometimes I prefer to learn a song in the key which I will be playing.
In the past I have always used a tool like Cakewalk Sonar to load the MP3 file, adjust the pitch, and then save out the adjusted audio. But I thought that was far too prosaic of an approach; I wanted a way to script the pitch change. This got me thinking about one of my favorite tools: FFmpeg.
I have mentioned FFmpeg in previous blogs, and it's one of my favorite tools; I use it almost every day for one purpose or other, and I have a large collection of batch files to automate various tasks. But unfortunately, I didn't have anything for adjusting audio pitch. That being said, I have done a lot with various FFmpeg audio and video filters, and after a little while of sifting through some of the various settings I came up with a way to easily change the pitch for an MP3 file. (And if I ever need to automate a whole directory of MP3 files, it would be simple to update this script with a loop.)
Here's the secret to the way this works - there are two audio filters that I am using:
- asetrate - this filter adjusts the sample rate; altering the sample rate will stretch or shrink the audio, thereby changing the pitch and length of the audio.
- atempo - this filter adjusts the tempo of the audio; altering the tempo will change the length of the audio, without changing the pitch.
So the trick is to use these two filters inversely; in other words:
- If you increase the sample rate by 2, then you need to decrease the tempo by 2.
- If you decrease the sample rate by 1.5, then you need to increase the tempo by 1.5.
With that in mind, I pulled out one of my favorite math constants: 2^(1/12), which is roughly 1.0594630943592952645618252949463. You might recall from some of my other blogs that this is the value by which every pitch in Equal Temperament is derived; in other words, that value is used to create every note in the chromatic scale which is used throughout the planet.
Taking that into account, I looked at the filter settings that were possible for use with FFmpeg:
- If I assume that MP3 files are using a sample rate of 44.1khz, then I need to use values for the asetrate filter which raise or lower the sample rate by r*2^(n/12), where:
- r is the sample rate.
- n is the number of half steps to raise or lower.
- The atempo can be values between 0.5 and 2.0, where:
With that in mind, I used a similar formula to increase or decrease the tempo by 2^(n/12), where n is the number of half steps to raise or lower.
- 0.5 is half-tempo
- 1.0 is the original tempo
- 2.0 is double-tempo
The math is a little weird, I'll admit - but it's pretty straight-forward. And here's the great part for you: I've already done the math, and I've written a batch file which defines a set of constants that can be used in batch files to script the raising or lowering the pitch of an MP3 file.
Here's the code for the batch file:
ffmpeg -y -i "%TMPFILE1%" -af "%RAISE_PITCH_01%" "%TMPFILE2%"
The only parts that you need to configure are:
- TMPFILE1 - set this variable to the name of your original input MP3 file.
- TMPFILE2 - set this variable to the name of your adjusted pitch output MP3 file.
- Specify whether to raise or lower the pitch in the FFmpeg command by choosing one of the constants defined in the batch file; for example:
- RAISE_PITCH_02 would raise the pitch of the original audio file by two half-steps (or one whole step).
- LOWER_PITCH_05 would lower the pitch of the original audio file by five half-steps (or 2½ whole steps).
There are, of course, hundreds of other parameters which you can pass to FFmpeg in order to customize how FFmpeg processes the audio, but those are way out of scope for this blog.
With that in mind, that's it for now; have fun!