This page is now depreciated and these articles have been replaced with new, see the menu “Basics->Audio->Dacs for Sound”.
In the previous article on playing WAV’s (Digitised speech/sound on ESP32 – Playing Wavs) we introduced a library for easily playing short WAV files on an ESP32 but it was warned that the library used was subject to change and indeed it has. It has been effectively re-written for one main reason. It was always planned that the library would do more than play single WAV files, for example playing multiple ones at once and other music and sound effects not WAV based. This would have put too much work into the interrupt routine and could have led to issues.
So the interrupt routine was simplified so that it’s only real job was to read a byte from memory and put this to the DAC at a certain rate (or frequency). Somewhere in the main loop you would need to ensure that the memory the DAC was reading from was populated with the data to be played out. The obvious way to do this was to implement a wrap-around buffer, this is an area of memory set aside to be filled and read from as required.
What is a buffer?
In computing terms a buffer is an area of memory (usually a contiguous array of bytes) where data is stored for retrieval later. There are various types of buffer; First-in First-out, First-in Last-out and of course the wrap-around. The wrap-around buffer is basically a variant of the First-in First-out type of buffer where the first byte stored in the buffer will be the first byte pulled out. The buffer “wraps-around” back on itself when it reaches the end. In the other types if you run out of buffer space you usually get some sort of system crash or program halting (depending on how nice the system itself behaves with low memory conditions). First-In First-Out’s are most often (if not exclusively) wrap-around buffers.
Wrap-Around Buffer – the Dash cam
A good example of a wrap around buffer is the dash-cam. It constantly records video footage all the time and if you want to review the last 10 mins or whatever you can do. But it doesn’t have limitless memory, it has enough to store the last 5 mins, 30 mins, 60 mins or whatever depending on the model you buy. As its recording when it gets to the end of memory it wraps-around back to the beginning of memory and records over that data. So you will always have the last 30 mins (or whatever) worth of footage.
Now, our wrap-around buffer is a little more complex. Firstly we want to actually start playing the sound at the same time as filling the buffer (dash-cams just fill and you review later when not recording). We also want to fill the buffer as quick as we can and also we cannot write over data that has not yet been played and also we cannot play from an area of the buffer that doesn’t have valid data to play. This all complicates this buffer and required us to monitor the various aspects of fill and play positions quite closely.
Our buffer and its memory pointers
The buffer implemented is shown below and in this example is just 16 bytes in length.
You buffer is across the top and has 16 bytes (0-15). We need 3 pointers to keep track of where things are. They are
NextPlayPos: The next byte to be sent to the DAC
NextFillPos: The Next byte that is free to be filled by the buffer filling routine
LastFillPos: The last bytes that is free to be filled by the buffer filling routine
You can see that I’ve also adding in some example memory for some sound data. So initially before any sound is played or even told to play the various pointers will be set as shown in the image above. The code that plays the sounds is an interrupt routine that is called 50,000 times a second. This means that this system cannot play back sounds with sample rates greater than 50Khz but that is way more than we’ll need for anything using a simple micro-controller. So part of the interrupt code for playing the sounds makes a check and if the “NextPlayPos” equals the “NextFillPos” then it does nothing. If they are not equal then it is deemed that there is a valid byte to send to the DAC at the NextPlayPos and it retrieves that byte and sends it to the DAC.
I’m not going to give a blow by blow account on how the routines fill or play from the buffer, if you want to take a look then please do so.
The Source code
Click the link below to download the source code. If you’ve already installed the previous version of the Audio DAC library then this will need to be removed first, just delete the folder in the Arduino libraries folder. If your not sure what to do then see this article. Once you’ve removed the old version then install the new library using the menu item “Sketch->Include library->Add .zip library”, if you’re not too sure how to do this see this article.
Example Code
Once the library is installed go to File->Examples->XT_DAC_Audio->Play WAV. This will load up the example which is also listed below.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 |
// Playing a digital WAV recording repeatadly using the XTronical DAC Audio library // prints out to the serial monitor numbers counting up showing that the sound plays // independently of the main loop // See www.xtronical.com for write ups on sound // You will obviously need the appropriate hardware such as ESP32, audio amp and speaker, see // https://www.xtronical.com/basics/audio/dacs-on-esp32/ for the simple build and specifically // // for the write ip on digitised speach/ sounds #include "SoundData.h"; #include "XT_DAC_Audio.h"; XT_Wav_Class ForceWithYou(Force); // create an object of type XT_Wav_Class that is used by // the dac audio class (below), passing wav data as parameter. XT_DAC_Audio_Class DacAudio(25,0); // Create the main player class object. // Use GPIO 25, one of the 2 DAC pins and timer 0 void setup() { } void loop() { DacAudio.FillBuffer(); //Fill the sound buffer with data if(ForceWithYou.Completed) // if completed playing, play again DacAudio.PlayWav(&ForceWithYou); } |
If you’ve followed the previous version of this library you should be able to see that there is very little difference in the code, just one extra line in the main loop “DacAudio.FillBuffer();”. But yet behind the scenes the code has been significantly re-written. Admittedly in the previous version it was simpler as you didn’t need this extra line, but it is a small price to pay for the future expansion of the library.
Why did it have to be re-written?
Interrupt routines must complete as quick as possible and it was predicted with the extra workload in the future (as I add in more features) the interrupt routine would be doing too much. So I had to offload some of the work to the main loop.
Changing the size of the buffer
By default if you open up the file “XT_DAC_Audio.h” within the XT_DAC_Audio folder in the Arduino libraries folder you will see a line that reads;
#define BUFFER_SIZE 600
This is reserving 600 of dynamic RAM for the buffer. Now, depending upon your code this can be dramatically reduced or in some cases it may need to be increased, it all depends upon how fast your main loop executes. If it has a lot to do and takes some time to execute then you will need a large buffer (perhaps bigger than 600), but if it executes quickly then you can reduce your buffer size, perhaps to just the minimum (which is a lowly 3 bytes!).
You may ask how you can work out the size of buffer to use, and that would be a good question! The library includes a routine to tell you how much of the buffer is being used on average. So for example if that reports 50Bytes then you could set the buffer size to maybe around 70bytes rather than the 600 and save yourself heaps of memory!
Buffer usage example code
Below is the example code showing how to use the buffer usage routine AverageBufferUsage(). Note you will need the serial monitor window open to see the result.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 |
#include "SoundData.h"; #include "XT_DAC_Audio.h"; XT_Wav_Class ForceWithYou(Force); // create an object of type XT_Wav_Class that is used by // the dac audio class (below), passing wav data as parameter. XT_DAC_Audio_Class DacAudio(25,0); // Create the main player class object. // Use GPIO 25, one of the 2 DAC pins and timer 0 void setup() { Serial.begin(115200); } void loop() { DacAudio.FillBuffer(); //Fill the sound buffer with data if(ForceWithYou.Completed) // if completed playing, play again DacAudio.PlayWav(&ForceWithYou); DacAudio.AverageBufferUsage(); delay(1); } |
Note that there is an artificial delay, delay(1), in the code to deliberately slow it down simulating some slower main loops, otherwise it would execute so fast that the estimating buffer usage would be 0! So what do we get, using a 240Mhz ESP32 the serial monitor window showed the following
Avg Buffer Usage : 49 bytes
Setting the Buffer Memory
So in the “fake” scenario above you would open up the file “XT_DAC_Audio.h” within the XT_DAC_Audio folder in the Arduino libraries folder you will see a line that reads;
#define BUFFER_SIZE 600
Change this to something above 49 bytes. If you don’t allocate enough buffer memory then your sounds will start to play slower than the intended speed, nothing more worse than that thank-fully! So if this happens try increasing the buffer size. If you allocate too much memory then you are wasting resources. So on any new code use the AverageBufferUsage() routine shown above to allow you to set the buffer appropriately and then obviously remove it from the final code!
In many situations for most peoples code you will get a result of “0” for the buffer used, as in a lot of situations the main loop will execute very quickly. But don’t be tempted to set your buffer to 0 or even 1. The minimum buffer size is 3 bytes otherwise sound will not play. For myself I do a lot of game coding sending large amounts of data to display devices. This can cause main loops to be relativly slow and means I need a larger buffer.
What’s Next?
Well now that the re-write is complete the code is now suitable for performing “mixing”, that is playing more than one sound at once. In the 80’s computers would often be advertised with how many sound channels (or sometimes referred to as voices) they had with 3 being common. For example ones using a Texas Instruments SN76489 or General Instruments GI8912. Commodore had their excellent SID6581 in their machines that sported 3 “official” channels but with some clever use of the chip and programming could do 4 and simulate 5. These chips mixed sounds within their hardware, but all we have is a ESP32 with no dedicated hardware for sound built in and we’ve only added an amp externally. We are not going to start dding mixing circuits. As all our sound is digital we can do the mixing digitally as well using some coding. This takes processor time but that is something we’re spoilt for with our ESP32’s and to some extent other MCU’s