This project has moved. For the latest updates, please go here.

Crackling sound when loading sound files in parallel threads

May 16, 2014 at 5:00 AM
Edited May 16, 2014 at 5:03 AM
I noticed that when I load multiple sound files in parallel (using threads), the sounds end up crackling when they are played.

I was able to examine the source code and make a change which fixed my problem. In the file Mdct.cs, I changed the Reverse method to look like this:
    public static void Reverse(float[] samples, int sampleCount)
    {
        Mdct mdct = GetSetup(sampleCount);

        lock (mdct)
        {
            mdct.CalcReverse(samples);
        }
    }
I added the "lock(mdtc)". The reason I did this is because multiple threads may be processing using the same mdct object at the same time. The mdtc object has a member variable named "buf2Temp" that seems to be getting modified by the CalcReverse method. So if two threads try to call that CalcReverse method at the same time on the same mdct object, then they will conflict.

Instead of my solution, it might be better to change the mdct implementation so that "buf2Temp" is not a member variable but is instead created when CalcReverse is called and passed down the chain as needed.

Anyhow, that change fixed my problem.

I also noticed that the GetSetup() method in the same class is not protecting _setupCache properly. One thread should not be attempting to read from _setupCache while another thread is writing to it (because adding to a dictionary can cause a re-hash of its contents, which repositions everything, and potentially interferes with a read). I changed that to look like this:
    static ReaderWriterLockSlim _cacheLock = new ReaderWriterLockSlim();

    static Mdct GetSetup(int n)
    {
        _cacheLock.EnterUpgradeableReadLock();

        if (!_setupCache.ContainsKey(n))
        {
            _cacheLock.EnterWriteLock();
            _setupCache[n] = new Mdct(n);
            _cacheLock.ExitWriteLock();
        }

        Mdct mdct = _setupCache[n];

        _cacheLock.ExitUpgradeableReadLock();

        return mdct;
    }
Anyhow, just wanted to let you know about this, and see if any of these fixes might be helpful. It's also possible that I made some incorrect assumption about how things are meant to be used. If so, that would be good to know too.
Coordinator
May 16, 2014 at 12:00 PM
D'oh!! I never considered whether Mdct needed to be thread-safe or not. Obviously it does, but I missed it. Sorry about that.

I'll either put in locking or change the logic to no longer use a member variable. I'll think about GetSetup() and see if I like that locking style there.
Coordinator
May 16, 2014 at 2:36 PM
I've checked in some changes to address the threading. It works, but I'm not done with it (this was a quick fix so others can multi-thread the library when building from source). When I have some time to really fix it, my plan is to split out the lookup tables from the work buffers, handle threading in Reverse(...), and make the whole class "static" so there's no vtable lookup penalty (very slightly better performance).

In any case, I don't want to lock around the call to CalcReverse(...) since it is the single most expensive part of decoding by a nearly 50% margin. Making threads have to wait for it is not good for latency.
May 16, 2014 at 4:14 PM
Wow, thanks for the quick reply! I just took a look at the changes, and they appear quite reasonable.

I know you're not finished, so at the risk of becoming annoying, I will suggest one more change: instead of using a thread-local dictionary to store buffers, you could try using a pool. That would look something like this.
static Stack<float[]> _bufferPool = new Stack<float[]>();
float[] GetBuffer()
{
    lock (_bufferPool)
    {
        if (_bufferPool.Count > 0)
        {
            return _bufferPool.Pop();
        }
        return new float[_n2];
    }
}

void ReleaseBuffer(float[] buffer)
{
    lock (_bufferPool)
    {
        _bufferPool.Push(buffer);
    }
}
This should be very slightly faster, but the real benefit is that different threads can reuse the buffers that prior threads have created so that less need to be created (and stored indefinitely, forever increasing as more threads are spawned). Anyhow, that shouldn't be a big deal for what I'm working on, so feel free to implement it as you see fit.

I really do appreciate the speed of this change. I'll give it a try tonight!