Weak References and Type Erasure

January 2, 2017

In one of my side projects, two interesting Swift problems—heterogeneous arrays and weak references—collided in an interesting way.

I needed to store an array of signal objects.

var signals: [Signal] = []

Easy peasy? Not quite. Signals push updates to subscribers, so we’d really like them to be generic in the type of values that they push:

public class Signal {
    private var currentValue: Value? = nil {
        didSet {
            // … push new value to subscribers …
        }
    }
    
    func update(to value: Value) {
        currentValue = value
    }
    
    // … subscription machinery elided …
}

The Values here range over all the types stored in the model, plus arrays of those types. We could try something like:

var signals: [Signal] = []

but co- and contravariance raise their heads and we can’t actually put more specific signals in the array:

let intSignal = Signal()
signals.append(intSignal)

Playground execution failed: error: WeakReferencesAndTypeErasure.playground:27:16: error: cannot convert value of type 'Signal' to expected argument type 'Signal'

In situations like this, I often reach for type erasure.

For my project, it turns out that there is only one method on Signal that I need to call. For each signal, there’s an associated mapping function that takes the global data model and produces the new value for the signal: (DataStorage) -> Value? If that function returns a non-nil value, then the signal should be updated to the value.

So, instead of storing signals directly, let’s store an array of closures that do the right thing:

var signalCallers: [(DataStorage) -> Void] = []

func addSignal(
    _ signal: Signal,
    matching matcher: @escaping (DataStorage) -> Value?)
{
    let caller: (DataStorage) -> Void = { storage in
        if let value = matcher(storage) {
            signal.update(to: value)
        }
    }
    signalCallers.append(caller)
}

To store a signal, we now call the addSignal(_:, matching:) function. The function captures the signal inside the caller closure and saves the closure in the signalCallers array.

Now when my app mutates the global data storage, the model layer can iterate over the signalCallers and broadcast changes to the subscribers.

var storage: DataStorage {
    didSet {
        for caller in signalCallers {
            caller(storage)
        }
    }
}

Unfortunately we just created a signicant memory leak. The closures in signalCallers retain every signal. My data model retains signalCallers. Because the data model is a singleton, every signal stays in memory and is updated, even if the subscriber to the signal is long gone.

Subscribers should be responsible for retaining signals. The data model should just hold the signals weakly. We can change caller closure in ``addSignal(_:, matching:)` to use a capture list:

func addSignal(
    _ signal: Signal,
    matching matcher: @escaping (DataStorage) -> Value?)
{
    let caller: (DataStorage) -> Void = { [weak signal] storage in
        guard let signal = signal else { return }
        if let value = matcher(storage) {
            signal.update(to: value)
        }
    }
    signalCallers.append(caller)
}

This keeps us from leaking all the signals. Signals are only retained by their subscribers and, thanks to the guard, the caller closures become no-ops after the signals are deallocated. Unfortunately, we’re still leaking the closures themselves. We also have a performance problem. The signalCallers array is always growing. The longer the app runs, the more no-op callers we have to invoke on every model change.

How can we clean up the old callers? We need some logic that tells us when a signal has been deallocated. But we don’t have a direct reference to any signals at all from the data model. We gave that up when we used type erasure. The only references to the signals are inside the callers.

That realization leads to the insight that unlocks this problem. We can capture a weak reference to the signal inside another closure that tells us whether or not a signal has been deallocated:

let isLivePredicate: () -> Bool = { [weak signal] _ in
    return signal != nil
}

Let’s wrap that predicate and the caller closure up in a little struct:

private struct SignalHolder {
    private let isLivePredicate: () -> Bool
    private let caller: (DataStorage) -> Void
    
    init(
        signal: Signal,
        matching matcher: @escaping (DataStorage) -> Value?)
    {
        isLivePredicate = { [weak signal] _ in
            return signal != nil
        }

        caller = { [weak signal] storage in
            guard let signal = signal else { return }
            if let value = matcher(storage) {
                signal.update(to: value)
            }
        }
    }
    
    var isLive: Bool { return isLivePredicate() }
    
    func update(from storage: DataStorage) {
        caller(storage)
    }
}

We can replace our array of signal callers with signal holders and update addSignal(_:,matching:) to do the right thing:

private var signalHolders: [SignalHolder] = []

func addSignal(
    _ signal: Signal,
    matching matcher: @escaping (DataStorage) -> Value?)
{
    let holder = SignalHolder(signal: signal, matching: matcher)
    signalHolders.append(holder)
}

Finally, we can update the didSet handler for storage to filter out deallocated signals:

var storage: DataStorage {
    didSet {
        signalHolders = signalHolders.filter { $0.isLive }
        for holder in signalHolders {
            holder.update(from: storage)
        }
    }
}

Now every time there’s a model change, we dispose of the no-longer-useful signal holders, then update the remaining ones. Type erasure lets us store heterogeneous signals. Capture lists let us hold the signals weakly. The isLivePredicate lets us clean up after ourselves.

I’m mostly pleased with this solution, but I still have a nagging doubt that it’s too clever. Will future me be confused about what’s going on here? Or are type erasure and weak capture standard practice that Swift developers will just expect to understand? Is there a way to simplify this or make it clearer? I’d love to know.