Timo and I agreed previously that we should ditch the class pattern for view models and instead have them be interfaces which are simply created by functions. They're more straightforward to write, mock, and instantiate this way.
The code for media view models and media items is pretty much the last remaining instance of the class pattern. Since I was about to introduce a new media view model for ringing, I wanted to get this refactor out of the way first rather than add to the technical debt.
This refactor also makes things a little easier for https://github.com/element-hq/element-call/pull/3747 by extracting volume controls into their own module.
It's always worth having logs for when state holders are created or destroyed (these are often the most interesting things happening in the application), so I thought it would be nice to have generateItems always log for you when it's doing that.
once)
The local jwt token needs to be aquired via the right endpoint. The
endpoint defines how our rtcBackendIdentity is computed. Based on us
using sticky events or state events we also need to use the right
endpoint. This cannot be done generically in the connection manager. The
jwt token now is computed in the localTransport and the resolved sfu
config is passed to the connection manager.
Add JWT endpoint version and SFU config support Pin matrix-js-sdk to a
specific commit and update dev auth image tag. Propagate SFU config and
JWT endpoint choice through local transport, ConnectionManager and
Connection; add JwtEndpointVersion enum and LocalTransportWithSFUConfig
type. Add NO_MATRIX_2 auth error and locale string, thread
rtcBackendIdentity through UI props, and include related test, CSS and
minor imports updates
On second glance, the way that we determined a media tile to be 'waiting for media' was too implicit for my taste. It would appear on a surface reading to depend on whether a participant was currently publishing any video. But in reality, the 'video' object was always defined as long as a LiveKit participant existed, so in reality it depended on just the participant. We should show this relationship more explicitly by moving the computation into the view model, where it can depend on the participant directly.