It's a fantastic open source solution. Can be self hosted. It is to note that tr...

supermatt · on April 2, 2020

The server (jitsi videobridge) DOES have the keys to decrypt the traffic - i.e. its not e2e encrypted. This is par for the course with webrtc SFUs (there are some "workarounds" to support e2e encryption over webrtc I mentioned in another thread).

SparkyMcUnicorn · on April 2, 2020

The Jistsi team says the server needs roughly 5.5Mbps per Chrome user. Firefox uses a lot more bandwidth, system resources, and degrades the room capacity.

Just something to keep in mind, and after some testing I saw the same results.

My AWS bill was projected to be over $1k/month, so I put it on Linode where it'll cost between $100-200/month. Just about any decent VPS provider would be good options compared to AWS due to bandwidth.

littlestymaar · on April 2, 2020

> The Jistsi team says the server needs roughly 5.5Mbps per Chrome user.

That sounds like a lot! I wonder why an almost still image can use that much bandwidth, I guess it has to do with the low-latency requirement but I would love more details on that.

saghul · on April 2, 2020

We use a technique called simulcast. It consists on making every participant "work a bit harder" for the good of the bunch.

That is, every participant sends 3 separate video resolutions to the server: 720p, 480p and 180p (this may change due to bandwidth constraints). Then the server will only forward the approopriate layer to each other participant. So, if you are only seeing me in a thumbnail it will only forward the 180p layer. If I become the active speaker (or you choose to pin me to the large view) the server will immediately switch to forwarding the 720p layer.

In addtion, we use SVC with temporal layers, so thumbnails may be given at just 15fps or even less.

We do have a trick up our sleeve (but IIRC it haad to be disabled for the time being) which involves disabling the ssimulcast layers that nobody is requesting. That is, if nobody iss seeing me in the large view, why send the 720p layer at all?

Hope that helps!

gregmac · on April 2, 2020

SVC spatial and quality layers [1] sound like a really good solution to the bandwidth issue. From my (extremely limited) understanding, basically if you skip certain packets you get a lower resolution/quality stream. A client sends a single stream at the best quality it can tolerate, and then the SFU can forward whichever layers to each client depending on what resolution that client wants.

What's the state of this in jitsi? I can only find limited info about SVC [2] and that is only on temporal layer. How much bandwidth does it even save in practice - maybe it's not worth the complexity trade-off?

[1] http://webrtchacks.staging.wpengine.com/chrome-vp9-svc/

[2] https://github.com/jitsi/jitsi-videobridge/blob/master/doc/s...

littlestymaar · on April 3, 2020

Thanks for your interesting answer. It doesn't really address my question though, so I will rephrase it:

If I take a 720p video of a webcam and encode it to be delivered as progressive live streaming, the resulting stream is going to be less than one Mbps : because the image doesn't move much, I don't need many key-frame, one every 4 seconds is more than enough, and I've seen streams with no more than one every 30s (live streaming of harbours' CCTV cameras, don't ask me why). But of course it won't be realtime either, and you're gonna have a few seconds of latency. While this is OK for live streaming, it is certainly not for a video chat room.

What I'd like to know is why the latency requirement reduces that much the encoding efficiency. Do you have an idea ?

applecrazy · on April 2, 2020

Naive (and perhaps stupid) question here, why don't you request just the 720p feed from each participant and then rescale the streams as clients need them?

endgame · on April 2, 2020

The server would have to downscale the received stream. That's at least one round of downscaling for each participant, possibly more if you want to send different quality versions of the same stream down to different participants.

folmar · on April 2, 2020

And while your device is quite capable of rescaling your stream -- even a lowest-tier smartphone will do it with no strain on its GPU -- its much more of a strain on the server to rescale 100's/1000's of stream simultaneously.

littlestymaar · on April 3, 2020

Additionally: AFAIK you can't easily rescale an encoded stream: you need to decode it, rescale it and then re-encode it. For every single stream! That would be horribly computationally expensive.

nerpderp82 · on April 3, 2020

Ram bandwidth is easier to manage and provision than physical bandwidth.

jfjrjri9nn · on April 2, 2020

Almost still image is a weird way of saying “numerous still images regardless of amount of motion, that have to be stitched together in precise time sorted batches to provide a smooth experience for the viewer.”

Thinking about the use case it’s obvious this isn’t a simple calendar app amount of data.

I bet if you create some images in various resolutions these services support and what looks good to your eye, then fill folders with sequences of them, you’ll see why it’s a lot of bandwidth.

Better to over communicate and let the client deal with the organization as it’s designed to and not let some network admin nickle and dime over bits.

Financial economy doesn’t really say much about the literal economy of building all these toxic gadgets.

Given the big picture, not sure why such a trivial concern as bandwidths ephemeral money cost would foster such strong curiosity.

Even then, code is in a repo. Go learn.

toohotatopic · on April 2, 2020

Is there cheaper bandwidth available than Hetzner's 1.19euro +vat per TB? (Linode seems to be at 10$ / TB).

capableweb · on April 2, 2020

Or Hetzners dedicated hosts that have zero cost for bandwidth and unlimited traffic as well, haven't been able to find anything cheaper than that.

morphle · on April 2, 2020

Leaseweb can be o lot cheaper, it depends.

I'm starting an ISP again, we will charge around $0,25 per TB

iknowstuff · on April 2, 2020

OVH doesn't limit it afaik.

moron4hire · on April 2, 2020

I setup my own server yesterday and shutdown the server in the middle of a conversation. The conversation kept going.

Traffic only goes through the server for users behind NAT, triggering the TURN path.

supermatt · on April 2, 2020

That is only for 2 participants. For 3 or more (video participants), it uses the videobridge SFU.

Source: https://github.com/jitsi/jitsi-meet/blob/master/doc/manual-i...

moron4hire · on April 2, 2020

I had 4 people, so either the docs are wrong or they have forgone the video bridge on my server and are using their own.

saghul · on April 2, 2020

If you had 4 people the video was going trhough the JVB for sure. Since a Jitsi Meet installation uses multiple components, maybe you didn't shut them all?

Video shenanigans can be a dark art, but not that magical ;-)

moron4hire · on April 2, 2020

All installed on one server, physically shut the entire server down.

supermatt · on April 2, 2020

Very possible the docs are wrong - im not a jitsi dev, so I wouldnt be able to confirm or deny without poking around. BUt they do specifically state that (as linked).

That said, it is worth pointing out that this thread is specifically about videobridge (i.e. scaling beyond a full mesh).