The concept is simple, instead of each video participant sending one video stream and receiving one with either all participants or the last speaker, each video participant can send multiple video streams (either from various sources like cameras, content, different parts of the room etc or multiple resolutions of the same stream) and receive multiple video streams (from each of the participants), thus making it possible to compose locally a desired layout, zoomed in on one participant, only view the content and so on.
But why this is better than single stream might not be so obvious, so alright, I get it, let my give it a try to explain. First of all: what does it mean to you as a user of video? If you use meeting room video systems, personal desktop phones and video systems, video clients on your PCs, and video apps on your tablet or phone, why does it make a difference? Imagine everybody having phone calls with video all the time, let’s say every other call is a video call. My 10 year old in her room chatting with her friend, my wife reviewing their latest presentation with her colleague, my 7 year old playing an online game with his friends… And by the way, I have to do a presentation in my 6pm meeting with one of our large service provider partners. And yes, in my family, we all want to see our friends, colleagues, and customers when talking to them! To some, this may be a foreign idea, but in our home it is not. But is it possible that we can all experience super-high quality video and audio? Nope, not today. Indeed, you may have heard about how Netflix and other streaming video services have overwhelmed networks and service providers? That is one high-quality video stream, going in one direction, sent from a central source. Imagine that you have 8 times that! Four people, two-way video streams, and these streams are not from a central location, but criss-crossing the networks to reach the people we talk to! And the real problem is not really what your family does, but what all your neighbours do, because you share the network capacity with them.
The classic approach to making sure that you get quality audio (and video) is to reserve capacity for your call. This is what happened in the old telephony world and this is what happens when you make a mobile call. It goes way back to the early days when you actually connected two people to each other over the phone using a cable in a manual switchboard. You own the cable end to end for the duration of the call. It makes sense when it is only calls those cables handle (and not web pages, emails, youtube snippets, and other data) and when you know that a call is a call, you always need that cable’s full capacity. The same was true for early video calls, you spent all the bandwidth capacity you had available and only reservation would guarantee that you kept the bandwidth for the duration of the call. However, a modern video call can fluctuate in how much bandwidth is needed, both in terms of what is needed and how it can adapt to a certain restricted bandwidth. And mixed together with other data on a network, reserving capacity for each call becomes impractical and a waste of resources.
For those of you who have read my blog posts, you will recognise this as “media elasticity”. So, one of the big values of the new multi stream video architecture is to make sure that it is technically possible to support that you and I, and all our neighbours and colleagues, actually use video integrated into our daily lives. Also, bandwidth costs and the network impacts of giving everybody in an enterprise high-quality video have been a big barrier to improving the office lives of millions of people. In fact, the bandwidth costs can be bigger than the investment in the video equipment. The multi stream architecture will allow you to enable everybody in a company to use video for all their work, both for internal meetings and to meet with customers. When in-person experiences can be replicated over video, the way your employees do their work, who report to whom, what they can be responsible for, and how they engage with customers, will change dramatically. We have called this “pervasive video” and seeing how companies are changing and improving how they do business is one of the most rewarding things you can participate in when working in the collaboration industry!
The second important value to you as a user is that the way you experience video calls will change dramatically. Have you tried to make full size a video from somebody in Skype or another video service? Do you see a super-crisp, vivid image regardless of how small or big it is? Ever been in a video meeting with many participants? Have you found any value in the small thumbnails of each person when they don’t talk? Do you know who is talking? Have you met everybody before, and you recognise who they are? Before digital cameras arrived, you would take pictures, wait some days to get to a store (or mail the film), develop the film, pick out the ones you wanted, put them in an album, and then show everybody your experiences from your latest trip to the Amazonas jungle flipping through the album. Today, various software allows you to put together a slideshow in no time and show it on your 50” television, the same day you come back home. But before you got these tools that helps you create this experience, you had to have the digital cameras. The eco-system of software and services for using your pictures in various ways have slowly evolved in the years after digital cameras became available. The multi stream architecture and the video technologies involved have evolved similarly to this. We have had high-quality video systems and infrastructure for a long time, but only now we see the eco-system around emerging and the next few years you are going to see a dramatic shift in how you will experience video meetings.
For example, with multi stream, you will have available high-quality video streams of all the most active speakers. They can be shown on your screen with super-crisp boundaries, with name tags for each person and you can move them around the way you want. You can resize them, see the presentation in the size you want, super-sharp, and if you have two screens (or more) available in the meeting room, you can choose what or whom to show where. And of course, each video stream will come with its own audio, so you will hear the sound from the right part of screen/room. If the meeting room has multiple cameras, you can lock one camera on the presenter and keep the video of the presenter on one screen (or bigger than the others on your PC). You can flip through (using a touch screen) all the thumbnails (with name tags) to find out who is present in the meeting. If you are the owner of the meeting, you can select a participant, mute him or her, assign presenter status, maybe do a one-to-one chat, whisper something to that person, or maybe quickly leave the meeting to do some prep work in a separate video room for later to return to the whole group. In a recording of a meeting, you can see a graphical overview of who spoke when, and quickly skip to the parts of the meeting you want to listen to.
There are so many things that can be done to make the video interactions closer to real-life interactions. We will be able to re-create various in-person experiences like team huddles, working in groups in the same room, brainstorming on the whiteboard, and so on! When I say “better than being there”, this is what I imagine!
(Note! Multi stream is just about to enter the market and these features are not yet available or even planned. They are just in my imagination based on what the multi stream architecture can enable…)