Free Live Streaming Cloud Server: DIY WebRTC Streaming on a Budget

Last Sunday, the Media Lab hosted a Public Dialogue on DRM and the future of the Web, featuring Richard Stallman, founder of the Free Software Foundation. Despite significant external interest, the event couldn’t be streamed using MIT’s proprietary video streaming service due to Stallman’s strict adherence to completely free software. This limitation sparked an idea: could we create a free, WebRTC-based streaming solution acceptable to RMS?

Richard Stallman, Danny O’Brien, Joi Ito, and Harry Halpin. Photo: Jon Christian

During a lunch hosted by Joi Ito earlier that day, Tal Achituv and I explored the feasibility of a free WebRTC solution. Research indicated viable open-source WebRTC broadcasting options. Just hours before the panel, we decided to put this idea into action. What followed was a whirlwind of two hours filled with coding, setup, and a frantic search for necessary equipment.

The Quick and Dirty Setup

Photo: Tal Achituv

Ironically, creating this diagram took longer than setting up the free streaming solution itself.

Our setup consisted of three main components: a Media Server, a Signaling Server, and the Video Capturing setup. Each played a crucial role in getting the live stream up and running with free software.

Kurento Media Server: The Heart of the Stream

For the media server, we chose Kurento, an open-source media server under the LGPL-2.1 license. Kurento is ideal for this purpose as it fully implements the WebRTC specification and utilizes GStreamer for robust multimedia processing. In our scenario, Kurento functioned as a broadcasting server. It received a single WebRTC audio-video stream from the presenter’s laptop (equipped for video capture) and efficiently retransmitted it to multiple viewers via separate WebRTC streams.

Initially, we intended to deploy Kurento on a cloud service after testing. However, time constraints led us to run Kurento directly on a Linux VM on my laptop. Surprisingly, this debug setup became our production environment.

Key observations about the Media Server:

Post-stream analysis revealed the VM was running with a minimal configuration of just 1 core and 1GB of RAM. Despite these meager resources, it handled the streaming flawlessly, highlighting Kurento’s efficiency.
Network Address Translation (NAT) can often complicate WebRTC setups, requiring TURN and STUN servers to handle connectivity. Fortunately, the Media Lab’s wired network provided a public IP address to each wired device. This meant our VM, using a bridged network, bypassed NAT issues, simplifying the setup considerably.
For future deployments and easier setup, we discovered Kurento dockerfiles. These offer a quick, containerized deployment method for Kurento, which could significantly streamline the process in more complex network environments or when using cloud servers.

Node.js Signaling Server: Connecting Clients to the Stream

The signaling server’s role is twofold: serving static web assets and establishing WebSocket connections between clients and the Media Server. For this, we leveraged Node.js and heavily adapted Kurento’s one-to-many video call tutorial. In fact, “adapted” is an understatement – we essentially copied the tutorial code. Our primary modification was separating the presenter and viewer pages and stripping down the viewer page to the bare essentials – a video container. The source code for our signaling server is available here.

Screenshots of the live viewing page. Note the favicon change, a testament to live tweaking.

Interesting aspects of the Signaling Server setup:

Serving files directly from my laptop enabled real-time, “hot-swapping” of files. This allowed us to tweak the viewer experience on-the-fly, even during the live stream. We made live adjustments to CSS and even changed the favicon, demonstrating the flexibility of this direct-serving approach.
To avoid exposing my personal laptop directly to the internet (as the signaling server was running directly on it, not in a VM), we employed ngrok. Ngrok provided a secure tunnel, forwarding traffic to my local machine and effectively masking my personal IP address. This is particularly useful for quick demos and setups where security isn’t the paramount concern but ease of deployment is.

Left: ngrok showing 5 open connections. Right: Output from the signaling server, monitoring connections.

Video Capturing: From Physical to Digital

For video capture, Tal located a budget-friendly video capture card within the Media Lab – similar to this model. We paired this with a compatible laptop (after discovering our initial choice was incompatible). Firefox browser was used to connect to our presenter endpoint, initiating the WebRTC handshake with the Kurento Media Server and starting the stream.

Points to note regarding Video Capturing:

Midway through the stream, we realized the audio feed to the capture card was disconnected. Unintentionally, we were using the laptop’s built-in microphone. Surprisingly, the audio quality remained quite acceptable, a testament to modern laptop microphone technology.
It’s important to acknowledge that this setup lacked robust security. Anyone with the presenter endpoint URL could potentially hijack the stream. This highlights the trade-offs made for speed and simplicity in this impromptu setup. In a production environment, proper authentication and security measures are crucial.

Surprising Results and Key Takeaways

Our impromptu, Free Live Streaming Cloud Server (running on a laptop VM!) peaked at 37 concurrent viewers, averaging around 12 throughout the event. Due to the rapid deployment and the engaging nature of the panel discussion, we didn’t implement continuous monitoring or detailed load data collection. However, subjectively, the system felt capable of handling significantly more viewers. The responsiveness of the machine, even the resource-constrained VM, remained consistent throughout the session. Kudos to Kurento for its efficient performance!

Key Lessons Learned:

Parkinson’s Law in Action: “Work expands so as to fill the time available for its completion.” (Parkinson’s Law). Because we only decided to implement this at the last minute, it took only that short time to accomplish. Had we planned a week in advance, the process might have stretched to fill that entire week, potentially leading to over-engineering and unnecessary complexity. Sometimes, “good enough” truly is sufficient.
The Power of Existing Tools: Live streaming setup once seemed like a complex, daunting task. However, we are now in an era where powerful tools are readily available, often open-source and free. The primary limitation is often our own perception of complexity rather than the actual technical barriers.
Embrace the Server Within: The original vision of the internet was a network of equal nodes, where anyone could be both a service consumer and a service provider. While cloud computing and massive data centers have created a client-server divide, the underlying principle remains. Our personal machines are capable of serving content and services. Tools like ngrok empower us to easily bridge this gap and rediscover the original, decentralized spirit of the internet. We should continue to develop and utilize tools that enable this client-and-server duality.

Special thanks to Tal Achituv for the collaborative effort and the reminder of these fundamental principles.