LIVE Fusion of Two Greenscreen Sets Tested

Creating a fitting home-story interview setting for two people on a greenscreen set is no problem with some preparation and the right hardware. Even when the backgrounds are keyed out in real time and replaced live with virtual sets in Unreal Engine. But how well does that work when the second person is standing in front of cameras somewhere else in the world? Can two sets be merged into one? And what conditions need to be met so that the dialogue between the protagonists comes across as naturally as possible? For us, this was an exciting setting that we absolutely wanted to test when launching our greenscreen studio.

“We want to offer clients and partner companies more than same-same-but-different when designing events. There are so many more exciting ways to develop brands or share information when you’re willing to think out-of-the-box. In the Urban Studio OpenLab, we continuously develop our portfolio of features and formats - for productions that truly deserve the next-level label.”

Nicolai, Founder and CEO of Jakobs Medien

I. PLANNING

INSPIRATION: FUSION FROM COAST TO COAST

The impulse for the format test came from a making-of video in which talk show host Oprah Winfrey interviews former US President Barack Obama. Both discuss how long it’s been since they’ve been out among people. Then the various fusion layers dissolve layer by layer and it becomes clear that the questions are being asked from a set on the East Coast, while the answers come from a greenscreen studio on the West Coast.

Beyond the technical setup of the setting, we were naturally interested in whether the format works only as a recording or also as a live format for hybrid events.

SETTING: TEST WITH TWO BERLIN GREENSCREEN STUDIOS

To simulate the situation, we set up the greenscreen set at the Urban Studios in Kreuzberg as the main stage over two days. Parquet flooring was quickly laid to match the loft living room setting that we had previously constructed virtually in Unreal Engine. Two chairs were placed opposite each other, but only one remained occupied - most of the time, at least ;).

Behind the empty chair, we placed a preview monitor that was supposed to show the remotely connected interview partner in a wide shot. In a second greenscreen set at the BMU MediaLab in Berlin Mitte, we then placed identical chairs in front of a green screen and on green flooring. Here, accordingly, the other chair was occupied.

TASKS: THREE FOCUS AREAS FOR OUR TESTS

From the kickoff with all team members involved, three challenges emerged that we had to master before we could incorporate the format into our portfolio:

1. The signals from both sets had to be cleanly captured and virtually merged without excessive delays.

2. The settings had to be arranged so that the illusion is created that both protagonists are in the same location.

3. Additionally, the communication had to run smoothly - meaning without noticeable delay - so that the dialogue feels natural.

II. OPEN LAB

TASK 1. SIGNAL TRANSMISSION: LIVEU OR SRT GATEWAY?

A total of four camera signals are needed to ultimately display the interview situation in the fused set. The wide shot and a close-up of each protagonist along with audio arrive at the Tricaster in the Urban Studios in Kreuzberg and the one at the MediaLab in Berlin Mitte respectively.

From there, they are sent via SRT protocol and meet halfway on an Amazon Web Services server for a so-called handshake data exchange via a Haivision SRT Gateway. Just under two seconds after recording, all perspectives are thus merged in the control studio.

Achieving a latency of under two seconds would be possible through corresponding LiveU hardware - but it’s definitely a cost issue. For four signals, we would need four LiveUs. Our solution with the Haivision SRT Gateway allowed up to six signals and was therefore the better choice for us at the time of testing.

TASK 2. SET DESIGN - PHYSICAL AND VIRTUAL

The recordings from both greenscreen studios came together in our Urban control studio and were keyed out in the Tricaster in quasi-real time. However, the two real sets and then the virtual set had to be aligned beforehand.

Ideally, identical hardware should be used at both sets. We had motorized PTZ cameras at both locations, aligned at the same distance and angle to both chairs. Fine-tuning using two semi-transparent overlaid camera signals of the wide shot proved practical but required considerable patience.

For the lighting, we had to strike a balance between technical base light for a clean key and creative lighting that underscores the atmosphere of the interview situation. Then camera perspectives, people, chairs, and parquet were transferred into the virtual space.

Since the transition from the physical wooden floor of the main set into the depth of the digital equivalent could hardly be realistically represented, we decided to lay green flooring after all and transfer the shadow using the Ultra Keyer in the Tricaster into the alpha channel and thus into the finished image.

TASK 3. COMMUNICATION, DIRECTION, AND PROTAGONISTS

Whether the format would ultimately be adopted into our portfolio and offered to clients depended crucially on the naturalness of the conversation. In other words, how well the protagonists could respond to each other and whether reaction delays would stall the conversation.

We knew from the signal transmission tests that the camera and audio signals from the PTZ cameras were displayed on the respective conversation partner’s set monitor via SRT with a two-second delay. If you then add the natural reaction time, the conversation flow starts to stall.

Things looked different with a conversation via Zoom. Here, a conversation in perceived real time is possible, partly because the visual data is more compressed. Even performing a song together is possible through it.

The smartest solution for us was to route the conversation and preview monitors of the protagonists on the greenscreen set through a Zoom conference, and send the camera signals (wide and close-up) in parallel via SRT to the control studio. In the Newtek Tricaster 2, the high-quality signals were then merged with a 2-second delay. The Zoom audio was only needed for the conversation, but no longer for the final product.

III. CONCLUSION & LEARNINGS

Connecting two set locations is entirely feasible for us. The “Virtual Interview” format has the potential to make content fundamentally more attractive and offer our clients real added value that stands out from the masses of virtual events.

Communication between both protagonists (via Zoom or SRT) is possible despite a minor delay and is production-ready. Using our Urban control studio for keying in one location is practical and technically feasible despite compression.

NDIs instead of SDI as output medium are a good choice, as in this scenario a greenscreen always needs to be connected from outside and the data transmitted to the control studio. The image quality is optimal for publications via social media platforms.

Dynamic 3D backgrounds provide flexibility for lighting (day/night), camera perspectives including depth of field. With Unreal Engine as the background input, the TriCaster can continue to be used as the broadcasting system and, if necessary, fall back on a backup solution - such as a static background image (until Unreal is back online).

All information about our Urban Studio Berlin is available here.