What if I told you that a graphics card could be the quickest way to improve your livestream or podcast audio? It sounds counterintuitive, but think about it: A GPU often has an extreme amount of processing power sitting idle, so why not use that redundant hardware for other things?
Fortunately, NVIDIA is way ahead of us, and has already harnessed the potential of its own GPUs to do things beyond, well, graphics. For example, you might remember RTX Voice, which as the name implies, is a tool for upping your microphone game. Then, it quietly released Broadcast — a more comprehensive tool aimed squarely at streamers and content creators. Both offer great audio enhancement features, but we’ll focus on Broadcast here as that has effectively (though not entirely) replaced RTX Voice.
Right up the top, I should set some expectations. While Broadcast offers some helpful tools for all streamers, the real benefit is for those with more entry-level gear. For example, if you have something like a Blue Yeti and an older webcam you’re going to get more out of this tool than someone with a Shure SM7B and a Sony A7.
Broadcast specifically “uses Tensor Cores on NVIDIA RTX GPUs to accelerate AI calculations so you can game, livestream and run AI networks at the same time.” It’s compatible with any RTX GPU — “GeForce RTX 2060, Quadro RTX 3000, or higher” — according to a company spokesperson.
If you don’t already own a compatible GPU, now isn’t exactly the best time to be looking for one, thanks to the ongoing chip shortage, but things do seem to be slowly easing up. If you do have a supported card then you can simply download the Broadcast app and get cracking. The better news is, if you have a Logitech headset or Blue mic, as of today, Broadcast is natively supported so you won’t even need to dive into the app.
Supported models at launch are Logitech's G733, Pro X and Pro X Wireless headsets, and Blue's Yeti X, Yeti Classic and Yeti Nano microphones. While that’s only a fraction of the companies’ offerings, it still represents a lot of headsets and microphones that today have new, untapped potential.
The Logitech partnership, to date, only works with some products and only with some of the features on offer in NVIDIA Broadcast. Those looking for the full audio-visual featureset will still need to download the standalone app. Once you’re in Broadcast you’ll see three main tabs: Microphone, Speaker and Camera. We’ll focus mainly on the microphone section, but the other two are just as useful and it all combines into one hub for tweaking your stream, be it video, audio or both.
Under the Microphone tab you’ll find a drop down on the left to select your input source and a space below for adding effects. The area on the right is given over to a tool for testing these effects before you commit to them.
Right now, there are only two effects to choose from. But both are useful and there’s no novelty chaffe (get your robot voices elsewhere!). Broadcast is focused on shaping up your stream, not bending it into something else. And importantly, all in realtime, unlike something like iZotope RX which is incredibly good at repairing sound, but aimed at post production.
The first effect is Room echo removal. For anyone who has a space with less than favorable acoustics, this is going to help you dial down that dreaded “cave” sound you have probably been battling with. Reverb reduction is actually quite a science, given that you’re trying to remove elements of a sound that are… well, very very similar to the source. So you can’t just hack out the errant frequencies and be done with it, you need to leave the original signal intact.
NVIDIA kept things nice and simple. Other pro tools (like iZotope’s RX) give you an bevvy of settings and controls. Broadcast? Just two: on/off and “strength” (amount of reverb to be removed). To test this, I used a condenser microphone as those are most prone to picking up reverb. In the recording below, I start with no effect applied before dialling in about 50 percent and then finally with strength set to maximum.
As you can hear, the effect is, well, effective. The acoustics in the room I recorded the samples in aren’t terrible, but they’re definitely not optimal. But the difference between the raw recording and then with the reverb removed is stark. The effect is most obvious in the first recording with the condenser mic. The raw signal is… fine, but things are much improved with the echo removal tool set to around half way. The recording feels much more present and there’s no distracting room echo. More, isn’t always better though. Once I dial the effect up to 100 percent, reverb might be eliminated, but at the expense of the original signal. In short, play around with the settings to find the best balance for your tastes and recording space.
You’ll notice in the second recording that there’s not really all that much reverb to remove. This is thanks to the dynamic microphone which does a pretty good job of that itself. You can hear a difference once I start adding the effect, but it might not be worth the risk of degrading your source audio for such a minor benefit.
Perhaps more impressive than the echo removal is the second effect on offer: Noise reduction. While reverb tends to be a constant, outside noises are unpredictable. Things might be quiet when you sit down to record, only for a loud motorbike or barking dog to invade your stream moments later. Not with noise removal applied though.
This effect is impressively adept at removing anything but your voice from your stream. Be that a jackhammer, a crying baby next door or even a song played loudly on a speaker right by your microphone. Honestly, listen to the below.
Notice how you can’t hear the song when the effect is applied? That’s coming from a speaker barely a foot behind the microphone. When I recorded it, it’s fully audible to my ears, but almost entirely inaudible on the recording. I say almost as those with sharp ears might notice the odd fragment popping through, but you really have to listen closely. I actually asked NVIDIA about this and was given this response:
“The AI networks are trained to recognize some patterns, and as such there will be gray areas where the AI can have doubts. [...] As such we wanted to give our users flexibility to run the effects in a comfortable range, or dial things up in case they needed help in extreme situations. For example, for audio effects we recommend running them at 75-90 percent, depending on how much background noise or room echo there is.”
As mentioned, Broadcast largely replaced RTX Voice, but NVIDIA decided to patch it with support for NVIDIA GeForce GTX GPUs meaning you can still get the noise reduction tools even if your GPU doesn’t support Broadcast. Though, the company says, experiences will vary on older cards.
As with the reverb removal, the strength setting will ultimately impact the quality of the output so trial and error is needed to find the sweet spot. You might also want to consider time of day. If, for example, you live near a noisy road, you could dial in a second pair of settings for when you have to record around rush hour.
The noise removal filter is impressive, but it’s not without limitations. For one, the “strength” slider doesn’t seem to fade out sounds like music, instead it falls off only once you apply the maximum amount. At least in my testing. I doubt you’re intentionally streaming with music only you want to hear, so that’s a minor thing to be aware of, but if you do need to keep the strength at maximum, be aware of that degradation in signal which will get worse the more noise Broadcast is eliminating.
As a companion to the vocal effects, the middle tab in Broadcast is “Speakers.” We can sum this section in one shot: It’s the same as Microphone, just for incoming audio. That’s to say, you can go ahead and remove room echo or background noise from people you are speaking with. If you’ve ever been on a Zoom call during these pandemic times and someone’s baby starts crying or has a really loud road nearby, then you can selfishly spare your ears here.
Conceptually, the effects for your camera work in a similar way to the ones for your microphone. NVIDIA’s AI cores are “looking” out for noise or, depending, your entire background. Yup, with Broadcast you can achieve a green screen effect without the actual screen. You might be thinking “well, Zoom/Meet/TikTok does that” and you’d be right, but those don't do it nearly as well. Below is an example of what background removal/replacement looks like in Broadcast
You can definitely see where the effect isn’t perfect around my hair, but in general it works really well. I did notice that it struggled with the headrest on my chair, with the image flickering around that area as I moved around, but again, it’s leagues above what you might hope for from a free chat app.
In total, there are five effects for cameras. Three for your background: Blur, removal and replace (images or videos work!). Then there is Auto Frame, which crops in and then keeps your face in the center. Last, but not least, is Video noise removal — this is apt for recording in low light and your camera starts to go all grainy. You can see some examples in the picture below. Two were taken using a GoPro and the other pair are from an old DSLR repurposed as a webcam.
If you have good lighting, you probably won’t need the noise removal tool. But if your camera struggles with anything other than an abundance of photons, this filter can help. Again, as with the audio tools, be aware that more is not always better. I found with the effect applied at 100 percent you get the tell-tale “smoothing” that will tell your viewers that you’re papering over the cracks.
The background tools, on the other hand, are a little more forgiving. The blur tool does a good job at obscuring the items behind you, but there’s a very unnatural contrast between you, in focus, and literally anything else in shot. This is similar to how iPhone “portrait” mode photos often appear. The background is blurred, but the subject looks unnatural as there’s almost no transition between foreground and background.
This is, of course, a benefit when you want to remove the background completely. With this filter, NVIDIA does a surprisingly good job of isolating you and deleting everything else. If you want to stream via OBS (or similar) with a game or video on screen while you play in the corner, this is the effect you want and no green screen required.
As seen above, NVIDIA does a much better job of removing the background than, say, Zoom or Google Meet does. There’s almost no bleed around hair or hands meaning your background rarely pokes through and spoils the green-screen illusion. Background replacement is essentially an extension of this effect with the option of choosing your own image or video to replace whatever happens to be behind you.
The last of the video effects is Auto-Frame which, as the name suggests, keeps you locked into the center of your shot. To do this, the video crops in a little bit and it uses the extra space to gently shift the video from left to right or vice versa as you move around. The movement isn’t jarring, it looks pretty smooth, and is a good way to make sure you don’t accidentally drift off to the side if you are a fidgeter (like me) and don’t keep still.
You can also combine these effects if you want. Auto-Frame can be stacked with either noise removal or one of the background effects for example, or you could use remove background with noise removal applied. Right now you can only stack two though, but that’s probably a good thing as all three at the same time might be a bit much.
The camera suite of effects is currently in beta, and NVIDIA has been adding new features and improving performance at a rapid pace. Version 1.2 in May integrated Broadcast into popular streaming apps like OBS, while the latest 1.3 version released in September added support for many virtual camera apps and made combining video effects much more viable by reducing their VRAM usage by 40 percent.
While performance and memory usage is never going to be an issue for those using Broadcast for their chats, one-PC streamers who are using the app while gaming will be grateful for the additional frames. Perhaps, one day, we can dare to dream about many other features such as multi-mic support (with different effects) live transcribing for closed captions and even smart detection of licensed music so your streams don’t flag a DMCA violation.