Sound Design

ADR Workflow: Replacing Dialogue Without Losing Performance

By Chris Nolan -- Game Audio Implementation Specialist, Wwise/FMOD · 12 min read

Actor recording ADR in a professional dubbing booth

Automatic Dialogue Replacement -- ADR, also called looping -- is the process of re-recording dialogue in a controlled studio environment to replace production audio that is unusable due to noise, technical issues, or creative changes. Approximately 30% of dialogue in modern feature films and television goes through ADR, and in action-heavy productions or films shot in challenging acoustic environments, that number can climb to 60% or higher.

I've spent the past six years implementing dialogue systems in games using Wwise and FMOD, which means I live at the intersection of recorded performance and interactive delivery. The ADR workflow principles I've learned from film projects directly inform how I implement game dialogue -- because whether the medium is linear or interactive, the fundamental challenge is the same: make the replacement sound like it was always there.

When ADR Becomes Necessary

Production sound mixers do extraordinary work under impossible conditions, but there are limits. Aircraft noise during an exterior scene, refrigerator hum in a kitchen close-up, clothing rustle on a lavalier microphone, radio frequency interference from nearby cell towers -- these are all common reasons production dialogue gets replaced in post. Sometimes the issue is creative: the director wants to change a line reading, or the script was rewritten after the scene was shot.

On a recent streaming series I consulted on, we needed to ADR 45% of the dialogue because the production was shot in a converted warehouse in East London, directly under the flight path to London City Airport. Every take had at least one aircraft pass overhead, and even with directional microphones and noise gates, the low-frequency rumble of jet engines was baked into every close-up. The ADR budget for that series was 120% of the original production sound budget.

The ADR Cue Sheet

Every ADR session starts with a cue sheet -- a frame-accurate list of every line that needs to be re-recorded. The cue sheet includes the timecode in and out, the original dialogue line, any modified dialogue, the character name, and notes about emotional context or performance direction. A typical one-hour television episode generates a cue sheet with 40-80 individual cues.

The cue sheet is prepared by the dialogue editor, who identifies unusable production audio during the editorial process. Each cue is flagged with a reason code: noise (unwanted ambient sound), performance (creative change), technical (microphone failure or distortion), or addition (new dialogue not in the original production). This categorization helps the ADR supervisor plan the session efficiently -- performance cues need the most takes, while technical replacements can usually be nailed in two or three.

Matching the Production Microphone Character

The single most important technical factor in successful ADR is matching the tonal character of the production microphone. If the original dialogue was recorded with a Sanken COS-11 lavalier hidden in the actor's hair, re-recording it with a Neumann U 87 in a treated booth will sound obviously different. The proximity effect, the off-axis response, the cable noise -- all of these characteristics need to be replicated or processed to match.

In practice, this means the ADR engineer needs detailed notes about the production sound setup: which microphones were used on which actors, at what distance, in what acoustic environment. With that information, the ADR recording can be equalized and processed to match the production sound's frequency response and ambient character. The match doesn't need to be perfect -- it needs to be good enough that the re-recording mixer can blend them together in the final mix without the audience noticing the transition.

The ADR Recording Session

A professional ADR session runs 2-4 hours per actor and covers 20-50 cues depending on complexity. The actor watches the scene on a screen, hears three beeps followed by a streamer (a visual line that sweeps across the screen to indicate timing), and delivers the line in the window between the streamer's arrival and the original dialogue's start point.

The ADR supervisor's job during the session is to direct the performance -- not just the words, but the breath, the emotional intensity, the physical energy. A line delivered while running sounds different from the same line delivered while standing still, even if the actor is stationary in the booth. The ADR supervisor will have the actor move their body, change their breathing pattern, or adopt a physical posture that matches the on-screen action.

Each cue typically gets 3-6 takes. The first two takes are warm-up -- the actor is getting into the character's headspace. Takes three through five are usually the strongest. Take six and beyond, fatigue sets in and performance quality drops. If a cue hasn't been nailed by take six, the ADR supervisor will move on and come back to it later with fresh ears.

Technical Setup and Signal Chain

The standard ADR recording chain: actor performs in an acoustically treated booth (targeting NC-15 or better), microphone is typically a Neumann U 87 or U 67 in cardioid mode at 12-18 inches from the mouth, signal passes through a high-quality preamp (Neve 1073 or equivalent) set for 40-50 dB of gain, and is recorded at 96 kHz / 24-bit into Pro Tools.

The Pro Tools session is organized with the original production audio on reference tracks (muted during recording but available for comparison), the ADR takes on playlisted record tracks, and a dedicated streamer/beep track that generates the three-beep-and-streamer cue. The ADR recordist sets the input gain so that the actor's loudest delivery peaks at -6 dBFS, leaving 6 dB of headroom for louder-than-expected deliveries.

"ADR is not about replacing dialogue. It's about recreating a moment. The actor needs to be back in that emotional place, in that physical situation, even though they're standing in a quiet room staring at a screen. The technology gets you the sound. The acting gets you the truth." -- Randy Thom, supervising sound editor, interviewed at the AES Convention, 2018

ADR Editing: The Invisible Craft

ADR editing is where the technical and creative challenges merge. The editor takes the best ADR takes from the session and aligns them to picture with sample-accurate precision. This means not just matching the start of the line, but matching every syllable, every breath, every consonant transient to the on-screen lip movement. The tolerance for misalignment in professional ADR editing is approximately 2 frames -- about 83 milliseconds at 24 fps.

After alignment, the editor processes the ADR to match the production sound's acoustic character. This involves EQ matching (using a reference spectrum analyzer to compare the frequency response of the ADR and production sound), reverb matching (convolving the ADR with an impulse response captured from the production location, or using a reverb plugin tuned to match), and level matching (ensuring the ADR sits at the same perceived loudness as the surrounding production dialogue).

EQ Matching Techniques

I use a spectrum analysis approach for EQ matching. First, I play a representative section of clean production dialogue and capture its long-term average spectrum using a real-time analyzer with a 30-second averaging window. This gives me the frequency response curve of the production microphone in its acoustic environment. Then I play the ADR recording through the same analyzer and note the differences.

The most common differences are: ADR has more low-frequency content (the booth is closer-miked than the production lavalier), ADR has less high-frequency air (the booth absorbs high frequencies that a location's hard surfaces would reflect), and ADR has a cleaner midrange (no environmental resonances). I address these with a high-pass filter set to match the production sound's low-frequency roll-off, a high shelf boost to add air, and narrow EQ cuts to introduce subtle resonances that match the location's acoustic signature.

ADR vs Production Sound: A Real Comparison

Understanding the strengths and limitations of both ADR and production sound helps you make informed decisions during the editorial process. Neither is inherently better -- they serve different purposes and have different trade-offs.

Table 1: ADR vs Production Sound Characteristics
Characteristic	Production Sound	ADR Recording
Noise floor	Variable (NC-30 to NC-50)	Consistent (NC-15 to NC-20)
Performance authenticity	100% (captured in the moment)	80-95% (recreated in booth)
Acoustic match to location	Perfect (recorded on set)	Requires processing to match
Cost per minute of dialogue	Included in production budget	$200-600 per minute
Flexibility for script changes	None (requires reshoot)	Full (record new lines)

ADR for Games: Interactive Dialogue Systems

While my primary expertise is in game audio implementation using Wwise and FMOD, the ADR workflow principles translate directly to game dialogue recording. The key difference is volume: a game with branching dialogue trees can generate 10-20x more recorded dialogue than a feature film of the same runtime. A 20-hour RPG like "The Witcher 3" contains approximately 400,000 words of spoken dialogue -- compared to a 2-hour film at roughly 12,000 words.

Game dialogue recording sessions are organized differently from film ADR. Instead of watching picture and matching lip sync, the voice actor reads from a script with context notes, and the lines are recorded in batches grouped by emotional state and character relationship. A single session might record 200-400 lines covering every permutation of a dialogue tree: friendly greeting, hostile greeting, neutral greeting, friendly goodbye, hostile goodbye, quest acceptance, quest refusal, and so on.

The implementation challenge in Wwise is organizing these lines so that the game engine plays the correct variation at the correct moment. I use Wwise's Random Containers with probability weighting -- if the player approaches an NPC for the third time, the third greeting variation should play, not the first. This requires meticulous naming conventions and metadata tagging during the recording session.

Quality Control and the Final Blend

The final test for ADR is the re-recording mix stage. When the dialogue, Foley, effects, and music are all playing together at theatrical levels, the ADR needs to be indistinguishable from production sound. The re-recording mixer will A/B between the original production audio and the ADR replacement, checking for tonal match, spatial match, and performance match.

If the ADR doesn't pass the A/B test, it goes back to the dialogue editor for reprocessing. The most common failure mode is a reverb mismatch -- the ADR sounds too dry compared to production dialogue that was recorded in a space with natural reverberation. The fix is usually to increase the early reflection level in the convolution reverb and adjust the pre-delay to match the distance from the sound source to the nearest wall in the production location.

The acceptance criteria I use: if a trained audio professional cannot identify which lines are ADR and which are production sound during a continuous playback of the scene, the ADR passes. If they can identify even one line, that line needs to be reprocessed. In practice, a well-executed ADR workflow on a professional production achieves a 95% or higher pass rate on first review.

References: Randy Thom, "ADR: The Performance Challenge," AES Convention Technical Paper (2018) | David Lewis Yewdall, "The Practical Art of Motion Picture Sound," 4th Edition (2020) | Mark Mangini, "The Sound of ADR," MPSE Master Class series (2019) | Audiokinetic, "Wwise Game Audio Implementation Guide," v2024.1 (2024)