top of page
Search

How to Make Remote RPG Sessions Feel Less Like Zoom Meetings

Updated: Mar 19



Online D&D sessions feel like work meetings because the tools are the same: a video grid, equal squares, no visual hierarchy. In person, the GM's presence signals authority physically, spatially, visually. Online, that signal disappears. Several specific adjustments to how you structure the session and present yourself on camera address this directly.

Why does online D&D feel like a work meeting?

The short answer: the software was designed for productivity, not narrative authority.

When you open Discord or Zoom for a session, the platform arranges every participant in identical squares at equal scale. No one is positioned at the head of the table. There are no spatial cues that differentiate the person narrating from the people listening. The GM's face is the same size as the player checking their phone. The visual architecture of a meeting is a grid of interchangeable participants, is working exactly as designed. It was not designed for this.

In an in-person session, players orient toward the GM instinctively. The GM's physical presence occupies a specific position in the room. Their face is visible at a different scale. Their voice fills the actual air in the room. None of that translates through a webcam square. The platform actively equalises everyone in the frame, and equalisation is the opposite of narrative authority.

This is not a vibe problem. It is a structural problem with a structural cause.

What makes in-person sessions feel different?

Spatial positioning is the first thing that disappears. At a physical table, the GM often sits at one end. Players face them. The GM's hands are visible, their body language readable, their attention clearly directed at specific players when needed. None of that is arbitrary, it does real work establishing who is running the session.

Physical props do similar work. A GM who reaches for a miniature, places a map, or rolls visible dice in front of their players is performing. That performance communicates: what is happening here is deliberate. Online, all of those objects disappear from shared space.

Voice carries differently in a room. Not just acoustically but relationally. A GM who lowers their voice at a tense moment in person creates a response in the room. The same GM doing the same thing over a compressed audio stream through laptop speakers lands flat. The spatial cues that made the voice meaningful are not there to support it.

What is actually lost in online play is not immersion as an abstract value. It is the physical environment that made the GM's presence legible as a performance and not as a meeting.

How do you signal that this is a performance, not a meeting?

The most direct lever is the GM's visual frame.

What the camera sees matters in a specific way: it is the only spatial environment players have access to. Lighting, background framing, and camera position all affect whether the person on screen reads as someone who is presenting or someone who is attending. A GM lit from a window behind them, sitting in front of a blank wall, in an unmade room, registers as a call. The same GM who has composed their frame with warm front light, a background with depth, something visible in the space that belongs to the game will register differently.

Beyond frame composition, the visual identity of the GM on screen does work that most setup guides don't address. A GM who looks like themselves, in their regular clothes, in their regular environment, is asking players to perform the imaginative labour of pretending the session is something other than what it visually appears to be. A GM who has prepared a consistent visual identity with something that marks them as being in a different mode reduces that labour. Tools like Faes AR allow GMs to author a persistent character look that maintains across the session, reinforcing visually that what players are watching is a performance. That signal is available from the moment the session opens, without requiring any explanation or announcement.

The camera frame is the only stage an online GM has. Composing it deliberately is the same work as arranging the physical space of a table.

Does audio setup make a difference?

Yes, but the contribution is narrower than most setup guides suggest.

A dedicated microphone removes one specific association: the flat, slightly robotic audio quality of a built-in laptop mic is the same quality players associate with work calls. A mic that picks up the room naturally, with presence and warmth, breaks that association. It does not fix the structural problems above, but it removes a signal that actively codes the session as a call.

The more meaningful audio factor is what is not there. Built-in mics pick up fan noise, keyboard noise, ambient room sound, and the mechanical sound of breathing too close to the microphone. All of that competes with the voice. A dedicated mic with a cardioid pattern isolates the voice. The result is not richer, it is clearer, and clarity is what the voice needs to carry narrative weight.

A condenser mic on a boom arm, positioned correctly, is sufficient. It does not need to be expensive. It needs to be present.

What session structure habits reduce the work-meeting feeling?

The opening five minutes do more work than almost anything else in the session.

A work meeting opens with a check-in, a roll call, or a "can everyone hear me" that confirms the logistics before the meeting begins. An online session that opens the same way has already told players what kind of event this is. The convention is set before anyone rolls a die.

GMs who run sessions that feel different tend to start differently. Not with logistics. Not with housekeeping. With the world. The recap of last session delivered in present tense, as though the events are still unfolding. The ambient audio cue that signals the game is beginning. The question directed at a specific player, not at the group, that requires an in-character response immediately.

The structural logic is simple: the first thing that happens in the session teaches players what kind of thing a session is. If the first thing is a logistics check, the session is a logistics-adjacent event. If the first thing is fiction, the session is fiction.

Transitions between scenes follow the same principle. A GM who says "okay, so that scene is done, you all get a short rest, should we move on?" has surfaced out of the fiction to manage the meeting. A GM who narrates the transition - who makes the cut - stays in the performance mode that the opening established.

What do players need from the GM to stay present?

Consistent identity. The persistent signal that the person on screen is operating in a different register from their normal self.

Players calibrate to the GM's mode within the first few minutes of a session. A GM who has visually composed their frame, who opens in fiction rather than logistics, who narrates transitions rather than managing them, establishes a clear mode. Players respond to that mode. The phone stays down when it is obvious that the person on screen is performing.

The visual and structural work described above serves one function: reducing the amount of imaginative labour players have to do to treat the session as something other than a call. Every signal that says "this is a performance" is one less burden on the players to pretend it is.

That is the work. None of it requires expensive equipment or technical sophistication. It requires treating the camera frame as a stage and the session opening as a directed event.

Learn more at https://faes.ar/ and explore the product here: https://araura.gumroad.com/l/qyoqv.



 
 
 

Comments


bottom of page