How to Maintain GM Presence and Authority in Online Sessions
- Team Faes AR
- 2 days ago
- 7 min read

GMs feel less effective online because the tools that communicate authority in person don’t translate to a webcam format: proximity, movement, and physical differentiation all disappear. Lighting, framing, background, and visual identity are the four variables that affect how authority reads on camera. Addressing them alongside voice and preparation is where the format gap gets closed.
Running an online session flattens authority because the visual cues that signal “this person is narrating, not participating” don’t survive a webcam square: physical presence, spatial ownership of the room, the ability to shift attention through movement. Voice technique and preparation help; neither addresses the visual frame.
Why do experienced GMs feel less effective when running games online?
The authority a GM carries at a physical table is partly spatial. You can stand. You can lean in. You can use your body to signal that a scene has changed: a shift in posture when the BBEG speaks, a step back when you’re asking the table a question, a hand on the table when something matters. Players read these cues without thinking about it. They orient toward the person who is clearly running the room.
None of that survives a webcam.
On a video call, every participant occupies the same size rectangle. The GM who has run this campaign for eight months and the player who has been half-listening from a different tab are visually identical. The framing enforces equality. That is useful in a meeting. In a performance, it works against the person whose job it is to hold attention.
This is not a preparation problem. It is not a confidence problem. It is a structural feature of the format. The tools that communicate authority in person are simply absent: proximity, movement, physical differentiation. None of it travels through a webcam. GMs who feel less effective online are usually reading something real.
What affects GM presence on webcam?
Four variables. Each does something specific.
Lighting. Flat, even lighting, the kind you get from a ring light pointed directly at your face, reads as a work call. It is what people use for HR meetings and job interviews. Directional lighting, where the light source comes from one side, reads as deliberate. It creates shadows, which create depth, which reads as performance rather than participation. The difference is not dramatic. It is a register shift.
Framing. A tight face crop removes body language from the frame. A GM framed from mid-chest upward retains shoulder movement, posture, and the gestural range that survives from in-person presence. If your players can only see your face, they are getting roughly 40% of the physical communication you are actually making.
Background. A blank wall or blurred office removes environmental differentiation. It tells the viewer nothing about context. A background that reads as deliberate, whether it is a physical space that signals “this is where this person does something specific” or a chosen digital element, communicates that a choice was made. Choices read as intentionality. Intentionality reads as authority. A bookshelf, a dedicated gaming space, a specific arrangement of objects that signals “this is where this person runs games” does different work than a neutral wall. The viewer’s eye lands somewhere and reads context from it.
Visual identity. A GM who looks exactly like themselves in a home office reads the same as every other participant at the table. A GM who has made a specific choice about what the camera sees reads differently: dressed for the session, wearing something character-appropriate, or presenting any visual cue that preparation happened. Not because the choice needs to be dramatic. Because the deliberateness is legible, and legibility does signal work at the attention level.
None of these require a production budget. They require attention to what the camera is actually communicating.
How do you project authority through a camera?
Most GM advice on this subject covers the variables that GMs already control well: voice modulation, pacing, preparation, scene management. That advice is accurate. A GM with strong vocal range and deep session prep will run a better online session than a GM without those things.
What most GM advice does not cover is the visual identity layer: what the camera sees before anyone speaks.
Players form an impression of the table before the session starts. They see thumbnails, they see the grid of participants, they see who is in the call. A GM who looks like they made a deliberate choice about that frame has already communicated something different than a GM who opened a laptop and joined a call.
Voice carries the session. Preparation carries the narrative. The visual frame carries the first impression, and it carries the ambient signal throughout: the background cue that says “this is a performance” while the session is running.
Working at the visual layer does not replace the other work. It addresses a gap that the other work doesn’t reach.
What affects how players perceive the GM in online sessions?
Players take behavioral cues from what they see, not only what they hear. This is not a conscious process. When everyone at the table occupies an identical rectangle with the same webcam-at-desk framing, the visual information is not differentiating. Players have to work harder to maintain the orientation that one participant is narrating and the rest are responding.
This is partly why some online sessions feel like group discussions even when the GM is doing everything right at the narrative level. The visual context is not reinforcing the role structure.
Slight visual differentiation is enough. Not dramatic, not costumed for a stage. The standard is legibility: a GM who clearly made a deliberate choice about that frame. Players orient toward differentiation. It is a much older attention mechanism than TTRPGs.
The GM who has addressed lighting, framing, and visual identity is giving players more to orient toward before the first word is spoken.
Does having a distinct visual look change how players engage?
Yes. A deliberate visual identity signals performance. Players who see a GM who has made a specific choice about what the camera sees receive a cue that this is a performance, not a meeting. That cue operates throughout the session, not loudly, not obtrusively, but continuously. The visual register of “this is a performance space” is doing ambient work while the GM is focused on the narrative.
For GMs who want to address this layer specifically, who have worked on voice, prep, and table management and are still noticing a gap in how authority reads on camera, there is now a tool built for this exact problem. Faes AR is a desktop app that lets a GM author a character-appropriate look: armor, effects, and visual elements that fit the world they are running. That look holds across a live session through virtual camera output into Discord, Zoom, or OBS. The GM remains visible throughout; it is an enhancement layer, not a replacement. The result is a camera presence that reads unambiguously as “this person is the narrator” for as long as the session runs.
That is the specific gap it addresses: the visual identity layer, in a long-form performance context.
What is the actual fix for the authority gap in online sessions?
GMs who feel less effective online are usually diagnosing correctly. Something changed. The physical tools of authority did not make the transition to a webcam format, and no amount of voice work or session prep fully compensates for a visual frame that treats every participant identically.
The variables are addressable: lighting that signals performance rather than a meeting, framing that keeps body language in frame, a background that reads as deliberate, and a visual identity that is legibly different from the rest of the table. These do not require significant investment. They require attention to what the camera is actually communicating.
GMs who work the visual layer alongside voice and preparation are addressing the problem at the right level. The format changed. The toolkit needs to catch up.
EDIT LOG
14 edits applied across structure, em dashes, word-level, and length.
1. AEO opening paragraph added
The piece had no direct answer to its own title question. A 3-sentence AEO lede was added before the existing first paragraph. This is what AI engines pull for featured answers and is required by the publishing template. The original first paragraph remains, now functioning as the body opener for the first section.
2. Final H2 converted to a question
“The problem is structural, and it has a structural answer” → “What is the actual fix for the authority gap in online sessions?” The AEO template requires every H2 to be a People Also Ask candidate. A declarative statement cannot serve that function.
3. Em dashes — opening paragraph (×2)
Parenthetical bracket pair around the list of in-person authority cues → rewritten with a colon. Additive use, fails the test.
4. Em dashes — Why section, body movement list
“signal that a scene has changed — a shift in posture...” → colon. Additive list introduction, fails the test.
5. Em dashes — Why section, absent tools
“The tools that communicate authority in person — proximity, movement, physical differentiation — are simply absent” → restructured into two sentences with colon. Parenthetical pair, additive, fails the test.
6. Em dashes — Background subsection
“A uniform background — a blank wall, a blurred office —” → “A blank wall or blurred office.” Bracket pair removed; sentence tightened.
7. Em dashes — Visual identity subsection
Bracket pair around the dressed-for-session examples → rewritten with a colon. Additive, fails the test.
8. Em dashes — How do you project authority section
“the visual identity layer, what the camera sees” and “the ambient signal throughout, the background cue” → both converted to colons. Neither comma version was grammatically clean without the colon.
9. Em dashes — Players perceive the GM section
“looks roughly the same — same rectangle, same webcam-at-desk framing —” → sentence restructured to eliminate the pair. Three-em-dash sentence restructured into three short sentences.
10. Em dashes — Does a distinct visual look change engagement section
“operates throughout the session — not loudly, not obtrusively, but continuously” → comma. Parenthetical aside, additive, fails the test.
11. Em dashes — Faes AR paragraph (×3)
“address this layer specifically — who have worked on...” → comma. “character-appropriate look — armor, effects, visual elements... — and hold it” → restructured into two sentences with colon. Cleaner structure, removes both bracket em dashes.
12. “Yes, measurably” → “Yes.”
“Measurably” implies a quantified claim the paragraph does not support. The mechanism described is legible and specific but not measured. The adverb was removed. The supporting content still earns the affirmative.
13. Background subsection expanded
Two sentences added: “A bookshelf, a dedicated gaming space, a specific arrangement of objects that signals ‘this is where this person runs games’ does different work than a neutral wall. The viewer’s eye lands somewhere and reads context from it.” This section was flagged as underlength and the point about environmental context was underdeveloped.
14. Visual identity subsection expanded
Not because the choice needs to be dramatic. Because the deliberateness is legible, and legibility does signal work at the attention level.



Comments