Documentation · Governance · Evidence

AD DS Troubleshooting Guide

Selected public-safe documentation pages from a private technical documentation hub. The focus is documented, controlled and reviewable technical delivery.

Active Directory Domain Services — Troubleshooting & Operations Field Guide

Purpose

This document summarizes practical Active Directory Domain Services troubleshooting concepts and command-line workflows for Microsoft hybrid and on-premises identity environments.

The focus is operational: how to reason through AD DS failures logically instead of checking random components in isolation.

Scope

This is a learning and portfolio-support document for AD DS troubleshooting reasoning.

It uses placeholder names and general operational patterns only. It does not include customer environment data, production domain names, credential identifiers, private incident evidence or privileged operational records.

Troubleshooting principle

Most AD DS issues should be investigated from the dependency layer upward:

1. DNS resolution

2. Secure channel

3. Replication

4. FSMO role availability

5. Group Policy processing

6. Event logs and exact failure evidence

DNS should usually be checked first because domain-joined clients and domain controllers depend on DNS to locate domain services, LDAP, Kerberos and Global Catalog records.

Core diagnostic flow

nslookup domain.local
nslookup -type=SRV _ldap._tcp.domain.local
dcdiag /test:dns /v
nltest /sc_query:domain.local
repadmin /showrepl
repadmin /replsummary
netdom query fsmo
gpresult /r
eventvwr.msc

FSMO roles

FSMO roles are special single-master roles used for operations that cannot safely occur as normal multi-master updates.

Forest-wide roles

Domain-wide roles

Operational notes

The PDC Emulator is usually the most operationally visible FSMO role. It affects password changes, account lockout behavior, time synchronization and Group Policy editing behavior.

netdom query fsmo

Get-ADDomain | Select-Object PDCEmulator, RIDMaster, InfrastructureMaster
Get-ADForest | Select-Object SchemaMaster, DomainNamingMaster

AD replication

AD DS replication is multi-master for most directory data. Replication issues are commonly caused by DNS failures, network connectivity problems, Kerberos or secure channel problems, excessive time drift or tombstone lifetime issues.

repadmin /replsummary
repadmin /showrepl
repadmin /syncall /AeD
repadmin /queue

Common replication failure causes

1. DNS resolution failure between domain controllers

2. Network connectivity or blocked RPC / LDAP ports

3. Kerberos or secure channel failure

4. Excessive time drift

5. Tombstone lifetime exceeded after a domain controller has been offline too long

Secure channel

A secure channel is the trusted machine-account relationship between a domain-joined computer and the domain.

Common symptom:

The trust relationship between this workstation and the primary domain failed.

Useful checks:

nltest /sc_query:domain.local
nltest /sc_reset:domain.local

Test-ComputerSecureChannel -Verbose
Test-ComputerSecureChannel -Repair

Typical cause: a machine has been restored from an old snapshot or backup and its local machine-account password no longer matches the password stored in Active Directory.

Domain controller health

Domain controller health should be checked with both high-level and targeted diagnostics.

dcdiag /v
dcdiag /test:dns /v
dcdiag /s:DC1 /v

Important areas:

DNS SRV records are critical for domain controller discovery:

nslookup -type=SRV _ldap._tcp.domain.local
nslookup -type=SRV _kerberos._tcp.domain.local

Group Policy troubleshooting

Group Policy troubleshooting should verify both policy processing and targeting.

gpupdate /force
gpresult /r
gpresult /h report.html

Common causes of GPO issues:

Group Policy processing order

Local policy
Site
Domain
OU

Later processing usually wins, except where Enforced and Block Inheritance change the normal behavior.

Fine-Grained Password Policy

Fine-Grained Password Policies are applied to users or global security groups, not directly to OUs.

Get-ADFineGrainedPasswordPolicy -Filter *
Get-ADUserResultantPasswordPolicy username

Key rule:

Lower precedence number = higher priority.

If an OU-level targeting model is needed, use a group-based approach instead of trying to link a Password Settings Object directly to an OU.

FSMO seizure

FSMO seizure is a last-resort recovery operation when the original role holder will not return.

Modern PowerShell approach:

Move-ADDirectoryServerOperationMasterRole `
  -Identity "DC2" `
  -OperationMasterRole SchemaMaster,RIDMaster,PDCEmulator,InfrastructureMaster,DomainNamingMaster `
  -Force

Important warning: if a seized FSMO role holder later returns online, it must not be allowed to rejoin normally. Metadata cleanup and controlled recovery are required.

Event logs

Important logs:

Common event IDs to recognize:

Event IDAreaMeaning
5719NetlogonSecure channel / domain controller communication issue
1311Directory ServiceReplication topology inconsistency
13568DFSRSYSVOL replication journal wrap issue
4768 / 4769Security / KerberosKerberos authentication events

Root cause priority

Common AD DS root causes in practical troubleshooting order:

1. DNS misconfiguration or missing SRV records

2. Replication failure caused by DNS or network issues

3. Broken secure channel

4. Incorrect Fine-Grained Password Policy targeting

5. Unavailable or misplaced FSMO role holder

6. Group Policy targeting, inheritance or refresh issue

Summary

Active Directory troubleshooting is dependency-driven. DNS, secure channel and replication must be validated before higher-level components such as Group Policy or password policy behavior can be trusted.