SQL Server Failover Cluster Instance (FCI) Setup & Workflow

A Failover Cluster Instance (FCI) is a high availability (HA) solution in SQL Server that provides automatic failover support for the entire SQL Server instance. It is built on Windows Server Failover Clustering (WSFC) and uses shared storage that is accessible by all cluster nodes—but only one node accesses the storage at a time.

What is an FCI?

In an FCI configuration, two or more Windows servers (nodes) are connected to a shared storage system. At any given time, only one node owns and runs the SQL Server instance, while the others remain on standby to take over in case of a failure.

Unlike Availability Groups, which provide database-level replication, an FCI protects the entire SQL Server instance. This includes user databases, system databases, SQL Server Agent jobs, linked servers, and instance-level configurations.

Key Components of FCI

Windows Server Failover Clustering (WSFC): Provides health detection, failover orchestration, and resource coordination.
Shared Storage: Typically a SAN or shared disk that stores the data and log files, accessible by all nodes but attached to only one node at a time.
Cluster Resource Group: A set of cluster-managed resources including SQL Server services, shared volumes, and network names/IPs.
Virtual Network Name/IP: A consistent endpoint used by client applications, which automatically follows the active node after failover.

How FCI Works

SQL Server is installed as a clustered instance across all participating nodes.
The instance is active on only one node, which owns the SQL Server service and mounts the shared storage.
When a failure is detected, WSFC initiates a failover to another node.
The new node takes ownership of the shared disk and starts the SQL Server service.
The virtual network name and IP are moved to the new active node.
Clients reconnect automatically using the same network name/IP, requiring no configuration changes.

Diagram: How FCI Operates

The diagram below illustrates a typical two-node FCI with shared storage:

                  +------------------+       +------------------+
                  |   Node 1         |       |   Node 2         |
                  | (Active Node)    |       | (Passive Node)   |
                  +------------------+       +------------------+
                          |                           |
                          |                           |
                          |                           |
                 +---------------------------------------------+
                 |               Shared Storage                |
                 |    (Database and Log Files on SAN)          |
                 +---------------------------------------------+
                          |
                          V
                +--------------------------+
                | Virtual Network Name/IP  |
                |  (Moves with active node)|
                +--------------------------+

          If Node 1 fails → Node 2 becomes active, mounts storage, resumes SQL Server.

Benefits of FCI

Provides instance-level protection, including system databases and SQL Server Agent jobs
No need for individual database configuration or replication setup
Transparent failover using a consistent virtual network name and IP address
Because there is no synchronization with secondary replicas, it eliminates synchronization-related overhead
All data remains consistent through shared storage without needing to manage data replication

Limitations of FCI

Requires shared storage infrastructure such as SAN or Storage Spaces Direct
No support for readable secondaries—only one node is active at a time
Not ideal for geographically distributed or cloud-native deployments
Failover requires service startup time, including OS readiness and SQL Server startup latency

FCI vs. Availability Groups

It’s important to understand the key differences between FCIs and Availability Groups (AGs):

Feature	Failover Cluster Instance (FCI)	Availability Groups (AGs)
Protection Scope	Entire SQL Server instance	Individual user databases
Storage	Shared storage (e.g., SAN)	Local storage per replica
Readable Secondaries	No	Yes
Quorum Required	Yes (WSFC)	Yes (WSFC)
Use Case	Ideal when you need instance-level redundancy, such as protecting system databases and jobs. Also suitable for environments that want to eliminate the overhead of synchronizing with secondary replicas. Common in on-premises deployments with existing shared storage infrastructure.	Best for scenarios requiring database-level HA/DR, read-scale with readable secondaries, or geographically distributed deployments using local storage.