A Failover Cluster Instance (FCI) is a high availability (HA) solution in SQL Server that provides automatic failover support for the entire SQL Server instance. It is built on Windows Server Failover Clustering (WSFC) and uses shared storage that is accessible by all cluster nodes—but only one node accesses the storage at a time.
What is an FCI?
In an FCI configuration, two or more Windows servers (nodes) are connected to a shared storage system. At any given time, only one node owns and runs the SQL Server instance, while the others remain on standby to take over in case of a failure.
Unlike Availability Groups, which provide database-level replication, an FCI protects the entire SQL Server instance. This includes user databases, system databases, SQL Server Agent jobs, linked servers, and instance-level configurations.
Key Components of FCI
- Windows Server Failover Clustering (WSFC): Provides health detection, failover orchestration, and resource coordination.
- Shared Storage: Typically a SAN or shared disk that stores the data and log files, accessible by all nodes but attached to only one node at a time.
- Cluster Resource Group: A set of cluster-managed resources including SQL Server services, shared volumes, and network names/IPs.
- Virtual Network Name/IP: A consistent endpoint used by client applications, which automatically follows the active node after failover.
How FCI Works
- SQL Server is installed as a clustered instance across all participating nodes.
- The instance is active on only one node, which owns the SQL Server service and mounts the shared storage.
- When a failure is detected, WSFC initiates a failover to another node.
- The new node takes ownership of the shared disk and starts the SQL Server service.
- The virtual network name and IP are moved to the new active node.
- Clients reconnect automatically using the same network name/IP, requiring no configuration changes.
Diagram: How FCI Operates
The diagram below illustrates a typical two-node FCI with shared storage:
+------------------+ +------------------+ | Node 1 | | Node 2 | | (Active Node) | | (Passive Node) | +------------------+ +------------------+ | | | | | | +---------------------------------------------+ | Shared Storage | | (Database and Log Files on SAN) | +---------------------------------------------+ | V +--------------------------+ | Virtual Network Name/IP | | (Moves with active node)| +--------------------------+ If Node 1 fails → Node 2 becomes active, mounts storage, resumes SQL Server.
Benefits of FCI
- Provides instance-level protection, including system databases and SQL Server Agent jobs
- No need for individual database configuration or replication setup
- Transparent failover using a consistent virtual network name and IP address
- Because there is no synchronization with secondary replicas, it eliminates synchronization-related overhead
- All data remains consistent through shared storage without needing to manage data replication
Limitations of FCI
- Requires shared storage infrastructure such as SAN or Storage Spaces Direct
- No support for readable secondaries—only one node is active at a time
- Not ideal for geographically distributed or cloud-native deployments
- Failover requires service startup time, including OS readiness and SQL Server startup latency
FCI vs. Availability Groups
It’s important to understand the key differences between FCIs and Availability Groups (AGs):
Feature | Failover Cluster Instance (FCI) | Availability Groups (AGs) |
---|---|---|
Protection Scope | Entire SQL Server instance | Individual user databases |
Storage | Shared storage (e.g., SAN) | Local storage per replica |
Readable Secondaries | No | Yes |
Quorum Required | Yes (WSFC) | Yes (WSFC) |
Use Case | Ideal when you need instance-level redundancy, such as protecting system databases and jobs. Also suitable for environments that want to eliminate the overhead of synchronizing with secondary replicas. Common in on-premises deployments with existing shared storage infrastructure. |
Best for scenarios requiring database-level HA/DR, read-scale with readable secondaries, or geographically distributed deployments using local storage. |