Tuesday, February 2, 2021

SQL SERVER ALWAYSON AVAILABILITY GROUP COMMON ERRORS OR FAILURES


AlwaysON Availability Groups has been introduced with SQL Server 2012 and has been a very popular and most used feature as it provides both High Availability (HA) and Disaster Recovery (DR) solutions. As it is a new feature, there are many issues which DBA’s face while configuring AlwaysON Availability Groups.

Below are some of the common errors or failures related to SQL Server AlwaysON Availability Groups.

Checking for compatibility of the database file location on the secondary replica resulted in an error. (Microsoft.SqlServer.Management.HadrTasks)
The following folder locations do not exist on the server instance that hosts secondary replica Node1\AGTest: S:\MSSQL11.AG1\SQL_DATA; Microsoft.SqlServer.Management.HadrTasks)

– This error occurs if the drive letters or the folder path does not match between primary and secondary replica.
– To resolve this error, make sure that same drive letter and folder path exists on both the servers or perform manual synchronization of secondary server databases using backup/restore.

Failed to create, join or add replica to availability group ‘AGTest’, because node ‘Node1’ is a possible owner for both replica ‘AGTest\AGTest’ and ‘Node1\AGTest1’. If one replica is failover cluster instance, remove the overlapped node from its possible owners and try again. (Microsoft SQL Server, Error: 19405)
– This error occurs, if SQL Server instance where you are setting up AlwaysON Availability Group is part of SQL Server failover clustering and has both nodes as possible owners and then You tried to add another availability Group for another SQL cluster instance involving these nodes.
– To resolve this error, either follow the solution suggested in the error message “If one replica is failover cluster instance, remove the overlapped node from its possible owners and try again” or instead of using cluster SQL instances, just use standalone instances.

Error while trying to connect to AlwaysON Availability group using Listener Name
A network-related or instance-specific error occurred while establishing a connection to SQL Server. The server was not found or was not accessible. Verify that the instance name is correct and that SQL Server is configured to allow remote connections. (provider: Named Pipes Provider, error: 40 – Could not open a connection to SQL Server) (Microsoft SQL Server, Error: 2)

– This error occurs if the port number of the Listener is not 1433. To resolve this issue either use the port number along with the listener name like SQLAGLSTNR,1467 or create an alias.
– Make sure listener port is open to communicate. Test using telnet listenername portnumber
– Communication only works when using TCP network protocol.

Database Mirroring login attempt by user ‘Domain\ComputerName$.’ failed with error:
‘Connection handshake failed. The login ‘Domain\ComputerName$’ does not have CONNECT permission on the endpoint. State 84.’

– This error will be logged in SQL errorlog when trying to setup AlwaysON or Database Mirroring. This error occurs if the SQL Server services are running under local system accounts or if the SQL Service account does not have connect permission on the endpoint.
– To resolve the error, Change the SQL Server services to run under a domain account and then grant connect permission on endpoint to SQL Server service account.
GRANT CONNECT ON ENDPOINT::[Hadr_endpoint] TO [Domain\SQLServiceAccount]
GO

Summary for the replica hosted by ServerName\INSTANCE2 Replica mode: Asynchronous commit
This replica will use asynchronous-commit availability mode and support only forced failover
(with possible data loss).
Note: This is a Failover Cluster Instance. Failover Cluster Instances do not support AlwaysOn automatic failover.

– This error occurs, if you are using SQL clustered instances to setup AlwaysON Availability Group.
– Cluster SQL Server instance will not support the automatic failover in availability groups. If you want automatic failover, install SQL server instance as standalone instance.

Cluster network name resource ‘TestAG’ failed to create its associated computer object in domain ‘testdomain.com’ during: Resource online.The text for the associated error code is: A constraint violation occurred.Please work with your domain administrator to ensure that:
The cluster identity ‘SQLClus$’ has Create Computer Objects permissions. By default all computer objects are created in the same container as the cluster identity ‘SQLClus$’.The quota for computer objects has not been reached. If there is an existing computer object, verify the Cluster Identity ‘SQLClus$’ has ‘Full Control’ permission to that computer object using the Active Directory Users and Computers tool.

– This error occurs if Cluster Name Object does not have rights to create a new Virtual Network Name Object. Refer below article for instructions to prestage the Virtual Name Object.
http://technet.microsoft.com/en-us/library/dn466519.aspx

Joining database on secondary replica resulted in an error.
(Microsoft.SqlServer.Management.HadrTasks)
Failed to join the database ‘AGTest’ to the availability group ‘AGGrpTest′ on the
availability replica ‘Replica2′. (Microsoft.SqlServer.Smo)
The connection to the primary replica is not active. The command cannot be processed.
(Microsoft SQL Server, Error: 35250 Level 16, State 7)

– Try below steps to fix the error
o Make sure that the alwaysON endpoint [Hadr_endpoint] is not blocked by firewall
o Verify and make sure that SQL Server service account of primary server is added as a login on all the secondary servers and vice-versa.
o If SQL Server service accunt is “Nt service\” or local system account then ensure system account (Domainname\systemname$) of each replica is added as a login to other replicas.
CREATE LOGIN Domain\replica2$] FROM WINDOWS
o Grant connect permission on alwaysON endpoints on each replicas for SQL Server service account of all other replicas On Secondary replica run below query by changing the domain and replica server names
GRANT CONNECT ON ENDPOINT::[Hadr_endpoint] TO [Domain\replica1$]
On primary replica run below query by changing the domain and replica server names
GRANT CONNECT ON ENDPOINT::[Hadr_endpoint] TO [Domain\replica2$]
o Make sure SQL Server name (select @@servername) matches with hostname of the server.
o Make sure cluster service startup account is added as SQL Server login.

Error while trying to setup AlwaysOn availability group
Operating System Error 1265(The system cannot contact a domain controller to service the authentication request. Please try again later.).

– This error can occur if the password of inter-domain trust account is not synchronized on both sides of the trust relationship.
– Refer below link to see if it helps fixing the error, else engage your domain Administrator.
http://support.microsoft.com/kb/816577/en-us

This is applicable on below versions of SQL Server

SQL Server 2012
SQL Server 2014

Hope this was helpful.

4 comments:

COMMON SQL SERVER BACKUP FAILURE ERRORS AND ISSUES

  One of the most common task for a DBA’s are to perform installation of new SQL Server versions and installing patches. Most often or not e...