Tuesday, February 2, 2021

COMMON SQL SERVER BACKUP FAILURE ERRORS AND ISSUES

 

One of the most common task for a DBA’s are to perform installation of new SQL Server versions and installing patches. Most often or not everything goes just smooth, but there will be many occasions we end up with failures with various errors and it will be difficult to find the cause and the solution of the failure.

 Below are some of the common errors and their resolutions which can cause the SQL Server setup failure.

SQL Server 2008 R2 setup failed with “Error 1406. Could not write value to key \SOFTWARE. System error. Verify that you have sufficient access to that key, or contact your support personnel”
– Run setup as Administrator
– Make sure Administrator has full control on the registry key reported in the error.
– This issue can happen, if there are any kind of virus/adware/malware/Addons on the system. Scan the system and fix any such issues.
– There is a connect bug raised for the same, please check of there are any latest updates.
https://connect.microsoft.com/SQLServer/feedback/details/998266/could-not-write-value-to-key-software-verify-that-you-have-sufficient-access-to-that-key-or-contact-your-support-personnel

 Error 997. Overlapped I/O operation is in progress during SQL Server installation
– Check the account with which you logged on to the server, make sure it is not a account with temporary profile.
– Run the Setup locally on the server and make sure you run the setup, by right-clicking on the Setup file and choose Run-As-Administrator.
– Disable any Anti-virus and reboot the server.
– Check for any problems with the disk drive where you are installing SQL Server files.

Invalid license data. Reinstall is required Error while trying to connect to SQL Server Management Studio (SSMS)
– Try to repair the SQL Server shared components(If you cannot launch the setup from control panel, use the Setup.exe from setup media.
– After repair, if it SSMS still fail with the same error, then uninstall Visual Studio 2010 or 2012 and then attemt the repar of the shared components again.

Errors while trying run SQL Server 2008 R2 setup to add second SQL node.
Microsoft.SqlServer.Configuration.Agent.InputValidationException: The credentials you provided for the SQL Server Agent service are invalid. To continue, provide a valid account and password for the SQL Server Agent service.

There are various reasons why this error can occur, below are some of the common causes.
– SQL Server installation on the first node may have not been completely successful and may have not created the SQL Server Agent cluster resouce. Check cluster manager and verifiy, if both the SQL Server and SQL Agent cluster resources got created. If SQL Agent resource is not created, then repair the installation on first node and fix any errors returned by the setup on first node.
– Another cause of this error could be that the SQL Server Agent resouce may be in failed or OFFLINE state. Bring it online on first node, before you again attempt the ADD NODE on second server.
– Another option is try and run the setup from command line. Run the below setup from an elevated command prompt on second server.

setup.exe /ACTION=AddNode /INSTANCENAME="" /SQLSVCACCOUNT="" /SQLSVCPASSWORD="" /AGTSVCACCOUNT="" /AGTSVCPASSWORD=""

Setup failed while trying to Add node at the initial steps in the “add Node Rules” setup wizard and returned below error.
Rule Check Result
---------------------------
Rule "SQL Server Database Services feature state" failed.
The SQL Server Database Services feature failed when it was initially installed. The feature must be removed before the current scenario can proceed.

– This error would mostly appear, if there was a failure with the SQL Server installation on the first node. Look into the setup logs on first node and fix the issue and run repair on the first node and make sure SQL Server and SQL Agent resources are online on the first node.
– If after fixing the problems with the first node, still if you are seeing this error on the second node, then follow the below steps.

o Change the below registry key on first node and the second node

HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Microsoft SQL Server\MSSQL10.MSSQLServer\ConfigurationState
Change all 4 values from 2 to 1

o Run repair on the first node, make sure it complets successfully.
o Run the Add node operation on the second node.

Rule “Setup account privileges” failed.
The account that is running SQL Server Setup does not have one or all of the following rights: the right to back up files and directories, the right to manage auditing and the security log and the right to debug programs. To continue, use an account with both of these rights.

– Ensure that the account which you are using to launch the setup is local administrator on the server.
– Make sure you run the setup by right-clicking and choose Run-as-Administrator.
– Add the account which you are running the setup to below local security policies. Click Start -> Run -> Type secpol.msc -> Select option “Local Security Policy” -> Choose “Local Policies” -> then on right side, go to properties of each of the below mentioned policies and add the account which is used to run the SQL setup.
o Backup files and directories
o Debug Programs
o Manage auditing and security log

Error Code: 1067
MSI (s) (5C!6C) [14:50:06:117]: Product: Microsoft SQL Server 2005 — Error 29503. The SQL Server service failed to start. For more information, see the SQL Server Books Online topics, “How to: View SQL Server 2005 Setup Log Files” and “Starting SQL Server Manually.”
The error is (1067) The process terminated unexpectedly..Error 29503. The SQL Server service failed to start. For more information, see the SQL Server Books Online topics, “How to: View SQL Server 2005 Setup Log Files” and “Starting SQL Server Manually.”

– Above errors occur when you try to install SQL Server on a server which has ratio of number or processors between logical processors is not a power of 2. Example, on a computer which has a one socket with 6 logical processors which is not a power of 2.
– The resolution to this problem is as mentioned in the below KB article
http://support.microsoft.com/kb/954835/en-us

Could not fix registry key HKEY_LOCAL_MACHINE\Software\Microsoft\Microsoft SQL Server\MICROSOFTBCM\MSSQLServer\SuperSocketNetLib\Lpc.The SQL Server registry keys from a prior installation cannot be modified. To continue, see SQL Server Setup documentation about how to fix registry keys.
– Setup may fail complaining about a particular registry key, in which case, cleanup any previous failed installation of SQL Server, grant full permission to the mentioned registry key to your account and to SQL service account and then rerun the setup by choosing Rus-as-Administrator.
– There could be variations of this error like after uninstalling SQL Server and then when you attempt to reinstall again, the following error occurs. 
“Rule ‘Consistency validation for SQL Server registry keys’ failed”. The SQL Server registry keys from a prior installation cannot be modified. Further in the logs, you may notice error “File C:\TEMP\en_sql_server_2008_r2_standard_x86_x64_ia64_dvd_521546\x64\FixSqlRegistryKey_x86.exe does not exist!”
– To fix the above error, download the new setup media, as the existing one appears corrupt or not complete. Another option is to copy the file “FixSqlRegistryKey_x86.exe” into x64 folder from another setup media which has this file in x86 to x64 folders of setup media.

There are many other errors which cause SQL Server setup failure and will try to add as many as possible. Send me any issues you encountered, so that I can add them, which will be helpful for others.

This is applicable on below versions of SQL Server

SQL Server 2005
SQL Server 2008 R2
SQL Server 2012
SQL Server 2014

Hope this was helpful.

 

SQL SERVER ALWAYSON AVAILABILITY GROUP COMMON ERRORS OR FAILURES


AlwaysON Availability Groups has been introduced with SQL Server 2012 and has been a very popular and most used feature as it provides both High Availability (HA) and Disaster Recovery (DR) solutions. As it is a new feature, there are many issues which DBA’s face while configuring AlwaysON Availability Groups.

Below are some of the common errors or failures related to SQL Server AlwaysON Availability Groups.

Checking for compatibility of the database file location on the secondary replica resulted in an error. (Microsoft.SqlServer.Management.HadrTasks)
The following folder locations do not exist on the server instance that hosts secondary replica Node1\AGTest: S:\MSSQL11.AG1\SQL_DATA; Microsoft.SqlServer.Management.HadrTasks)

– This error occurs if the drive letters or the folder path does not match between primary and secondary replica.
– To resolve this error, make sure that same drive letter and folder path exists on both the servers or perform manual synchronization of secondary server databases using backup/restore.

Failed to create, join or add replica to availability group ‘AGTest’, because node ‘Node1’ is a possible owner for both replica ‘AGTest\AGTest’ and ‘Node1\AGTest1’. If one replica is failover cluster instance, remove the overlapped node from its possible owners and try again. (Microsoft SQL Server, Error: 19405)
– This error occurs, if SQL Server instance where you are setting up AlwaysON Availability Group is part of SQL Server failover clustering and has both nodes as possible owners and then You tried to add another availability Group for another SQL cluster instance involving these nodes.
– To resolve this error, either follow the solution suggested in the error message “If one replica is failover cluster instance, remove the overlapped node from its possible owners and try again” or instead of using cluster SQL instances, just use standalone instances.

Error while trying to connect to AlwaysON Availability group using Listener Name
A network-related or instance-specific error occurred while establishing a connection to SQL Server. The server was not found or was not accessible. Verify that the instance name is correct and that SQL Server is configured to allow remote connections. (provider: Named Pipes Provider, error: 40 – Could not open a connection to SQL Server) (Microsoft SQL Server, Error: 2)

– This error occurs if the port number of the Listener is not 1433. To resolve this issue either use the port number along with the listener name like SQLAGLSTNR,1467 or create an alias.
– Make sure listener port is open to communicate. Test using telnet listenername portnumber
– Communication only works when using TCP network protocol.

Database Mirroring login attempt by user ‘Domain\ComputerName$.’ failed with error:
‘Connection handshake failed. The login ‘Domain\ComputerName$’ does not have CONNECT permission on the endpoint. State 84.’

– This error will be logged in SQL errorlog when trying to setup AlwaysON or Database Mirroring. This error occurs if the SQL Server services are running under local system accounts or if the SQL Service account does not have connect permission on the endpoint.
– To resolve the error, Change the SQL Server services to run under a domain account and then grant connect permission on endpoint to SQL Server service account.
GRANT CONNECT ON ENDPOINT::[Hadr_endpoint] TO [Domain\SQLServiceAccount]
GO

Summary for the replica hosted by ServerName\INSTANCE2 Replica mode: Asynchronous commit
This replica will use asynchronous-commit availability mode and support only forced failover
(with possible data loss).
Note: This is a Failover Cluster Instance. Failover Cluster Instances do not support AlwaysOn automatic failover.

– This error occurs, if you are using SQL clustered instances to setup AlwaysON Availability Group.
– Cluster SQL Server instance will not support the automatic failover in availability groups. If you want automatic failover, install SQL server instance as standalone instance.

Cluster network name resource ‘TestAG’ failed to create its associated computer object in domain ‘testdomain.com’ during: Resource online.The text for the associated error code is: A constraint violation occurred.Please work with your domain administrator to ensure that:
The cluster identity ‘SQLClus$’ has Create Computer Objects permissions. By default all computer objects are created in the same container as the cluster identity ‘SQLClus$’.The quota for computer objects has not been reached. If there is an existing computer object, verify the Cluster Identity ‘SQLClus$’ has ‘Full Control’ permission to that computer object using the Active Directory Users and Computers tool.

– This error occurs if Cluster Name Object does not have rights to create a new Virtual Network Name Object. Refer below article for instructions to prestage the Virtual Name Object.
http://technet.microsoft.com/en-us/library/dn466519.aspx

Joining database on secondary replica resulted in an error.
(Microsoft.SqlServer.Management.HadrTasks)
Failed to join the database ‘AGTest’ to the availability group ‘AGGrpTest′ on the
availability replica ‘Replica2′. (Microsoft.SqlServer.Smo)
The connection to the primary replica is not active. The command cannot be processed.
(Microsoft SQL Server, Error: 35250 Level 16, State 7)

– Try below steps to fix the error
o Make sure that the alwaysON endpoint [Hadr_endpoint] is not blocked by firewall
o Verify and make sure that SQL Server service account of primary server is added as a login on all the secondary servers and vice-versa.
o If SQL Server service accunt is “Nt service\” or local system account then ensure system account (Domainname\systemname$) of each replica is added as a login to other replicas.
CREATE LOGIN Domain\replica2$] FROM WINDOWS
o Grant connect permission on alwaysON endpoints on each replicas for SQL Server service account of all other replicas On Secondary replica run below query by changing the domain and replica server names
GRANT CONNECT ON ENDPOINT::[Hadr_endpoint] TO [Domain\replica1$]
On primary replica run below query by changing the domain and replica server names
GRANT CONNECT ON ENDPOINT::[Hadr_endpoint] TO [Domain\replica2$]
o Make sure SQL Server name (select @@servername) matches with hostname of the server.
o Make sure cluster service startup account is added as SQL Server login.

Error while trying to setup AlwaysOn availability group
Operating System Error 1265(The system cannot contact a domain controller to service the authentication request. Please try again later.).

– This error can occur if the password of inter-domain trust account is not synchronized on both sides of the trust relationship.
– Refer below link to see if it helps fixing the error, else engage your domain Administrator.
http://support.microsoft.com/kb/816577/en-us

This is applicable on below versions of SQL Server

SQL Server 2012
SQL Server 2014

Hope this was helpful.

Wednesday, September 16, 2020

Automate refresh of a SQL Server database that is part of an Availability Group

 

 we cannot restore a database that is part of availability group. We need to first remove the SQL Server database from the Availability Group in order to restore the database. In this tip we look at how we can automate this process to ensure that AlwaysOn is re-established after the restore without manual intervention.

First we walk through the steps that need to take place and then talk about how this can be automated using jobs.

Remove SQL Server Database From AlwaysOn Availability Group on Primary

The first step is to remove the database from the Availability Group. We can achieve this using the below command.

Note: you would need to replace the <Availability Group> and <Database Name> with your values.

-- runs on primary server

USE master 
GO
 
ALTER AVAILABILITY GROUP [<Availability Group>] REMOVE DATABASE [<Database Name>]; 
GO 

Restore SQL Server Database on Primary

The next step would be to restore the backup to refresh the database on your primary server. You can also add additional steps for any specific requirements to run after the restore such as granting permissions, etc.

This is the backup that is coming from the other server that will be used to refresh the database on the primary server.

-- runs on primary server

USE master 
GO
 
RESTORE DATABASE [<Database Name>] 
FROM DISK='<Shared Network Location>\<Database Name>.bak' WITH REPLACE 
GO

Ensure Restored SQL Server Database is in Full Recovery on Primary

We need to ensure the database recovery mode is set to FULL in order to meet AlwaysOn requirements.

-- runs on primary server

USE master 
GO
 
ALTER DATABASE [<Database Name>] SET RECOVERY FULL WITH NO_WAIT 
GO 

Backup SQL Server Database and Log on Primary

Next you need to take a full backup and a transaction log backup of this restored database.  To make this easier for the restore you should create the backups on a network share which is accessible from both the PRIMARY and SECONDARY servers. This will avoid having to copy the backup from the PRIMARY to the SECONDARY and will also save time by eliminating the copy step.

-- runs on primary server

BACKUP DATABASE [<database Name>] 
TO DISK='<Shared Network Location>\<Database Name>.bak' WITH FORMAT, INIT, COMPRESSION 
GO
 
BACKUP LOG [<database Name>] 
TO DISK='<Shared Network Location>\<Database Name>.trn' WITH FORMAT, INIT, COMPRESSION  
GO 

Add SQL Server Database to Availability Group on Primary

Next we need to add the database back to the Availability Group.


-- runs on primary server

USE master 
GO
 
ALTER AVAILABILITY GROUP [<Availability Group>] ADD DATABASE [<Database Name>];  
GO

Now you are done with all the steps on the PRIMARY server.

Restore SQL Server Database on Secondary

Next we need to restore the full backup and log backup on the secondary server. 

-- runs on secondary server

USE master 
GO
 
RESTORE DATABASE [<database Name>] 
FROM DISK='<Shared Network Location>\<Database Name>.bak' WITH FILE=1, REPLACE, NORECOVERY 
GO
 
RESTORE LOG [<database Name>] 
FROM DISK='<Shared Network Location>\<Database Name>.trn' WITH FILE=1, REPLACE, NORECOVERY  
GO 

Check Status of SQL Server Database and then Add to Availability Group on Secondary

The next step is important, this enables data synchronization after the database has been restored and is ready to join the Availability Group.

-- runs on secondary server

-- Wait for the replica to start communicating 
begin try 
   declare @conn bit 
   declare @count int 
   declare @replica_id uniqueidentifier 
   declare @group_id uniqueidentifier 
   set @conn = 0 
   set @count = 30 -- wait for 5 minutes 
  
   if (serverproperty('IsHadrEnabled') = 1) 
      and (isnull((select member_state 
                   from master.sys.dm_hadr_cluster_members 
                   where upper(member_name COLLATE Latin1_General_CI_AS) = upper(cast(serverproperty('ComputerNamePhysicalNetBIOS') as nvarchar(256)) COLLATE Latin1_General_CI_AS)), 0) <> 0) 
      and (isnull((select state 
                   from master.sys.database_mirroring_endpoints), 1) = 0) 
   begin 
      select @group_id = ags.group_id 
      from master.sys.availability_groups as ags 
      where name = N'<Availability Group>' 

      select @replica_id = replicas.replica_id 
      from master.sys.availability_replicas as replicas 
      where upper(replicas.replica_server_name COLLATE Latin1_General_CI_AS) = upper(@@SERVERNAME COLLATE Latin1_General_CI_AS) and group_id = @group_id 

      while @conn <> 1 and @count > 0 
      begin 
         set @conn = isnull((select connected_state from master.sys.dm_hadr_availability_replica_states as states where states.replica_id = @replica_id), 1) 
         if @conn = 1 
         begin 
            -- exit loop when the replica is connected, or if the query cannot find the replica status 
            break 
         end 
         waitfor delay '00:00:10' 
         set @count = @count - 1 
      end 
   end 
end try 

begin catch 
   -- If the wait loop fails, do not stop execution of the alter database statement 
end catch 

ALTER DATABASE [<Database Name>] SET HADR AVAILABILITY GROUP = [<Availability Group>]; 
GO 

SQL Server Agent Jobs

To automate this process we can create SQL Server Agent jobs on both servers:

  1. Primary Server - Create one job that does the following:
    1. Removes database from Availability Group
    2. Restores the database
    3. Sets recovery model of database to FULL
    4. Adds database to Availability Group
    5. Creates database backup and log backup
    6. optional - runs job on secondary server using sp_start_job (there are other things you would need to setup to do this)
  2. Secondary Server - Create one job that does the following:
    1. Restores database backup and log backup
    2. Checks database status and adds database to Availability Group

If you don't have the primary server start the job on the secondary server you would need to figure out when to schedule this job to run after the job on the primary server completes.

How to Find which user deleted the database in SQL Server

 SQL Server: How to Find which user deleted the database in SQL Server

There are two different methods by which one can easily find who deleted the database in SQL Server. The first method is using builtin Schema Changes History Report. The second method is to load the SQL Server Default Trace into a table to see who deleted the database.

Steps to find who deleted the User database in SQL Server Using SQL Server Schema Changes History Report

1. Open SQL Server Management Studio and Connect to the SQL Server Instance.

2. Right click SQL Server Instance and Select Reports -> Standard Reports -> Schema Changes History as shown in the below snippet.

SQL Server Schema Changes History Report to Find which user deleted the database in SQL Server
SQL Server Schema Changes History Report to Find which user deleted the database in SQL Server

3. This will open up Scheme Changes History report which will have the details about who deleted the SQL Server Database along with the timestamp when the database was deleted. Refer the below snippet for more information.

Steps to Identify who deleted the user database using Using Default Trace Files

The SQL Server Default Trace file gives very useful information to a DBA to understand what is happening on the SQL Server Instance. 

Execute the below query to find the default path of trace file in SQL Server.

SELECT
	 path AS [Default Trace File]
	,max_size AS [Max File Size of Trace File]
	,max_files AS [Max No of Trace Files]
	,start_time AS [Start Time]
	,last_event_time AS [Last Event Time]
FROM sys.traces WHERE is_default = 1
GO
SQL Server Trace File Location

How to Load SQL Server Trace File in SQL Server Table

Execute the below script to load the default trace file content in a temporary table to read the relevant information with respect to who deleted the user database on the instance of SQL Server. If you don’t find the relevant information in the latest trace file then it is recommended to load the data from all the available trace files on the server to explore the information.

USE tempdb
GO

IF OBJECT_ID('dbo.TraceTable', 'U') IS NOT NULL
	DROP TABLE dbo.TraceTable;

SELECT * INTO TraceTable
FROM ::fn_trace_gettable
('G:\Program Files\Microsoft SQL Server\MSSQL10_50.SQL2008R2\MSSQL\Log\log_12.trc', default)
GO

SELECT
	 DatabaseID
	,DatabaseName
	,LoginName
	,HostName
	,ApplicationName
	,StartTime
	,CASE
		WHEN EventClass = 46 THEN 'Database Created'
		WHEN EventClass = 47 THEN 'Database Dropped'
	ELSE 'NONE'
	END AS EventType
FROM tempdb.dbo.TraceTable
	WHERE DatabaseName = 'MyTechMantra'
		AND (EventClass = 46 /* Event Class 46 refers to Object:Created */
			OR EventClass = 47) /* Event Class 47 refers to Object:Deleted */
GO

From the above snippet you could see that the event class 46 represents the database creation time along with the user who created it and event class 47 represents the database deletion time along with the user who deleted the database.

Conclusion

In this article we have seen how easily one can find out who deleted the user database in SQL Server with the help of in built SQL Server Schema Changes History Report.

COMMON SQL SERVER BACKUP FAILURE ERRORS AND ISSUES

  One of the most common task for a DBA’s are to perform installation of new SQL Server versions and installing patches. Most often or not e...