Thursday, March 12, 2015

common issues and Alerts Configuration in Server

common issues in real time :- 


1. backup failures.
2. job failures.
3. disk space issues
4. log file growth issues.
5. data file growth issues.
6. db  offline and suspect mode issues.
7. cpu and memory  is high issues.
8. Sql server sudden start/stop.
9. performance issues like Blocking and Query slowness and Server slowness.
10. user access issues.
11. High availability issues like log shipping out of sync and cluster fail over and replication issues.

Alerts need to configured in a servers:-


1. disk space alert:- used space>80% --> warning & used space>90% is critical.
2. memory is high
3. cpu is high alerts
4. jobs failures
5. db offline or suspect alerts
6. database which doesn't have any backups.
7. sql server start and stopped.
8. blocking and deadlocks.



RAID Levels and Recommendation for Data and Log Files


RAID Levels:-

RAID 0:- RAID 0 is known as disk striping. This RAID 0  stripes data across disks in the array, means , data will be equally distributed among all the disks.This will  offer better throughput on the read/write operations. However, there is no data protection offered in this RAID level. If one disk fails, the data stored on the disk will be lost.

RAID 1:- RAID 1 is known as disk mirroring. You need a minimum of two disks to form a RAID 1 array. One primary disk is used for read/write operations and the data is replicated to the second disk. This RAID level offers better read performance but slower write performance. The main disadvantage is out of 2 disks only 1 disk used to store the data.
RAID 5:-  RAID 5 will provide good performance throughput and prevent data loss if there is a disk failure. For the database which has high read/write activity, RAID 5 will be a better choice when compared to RAID 0 and RAID 1 in terms of data availability and data protection. For an ideal SQL Server configuration, you should have a combination of RAID 1 and RAID 5. You should mirror the operating system using RAID 1, and place transaction logs on a RAID 1 that is separate from the RAID 1 that hosts the operating system. SQL Server writes data to the transaction logs and maintains serial information of all modifications that occurred in a SQL database. The transaction log files can be used for rollback and roll forward of information from a SQL Server database. The SQL Server files and filegroups should be placed on a RAID 5, because you get best performance throughput by placing database files on RAID 5 disk array.
RAID 10:-   RAID 10 is a better choice than RAID 5, but RAID 10 would be more expensive than RAID 5. RAID 10 is a combination of RAID 0+1, which is known as striping with mirroring. You should first mirror disks and then create a stripe set of mirrored disks. This provides high fault tolerance for data and excellent throughput performance.

Best Practices:-
  • RAID 1 is often chosen to store operating systems, binaries, index file groups and database transaction log files. Critical to log and index performance is fault tolerance and a good write speed.
  • Since log file are written sequentially and are read only for recovery operations, recommended RAID for Log Files is RAID 1 or 10. If your RAID 1 is at a 100% usage, choose RAID 10 for better performance.
  • For data files with random access and read heavy data volumes, striping is important. So recommended RAID is 5 or 10.
  • For data files which require good write performance, RAID 10 is recommended. Use a battery backed caching RAID controller for better write performance
  • For tempdb files with good read/write performance RAID 0, 1 or 10 is recommended. Although tempdb contains temporary data and DBA’s often go in for RAID 0 for tempdb, keep in mind that SQL Server requires tempdb to carry out many of its activities. So if you want your system to be always available, rethink RAID 0.
  • Putting log files or tempdb on a RAID 5 array is not recommended, since RAID 5 does not perform well for write operations. DBA's although have conflicting opinions over this point.

Hiding the SQL Server over the network...


Hiding the SQL Server over the network...

Generally we can browse available SQL Instances on network by choosing "<Browse for More..>" from Server Name drop-down list in "Connect to Server" dialog box in Management Studio:

As you can see all of my instances are visible on the network. If I want to hide this information from people on my network, it can be done easily. SQL Server allows us to hide an instance from being listed on the network. This can be done using SQL Server Configuration Manager.
For example, If I need to hide my SQL Server  instance TIRU01
1. Launch SQL Server Configuration Manager
2. Under "SQL Server Network Configuration" > Right-Click "Protocols for TIRU01" > Select Properties
3. Set "Hide Instance" value to "Yes" from the drop-down list:
4. restart the SQL Server instance TIRU01 to take effect
5.After the restart , the instance is restarted it will not longer appear in the "Network Servers" list:


Migration of SQL Server from one Server to another server with less downtime:-


Migration of SQL Server from one Server to another server with less downtime:-

As newer versions of SQL Server are released, we will plan implement to upgrade to higher versions and during this process, we may also upgrade to new  servers with better hardware and latest Operating System. Also, as part of data center relocation, we migrate the SQL Server instance from one server to another server in new data center. The process of migration involves lot of steps and need to plan and implement the steps so that, at end of migration, we have SQL Server instance running without any problems.
This article covers a scenario where we want to migrate SQL Server instance from one server to another server. This is applicable for scenarios where we are migrating between servers which have same versions of SQL Server installed or new servers having higher version of SQL Server. Goal of the migration is that there should be  minimal downtime when we switch the applications from old server to new server.
Below are the high level steps involved in migration.
1)  we need to perform these steps on test server, so that we can find the issues that can arise during the migration before on production itself and also to test if the applications work without any problems after migration.
2) If we are migrating to a higher version of SQL Server, then need to document the issues that can arise with the migration by running Upgrade Adviser and prepare documentation with remediation steps to fix those issues as part of the migration. If we are migrating to new server with same version of SQL Server, then no requirement of upgrade adviser.
3) Install the required version of SQL Server instance on the new server.
4) Apply latest Service Pack or Cumulative Updates.
5) Script logins from existing server by using revlogin script  and Create those logins on the new server.
6)  Script all the jobs on existing server and create them on the new server, keep them disabled. Create maintenance plans on new server similar to existing server.
7) Configure Database Mail if used.
8) Create any linked servers if required.
9) Configure SQL Server settings like Max Server Memory, MaxDrop, backup compression settings, etc on the new server.
10) Configure OS or Network settings on new server like Lock Pages In Memory, Perform Volume Maintenance Tasks, etc.
11)  Backup all User databases on existing server, copy the backups to new server and restore them. Document the time it took for performing backups, copying backups to new server and for restoring of databases.
12) Fix Orphan Users for all user databases.
13) Enable backup and maintenance jobs like rebuild index, Update stats, etc on new server.
14) Change the compatibility level of user databases to latest version(If migrating to higher version.
15) Perform Application testing to make sure the application is working without any problems and performance is acceptable.
16)Once, everything looks fine and got permission to move ahead, plan for the Go-Live of new server.
Go-Live Planning
– Our goal here is to have minimum downtime during the Go-Live process, so we need to either setup Logshipping or Database Mirroring between existing and new servers.
– Configure Logshipping from existing server to new server for all user databases and make sure backup, copy and restore jobs are running without any issue.
– Before Go-Live, make sure Logins and jobs are same on both the existing and new servers.
Start of Down Time
– Disable all jobs on existing server.
– Run logshipping backup jobs on existing server, once it completes successfully, disabled the job.
– Run logshipping copy jobs on new server, once it completes successfully, disabled the job.
– Run logshipping restore jobs on new server, once it completes successfully, disabled the job.
– Remove the logshipping configuration between existing and new server.
– Bring the all user databases online on new server by running “restore database databasename with recovery”.
– Enable all required SQL Server Agent jobs on new server and disable the same on old server
– Fix the orphan users.
– Change the compatibility level of all user databases to latest version(If migrating to higher version).
– Point the application to use the new server.
End of Down Time
– Test the Application to make sure it is working as expected.
– Later run rebuild indexes and Update Statistics on all user databases.
– If there are any issues, try possibilities of fixing those issue, if it is taking time and decided to rollback, refer below RollBack plan.
– If no rollabck required, take SQL Services offline on old server and decommission the old server.
RollBack Plan
– Verify all the user Databases are online on old server.
– Point the application back to old server.
– Enable all required jobs on old server and disable the same on new server.
– Test the application.
You can also use database mirroring instead of logshipping, just replace the steps related to logshipping with database mirroring related steps.
Hope this was helpful.

Adding article without running the entire database snapshot in Transaction Replication:-


Adding article without running the entire database snapshot in Transaction Replication:-

If Transaction replication was already setup and If we want to add new articles, then a Snapshot has to be generated and applied to the Subscriber, but by default snapshot of all the existing and new articles will get generated and applied to the Subscriber, which is not what we want as it can take long time when the database is large and causes Subscriber to be unavailable while applying the Snapshot. There is a way in which we can avoid this behavior and just generate the Snapshot of the newly added articles and apply them to the Subscriber, which will not cause any problem to the existing articles which can be available during this process.
Let’s see how we can achieve this.
Run the below commands on the Publication database
1
sp_helppublication
If the columns “immediate_sync” and “allow_anonymous” are having a value of 1(Enabled) for both of them, the Snapshot of all the articles will be generated. As we do not want this behavior, we will change these values to 0(Disabled) for both the “immediate_sync” and “allow_anonymous” options.
We can disable these options by running below commands on Publication database
1
2
3
4
5
6
7
8
9
10
11
EXEC sp_changepublication
@publication = 'your publication name',
@property = 'ALLOW_ANONYMOUS' ,
@value = 'False'
GO
 
EXEC sp_changepublication
@publication = 'your publication name',
@property = 'IMMEDIATE_SYNC' ,
@value = 'False'
GO
Now, we can go ahead and add the new article to the publication from GUI, and then we need to start the Snapshot Agent, which will generate new Snapshot just for the new articles. If we have not run the snapshot agent then the newly added article will be in Subscribed state, but will not be active, Inorder to make this active, we need to run the Snapshot agent job.
We can check article status on Subscriber database by running the Subscriber database.
1
sp_helpsubscription

Common Backup errors and Troubleshooting steps



Common Backup errors and Troubleshooting steps:-

You can find here some of the frequent or common errors faced which manually performing a database backup or an automated maintenance plan or custom backup job failed.
Backup Error: 3041, Severity: 16, State: 1
Backup Error: 3041, Severity: 16, State: 1.
Backup BACKUP failed to complete the command BACKUP DATABASE database_name. Check the backup application log for detailed messages.
Backup Error 3041 is a generic error which is returned when a backup fails. Along with this error, there are additional errors returned and can be found in the SQL Server errorlog. Look for other specific backup errors, which will point the cause of the backup failure.
Below are more specific errors which cause the backup failures.
Error: 18204, Severity: 16, State: 1
Error: 18204, Severity: 16, State: 1
BackupDiskFile::CreateMedia: Backup device '\\BackupServer\Backups\Test\Test_Backup.bak' failed to create. Operating system error 53(the network path was not found.)
If you carefully read the complete error message, it gives the cause of the failure “Operating system error 53(the network path was not found”.
– Test If you are able to browse to this backup share \\BackupServer\Backups\Test. You need to test using the same account which is performing the backup, meaning if backups are running from SQL Server job, use the SQL Server and Agent service accounts to test access to the backup share.
– If this is happening intermittently, then possibly it was is network glitch, so work with your network administrator to run a network trace to identify network issues.
Msg 3201, Level 16, State 1, Line 1
Msg 3201, Level 16, State 1, Line 1
Cannot open backup device 'D:\Backups\Test_backup.bak’. Operating system error 3(The system cannot find the path specified.).
– Make sure that the path D:\Backups exists, sometimes with user oversight, the path mentioned could be wrong lie the folder name could be backup and we use backups and will result in error. – Another issue when this can occur is, if there are any space at the end of the folder name.
Msg 3201, Level 16, State 1, Line 1
Msg 3201, Level 16, State 1, Line 1
Cannot open backup device 'D:\Backups\Adv.bak'. Operating system error 5(Access is denied.)
.

– The above error occurred, because of permission issues on the backup folder. Grant Full Control on backup folder to SQL Server service account and to the account with which are you are trying to perform the backups.
– Also, make sure that the account used to perform backup has sufficient permissions to perform backups in SQL Server.
Error: 3043, Severity: 16, State: 1
BACKUP 'Test_DB' detected an error on page (1:12534) in file ’Test_DB.mdf'.
Error: 3043, Severity: 16, State: 1.
– The error 3043 occurs, if there is some sort of corruption in the database, which would again be mostly because if underlying hardware issues.
– Troubleshoot this issue in the direction, by first troubleshooting and fixing the database corruption issue.
Error: 18210, Severity: 16, State: 1
Error: 18210, Severity: 16, State: 1.
BackupMedium::ReportIoError: write failure on backup device '0a158c7d-a7a3-4d5a-8b58-124602e40a14'.
Operating system error 995(The I/O operation has been aborted because of either a thread exit or an application request.).
– Error 18210 occurs mostly when a third party backup tool is performing the backups of SQL databases using native T-SQL commands or by backing up the .mdf and .ldf files. Third party tools generally use VDI/VSS and use VSSWriter or SQLWriter for performing the backups. Check for any problems with these writers.
– If you are not using any third party tool for performing SQL database backups, then you can try disabling the OS and SQL VSSwriter to see if backup completes successfully after disabling them.
– Check for any known issues with SQL Server or OS.
– Check for any known issues with the third backup software.
Msg 3009, Level 16, State 1, Line 1
Msg 3009, Level 16, State 1, Line 1
Could not insert a backup or restore history/detail record in the msdb database. This may indicate a problem with the msdb database. The backup/restore operation was still successful
.
– Whenever SQL Server database backup is performed, an entry for that backup is made into msdb..backupset table. Check if the data and log files of msdb is set for autogrowth. Verify if there is sufficient free space of the disk drive where msdb files are located.
Msg 3033, Level 16, State 0, Line 1
Msg 3033, Level 16, State 0, Line 1
BACKUP DATABASE cannot be used on a database opened in emergency mode
.
– This error can occur if you try to perform backup of a database which is in emergency mode, which is not supported. Bring your database online and then perform the backups.
Msg 4208, Level 16, State 1, Line 1 
Msg 4208, Level 16, State 1, Line 1
The statement BACKUP LOG is not allowed while the recovery model is SIMPLE. Use BACKUP DATABASE or change the recovery model using ALTER DATABASE.
– This error can occur, if we try to perform transaction log backup of a database whose recovery model is set to “SIMPLE”. Transaction Log backups are not supported for databases in Simple recovery mode. If the database is critical and recovering as much data is required and if point in time recovery is important, then change the recovery model of the database to “FULL” and then schedule job to perform regular transaction log backups.
SQL Server Backups causing operating system returned error 1450
The operating system returned error 1450(Insufficient system resources exist to complete the requested service.) to SQL Server during a write at offset 0x00000000003200 in file with handle 0x0000101C. This is usually a temporary condition and the SQL Server will keep retrying the operation. If the condition persists then immediate action must be taken to correct it.
– Error 1450 is not related to SQL Server, it is related to Hardware/Operating System, so check for any configuration issues and make changes accordingly.
– Remove /3 GB switch if enabled, and try performing the backups.
– Make sure there is no corruption with database.
– Check for any issues with disk subsystem.
Msg 3241, Level 16, State 7, Line 1 and Msg 3013, Level 16, State 1, Line 1
Msg 3241, Level 16, State 7, Line 1
The media family on device 'PathToBackupFile\BackupFile.bak' is incorrectly formed. SQL Server cannot process this media family.
Msg 3013, Level 16, State 1, Line 1
RESTORE DATABASE is terminating abnormally
.

This issue can occur for various reasons, some of them are as mentioned below.
– Make sure the backup file is located on the same server where you are trying to restore. Check if the backup file is not corrupted. Try to move the backup file on to another drive or to another server and see if that works. Run below command to make sure there is no corruption with the backup file.
Restore verifyonly from disk = 'PathToBackupFile\BackupFile.bak'
– This error can also occur if you are trying to restore a database from lower version to higher version. Example, if you are trying to perform a restore from SQL Server 2012 database on SQL Server 2008 R2 instance. This scenario is not supported and cannot do it, so you have to script our all tables/SP’s/Views/Other objects from SQL 2012 and create them on SQL Server 2008 and then export the data from SQL Server 2012 to SQL Server 2008 R2

COMMON SQL SERVER BACKUP FAILURE ERRORS AND ISSUES

  One of the most common task for a DBA’s are to perform installation of new SQL Server versions and installing patches. Most often or not e...