Health Monitoring Multiple SQL Azure Databases Part 5: Analysing the Data – Critical Alerts
Amigos, we’ve come a long way. In Part 1 we pimped the Azure databases to return health data XML, in Part 2 we made a control database that knew where these databases were at, in Part 3 we hoovered out the XML data we wanted from them via PowerShell, and in Part 4 we shredded this ready for analysis. Tidy. So how do you go about analysing it? Man, this part’s up to you.
Me? First thing I want to know is when the databases are going to go boom. There aint no auto-grow in Azure, so if you’ve gone for 10gb, when it gets there it’ll stop. Dead. Silent. Bar the barely audible whistle of your boss’ boot sailing towards your ass.
So we need a view to identify the potential butt killers and fast. This will do the job:
-- db space alert view: > 70% full CREATE VIEW dbhealth.vCheckSpaceUsed AS SELECT azdbs.azDatabaseID, azdbs.dbMaxSize, azdbs.dbPctUsed, 'DB_SIZE' as alertType, 'DB ' + azd.azDatabaseName + ' on server ' + azs.azServerName + ' is ' + convert(varchar(14), convert(decimal(8,2), azdbs.dbPctUsed)) + '% full' as alertMsg FROM dbhealth.azureDBSize azdbs INNER JOIN dbhealth.azureResultSet azrs on azdbs.azResultSetID = azrs.azResultSetID INNER JOIN dbhealth.azureDatabase azd ON azdbs.azDatabaseID = azd.azDatabaseID INNER JOIN dbhealth.azureServer azs ON azd.azServerID = azs.azServerID WHERE azrs.azResultSetDate > DATEADD(hour, -6, getdate()) -- results in the last six hours and azdbs.dbPctUsed > convert(decimal(8,6), 70); GO
I’ve added a third step to my sql agent job: now the first step polls and retrieves the data, the second one shreds it, and the third runs critical alert checks. Skills.
(Dig that the view restricts results to ones in the last six hours. This is because my polling is every six hours, so I’m just checking the latest result set – i.e. the results obtained by job step one of my agent job. Don’t want to be reporting old records that have since been sorted hey. You’ll need to adjust to suit your polling period.)
My job step three executes the following stored procedure (utilising the above view and database mail):
CREATE PROCEDURE [dbhealth].[processCriticalAlerts] AS BEGIN TRY DECLARE @alertstogo nvarchar(max) = '' SELECT @alertstogo = @alertstogo + alertMsg + CHAR(13) + CHAR(10) -- '<br>' for HTML FROM dbhealth.vCheckSpaceUsed IF (@alertstogo <> '') BEGIN EXEC msdb.dbo.sp_send_dbmail @profile_name = 'Your DB Mail Profile name', @recipients = 'a@b.com;c@b.com', @copy_recipients = '', @body = @alertstogo, --@body_format = 'HTML', -- if you want to use this @subject = 'ALERT: SQL Azure Database Size', @importance = 'High' END END TRY BEGIN CATCH declare @ErrorMessage NVARCHAR(4000); declare @ErrorSeverity INT; declare @ErrorState INT; select @ErrorMessage = ERROR_MESSAGE(), @ErrorSeverity = ERROR_SEVERITY(), @ErrorState = ERROR_STATE(); raiserror (@ErrorMessage, @ErrorSeverity, @ErrorState); END CATCH GO GRANT EXECUTE ON dbhealth.processCriticalAlerts to HealthCheck; GO
Job done. On each run of the job an email is sent out for all databases that are more than 70% full. Add more views and more alerts to the procedure as needed but keep them critical only – don’t be an info freako here, save this for SSRS. Job step alerts = need attention right now, SSRS reports = longer-term trend analysis.
The next logical step is then to add custom SSRS reports to Management Studio. I will be doing this in the next few weeks and laying it down for you soon, as there’s good stuff to be had here. But hey, don’t wait for me. Add your critical alerts and get SSRS fired up. Automated alerts, shiny graphs and charts – these are review time gold baby. So go get analysing. You know it makes sense.
As soon as my SSRS Management Studio reports are grooving you’ll be hearing from me. Till then peace people.
Word.
HI James, thanks for the great information. We use a number of these queries in our web based Cotega monitoring service (www.cotega.com) for Azure SQL Databases. Basically we continually look at these factors for a users database and then send notifications to the administrators when there are issues.