The
SQL Server which is installed on the Windows Server is receiving data from “Tandem
Non-Stop System” by using Oracle Golden Gate replication. This downstream data
feed is critical to smooth business functionality. So, it needs to be monitored proactively.
Although monitoring this environment is critical, however, by executing various Golden
Gate commands (STATUS ALL, INFO ALL, LAG) in the GGSCI shell manually to
understand Latency is very annoying and tedious. Therefore, the automated process
comes into the picture to monitor performance in real-time for our die-hard operational
team.
Two REPLICATs we are
interested in:
We
are interested in monitoring two of the REPLICATs’ performance constantly as
they are critical to our healthcare business activities along with some SQL
Server and Windows performance counters.
The
following dashboard has been developed with the open source technology to visualize
performance data in real-time so one can glue his/her eyes 24/7 on the
dashboard. The dashboard process (the Grafana) can also trigger alerts/notifications
if a particular monitoring object exceeds the threshold for a substantial period
of time.
Performance metrics dashboard
in Real-time:
Technology Stack Used:
The
PowerShell script streams data in every
10 seconds interval and sends it to influxDB using the REST API method. The
script runs by the windows Task scheduler continuously.
InfluxDB is a time-series based database
technology which is high performance and can be massively scaled up. It is
written in GO language and is open source.
Grafana is an open source metric
analytics & visualization suite. It is most commonly used for visualizing
time series data for infrastructure and application analytics.
PowerShell Script and Dashboard download:
PowerShell Script:
Following
is the complete PowerShell Script which is feeding data into influxDB. This
script collects two Golden Gate REPLICATs’ latency, as well as couples SQL
Server and Windows Performance metrics every 10 seconds. Where, in the script, the server represents
the following:
SRV0009
– the Windows server where the Golden Gate process is running. The PowerShell
Script is also hosted here.
STV1075
– where the InfluxDB and the Grafana have been installed and are running.
<#####################################################################################
.NOTES
===========================================================================
Created with: PowerShell
ISE
Created on:
12/25/2016 9:39 AM
Created by:
Sarjen Haque
===========================================================================
.DESCRIPTION
function to
collect and parse Golden Gate status data from
a local or
remote server installed on Windows
######################################################################################>
function Get-RealTimeStatus()
{
$result
= winrs -r:SRV0009 "CMD /c echo status all |
H:\OracleGoldenGate\ggsci"
$raw
= $result
-match 'REPLICAT'
[StringSplitOptions]$Options = "RemoveEmptyEntries"
# loop
through each line and break
foreach
($line in
$raw)
{
$wrd
= $line.Split(" ", $Options)
$lg
= $wrd[3].Split(":")
$tm
= $wrd[4].Split(":")
$result
= @{
"Program"
= $wrd[0];
"Status"
= $wrd[1];
"Name"
= $wrd[2];
"Lag"
= $wrd[3];
"LagSec"
= [int]$lg[0] *(60 * 60) + [int]$lg[1] *60 + [int]$lg[2];
"ChkPt"
= $wrd[4];
"ChkPtSec"
= [int]$tm[0]*(60 * 60) + [int]$tm[1] *60 + [int]$tm[2];
}
$obj
= New-Object
-TypeName PSObject
-Property $result
Write-Output
$obj
}
}
#######################################################################################
# FUnction to collect and format Golden gate and Windows/SQL
Server PerfMon data
#######################################################################################
Function Get-ggPerf
{
# Windows
and SQL Server Instance
$WinServer
= 'SRV0009'
$SQLServer
= 'SQLServer'
# use for default instance
#
$SQLServer = 'MSSQL$SQL2014' # use for a named instance. Replace SQL2014 with
the isntance name
# Pull
Windows and SQL stats
[System.Collections.ArrayList]$counters =
@()
# Windows
reelated metrics
$counters.Add("\\$WinServer\logicaldisk(d:)\avg.
disk sec/transfer") | out-null #0
$counters.Add("\\$WinServer\logicaldisk(e:)\avg.
disk sec/transfer") | out-null #1
$counters.Add("\\$WinServer\logicaldisk(h:)\avg.
disk sec/transfer") | out-null #2
$counters.Add("\\$WinServer\logicaldisk(l:)\avg.
disk sec/transfer") | out-null #3
$counters.Add("\\$WinServer\logicaldisk(_total)\avg.
disk sec/transfer") | out-null #4
$counters.Add("\\$WinServer\Memory\Available
MBytes") | out-null #5
$counters.Add("\\$WinServer\Processor(_total)\%
Privileged Time") | out-null #6
$counters.Add("\\$WinServer\Processor(_total)\%
User Time") | out-null #7
$counters.Add("\\$WinServer\Processor(_total)\%
Processor Time") | out-null #8
#SQL Server
related metrics
$counters.Add("\\$WinServer\"
+ $SQLServer
+ ":Resource
Pool Stats(default)\CPU usage %") |
out-null #9
$counters.Add("\\$WinServer\"
+ $SQLServer
+ ":General
Statistics\Processes blocked") |
out-null #10
$counters.Add("\\$WinServer\"
+ $SQLServer
+ ":Buffer
Manager\page life expectancy") |
out-null #11
$counters.Add("\\$WinServer\"
+ $SQLServer
+ ":General
Statistics\User Connections") |
out-null #12
$counters.Add("\\$WinServer\"
+ $SQLServer
+ ":SQL
Statistics\Batch Requests/Sec") |
out-null #13
$counters.Add("\\$WinServer\"
+ $SQLServer
+ ":Memory
Manager\total server memory (kb)") |
out-null #14
$countersall
= (Get-Counter
-Counter $Counters
-SampleInterval 1).countersamples.CookedValue
# Format
SQL data into Influxdb Line Protocol
$postParams
= "sql_perf,host="
+ $WinServer
+ "
avgtransfer_d=" + $countersall[0] + ",avgtransfer_e="
+ $countersall[1] `
+
",avgtransfer_h=" + $countersall[2] + ",avgtransfer_l=" + $countersall[3] `
+
",avgtransfer_total=" + $countersall[4] + ",available_mbytes=" + $countersall[5] `
+
",pct_priviledge_time=" + $countersall[6] + ",pct_user_time=" + $countersall[7] `
+
",pct_processor_time=" + $countersall[8] + ",sql_cpu_time=" + $countersall[9] `
+
",processess_blocked=" + $countersall[10] + ",page_life_expectancy=" + $countersall[11] `
+
",user_connections=" + $countersall[12] + ",batch_resc_per_sec=" + $countersall[13] `
+
",total_server_memory_kb=" + $countersall[14]
# get the
output from the Golden Gate function
$result
= Get-RealTimeStatus
| select Program, Status, Name, Lag, LagSec, ChkPt, ChkptSec
$r1 = $result | ?{ @("RP9CRIT2") -contains
$_.Name
} | select -First 1
$r2 = $result | ?{ @("RP9MAIN2") -contains
$_.Name
} | select -First 1
if ($r1.Status -eq 'RUNNING') {
$r1status =
1 }
else
{ $r1status =
0 }
if ($r2.Status -eq 'RUNNING') {
$r2status =
1 }
else
{ $r2status =
0 }
# Format GG
data into influxdb Line Protocol
$postParams_r1
= "gg_RP9CRIT2,host="
+ $WinServer
+ "
LagSec=" + $r1.LagSec + ",ChkptSec="
+ $r1.ChkptSec + ",Status=" +
$r1status
$postParams_r2
= "gg_RP9MAIN2,host="
+ $WinServer
+ "
LagSec=" + $r2.LagSec + ",ChkptSec="
+ $r2.ChkptSec + ",Status=" +
$r2Status
# Post to
influxdb API
$uri
= 'http://SRV0175:8086/write?db=ggperfdb'
# Create
web connection with influxdb URL
$ServicePoint
= [System.Net.ServicePointManager]::FindServicePoint($uri)
$ServicePoint.ConnectionLimit =
3
# Use
influxdb authentication to logon
$authheader
= "Basic
" + ([Convert]::ToBase64String([System.Text.encoding]::ASCII.GetBytes("admin:admin")))
# Post data
to influxDB
try
{
Invoke-RestMethod
-Headers @{ Authorization = $authheader
} -Uri $uri
-Method POST
-Body $postParams
-DisableKeepAlive
Invoke-RestMethod
-Headers @{ Authorization = $authheader
} -Uri $uri
-Method POST
-Body $postParams_r1
-DisableKeepAlive
Invoke-RestMethod
-Headers @{ Authorization = $authheader
} -Uri $uri
-Method POST
-Body $postParams_r2
-DisableKeepAlive
#Close
web connection
$ServicePoint.CloseConnectionGroup("")
| Out-Null
}
catch
{
throw
'Could not POST to InfluxDB API endpoint'
}
}
#######################################################################################
# Run the collector function in a loop, continuously
#######################################################################################
while ($true)
{
# execute
the collector
Get-ggPerf
# pause for
X seconds
start-sleep
-Seconds 8
}
# end of script
References:
PowerShell Way: Automating Golden Gate Replication Monitoring on Windows
http://sqltouch.blogspot.ca/2014/08/powershell-way-automating-golden-gate.html
Grafana: http://grafana.org/
http://sqltouch.blogspot.ca/2014/08/powershell-way-automating-golden-gate.html
Grafana: http://grafana.org/
InfluxDB:
https://www.influxdata.com/
No comments:
Post a Comment