Data Guard Broker
The broker is part of Oracle's Database Enterprise Manager and an integral part of Data Guard, it is the management framework. You can of course use SQL*Plus to manage Data Guard but the broker when used with the Enterprise Manager becomes a powerful tool offering the following
Before I discuss the integration of the broker into the Enterprise Manager lets first discuss the broker itself. The broker can make changes to a Data Guard setup (physical or logical database), the type of changes are configuration, transport modes, apply setup and role change services and the protection mode. You can also monitor the health of any configuration, it is also responsible for implementing and managing the automatic failover capability also know as Fast-Start Failover (FSFO).
The broker has three parts, a set of background processes on each database, a set of configuration files and a commandline interface (CLI) called DGMGRL.
The broker has a number of processes running, the broker manages these processes automatically you have no control over them
In a broker configuration it is the Data Guard Monitor (DMON) process on the primary database that is the owner of the configuration, all orders will come from the primary even if use a DGMGRL CLI from another server, the standby database will receive all configuration changes via the primary.
When the DMON process communicates with the standby database it uses one of the NSV processes to send the work to the standby, this protects the DMON from a hang if the network should go down. Whenever the DMON needs to execute some SQL it will enlist the aid of the RSM process on the primary database, if the SQL is for the primary database it will execute it directly, however if the SQL is for a standby database the RSM process asks the NSV process to send it to the standby database again this protects the RSM from a network hang.
Each NSV process has a DRC partner process on the standby database, which will perform the work of the NSV process, the DRC process on the standby will return the results or status back to the NSV on the primary.
On startup the DMON will communicate with each standby via the NSV-DRC connection process pair to establish communication and configuration information so that the standby can start the apply services, if there are any problems check the alert.log after the NVS section to see if the are any TNS errors.
When using a RAC environment there are another set of process called the internode servers (INSV) which maintain a connection between the nodes in the cluster to ensure that the broker on each node knows the state of the cluster. A INSV process will always be started on the primary even if the primary is not in a RAC.
Lastly you may see one more process called the Fast-Start-Failover Process (FSFP), which is used only when the primary database is under the control of Data Guard automatic failover feature Fast-Start Failover. We will be discussing FSFP in another section switchover and failover.
Each database has two copies of the configuration, stored in the below parameter locations
By default they are stored in $ORACLE_HOME/dbs directory with the filename of dr1<DB_UNIQUE_NAME>.dat and dr2<DB_UNIQUE_NAME>.dat, if using a RAC these should be keep on a shared location as only one set of copies can be exist for the entire RAC.
The primary has the master copies of the configuration files, all changes to these files are done by the primary, even if you are on the standby server the configuration changes will be passed to the primary via NSV-DRC processes and the DMON process will make the change which is then propagated back to the standby databases. The reason for the multiple copies of the configuration file is if the primary has a problem then each standby database has a copy, when a failover occurs the standby that becomes the primary will then become the master of those configuration files.
You have two choices to interact with Data Guard either Enterprise Manager (EM) or the broker CLI DGMGRL you can swap between the two with a few simple configuration changes. To gain full full functionality of Data Guard through Grid Control you must use the broker.
Broker CLI DGMGRL comes with Enterprise Manager, you can run the broker on any platform you wish it does not need to be the same platform as the primary or standby databases.
You can access the DGMGRL command line using the below
DGMGRL CLI | [primargydg01] > dgmgrl sys/oracle [primargydg01] > dgmgrl sys/oracle@PROD1 |
When you start the DGMGRL commandline it does not connect you to a database, only the Data Guard configuration. Before you start to use the broker there are a number of things to perform first
Even if you use Enterprise Manager you should really have a understanding on how the CLI works, as mentioned above there are two configuration files, both the systems parameters must be set on the primary and all standby databases, you put set this files anywhere you like
data Guard configuration files location | # Directory # ASM Note: if you are using ASM then the directories must be created already, use the asmcmd command to do this, the broker will create symlinks to the actual config files but the directories must be there |
In a RAC environment they should be in a shared area and all nodes should point to the same files, only one set of files must exist in a RAC.
Once the directory has been created you are ready to startup the broker, do this on all databases (primary and standby)
Start the Broker | alter system set DG_BROKER_START=TRUE SCOPE=BOTH; |
When you start the broker it does not configure any files yet because you have to use the DGMGRL command, all it does is start the necessary processes we mentioned earlier. If you need to change the location of the files the steps are below
The broker uses the Oracle Net Services to make connections to the databases, setup both redo transport and archive gap resolution and perform role transitions. We need to create a special static entry in the listener.ora file for each database in the broker configuration, this entry makes it possible for the broker to connect to an idle instance using a remote SYSDBA connection and perform the necessary startup. Here is an example
Broker listener.ora static entry | ## Primary and all standby databases SID_LIST_LISTENER = |
Once the above has been done we are ready to create a broker configuration, there is no difference in a non-RAC or RAC environment, to connect to the configuration you use the dgmgrl command but one word of advise make sure that you use a specific user and password and not the / as the user, this has know to cause problems at a later date
DGMGRL CLI | [primargydg01] > dgmgrl sys/oracle [primargydg01] > dgmgrl sys/oracle@PROD1 |
There are four main area's to DGMGRL CLI
If you used the Enterprise Manager/Grid Control to setup your standby database then the next steps would have already been performed for you, if however you used the Power User Method then you need to follow the next steps to configure the broker.
The first step is to create the base configuration
Create base configuration | # Primary Database server |
The above command would have created a base configuration, you can display the configuration using the below command
Display configuration | # Primary Database server DGMGRL> connect sys/password DGMGRL> show configuration Configuration - prod1 Protection Mode: MaxPerformance Fast-Start Failover: DISABLED Configuration Status: ## Here is the same screen shot after adding a logical server DGMGRL> show configuration; Configuration - prod1 Protection Mode: MaxPerformance Fast-Start Failover: DISABLE Configuration Status: |
Next we add the standby database to the configuration, if you receive an error stating that the properties could not be imported from the database, you need to check DB_UNIQUE_NAME also the TNSNAME and the transport parameter (LOG_ARCHIVE_DEST_n), so what is the proper setup, you must have your redo transport parameter defined using the DB_UNIQUE_NAME method, meaning that each redo transport parameter must contain the DB_UNIQUE_NAME=<name> attribute. The broker will search all of your LOG_ARCHIVE_DEST_n parameters looking for a database with a unique name that matches the database name you entered for the command. If you have not done this then you need to use the full set of arguments to add the database
Add the standby database | # Primary Database server - if you have setup db_unique_name, tnsname and log_archive_dest_n Database "prod1dr" added # if you want to add a logical standby you can use the following DGMGRL> add database prod1lr |
When you display the configuration you should see that the standby database has been added
Display configuration | # Primary Database server Configuration - prod1 Protection Mode: MaxPerformance Fast-Start Failover: DISABLED Configuration Status: ## Because we have not setup any parameters for Broker all parameters defaults will be presumed, Database - prod1 Role: PRIMARY Properties: Database Status: DGMGRL> show database verbose prod1dr Database - prod1dr Role: PHYSICAL STANDBY Properties: Database Status: |
One parameter to watch for is the local archiving parameter on the primary and standby databases, the broker will modify this parameter and if necessary add the VALID_FOR attribute in preparation for the archival of your standby redo log files.
One area to watch is if you add an additional standby database, the broker could presume the wrong parameter values, for instance you may want one standby to use SYNC and another to use ASYNC, but if you configure the second standby database using the short method it may default to SYNC, so when you do add any additional standby databases always double check the parameters are set correctly for the type of configuration that you want, any changes should be done before enabling the additional standby database.
Once we are happy that all standby databases have been added and their parameters have been check we are ready to enable the configuration, the command will issue alter system commands on both the primary and standby databases, start the redo transport to the standby databases and the apply services, before examining the configuration give it sometime to complete as it does take a while, otherwise you see some ORA-16610 errors.
Enabling the configuration | # Primary Database server DGMGRL> connect sys/password DGMGRL> enable configuration Enabled. |
You can tail the alert log file and see what action is taking place, hopefully after a few minutes you should see a successful configuration running, you can also check the redo transport and apply services by viewing the v$managed_standby view.
Display configuration | # Primary Database server Configuration - prod1 Protection Mode: MaxPerformance Fast-Start Failover: DISABLED Configuration Status: # Standby database - check the LGWR - RFS connection process and the MRP0 is in the applying state |
As I mentioned above the broker has a number of properties which can be changed, to change the property we use the EDIT command in the DGMGRL CLI
Edit configuration | EDIT CONFIGURATION SET PROPERTY <name>=<value> EDIT DATABASE <db_name> SET PROPERTY <name>=<value> EDIT INSTANCE <in_name> SET PROPERTY <name>=value> |
Configuration Properties |
|
BystandersFollowRoleChange | |
FastStartFailoverAutoReinstate | |
FastStartFailoverLagLimit | |
FastStartFailoverPmyShutdown | |
FastStartFailoverThreshold | |
CommunicationTimeout | the amount of time the broker will wait for a response before giving up (default 180 seconds), you can remove any timeout by setting this to 0 |
Database Properties |
|
FastStartFailoverTarget | |
ObserverConnectIdentifier | |
ApplyInstanceTimeout | defines how long the broker should wait until moving the apply process to another instance in a standby RAC, default is 0 when means immediately |
PreferredApplyInstance | allows you to tell the broker where you would like the apply to run when you have a RAC, by default it is empty which means it can choose any instance |
ArchiveLagTarget | alters database parameter: ARCHIVE_LAG_TARGET |
DbFileNameConvert | alters database parameter: DB_FILE_NAME_CONVERT |
LogArchiveMaxProcesses | alters database parameter: LOG_ARCHIVE_MAX_PROCESSES |
LogArchiveMinSuccessDest | alters database parameter: LOG_ARCHIVE_IN_SUCCEED_DEST |
LogFileNameConvert | alters database parameter: LOG_FILE_NAME_CONVERT |
LogShipping (standby role only) |
enables or defers redo transport to that standby database. alters database parameter: LOG_ARCHIVE_DEST_STATE_n |
StandbyFileManagement (standby role only) |
alters database parameter: STANDBY_FILE_MANAGEMENT |
Instance Properties |
|
HostName | No explaination needed here |
SidName | No explaination needed here |
LogArchiveTrace | alters database parameter: LOG_ARCHIVE_TRACE |
LogArchiveFormat | alters database parameter: LOG_ARCHIVE_FORMAT |
StandbyArchiveLocation | alters database parameter: LOG_ARCHIVE_DEST_n |
AlternateLocation | alters database parameter: LOG_ARCHIVE_DEST_n |
LsbyMaxSga | alters database parameter: MAX_SGA |
LsbyMaxServers | alters database parameter: MAX_SERVERS |
There are a number of commands that you can use to change the state of the database
turn off/on the redo transport service for all standby databases | Primary |
DGMGRL> edit database prod1 set state=transport-off; DGMGRL> edit database prod1 set state=transport-on; |
turn off/on the apply state | Standby |
DGMGRL> edit database prod1dr set state=apply-off; |
put a database into a real-time query mode | Standby |
DGMGRL> edit database prod1dr set state=apply-off; sql> alter database open read only; DGMGRL> edit database prod1dr set state=apply-on; |
change the protection mode | Primary |
# Choose what level of protection you require # display the configuration |
There are a number of useful monitoring commands and log files that can help with diagnosing problems
configuration | DGMGRL> show configuration; |
database | DGMGRL> show database prod1; DGMGRL> show database prod1dr; # There are a number of specific information commands, here are the most used DGMGRL> show database prod1 statusreport; DGMGRL> show database prod1 inconsistentProperties; DGMGRL> show database prod1 inconsistentlogxptProps; DGMGRL> show database prod1 logxptstatus; DGMGRL> show database prod1 latestlog; |
Logfiles | # change the instance name to reflect the one you have choosen prod1 (alert log): /u01/app/oracle/diag/rdbms/prod1/PROD1/trace/alert_PROD1.log prod1 (DG log): /u01/app/oracle/diag/rdbms/prod1/PROD1/trace/drcPROD1.log prod1dr (alert log): /u01/app/oracle/diag/rdbms/prod1dr/PROD1DR/trace/alert_PROD1DR.log |
Lastly there is a utility called the "DG Menu Utility", this can be downloaded from the Oracle Data Guard 11g handbook web site http://www.dataguardbook.com, I will be covering this in my monitoring section but below is a screen shot of the menu
DG Menu Utility | DG Menu ======================================== # ------------------------------------------------------------------------- # # # # Data Guard Check List - primarydg01 # # # 0. Review database information and status # # 1. Check for password file # # 2. Check for forced logging and unrecoverable activities # # ------------------------------------------------------------------------- # # 3. Check for archive log mode # # 4. Check for standby redo logs # # 5. Check current SCN on primary and standby databases # # 6. Check archive log destinations # # ------------------------------------------------------------------------- # # 7. Check Data Guard Status View for errors and fatal messages # # 8. Check Managed Recovery Process Status # # 9. Check for missing archive logs # # 10. Check archive log gaps on the standby database # # 11. Check average apply rate / active apply rate # # 12. Check transport / apply lag # # 13. How far behind is my Data Guard in terms of time? # # # # ------------------------------------------------------------------------- # # 20. Launch the Logical Standby Data Guard Submenu # # ------------------------------------------------------------------------- # # 21. Generate init.ora entries for primary database # # 22. Generate init.ora entries for standby database # # 23. Generate tnsnames.ora entries for primary and standby databases # # 24. Generate SQL syntax to create standby redo logs # # # # ------------------------------------------------------------------------- # # 30. Generate syntax to duplicate standby database from active database # # # # x. Exit # # ------------------------------------------------------------------------- # # Enter Task Number:
# ------------------------------------------------------------------------- # |
Removing the Broker
Finally if you wish remove the broker follow the steps below
Remove the broker |
|