(This is public version of document. Use internal version, if it is available; it contains few additional details).
Production uses DataGuard oracle cluster for redundancy (and can use Logical Standby for reporting). Creating these databases require very careful configuration, which fail after a single error in the configuration.
Here is step-by-step process to plan and create such database.
2 Linux servers required, both with SLES9 Sp3 Linux.
Names: traindb01, traindb02 – for staging; xxoddb01, xxoddb02 – for production.
Each server should have file systems for:
File systems:
|
File System |
Relative Size % |
Goal |
|
/db01/db0 |
100 |
DataFiles |
|
/db01/redo0 |
2 |
REDO files |
|
/db01/flash0 |
150 |
Flash recovery Area |
|
/db01/local0 |
4 |
Local copy of redo |
|
/db01/backup0 |
50 |
Primary backup area |
|
/db01/arch |
300NFS, big area |
Long term storage for backups |
|
/local/apps |
10 GB |
Applications |
|
|
|
|
Software.
|
SLES9 SP3 Linux x86_64; |
|
ISCSI or another SAN storage access system; |
|
Oracle 10.2.0.3 (with patchset) |
|
Oracle Enterprise Manager agent. |
DataGuard success depends on the careful name selection and following all defaults. We select for the databases names (for production, replace ‘tr’ to ‘pr’):
<pr or tr > - production or training (keep it 2 symbols long);
<xx for xxxx, wf for yyyy, and so on>
<u> for utf-8, nothing for ascii;
<number>
p – primary, s – physical standby, l – logical standby. 7 characters max (we can use 8 maximum in SID).
|
Name |
Server1 |
Server2 |
Role |
|
DB name |
trxx1 |
Database name |
|
|
SID |
trxx1p |
trxx1s |
Oracle SID |
|
UNIQ name |
trxx1p |
trxx1s |
Db_uniq_name |
|
Service |
trxx1_s |
Universal service name |
|
|
|
|
|
|
|
DB name |
trxx2 |
Database name |
|
|
SID |
trxx2s |
trxx2p |
SID |
|
UNIQ name |
trxx2s |
trxx2p |
UNIQ name |
|
Service |
trxx2_s |
Universal service name |
|
|
|
|
|
|
Names for other databases (add suffixes p for primary instance, s for standby instance, l for logical standby):
And so on. Names must be shorter then 6 characters before adding suffixes, because Unix SID have 8 char limitation in some parts of the DataGuard.
Slogin to the server (slogin –X oracle@server) with X11.
Start dbca.
i. First location: /db01/redo0
ii. Second one: /db01/local0
i. Change .dbf onto .arc in file name format;
i. db_unique_name = trxx1p (see table) – use SID name here;
ii. filesystem_options = ‘setall’;
iii. standby_file_management = ‘AUTO’.
iv. control_files
=
(”/db01/redo0/{DB_UNAME_UPPER}/{DB_UNAME_UPPER}-1.ctl”, “/db01/local0/{DB_UNAME_UPPER}/{DB_UNAME_UPPER}-2.ctl”, “/db01/db0/{DB_UNAME_UPPER}/{DB_UNAME_UPPER}-3.ctl “)
v. sga_max_size = 2000000000 (2 GB)
vi. CLOSE
After primary database is created and started:
Now it’s time to create a Physical Standby on the second server. The only way to do it without extra DBA skills and headache is using OEM web interface and following rules. You need a working Oracle Enterprise Manager (web application), version 10.2 or higher.
i. Backup Files Directory: /db01/backup0 (or use disk with enough space);
ii. Enter Oracle user name and password (user Oracle on primary DB server);
iii. Next;
i. Instance Name (for example, trxx1s); <Database Name + s for physical standby, l for logical standby);
ii. Select server (traindb02 for example) other then primary host;
iii. Next.
i. Backup File Directory Area: /db01/backup0
ii. Method = HTTP
iii. Database Area: /db01/db0 (!don’t forget to change!);
iv. Flash Recovery Area: /db01/flash0 (!don’t forget to change);
v. Flash Recovery Size (the same as for primary DB). (!don’t forget to change!).
vi. Multiplex REDO and Controls. Use:
1.
/db01/redo0
2.
/db01/local0
vii. Check everything here!
viii. Next.
i. Database Unique Name – use the same as sid, for example trxx1s ;
ii. Target name: use the same name again – trxx1s (for example);
Now you can drink coffee and watch how system creates standby database. If everything was done correct (names, disk locations, enough space on disks, listeners configured and so on) you will see DataGuard page with ‘Creating in progress’ status. It takes about 10 – 15 minutes for the system to complete database cloning and change status to Normal (if there are not errors).
When you see Normal DataGuardStatus, you should verify configuration and fix few possible glitches. The only problem, known for me, rely to ‘standby file management’ – it must be configured as AUTO, but OEM configure it as manual.
To correct it, do a few steps (I describe using OEM, but you can use sqlplus instead):
FastFailover and Observer must be configured because we don’t want to start databases manually, and the only mode when they are auto-started and auto-failed-over is this mode.
When DataGuard is created (and reasonable before testing), you should configure automated service trigger, which set starts service on primary server and stops it on standby server (so allowing client to select active server). I use such script to do it:
oracle@traindb02:/INST/SQL.d> cat set_trigger.sql
# BEGINDBMS_SERVICE.CREATE_SERVICE('&1','&1');END;/BEGINDBMS_SERVICE.START_SERVICE('&1');END;/ create or replace trigger managed_service after DB_ROLE_CHANGE on databasedeclare role varchar(30);begin select database_role into role from v$database; if role = 'PRIMARY' then DBMS_SERVICE.START_SERVICE('&1'); else DBMS_SERVICE.STOP_SERVICE('&1'); end if;end;/create or replace trigger managed_service_start after startup on databasedeclare role varchar(30);begin select database_role into role from v$database; if role = 'PRIMARY' then DBMS_SERVICE.START_SERVICE('&1'); else DBMS_SERVICE.STOP_SERVICE('&1'); end if;end;/
For example (example is about trwf1 database):
oracle@traindb02:/INST/SQL.d> export ORACLE_SID=trwf1poracle@traindb02:/INST/SQL.d> sqlplus sys as sysdba SQL*Plus: Release 10.2.0.3.0 - Production on Mon Aug 6 19:24:02 2007 Copyright (c) 1982, 2006, Oracle. All Rights Reserved. Enter password: Connected to:Oracle Database 10g Enterprise Edition Release 10.2.0.3.0 - 64bit ProductionWith the Partitioning, OLAP and Data Mining options SQL> @set_trigger.sql trwf1_sold 2: DBMS_SERVICE.CREATE_SERVICE('&1','&1');new 2: DBMS_SERVICE.CREATE_SERVICE('trwf1_s','trwf1_s'); PL/SQL procedure successfully completed. old 2: DBMS_SERVICE.START_SERVICE('&1');new 2: DBMS_SERVICE.START_SERVICE('trwf1_s'); PL/SQL procedure successfully completed. old 8: DBMS_SERVICE.START_SERVICE('&1');new 8: DBMS_SERVICE.START_SERVICE('trwf1_s');old 10: DBMS_SERVICE.STOP_SERVICE('&1');new 10: DBMS_SERVICE.STOP_SERVICE('trwf1_s'); Trigger created. old 8: DBMS_SERVICE.START_SERVICE('&1');new 8: DBMS_SERVICE.START_SERVICE('trwf1_s');old 10: DBMS_SERVICE.STOP_SERVICE('&1');new 10: DBMS_SERVICE.STOP_SERVICE('trwf1_s'); Trigger created. SQL>
To verify, run lsnrctl services | grep <db_name> on both servers and verify, that your service name exists on primary server only.
Now you can configure tnsnames.ora on the clients (and use similar string in jdbc configurations). For example, here is configuration for trxx1 database:
trxx1 = (DESCRIPTION = (ADDRESS_LIST = (LOAD_BALANCE=OFF)(FAILOVER=ON) (ADDRESS = (PROTOCOL = TCP)(HOST = traindb01.stg2.mycompany.com)(PORT =1521)) (ADDRESS = (PROTOCOL = TCP)(HOST = traindb02.stg2.mycompany.com)(PORT =1521)) ) (CONNECT_DATA = (SERVICE_NAME = trxx1_s) (FAILOVER_MODE= (TYPE=session) (METHOD=basic) (RETRIES=180) (DELAY=5) ) ) )Here is what to test:
You can’t keep these systems without backup because they will run out of archive space. COnfigure periodic backups with removing all archive logs periodically. Full backup script is in development yet.