ORA-12547: TNS:lost contact

Issue:

Unable to connect to database via sqlplus using local authentication on a linux machine.

ukblx247_$ ps -ef | grep pmon
ora_vdis 9955 1 0 Nov 15 ? 130:32 ora_pmon_DIS1P
ora_vcms 11106 1 0 Nov 15 ? 75:15 ora_pmon_CMSP
ora_vcms 28057 1 0 Nov 15 ? 75:18 ora_pmon_CMSPP
ora_vcms 4416 4346 0 07:53:33 pts/7 0:00 grep pmon

cat /var/opt/oracle/oratab

DIS1P:/data/d151/app/oracle/product/10.2.0.5/db_1:N
CMSP:/data/d051/app/oracle/product/10.2.0/db_1:N
CMSPP:/data/d051/app/oracle/product/10.2.0/db_1:N

Getting the following error
ukblx247_$ . oraenv
ORACLE_SID = [DIS1P] ? CMSP
ukblx247_$ sqlplus / as sysdba

SQL*Plus: Release 10.2.0.5.0 – Production on Thu Apr 28 07:54:10 2016

Copyright (c) 1982, 2010, Oracle. All Rights Reserved.

ld.so.1: oracle: fatal: relocation error: file /data/d151/app/oracle/product/10.2.0.5/db_1/lib/libjox10.so: symbol joxjwt_: referenced symbol not found
ERROR:
ORA-12547: TNS:lost contact

 

Checks done:

Verified that the correct ORACLE_HOME , PATH and ORACLE_SID was set.

But still not able to connect.

ACLE_SID
CMSP
ukblx247_$ echo $ORACLE_HOME
/data/d051/app/oracle/product/10.2.0/db_1
ukblx247_$ echo $PATH
/usr/local/bin:/usr/bin:/usr/local/sbin:/sbin:/usr/sbin:/usr/dt/bin:/bin:/usr/bin:/usr/ucb:/usr/sbin:/usr/openwin/bin:/usr/local/bin:/zoe/bin:/data/d151/app/oracle/product/10.2.0.5/db_1/bin:/data/d151/app/oracle/product/10.2.0/db_1/bin:/usr/ccs/bin:/usr/ucb:/opt/VRTSdbed/bin:/opt/VRTSvcs/bin:/usr/local/bin:/data/d051/app/oracle/product/10.2.0/db_1/bin

 

 

Fix:

listed all environment variables and it showed that the LD_LIBRARY_PATH is pointing to a different home on the server

ukblx247_$ env
_=/usr/bin/env
LC_MONETARY=en_GB.ISO8859-1
LC_TIME=en_GB.ISO8859-1
SSH_TTY=/dev/pts/7
PATH=/usr/local/bin:/usr/bin:/usr/local/sbin:/sbin:/usr/sbin:/usr/dt/bin:/bin:/usr/bin:/usr/ucb:/usr/sbin:/usr/openwin/bin:/usr/local/bin:/zoe/bin:/data/d151/app/oracle/product/10.2.0.5/db_1/bin:/data/d151/app/oracle/product/10.2.0/db_1/bin:/usr/ccs/bin:/usr/ucb:/opt/VRTSdbed/bin:/opt/VRTSvcs/bin:/usr/local/bin:/data/d051/app/oracle/product/10.2.0/db_1/bin
zoe_logs=/zoe/logs
zoe_syslib=/zoe/syslib
ORACLE_BASE=/data/d051/app/oracle
zoe_shutdown=/zoe/shutdown
zoe_root=/zoe
EDITOR=vi
LOGNAME=ora_vcms
zoe_backup=/zoe/backup
zoe_util=/zoe/util
MAIL=/var/mail//ora_vcms
ORACLE_SID=CMSP
EPC_DISABLED=true
zoeini_dbuser=zoe/BZw7xw8h
USER=ora_vcms
LC_MESSAGES=C
LC_CTYPE=en_GB.ISO8859-1
zoe_temp=/zoe/temp
zoe_startup=/zoe/startup
SHELL=/bin/ksh
ORACLE_TERM=vt220
zoe_bin=/zoe/bin
HOME=/export/home/ora_vcms
LC_COLLATE=en_GB.ISO8859-1
LC_NUMERIC=en_GB.ISO8859-1
SSH_CONNECTION=10.54.82.123 50641 10.146.14.68 22
SSH_CLIENT=10.54.82.123 50641 22
LD_LIBRARY_PATH=/data/d151/app/oracle/product/10.2.0.5/db_1/lib:/usr/lib
TERM=xterm
ORACLE_HOME=/data/d051/app/oracle/product/10.2.0/db_1
PWD=/export/home/ora_vcms
TZ=GB
ORAENV_ASK=N

Changed the LD_LIBRARY_PATH to point to the correct home and tried connecting to the database and the issue was fixed.

ukblx247_$ export LD_LIBRARY_PATH=/data/d051/app/oracle/product/10.2.0/db_1/lib:/usr/lib

ukblx247_$ sqlplus / as sysdba

SQL*Plus: Release 10.2.0.4.0 – Production on Thu Apr 28 08:08:40 2016

Copyright (c) 1982, 2007, Oracle. All Rights Reserved.
Connected to:
Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 – 64bit Production
With the Partitioning, Data Mining and Real Application Testing options

 

 

 

ORA-01031: insufficient privileges

Issue:

On a windows Server 2008 R2 machine hosting a 11.2.0.3 database when I try to connect to the database as sys with sysdba privs I get the error ‘ORA-01031: insufficient privileges’. Whereas when I provide the sys user password it connects successfully as shown below.

 

ORACLE_SID has been set to ‘DEVC’
ORACLE_HOME has been set to ‘d:\app\oracle\product\11.2.0\dbhome_2’

C:\Users\Deepa.Raja>sqlplus /  as sysdba

SQL*Plus: Release 11.2.0.3.0 Production on Thu Jun 18 01:38:08 2015

Copyright (c) 1982, 2011, Oracle.  All rights reserved.

ERROR:
ORA-01031: insufficient privileges

 

C:\Users\Deepa.Raja>sqlplus  "sys/test as sysdba"

SQL*Plus: Release 11.2.0.3.0 Production on Thu Jun 18 01:38:45 2015

Copyright (c) 1982, 2011, Oracle.  All rights reserved.

Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 – 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options

SQL>

 

Reason:

When we do sqlplus / as sysdba from the database server, we are making a local connection to the database instance.

To make local connections to an Oracle instance a user must either:

a) Supply a password
b) Be a user belonging to a special ‘dba’ group.

The ‘DBA’ group is chosen at installation time and is usually the group ‘dba’ by default. In our case it was ora_dba

Fix:

Open control panel and change the group for your user as shown below in the screen shots.

image

 

image

 

image

 

image

 

image

 

 

C:\Users\Deepa.Raja>sqlplus /  as sysdba

SQL*Plus: Release 11.2.0.3.0 Production on Thu Jun 18 01:38:45 2015

Copyright (c) 1982, 2011, Oracle.  All rights reserved.

Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 – 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options

SQL>

Recovering lost spfile

Issue:

Today our rman backup for a database failed with the following error message

time: 00:13:56 channel ORA_DISK_1: starting compressed incremental level 1 datafile backup set channel ORA_DISK_1: specifying datafile(s) in backup set including current control file in backup set

RMAN-00571: ===========================================================

RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============

RMAN-00571: ===========================================================

RMAN-03009: failure of backup command on ORA_DISK_1 channel at 06/12/2015 20:14:06

ORA-01565: error in identifying file ‘/oracle/app/oracle/product/11.2.0.4/db_1/dbs/spfilePKRONOS1.ora’

ORA-27037: unable to obtain file status

Linux-x86_64 Error: 2: No such file or directory Additional information: 3

 

Investigation:

 

Checked whether the database is using spfile and identify the location of the same. But at OS level this file is missing. This was the the reason for the backup failure

SQL> show parameter spfile

NAME                                 TYPE        VALUE

———————————— ———– ——————————

spfile                               string      /oracle/app/oracle/product/11.2.0.4/db_1/dbs/spfilePKRONOS1.ora

SQL> exit

 

[oracle@t1rdkrn701 ~]$ ls /oracle/app/oracle/product/11.2.0.4/db_1/dbs/spfilePKRONOS1.ora

ls: /oracle/app/oracle/product/11.2.0.4/db_1/dbs/spfilePKRONOS1.ora: No such file or directory

 

The instance is up and we are able to query the parameter values from the memory

 

SQL> show parameter buffer

NAME                                 TYPE        VALUE
———————————— ———– ——————————
buffer_pool_keep                     string
buffer_pool_recycle                  string
db_block_buffers                     integer     0
log_buffer                           integer     20922368
use_indirect_data_buffers            boolean     FALSE
SQL> show parameter sga

NAME                                 TYPE        VALUE
———————————— ———– ——————————
lock_sga                             boolean     FALSE
pre_page_sga                         boolean     FALSE
sga_max_size                         big integer 16G
sga_target                           big integer 0

 

Resolution

With oracle 11g we can create a spfile based on the data in memory but it wouldnt allow you to create a spfile with the name of the spfile currently used by the database. So create a backup spfile from memory value and restart the database with the new spfile.

[oracle@t1rdkrn701 dbs]$ sqlplus / as sysdba

SQL*Plus: Release 11.2.0.4.0 Production on Sat Jun 13 03:16:51 2015

Copyright (c) 1982, 2013, Oracle.  All rights reserved.

Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 – 64bit Production

SQL> create spfile =’/oracle/app/oracle/product/11.2.0.4/db_1/dbs/spfilePKRONOS1.ora’ from memory;
create spfile =’/oracle/app/oracle/product/11.2.0.4/db_1/dbs/spfilePKRONOS1.ora’ from memory
*
ERROR at line 1:
ORA-32002: cannot create SPFILE already being used by the instance

 

SQL> create spfile =’/oracle/app/oracle/product/11.2.0.4/db_1/dbs/spfilePKRONOS1.ora.bkp’ from memory;

File created.

 

shutdown the instance

mv ‘/oracle/app/oracle/product/11.2.0.4/db_1/dbs/spfilePKRONOS1.ora.bkp’  ‘/oracle/app/oracle/product/11.2.0.4/db_1/dbs/spfilePKRONOS1.ora’

startup

Viewing alert log for oracle database on windows

Oracle DBAs used to working in linux environment often hate to manage oracle databases running on windows. As a novice to oracle on windows, I find the following difficult in Windows compared to linux

  • listing directories and sub directories along with their size (du – sh in linux)
  • killing sessions (kill –9 in linux)
  • viewing alert log and waiting for latest updates (linux less command with shift f)

There are lot more to add to this list…But let me stop here and I would like to share a  solution for the last issue. I always had trouble reading alert log on windows environment. In cases where there was no log rotation, the files were too big and took a long time to load in notepad. Also when we are looking for latest updates we need to close the notepad and open it again to view the latest alert log entries. This makes it even worse. I found a better solution  with oracle 11g.

Oracle introduced Automatic diagnostic repository (ADR) in 11g, the file based repository for diagnostic data. We can use the command line interface available for ADR called adrci to view the alert logs.

 

Connect to adrci and set the homepath

 

F:\BIAS\scripts>adrci

ADRCI: Release 11.2.0.3.0 – Production on Wed Jun 10 00:56:06 2015

Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.

ADR base = "C:\APP\ORACLE"
adrci> show homes
ADR Homes:
diag\clients\user_amabesadmin\host_343795160_76
diag\clients\user_anshal.nath\host_343795160_76
diag\clients\user_arun.kumar\host_343795160_76
diag\clients\user_mansoor.naeem\host_343795160_76
diag\clients\user_SYSTEM\host_343795160_76
diag\clients\user_UNKNOWN\host_343795160_76

show alert

This opens the alert log file in a notepad

adrci> show  alert

ADR Home = C:\APP\ORACLE\diag\rdbms\gmcfpr01\prod:
*************************************************************************
Output the results to file: C:\Users\DEEPA~1.RAJ\AppData\Local\Temp\5\alert_7428_6896_prod_1.ado

 

show alert –tail

This shows the last 10 entries from the alert log file.

adrci> show alert -tail
2015-06-09 23:56:50.687000 -04:00
LNS: Standby redo logfile selected for thread 1 sequence 35262 for destination LOG_ARCHIVE_DEST_2
2015-06-09 23:57:03.794000 -04:00
Archived Log entry 88814 added for thread 1 sequence 35261 ID 0xd8ac4e2b dest 1:
Archived Log entry 88815 added for thread 1 sequence 35261 ID 0xd8ac4e2b dest 4:
2015-06-10 00:11:24.650000 -04:00
Thread 1 cannot allocate new log, sequence 35263
Private strand flush not complete
  Current log# 3 seq# 35262 mem# 0: F:\APP\ORACLE\ORADATA\PROD\REDO03.LOG
2015-06-10 00:11:27.742000 -04:00
Thread 1 advanced to log sequence 35263 (LGWR switch)
  Current log# 4 seq# 35263 mem# 0: F:\APP\ORACLE\ORADATA\PROD\REDO04.LOG
LNS: Standby redo logfile selected for thread 1 sequence 35263 for destination LOG_ARCHIVE_DEST_2
2015-06-10 00:11:45.752000 -04:00
Archived Log entry 88817 added for thread 1 sequence 35262 ID 0xd8ac4e2b dest 1:
Archived Log entry 88818 added for thread 1 sequence 35262 ID 0xd8ac4e2b dest 4:
2015-06-10 00:20:00.222000 -04:00
Errors in file C:\APP\ORACLE\diag\rdbms\gmcfpr01\prod\trace\prod_j001_17644.trc:
ORA-12012: error on auto execute of job "SYS"."LIGHTWEIGHT_APP"
ORA-06576: not a valid function or procedure name

 

show alert –tail 15

Shows the last 15 entries from the alert log

adrci> show alert -tail  15
2015-06-09 22:01:08.798000 -04:00
Archived Log entry 88811 added for thread 1 sequence 35260 ID 0xd8ac4e2b dest 1:
Archived Log entry 88812 added for thread 1 sequence 35260 ID 0xd8ac4e2b dest 4:
2015-06-09 22:20:00.097000 -04:00
Errors in file C:\APP\ORACLE\diag\rdbms\gmcfpr01\prod\trace\prod_j001_15832.trc:
ORA-12012: error on auto execute of job "SYS"."LIGHTWEIGHT_APP"
ORA-06576: not a valid function or procedure name
2015-06-09 22:59:07.000000 -04:00
Warning: VKTM detected a time drift.
Time drifts can result in an unexpected behavior such as time-outs. Please check trace file for more details.
2015-06-09 23:00:13.805000 -04:00
End automatic SQL Tuning Advisor run for special tuning task  "SYS_AUTO_SQL_TUNING_TASK"
2015-06-09 23:20:00.220000 -04:00
Errors in file C:\APP\ORACLE\diag\rdbms\gmcfpr01\prod\trace\prod_j001_13464.trc:
ORA-12012: error on auto execute of job "SYS"."LIGHTWEIGHT_APP"
ORA-06576: not a valid function or procedure name
2015-06-09 23:56:47.402000 -04:00
Thread 1 cannot allocate new log, sequence 35262
Private strand flush not complete
  Current log# 7 seq# 35261 mem# 0: F:\APP\ORACLE\ORADATA\PROD\REDO07.LOG
2015-06-09 23:56:50.484000 -04:00
Thread 1 advanced to log sequence 35262 (LGWR switch)
  Current log# 3 seq# 35262 mem# 0: F:\APP\ORACLE\ORADATA\PROD\REDO03.LOG
LNS: Standby redo logfile selected for thread 1 sequence 35262 for destination LOG_ARCHIVE_DEST_2
2015-06-09 23:57:03.794000 -04:00
Archived Log entry 88814 added for thread 1 sequence 35261 ID 0xd8ac4e2b dest 1:
Archived Log entry 88815 added for thread 1 sequence 35261 ID 0xd8ac4e2b dest 4:
2015-06-10 00:11:24.650000 -04:00
Thread 1 cannot allocate new log, sequence 35263
Private strand flush not complete
  Current log# 3 seq# 35262 mem# 0: F:\APP\ORACLE\ORADATA\PROD\REDO03.LOG
2015-06-10 00:11:27.742000 -04:00
Thread 1 advanced to log sequence 35263 (LGWR switch)
  Current log# 4 seq# 35263 mem# 0: F:\APP\ORACLE\ORADATA\PROD\REDO04.LOG
LNS: Standby redo logfile selected for thread 1 sequence 35263 for destination LOG_ARCHIVE_DEST_2
2015-06-10 00:11:45.752000 -04:00
Archived Log entry 88817 added for thread 1 sequence 35262 ID 0xd8ac4e2b dest 1:
Archived Log entry 88818 added for thread 1 sequence 35262 ID 0xd8ac4e2b dest 4:
2015-06-10 00:20:00.222000 -04:00
Errors in file C:\APP\ORACLE\diag\rdbms\gmcfpr01\prod\trace\prod_j001_17644.trc:
ORA-12012: error on auto execute of job "SYS"."LIGHTWEIGHT_APP"
ORA-06576: not a valid function or procedure name

show alert –tail –f

This is unix equivalent of less with a shift f. This basically shows the last 10 entries and waits for the latest updates

adrci> show alert -tail -f
2015-06-09 23:56:50.687000 -04:00
LNS: Standby redo logfile selected for thread 1 sequence 35262 for destination LOG_ARCHIVE_DEST_2
2015-06-09 23:57:03.794000 -04:00
Archived Log entry 88814 added for thread 1 sequence 35261 ID 0xd8ac4e2b dest 1:
Archived Log entry 88815 added for thread 1 sequence 35261 ID 0xd8ac4e2b dest 4:
2015-06-10 00:11:24.650000 -04:00
Thread 1 cannot allocate new log, sequence 35263
Private strand flush not complete
  Current log# 3 seq# 35262 mem# 0: F:\APP\ORACLE\ORADATA\PROD\REDO03.LOG
2015-06-10 00:11:27.742000 -04:00
Thread 1 advanced to log sequence 35263 (LGWR switch)
  Current log# 4 seq# 35263 mem# 0: F:\APP\ORACLE\ORADATA\PROD\REDO04.LOG
LNS: Standby redo logfile selected for thread 1 sequence 35263 for destination LOG_ARCHIVE_DEST_2
2015-06-10 00:11:45.752000 -04:00
Archived Log entry 88817 added for thread 1 sequence 35262 ID 0xd8ac4e2b dest 1:
Archived Log entry 88818 added for thread 1 sequence 35262 ID 0xd8ac4e2b dest 4:
2015-06-10 00:20:00.222000 -04:00
Errors in file C:\APP\ORACLE\diag\rdbms\gmcfpr01\prod\trace\prod_j001_17644.trc:
ORA-12012: error on auto execute of job "SYS"."LIGHTWEIGHT_APP"
ORA-06576: not a valid function or procedure name
2015-06-10 01:20:00.214000 -04:00
Errors in file C:\APP\ORACLE\diag\rdbms\gmcfpr01\prod\trace\prod_j001_15528.trc:
ORA-12012: error on auto execute of job "SYS"."LIGHTWEIGHT_APP"
ORA-06576: not a valid function or procedure name

 

SHOW ALERT -P "MESSAGE_TEXT LIKE ‘%ORA-%’”

view_alert_log

 

SHOW ALERT -P "MESSAGE_TEXT LIKE ‘%ORA-%’ AND originating_timestamp > systimestamp-1"

view_alert_log3

 

SHOW ALERT -P "ORIGINATING_TIMESTAMP >= ‘2015-06-08 00:00:00.000000 -04:00’ AND originating_timestamp < ‘2015-06-09 02:00:00.000000 -04:00’"

view_alert_log1

 

Also ADR data can be viewed from support work bench in OEM. So we can view alert from OEM as well.

view_alert_log2

 

To view the xml log contents click on the go button as shown in the pic.

view_alert_log4

And the output looks like this.

view_alert_log5

 

To view the text base alert log file click on ‘Switch to text alert log contents’ hyperlink.

 

view_alert_log8

 

Choose the time period for viewing the alert log.

view_alert_log6

 

Click on go to fetch results as shown below.

view_alert_log7

GoldenGate dirdat directory filling up

Issue:

Today through manual monitoring we found that the /GG volume was 100% full on a omnlink server

[oracle@AZPRODORA01 ~ JPROD1]$df -h

Filesystem            Size  Used Avail Use% Mounted on

/dev/mapper/VolGroup01-LogVol00

                       67G   46G   18G  73% /

/dev/cciss/c0d0p1      99M   53M   41M  57% /boot

tmpfs                  48G  236M   48G   1% /dev/shm

/dev/mapper/Oracle_GG

                      139G  132G  9.9M 100% /GG

Investigation:

What is eating up the space and why

[oracle@AZPRODORA01 GG JPROD1]$du -h –max-depth=2

16K     ./CPROD/dirpcs

4.0K    ./CPROD/dirtmp

20M     ./CPROD/dirjar

8.0K    ./CPROD/dirwlt

901M    ./CPROD/dirdat

136K    ./CPROD/cfg

4.0K    ./CPROD/dirdef

61M     ./CPROD/dirbdb

48K     ./CPROD/dirprm

452K    ./CPROD/UserExitExamples

32K     ./CPROD/dirsql

4.0K    ./CPROD/dirout

228K    ./CPROD/dirchk

7.7M    ./CPROD/dirrpt

2.1G    ./CPROD

16K     ./JPROD/dirpcs

4.0K    ./JPROD/dirtmp

20M     ./JPROD/dirjar

8.0K    ./JPROD/dirwlt

123G    ./JPROD/dirdat

164K    ./JPROD/cfg

4.0K    ./JPROD/dirdef

112M    ./JPROD/dirbdb

52K     ./JPROD/dirprm

452K    ./JPROD/UserExitExamples

32K     ./JPROD/dirsql

4.0K    ./JPROD/dirout

80K     ./JPROD/dirchk

6.3M    ./JPROD/dirrpt

126G    ./JPROD

20K     ./VPROD/dirpcs

4.0K    ./VPROD/dirtmp

20M     ./VPROD/dirjar

8.0K    ./VPROD/dirwlt

1.3G    ./VPROD/dirdat

160K    ./VPROD/cfg

4.0K    ./VPROD/dirdef

86M     ./VPROD/dirbdb

52K     ./VPROD/dirprm

452K    ./VPROD/UserExitExamples

32K     ./VPROD/dirsql

4.0K    ./VPROD/dirout

12K     ./VPROD/BR

244K    ./VPROD/dirchk

8.8M    ./VPROD/dirrpt

3.6G    ./VPROD

132G    .

[oracle@AZPRODORA01 dirdat JPROD1]$ls -lrth

total 115G

-rw-r—– 1 oracle oinstall 477M May 30 11:29 1r034254

-rw-r—– 1 oracle oinstall 477M May 30 11:50 1r034255

-rw-r—– 1 oracle oinstall 477M May 30 12:17 1r034256

-rw-r—– 1 oracle oinstall 477M May 30 12:46 1r034257

-rw-r—– 1 oracle oinstall 477M May 30 13:13 1r034258

-rw-r—– 1 oracle oinstall 477M May 30 13:41 1r034259

-rw-r—– 1 oracle oinstall 477M May 30 14:09 1r034260

-rw-r—– 1 oracle oinstall 477M May 30 14:36 1r034261

-rw-r—– 1 oracle oinstall 477M May 30 15:04 1r034262

-rw-r—– 1 oracle oinstall 477M May 30 15:33 1r034263

-rw-r—– 1 oracle oinstall 477M May 30 16:03 1r034264

dirdat is 123G in size and it is the folder in which extract files are stored. On normal functioning, a replicat picks these files (equivalent to redolog files in oracle) and applies it to the target database. Once applied, purges them. So probably a replicat is not running or stuck and hence not applying the extract files. So checked status by connecting to ggsci

[oracle@AZPRODORA01 JPROD JPROD1]$ggsci

Oracle GoldenGate Command Interpreter for Oracle

Version 11.2.1.0.7 16934304 OGGCORE_11.2.1.0.7_PLATFORMS_130709.1600.1_FBO

Linux, x64, 64bit (optimized), Oracle 11g on Jul 18 2013 07:04:28

Copyright (C) 1995, 2013, Oracle and/or its affiliates. All rights reserved.

GGSCI (AZPRODORA01) 1> info all

Program     Status      Group       Lag at Chkpt  Time Since Chkpt

MANAGER     STOPPED

JAGENT      STOPPED

EXTRACT     STOPPED     EJPROD      00:00:00      25236:51:45

EXTRACT     STOPPED     PJPRODHS    00:00:00      25236:51:41

EXTRACT     STOPPED     PJPRODP     00:00:00      10739:09:31

REPLICAT    STOPPED     RJPRODHS    00:00:00      490:41:32

REPLICAT    RUNNING     RJPRODP     132:23:52     00:00:00

Resolution:

This showed that the replicat RJPRODP is lagging by 132 hours which is going back to 6 days which corresponds to the oldest file in the dirdat directory. so stopped replicat RJPRODP and started the manager first. Then restarted the replicat.

GGSCI (AZPRODORA01) 4> stop replicat RJPRODP

Sending STOP request to REPLICAT RJPRODP …

STOP request pending end-of-transaction (865 records so far)..

GGSCI (AZPRODORA01) 7> start manager

Manager started.

GGSCI (AZPRODORA01) 8>

GGSCI (AZPRODORA01) 8> info all

Program     Status      Group       Lag at Chkpt  Time Since Chkpt

MANAGER     RUNNING

JAGENT      STOPPED

EXTRACT     STOPPED     EJPROD      00:00:00      25237:09:00

EXTRACT     STOPPED     PJPRODHS    00:00:00      25237:08:56

EXTRACT     STOPPED     PJPRODP     00:00:00      10739:26:46

REPLICAT    STOPPED     RJPRODHS    00:00:00      490:58:46

REPLICAT    STOPPED     RJPRODP     132:25:00     00:01:25

Sending START request to MANAGER …

REPLICAT RJPRODP starting

GGSCI (AZPRODORA01) 10> info all

Program     Status      Group       Lag at Chkpt  Time Since Chkpt

MANAGER     RUNNING

JAGENT      STOPPED

EXTRACT     STOPPED     EJPROD      00:00:00      25237:09:19

EXTRACT     STOPPED     PJPRODHS    00:00:00      25237:09:15

EXTRACT     STOPPED     PJPRODP     00:00:00      10739:27:05

REPLICAT    STOPPED     RJPRODHS    00:00:00      490:59:06

REPLICAT    RUNNING     RJPRODP     132:26:40     00:00:00

Monitoring:

Now monitor  the ggserr.log (/GG/JPROD/ggserr.log) and if you see entries like this then it means that the replicat is working through the extract files.

2015-06-05 00:11:05  INFO    OGG-00957  Oracle GoldenGate Manager for Oracle, mgr.prm:  Purged old extract file /GG/JPROD/dirdat/1r034255, applying UseCheckPoints purge rule: Oldest Chkpt Seqno 34256 > 34255.

Also we can monitor the progress in ggsci as shown below.  The output show the current extract file that the replicat is working on along wit the RBA that it currently on. As long as the RBA is ticking (like SCN for oracle) the replicat is working and processing records.

GGSCI (AZPRODORA01) 2> info replicat RJPRODP detail

REPLICAT   RJPRODP   Last Started 2015-06-04 23:43   Status RUNNING

Checkpoint Lag       132:06:54 (updated 00:00:00 ago)

Log Read Checkpoint  File ./dirdat/1r034256

                     2015-05-30 12:13:13.257148  RBA 415340900

  Extract Source                          Begin             End

  ./dirdat/1r034256                       2015-05-30 11:16  2015-05-30 12:13

  ./dirdat/1r034253                       2015-05-30 01:09  2015-05-30 11:16

  ./dirdat/1r034233                       2015-05-15 11:43  2015-05-30 01:09

  ./dirdat/1r033552                       2015-05-15 11:43  2015-05-15 11:43

  ./dirdat/1r033552                       2015-05-15 11:43  First Record

  ./dirdat/1r033552                       2015-05-15 11:43  First Record

  ./dirdat/1r033552                       2015-05-15 11:43  First Record

  ./dirdat/1r033552                       2015-05-15 11:43  First Record

  ./dirdat/1r033552                       2015-05-15 11:43  First Record

  ./dirdat/1r033552                       2015-05-15 11:43  First Record

  ./dirdat/1r033552                       2015-05-15 11:43  First Record

  ./dirdat/1r033552                       2015-05-15 11:43  First Record

  ./dirdat/1r033552                       2015-05-15 11:43  First Record

  ./dirdat/1r033552                       2015-05-15 11:43  First Record

  ./dirdat/1r033552                       2015-05-15 11:43  First Record

  ./dirdat/1r033552                       2015-05-12 11:44  2015-05-15 11:43

  ./dirdat/1r033419                       2015-05-06 04:52  2015-05-12 11:44

  ./dirdat/1r033137                       2015-04-14 09:19  2015-05-06 04:52

  ./dirdat/1r032227                       * Initialized *   2015-04-14 09:19

  ./dirdat/1r031329                       * Initialized *   First Record

  ./dirdat/1r031329                       * Initialized *   First Record

  ./dirdat/1r000000                       * Initialized *   First Record

Current directory    /GG/JPROD

Report file          /GG/JPROD/dirrpt/RJPRODP.rpt

Parameter file       /GG/JPROD/dirprm/rjprodp.prm

Checkpoint file      /GG/JPROD/dirchk/RJPRODP.cpr

Checkpoint table     SYS.GGS_CHECKPOINT

Process file         /GG/JPROD/dirpcs/RJPRODP.pcr

Stdout file          /GG/JPROD/dirout/RJPRODP.out

Error log            /GG/JPROD/ggserr.log

GGSCI (AZPRODORA01) 3> info replicat RJPRODP detail

REPLICAT   RJPRODP   Last Started 2015-06-04 23:43   Status RUNNING

Checkpoint Lag       132:05:00 (updated 00:00:00 ago)

Log Read Checkpoint  File ./dirdat/1r034256

                     2015-05-30 12:16:21.107452  RBA 470300944

  Extract Source                          Begin             End

  ./dirdat/1r034256                       2015-05-30 11:16  2015-05-30 12:16

  ./dirdat/1r034253                       2015-05-30 01:09  2015-05-30 11:16

  ./dirdat/1r034233                       2015-05-15 11:43  2015-05-30 01:09

  ./dirdat/1r033552                       2015-05-15 11:43  2015-05-15 11:43

  ./dirdat/1r033552                       2015-05-15 11:43  First Record

  ./dirdat/1r033552                       2015-05-15 11:43  First Record

  ./dirdat/1r033552                       2015-05-15 11:43  First Record

  ./dirdat/1r033552                       2015-05-15 11:43  First Record

  ./dirdat/1r033552                       2015-05-15 11:43  First Record

  ./dirdat/1r033552                       2015-05-15 11:43  First Record

  ./dirdat/1r033552                       2015-05-15 11:43  First Record

  ./dirdat/1r033552                       2015-05-15 11:43  First Record

  ./dirdat/1r033552                       2015-05-15 11:43  First Record

  ./dirdat/1r033552                       2015-05-15 11:43  First Record

  ./dirdat/1r033552                       2015-05-15 11:43  First Record

  ./dirdat/1r033552                       2015-05-12 11:44  2015-05-15 11:43

  ./dirdat/1r033419                       2015-05-06 04:52  2015-05-12 11:44

  ./dirdat/1r033137                       2015-04-14 09:19  2015-05-06 04:52

  ./dirdat/1r032227                       * Initialized *   2015-04-14 09:19

  ./dirdat/1r031329                       * Initialized *   First Record

  ./dirdat/1r031329                       * Initialized *   First Record

  ./dirdat/1r000000                       * Initialized *   First Record

Current directory    /GG/JPROD

Report file          /GG/JPROD/dirrpt/RJPRODP.rpt

Parameter file       /GG/JPROD/dirprm/rjprodp.prm

Checkpoint file      /GG/JPROD/dirchk/RJPRODP.cpr

Checkpoint table     SYS.GGS_CHECKPOINT

Process file         /GG/JPROD/dirpcs/RJPRODP.pcr

Stdout file          /GG/JPROD/dirout/RJPRODP.out

Error log            /GG/JPROD/ggserr.log

ORACLE DBA interview questions

Core DATABASE

compatible parameter

Size of log buffer related to redo log size

Explain logfile switch and logfile sync wait

What happens when you issue an update statement in an oracle database

Difference between an mview and view

How to find the number of cpus in a server to decide on parallelism

Is there an option to by pass inventory and apply patch

 

Export/Import

Can you use expdp against and Read Only database

What happens if you change the name of the master table in the database for expdp

Why is expdp faster than exp?

Expdp is carried out on server side and uses direct path api(streams)/external tables api.

How to speed up exp

how to import without affecting already existing data in a table

What are the precautions to take before refreshing a schema

How to investigate a stuck expdp job

 

Tuning

What SQL tuning have you done

What is a profile?

 

Indexes

How to prevent an SQL from using an index

What is the criteria for rebuilding index

Increasing processes what shared memory do u need to check

how to fix an index corruption

what is index skip scan

What is multi block read count

What are the considerations to be taken before adding a new database to a oracle server that has an ASM instance serving multiple databases.

 

Recovery

What do we need to do if  there is a corruption in an online redo log of a database running in no archive log mode

What are the steps involved in creating an RMAN clone

 

RAC

Rac scan ip how it decides which node

Ctsctl stop CRS stops db in what mode

How to load balance in RAC both on client side and service side

What are the common causes of node eviction?

What is the difference between a 11.1 and 11.2 RAC

 

OEM

OEM Sysman user password change

How do you add targets to OEM

 

Dataguard

Primary adding a tablespace of 60G. But at the standby we get the error that there is not enough space. How to resolve the issue.

how to improve the performance of an redo apply

What are the different protection levels in oracle dataguard.

A standby is behind the primary by about 4 days. What is the quickest way to recover the physical standby database and bring it up to date.

What are the parameters relevant to dataguard

 

Troubleshooting

Unable to connect to a database instance. How to investigate it

Instance crashed and no information in the alert log. How to investigate this.

 

12C New features

Shutting down/Starting physical standby database

Shutdown Standby 

Before shutting down a physical standby database check the open mode for the standby so that we can put the standby back in the same mode after the bounce.

 

SYS@NCAPSBY3> select database_role, open_mode from v$database; 

 

DATABASE_ROLE    OPEN_MODE 

---------------- -------------------- 

PHYSICAL STANDBY READ ONLY WITH APPLY

 

Above physical standby database open for read-only access while Redo Apply is active is called active dataguard with real time query on. A separate oracle license is required to use this feature.

 

Then If the primary database is up and running, defer the log destination corresponding to the standby to be shutdown on the primary database and perform a log switch

 

alter system set log_archive_dest_state_2=defer scope=both; 

 

alter system switch logfile;

   

Shutdown the standby in sqlplus or using srvctl in RAC. SHUTDOWN stops redo apply and shuts down the physical standby database.

 

Start up standby

The STARTUP statement starts the standby database, mounts the database as a physical standby database, and opens the database for read-only access when it is invoked without any arguments. The STARTUP MOUNT statement starts and mounts the database as a physical standby database, but does not open the database. Redo apply is not started automatically in either case.

 

Can verify if apply is running or not, based on the SQL below on a standby database

 

SELECT PROCESS, STATUS FROM V$MANAGED_STANDBY;

SYS@NCAPSBY3> SELECT PROCESS, STATUS FROM V$MANAGED_STANDBY; 

 

PROCESS   STATUS

--------- ------------

ARCH      CLOSING

RFS       IDLE

RFS       IDLE

RFS       IDLE

RFS       IDLE

MRP0      APPLYING_LOG 

 

6 rows selected.

 
We can also check the same from primary as below
 
SYS@NCAP11G3> set linesize 200

column dest_name format a50 trunc

select dest_id, dest_name, status, gap_status,recovery_mode from v$archive_dest_status where dest_id in (‘1′, ‘2’,’3′);

  SYS@NCAP11G3> SYS@NCAP11G3>

   DEST_ID DEST_NAME                                          STATUS    GAP_STATUS               RECOVERY_MODE

———- ————————————————– ——— ———————— ———————–

         1 LOG_ARCHIVE_DEST_1                                 VALID                              IDLE

         2 LOG_ARCHIVE_DEST_2                                 VALID     NO GAP                   MANAGED REAL TIME APPLY

         3 LOG_ARCHIVE_DEST_3                                 INACTIVE                           IDLE

 

You cannot open a standby database when the redo apply is on

 

SYS@NCAPSBY3> select database_role, open_mode from v$database; 

 

DATABASE_ROLE    OPEN_MODE 

---------------- -------------------- 

PHYSICAL STANDBY MOUNTED 

 

SYS@NCAPSBY3> alter database open; 

alter database open 

* 

ERROR at line 1: 

ORA-10456: cannot open standby database; media recovery session may be in progress 

 

SYS@NCAPSBY3> ALTER DATABASE OPEN READ ONLY; 

ALTER DATABASE OPEN READ ONLY 

* 

ERROR at line 1: 

ORA-10456: cannot open standby database; media recovery session may be in progress 

 

So cancel the redo apply and open the database

 

SYS@NCAPSBY3> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE CANCEL; 

 

Database altered. 

 

SYS@NCAPSBY3> ALTER DATABASE OPEN READ ONLY; 

 

Database altered. 

 

Now we can enable active dataguard by starting redo apply

 

SYS@NCAPSBY3> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE USING CURRENT LOGFILE DISCONNECT FROM SESSION; 

 

Database altered. 

 

SYS@NCAPSBY3> select database_role, open_mode from v$database; 

 

DATABASE_ROLE    OPEN_MODE 

---------------- -------------------- 

PHYSICAL STANDBY READ ONLY WITH APPLY

Increase SGA for an oracle database in windows with a dataguard physical standby

Environment:

Oracle: 11.2.0.3 enterprise edition

OS: Windows 2008 R2 Standard

OEM: 12C

 

Task: To increase SGA on a production single instance database with a single instance physical standby database.

Research: Ensure there is enough room for the increase in SGA. In our case the

Total RAM on server: 12GB RAM.
Current SGA: 4GB
Current PGA : 5GB

Required SGA: 6GB

For running a windows 64 bit version, Microsoft recommends a RAM of 2GB for OS. We wanted to increase the SGA to 6GB. But this would take the total RAM requirement to 2GB (OS)+ 6GB (SGA)+5GB (PGA) = 13GB which is more than available 12GB. As there was a shortage, after analyzing the PGA requirement during the peak period for the database (memory advisors), we decided to decrease the PGA to 3GB and increase SGA to 6GB so that there would be enough RAM for OS.

1)Create Blackout in OEM 12C
Blackout both the primary and standby db instance (just primary would do, but didn’t want to get alerted by lag and hence included standby too)

2)Defer Sync: Connect to primary database and defer sync to standby

show parameter archive_dest
alter system set log_archive_dest_state_2=defer scope=both;

3)Check and amend memory parameters on primary

show parameter sga
show parameter pga
alter system set sga_target=6G scope=spfile
alter system set sga_max_size=6G scope=spfile
alter system set pga_aggregate_target=3G scope=spfile
shutdown immediate
start up

4)verify changes on primary

show parameter target
show parameter sga
show parameter pga

5)validate primary database after bounce
check listener listening to the db service
connect via tns entry and verify successful connection
check alert log

6)Re-enable sync to standby

alter system set log_archive_dest_state_2=enable scope=both;
show parameter archive_dest

 

7)verify sync
on primary
archive log list

On standby
archive log list

On Primary
alter system switch logfile
alter system switch logfile
archive log list

On standby
archive log list
Ensure the current log sequence is upto date

SQL> archive log list
Database log mode              Archive Mode
Automatic archival             Enabled
Archive destination            G:\oracle\Archive\
Oldest online log sequence     34540
Next log sequence to archive   0
Current log sequence           34547

7)verify alert log on standby

In windows environment viewing alert log for latest updates is a pain as we have to close and reopen the alert log for the latest updates. If the alert log is huge then this is frustrating. ADRCI comes in handy in windows.

adrci> set homepath diag\rdbms\gmcfad01\gmcfad01
adrci> show alert –tail

 

7)Remove blackout in OEM

Time Synchronization in RAC

In RAC environment it is essential that all nodes are in sync in time. Also it recommended to have the time in sync between primary and standby. There are three options to ensure that the time across nodes are in sync

1)Windows Time Service

2)Network Time Protocol

3)Oracle cluster time synchronization service CTSS

 

If one of the first 2 is available on a RAC node then CTSS starts in observer mode

If neither found, CTSS then starts in active mode and synchronizes time across cluster nodes without an external server

 

image

 

To check whether NTP is running on a server use ps command

oracle:dev1:/home/oracle$ ps -ef | grep ntp
ntp       39013      1  0 Apr05 ?        00:03:34 ntpd -u ntp:ntp -p /var/run/ntpd.pid -x
oracle    93293  93226  0 12:54 pts/4    00:00:00 grep ntp

for RAC NTP has to run with –x option. This option means that the time corrections are done gradually in small changes and this is also called as slewing.

Is my database a RAC database?

To check whether a database is a RAC database from SQL Plus

show parameter cluster

image

 

To check the details of the  instances from SQL Plus run the following command

SQL> select * from v$active_instances;

INST_NUMBER
———–
INST_NAME
——————————————————————————————————————————————————————————————————–
          1
testdadb01.net:wqa1

          2
testdadb02.net:wqa2

 

Or use srvctl  command to check the database instance details.

oracle:wqa1:/home/oracle$ srvctl status database -d edwqa -v
Instance wqa1 is running on node testdadb01. Instance status: Open.
Instance wqa2 is running on node testdadb02. Instance status: Open.