Home > bug > 今日BUG 8870547来相见

今日BUG 8870547来相见

November 20th, 2011 Leave a comment Go to comments

下午开完会,晚上回去被告知某库的备库机器crash重启了,这个倒不是问题的关键,关键在于机器重启之后dg连续遇到了几个比较蛋疼的问题,记录下。

首先是遇到physical standby 不能clear redo的问题,查看了下发现redo不存在,设置log_file_name_convert参数后成功clear。
之后的问题就是主库的归档无法传到备库,查看主库日志报ORA-16047错误:

ARC0: Archivelog destination LOG_ARCHIVE_DEST_2 disabled: Data Guard configuration identifier mismatch
ORA-16047: DGID mismatch between destination setting and standby

检查了下LOG_ARCHIVE_DEST_2没什么问题,但是发现log_archive_config这个参数两边都没有配置。之前搭DG也没有去关注这个参数,因为发现这个参数不配置也不会影响DG的搭建,我尝试设置了下主库的log_archive_config,发现还是不行。baidu+google无数,都说是db_unique_name没配好,最后发现itpub一位同志的贴纸,重新手工设置下log_archive_config参数即可(必须两边都设置才行),归档成功ship过去。再把两边log_archive_config参数都去掉,dg也没什么问题。怀疑是不是bug引起,搜刮metalink,发现Bug 8870547 比较相似:

Bug 8870547: ORA-16047 DGID MISMATCH BETWEEN PRIMARY AND STANDBY

类型
B – Defect
已在产品版本中修复
-

严重性
2 – Severe Loss of Service
产品版本
10.2.0.4

状态
33 – Suspended, Req’d Info not Avail
平台
226 – Linux x86-64

Hdr: 8870547 10.2.0.4 RDBMS 10.2.0.4 DATAGUARD_TRAN PRODID-5 PORTID-226
Abstract: ORA-16047 DGID MISMATCH BETWEEN PRIMARY AND STANDBY

*** 09/03/09 05:37 am ***
TAR:
—-
SR 7667127.994

PROBLEM:
——–
++ 9 Node RAC Primary , 9 Node Standby operating in Dataguard Max Performance
using LGWR ASYNC redo transport , without Broker.

++ Itermitentnly PRIMARY nodes (not same node) give ORA-16407 DGID mismatch
and remote archival status errors out

DIAGNOSTIC ANALYSIS:
——————–
++ Checked LOG_ARCHIVE_CONFIG and DB_UNIQUE_NAME which is set correctly

++ Reset the LOG_ARCHIVE_CONFIG=NO_DGCONFIG and set it back (initially it was
set case sensitive) so we set it all UPPERCASE as db_unique_name

++ select * from gv$dataguard_config; shows same in PRIMARY and STANDBY

++ LOG_ARCHIVE_DEST_2 = SERVICE=GEARP_DR LGWR ASYNC=20480 NOAFFIRM REOPEN=30
NET_TIMEOUT=15 NOMAX_FAILURE OPTIONAL DB_UNIQUE_NAME=GEARP_DR

++ Enabled LAT=1569 (1024 + 1 + 32 + 512) on the failing instance of PRIMARY
and STANDBY apply instance , simulate the issue. & /bdump files from PRIMARY

WORKAROUND:
———–
None. Just try disabling, enabling the dest_state_2 and do switch, sometime
it works mostly gives same error.

RELATED BUGS:
————-
4636417

REPRODUCIBILITY:
—————-

TEST CASE:
———-

STACK TRACE:
————

SUPPORTING INFORMATION:
———————–

24 HOUR CONTACT INFORMATION FOR P1 BUGS:
—————————————-

DIAL-IN INFORMATION:
——————–

IMPACT DATE:
————

*** 09/03/09 05:40 am ***
Files uploaded
1) DG diag scripts
2) PRIMARY.zip
3) Standby.zip
*** 09/07/09 04:01 am ***
*** 09/08/09 03:21 am *** (CHG: Sta->10)
*** 09/08/09 03:21 am ***
*** 09/13/09 11:04 pm ***
Feedback from customer is, From Mid of April onwards nothing was changed.
Customer gone thru storage migration on 23-Aug-2009.

Do you still need alert logs since april? pls let me know
*** 09/13/09 11:04 pm *** (CHG: Sta->16)
*** 09/14/09 12:57 am *** (CHG: Sta->10)
*** 09/14/09 12:57 am ***
*** 10/12/09 12:52 am *** (CHG: Sta->33)
*** 10/12/09 12:52 am ***
*** 10/13/09 08:03 pm ***
In the standby server remote_listener was not set correctly in some of the
nodes, that cause the connection from primary to standby and reverted back to
primary itself. After correcting the remote_listener the issue is fixed.

thanks.

但是该申请并没有被ORACLE明确为BUG,因为状态是Suspended, Req’d Info not Avail。

Categories: bug Tags:
  1. No comments yet.
  1. No trackbacks yet.