bug


Oracle 12.1 – Find OCR Master Node

I’ve been dealing with a problem where, sometimes, a rebooted RAC node is unable to join back the cluster. The issue seems to be with the “Master Node”, which refuses to accept the node.

So I’ve to know which is the “Master Node” (the current known solution is to reboot it, and then all nodes join the cluster).

There is the Oracle note: How to Find OCR Master Node (Doc ID 1281982.1)

And there is this blog entry: 11G R2 RAC: How to identify the master node in RAC

In my case I’m using Oracle 12.1.0.2, the location of the files is a bit different. The location of the OCR Master Node can be found on this version using one of the following ways:

  • Check the crsd logs for “OCR MASTER”
grep "OCR MASTER" ${ORACLE_BASE}/diag/crs/`hostname`/crs/trace/crsd*

and, if the logs did not rotate too much yet, you should see one of the two below:

/u00/app/oracle/diag/crs/anjovm1/crs/trace/crsd_73.trc:2018-01-13 14:05:30.535186 :  OCRMAS:3085: th_master:13: I AM THE NEW OCR MASTER at incar 2. Node Number 1

/u00/app/oracle/diag/crs/anjovm2/crs/trace/crsd_71.trc:2018-01-13 14:05:32.823231 :  OCRMAS:3085: th_master: NEW OCR MASTER IS 1
  • Check the location of the OCR automatic backups

the cluster node currently keeping the backups, is the OCR master node. If you see older backups on other nodes, it was when they were OCR master nodes on its turn.

ls -l /u00/app/12.1.0.2/grid/cdata/<cluster_name>
-rw-r--r--    1 root     system      943266 Jan  14 00:01 backup00.ocr
-rw-r--r--    1 root     system      943266 Jan 13 20:01 backup01.ocr
-rw-r--r--    1 root     system      943266 Jan 13 16:01 backup02.ocr
-rw-r--r--    1 root     system      943266 Jan 13 00:00 day.ocr
-rw-r--r--    1 root     system      943266 Jan  14 00:01 day_.ocr
-rw-r--r--    1 root     system      943266  Dec 31 23:55 week.ocr
-rw-r--r--    1 root     system      943266 Jan 07 23:59 week_.ocr

Note: do not confuse the OCR master node with the Cluster Health Monitor repository master node, which you get using the command:

/u00/app/12.1.0.2/grid/bin/oclumon manage -get MASTER

 


Goldengate 12.2 – ERROR OGG-00041 Data source not specified.

While testing the new Goldengate 12.2 with a training material I’ve been doing, I got the error:

2016-03-29 15:36:00 ERROR OGG-00041 Data source not specified. 
2016-03-29 15:36:00 ERROR OGG-01668 PROCESS ABENDING.

This when configuring the most simple integrated extract.

Looking at the ggserr.log did not gave any other idea.

After opening a SR with Oracle, I decided to try the same configuration on Goldengate 12.1. And to my surprise the error was “clearer”:

2016-03-29 16:15:51 ERROR OGG-00303 Unrecognized parameter (EXTRACT e_hr).

Immediatly it come to my mind that the problem was related to my copy/paste from powerpoint (training material I’m doing) to the putty window. Doing ‘:set list’ in vi did not show any strange character, but after recreating the extract parameter file from scratch with exactly the same parameters, it worked like a charm.

Go back test on Goldengate 12.2 and no error either.

So, if you have OGG-00041 error, it can be just that Goldengate is not interpreting well the parameter file.

Meanwhile, another problem on Goldengate 12.2 is that it crashes when you delete an extract while connected to a database.


Oracle 12.1.0.2 – ORA-600 [QERHJDISABLEHJCBK: NOT A HASH JOIN] on UPDATE

Surprises happen when you go to the latest version of Oracle without deep testing.
My client just hit the bug:
Bug 20513930 : UPDATE FAILS WITH [QERHJDISABLEHJCBK: NOT A HASH JOIN]

Basically the client was getting a ORA-600 [QERHJDISABLEHJCBK: NOT A HASH JOIN] when doing a certain query. Looking at it, it was an UPDATE of a view. This view had a “instead of update” trigger, which was doing several actions. After enabling trace on the session, the bug does not happen!

Fortunately someone had already open the bug at Metalink. The workaround is to disable the new “groundbreaking feature” (dixit Oracle) called Adaptive Plans.

SQL> alter system set "_optimizer_adaptive_plans"=false;
System altered.

Hopefully there will be soon a patch for this problem.


Impossible to change virtual column

On my series of using datapump with remap_schema and virtual columns, I end up yesterday with a new case:

– impossible to change a virtual column after the referenced function does not exist anymore.

This bug is present on 11.2.0.3, corrected on 11.2.0.4.

Example:

SQL> grant connect,resource to u1 identified by u1;
Grant succeeded.

SQL> connect u1/u1
Connected.

SQL> create or replace function f1 (v1 number)
  return number DETERMINISTIC as
begin
  return v1;
end;
/  

Function created.

-- Create table with virtual column referencing the function

SQL> create table t1 (c1 number, c2 number as (f1(c1)));
Table created.

SQL> drop function f1;
Function dropped.

-- Try to change the virtual column. Impossible!
SQL> alter table t1 drop column c2;

alter table t1 drop column c2
*
ERROR at line 1:
ORA-00904: "U1"."F1": invalid identifier

-- Workaround is to recreate the function...

-- clean up
SQL> connect / as sysdba
Connected.

SQL> drop user u1 cascade;
User dropped.