Since a few weeks there was an error during backups of one of my clients databases. They use Commvault to trigger the backups.
In the background, for some reason, Commvault asks the primary instance for some information about the free space in the tablespaces, using the following query:
SELECT 'D,' || c.NAME || ':' || t.NAME || ':' || c.dbId || ',' || d.STATUS || ',' || d.ENABLED || ',' || TO_CHAR(d.BYTES) || ',' || TO_CHAR(d.BYTES - NVL(ff.fbytes,0)) || ',' || TRIM(' ' FROM d.NAME) || ',' || TRIM(' ' FROM d.FILE#) AS PDB_TS_DF
FROM v$datafile d, v$tablespace t, v$CONTAINERS c, (SELECT f.CON_ID CON_ID, f.file_id file_id, SUM(f.bytes) fbytes FROM CDB_FREE_SPACE f GROUP BY f.file_id,f.CON_ID) ff
WHERE c.CON_ID = d.CON_ID and d.CON_ID = t.CON_ID and d.TS#=t.TS# AND ff.file_id (+)= d.FILE#
ORDER BY PDB_TS_DF;
The important part is this one:
SELECT f.CON_ID CON_ID, f.file_id file_id, SUM(f.bytes) fbytes
FROM CDB_FREE_SPACE f
GROUP BY f.file_id,f.CON_ID;
This query was taking quite some long time. This makes me suspicious about the Commvault error:
22452 57b4 12/19 04:57:58 992041 RacBrowser::DoBrowse() - Dataguard client with skip readonly/data backup on standby refresh primary instance for data file status
22452 57b4 12/19 04:57:58 992041 RacBrowser::BrowsePrimaryInstanceForSchema() - commandLineArgs = BROWSE -a 2:640 -jm 1099511627776
22452 57b4 12/19 05:00:44 992041 CCvNetwork::Disconnect() - Error [0x80070306:{CCvNetwork::ReceiveXMLMessage(4236)} + {CCvNetwork::ReceiveXDRMessagePrivate(6032)/W32.774.(One or more errors occurred while processing the request. (ERROR_ERRORS_ENCOUNTERED.774))-Unexpected message from RemoteHost=vm4375.myclient.com, RemoteProcess=cvd, Received Msg=150996226:CVS_PM_REMOTE_EXECUTE_RSP, Expected=150995458:CVS_PM_DISCONNECT_RSP. Connection string: vm4375/vm4375.myclient.com/SockIP(159.232.140.175)/159.232.140.175:cvd/8400/8400 cvd(24022:3e21b700)}] receiving disconnect response Object
22452 57b4 12/19 05:00:44 992041 RacBrowser::BrowsePrimaryInstanceForSchema() - Error [0x80070102:{CCvNetwork::ExecRemoteCmdAndWait(2825)} + {CCvNetwork::ExecRemoteCmdAndWait(2792)} + {CCvNetwork::ReceiveXDRMessageInternal(5514)} + {CCvNetwork::CheckIfDataAvailable(5824)/vm4375/vm4375.myclient.com/SockIP(159.232.140.175)/159.232.140.175:cvd/8400/8400 cvd(24022:3e21b700)} + {CCvNetwork::CheckDataWaiting(6498)} + {CSessionConnectionSocket::CheckDataWaiting(668)/W32.258.(The wait operation timed out. (WAIT_TIMEOUT.258))-Timed out}] returned from executeRemoteCommand [ClOraAgent] on RAC client 428
22452 57b4 12/19 05:00:44 992041 IDAError::SendMsg() - setting JPR for jobId 992041
22452 57b4 12/19 05:00:44 992041 JM Client CVJobClient::getRemoteServer(): Got token [992041:4:11:0:0:15608].
22452 57b4 12/19 05:00:44 992041 JM Client CVJobClient::getRemoteServer(): Got host [VM4435.myclient.com] from input.
22452 57b4 12/19 05:00:44 992041 JM Client CVJobClient::connectToJM(): Got remote host [VM4435.myclient.com].
22452 57b4 12/19 05:00:44 992041 SetSocketOptions() - Couldn't set TCP keep-alive options. Return code:[95=EOPNOTSUPP].
22452 57b4 12/19 05:00:44 992041 SetSocketOptions() - Couldn't set TCP keep-alive options. Return code:[95=EOPNOTSUPP].
22452 57b4 12/19 05:00:44 992041 BrowsePrimaryInstanceForSchema() - errMsg=[Failed to query data file list and enabled status from Primary instance to skip read only tablespaces for standby backup.]
22452 57b4 12/19 05:00:44 992041 BrowsePrimaryInstanceForSchema() - IsSetJPR = [TRUE]
22452 57b4 12/19 05:00:44 992041 BrowsePrimaryInstanceForSchema() - IsAlreadySent = [FALSE]
22452 57b4 12/19 05:00:44 992041 BrowsePrimaryInstanceForSchema() - Error with Stack[0xECCC065B:{RacBrowser::BrowsePrimaryInstanceForSchema(198)/ErrNo.1627.(Unknown error 1627)-Failed to query data file list and enabled status from Primary instance to skip read only tablespaces for standby backup.}]
22452 57b4 12/19 05:00:44 992041 ClOraAgent::Browse(1460) - Browse failed
22452 57b4 12/19 05:00:44 992041 ::Browse() - Browse failed:Failed to query data file list and enabled status from Primary instance to skip read only tablespaces for standby backup..
22452 57b4 12/19 05:00:44 992041 ::ExitHere() - enter: phase=6
When checking on Oracle support we see that “Queries on DBA_FREE_SPACE are Slow (Doc ID 271169.1)” is mostly due to the recycle bin.
I found out that the problem came only from the main PDB, where more than 200 thousand objects were in the recycle bin. Once I deleted them, the backup errors disappeared.
As I was not sure I could simple disable the recycle bin, I took the chance to apply a scheduler job which deletes from the Oracle recycle bin everything older than X days:
alter session set container=my_pdb;
create or replace procedure clean_recyclebin as
/* Purge recyclebin objects in this PDB older than 1 hour */
begin
for obj in (select owner,object_name from dba_recyclebin where type='TABLE' and to_date(droptime,'YYYY-MM-DD:HH24:MI:SS')<sysdate-1/24) loop
execute immediate 'purge table '||obj.owner||'."'||obj.object_name||'"';
end loop;
end;
/
begin
dbms_scheduler.create_job(job_name => 'PURGE_RECYCLEBIN_1_HOUR',
job_type => 'STORED_PROCEDURE',
job_action => 'clean_recyclebin',
start_date => systimestamp,
end_date => null,
repeat_interval => 'freq=hourly; byminute=0; bysecond=0;',
enabled => true,
auto_drop => false,
comments => 'To avoid Commvault problems, due many objects drop, it purges recyclebin oltder than 1 hour');
end;
/
select job_type,job_action,repeat_interval,enabled,state,next_run_date from dba_scheduler_jobs where job_name='PURGE_RECYCLEBIN_1_HOUR';
