Problem
Due to a bug (20186278) in 11.2.0.4, huge CHM(Cluster Health Monitor) files will be created in Grid Home. Remedy for this issue is available from 11.2.0.4.7 (Jul 2015) Grid Infrastructure patch set update or from 12.2. As a quick fix, you can stop the resource ora.crf and remove the files. The file location can be found as below:
Solution
Let's find the cluster health monitor's path:
[oracle@myserver bin]$ ./oclumon manage -get reppath
CHM Repository Path = /u01/app/11.2.0.4/grid/crf/db/db01db01
Done
Let us list the files under this:
[oracle@myserver db01db01]$ ls -lhtr
total 19G
-rw-r--r-- 1 root root 89K May 29 2014 29-MAY-2014-17:48:14.txt
-rw-r--r-- 1 root root 1.3M May 29 2014 29-MAY-2014-20:02:00.txt
-rw-r--r-- 1 root root 2.1M May 31 2014 31-MAY-2014-11:52:44.txt
-rw-r--r-- 1 root root 1.2M May 31 2014 31-MAY-2014-11:58:47.txt
-rw-r--r-- 1 root root 2.1M Oct 14 2014 14-OCT-2014-16:15:09.txt
-rw-r--r-- 1 root root 1.3M Oct 14 2014 14-OCT-2014-16:29:55.txt
-rw-r--r-- 1 root root 1.9M Oct 23 2014 23-OCT-2014-15:49:22.txt
-rw-r--r-- 1 root root 1.3M Oct 23 2014 23-OCT-2014-16:26:58.txt
-rw-r--r-- 1 root root 1.8M Apr 7 16:30 07-APR-2015-16:30:33.txt
-rw-r--r-- 1 root root 1.2M Apr 7 16:38 07-APR-2015-16:38:42.txt
-rw-r----- 1 root root 8.0K May 29 16:58 repdhosts.bdb
-rw-r----- 1 root root 24K May 29 16:58 __db.001
-rw-r--r-- 1 root root 115M May 29 16:59 db01db01.ldb
-rw-r----- 1 root root 8.0K May 29 16:59 crfconn.bdb
-rw-r----- 1 root root 16M Sep 9 14:59 log.0000014662
-rw-r----- 1 root root 56K Sep 9 15:23 __db.006
-rw-r----- 1 root root 2.1M Sep 9 15:23 __db.004
-rw-r----- 1 root root 16M Sep 9 15:23 log.0000014663
-rw-r----- 1 root root 392K Sep 9 15:23 __db.002
-rw-r----- 1 root root 298M Sep 9 15:23 crfts.bdb
-rw-r----- 1 root root 460M Sep 9 15:23 crfloclts.bdb
-rw-r----- 1 root root 387M Sep 9 15:23 crfhosts.bdb
-rw-r----- 1 root root 424M Sep 9 15:23 crfcpu.bdb
-rw-r----- 1 root root 17G Sep 9 15:23 crfclust.bdb
-rw-r----- 1 root root 382M Sep 9 15:23 crfalert.bdb
-rw-r----- 1 root root 1.2M Sep 9 15:23 __db.005
-rw-r----- 1 root root 2.6M Sep 9 15:23 __db.003
As you can see in the list above, crfclust.bdb grew to 17G due to the bug. We can remove this file after stopping ora.crf as below:
[oracle@myserver bin]$ ./crsctl stop res ora.crf -init
CRS-2673: Attempting to stop 'ora.crf' on 'db01db01'
CRS-2677: Stop of 'ora.crf' on 'db01db01' succeeded
Now we can remove crfclust.bdb. It should be done as root user:
[root@myserver db01db01]# rm -f crfclust.bdb
Restart ora.crf:
[oracle@myserver bin]$ ./crsctl start res ora.crf -init
CRS-2672: Attempting to start 'ora.crf' on 'db01db01'
CRS-2676: Start of 'ora.crf' on 'db01db01' succeeded
Due to a bug (20186278) in 11.2.0.4, huge CHM(Cluster Health Monitor) files will be created in Grid Home. Remedy for this issue is available from 11.2.0.4.7 (Jul 2015) Grid Infrastructure patch set update or from 12.2. As a quick fix, you can stop the resource ora.crf and remove the files. The file location can be found as below:
Solution
Let's find the cluster health monitor's path:
[oracle@myserver bin]$ ./oclumon manage -get reppath
CHM Repository Path = /u01/app/11.2.0.4/grid/crf/db/db01db01
Done
Let us list the files under this:
[oracle@myserver db01db01]$ ls -lhtr
total 19G
-rw-r--r-- 1 root root 89K May 29 2014 29-MAY-2014-17:48:14.txt
-rw-r--r-- 1 root root 1.3M May 29 2014 29-MAY-2014-20:02:00.txt
-rw-r--r-- 1 root root 2.1M May 31 2014 31-MAY-2014-11:52:44.txt
-rw-r--r-- 1 root root 1.2M May 31 2014 31-MAY-2014-11:58:47.txt
-rw-r--r-- 1 root root 2.1M Oct 14 2014 14-OCT-2014-16:15:09.txt
-rw-r--r-- 1 root root 1.3M Oct 14 2014 14-OCT-2014-16:29:55.txt
-rw-r--r-- 1 root root 1.9M Oct 23 2014 23-OCT-2014-15:49:22.txt
-rw-r--r-- 1 root root 1.3M Oct 23 2014 23-OCT-2014-16:26:58.txt
-rw-r--r-- 1 root root 1.8M Apr 7 16:30 07-APR-2015-16:30:33.txt
-rw-r--r-- 1 root root 1.2M Apr 7 16:38 07-APR-2015-16:38:42.txt
-rw-r----- 1 root root 8.0K May 29 16:58 repdhosts.bdb
-rw-r----- 1 root root 24K May 29 16:58 __db.001
-rw-r--r-- 1 root root 115M May 29 16:59 db01db01.ldb
-rw-r----- 1 root root 8.0K May 29 16:59 crfconn.bdb
-rw-r----- 1 root root 16M Sep 9 14:59 log.0000014662
-rw-r----- 1 root root 56K Sep 9 15:23 __db.006
-rw-r----- 1 root root 2.1M Sep 9 15:23 __db.004
-rw-r----- 1 root root 16M Sep 9 15:23 log.0000014663
-rw-r----- 1 root root 392K Sep 9 15:23 __db.002
-rw-r----- 1 root root 298M Sep 9 15:23 crfts.bdb
-rw-r----- 1 root root 460M Sep 9 15:23 crfloclts.bdb
-rw-r----- 1 root root 387M Sep 9 15:23 crfhosts.bdb
-rw-r----- 1 root root 424M Sep 9 15:23 crfcpu.bdb
-rw-r----- 1 root root 17G Sep 9 15:23 crfclust.bdb
-rw-r----- 1 root root 382M Sep 9 15:23 crfalert.bdb
-rw-r----- 1 root root 1.2M Sep 9 15:23 __db.005
-rw-r----- 1 root root 2.6M Sep 9 15:23 __db.003
As you can see in the list above, crfclust.bdb grew to 17G due to the bug. We can remove this file after stopping ora.crf as below:
[oracle@myserver bin]$ ./crsctl stop res ora.crf -init
CRS-2673: Attempting to stop 'ora.crf' on 'db01db01'
CRS-2677: Stop of 'ora.crf' on 'db01db01' succeeded
Now we can remove crfclust.bdb. It should be done as root user:
[root@myserver db01db01]# rm -f crfclust.bdb
Restart ora.crf:
[oracle@myserver bin]$ ./crsctl start res ora.crf -init
CRS-2672: Attempting to start 'ora.crf' on 'db01db01'
CRS-2676: Start of 'ora.crf' on 'db01db01' succeeded
If you come across any
file system full alerts on your Grid boxes, make sure you check this file if
you run 11.2.0.4 binaries.