HugePages not used when starting DB with srvctl (but works with sqlplus)


Once again I end up with my clients database swapping. Why? After quick investigation, could see that HugePages were not used on the last restart of the database.

oracle@myvm1:./trace/ [oracle19] grep -B1 -A4 PAGESIZE alert*.log
2020-04-14T04:36:34.601494+02:00
  PAGESIZE  AVAILABLE_PAGES  EXPECTED_PAGES  ALLOCATED_PAGES  ERROR(s)
2020-04-14T04:36:34.601550+02:00
        4K       Configured              10              10        NONE
2020-04-14T04:36:34.601642+02:00
     2048K           247816            8193            8193        NONE
--
2020-10-13T22:59:28.856763+02:00
  PAGESIZE  AVAILABLE_PAGES  EXPECTED_PAGES  ALLOCATED_PAGES  ERROR(s)
2020-10-13T22:59:28.856818+02:00
        4K       Configured              10         4186122        NONE
2020-10-13T22:59:28.856925+02:00
     2048K           202479            8193              17        NONE

Why was that? I did use a normal start command:

oracle@myvm1:./trace/ [oracle19] srvctl start database -db mydb

Let’s put the context. This is a Oracle Restart server, with separation between oracle and grid users.

The database in the past used HugePages, so they are configured (I skip the check here, but I did it).

I’ve looked around and could see that ulimits were correctly set for both users:

oracle@myvm1:./trace/ [oracle19] grep 'memlock' /etc/security/limits.d/99-grid-oracle-limits.conf
oracle soft memlock 1342177280
oracle hard memlock 1342177280
grid soft memlock 1342177280
grid hard memlock 1342177280

I see the file was changed after the previous DB restart. The server was not restarted in-between.

But on the alertlog I see that the ulimit seen by the database changed compared the the last restart:

oracle@myvm1:./trace/ [oracle19] grep -h -B1 "Per proc" alert*.log
2020-04-14T04:36:34.301642+02:00
 Per process system memlock (soft) limit = 1280G
--
2020-10-13T22:59:28.756763+02:00
 Per process system memlock (soft) limit = 64K

I decided to stop the database and starting it again but with sqlplus. And HugesPages were used!

So, where could be the problem?

The srvctl uses de oraagent.bin to start databases. It is so important to check what the limits used by this process:

grid@myvm1:~/ [grid19] grep memory /proc/$(pgrep oraagent.bin)/limits
Max locked memory         65536                65536                bytes

Here is the problem, the oraagent.bin was started before the change of the limits file. And new limits are only valid for new sessions.

When we connect to a new session and start a DB with sqlplus, it will get this session limits. But if we use srvctl, it will get the limits that were in place when oraagent.bin started.

We are on a production server, restart the whole clusterware stack would be quite annoying. What is nice, is that we can simply “kill” the oraagent.bin process and it will respawn. Even better if we do it with the ohasd.bin.

grid@myvm1:~/ [grid19] ps -ef | grep ohasd.bin
grid       2854      1  0 Sep02 ?        03:23:19 /u00/app/grid/product/19/bin/ohasd.bin reboot

grid@myvm1:~/ [grid19] kill -9 2854

grid@myvm1:~/ [grid19] ps -ef | grep ohasd.bin
grid     185684      1 17 14:22 ?        00:00:01 /u00/app/grid/product/19/bin/ohasd.bin restart

grid@myvm1:~/ [grid19] psg pmon
grid       6458      1  0 Sep02 ?        00:03:38 asm_pmon_+ASM
oracle     7069      1  0 Oct13 ?        00:05:18 ora_pmon_mydb

And now the new limits are correctly set for the oraagent.bin:

grid@myvm1:~/ [grid19] grep memory /proc/$(pgrep oraagent.bin)/limits
Max locked memory         1374389534720        1374389534720        bytes

And further restarts with srvctl are using HugePages.

Lesson: check the if the ulimits are correctly applied to oragent.bin.

If you have a file with a list of servers separated by spaces or lines, then you can use a similar loop to go through all of them:

for server in $(cat server_list.txt | grep -v '^#'); do 
  echo "=== $server"; 
  ssh -o ConnectTimeout=1 -o ConnectionAttempts=1 -oBatchMode=yes $server \
    "echo '99-grid - ' \$(grep 'grid soft memlock' /etc/security/limits.d/99-grid-oracle-limits.conf); echo 'oraagent - ' \$(grep memory /proc/\$(pgrep oraagent.bin)/limits)"; 
done;

Leave a comment

Your email address will not be published. Required fields are marked *