Sunday, July 9, 2017

Registration of an 11.2 database fails against a 12.1 CRS stack

I encountered a problem while registering an 11gR2(11.2.0.3) standby database against a 12.1 CRS stack home. See the add database command and the failure afterwards.

$ srvctl add database -d prod01 -o /prd/prod01/product/11.2.0/db_1  -p '+PROD01_DATA/prod01/spfileprod01.ora' -r PHYSICAL_STANDBY -s MOUNT -y AUTOMATIC -t IMMEDIATE -a 'PROD01_DATA,PROD01_FRA,PROD01_REDO'

PRCR-1006 : Failed to add resource ora.prod01.db for prod01
PRCD-1184 : Failed to upgrade configuration of database type to version 11.2.0.3.0
PRCR-1071 : Failed to register or update resource type ora.database.type
CRS-0245:  User doesn't have enough privilege to perform the operation


I was surprised to see this problem and luckily there is an MOS support note on this issue that is well documented with a workaround or patch that can be applied.

The workaround to the issue is run the following after setting your environment to the 12.1 CRS home.

$ crsctl modify type ora.database.type -attr  "ATTRIBUTE=TYPE_VERSION,DEFAULT_VALUE=3.2" "-unsupported"
$ crsctl modify type ora.service.type -attr "ATTRIBUTE=TYPE_VERSION,DEFAULT_VALUE=2.2" "-unsupported"


A one-off patch also exists and the fix is included in 11.2.0.3.4 GI PSU onwards.

Patch 13460353: REGISTRATION OF 11.2.0.3 DATABASE FAILS AGAINST 12.1 CRS STACK

Wednesday, May 31, 2017

Oracle Data Guard Timezone Considerations

I got a question from a client about Oracle Data guard timezones requirements. They asked if the timezones on the Primary and Standby servers need to be the same or not? The first thing that came to mind is that from my experience it simply does not matter at all. For example if your Primary Database server is in Chicago(Central time) and your Standby Database server is in New York(Eastern time) then you should not have any issues with log shipping replication. 

I was talking to an Oracle consultant on the topic today and he pointed out that the Oracle documentation actual does recommend that BOTH Primary and Standby Database servers should consider having the same timezone. This was something I learned new about Data guard.

Check out the verbiage from the 12c documentation, Data guard Concepts and Administration guide. That came as 

https://docs.oracle.com/database/121/SBYDB/standby.htm#SBYDB4717

"Because some applications that perform updates involving time-based data cannot handle data entered from multiple time zones, consider setting the time zone for the primary and remote standby systems to be the same to ensure the chronological ordering of records is maintained after a role transition."

Based on this information it perhaps would be a good idea to consider using one standard timezone for database servers in a Data guard configuration such as the GMT or the UTC+0 timezone. Keep in mind before considering this standard that you would need to check if your specific application cannot handle data entered from multiple time zones, if this is the case then consider using the same timezone for your Database servers in the Data guard configuration. One another option workaround would be to ensure the timezone data is recorded from the application as the data is recorded into the database, this configuration would allow different time zones for the database servers in a data guard configuration and should not present any issues after a role transition.








Friday, March 17, 2017

How to backup the Oracle Supercluster M7 system?

Backing up the Oracle Supercluster M7 system is critical a task that should completed upon the system deployment. Having a good backup & recovery plan is a vital standard practice that is of utmost importance. I highly recommend that any backup/recovery plan always be tested on a regular basis to ensure it is working for your system environment.

Oracle Solaris has provided a tool that will backup & restore critical components of an Oracle Supercluster back together again. The osc-config-backup tool make the process of backing up the SuperCluster component configurations simple. The backup data is then stored on the internal ZFS storage appliance that is on the Supercluster system.

The following a link to the Supercluster M7 osc-config-backup tool link and an Oracle Support note with more details on the tool.

https://docs.oracle.com/cd/E58626_01/html/E69073/z400020e165586.html#scrolltoc

How To Back Up the Oracle SuperCluster Configuration with osc-config-backup (Doc ID 1934129.1)

Currently the osc-config-backup tool is supported on the Oracle SuperCluster M7 and Oracle SuperCluster T5-8.

The following SuperCluster components are backed up by osc-config-backup:
·       Domains – Configuration, ZFS pools (rpool, bpool if present, and u01-pool for Database Domains)
·       Oracle Solaris zones – Configuration, ZFS pools (rpool and u01-pool for Database Zones)
·       IB switches – Configuration
·       Ethernet management switch – Configuration
·       ZFS storage appliance – Configuration
·       SuperCluster – Configuration information (ssc-cu.zip file)
·       Service Processor – Configuration
·       Explorer – Data from each Database and Application Domain

Please note: Data located outside the rpool and u01-pool zpools in zones and domains is not backed up by osc-config-backup


You still should do regular file system level backups as you would any other system to backup other critical filesystems and their contents. Any file level backup tool that you would use to back up any other system should suffice.

Finally, to recover from the backup from the tool. Please refer to the following Oracle Support note.

How To Recover SuperCluster Components that were Backed Up with osc-config-backup (Doc ID 1934130.1)

Thursday, February 23, 2017

Memory Script I/O Domain Solaris

The following script is useful to see the memory information on an I/O Domain on Solaris. I have used this script on an M7 Supercluster deployment on Solaris 11. You can see the total available memory, free memory and the used memory.

#!/usr/bin/ksh

# Available memory
memory=`prtconf | grep Memory | head -1 | awk 'BEGIN {FS=" "} {print $3}'`
gb_memory=`echo "scale=2; $memory/1024" | bc -l`

# Free memory
pagesize=`pagesize`
kb_pagesize=`echo "scale=2; $pagesize/1024" | bc -l`
sar_freemem=`sar -r 1 1 | tail -1 | awk 'BEGIN {FS=" "} {print $2}'`
gb_freemem=`echo "scale=2; $kb_pagesize*$sar_freemem/1024/1024" | bc -l`

# Used Memory
gb_usedmem=`echo "scale=2; $gb_memory-$gb_freemem" | bc -l`

# Conclusion
echo "Avai Mem: $gb_memory GB"
echo "Free Mem: $gb_freemem GB"
echo "Used Mem: $gb_usedmem GB"


Sample Output below:
--------------------------


$ free.sh
Avai Mem: 96.00 GB
Free Mem: 42.02 GB
Used Mem: 53.98 GB






Sunday, December 4, 2016

Oracle Dataguard Deep Dive Presentation

Thank you to the Dallas OUG for letting me present recently on my "Oracle Dataguard Deep Dive" topic. I also covered 12c new features as well in this presentation. You may view my presentation at the following link.

http://tinyurl.com/hq5h7s3

Wednesday, September 14, 2016

Important Settings on Every Oracle Database Build!

Today I worked on a problem on an Oracle Supercluster M7 system. The issue was that we have about 15 databases running from a single I/O DB Domain it is basically a Virtual Machine(VM). The total allocated SGA/PGA memory across all of these 12c(12.1.0.2) databases was set to a total sum of 60GB and the whole VM has 192GB of memory assigned to it however when I checked the free memory I only showed about 3-4GB available. This did not make sense at all and it did not add up since I only allocated 60GB across all databases. A few issues that we experienced other than the limited memory was also that we got consistent related errors as a result of the limited memory across the database alert logs.

ORA-00604: error occurred at recursive SQL level 1
ORA-04031: unable to allocate 128 bytes of shared memory
ORA-00700: soft internal error, arguments: [kskvmstatact: excessive swapping observed]
ORA-27300: OS system dependent operation:fork failed with status: 12
ORA-04030: out of process memory when trying to allocate 552 bytes (kxs-heap-p,kxfqlobf)


BTW Please keep in mind this issue can also apply to other systems as well such as Exadata, a Physical server or any VM running oracle databases.

I started to dig and troubleshoot.

prstat -t output below shows total memory usage by user on Solaris. Look at the RSS column below it clearly shows 144GB of memory being used by the oracle user. I was surprised to see this.

NPROC USERNAME SWAP RSS MEMORY TIME CPU 
85 root 2520M 2481M 1.3% 87:40:45 0.2% 
5462 oracle 183G 144G 75% 197:27:56 0.2% 
51 grid 17G 12G 6.4% 34:47:13 0.0% 
1 smmsp 4272K 10M 0.0% 0:00:06 0.0% 
7 daemon 9384K 12M 0.0% 0:00:54 0.0% 
1 netcfg 4344K 7240K 0.0% 0:00:24 0.0% 
3 netadm 46M 46M 0.0% 0:00:45 0.0% 

I then wanted to see a list of process sorted by memory usage I also added cpu time, the process id and the command and sorted by the memory column from largest usage to lowest as shown in the following command. The output showed that the largest processes were oracle and all of them were about the same size based on the pmem column which shows the output in terms of a %.

# ps -eo pmem,pcpu,pid,comm | sort -k 1 -nr |wc -l 

I also wanted to see a total process count on the server so I ran the following command it returned about 6000 processes, so many.

# ps -ef| wc -l

..And the plot thickens then I thought about checking to see if we have too many parallel process threads running and I ran the following command to check and verify this.The following command returned about 4800 rows! Now we are on to something this clearly shows that about 80% of the processes running on the VM are all parallel threads, this is way to much.

ps -eo pmem,pcpu,pid,comm | sort -k 1 -nr |grep -i _p0|wc -l 

The parallel_max_servers database init parameter in 12c is a culprit for this and it should be limited. If this parameter is not set it will take the default value which is based upon the settings from CPU_COUNT, PARALLEL_THREADS_PER_CPU, and PGA_AGGREGATE_TARGET. 

The CPU_COUNT database init parameter should also be limited as well. If the default value is taken then the database will take upon the value reported by the total cpu threads reported from the OS. In my case we have 20 CPU cores assigned to the VM and on the M7 Supercluster each CPU has 32 cores with 8 threads each a total of 256 threads per core.So 20 cores assigned to the VM is 160 threads and can set CPU_COUNT accordingly based on the workload need.

The PARALLEL_THREADS_PER_CPU is normally OS dependent and usually 2. This setting is fine.

Once I appropriately set the CPU_COUNT and PARALLEL_MAX_SERVERS parameters on each of the 16 databases and restarted each one I was happy to see the overall memory utilization dramatically decreased. As shown below in the prstat command.

# prstat -t 
NPROC USERNAME SWAP RSS MEMORY TIME CPU 
764 oracle 80G 56G 29% 14:33:54 0.1% 
94 root 2525M 2493M 1.3% 88:14:08 0.1% 
53 grid 8239M 6759M 3.4% 34:26:20 0.0% 
1 smmsp 4272K 10M 0.0% 0:00:06 0.0% 
7 daemon 9384K 12M 0.0% 0:00:55 0.0% 
1 netcfg 4344K 7240K 0.0% 0:00:24 0.0% 
3 netadm 46M 46M 0.0% 0:00:45 0.0% 



In Summary the following two parameters should be set appropriately on every new Oracle Database build. As a best practice please limit these parameters as the default settings could hurt the overall performance of your system and even impact the available memory as it did in our case. I have even see many cases where the PARALLEL_MAX_SERVERS was not set and it completely hung an Exadata or other types of database servers so please be mindful of these two parameters.


  • CPU_COUNT
  • PARALLEL_MAX_SERVERS




Tuesday, September 13, 2016

Dallas Oracle Users Group - Presenting "Oracle Dataguard Deep Dive"

I am very excited and honored to be presenting at the Dallas Oracle Users Group soon on Sept 22, 2016. The topic is "Oracle Dataguard Deep Dive". Session details below.


To RSVP for this meeting:  http://www.doug.org/calendar.html#id=10163&cid=538&wid=201&type=Cal



Location
Texas State Government Facility (education-related)
- Use building Entrance C on SE side
Dallas Room – 1st floor
400 East Spring Valley Road
Richardson, Texas, 75081


Presentation Details
Oracle Data Guard Deep Dive
Ensuring your business continuity for critical production databases is of paramount importance. Oracle Data Guard offers high available synchronization and reporting of your primary database. Oracle Data Guard is the most comprehensive solution available to eliminate single points of failure for mission critical Oracle Databases. It prevents data loss and downtime in the simplest and most economical manner by maintaining a synchronized physical replica of a production database at a remote location. If the production database is unavailable for any reason, client connections can quickly, and in some configurations transparently, fail-over to the synchronized replica to restore service. We will go through explaining Data Guard concepts, several new features in 12c such as Far/Fast sync, learn about Data Guard single instance and RAC setup step-by-step, configuration, broker setup and monitoring, and even tips on automating the complete setup.  This presentation will be very useful for someone who wants to take a deep dive into Data Guard or even refresh their skills.