Friday, March 17, 2017

How to backup the Oracle Supercluster M7 system?

Backing up the Oracle Supercluster M7 system is critical a task that should completed upon the system deployment. Having a good backup & recovery plan is a vital standard practice that is of utmost importance. I highly recommend that any backup/recovery plan always be tested on a regular basis to ensure it is working for your system environment.

Oracle Solaris has provided a tool that will backup & restore critical components of an Oracle Supercluster back together again. The osc-config-backup tool make the process of backing up the SuperCluster component configurations simple. The backup data is then stored on the internal ZFS storage appliance that is on the Supercluster system.

The following a link to the Supercluster M7 osc-config-backup tool link and an Oracle Support note with more details on the tool.

How To Back Up the Oracle SuperCluster Configuration with osc-config-backup (Doc ID 1934129.1)

Currently the osc-config-backup tool is supported on the Oracle SuperCluster M7 and Oracle SuperCluster T5-8.

The following SuperCluster components are backed up by osc-config-backup:
·       Domains – Configuration, ZFS pools (rpool, bpool if present, and u01-pool for Database Domains)
·       Oracle Solaris zones – Configuration, ZFS pools (rpool and u01-pool for Database Zones)
·       IB switches – Configuration
·       Ethernet management switch – Configuration
·       ZFS storage appliance – Configuration
·       SuperCluster – Configuration information ( file)
·       Service Processor – Configuration
·       Explorer – Data from each Database and Application Domain

Please note: Data located outside the rpool and u01-pool zpools in zones and domains is not backed up by osc-config-backup

You still should do regular file system level backups as you would any other system to backup other critical filesystems and their contents. Any file level backup tool that you would use to back up any other system should suffice.

Finally, to recover from the backup from the tool. Please refer to the following Oracle Support note.

How To Recover SuperCluster Components that were Backed Up with osc-config-backup (Doc ID 1934130.1)

Thursday, February 23, 2017

Memory Script I/O Domain Solaris

The following script is useful to see the memory information on an I/O Domain on Solaris. I have used this script on an M7 Supercluster deployment on Solaris 11. You can see the total available memory, free memory and the used memory.


# Available memory
memory=`prtconf | grep Memory | head -1 | awk 'BEGIN {FS=" "} {print $3}'`
gb_memory=`echo "scale=2; $memory/1024" | bc -l`

# Free memory
kb_pagesize=`echo "scale=2; $pagesize/1024" | bc -l`
sar_freemem=`sar -r 1 1 | tail -1 | awk 'BEGIN {FS=" "} {print $2}'`
gb_freemem=`echo "scale=2; $kb_pagesize*$sar_freemem/1024/1024" | bc -l`

# Used Memory
gb_usedmem=`echo "scale=2; $gb_memory-$gb_freemem" | bc -l`

# Conclusion
echo "Avai Mem: $gb_memory GB"
echo "Free Mem: $gb_freemem GB"
echo "Used Mem: $gb_usedmem GB"

Sample Output below:

Avai Mem: 96.00 GB
Free Mem: 42.02 GB
Used Mem: 53.98 GB

Sunday, December 4, 2016

Oracle Dataguard Deep Dive Presentation

Thank you to the Dallas OUG for letting me present recently on my "Oracle Dataguard Deep Dive" topic. I also covered 12c new features as well in this presentation. You may view my presentation at the following link.

Wednesday, September 14, 2016

Important Settings on Every Oracle Database Build!

Today I worked on a problem on an Oracle Supercluster M7 system. The issue was that we have about 15 databases running from a single I/O DB Domain it is basically a Virtual Machine(VM). The total allocated SGA/PGA memory across all of these 12c( databases was set to a total sum of 60GB and the whole VM has 192GB of memory assigned to it however when I checked the free memory I only showed about 3-4GB available. This did not make sense at all and it did not add up since I only allocated 60GB across all databases. A few issues that we experienced other than the limited memory was also that we got consistent related errors as a result of the limited memory across the database alert logs.

ORA-00604: error occurred at recursive SQL level 1
ORA-04031: unable to allocate 128 bytes of shared memory
ORA-00700: soft internal error, arguments: [kskvmstatact: excessive swapping observed]
ORA-27300: OS system dependent operation:fork failed with status: 12
ORA-04030: out of process memory when trying to allocate 552 bytes (kxs-heap-p,kxfqlobf)

BTW Please keep in mind this issue can also apply to other systems as well such as Exadata, a Physical server or any VM running oracle databases.

I started to dig and troubleshoot.

prstat -t output below shows total memory usage by user on Solaris. Look at the RSS column below it clearly shows 144GB of memory being used by the oracle user. I was surprised to see this.

85 root 2520M 2481M 1.3% 87:40:45 0.2% 
5462 oracle 183G 144G 75% 197:27:56 0.2% 
51 grid 17G 12G 6.4% 34:47:13 0.0% 
1 smmsp 4272K 10M 0.0% 0:00:06 0.0% 
7 daemon 9384K 12M 0.0% 0:00:54 0.0% 
1 netcfg 4344K 7240K 0.0% 0:00:24 0.0% 
3 netadm 46M 46M 0.0% 0:00:45 0.0% 

I then wanted to see a list of process sorted by memory usage I also added cpu time, the process id and the command and sorted by the memory column from largest usage to lowest as shown in the following command. The output showed that the largest processes were oracle and all of them were about the same size based on the pmem column which shows the output in terms of a %.

# ps -eo pmem,pcpu,pid,comm | sort -k 1 -nr |wc -l 

I also wanted to see a total process count on the server so I ran the following command it returned about 6000 processes, so many.

# ps -ef| wc -l

..And the plot thickens then I thought about checking to see if we have too many parallel process threads running and I ran the following command to check and verify this.The following command returned about 4800 rows! Now we are on to something this clearly shows that about 80% of the processes running on the VM are all parallel threads, this is way to much.

ps -eo pmem,pcpu,pid,comm | sort -k 1 -nr |grep -i _p0|wc -l 

The parallel_max_servers database init parameter in 12c is a culprit for this and it should be limited. If this parameter is not set it will take the default value which is based upon the settings from CPU_COUNT, PARALLEL_THREADS_PER_CPU, and PGA_AGGREGATE_TARGET. 

The CPU_COUNT database init parameter should also be limited as well. If the default value is taken then the database will take upon the value reported by the total cpu threads reported from the OS. In my case we have 20 CPU cores assigned to the VM and on the M7 Supercluster each CPU has 32 cores with 8 threads each a total of 256 threads per core.So 20 cores assigned to the VM is 160 threads and can set CPU_COUNT accordingly based on the workload need.

The PARALLEL_THREADS_PER_CPU is normally OS dependent and usually 2. This setting is fine.

Once I appropriately set the CPU_COUNT and PARALLEL_MAX_SERVERS parameters on each of the 16 databases and restarted each one I was happy to see the overall memory utilization dramatically decreased. As shown below in the prstat command.

# prstat -t 
764 oracle 80G 56G 29% 14:33:54 0.1% 
94 root 2525M 2493M 1.3% 88:14:08 0.1% 
53 grid 8239M 6759M 3.4% 34:26:20 0.0% 
1 smmsp 4272K 10M 0.0% 0:00:06 0.0% 
7 daemon 9384K 12M 0.0% 0:00:55 0.0% 
1 netcfg 4344K 7240K 0.0% 0:00:24 0.0% 
3 netadm 46M 46M 0.0% 0:00:45 0.0% 

In Summary the following two parameters should be set appropriately on every new Oracle Database build. As a best practice please limit these parameters as the default settings could hurt the overall performance of your system and even impact the available memory as it did in our case. I have even see many cases where the PARALLEL_MAX_SERVERS was not set and it completely hung an Exadata or other types of database servers so please be mindful of these two parameters.


Tuesday, September 13, 2016

Dallas Oracle Users Group - Presenting "Oracle Dataguard Deep Dive"

I am very excited and honored to be presenting at the Dallas Oracle Users Group soon on Sept 22, 2016. The topic is "Oracle Dataguard Deep Dive". Session details below.

To RSVP for this meeting:

Texas State Government Facility (education-related)
- Use building Entrance C on SE side
Dallas Room – 1st floor
400 East Spring Valley Road
Richardson, Texas, 75081

Presentation Details
Oracle Data Guard Deep Dive
Ensuring your business continuity for critical production databases is of paramount importance. Oracle Data Guard offers high available synchronization and reporting of your primary database. Oracle Data Guard is the most comprehensive solution available to eliminate single points of failure for mission critical Oracle Databases. It prevents data loss and downtime in the simplest and most economical manner by maintaining a synchronized physical replica of a production database at a remote location. If the production database is unavailable for any reason, client connections can quickly, and in some configurations transparently, fail-over to the synchronized replica to restore service. We will go through explaining Data Guard concepts, several new features in 12c such as Far/Fast sync, learn about Data Guard single instance and RAC setup step-by-step, configuration, broker setup and monitoring, and even tips on automating the complete setup.  This presentation will be very useful for someone who wants to take a deep dive into Data Guard or even refresh their skills.


Wednesday, June 15, 2016

Calculate Exadata X6 Storage space available for ASM Normal and High Redundancy

When reviewing the datasheet for the Exadata X6 the latest release you may want to understand how much usable space will be available for you to use for Normal and High Redundancy in ASM for your diskgroups.

Each Exadata X6 storage cell comes with 12 x 8TB SAS 7200 RPM disks with a total Raw capacity of 96TB for each cell. Now if you want to see how to calculate the Normal or High redundancy for example of a Quarter Rack it is not simply taking the raw storage amount divided by 2 for Normal redundancy and 3 for High redundancy. The formula is below and I confirmed this with Oracle Support as well.

Please note: The amount of storage that needs to be reserved to recover from the failure of a disk is 15% of the cell disk capacity.

# cellcli -e list celldisk attributes name,size
CD_00_internal 7.1192474365234375T


The calculation is as follows -
Size of one disk : 7.1192474365234375TB
Disks per cell node : 12
No of cell nodes : 3

Total size of Disk : 7.119TB  * 12 * 3 = 256.284TB

Normal Redundancy  = [ Total size of Disk / 2 ] * .85

= [ 256.284TB / 2 ] * .85

= 128.142TB * .85

=> 108.9207TB or 109TB [ The value in the Exadata X6 data sheet ] 

High Redundancy  = [ Total size of Disk / 3 ] * .85
= [ 256.284TB / 3 ] * .85

= 85.428TB * .85

=> 72.6138TB

A table below also shows this information to easily see the values on the different size configurations.