Recently I came across a script from Oracle Support written recently to check in the ASM storage to see if a disk or a cell failure/loss can be tolerated, the script will report a PASS or FAIL status depending on whether rebalancing can occur after the loss of a disk or cell(12 disks) in the Exadata Storage. The risk a cell server can fail is unlikely but could occur I personally faced this issue almost 1 year ago in a Production environment with a Half Rack(7 cell nodes) when we lost a cell node for almost 2-3 days however we had enough free space for rebalancing to occur and we could tolerate the lost cell node and there was no downtime to any of the databases.
The Oracle Support note is listed below and the script is also attached to it.
Understanding ASM Capacity and Reservation of Free Space in Exadata (Doc ID 1551288.1)
Some key points
- Ensure that you keep FREE_MB column in the ASM lsdg output above the Cell Required Mirror Free MB or Disk Required Mirror Free MB at all times, this number should not go Negative.
- Disk Required Mirror Free MB is the amount of space that should be reserved for disk failure coverage.
- One Cell Required Mirror Free MB is the amount of space to reserve for single cell failure coverage, regardless of redundancy type.
Script output below with BEFORE/AFTER results and expected output that will be sent in case of a failure.
BEFORE
State Type Rebal Sector Block AU Total_MB Free_MB Req_mir_free_MB Usable_file_MB Offline_disks Voting_files Name MOUNTED NORMAL N 512 4096 4194304 54042624 20132832 18014208 1059312 0 N DATA_EXAD/ MOUNTED NORMAL N 512 4096 4194304 894240 636448 298080 169184 0 Y DBFS_DG/ MOUNTED NORMAL N 512 4096 4194304 13512384 7173544 4504128 1334708 0 N RECO_EXAD/
AFTER
State Type Rebal Sector Block AU Total_MB Free_MB Req_mir_free_MB Usable_file_MB Offline_disks Voting_files NameMOUNTED NORMAL N 512 4096 4194304 54042624 28095768 18014208 5040780 0 N DATA_EXAD/MOUNTED NORMAL N 512 4096 4194304 894240 636448 298080 169184 0 Y DBFS_DG/MOUNTED NORMAL N 512 4096 4194304 13512384 7173208 4504128 1
BEFORESQL> @check_asm.sql------ DISK and CELL Failure Diskgroup Space Reserve Requirements ------This procedure determines how much space you need to survive a DISK or CELLfailure. It also shows the usable spaceavailable when reserving space for disk or cell failure.Please see MOS note 1551288.1 for more information.. . .Description of Derived Values:Cell Required Mirror Free MB : Free MB needed to permit successful rebalanceafter losing largest CELL in a DG2 Cell Required Mirror Free MB : Free MB needed to permit successful rebalanceafter losing 2 largest CELLs in high redundancy DGDisk Required Mirror Free MB : Free MB needed to rebalance after loss ofsingle disk (normal redundancy DG) or double disk (high redundancy DG)Disk Failure Usable File MB : Usable space available after reserving spacefor disk failure (1 disk in normal or 2 disks in high redundancy DG) andaccounting for mirroringCell Failure Usable File MB : Usable space available after reserving spacefor 1 cell failure and accounting for mirroring2 Cell Failure Usable File MB : Usable space available after reserving spacefor 2 cell failures and accounting for mirroring in a HIGH redundancy DG. . .ASM Version: 11.2.0.2 - WARNING DISK FAILURE COVERAGE ESTIMATES HAVE NOT BEENVERIFIED ON 11.2.0.2 !. . .-------------------------------------------------------------------------DG Name: DATA_EXADDG Type: NORMALNum Disks: 36Disk Size MB: 1,501,184. . .DG Total MB: 54,042,624DG Used MB: 34,648,092DG Free MB: 19,394,532. . .Cell Required Mirror Free MB: 27,021,312. . .Disk Required Mirror Free MB: 1,636,279. . .Disk Failure Usable File MB: 8,879,126Cell Failure Usable File MB: -3,813,390. . .Enough Free Space to Rebalance after loss of ONE disk: PASSEnough Free Space to Rebalance after loss of ONE cell: FAIL-------------------------------------------------------------------------DG Name: DBFS_DGDG Type: NORMALNum Disks: 30Disk Size MB: 29,808. . .DG Total MB: 894,240DG Used MB: 257,792DG Free MB: 636,448. . .Cell Required Mirror Free MB: 447,120. . .Disk Required Mirror Free MB: 53,600. . .Disk Failure Usable File MB: 291,424Cell Failure Usable File MB: 94,664. . .Enough Free Space to Rebalance after loss of ONE disk: PASSEnough Free Space to Rebalance after loss of ONE cell: PASS-------------------------------------------------------------------------DG Name: RECO_EXADDG Type: NORMALNum Disks: 36Disk Size MB: 375,344. . .DG Total MB: 13,512,384DG Used MB: 7,484,712DG Free MB: 6,027,672. . .Cell Required Mirror Free MB: 6,756,192. . .Disk Required Mirror Free MB: 423,896. . .Disk Failure Usable File MB: 2,801,888Cell Failure Usable File MB: -364,260. . .Enough Free Space to Rebalance after loss of ONE disk: PASSEnough Free Space to Rebalance after loss of ONE cell: FAIL. . .Script completed. PL/SQL procedure successfully completed. SQL> exit
AFTER
SQL> @check_asm.sql
------ DISK and CELL Failure Diskgroup Space Reserve Requirements ------
This procedure determines how much space you need to survive a DISK or CELL
failure. It also shows the usable space
available when reserving space for disk or cell failure.
Please see MOS note 1551288.1 for more information.
. . .
Description of Derived Values:
Cell Required Mirror Free MB : Free MB needed to permit successful rebalance
after losing largest CELL in a DG
2 Cell Required Mirror Free MB : Free MB needed to permit successful rebalance
after losing 2 largest CELLs in high redundancy DG
Disk Required Mirror Free MB : Free MB needed to rebalance after loss of
single disk (normal redundancy DG) or double disk (high redundancy DG)
Disk Failure Usable File MB : Usable space available after reserving space
for disk failure (1 disk in normal or 2 disks in high redundancy DG) and
accounting for mirroring
Cell Failure Usable File MB : Usable space available after reserving space
for 1 cell failure and accounting for mirroring
2 Cell Failure Usable File MB : Usable space available after reserving space
for 2 cell failures and accounting for mirroring in a HIGH redundancy DG
. . .
ASM Version: 11.2.0.2 - WARNING DISK FAILURE COVERAGE ESTIMATES HAVE NOT BEEN
VERIFIED ON 11.2.0.2 !
. . .
-------------------------------------------------------------------------
DG Name: DATA_EXAD
DG Type: NORMAL
Num Disks: 36
Disk Size MB: 1,501,184
. . .
DG Total MB: 54,042,624
DG Used MB: 25,946,856
DG Free MB: 28,095,768
. . .
Cell Required Mirror Free MB: 27,021,312
. . .
Disk Required Mirror Free MB: 1,636,279
. . .
Disk Failure Usable File MB: 13,229,744
Cell Failure Usable File MB: 537,228
. . .
Enough Free Space to Rebalance after loss of ONE disk: PASS
Enough Free Space to Rebalance after loss of ONE cell: PASS
-------------------------------------------------------------------------
DG Name: DBFS_DG
DG Type: NORMAL
Num Disks: 30
Disk Size MB: 29,808
. . .
DG Total MB: 894,240
DG Used MB: 257,792
DG Free MB: 636,448
. . .
Cell Required Mirror Free MB: 447,120
. . .
Disk Required Mirror Free MB: 53,600
. . .
Disk Failure Usable File MB: 291,424
Cell Failure Usable File MB: 94,664
. . .
Enough Free Space to Rebalance after loss of ONE disk: PASS
Enough Free Space to Rebalance after loss of ONE cell: PASS
-------------------------------------------------------------------------
DG Name: RECO_EXAD
DG Type: NORMAL
Num Disks: 36
Disk Size MB: 375,344
. . .
DG Total MB: 13,512,384
DG Used MB: 6,339,176
DG Free MB: 7,173,208
. . .
Cell Required Mirror Free MB: 6,756,192
. . .
Disk Required Mirror Free MB: 423,896
. . .
Disk Failure Usable File MB: 3,374,656
Cell Failure Usable File MB: 208,508
. . .
Enough Free Space to Rebalance after loss of ONE disk: PASS
Enough Free Space to Rebalance after loss of ONE cell: PASS
. . .
Script completed.
PL/SQL procedure successfully completed.
No comments:
Post a Comment