Saturday, April 20, 2024

Setting the Oracle ASM Disk Repair Timer

 

Setting the Oracle ASM Disk Repair Timer

 

As soon exadata storage server hit a disk failure and the disk is no more available, ASM marks this disk offline and time starts ticking until disk_repair_time value is reached. If the issue is not fixed and disk is still not available asm drops the disk from diskgroup. Once the disk is dropped rebalance operation is triggered, which may take longer to complete depending on many factors like power limit used, amount of data to rebalance etc..

 

Once the disk is available to server again you will add it back to the diskgroup and again rebalance operation will take place subject to size and power limit.

 

SQL> SELECT GROUP_NUMBER, PASS, STATE FROM V$ASM_OPERATION;
 
GROUP_NUMBER PASS      STAT
------------ --------- ----
           1 RESYNC    RUN
           1 REBALANCE WAIT
           1 COMPACT   WAIT

 

The Oracle ASM disk repair timer represents the amount of time a disk can remain offline before it is dropped by Oracle ASM. While the disk is offline, Oracle ASM tracks the changed extents so the disk can be resynchronized when it comes back online. The default disk repair time is 3.6 hours. If the default is inadequate, then the attribute value can be changed to the maximum amount of time it might take to detect and repair a temporary disk failure. The following command is an example of changing the disk repair timer value to 8.5 hours for the DATA disk group:

 

SQL> ALTER DISKGROUP data SET ATTRIBUTE 'disk_repair_time' = '8.5h'

 

The disk_repair_time attribute does not change the repair timer for disks currently offline. The repair timer for those offline disks is either the default repair timer or the repair timer specified on the command line when the disks were manually set to offline. To change the repair timer for currently offline disks, use the OFFLINE command and specify a repair timer value. The following command is an example of changing the disk repair timer value for disks that are offline:

 

SQL> ALTER DISKGROUP data OFFLINE DISK data_CD_06_cell11 DROP AFTER 20h;

 

 

To check repair times for all mounted disk groups – log into the ASM instance and perform the following query:

 

SQL> select dg.name,a.value from v$asm_diskgroup
dg, v$asm_attribute a where dg.group_number=a.group_number and
a.name='disk_repair_time';

 

Note:

 

Vulnerability to a double failure increases in line with increases to the disk repair time value

 

Exadata : Maintaining PMEM Devices on Oracle Exadata Storage Servers

Maintaining PMEM Devices on Oracle Exadata Storage Servers

 

Persistent memory (PMEM) devices reside in Exadata X8M-2 and X9M-2 storage server models with High Capacity (HC) or Extreme Flash (EF) storage.

 

If a PMEM device fails, Oracle Exadata System Software isolates the failed device and automatically recovers the cache using the device.

 

If the cache is in write-back mode, the recovery operation, also known as resilvering, restores the lost data by reading a mirrored copy. During resilvering, the grid disk status is ACTIVE -- RESILVERING WORKING. If the cache is in write-through mode, then the data in the failed PMEM device is already stored in the data grid disk, and no recovery is required.

 

 

 

 

Replacing a PMEM Device Due to Device Failure

If the PMEM device has a status of Failed, you should replace the PMEM device on the Oracle Exadata Storage Server.

A PMEM fault could cause server to reboot. The failed device should be replaced with a new PMEM device at the earliest opportunity. Until the PMEM device is replaced, the corresponding cache size is reduced. If the PMEM device is used for commit acceleration (XRMEMLOG or PMEMLOG), then the size of the corresponding commit accelerator is also reduced.

An alert is generated when a PMEM device failure is detected. The alert message includes the slot number and cell disk name. If you have configured the system for alert notification, then an alert is sent by e-mail message to the designated address.

To identify a failed PMEM device, you can also use the following command:

 

CellCLI> LIST PHYSICALDISK WHERE disktype=PMEM AND status=failed DETAIL

 

    name:                          PMEM_0_1

    diskType:                      PMEM

    luns:                          P0_D1

    makeModel:                     "Intel NMA1XBD128GQS"

    physicalFirmware:              1.02.00.5365

    physicalInsertTime:            2019-09-28T11:29:13-07:00

    physicalSerial:                8089-A2-1838-00001234

    physicalSize:                  126.375G

    slotNumber:                    "CPU: 0; DIMM: 1"

    status:                        failed

 

In the above output, the slotNumber shows the socket number and DIMM slot number.

1.     Locate the storage server that contains the failed PMEM device.

A white Locator LED is lit to help locate the affected storage server. When you have located the server, you can use the Fault Remind button to locate the failed DIMM.

Caution:

Do not attempt to remove a faulty DCPMM DIMM when the Do Not Service LED indicator is illuminated.

2.     Power down the storage server with the failed PMEM device and unplug the power cable for the server.

3.     Replace the failed PMEM device.

·         X9M-2: See "Servicing the DIMMs" in Oracle Exadata Storage Server X9-2 EF, HC, and XT Service Manual at https://docs.oracle.com/en/servers/x86/x9-2l/exa-storage/servicing-dimms.html

·         X8M-2: See "Servicing the DIMMs" in Oracle Exadata Storage Server X8-2 EF, HC, and XT Service Manual at https://docs.oracle.com/en/servers/x86/x8-2l/exadata-storage-service-manual/gqtcm.html

4.     Restart the storage server.

Note:

During the restart, the storage server will shut down a second time to complete the initialization of the new PMEM device.

The new PMEM device is automatically used by the system. If the PMEM device is used for caching, then the effective cache size increases. If the PMEM device is used for commit acceleration, then commit acceleration is enabled on the device.

Replacing a PMEM Device Due to Degraded Performance

If a PMEM device has degraded performance, you might need to replace the module.

If degraded performance is detected on a PMEM device, the module status is set to warning - predictive failure and an alert is generated. The alert includes specific instructions for replacing the PMEM device. If you have configured the system for alert notifications, then the alerts are sent by e-mail message to the designated address.

The predictive failure status indicates that the PMEM device will fail soon, and should be replaced at the earliest opportunity. No new data is cached in the PMEM device until it is replaced.

To identify a PMEM device with the status predictive failure, you can also use the following command:

 

CellCLI> LIST PHYSICALDISK WHERE disktype=PMEM AND status='warning - predictive failure' DETAIL

 

         name:               PMEM_0_6

         diskType:           PMEM

         luns:               P0_D6

         makeModel:          "Intel NMA1XBD128GQS"

         physicalFirmware:   1.02.00.5365

         physicalInsertTime: 2019-11-30T21:24:45-08:00

         physicalSerial:     8089-A2-1838-00001234

         physicalSize:       126.375G

         slotNumber:         "CPU: 0; DIMM: 6"

         status:             warning - predictive failure

 

 

You can also locate the PMEM device using the information in the LIST DISKMAP command:

 

CellCLI> LIST DISKMAP

Name      PhysicalSerial         SlotNumber        Status       PhysicalSize

   CellDisk       DevicePartition    GridDisks

PMEM_0_1  8089-a2-0000-00000460  "CPU: 0; DIMM: 1"  normal      126G

   PM_00_cel01    /dev/dax5.0        PMEMCACHE_PM_00_cel01

PMEM_0_3  8089-a2-0000-000004c2  "CPU: 0; DIMM: 3"  normal      126G

   PM_02_cel01    /dev/dax4.0        PMEMCACHE_PM_02_cel01

PMEM_0_5  8089-a2-0000-00000a77  "CPU: 0; DIMM: 5"  normal      126G

   PM_03_cel01    /dev/dax3.0        PMEMCACHE_PM_03_cel01

PMEM_0_6  8089-a2-0000-000006ff  "CPU: 0; DIMM: 6"  warning -   126G

   PM_04_cel01    /dev/dax0.0        PMEMCACHE_PM_04_cel01

PMEM_0_8  8089-a2-0000-00000750  "CPU: 0; DIMM: 8"  normal      126G

   PM_05_cel01    /dev/dax1.0        PMEMCACHE_PM_05_cel01

PMEM_0_10 8089-a2-0000-00000103  "CPU: 0; DIMM: 10" normal      126G

   PM_01_cel01    /dev/dax2.0        PMEMCACHE_PM_01_cel01

PMEM_1_1  8089-a2-0000-000008f6  "CPU: 1; DIMM: 1"  normal      126G

   PM_06_cel01    /dev/dax11.0       PMEMCACHE_PM_06_cel01

PMEM_1_3  8089-a2-0000-000003bb  "CPU: 1; DIMM: 3"  normal      126G

   PM_08_cel01    /dev/dax10.0       PMEMCACHE_PM_08_cel01

PMEM_1_5  8089-a2-0000-00000708  "CPU: 1; DIMM: 5"  normal      126G

   PM_09_cel01    /dev/dax9.0        PMEMCACHE_PM_09_cel01

PMEM_1_6  8089-a2-0000-00000811  "CPU: 1; DIMM: 6"  normal      126G

   PM_10_cel01    /dev/dax6.0        PMEMCACHE_PM_10_cel01

PMEM_1_8  8089-a2-0000-00000829  "CPU: 1; DIMM: 8"   normal     126G

   PM_11_cel01    /dev/dax7.0        PMEMCACHE_PM_11_cel01

PMEM_1_10 8089-a2-0000-00000435  "CPU: 1; DIMM: 10"   normal    126G

   PM_07_cel01    /dev/dax8.0        PMEMCACHE_PM_07_cel01

 

If the PMEM device is used for write-back caching, then the data is flushed from the PMEM device to the flash cache. To ensure that data is flushed from the PMEM device, check the cachedBy attribute of all the grid disks and ensure that the affected PMEM device is not listed.

1.     Locate the storage server that contains the failing PMEM device.

A white Locator LED is lit to help locate the affected storage server. When you have located the server, you can use the Fault Remind button to locate the failed DIMM.

Caution:

Do not attempt to remove a faulty DCPMM DIMM when the Do Not Service LED indicator is illuminated.

2.     Power down the storage server with the failing PMEM device and unplug the power cable for the server.

3.     Replace the failing PMEM device.

·         X9M-2: See "Servicing the DIMMs" in Oracle Exadata Storage Server X9-2 EF, HC, and XT Service Manual at https://docs.oracle.com/en/servers/x86/x9-2l/exa-storage/servicing-dimms.html

·         X8M-2: See "Servicing the DIMMs" in Oracle Exadata Storage Server X8-2 EF, HC, and XT Service Manual at https://docs.oracle.com/en/servers/x86/x8-2l/exadata-storage-service-manual/gqtcm.html

4.     Restart the storage server.

Note:

During the restart, the storage server will shut down a second time to complete the initialization of the new PMEM device.

The new PMEM device is automatically used by the system. If the PMEM device is used for caching, then the effective cache size increases. If the PMEM device is used for commit acceleration, then commit acceleration is enabled on the device.

Enabling and Disabling Write-Back PMEM Cache

Prior to Oracle Exadata System Software release 23.1.0, you can configure PMEM cache to operate in write-back mode. Also known as write-back PMEM cache, this mode enables the cache to service write operations.

Note:

The best practice recommendation is to configure PMEM cache in write-through mode. This configuration provides the best performance and availability.

Commencing with Oracle Exadata System Software release 23.1.0, PMEM cache only operates in write-through mode.

 

Enable Write-Back PMEM Cache

Write-back PMEM cache is only supported in conjunction with write-back flash cache. Consequently, to enable write-back PMEM cache you must also enable write-back flash cache.

Note:

Commencing with Oracle Exadata System Software release 23.1.0, you cannot enable write-back PMEM cache because PMEM cache only operates in write-through mode.

Note:

To reduce the performance impact on your applications, change the cache mode during a period of reduced workload.

The following command examples use a text file named cell_group that contains the host names of the storage servers that are the subject of the procedure.

1.     Check the current flash cache mode setting (flashCacheMode):

 

# dcli –l root –g cell_group cellcli -e "list cell detail" | grep flashCacheMode

 

2.     If the flash cache is in write-back mode:

 

a.      Validate that all the physical disks are in NORMAL state before modifying the flash cache.

 

# dcli –l root –g cell_group cellcli –e "LIST PHYSICALDISK ATTRIBUTES name,status" | grep –v NORMAL

 

The command should return no rows.

 

b.     Determine amount of dirty data in the flash cache.

 

# dcli –g cell_group –l root cellcli -e "LIST METRICCURRENT ATTRIBUTES name,metricvalue WHERE name LIKE \'FC_BY_DIRTY.*\' "

 

c.      Flush the flash cache.

If the flash cache utilizes all available flash cell disks, you can use the ALL keyword instead of listing the flash disks.

 

# dcli –g cell_group –l root cellcli -e "ALTER FLASHCACHE CELLDISK=\'FD_02_dm01celadm12,

FD_03_dm01celadm12,FD_00_dm01celadm12,FD_01_dm01celadm12\' FLUSH"

 

d.     Check the progress of the flash cache flush operation.

The flushing process is complete when the metric FC_BY_DIRTY is zero.

 

# dcli -g cell_group -l root cellcli -e "LIST METRICCURRENT ATTRIBUTES name, metricvalue WHERE name LIKE \'FC_BY_DIRTY.*\' "

 

Or, you can check to see if the attribute flushstatus is set to Completed.

 

# dcli -g cell_group -l root cellcli -e "LIST CELLDISK ATTRIBUTES name, flushstatus, flusherror" | grep FD

 

e.      After the flash cache is flushed, drop the flash cache.

 

# dcli -g cell_group -l root cellcli -e "drop flashcache"

 

f.       Modify the cell to use flash cache in write-back mode.

 

# dcli -g cell_group -l root cellcli -e "ALTER CELL flashCacheMode=writeback"

 

g.     Re-create the flash cache.

If the flash cache utilizes all available flash cell disks, you can use the ALL keyword instead of listing the cell disks.

If the size attribute is not specified, then the flash cache consumes all available space on each cell disk.

 

# dcli –l root –g cell_group cellcli -e "create flashcache celldisk=\'FD_02_dm01celadm12,

FD_03_dm01celadm12,FD_00_dm01celadm12,FD_01_dm01celadm12\'

 

h.     Verify that flashCacheMode is set to writeback.

 

# dcli –l root –g cell_group cellcli -e "list cell detail" | grep flashCacheMode

3.     Flush the PMEM cache.

If the PMEM cache utilizes all available PMEM cell disks, you can use the ALL keyword as shown here.

 

# dcli –l root –g cell_group cellcli -e "ALTER PMEMCACHE ALL FLUSH"

Otherwise, list the specific disks using the CELLDISK="cdisk1 [,cdisk2] ..." clause.

4.     Drop the PMEM cache.

 

# dcli –l root –g cell_group cellcli -e "DROP PMEMCACHE"

5.     Modify the cell to use PMEM cache in write-back mode.

 

# dcli –l root –g cell_group cellcli -e "ALTER CELL pmemCacheMode=WriteBack"

Starting with Oracle Exadata System Software release 20.1.0, this command warns about the best practice recommendation to use PMEM cache in write-through mode and prompts for confirmation of the change.

6.     Re-create the PMEM cache.

If the PMEM cache utilizes all available PMEM cell disks, you can use the ALL keyword as shown here. Otherwise, list the specific disks using the CELLDISK="cdisk1 [,cdisk2] ..." clause. If the size attribute is not specified, then the PMEM cache consumes all available space on each cell disk.

 

# dcli –l root –g cell_group cellcli -e "CREATE PMEMCACHE ALL"

7.     Verify that pmemCacheMode is set to writeback.

 

# dcli –l root –g cell_group cellcli -e "list cell detail" | grep pmemCacheMode

 

Disable Write-Back PMEM Cache

Use these steps if you need to disable Write-Back PMEM cache on the storage servers.

You do not have to stop the cellsrv process or inactivate grid disks when disabling Write-Back PMEM cache. However, to reduce the performance impact on the application, disable the Write-Back PMEM cache during a period of reduced workload.

1.     Validate all the Physical Disks are in NORMAL state before modifying PMEM cache.

The following command should return no rows:

 

# dcli –l root –g cell_group cellcli –e “LIST PHYSICALDISK ATTRIBUTES name,status”|grep –v NORMAL

2.     Flush the PMEM cache.

 

# dcli –g cell_group –l root cellcli -e "ALTER PMEMCACHE ALL FLUSH"

The PMEM cache flushes the dirty data to the lower layer Write-Back Flash Cache.

3.     Check that the flushing operation for the PMEM cache has completed.

The flushing process is complete when the PMEM devices do not show up in the cachedBy attribute for the grid disks.

 

CellCLI> LIST GRIDDISK ATTRIBUTES name, cachedBy

DATA_CD_00_cel01     FD_00_cel01

DATA_CD_01_cel01     FD_01_cel01

DATA_CD_02_cel01     FD_03_cel01

DATA_CD_03_cel01     FD_02_cel01

DATA_CD_04_cel01     FD_00_cel01

DATA_CD_05_cel01     FD_02_cel01

...

4.     Drop the PMEM cache.

 

# dcli -g cell_group -l root cellcli -e drop pmemcache all

5.     Set the pmemCacheMode attribute to writethrough.

 

# dcli -g cell_group -l root cellcli -e "ALTER CELL pmemCacheMode=writethrough"

6.     Re-create the PMEM cache.

 

# dcli –l root –g cell_group cellcli -e create pmemcache all

7.     Verify the pmemCacheMode has been set to writethrough.

 

CellCLI> LIST CELL ATTRIBUTES pmemcachemode

   WriteThrough