分类: 系统运维
2012-02-06 16:49:04
Abstract: This document provides a short overview of the SCSI reservation policies with AIX MPIO and AIX SDDPCM devices. With native AIX MPIO the default device attributes generally are set to ‘fail_over’ and ‘single_path’ reserve and a SCSI-2 reservation methodology will be applied when an AIX volume group is varied on. Be aware that with SDDPCM (e.g. for DS8000, DS6000, DS4000, DS5000, SVC) the default device attributes are set to ‘load_balance’ and ‘no_reserve’ which means that no reservation methodology for the device is applied. Thus be aware, that an AIX volume group on SDDPCM devices with default attributes might be varied on and accessed by other initiators and even other host systems! |
When a non-concurrent AIX volume group is varied on, typically a SCSI-2 reservation is applied to the AIX hdisk. Such a reservation is also applied to the system’s rootvg device and not even released when the system is shut down (which might be important to know when planning to use the rootvg as target for a FlashCopy operation which fails when a reservation is present on the target device). With native AIX MPIO the default attributes for an AIX hdisk generally are algorithm=fail_over andreserve_policy=single_path, thus a SCSI-2 reservation methodology is applied to the device with a varyonvg command! Note that in this case only one active path is used for I/O at a time. Be aware that with SDDPCM (e.g. for DS8000, DS6000, DS4000, DS5000, SVC) the default device attributes are set toalgorithm=load_balance and reserve_policy=no_reserve which means that no reservation methodology for the device is applied. A SCSI-2 reservation which is typically thought to be in place when doing a varyonvg of an AIX volume group is not actually applied with SDDPCM default settings for the device. Such an AIX volume group might be varied on and accessed by other initiators and even other host systems in this case! If this AIX volume group is shared between hosts (not using any cluster services like PowerHA with enhanced concurrent volume groups) and should not be accessed by other hosts using a SCSI-2 reservation methodology then the device reserve_policy can be set to single_path (using SCSI-2 reserve) after first setting the device path selection algorithm to fail_over. Any attempt to set the algorithm to round_robin, load_balance, or load_balance_port with single_path reserve policy will fail. When using single_path reservation methodology with fail_over path selection algorithm only one active path is used for I/O at a time. The active path actually selected for I/O can be configured using the chpath command by setting the path priority attribute accordingly. AIX native MPIO Here is an example of a DS4800 hdisk using native AIX MPIO with its default attributes algorithm=fail_over andreserve_policy=single_path. The SCSI-2 reservation methodology will be applied to the hdisk when doing a varyonvg on a volume group. In order to use multiple paths with native AIX MPIO it is required to set the hdisk’s algorithm attribute toround_robin and the reserve_policy attribute to no_reserve (no_reserve will be fine for most situations but is, of course, depending on your requirements and specific SCSI reservation policy) using: # chdev -l This change requires a reboot. Alternatively one can stop using the hdisks (varyoff any VG with the disks) and eliminate the -P flag to go into effect without a reboot. Note, that a Virtual SCSI Disk only offers fail_over as path selection algorithm, round_robin is not available here! The MPIO reserve_policy defines whether a reservation methodology is employed when the device is opened. The values are as follows: # lsattr -Rl no_reserve single_path PR_exclusive PR_shared no_reserve Does not apply a reservation methodology for the device. The device might be accessed by other initiators, and these initiators might be on other host systems. single_path Applies a SCSI2 reserve methodology for the device, which means the device can be accessed only by the initiator that issued the reserve. This policy prevents other initiators in the same host or on other hosts from accessing the device. This policy uses the SCSI2 reserve to lock the device to a single initiator (path), and commands routed through any other path result in a reservation conflict. Path selection algorithms that alternate commands among multiple paths can result in thrashing when the single_path reserve value is selected. As an example, assume a device-specific PCM has a required attribute that is set to a value that distributes I/O across multiple paths. When single_path reserve is in effect, the disk driver must issue a bus device reset (BDR) and then issue a reserve using a new path for sending the next command to break the previous reservation. Each time a different path is selected, thrashing occurs and performance degrades because of the overhead of sending a BDR and issuing a reserve to the target device. (The AIX? PCM does not allow you to select an algorithm that could cause thrashing.) PR_exclusive Applies a SCSI3 persistent-reserve, exclusive-host methodology when the device is opened. The PR_key attribute value must be unique for each host system. The PR_key_value attribute is used to prevent access to the device by initiators from other host systems. PR_shared Applies a SCSI3 persistent-reserve, shared-host methodology when the device is opened. ThePR_key_value must be a unique value for each host system. Initiators from other host systems are required to register before they can access the device. The PR_key_value is required only if the device supports any of the persistent reserve policies: PR_exclusive or PR_shared. For more information on native AIX MPIO and the related device attributes please take a look at: System p and AIX Information Center – Multiple Path I/O AIX MPIO with SDDPCM (2.4.x.x) The default path selection algorithm with SDDPCM is load_balance with the device reserve policy set to no_reservewhich means that no reservation methodology for the device is applied. Here is an example of the default attributes of an SVC disk on a system running AIX6.1TL3SP1with SDDPCM 2.4.0.2: # lsdev -l hdisk1 hdisk1 Available 17-T1-01 MPIO FC 2145 # lsattr -Rl hdisk1 -a algorithm fail_over round_robin load_balance load_balance_port # lsattr -Rl hdisk1 -a reserve_policy no_reserve single_path PR_exclusive PR_shared # lsattr -Rl hdisk1 -a hcheck_mode enabled failed nonactive # lsattr -Rl hdisk1 -a hcheck_interval 0...3600 (+1) If AIX volume groups are shared between hosts (without using any cluster services like PowerHA with enhanced concurrent volume groups) and they should be prevented from being accessible by other hosts using a SCSI-2 reservation methodology then the device reserve_policy can be set to single_path (using SCSI-2 reserve) after first setting the device path selection algorithm to fail_over. The path selection algorithm can be changed online with the pcmpath set device [device no.] algorithm [fo/rr/lb/lbp] command. The reserve_policy attribute cannot be changed while the device is online (e.g. volume group is varied on). Here a varyoffvg of the AIX volume group is required or with a chdev -l With single_path reservation methodology and fail_over path selection algorithm only one active path is used for I/O at a time. With multiple paths available the path priority attribute might be set accordingly to determine which paths to use. The path priority can be displayed with the lspath command and changed with the chpath command. The highest path priority is 1 which is the default for all paths. Increasing the value for the path priority actually means lowering the priority. Changes to the path priority can be made online. When the algorithm attribute value is fail_over, the paths are kept in a list. Multiple paths can have the same priority value, but if all paths have the same value, selection is based on when each path was configured. Here is an example of setting the device attributes to fail_over with single_path reserve (SCSI-2 reservation) and modifying path priority: # pcmpath set device 1 algorithm fo # chdev -l hdisk1 -a reserve_policy=single_path -P hdisk1 changed # pcmpath query device 1 # lspath -l hdisk1 -F "name:path_id:connection:parent:path_status:status" hdisk1:0:5005076801102d0c,0:fscsi0:Available:Enabled hdisk1:1:5005076801102cd9,0:fscsi0:Available:Enabled hdisk1:2:5005076801202d0c,0:fscsi2:Available:Enabled hdisk1:3:5005076801202cd9,0:fscsi2:Available:Enabled hdisk1:4:5005076801302d0c,0:fscsi1:Available:Enabled hdisk1:5:5005076801302cd9,0:fscsi1:Available:Enabled hdisk1:6:5005076801402d0c,0:fscsi3:Available:Enabled hdisk1:7:5005076801402cd9,0:fscsi3:Available:Enabled # lspath -AHE -l hdisk1 -p fscsi1 -w 5005076801302cd9,0 -a priority attribute value description user_settable priority 1 Priority True # chpath -l hdisk1 -p fscsi1 -w 5005076801302cd9,0 -a priority=2 path Changed # lspath -AHE -l hdisk1 -p fscsi1 -w 5005076801302cd9,0 -a priority attribute value description user_settable priority 2 Priority True # pcmpath query device 1 # lspath -l hdisk1 -F "name path_id connection parent path_status status"|while read hd pth cnt fc rst; do echo "$pth => $(lspath -AE -l hdisk1 -p $fc -w $cnt -a priority)" ; done 0 => priority 1 Priority True 1 => priority 1 Priority True 2 => priority 1 Priority True 3 => priority 1 Priority True 4 => priority 1 Priority True 5 => priority 2 Priority True 6 => priority 1 Priority True 7 => priority 1 Priority True For more information on path priorities please refer to System p and AIX Information Center – Path control module attributes For a more information on SDDPCM please refer to the Multipath Subsystem Device Driver User’s Guide ftp://ftp.software.ibm.com/storage/subsystem/UG/1.8–2.4/SDD_1.8–2.4_User_Guide_English_version.pdf The SDDPCM 2.1.0.0 (and later) fileset provides two persistent reserve command tools: (1) pcmquerypr The pcmquerypr command provides a set of persistent reserve functions. This command supports the following persistent reserve service actions:
This command can be issued to all system MPIO devices, including MPIO devices not supported by SDDPCM. (2) pcmgenprkey The pcmgenprkey command can be used to set or clear the PR_key_value ODM attribute for all SDDPCM MPIO devices. It also can be used to query and display the reservation policy of all SDDPCM MPIO devices and the persistent reserve key, if those devices have a PR key. (see: Multipath Subsystem Device Driver User’s Guide, SDDPCM utility programs, p.133f) Additionally SDDPCM fileset provides the relbootrsv command. The relbootrsv command releases a SCSI-2 reserve on boot devices or on active nonboot volume groups. If you specify a VGname (volume group name), relbootrsv releases the SCSI-2 reserve of the specified non-SAN boot volume group; otherwise, it releases the SCSI-2 reserve of a SAN boot volume group (rootvg). (see: Multipath Subsystem Device Driver User’s Guide, Migrating SDDPCM, p.115 and Multipath SAN boot support, p.128) SDDPCM supports four types of MPIO reserve policies. You can select one of the four reserve policies based on the configuration environment or application needs. The supported reserve policies are:
No Reserve reservation policy If you set MPIO devices with this reserve policy, there is no reserve being made on MPIO devices. A device without reservation can be accessed by any initiators at any time. I/O can be sent from all the paths of the MPIO device. This is the default reserve policy of SDDPCM. Exclusive Host Access single-path reservation policy This is the SCSI-2 reservation policy. If you set this reserve policy for MPIO devices, only the fail_over path selection algorithm can be selected for the devices. With this reservation policy, an MPIO device has all paths being opened; however, only one path made a SCSI-2 reservation on the device. I/O can only be sent through this path. When this path is broken, reserve is released, another path is selected, and SCSI-2 reserve is reissued by this new path. All input and output is now routed to this new path. Persistent Reserve Exclusive Host Access reservation policy If you set an MPIO device with this persistent reserve policy, a persistent reservation is made on this device with a persistent reserve (PR) key. Any initiators who register with the same PR key can access this device. Normally, you should pick a unique PR key for a server. Different servers should have different unique PR key. I/O is routed to all paths of the MPIO device, because all paths of an MPIO device are registered with the same PR key. In a nonconcurrent clustering environment, such as HACMP, this is the reserve policy that you should select. Current? HACMP clustering software supports no_reserve policy with Enhanced Concurrent Mode volume group. HACMP support for persistent reserve policies for supported storage MPIO devices is not available. Persistent Reserve Shared Host Access reservation policy If you set an MPIO device with this persistent reserve policy, a persistent reservation is made on this device with a persistent reserve (PR) key. However, any initiators that implemented persistent registration can access this MPIO device, even if the initiators are registered with different PR keys. In a concurrent clustering environment, such as HACMP, this is the reserve policy that you should select for sharing resources among multiple servers. Current HACMP clustering software supports no_reserve policy with Enhanced Concurrent Mode volume group. HACMP support for persistent reserve policies for supported storage MPIO devices is not available. |