AIX (Advanced Interactive eXecutive) is a series of proprietary Unix operating systems developed and sold by IBM.
Performance Optimization With Enhanced RISC (POWER) version 7 enables a unique performance advantage for AIX OS.
POWER7 features new capabilities using multiple cores and multiple CPU threads, creating a pool of virtual CPUs.
AIX 7 includes a new built-in clustering capability called Cluster Aware
AIX POWER7 systems include the Active Memory Expansion feature.

Thursday, September 1, 2011

Cleanup a PVMISSING disk

While wondering around the web I found a blog with comparisions between Solaris 10 and AIX 6. One of them is this blog with several AIX articles. One (scroll down a bit if you follow the link) was on how to trick the ODM into letting you remove a MISSING disk. Anyone who has followed an AIX administration course (well the advanced one) knows that there is a command to do all this for you! Even if editting ODM is fun for some of us (RAWR!).
Below, my extended guide for removing a MISSINGPV from the other disks VGDA and AIX ODM. 

Introduction

How a disk becomes PVMISSING is irrelevant. These things happen. Getting the system repaired is relevant! So, the simpler way! to correct volume group VGDA and AIX ODM.
The single command we will be using to remove the disk is:
ldeletepv -v VGID -p PVID
But, before we do, there are a number of steps we should follow as a matter of "best practice".
CASE: While the volume group is offline, maintenance is performed on the disks. One disk is/was damaged beyond repair, or replaced during the process. Now back at AIX the volumes are to be reactivated.
root@aix530:[/]lsvg -p vgExport 
0516-010 : Volume group must be varied on; use varyonvg command.
root@aix530:[/]varyonvg vgExport
PV Status:      hdisk1  00c39b8d69c45344        PVACTIVE
                hdisk2  00c39b8d043427b6        PVMISSING


The disk hdisk2 is PVMISSING. We assume hdisk2 with PVID 00c39b8d043427b6 is physically destroyed. All the data is lost; however, the AIX ODM and the VGDA on all the other disks in the volume group do not know this yet.

First document what is lost. We need to know which logical volumes are (were) on the missing disk. Normally we could use lspv -l hdiskX; (new: undocumented variation: lspv -l PVID) however, with the disk missing, this version of the command will not work. Instead, we use the VGID (volume group identifer).
1. Query the VGDA of the working disk to get the VGID and PVID of all disks in the volume group
root@aix530:[/]lqueryvg -p hdisk1 -vPt
Physical:       00c39b8d69c45344                2   0 
                00c39b8d043427b6                1   0 
VGid:           00c39b8d00004c000000011169c45a4b

2. Get a list of all the logical volumes on the missing disk
root@aix530:[/]lspv -l -v 00c39b8d00004c000000011169c45a4b hdisk2
hdisk2:
LV NAME               LPs   PPs   DISTRIBUTION          MOUNT POINT
lvTest                512   512   109..108..108..108..79 /scratch
loglv00               1     1     00..00..00..00..01    N/A

(Note: lspv -l  00c39b8d043427b6 should give us the same output!)
3. Verify all filesystems are unmounted.root@aix530:[/]lsvg -l vgExport
vgExport:
LV NAME             TYPE       LPs   PPs   PVs  LV STATE      MOUNT POINT
lvExport            jfs2       416   416   1    closed/syncd  /export
lvTest              jfs        512   512   1    closed/syncd  /scratch
loglv00             jfslog     1     1     1    closed/syncd  N/A


With this info I know that any data in /scratch is suspect, and should be restored from a backup.

4. Remove the logical volumes from the volume group before deleting the VGDA from the other disks.
root@aix530:[/]rmfs /scratch
rmfs:  0506-936  Cannot read superblock on /dev/lvTest.
rmfs:  0506-936  Cannot read superblock on /scratch.
rmfs: Unable to clear superblock on /scratchrmlv: Logical volume lvTest is removed.
root@aix530:[/]rmlv loglv00
Warning, all data contained on logical volume loglv00 will be destroyed.
rmlv: Do you wish to continue? y(es) n(o)? y
rmlv: Logical volume loglv00 is removed.
root@aix530:[/]lsvg -p vgExport
vgExport:
PV_NAME           PV STATE          TOTAL PPs   FREE PPs    FREE DISTRIBUTION
hdisk1            active            511         95          00..00..00..00..95
hdisk2            missing           542         29          51..18..51..51..51
root@aix530:[/]lsvg -l vgExport
vgExport:
LV NAME             TYPE       LPs   PPs   PVs  LV STATE      MOUNT POINT
lvExport            jfs2       416   416   1   
closed/syncd  /export
5. The volume group has been prepared - all damaged logical volume definitions have been removed. All that is remaining for cleanup is to remove the definition of the damaged disk from the VGDA of the remaining disk(s).
root@aix530:[/]ldeletepv -g 00c39b8d00004c000000011169c45a4b -p 00c39b8d043427b6

Note: there is no output for the above command when all proceeds accordingly.

Now the regular AIX commands to verify VGDA and ODM are in order.
root@aix530:[/]lsvg -p vgExport                                                
vgExport:
PV_NAME           PV STATE          TOTAL PPs   FREE PPs    FREE DISTRIBUTION
hdisk1            active            511         95          00..00..00..00..95
root@aix530:[/]mount /exportroot@aix530:[/]lsvg -l vgExport
vgExport:
LV NAME             TYPE       LPs   PPs   PVs  LV STATE      MOUNT POINT
lvExport            jfs2       416   416   1    open/syncd    /export

6. Various steps that I will only list here:
a. add a new disk to the volume group (extendvg)
b. remake the deleted logical partitions (mklv)
c. format, as needed, the log logical volumes (logform)
d. create the filesystems (crfs, or use smit)
e. restore the data from a backup (restore, tar, cpio, etc.)

No comments: