Chinaunix首页 | 论坛 | 博客
  • 博客访问: 2005991
  • 博文数量: 346
  • 博客积分: 10221
  • 博客等级: 上将
  • 技术积分: 4079
  • 用 户 组: 普通用户
  • 注册时间: 2009-06-01 19:43
文章分类

全部博文(346)

文章存档

2012年(1)

2011年(102)

2010年(116)

2009年(127)

我的朋友

分类: 服务器与存储

2010-05-14 23:47:34

are best known as NAS devices, but I think few people appreciate just what a versatile storage platform they really are. With a single filer you can provide NAS, both CIFS and NFS, Fibre Channel SAN (both Fabric and DAS), and IP SAN (iSCSI and iFCP). I've found times in my past where I knew I needed storage for a project but it wasn't clear at the time of purchase just exactly how we needed to access the storage, and in these cases choosing a NetApp Filer provided me with the widest range of possible solutions to solve the problem and further to continue using that Filer beyond its origonal intent.

In releases prior to OnTap 7g, volumes were defined much like they are with other tools, creating an intimate relationship between volume and disk. A volume (on which a WAFL filesystem sits that contains data) would be created on top of one or more Raid Groups (RG). Each raid group is RAID4, continging n data disks +1 parity disk. In the 6.x releases of OnTap we got a nifty feature called Double Parity (DP) which allowed you to have 2 parity disks (this isn't just a mirror of parity, [PDF]). With RAID4 you can lose no more than 1 disk, which means that when a disk fails your pants are down untill a spare takes over, and with 300GB drives that can take a while, so by using RAID-DP you could lose up to 2 disks, which covers you during a spare reconstruction and gives you a little extra security and peice of mind.

The following example is of a traditional volume:

nova> vol create trad_vol0 -t raid_dp 5
Creation of a volume with 5 disks has completed.
nova> vol status
      trad_vol0 online     raid_dp, trad
nova> sysconfig -r
Volume trad_vol0 (online, raid_dp) (zoned checksums)
  Plex /trad_vol0/plex0 (online, normal, active, pool0)
    RAID group /trad_vol0/plex0/rg0 (normal)

      RAID Disk Device  HA  SHELF BAY CHAN Pool Type  RPM  Used (MB/blks)    Phys (MB/blks)
      --------- ------  ------------- ---- ---- ---- ----- --------------    --------------
      dparity   7.48    7     6   0   FC:A   0  FCAL 10000 34500/70656000    34732/71132960
      parity    7.41    7     5   1   FC:A   0  FCAL 10000 34500/70656000    34732/71132960
      data      7.33    7     4   1   FC:A   0  FCAL 10000 34500/70656000    34732/71132960
      data      7.49    7     6   1   FC:A   0  FCAL 10000 34500/70656000    34732/71132960
      data      7.42    7     5   2   FC:A   0  FCAL 10000 34500/70656000    34732/71132960
...
nova> df -h trad_vol0
Filesystem               total       used      avail capacity  Mounted on
/vol/trad_vol0/           72GB      132KB       72GB       0%  /vol/trad_vol0/
/vol/trad_vol0/.snapshot       18GB        0KB       18GB       0%  /vol/trad_vol0/.snapshot

When you create a Traditional Volume (simply refered to in pre-7g releases as a "volume") the volume sits on top of the RAID Groups you created. If you want to increase the size of the volume you must add more disks, and for performance reasons you typically want to add additional full RAID groups at a time. This method of allocation is fairly inflexable, you can't shrink volumes and your volume sizing is based on the physical disk sizes, not simply needed capacity. To make up for this, you can create within a traditional volume a Quota Tree (qtree), which by-and-large can be treated like a volume itself, for instance, you can replicate individual qtree's using snapmirror, define diffrent access types (UNIX, NTFS, or Mixed), etc. But Qtree's are only so flexable.

nova> qtree create /vol/trad_vol0/devteam1
nova> qtree security /vol/trad_vol0/devteam1 mixed
nova> rdfile /etc/quotas
/vol/trad_vol0/devteam1 tree 4G 20K
nova> qtree status
Volume   Tree     Style Oplocks  Status
-------- -------- ----- -------- ---------
trad_vol0          unix  enabled  normal
trad_vol0 devteam1 mixed enabled  normal
nova> quota report
                                 K-Bytes             Files
Type       ID    Volume    Tree  Used      Limit     Used    Limit   Quota Specifier
----- -------- -------- -------- --------- --------- ------- ------- ---------------
tree         1 trad_vol0 devteam1         0   4194304       1   20480 /vol/trad_vol0/devteam1

Enter 7g, where we get a new concept called aggregates. This adds a layer of abstraction between the physical disks and the volumes. The idea is simple, allocate a certain number of disks to a big aggregate which is built on RAID Groups just like traditional volumes. However, there is no WAFL file system within an aggregate, its just allocatable space. Now within this big aggregate we can create flexible volumes ("flex vols"). Because a flex vol is abstracted from the underlying disk you can create a volume to meet your capacity needs, without reguard for physical layout. Furthermore, you can grow or shrink a flex vol easily at any time. This truly is flexible. One advantage of this is that over time, as your needs and requirements change, you can manipulate your storage as you actually need to by sloshing things around within the aggregate.

The following is an exmple of an aggregate:

nova> aggr create aggr0 -r 6 -t raid4 12@36
Creation of an aggregate with 12 disks has completed.
nova> sysconfig -r
Aggregate aggr0 (online, raid4) (zoned checksums)
  Plex /aggr0/plex0 (online, normal, active, pool0)
    RAID group /aggr0/plex0/rg0 (normal)

      RAID Disk Device  HA  SHELF BAY CHAN Pool Type  RPM  Used (MB/blks)    Phys (MB/blks)
      --------- ------  ------------- ---- ---- ---- ----- --------------    --------------
      parity    7.50    7     6   2   FC:A   0  FCAL 10000 34500/70656000    34732/71132960
      data      7.34    7     4   2   FC:A   0  FCAL 10000 34500/70656000    34732/71132960
      data      7.43    7     5   3   FC:A   0  FCAL 10000 34500/70656000    34732/71132960
      data      7.51    7     6   3   FC:A   0  FCAL 10000 34500/70656000    34732/71132960
      data      7.35    7     4   3   FC:A   0  FCAL 10000 34500/70656000    34732/71132960
      data      7.44    7     5   4   FC:A   0  FCAL 10000 34500/70656000    34732/71132960

    RAID group /aggr0/plex0/rg1 (normal)

      RAID Disk Device  HA  SHELF BAY CHAN Pool Type  RPM  Used (MB/blks)    Phys (MB/blks)
      --------- ------  ------------- ---- ---- ---- ----- --------------    --------------
      parity    7.36    7     4   4   FC:A   0  FCAL 10000 34500/70656000    34732/71132960
      data      7.45    7     5   5   FC:A   0  FCAL 10000 34500/70656000    34732/71132960
      data      7.52    7     6   4   FC:A   0  FCAL 10000 34500/70656000    34732/71132960
      data      7.37    7     4   5   FC:A   0  FCAL 10000 34500/70656000    34732/71132960
      data      7.46    7     5   6   FC:A   0  FCAL 10000 34500/70656000    34732/71132960
      data      7.53    7     6   5   FC:A   0  FCAL 10000 34500/70656000    34732/71132960
...
nova> vol create flexvol0 aggr0 40g
Creation of volume 'flexvol0' with size 40g on containing aggregate
'aggr0' has completed.
nova> vol create flexvol1 aggr0 1g
Creation of volume 'flexvol1' with size 1g on containing aggregate
'aggr0' has completed.
nova> vol size flexvol0 -20g
vol size: Flexible volume 'flexvol0' size set to 20g.
nova> vol size flexvol1 +9g
vol size: Flexible volume 'flexvol1' size set to 10g.
nova> df -h
Filesystem               total       used      avail capacity  Mounted on
...
/vol/flexvol0/            16GB       92KB       15GB       0%  /vol/flexvol0/
/vol/flexvol0/.snapshot     4096MB        0KB     4096MB       0%  /vol/flexvol0/.snapshot
/vol/flexvol1/          8192MB       96KB     8191MB       0%  /vol/flexvol1/
/vol/flexvol1/.snapshot     2048MB        0KB     2048MB       0%  /vol/flexvol1/.snapshot

Flexvol's let us do some really kool things. One of my favorites is cloning. We can take an existing flexvol and create a clone from it, either using the volume itself (right now) or using an existing snapshot from some point in the past (like before your server blew up). A clone looks exactly like the volume it was created from (its parent flexvol) but it uses no additional physical storage! Using snapshots and clones together provides a powerful tool.

nova> vol clone create flexclone0 -b flexvol0
Creation of clone volume 'flexclone0' has completed.
nova> df -h
Filesystem               total       used      avail capacity  Mounted on
...
/vol/flexvol0/            16GB      104KB       15GB       0%  /vol/flexvol0/
/vol/flexclone0/          16GB     1552KB       15GB       0%  /vol/flexclone0/
nova> vol size flexvol0 -10g
vol size: Flexible volume 'flexvol0' size set to 10g.
nova> df -h
Filesystem               total       used      avail capacity  Mounted on
...
/vol/flexvol0/          8192MB      104KB     8191MB       0%  /vol/flexvol0/
/vol/flexclone0/          16GB     1552KB       15GB       0%  /vol/flexclone0/

Snapshot management itself is really easy. The operation takes milliseconds. Because of the robust snapshotting capabilities which are used extensively within the filers and paired with its WAFL (Write Anywhere Filesystem Layout) operations like volume creation, snapshot creation, cloning, etc, requrie almost no work at all. No more sitting around waiting an hour for volumes to become usable. This is the real magic behind NetApp Filers.

Here's an interesting example. Say your PostgreSQL system got attacked and the database was corrupt. First priority is going to be getting the databases back online, but we'd also like to have a copy of that corrupt data for further analysis when some of our DBA's get into the office in the morning. If we were using a NetApp Filer with OnTap 7g we could create a clone of the effected volume based on a prior snapshot thats known to be good, make that avaible to the system (export the NFS share, CIFS share it, or remap the LUN for iSCSI or SAN), and be back online in the time that it took you to type the commands. Then we could mount that corrupt volume off to a test system for analysis. When we're all done with the analysis, we could split off the clone that we created so that it would be its own unique volume and keep using it as is (not loosing the changes between the time we recovered and now) or we could just delete the clone and revert the whole volume back to that known good snapshot. The only thing to remember about reverting to a previous snapshot is that you will destroy all snapshots made from that point untill now, so cloning based on a point-in-time snapshot and then being able to split off that clone allows us some flexablity in getting around that.

When dealing with iSCSI or FCP (Fibre Channel Protocal, ie SAN) we are making block allocations available, which we call LUNs. A LUN is actually just a big allocated file sitting on a volumes filesystem (or so it appears). So in order to create a LUN you must first create a volume as we discussed above. Then when creating the LUN, you'll specify the path within that volume.

nova> vol create qa_luns aggr0 100g
Creation of volume 'qa_luns' with size 100g on containing aggregate
'aggr0' has completed.
nova> df -h /vol/qa_luns
Filesystem               total       used      avail capacity  Mounted on
/vol/qa_luns/             80GB      100KB       79GB       0%  /vol/qa_luns/
/vol/qa_luns/.snapshot       20GB        0KB       20GB       0%  /vol/qa_luns/.snapshot
nova> lun create -s 45g /vol/qa_luns/qa_solaris1
nova> lun create -s 20g /vol/qa_luns/qa_linux1
nova> lun show
        /vol/qa_luns/qa_linux1        20g (21474836480)   (r/w, online)
        /vol/qa_luns/qa_solaris1      45g (48318382080)   (r/w, online)
nova> df -h /vol/qa_luns
Filesystem               total       used      avail capacity  Mounted on
/vol/qa_luns/             80GB       65GB       14GB      81%  /vol/qa_luns/
/vol/qa_luns/.snapshot       20GB       68KB       19GB       0%  /vol/qa_luns/.snapshot

You can see how easy it is to create LUNs. You'll notice that while I created a 100GB volume that 20GB of it is allocated to snapshots. We can limit that easily:

nova> snap reserve qa_luns 5
nova> df -h /vol/qa_luns
Filesystem               total       used      avail capacity  Mounted on
/vol/qa_luns/             95GB       65GB       29GB      69%  /vol/qa_luns/
/vol/qa_luns/.snapshot     5120MB       68KB     5119MB       0%  /vol/qa_luns/.snapshot

Snapshot sizing and the arguments against turning off snapshotting are beyond the scope of this blog entry.

Once you've created a LUN you make it accessable by mapping it to an initiator group. Initiator groups contain one or more iSCSI IQN's or FCP WWN's which map to a LUN on the Filer. This is traditionally known as LUN masking, because your making sure that only the initiators (clients) that should see certain LUNs can. This sounds a lot more difficult than it is:

nova> lun show
        /vol/qa_luns/qa_linux1        20g (21474836480)   (r/w, online)
        /vol/qa_luns/qa_solaris1      45g (48318382080)   (r/w, online)
nova> igroup create -f QA-Solaris 10:00:00:00:c9:2b:51:2c
nova> lun map /vol/qa_luns/qa_solaris1 QA-Solaris 0
nova> igroup show QA-Solaris
    QA-Solaris (FCP) (ostype: default):
        10:00:00:00:c9:2b:51:2c (not logged in)
nova> lun show /vol/qa_luns/qa_solaris1
        /vol/qa_luns/qa_solaris1      45g (48318382080)   (r/w, online, mapped)

Easy easy. iSCSI and FCP are handled in exactly the same way, except for a single flag. Notice the -f in the igroup create above, that means FCP, change it to -i and your doing iSCSI instead.

This is only a taste of what you can do with a NetApp Filer. But I hope you can see that there is a huge amount of flexability found in these boxes that is either unavailable from other vendors or so expensive as to be unpractical (replication on EMC vs Snapmirror on NetApp is a good example). I view NetApp FIlers as the Swiss Army Knives of the storage world, they do everything you need to do and a ton of stuff that might come in handy some day (like that little toothpick). If your just doing straight-up SAN storage, I highly recommend the Sun StorEdge 3510 (search Google for "cuddletech 3510" for more details) but when you need flexability to roll with whatever comes your way, nothing beats NetApp: The Versatile Storage Platform.


- - C O M M E N T S - -

Damn good article, Ben. I’m going to forward it to the rest of my sysadmin team. Our 3050C cluster is arriving soon. 8-)
阅读(1761) | 评论(0) | 转发(0) |
给主人留下些什么吧!~~