分类: 服务器与存储
2010-05-14 23:47:34
are best known as NAS devices, but I think few people appreciate just what a versatile storage platform they really are. With a single filer you can provide NAS, both CIFS and NFS, Fibre Channel SAN (both Fabric and DAS), and IP SAN (iSCSI and iFCP). I've found times in my past where I knew I needed storage for a project but it wasn't clear at the time of purchase just exactly how we needed to access the storage, and in these cases choosing a NetApp Filer provided me with the widest range of possible solutions to solve the problem and further to continue using that Filer beyond its origonal intent.
In releases prior to OnTap 7g, volumes were defined much like they are with other tools, creating an intimate relationship between volume and disk. A volume (on which a WAFL filesystem sits that contains data) would be created on top of one or more Raid Groups (RG). Each raid group is RAID4, continging n data disks +1 parity disk. In the 6.x releases of OnTap we got a nifty feature called Double Parity (DP) which allowed you to have 2 parity disks (this isn't just a mirror of parity, [PDF]). With RAID4 you can lose no more than 1 disk, which means that when a disk fails your pants are down untill a spare takes over, and with 300GB drives that can take a while, so by using RAID-DP you could lose up to 2 disks, which covers you during a spare reconstruction and gives you a little extra security and peice of mind.
The following example is of a traditional volume:
nova> vol create trad_vol0 -t raid_dp 5 Creation of a volume with 5 disks has completed. nova> vol status trad_vol0 online raid_dp, trad nova> sysconfig -r Volume trad_vol0 (online, raid_dp) (zoned checksums) Plex /trad_vol0/plex0 (online, normal, active, pool0) RAID group /trad_vol0/plex0/rg0 (normal) RAID Disk Device HA SHELF BAY CHAN Pool Type RPM Used (MB/blks) Phys (MB/blks) --------- ------ ------------- ---- ---- ---- ----- -------------- -------------- dparity 7.48 7 6 0 FC:A 0 FCAL 10000 34500/70656000 34732/71132960 parity 7.41 7 5 1 FC:A 0 FCAL 10000 34500/70656000 34732/71132960 data 7.33 7 4 1 FC:A 0 FCAL 10000 34500/70656000 34732/71132960 data 7.49 7 6 1 FC:A 0 FCAL 10000 34500/70656000 34732/71132960 data 7.42 7 5 2 FC:A 0 FCAL 10000 34500/70656000 34732/71132960 ... nova> df -h trad_vol0 Filesystem total used avail capacity Mounted on /vol/trad_vol0/ 72GB 132KB 72GB 0% /vol/trad_vol0/ /vol/trad_vol0/.snapshot 18GB 0KB 18GB 0% /vol/trad_vol0/.snapshot
When you create a Traditional Volume (simply refered to in pre-7g releases as a "volume") the volume sits on top of the RAID Groups you created. If you want to increase the size of the volume you must add more disks, and for performance reasons you typically want to add additional full RAID groups at a time. This method of allocation is fairly inflexable, you can't shrink volumes and your volume sizing is based on the physical disk sizes, not simply needed capacity. To make up for this, you can create within a traditional volume a Quota Tree (qtree), which by-and-large can be treated like a volume itself, for instance, you can replicate individual qtree's using snapmirror, define diffrent access types (UNIX, NTFS, or Mixed), etc. But Qtree's are only so flexable.
nova> qtree create /vol/trad_vol0/devteam1 nova> qtree security /vol/trad_vol0/devteam1 mixed nova> rdfile /etc/quotas /vol/trad_vol0/devteam1 tree 4G 20K nova> qtree status Volume Tree Style Oplocks Status -------- -------- ----- -------- --------- trad_vol0 unix enabled normal trad_vol0 devteam1 mixed enabled normal nova> quota report K-Bytes Files Type ID Volume Tree Used Limit Used Limit Quota Specifier ----- -------- -------- -------- --------- --------- ------- ------- --------------- tree 1 trad_vol0 devteam1 0 4194304 1 20480 /vol/trad_vol0/devteam1
Enter 7g, where we get a new concept called aggregates. This adds a layer of abstraction between the physical disks and the volumes. The idea is simple, allocate a certain number of disks to a big aggregate which is built on RAID Groups just like traditional volumes. However, there is no WAFL file system within an aggregate, its just allocatable space. Now within this big aggregate we can create flexible volumes ("flex vols"). Because a flex vol is abstracted from the underlying disk you can create a volume to meet your capacity needs, without reguard for physical layout. Furthermore, you can grow or shrink a flex vol easily at any time. This truly is flexible. One advantage of this is that over time, as your needs and requirements change, you can manipulate your storage as you actually need to by sloshing things around within the aggregate.
The following is an exmple of an aggregate:
nova> aggr create aggr0 -r 6 -t raid4 12@36 Creation of an aggregate with 12 disks has completed. nova> sysconfig -r Aggregate aggr0 (online, raid4) (zoned checksums) Plex /aggr0/plex0 (online, normal, active, pool0) RAID group /aggr0/plex0/rg0 (normal) RAID Disk Device HA SHELF BAY CHAN Pool Type RPM Used (MB/blks) Phys (MB/blks) --------- ------ ------------- ---- ---- ---- ----- -------------- -------------- parity 7.50 7 6 2 FC:A 0 FCAL 10000 34500/70656000 34732/71132960 data 7.34 7 4 2 FC:A 0 FCAL 10000 34500/70656000 34732/71132960 data 7.43 7 5 3 FC:A 0 FCAL 10000 34500/70656000 34732/71132960 data 7.51 7 6 3 FC:A 0 FCAL 10000 34500/70656000 34732/71132960 data 7.35 7 4 3 FC:A 0 FCAL 10000 34500/70656000 34732/71132960 data 7.44 7 5 4 FC:A 0 FCAL 10000 34500/70656000 34732/71132960 RAID group /aggr0/plex0/rg1 (normal) RAID Disk Device HA SHELF BAY CHAN Pool Type RPM Used (MB/blks) Phys (MB/blks) --------- ------ ------------- ---- ---- ---- ----- -------------- -------------- parity 7.36 7 4 4 FC:A 0 FCAL 10000 34500/70656000 34732/71132960 data 7.45 7 5 5 FC:A 0 FCAL 10000 34500/70656000 34732/71132960 data 7.52 7 6 4 FC:A 0 FCAL 10000 34500/70656000 34732/71132960 data 7.37 7 4 5 FC:A 0 FCAL 10000 34500/70656000 34732/71132960 data 7.46 7 5 6 FC:A 0 FCAL 10000 34500/70656000 34732/71132960 data 7.53 7 6 5 FC:A 0 FCAL 10000 34500/70656000 34732/71132960 ... nova> vol create flexvol0 aggr0 40g Creation of volume 'flexvol0' with size 40g on containing aggregate 'aggr0' has completed. nova> vol create flexvol1 aggr0 1g Creation of volume 'flexvol1' with size 1g on containing aggregate 'aggr0' has completed. nova> vol size flexvol0 -20g vol size: Flexible volume 'flexvol0' size set to 20g. nova> vol size flexvol1 +9g vol size: Flexible volume 'flexvol1' size set to 10g. nova> df -h Filesystem total used avail capacity Mounted on ... /vol/flexvol0/ 16GB 92KB 15GB 0% /vol/flexvol0/ /vol/flexvol0/.snapshot 4096MB 0KB 4096MB 0% /vol/flexvol0/.snapshot /vol/flexvol1/ 8192MB 96KB 8191MB 0% /vol/flexvol1/ /vol/flexvol1/.snapshot 2048MB 0KB 2048MB 0% /vol/flexvol1/.snapshot
Flexvol's let us do some really kool things. One of my favorites is cloning. We can take an existing flexvol and create a clone from it, either using the volume itself (right now) or using an existing snapshot from some point in the past (like before your server blew up). A clone looks exactly like the volume it was created from (its parent flexvol) but it uses no additional physical storage! Using snapshots and clones together provides a powerful tool.
nova> vol clone create flexclone0 -b flexvol0 Creation of clone volume 'flexclone0' has completed. nova> df -h Filesystem total used avail capacity Mounted on ... /vol/flexvol0/ 16GB 104KB 15GB 0% /vol/flexvol0/ /vol/flexclone0/ 16GB 1552KB 15GB 0% /vol/flexclone0/ nova> vol size flexvol0 -10g vol size: Flexible volume 'flexvol0' size set to 10g. nova> df -h Filesystem total used avail capacity Mounted on ... /vol/flexvol0/ 8192MB 104KB 8191MB 0% /vol/flexvol0/ /vol/flexclone0/ 16GB 1552KB 15GB 0% /vol/flexclone0/
Snapshot management itself is really easy. The operation takes milliseconds. Because of the robust snapshotting capabilities which are used extensively within the filers and paired with its WAFL (Write Anywhere Filesystem Layout) operations like volume creation, snapshot creation, cloning, etc, requrie almost no work at all. No more sitting around waiting an hour for volumes to become usable. This is the real magic behind NetApp Filers.
Here's an interesting example. Say your PostgreSQL system got attacked and the database was corrupt. First priority is going to be getting the databases back online, but we'd also like to have a copy of that corrupt data for further analysis when some of our DBA's get into the office in the morning. If we were using a NetApp Filer with OnTap 7g we could create a clone of the effected volume based on a prior snapshot thats known to be good, make that avaible to the system (export the NFS share, CIFS share it, or remap the LUN for iSCSI or SAN), and be back online in the time that it took you to type the commands. Then we could mount that corrupt volume off to a test system for analysis. When we're all done with the analysis, we could split off the clone that we created so that it would be its own unique volume and keep using it as is (not loosing the changes between the time we recovered and now) or we could just delete the clone and revert the whole volume back to that known good snapshot. The only thing to remember about reverting to a previous snapshot is that you will destroy all snapshots made from that point untill now, so cloning based on a point-in-time snapshot and then being able to split off that clone allows us some flexablity in getting around that.
When dealing with iSCSI or FCP (Fibre Channel Protocal, ie SAN) we are making block allocations available, which we call LUNs. A LUN is actually just a big allocated file sitting on a volumes filesystem (or so it appears). So in order to create a LUN you must first create a volume as we discussed above. Then when creating the LUN, you'll specify the path within that volume.
nova> vol create qa_luns aggr0 100g Creation of volume 'qa_luns' with size 100g on containing aggregate 'aggr0' has completed. nova> df -h /vol/qa_luns Filesystem total used avail capacity Mounted on /vol/qa_luns/ 80GB 100KB 79GB 0% /vol/qa_luns/ /vol/qa_luns/.snapshot 20GB 0KB 20GB 0% /vol/qa_luns/.snapshot nova> lun create -s 45g /vol/qa_luns/qa_solaris1 nova> lun create -s 20g /vol/qa_luns/qa_linux1 nova> lun show /vol/qa_luns/qa_linux1 20g (21474836480) (r/w, online) /vol/qa_luns/qa_solaris1 45g (48318382080) (r/w, online) nova> df -h /vol/qa_luns Filesystem total used avail capacity Mounted on /vol/qa_luns/ 80GB 65GB 14GB 81% /vol/qa_luns/ /vol/qa_luns/.snapshot 20GB 68KB 19GB 0% /vol/qa_luns/.snapshot
You can see how easy it is to create LUNs. You'll notice that while I created a 100GB volume that 20GB of it is allocated to snapshots. We can limit that easily:
nova> snap reserve qa_luns 5 nova> df -h /vol/qa_luns Filesystem total used avail capacity Mounted on /vol/qa_luns/ 95GB 65GB 29GB 69% /vol/qa_luns/ /vol/qa_luns/.snapshot 5120MB 68KB 5119MB 0% /vol/qa_luns/.snapshot
Snapshot sizing and the arguments against turning off snapshotting are beyond the scope of this blog entry.
Once you've created a LUN you make it accessable by mapping it to an initiator group. Initiator groups contain one or more iSCSI IQN's or FCP WWN's which map to a LUN on the Filer. This is traditionally known as LUN masking, because your making sure that only the initiators (clients) that should see certain LUNs can. This sounds a lot more difficult than it is:
nova> lun show /vol/qa_luns/qa_linux1 20g (21474836480) (r/w, online) /vol/qa_luns/qa_solaris1 45g (48318382080) (r/w, online) nova> igroup create -f QA-Solaris 10:00:00:00:c9:2b:51:2c nova> lun map /vol/qa_luns/qa_solaris1 QA-Solaris 0 nova> igroup show QA-Solaris QA-Solaris (FCP) (ostype: default): 10:00:00:00:c9:2b:51:2c (not logged in) nova> lun show /vol/qa_luns/qa_solaris1 /vol/qa_luns/qa_solaris1 45g (48318382080) (r/w, online, mapped)
Easy easy. iSCSI and FCP are handled in exactly the same way, except for a single flag. Notice the -f in the igroup create above, that means FCP, change it to -i and your doing iSCSI instead.
This is only a taste of what you can do with a NetApp Filer. But I hope you can see that there is a huge amount of flexability found in these boxes that is either unavailable from other vendors or so expensive as to be unpractical (replication on EMC vs Snapmirror on NetApp is a good example). I view NetApp FIlers as the Swiss Army Knives of the storage world, they do everything you need to do and a ton of stuff that might come in handy some day (like that little toothpick). If your just doing straight-up SAN storage, I highly recommend the Sun StorEdge 3510 (search Google for "cuddletech 3510" for more details) but when you need flexability to roll with whatever comes your way, nothing beats NetApp: The Versatile Storage Platform.