Sunday, October 5, 2014

NetApp Layout

Structure = controller (filer) < shelf< Disks< RaidGroup< Aggregate< OS< File system< Volumes< Qtrees< lun
 
Storage:  NetApp filer, known also as NetApp Fabric-Attached Storage (FAS).

Is an enterprise-class storage area network (SAN) as well as a networked storage appliance (NAS). It can serve storage over a network using file-based protocols such as NFS, CIFS, FTP, TFTP, and HTTP. And can serve data over block-based protocols such as FC, RcoE and iSCSi.

The most common NetAPP configuration consists of a filer (also known as a controller or head node) and disk enclosures (also known as shelves).
 
Filers are connected with two cluster interconnect cables (InfiniBand). This interconnect is used for HA heartbeat.

Each Controller is assigned Disks, which can be assigned from Different Disk Shelves to One Controller. Each Controller should have a HOT SPARE Disk of Each Type. Both Controllers (FILERS) Keep a SYNC of each other’s configuration to take over in case of a Filer (Controller) Failure.

The filers run NetApp's own adapted operating system called Data ONTAP, it is highly tuned for storage-serving purposes. The Data ONTAP operating system implements a single proprietary file-system called WAFL (Write Anywhere File Layout):
 
WAFL provides mechanisms that enable a variety of file systems and technologies that want to access disk blocks.

As the name suggests Write Anywhere File Layout does not store data or metadata in pre-determined locations on disk in a way designed to minimize the number of disk operations required to commit data to stable disk storage using single and dual parity based RAID. 

Disks:  SATA, FC, SAS and SSD disks (7.2Krpm up to 15krpm) (100GB SSD up to 2TB SATA)

There are only four types of disks in Data ONTAP:
Data
holds data stored within the RAID group
Spare
Does not hold data but is available to be added to a RAID group in an aggregate.
Parity
Store data reconstruction information within the RAID group
dParity
Stores double-parity information within the RAID group, if RAID-DP is enabled

A disk can only be in one aggregate.  So each aggregate has its own drives.  This lets us tune the performance of the aggregate by adding many spindles.
 
Raid: RAID in NetApp terminology is called RAID group. NetApp works mostly with RAID 4 and RAID-DP. Where RAID 4 has one separate disk for parity and RAID-DP has two. Do not think that it leads to performance degradation. NetApp has very efficient implementation of these RAID levels.

We do not build raid groups; they are built behind the scene when you build an aggregate. Raid groups can be adjusted in size.  For FC/SAS they can be anywhere from 3 to 28 disks. Large raid-group: Better Space utilization, better performance, more risk (slower rebuilds and ratio of parity to data disks). 

Aggragate: An aggregate is a raw space. It is made up of one or more raid groups of disks, into a pool of disk space that can be used to create multiple volumes.

You take a bunch of individual disks and aggregate them together into aggregates. then layer on partitions, which in NetApp land are called volumes.  The volumes hold the data. Aggregates can SPAN Multiple RAID Groups on the SAME Filer. The more disks an Aggregate has, the better performance it gets (spindles).
 
A RAID Group Cannot SPAN Controllers, so Aggregates Cannot SPAN Controllers either. Aggregates can SPAN Multiple RAID Groups on the SAME Filer.
 
Volume: the volumes hold the data, providing file system. When you share out a volume it looks like NTFS to a windows box, or it looks like a UNIX filesystem to a unix box but in the end its just WAFL in the volume. you first make a volume, then you put a LUN in the volume. the LUN looks like one big file in the volume. 

Qtree:  A qtree is similar to a subdirectory. why use them? to sort data. There are 5 things you can do with a qtree: Oplocks, security style, Quotas, Snapvault, Qtree SnapMirror.


Lun: (logical unit number of block storage arrays) is a logical representation of storage. It looks like a hard disk to the client. It looks like a file inside of a volume. LUNs look like local disks to the OS. LUN is necessary to access data via block-level protocols like FCP and iSCSI.

In the end its normally the application that will determine whether you get your filesystem access through a LUN or a Volume. Some apps will not work across a network; Microsoft SQL and Exchange are two examples of this. Volumes are access via NAS protocols, CIFS/NFS. LUNS are accessed via SAN protocols, iSCSI/FCP/FcoE.
 
Some more interesting (important) entries:
·    Data ONTAP writes all data to a storage system in 4-KB blocks.
·    Vol0: Root Volume. was created when the storage system was initially setup at the factory . contains special directories and configuration files.
·    Igroup: An initiator group specifies which initiators can have access to a LUN. When you map a LUN on a storage system to the initiator group, you grant all the initiators in that group access to that LUN. If a host is not a member of an igroup that is mapped to a LUN, that host does not have access to the LUN.

In the case of iSCSI clients, hosts are identified in an initiator group by their node names. In the case of FCP clients, hosts are identified in an initiator group by their World Wide Port Names (WWPNs).


·    Dedupe: ONTAP Data has an additional feature called deduplication, it improves physical storage space by eliminating duplicate data blocks within a FlexVol volume. Deduplication can be useful if many similar systems are deployed on the same volume. If Dedupe is enabled, a checksum of every written block is created and compared to existing checksums. If this newly created checksum exists already in the checksum catalogue, the write operation is not performed and the block is simply referenced to the existing data. Deduplication works at the block level on the active file system. You can configure deduplication operations to run automatically or on a schedule.

 ·    NVRAM: Non-volatile battery-backed memory (NVRAM) is used for write caching. Before going to hard drives all writes are cached in NVRAM. NVRAM memory is split in half and each time 50% of NVRAM gets full, writes are being cached to the second half, while the first half is being written to disks. When data has been written to disks as part of Consistency Point (CP), write blocks which were cached in main memory become the first target to be evicted and replaced by other data.

 ·    Plex is collection of RAID groups and is used for RAID level mirroring. For instance if you have two disk shelves and SyncMirror license then you can create plex0 from first shelf drives and plex1 from second shelf.  This will protect you from one disk shelf failure.

No comments:

Post a Comment