Storage Basics Tutorial to help you prepare for Job Interview
EMAIL roger.smithson@gmail.com FOR A FREE FILE with 200 Interview Questions & STORAGE STUDENT TUTORIAL GUIDE
DEFINITIONS OF TERMS USED ALL THROUGHOUT THIS BLOG
A node is a computer attached to a SAN.
A SAN is a high-speed subnetwork of shared storage devices.
Software that manages San file system functions, such as file locking, space allocation, and data access authorization, is called the Metadata controller.(This is an Apple Storage implementation term - other companies use other terms)
Metadata controller uses callbacks to communicate with file system clients.
Xsan file system client software runs on all nodes in the SAN and communicates with the metadata controller in order to provide Xsan services. The term file system client refers to a node that is running the Xsan file system client software.
A redundant array of independent disks (RAID) device is a category of disk devices that combines two or more drives increased for fault tolerance and performance. There are several RAID levels:
*Level 0 provides data striping, where blocks of a file are spread across multiple disks, increasing performance; this level does not have any provisions that increase fault tolerance.
*Level 1 provides disk mirroring.
*Level 3 provides the data striping of Level 0 and also reserves a disk for storing error correction data, thereby increasing performance and fault tolerance.
*Level 5 provides data striping at the byte level and maintains stripe error correction information.
A JBOD (just a bunch of disks) is a disk that is not configured for RAID.
A logical unit number (LUN) is an aggregation of physical devices. Applications access LUNs through the special files in the system’s /dev/disk directory. For RAID devices, a LUN is typically a RAID-5 with three or more physical drives making up the LUN. For JBOD devices, one JBOD is one LUN.
A storage pool is a grouping of LUNs that have the same characteristics. Another term for storage pool is stripe group. One or more storage pools form a mountable volume. The number of volumes hosted by a single Xsan metadata controller should not exceed eight.
Stripe depth is the number of disks that have been assigned to a storage pool.
The stripe breadth is the maximum amount of data that is read or written before switching to the next LUN in the storage pool. When the last LUN is reached, I/O operations go back to t he first LUN. This is how large logical I/O operations are broken down into stripes across multiple LUNs. For example, if the stripe breadth for a storage pool is set at 4 MB, each I/O operation on that storage pool is physically no more than 4 MB. A 16 MB I/O operation would be broken down into 4 physical I/O operations.
A stripe line is the stripe breadth multiplied by the number of LUNs in the storage pool. To maximize performance, make I/O requests that area stripe line in size.
For real-time I/O, well-formed I/O is I/O that is a stripe line in size. This size makes the best utilization of the disks in the storage pool and maximizes the transfer rate. For non-real-time I/O, well-formed I/O is I/O that is memory aligned (modulus 4 bytes), 512-byte sector aligned, and modulus sector sized.
A block is the smallest number of bytes that can be read or written.
Storage pools can be assigned one or more values, known as an affinity identifiers, and a file can be assigned one affinity. When a request is made to allocate space in a file for which the affinity has been set, the space is allocated from the storage pool that an affinity identifier that matches the file’s affinity. For example, consider a SAN with some moderate performance JBOD LUNs and some high performance RAID-5 LUNs. By grouping the RAID-5 LUNs into the same storage pool and assigning them a specific affinity identifier, the developer can steer performance critical data to that storage pool. Files containing less critical data or files that do not have an affinity are assigned to the storage pool that consists of JBOD LUNs.
When a storage pool is in real-time I/O mode, file system clients that have processes that do non-real-time I/O must request a non-real-time I/O token Xsan throttles the speed of I/O of applications that are not in real-time mode so that their I/O does not interfere with real-time I/O. This document uses the term gate to describe processes or file descriptors that are not in real-time I/O mode and the term ungated to describe processes or file descriptors that are in real-time I/O mode.
An extent is a chunk of file data whose allocation is contiguous on a storage pool. A file’s data may be stored in one or more extents. Information about an extent includes its file-relative starting byte offset, its file system starting byte offset, the file system ending byte offset, and the ordinal of the storage pool on which the extent resides. File system clients use extent mapping tables to load information about a file’s extents. Loading extent information improves performance by eliminating a subsequent trip by the file system client to the metadata controller in order to retrieve extent information for the range mapped by an I/O request.
When there are two or more storage pools that have the same characteristics, an allocation strategy is needed. The strategy can be to round-robin files through the set of storage pools, balance the remaining space in the storage pools, or fill the first storage pool before going to the next storage pool.
A disk file system, such as a UFS or HFS+ file system, resides on the internal drives of a computer or on storage devices that are attached directly to the computer. A network file system allows data on internal drives or on directly attached drives to be shared with other computers on the network. Examples of network file systems include Apple Filing Protocol (AFP), Server Message Block (SMB), Common Internet File System (CIFS), or Network File System (NFS). A distributed file system is a blend of disk file system and network file system used to simplify data sharing through the creation of a single shared name space across a collection of servers. A cluster file system gives multiple computers simultaneous, very high-speed access to all shared data residing on an external, centralized storage pool. The storage pool typically consists of highly available RAID systems. Xsan is a cluster file system
SAN Architecture
PREVIOUS POSTS ARE LINKED HERE
No comments:
Post a Comment