Monday, April 27, 2009

Success Points - Behind the Storage

Multi-path fetching

For any storage subsystem based on SAN architecture, the redundancy on the data communication between the node and the storage devices can be achieved through the "independent multipath access". The advantage of this multipath is each and every path has its own queue for the request and acknowledgements (seek & respond of data), generated by the node for accessing the data from the storage which can be done without backplane bottlenecks. Each storage devices will be having the information table in which all the request are stored and the data can be accessed through the storage device channel with respect to the device allocated for the particular queue. The queue requests are served with the respective data using the multipath feature without making block on the FC data path.

IO operations

RAID (Redundant Array of Independent Disks; actually called as redundant array of inexpensive disks) is a way of storing the same data in different places for redundancy, on multiple hard disks with parity structures. This can be achieved using either Vraid (Virtual Raid) or 4D+1P, 8D + 1P RAID structures.
While storing the data on multiple disks the IO operations happens in a structured-balanced way.
Why to use multiple disks?
• Multiple disks increases the “Mean Time between Failure” technically known as (MTBF)
• Storing the data in redundant mode increases the fault-tolerance
• Better bandwidth can be achieved when multiple spindles are used for a single File system.

Dynamic load balancing & Load Sharing

Considering the scenario of a storage system connected with multiple nodes, accessed by heterogeneous clients on concurrent basis. Whenever there is a failure happens on any of the nodes, the load on that particular failed server failover on the very second node (Or the node with less load) without disturbing the workflow is commonly known as dynamic load balancing. The resource table with respect to the load will be distributed to all the switching modules.
The file system has to be mapped according to the no. of nodes used and with the switching modules. If the load on the first file system which is primarily mapped to the first node gets over loaded then the second node will take the load of the first node (according to the threshold limit set on nodes), so that there won’t be any bottle neck happens at the node level since nodes play vital role for serving the data either through CIFS or NFS protocol to the end-users.
The core aim to have this dynamic load balancing is to make sure there won’t be any manual interaction happens in case of node failure or when the node gets overloaded, balancing or failover happens dynamically without affecting the data flow.


Self Healing Mechanism

Load balancing balances the loads on the nodes with failover & failsafe. Fail safe happens only with the mechanism called “Self healing” where the failures or errors (errors based on storage node mapping configuration, cluster configuration, network mapping issues, transfer protocol errors) on the node gets healed by itself without any manual interventions.
This helps in reducing the management on multiple node scenarios & also helps to keep the nodes in active level always. This self healing will not work on the hardware faults.

Caching

Caching plays the major role on any storage subsystem scenario (low end or hi-end). Caching can be done at the memory of the node which is directly interacting with the storage device & the controller cache. Various types of cache involved are read cache, write cache, data cache, controller cache etc. After decided about the bandwidth and the throughput of the storage system, the important point to be noted is this cache. Cache helps us to read the data fast if the same data is going to be read by multiple clients with the regular intervals. In media industry the exact throughput calculation comes first by the read and second by the write since the ratio between them is 70:30 at any given point of time. So to get better results the storage subsystem should have more caching space in order to avoid bottlenecks at read time where multiple workstations & Render nodes accessing the central storage concurrently.

Block Size

Block is nothing but a sequence of bits or bytes in data packets. Blocks with specified size is known as block size.
The process to make the data to be stored in particular blocks is known as blocking.
Most of the file systems are based on block device, in which again the hardware is responsible for storing and reading of data with the specified block sizes given. In traditional file systems, a single block may contain only a part of a single file which again puts us in trouble on space inefficiency when the block sizes are not specified according to the size of the file going to store on it. So most of the last block of files will only half filled and half empty. Again this will create space crunch in file system. So while designing storage it’s necessary to define the block & stripe sizes of the file stems with respect to the files used by the industry. So that the maximum space on the RAID spindles can be utilized. Normally the logical volumes accessed through block IO depends upon the server connected to the storage subsystem either through the SCSI or the Fiber Channel & also the backplane connections of the spindles on the sub system (like FC to SAS, FC to SATA, FC to FC etc)

Disk Arrays

As the name implies disk array is the system having multiple disk drives (the slots on the arrays is known as disk bays). Ex. JBOD (Just Bunch Of Disks). In which the RAID can be configured for data redundancy across the disks. The RAID is stripped across the disks to utilize the entire bandwidth of the spindles.
The basic disk Array will have the components like enclosures with redundant power supplies, caching & controller etc. The more spindles we use we can achieve more performance since it shares the load across the arrays & the huge bandwidth of individual disks can be purely utilized for data sharing. The subsystem can have SBOD (Switched Bunch Of Disks) also which depends upon the solution required.

Pipeline

As the requirement comes the storage can be decided with the latest technology used and with respect to the requirements of the customers. But to make sure the storage subsystem performs well and to use the storage efficiently, the access flow or the pipeline of the projects to be set accurately. If the project pipeline is fixed with respect to the data availability, capacities, concurrent access- then one can surely results with high performance with the designed storage.

Networking

The first and the final stage of the industry’s requirement is the “network”. Unless until the network is designed properly we can’t expect the costlier storage should perform well. Network plays the major role in the media industry for the data access from the central storage at any given point of time. It’s better to have the Giga network architecture in order to get the higher bandwidth which in turn helps the concurrent clients to access the data from the centralized location.
There are multiple ways of designing the Giga network architecture with the following key points.
• The network architecture may be of blocking or non-blocking based
• Teaming / Trunk / bonding involved
• LACP if required for more active-active dual gig pipeline- mainly used on Apple MAC G5 systems
• Cascading / stacking of switches in order to get better backplane bandwidth from the switch with redundancy
• Core switches can be used for costlier solutions, which can be managed from a single window for multiple VLANS/ Subnets internally & also for the slicing of the internal modules on the basis of requirement.