The IBM General Parallel File System (GPFS) provides unmatched
performance and reliability with scalable access to critical file data. GPFS
distinguishes itself from other cluster file systems by providing concurrent
high-speed file access to applications executing on multiple nodes of an AIX
cluster, a Linux cluster, or a heterogeneous cluster of AIX and Linux nodes. In
addition to providing file storage capabilities, GPFS provides storage
management, information life cycle tools, centralized administration and
allows for shared access to file systems from remote GPFS clusters.
GPFS provides scalable high-performance data access from a two node
cluster providing a high availability platform supporting a database application,
for example, to 2,000 nodes or more used for applications like modeling
weather patterns. Up to 512 Linux nodes or 128 AIX nodes with access to
one or more file systems are supported as a general statement and larger
configurations exist by special arrangements with IBM. The largest existing
configurations exceed 2,000 nodes. GPFS has been available on AIX since
1998 and Linux since 2001. It has proven time and again on some of the
world's most powerful supercomputers1 to provide efficient use of disk
bandwidth.
GPFS was designed from the beginning to support high performance
computing (HPC) and has been proven very effective for a variety of
applications. It is installed in clusters supporting relational databases, digital
media and scalable file serving. These applications are used across many
industries including financial, retail and government applications. Being tested
in very demanding large environments makes GPFS a solid solution for any
size application.
GPFS supports various system types including the IBM System p™ family
and machines based on Intel® or AMD processors such as an IBM System
x™ environment. Supported operating systems for GPFS Version 3.2 include
AIX V5.3 and selected versions of Red Hat and SUSE Linux distributions.
This paper introduces a number of GPFS features and describes core
concepts. This includes the file system, high availability features, information
lifecycle management (ILM) tools and various cluster architectures.
A GPFS file system is built from a collection of disks which contain the file
system data and metadata. A file system can be built from a single disk or
contain thousands of disks, storing Petabytes of data. A GPFS cluster can
contain up to 256 mounted file systems. There is no limit placed upon the
number of simultaneously opened files within a single file system. As an
example, current GPFS customers are using single file systems up to 2PB in
size and others containing tens of millions of file.
Application interfaces
Applications can access files through standard UNIX® file system
interfaces or through enhanced interfaces available for parallel programs.
Parallel and distributed applications can be scheduled on GPFS clusters to
take advantage of the shared access architecture. This makes GPFS a key
component in many grid-based solutions. Parallel applications can
concurrently read or update a common file from multiple nodes in the cluster.
GPFS maintains the coherency and consistency of the file system using a
sophisticated byte level locking, token (lock) management and logging.
In addition to standard interfaces GPFS provides a unique set of extended
interfaces which can be used to provide high performance for applications with
demanding data access patterns. These extended interfaces are more
efficient for traversing a file system, for example, and provide more features
than the standard POSIX interfaces.
Performance and scalability
GPFS provides unparalleled performance especially for larger data objects
and excellent performance for large aggregates of smaller objects. GPFS
achieves high performance I/O by:
• Striping data across multiple disks attached to multiple nodes.
• Efficient client side caching.
• Supporting a large block size, configurable by the administrator, to fit
I/O requirements.
• Utilizing advanced algorithms that improve read-ahead and writebehind
file functions.
• Using block level locking based on a very sophisticated scalable token
management system to provide data consistency while allowing
multiple application nodes concurrent access to the files.
GPFS recognizes typical access patterns like sequential, reverse
sequential and random and optimizes I/O access for these patterns.
GPFS token (lock) management coordinates access to shared disks
ensuring the consistency of file system data and metadata when different
nodes access the same file. GPFS has the ability for multiple nodes to act as
token managers for a single file system. This allows greater scalability for high
transaction workloads.
Along with distributed token management, GPFS provides scalable
metadata management by allowing all nodes of the cluster accessing the file
system to perform file metadata operations. This key and unique feature
distinguishes GPFS from other cluster file systems which typically have a
centralized metadata server handling fixed regions of the file namespace. A
centralized metadata server can often become a performance bottleneck for
metadata intensive operations and can represent a single point of failure.
GPFS solves this problem by managing metadata at the node which is using
the file or in the case of parallel access to the file, at a dynamically selected
node which is using the file.
AdministrationGPFS provides an administration model that is consistent with standard
AIX and Linux file system administration while providing extensions for the
clustering aspects of GPFS. These functions support cluster management
and other standard file system administration functions such as quotas,
snapshots, and extended access control lists.
GPFS provides functions that simplify cluster-wide tasks. A single GPFS
command can perform a file system function across the entire cluster and
most can be issued from any node in the cluster. These commands are
typically extensions to the usual AIX and Linux file system commands.
Rolling upgrades allow you to upgrade individual nodes in the cluster while
the file system remains online. With GPFS Version 3.1 you could mix GPFS
3.1 nodes with different patch levels. Continuing that trend in GPFS Version
3.2 you can run a cluster with a mix of GPFS Version 3.1 and GPFS Version
3.2 nodes.
Quotas enable the administrator to control and monitor file system usage
by users and groups across the cluster. GPFS provides commands to
generate quota reports including user, group and fileset inode and data block
usage. In addition to traditional quota management, GPFS has an API that
provides high performance metadata access enabling custom reporting
options on very large numbers of files.
A snapshot of an entire GPFS file system may be created to preserve the
file system's contents at a single point in time. This is a very efficient
mechanism because a snapshot contains a map of the file system at the time
it was taken and a copy of only the file system data that has been changed
since the snapshot was created. This is done using a copy-on-write technique.
The snapshot function allows a backup program, for example, to run
concurrently with user updates and still obtain a consistent copy of the file
system as of the time that the snapshot was created. Snapshots provide an
online backup capability that allows files to be recovered easily from common
problems such as accidental deletion of a file.
An SNMP interface is introduced in GPFS Version 3.2 to allow monitoring
by network management applications. The SNMP agent provides information
on the GPFS cluster and generates traps in the event a file system is mounted,
modified or if a node fails. In GPFS Version 3.2 the SNMP agent runs only on
Linux. You can monitor a mixed cluster of AIX and Linux nodes as long as the
agent runs on a Linux node.
GPFS provides support for the Data Management API (DMAPI) interface
which is IBM’s implementation of the X/Open data storage management API.
This DMAPI interface allows vendors of storage management applications
such as IBM Tivoli® Storage Manager (TSM) to provide Hierarchical Storage
Management (HSM) support for GPFS.
GPFS enhanced access control protects directories and files by providing a
means of specifying who should be granted access. On AIX, GPFS supports
NFS V4 access control lists (ACLs) in addition to traditional ACL support.
Traditional GPFS ACLs are based on the POSIX model. Access control lists
(ACLs) extend the base permissions, or standard file access modes, of read
(r), write (w), and execute (x) beyond the three categories of file owner, file
group, and other users, to allow the definition of additional users and user
groups. In addition, GPFS introduces a fourth access mode, control (c), which
can be used to govern who can manage the ACL itself.
In addition to providing application file service GPFS file systems may be
exported to clients outside the cluster through NFS or Samba. GPFS has
been used for a long time as the base for a scalable NFS file service
infrastructure. Now that feature is integrated in GPFS Version 3.2 and is called
clustered NFS. Clustered NFS provides all the tools necessary to run a GPFS
Linux cluster as a scalable NFS file server. This allows a GPFS cluster to
provide scalable file service by providing simultaneous access to a common
set of data from multiple nodes. The clustered NFS tools include monitoring of
file services, load balancing and IP address fail over.
Data availability
GPFS is fault tolerant and can be configured for continued access to data
even if cluster nodes or storage systems fail. This is accomplished though
robust clustering features and support for data replication.
GPFS continuously monitors the health of the file system components.
When failures are detected appropriate recovery action is taken automatically.
Extensive logging and recovery capabilities are provided which maintain
metadata consistency when application nodes holding locks or performing
services fail. Data replication is available for journal logs, metadata and data.
Replication allows for continuous operation even if a path to a disk or a disk
itself fails.
GPFS Version 3.2 further enhances clustering robustness with connection
retries. If the LAN connection to a node fails GPFS will automatically try and
reestablish the connection before making the node unavailable. This provides
for better uptime in environments experiencing network issues.
Using these features along with a high availability infrastructure ensures a
reliable enterprise storage solution.
performance and reliability with scalable access to critical file data. GPFS
distinguishes itself from other cluster file systems by providing concurrent
high-speed file access to applications executing on multiple nodes of an AIX
cluster, a Linux cluster, or a heterogeneous cluster of AIX and Linux nodes. In
addition to providing file storage capabilities, GPFS provides storage
management, information life cycle tools, centralized administration and
allows for shared access to file systems from remote GPFS clusters.
GPFS provides scalable high-performance data access from a two node
cluster providing a high availability platform supporting a database application,
for example, to 2,000 nodes or more used for applications like modeling
weather patterns. Up to 512 Linux nodes or 128 AIX nodes with access to
one or more file systems are supported as a general statement and larger
configurations exist by special arrangements with IBM. The largest existing
configurations exceed 2,000 nodes. GPFS has been available on AIX since
1998 and Linux since 2001. It has proven time and again on some of the
world's most powerful supercomputers1 to provide efficient use of disk
bandwidth.
GPFS was designed from the beginning to support high performance
computing (HPC) and has been proven very effective for a variety of
applications. It is installed in clusters supporting relational databases, digital
media and scalable file serving. These applications are used across many
industries including financial, retail and government applications. Being tested
in very demanding large environments makes GPFS a solid solution for any
size application.
GPFS supports various system types including the IBM System p™ family
and machines based on Intel® or AMD processors such as an IBM System
x™ environment. Supported operating systems for GPFS Version 3.2 include
AIX V5.3 and selected versions of Red Hat and SUSE Linux distributions.
This paper introduces a number of GPFS features and describes core
concepts. This includes the file system, high availability features, information
lifecycle management (ILM) tools and various cluster architectures.
A GPFS file system is built from a collection of disks which contain the file
system data and metadata. A file system can be built from a single disk or
contain thousands of disks, storing Petabytes of data. A GPFS cluster can
contain up to 256 mounted file systems. There is no limit placed upon the
number of simultaneously opened files within a single file system. As an
example, current GPFS customers are using single file systems up to 2PB in
size and others containing tens of millions of file.
Application interfaces
Applications can access files through standard UNIX® file system
interfaces or through enhanced interfaces available for parallel programs.
Parallel and distributed applications can be scheduled on GPFS clusters to
take advantage of the shared access architecture. This makes GPFS a key
component in many grid-based solutions. Parallel applications can
concurrently read or update a common file from multiple nodes in the cluster.
GPFS maintains the coherency and consistency of the file system using a
sophisticated byte level locking, token (lock) management and logging.
In addition to standard interfaces GPFS provides a unique set of extended
interfaces which can be used to provide high performance for applications with
demanding data access patterns. These extended interfaces are more
efficient for traversing a file system, for example, and provide more features
than the standard POSIX interfaces.
Performance and scalability
GPFS provides unparalleled performance especially for larger data objects
and excellent performance for large aggregates of smaller objects. GPFS
achieves high performance I/O by:
• Striping data across multiple disks attached to multiple nodes.
• Efficient client side caching.
• Supporting a large block size, configurable by the administrator, to fit
I/O requirements.
• Utilizing advanced algorithms that improve read-ahead and writebehind
file functions.
• Using block level locking based on a very sophisticated scalable token
management system to provide data consistency while allowing
multiple application nodes concurrent access to the files.
GPFS recognizes typical access patterns like sequential, reverse
sequential and random and optimizes I/O access for these patterns.
GPFS token (lock) management coordinates access to shared disks
ensuring the consistency of file system data and metadata when different
nodes access the same file. GPFS has the ability for multiple nodes to act as
token managers for a single file system. This allows greater scalability for high
transaction workloads.
Along with distributed token management, GPFS provides scalable
metadata management by allowing all nodes of the cluster accessing the file
system to perform file metadata operations. This key and unique feature
distinguishes GPFS from other cluster file systems which typically have a
centralized metadata server handling fixed regions of the file namespace. A
centralized metadata server can often become a performance bottleneck for
metadata intensive operations and can represent a single point of failure.
GPFS solves this problem by managing metadata at the node which is using
the file or in the case of parallel access to the file, at a dynamically selected
node which is using the file.
AdministrationGPFS provides an administration model that is consistent with standard
AIX and Linux file system administration while providing extensions for the
clustering aspects of GPFS. These functions support cluster management
and other standard file system administration functions such as quotas,
snapshots, and extended access control lists.
GPFS provides functions that simplify cluster-wide tasks. A single GPFS
command can perform a file system function across the entire cluster and
most can be issued from any node in the cluster. These commands are
typically extensions to the usual AIX and Linux file system commands.
Rolling upgrades allow you to upgrade individual nodes in the cluster while
the file system remains online. With GPFS Version 3.1 you could mix GPFS
3.1 nodes with different patch levels. Continuing that trend in GPFS Version
3.2 you can run a cluster with a mix of GPFS Version 3.1 and GPFS Version
3.2 nodes.
Quotas enable the administrator to control and monitor file system usage
by users and groups across the cluster. GPFS provides commands to
generate quota reports including user, group and fileset inode and data block
usage. In addition to traditional quota management, GPFS has an API that
provides high performance metadata access enabling custom reporting
options on very large numbers of files.
A snapshot of an entire GPFS file system may be created to preserve the
file system's contents at a single point in time. This is a very efficient
mechanism because a snapshot contains a map of the file system at the time
it was taken and a copy of only the file system data that has been changed
since the snapshot was created. This is done using a copy-on-write technique.
The snapshot function allows a backup program, for example, to run
concurrently with user updates and still obtain a consistent copy of the file
system as of the time that the snapshot was created. Snapshots provide an
online backup capability that allows files to be recovered easily from common
problems such as accidental deletion of a file.
An SNMP interface is introduced in GPFS Version 3.2 to allow monitoring
by network management applications. The SNMP agent provides information
on the GPFS cluster and generates traps in the event a file system is mounted,
modified or if a node fails. In GPFS Version 3.2 the SNMP agent runs only on
Linux. You can monitor a mixed cluster of AIX and Linux nodes as long as the
agent runs on a Linux node.
GPFS provides support for the Data Management API (DMAPI) interface
which is IBM’s implementation of the X/Open data storage management API.
This DMAPI interface allows vendors of storage management applications
such as IBM Tivoli® Storage Manager (TSM) to provide Hierarchical Storage
Management (HSM) support for GPFS.
GPFS enhanced access control protects directories and files by providing a
means of specifying who should be granted access. On AIX, GPFS supports
NFS V4 access control lists (ACLs) in addition to traditional ACL support.
Traditional GPFS ACLs are based on the POSIX model. Access control lists
(ACLs) extend the base permissions, or standard file access modes, of read
(r), write (w), and execute (x) beyond the three categories of file owner, file
group, and other users, to allow the definition of additional users and user
groups. In addition, GPFS introduces a fourth access mode, control (c), which
can be used to govern who can manage the ACL itself.
In addition to providing application file service GPFS file systems may be
exported to clients outside the cluster through NFS or Samba. GPFS has
been used for a long time as the base for a scalable NFS file service
infrastructure. Now that feature is integrated in GPFS Version 3.2 and is called
clustered NFS. Clustered NFS provides all the tools necessary to run a GPFS
Linux cluster as a scalable NFS file server. This allows a GPFS cluster to
provide scalable file service by providing simultaneous access to a common
set of data from multiple nodes. The clustered NFS tools include monitoring of
file services, load balancing and IP address fail over.
Data availability
GPFS is fault tolerant and can be configured for continued access to data
even if cluster nodes or storage systems fail. This is accomplished though
robust clustering features and support for data replication.
GPFS continuously monitors the health of the file system components.
When failures are detected appropriate recovery action is taken automatically.
Extensive logging and recovery capabilities are provided which maintain
metadata consistency when application nodes holding locks or performing
services fail. Data replication is available for journal logs, metadata and data.
Replication allows for continuous operation even if a path to a disk or a disk
itself fails.
GPFS Version 3.2 further enhances clustering robustness with connection
retries. If the LAN connection to a node fails GPFS will automatically try and
reestablish the connection before making the node unavailable. This provides
for better uptime in environments experiencing network issues.
Using these features along with a high availability infrastructure ensures a
reliable enterprise storage solution.
No comments:
Post a Comment