(mysql.info) mysql-cluster-glossary
Info Catalog
(mysql.info) mysql-cluster-faq
(mysql.info) ndbcluster
15.11 MySQL Cluster Glossary
============================
The following terms are useful to an understanding of MySQL Cluster or
have specialized meanings when used in relation to it.
* *Cluster*:
In its generic sense, a cluster is a set of computers functioning
as a unit and working together to accomplish a single task.
*`NDB Cluster'*:
This is the storage engine used in MySQL to implement data
storage, retrieval, and management distributed among several
computers.
*MySQL Cluster*:
This refers to a group of computers working together using the
`NDB' storage engine to support a distributed MySQL database in a
_shared-nothing architecture_ using _in-memory storage_.
* *Configuration files*:
Text files containing directives and information regarding the
cluster, its hosts, and its nodes. These are read by the cluster's
management nodes when the cluster is started. See
mysql-cluster-config-file, for details.
* *Backup*:
A complete copy of all cluster data, transactions and logs, saved
to disk or other long-term storage.
* *Restore*:
Returning the cluster to a previous state, as stored in a backup.
* *Checkpoint*:
Generally speaking, when data is saved to disk, it is said that a
checkpoint has been reached. More specific to Cluster, it is a
point in time where all committed transactions are stored on disk.
With regard to the `NDB' storage engine, there are two types of
checkpoints which work together to ensure that a consistent view
of the cluster's data is maintained:
* *Local Checkpoint (LCP)*:
This is a checkpoint that is specific to a single node;
however, LCP's take place for all nodes in the cluster more
or less concurrently. An LCP involves saving all of a node's
data to disk, and so usually occurs every few minutes. The
precise interval varies, and depends upon the amount of data
stored by the node, the level of cluster activity, and other
factors.
* *Global Checkpoint (GCP)*:
A GCP occurs every few seconds, when transactions for all
nodes are synchronized and the redo-log is flushed to disk.
* *Cluster host*:
A computer making up part of a MySQL Cluster. A cluster has both a
_physical_ structure and a _logical_ structure. Physically, the
cluster consists of a number of computers, known as cluster hosts
(or more simply as hosts. See also *Node* and *Node group* below.
* *Node*:
This refers to a logical or functional unit of MySQL Cluster, and
is sometimes also referred to as a cluster node. In the context of
MySQL Cluster, we use the term `node' to indicate a _process_
rather than a physical component of the cluster. There are three
node types required to implement a working MySQL Cluster:
* *Management (MGM) nodes*:
Manages the other nodes within the MySQL Cluster. It provides
configuration data to the other nodes; starts and stops
nodes; handles network partitioning; creates backups and
restores from them, and so forth.
* *SQL (MySQL server) nodes*:
Instances of MySQL Server which serve as front ends to data
kept in the cluster's *data nodes*. Clients desiring to
store, retrieve, or update data can access an SQL node just
as they would any other MySQL Server, employing the usual
authentication methods and API's; the underlying distribution
of data between node groups is transparent to users and
applications. SQL nodes access the cluster's databases as a
whole without regard to the data's distribution across
different data nodes or cluster hosts.
* *Data nodes*:
These nodes store the actual data. Table data fragments are
stored in a set of node groups; each node group stores a
different subset of the table data. Each of the nodes making
up a node group stores a replica of the fragment for which
that node group is responsible. Currently, a single cluster
can support up to 48 data nodes total.
It is possible for more than one node to co-exist on a single
machine. (In fact, it is even possible to set up a complete
cluster on one machine, although one would almost certainly _not_
want to do this in a production environment.) It may be helpful to
remember that, when working with MySQL Cluster, the term _host_
refers to a physical component of the cluster whereas a _node_ is
a logical or functional component (that is, a process).
_Regarding Obsolete Terms* In older versions of the MySQL
Cluster documentation, data nodes were sometimes referred to as
`database nodes,' `DB nodes,' or occasionally `storage nodes.' In
addition, SQL nodes were sometimes known as `client nodes' or `API
nodes.' This older terminology has been deprecated to minimize
confusion, and for these reasons should be avoided.
* *Node group*:
A set of data nodes. All data nodes in a node group contain the
same data (fragments), and all nodes in a single group should
reside on different hosts. It is possible to control which nodes
belong to which node groups.
* *Node failure*:
MySQL Cluster is not solely dependent upon the functioning of any
single node making up the cluster; the cluster can continue to run
if one or more nodes fail. The precise number of node failures
that a given cluster can tolerate depends upon the number of nodes
and the cluster's configuration.
* *Node restart*:
The process of restarting a failed cluster node.
* *Initial node restart*:
The process of starting a cluster node with its filesystem
removed. This is sometimes used in the course of software upgrades
and in other special circumstances.
* *System crash* (or *system failure*):
This can occur when so many cluster nodes have failed that the
cluster's state can no longer be guaranteed.
* *System restart*:
The process of restarting the cluster and reinitializing its state
from disk logs and checkpoints. This is required after either a
planned or an unplanned shutdown of the cluster.
* *Fragment*:
A portion of a database table; in the `NDB' storage engine, a
table is broken up into and stored as a number of fragments. A
fragment is sometimes also called a `partition'; however,
`fragment' is the preferred term. Tables are fragmented in MySQL
Cluster in order to facilitate load balancing between machines and
nodes.
* *Replica*:
Under the `NDB' storage engine, each table fragment has number of
replicas stored on other data nodes in order to provide
redundancy. Currently, there may be up four replicas per fragment.
* *Transporter*:
A protocol providing data transfer between nodes. MySQL Cluster
currently supports four different types of transporter connections:
* TCP/IP
This is, of course, the familiar network protocol that
underlies HTTP, FTP (and so on) on the Internet. TCP/IP can
be used for both local and remote connections.
* SCI
*S*calable *C*oherent *I*nterface is a high-speed protocol
used in building multiprocessor systems and
parallel-processing applications. Use of SCI with MySQL
Cluster requires specialized hardware, as discussed in
sci-sockets. For a basic introduction to SCI, see this
essay at dolphinics.com
(http://www.dolphinics.com/corporate/scitech.html).
* SHM
Unix-style *sh*ared *m*emory segments. Where supported, SHM
is used automatically to connect nodes running on the same
host. The Unix man page for `shmop(2)'
(http://www.scit.wlv.ac.uk/cgi-bin/mansec?2+shmop) is a good
place to begin obtaining additional information about this
topic.
* The cluster transporter is internal to the cluster.
Applications using MySQL Cluster communicate with SQL nodes just
as they do with any other version of MySQL Server (via TCP/IP, or
through the use of Unix socket files or Windows named pipes).
Queries can be sent and results retrieved using the standard MySQL
client APIs.
* *`NDB'*:
This stands for *N*etwork *D*ata*b*ase, and refers to the storage
engine used to enable MySQL Cluster. The `NDB' storage engine
supports all the usual MySQL data types and SQL statements, and is
ACID-compliant. This engine also provides full support for
transactions (commits and rollbacks).
* *Share-nothing architecture*:
The ideal architecture for a MySQL Cluster. In a true
share-nothing setup, each node runs on a separate host. The
advantage such an arrangement is that there no single host or node
can act as single point of failure or as a performance bottle neck
for the system as a whole.
* *In-memory storage*:
All data stored in each data node is kept in memory on the node's
host computer. For each data node in the cluster, you must have
available an amount of RAM equal to the size of the database times
the number of replicas, divided by the number of data nodes. Thus,
if the database takes up 1GB of memory, and you want to set up the
cluster with four replicas and eight data nodes, a minimum of
500MB memory will be required per node. Note that this is in
addition to any requirements for the operating system and any
other applications that might be running on the host.
* *Table*:
As is usual in the context of a relational database, the term
`table' denotes a set of identically structured records. In MySQL
Cluster, a database table is stored in a data node as a set of
fragments, each of which is replicated on additional data nodes.
The set of data nodes replicating the same fragment or set of
fragments is referred to as a _node group_.
* *Cluster programs*:
These are command-line programs used in running, configuring, and
administering MySQL Cluster. They include both server daemons:
* `ndbd':
The data node daemon (runs a data node process)
* `ndb_mgmd':
The management server daemon (runs a management server
process)
and client programs:
* `ndb_mgm':
The management client (provides an interface for executing
management commands)
* `ndb_waiter':
Used to verify status of all nodes in a cluster
* `ndb_restore':
Restores cluster data from backup
For more about these programs and their uses, see
mysql-cluster-process-management.
* *Event log*:
MySQL Cluster logs events by category (startup, shutdown, errors,
checkpoints, and so on), priority, and severity. A complete
listing of all reportable events may be found in
mysql-cluster-event-reports. Event logs are of two types:
* *Cluster log*:
Keeps a record of all desired reportable events for the
cluster as a whole.
* *Node log*:
A separate log which is also kept for each individual node.
Under normal circumstances, it is necessary and sufficient to keep
and examine only the cluster log. The node logs need be consulted
only for application development and debugging purposes.
Info Catalog
(mysql.info) mysql-cluster-faq
(mysql.info) ndbcluster
automatically generated byinfo2html