• HP NonStop™ TUXEDO 

 

The HP NonStop™ TUXEDO enables the development of scalable,fault-tolerant OLTP applications that are portable and that interoperate with generic TUXEDO installations on many industry standard platforms. HP NonStop Tuxedo is the proven platform for simplifying distributed transaction processing and message-based application development, while delivering unlimited scalability and standards-based interoperability, which is fundamental to service-oriented architecture.

NonStop TUXEDO Architecture

Because NonStop TUXEDO is targeted specifically for the Himalaya platform, general

portability is not a requirement. Hence NonStop TUXEDO is internally modified to adapt to

Whereby several identical server processes read from a single request queue.

The weighting is user configurable, and allows for differences in the "capacity" of each machine

to be taken into account (e.g. the relative network bandwidth between the client machine and

each possible server machine, the class of each server machine (workstation, mainframe), etc).

Portability" in this case refers to the NonStop TUXEDO system software itself. Portability at

the application level is unaffected, the externals of NonStop TUXEDO are equivalent to those of

and exploit the unique attributes of the underlying Himalaya hardware and software

infrastructure.

The main differentiating characteristics of Himalaya systems (in this context) are as follows:

  •  The Himalaya processors are loosely-coupled MP machines, which have very different characteristics to SMP machines:
  •  Each machine ("node" in Tandem terminology) can have up to 16 cpus.
  •  Each cpu has its own memory. Memory can only be shared by processes in the same cpu
  •  Up to 255 nodes can be clustered (using the Tandem Expand networking architecture).
  • Any cpu in a cluster can access any device attached to any other cpu in the cluster (for example, all disks in the cluster are directly accessible from any process on any cpu).
  •  Processes send messages to each other via the NonStop Kernel Message System. Any process in the cluster can communicate with any other process in the cluster. The location of the target process is transparent to the caller.
  •  A node has a single-system image (SSI), the fact that a node comprises many cpus is transparent to users and application processes.
  •  Unlike BEA TUXEDO, NonStop TUXEDO is built upon an operating system and subsystem

           infrastructure that is explicitly designed to support the needs of OLTP applications.

          This same infrastructure is used to support Tandem RMs (for example, NonStop SQL/MP

          (RDBMS), Enscribe (ISAM file system)), and Pathway. Hence, many internal functions

          which BEA TUXEDO has to provide to make up for the generalities of Unix, are provided

          by underlying services, and are not required in the NonStop TUXEDO implementation.

          These underlying services comprise primarily the following components:

  •  NonStop Transaction Manager/MP. Provides the transaction management services

          necessary to associate requests made to RMs with the appropriate transaction, and to

         coordinate completion (commit/rollback) of the transaction across all RMs involved.

         Performs transaction recovery processing in the event of failure. Provides transaction

        and data logging services both for itself (the TM) and RMs. 

  •  NonStop TM/MP is tightlyintegrated with the NonStop Kernel message system, and the RMs.
  •  NonStop Transaction Services/MP. Performs server process management - the services necessary to ensure application server processes are always available to clients (detect server failures and automatically restarts them). Ensures server processes are efficiently utilised by load-balancing client-requests among them.
  •  NonStop Kernel Message System. Used to communicate between processes. Provides transparency of target process location (same cpu, node, or across nodes). Automatically propagates transaction information with the message (the receiving process is "infected" with the transaction currently associated with the sending process).

Shared Memory

On Himalaya machines, memory may only be shared between processes in the same cpu. If the

BEA TUXEDO architecture were maintained, this would constrain execution of all NonStop

BEA TUXEDO (in terms of use, programming, interoperability, and administration).

Himalaya clusters may consist of up to 4080 cpus (255*16), hence the term "Massively Parallel

Processors" (MPP) is often used to describe these configurations.

TUXEDO client, server, and system processes to only a single cpu.

Obviously this would yield poor performance and availability so this constraint has to be removed.

The "simplest" solution is to simply have a cpu equal a TUXEDO machine, and replicate every

shared-memory data-structure on every cpu. But such an implementation would have serious

shortcomings also - it would be difficult to configure and manage (every cpu would require an

entry in the UBBCONFIG *MACHINES section), and would still offer poor performance (for

example, servers in an MSSQ-set could not be distributed across cpus).

The actual solution implemented is as follows:

  •  The BB and BBL are replicated across every cpu. This enables clients and servers to be located on any cpu (a necessity), without requiring rework of large tracts of BEA code.Since during normal operation there is little change to the global part of the BB, the overhead of propagating such changes to multiple copies of the BB is not considered a problem.
  •  The Global Transaction Table (GTT) is deleted (see Transaction Management section below for details).
  • This architecture maintains the key attribute that a single Himalaya node appears equivalent to TUXEDO machine, preserving BEA TUXEDO externals (in terms of configuration and runtime administration).

Interprocess Messages

The NonStop Kernel message system is used for all NonStop TUXEDO interprocess

communication. Because the message system spans all cpus and nodes in a cluster seamlessly,

whether the destination process is on the same cpu, same node, or remote node is transparent to

the caller and recipient. This contrasts with the Unix message system which is limited to within

a single machine (hence the need by BEA TUXEDO for Bridge processes, to logically extend the

Unix message system across machines). NonStop TUXEDO is therefore able to do away with

Bridge processes completely.

Because the message system spans cpus, NonStop TUXEDO has no constraints on server

instance co-location with its "message queue", and hence server processes in a server class

(MSSQ-set) may be distributed across cpus.

Transaction Management

NonStop TUXEDO replaces the BEA TUXEDO transaction manager function with NonStop

TM/MP. As a result, the GTT is not required (NonStop TM/MP has its own internal datastructures),

neither is the functionality of the TMS. So the GTT and the TMS processes are

deleted in NonStop TUXEDO.

NonStop TM/MP functionality is tightly integrated with the message system and the RMs.

Transaction creation, propagation, and commitment, is a fundamental function of the NonStop

Kernel infrastructure. NonStop TM/MP processes exist on each node in the cluster, and manage

all the recoverable resources (RMs) on that node. There is no notion of separate groups for each

RM, each with its own TMS process, as with BEA TUXEDO.

Beginning a NonStop TUXEDO transaction starts a NonStop TM/MP transaction, and associates

the process with that transaction. When the process (client) sends a request to a server (via

tp*call(), tpconnect(), or tpforward()), the message system propagates the NonStop

TM/MP transaction to the server process. All work done in the server process (for example, RM

Since all these processes need access to the BB shared-memory data-structure.

Minimum change to BEA code is a high-priority design requirement.

access) is then associated with that transaction. When the client commits the transaction,

NonStop TM/MP broadcasts the 2-pc sequence to all nodes in the cluster. NonStop TM/MP on

each node then interacts with the local RMs to prepare and commit/rollback any changes made

on that node on behalf of the transaction.

A key difference in this NonStop TM/MP to RM interaction compared with the BEA TUXEDO

to RM interaction, is that NonStop TM/MP acts both as a "global transaction coordinator", and

"RM local transaction coordinator" simultaneously (logically NonStop TM/MP provides the

function of both the TM and RM roles as defined by the XA interface). NonStop TM/MP

provides a single "common" transaction log per node, which is used to save all the necessary

transaction recovery information for both NonStop TM/MP and the RMs. A single synchronous

log write per node occurs at transaction commitment, irrespective of the number of RMs

involved in the transaction. For example, if a NonStop TUXEDO service routine updates a

NonStop SQL/MP file, and an Enscribe file, and a Queue file, all the information necessary for

recovery is logged with a single synchronous write.

Process Management

In NonStop TUXEDO, process management is performed by NonStop TS/MP. NonStop

TUXEDO servers are implemented as NonStop TS/MP server classes (the internal use by

NonStop TUXEDO of NonStop TS/MP server classes is transparent).10 A server class may

consist of a single server process, or many (if the servers are defined as an MSSQ-set). NonStop

TUXEDO administration commands such as tmboot and tmshutdown are transformed (internally)

into NonStop TS/MP commands to start and stop server classes.

NonStop TS/MP is responsible for server failure detection and restart. The BEA TUXEDO

server process "surveillance" mechanism is replaced by the NonStop TS/MP mechanism. This is

considerably different to the BEA mechanism because of the "OLTP optimised" nature of the

NonStop Kernel operating system. In this case, the feature of note is that any process can request

notification upon the occasion of certain events (for example, on the abnormal death of another

process, or upon cpu failure). NonStop TS/MP11 receives such notification for any server

process in a server class which abends, and will automatically restart the failed server. If

NonStop TS/MP receives notification that a cpu has failed, then it will automatically restart all

the server processes from the failed cpu on another cpu.

In BEA TUXEDO, for request-response servers, the number of server instances is static. For

NonStop TUXEDO, NonStop TS/MP supports the notion of dynamic servers, which provides

an automatic response to increased load. The administrator can configure the MIN and MAX

number of instances of a particular server. When the server is first booted, NonStop TUXEDO

will start MIN instances of the server in the server class. If queuing starts to occur at the

Load Balancing

There is a mapping made (in the BB) between the service name requested, and the server class

name of the server(s) offering that service.12 The server class name is thus obtained from the BB

and used to address the message.

In the case of a service offered only by a single-server server class, client requests are sent

directly to the server process (where queuing occurs if necessary). No load balancing is

required.

In the case of a service offered by a multiple-server server class, load balancing between servers

in the server class is performed by NonStop TS/MP, via a distributor-server (there is a separate

distributor-server for each server class).13 Logically, the distributor-server represents the single

queue in an MSSQ-set to which all client requests are sent, and the servers in the server class are

single servers each with their own queue, to which the distributor-server forwards the request.

When the server replies, the reply is routed via the distributor-server back to the client. Because

it handles all client requests and replies, the distributor-server "knows" precisely the current

state of all server processes in the server class (busy, idle, suspended, etc), and hence can

perform accurate load-balancing when forwarding the request (to an idle server). Queuing

occurs at the distributor-server if necessary (for example, if all the servers in the server class are

busy).

If a service is offered by more than one server class, NonStop TUXEDO will select the server

class using the same round-robin algorithm as used by BEA TUXEDO when the same service is

offered by servers on different machines (selection of the server instance within the server class

then occurs via NonStop TS/MP as described above).

This mapping is derived from information defined in the UBBCONFIG file. In NonStop

TUXEDO, the *SERVERS RQADDR parameter specifies the server class name.

The distributor-server is implemented as a process-pair.

 

NonStop TUXEDO Advantages vs. BEA TUXEDO (on a Unix SMP)

BEA TUXEDO does an excellent job of bringing OLTP-capability to Unix based applications and

databases. However, it is somewhat limited by the shortcomings of the Unix SMP environment

itself, and by the constraints imposed to achieve portability across multiple Unix

implementations and hardware platforms. Because NonStop TUXEDO is specifically designed

to run only on Himalaya systems, and has been modified to adapt to and exploit the OLTPoptimised

system infrastructure (as described above), the result is a NonStop TUXEDO

implementation which exhibits additional benefits over BEA TUXEDO running on a Unix SMP.

These benefits accrue in the following areas:

  •  Application availability.
  •  Performance and scalability.
  •  Administration.

Availability

NonStop TUXEDO applications offer higher levels of availability for the following reasons:

  •  Fault-tolerance of the underlying hardware.

Because of the replication of components, and their isolation (for example, separate memoryper cpu), no single point of failure will render an entire Himalaya machine unavailable.

Failure of a cpu or its memory only affects those processes running in that cpu, no single

hardware error could occur which would make NonStop TUXEDO services on a node

completely unavailable.

 

NonStop Tuxedo Components