Waltz

Waltz

  • Docs

›Design

Design

  • Introduction
  • Terminology and Components
  • Application Programming Model
  • Client-Server Communication
  • Server-Storage Communication
  • On-Disk Data Structures
  • Concurrency Control (Optimistic Locking)
  • Back Pressure
  • Waltz Client
  • Waltz Server
  • Waltz Storage

Administration

  • Waltz Setup

Waltz Storage

Waltz Storage is a storage server which provides persistency to Waltz. Its functionality is designed to be limited to relatively simple set of commands. Waltz centralizes the responsibility of transaction consistency and a fault recovery to Waltz Server. Waltz Storage has following ten request/response patterns.

  1. Open Request → { Success Response | Failure Response }
  2. Last Session Info Request → { Last Session Info Response | Failure Response }
  3. Max Transaction ID Request → { Max Transaction ID Response | Failure Response }
  4. Truncate Request → { Success Response | Failure Response }
  5. Set Low-water Mark Request → { Success Response | Failure Response }
  6. Append Request → { Success Response | Failure Response }
  7. Record Header Request → { Record Header Response | Failure Response }
  8. Record Request → { Record Response | Failure Response }
  9. Record Header List Request → { Record Header List Response | Failure Response }
  10. Record List Request → { Record List Response | Failure Response }

Open Request

FieldData TypeDescription
Session IDlong-1
Sequence numberlong-1
Partition IDint-1
Cluster keyUUIDA unique ID of the cluster
Number of partitionsintThe number of partitions

Open request is the first request a Waltz server makes after a connection to a storage server is opened. A Waltz server sends a cluster key which is UUID assigned to a cluster when a cluster is created by an admin tool. The storage server compares it with the one in the control file. If they don’t match, it indicates there is a configuration error.

Last Session Info Request

FieldData TypeDescription
Session IDlongThe store session ID
Sequence numberlongThe message sequence number
Partition IDintThe partition ID

Last Session Info request is sent by the recovery manager when a replica connection is opened. It requests the information of the last store session of the storage partition on this storage server. It comes from Partition Info on the control file. The low-water mark is the high-water mark of the partition when the store session started.

Last Session Info Response

FieldData TypeDescription
Session IDlongThe store session ID
Sequence numberlongThe message sequence number
Partition IDintThe partition ID
Session IDlongThe last store session ID
Low-water MarklongThe low-water mark of the last session

Max Transaction ID Request

FieldData TypeDescription
Session IDlongThe store session ID
Sequence numberlongThe message sequence number
Partition IDintThe partition ID

This requests the max transaction ID of the specified storage partition. The transaction may not be committed yet.

Max Transaction ID Response

FieldData TypeDescription
Session IDlongThe store session ID
Sequence numberlongThe message sequence number
Partition IDintThe partition ID
Transaction IDlongThe max transaction ID of the partition on the storage server

Truncate Request

FieldData TypeDescription
Session IDlongThe store session ID
Sequence numberlongThe message sequence number
Partition IDintThe partition ID
Transaction IDlongThe max transaction ID to retain. Any transaction after this transaction ID will be removed.

Set Low-water Mark Request

FieldData TypeDescription
Session IDlongThe store session ID
Sequence numberlongThe message sequence number
Partition IDintThe partition ID
Low-water marklongThe low-water mark which is the max transaction ID when this store session is started.

Append Request

FieldData TypeDescription
Session IDlongThe store session ID
Sequence numberlongThe message sequence number
Partition IDintThe partition ID
Record List LengthintThe number of transaction records
Record ListRecord[]The list of records

RecordHeader

FieldData TypeDescription
Transaction IDlongThe transaction ID
Request IDReqIdClient generated unique request ID
Transaction HeaderintThe transaction header

Record

FieldData TypeDescription
Transaction IDlongThe transaction ID
Request IDReqIdClient generated unique request ID
Transaction HeaderintThe transaction header
Transaction Data LengthintThe length of transaction data
Transaction Databyte[]A byte array
ChecksumintCRC32 of transaction data

Record Header Request

FieldData TypeDescription
Session IDlongThe store session ID
Sequence numberlongThe message sequence number
Partition IDintThe partition ID
Transaction IDlongThe transaction ID

Request Header Response

FieldData TypeDescription
Session IDlongThe store session ID
Sequence numberlongThe message sequence number
Partition IDintThe partition ID
Record headerRecordHeaderThe transaction record header

Record Request

FieldData TypeDescription
Session IDlongThe store session ID
Sequence numberlongThe message sequence number
Partition IDintThe partition ID
Transaction IDlongThe transaction ID

Record Response

FieldData TypeDescription
Session IDlongThe store session ID
Sequence numberlongThe message sequence number
Partition IDintThe partition ID
RecordRecordThe transaction record

Record Header List Request

FieldData TypeDescription
Session IDlongThe store session ID
Sequence numberlongThe message sequence number
Partition IDintThe partition ID
Transaction IDlongThe transaction ID
Max number of recordsintThe maximum number of records to fetch

Request Header List Response

FieldData TypeDescription
Session IDlongThe store session ID
Sequence numberlongThe message sequence number
Partition IDintThe partition ID
Record header list LengthintThe number of record headers
Record header listRecordHeader[]The list of transaction record headers

Record List Request

FieldData TypeDescription
Session IDlongThe store session ID
Sequence numberlongThe message sequence number
Partition IDintThe partition ID
Transaction IDlongThe transaction ID
Max number of recordsintThe maximum number of records to fetch

Record List Response

FieldData TypeDescription
Session IDlongThe store session ID
Sequence numberlongThe message sequence number
Partition IDintThe partition ID
Record List LengthintThe number of records
Record ListRecord[]The list of transaction record headers

Success Response

FieldData TypeDescription
Session IDlongThe store session ID
Sequence numberlongThe message sequence number
Partition IDintThe partition ID

Failure Response

FieldData TypeDescription
Session IDlongThe store session ID
Sequence numberlongThe message sequence number
Partition IDintThe partition ID
ExceptionStorageRpcExceptionThe exception information

StorageRpcException

FieldData TypeDescription
MessageStringThe exception message
Stack Trace LengthintThe number of stack trace elements
Stack Trace Element *repeat
Class nameStringThe class name
Method nameStringThe method name
File nameStringThe file name
Line numberintThe line number

Replica Assignments

The assignment of replicas to the storage serves are stored in Zookeeper. The ZNode path is <cluster root>/store/assignment. This data is initialized when a cluster is configured. It can be updated dynamically with Zookeeper CLI tool.

Replica Assignments

FieldData TypeDescription
Map of Storage to Partition ID List *repeat
Connect StringStringThe connect string (host:port)
Replica IDsint[]The array of Replica IDs

Group Descriptor

The descriptor of replica group assignments are stored in Zookeeper. The ZNode path is <cluster root>/store/group. This data is initialized when a cluster is configured. It can be updated dynamically with Zookeeper CLI tool.

Group Descriptor

FieldData TypeDescription
Map of Connect String to Group ID *repeat
Connect StringStringThe connect string (host:port)
Group IDIntegerThe group ID it belongs to

Storage Administration

The Waltz storage server also exposes a separate port for administrative operations such as marking the storage node offline or online, or assigning a partition to the storage node. This client is implemented using the same client as the normal storage client, but with a different set of administrative messages as its protocol.

Admin Open Request

FieldData TypeDescription
Sequence numberlong-1
Cluster keyUUIDA unique ID of the cluster
Number of partitionsintThe number of partitions

Partition Assignment Request

FieldData TypeDescription
Sequence numberlongThe message sequence number
Partition IDintThe partition ID
ToggledbooleanCreate (true) or delete the partition (false) on the storage node

Partition Read Request

FieldData TypeDescription
Sequence numberlongThe message sequence number
Partition IDintThe partition ID
ToggledbooleanMark the partition as readable (true) or unreadable (false) on the storage node

Partition Write Request

FieldData TypeDescription
Sequence numberlongThe message sequence number
Partition IDintThe partition ID
ToggledbooleanMark the partition as writable (true) or unwritable (false) on the storage node

Admin Success Response

FieldData TypeDescription
Sequence numberlongThe message sequence number

Admin Failure Response

FieldData TypeDescription
Sequence numberlongThe message sequence number
ExceptionStorageRpcExceptionThe exception information
← Waltz ServerWaltz Setup →
  • Open Request
  • Last Session Info Request
  • Last Session Info Response
  • Max Transaction ID Request
  • Max Transaction ID Response
  • Truncate Request
  • Set Low-water Mark Request
  • Append Request
  • RecordHeader
  • Record
  • Record Header Request
  • Request Header Response
  • Record Request
  • Record Response
  • Record Header List Request
  • Request Header List Response
  • Record List Request
  • Record List Response
  • Success Response
  • Failure Response
  • StorageRpcException
  • Replica Assignments
    • Replica Assignments
  • Group Descriptor
    • Group Descriptor
  • Storage Administration
    • Admin Open Request
    • Partition Assignment Request
    • Partition Read Request
    • Partition Write Request
    • Admin Success Response
    • Admin Failure Response
Waltz
Docs
DesignAdministrationAPI Reference
Community
Stack OverflowTwitter
More
BlogGitHub
Copyright © 2019 WePay Inc.