bus1Version 1

The Internal Bus1 API


Table of Contents

1. Bus1 Overview
2. Bus1 Peer
struct bus1_peer — peer context
bus1_peer_acquire — acquire active reference to peer
bus1_peer_release — release an active reference
bus1_peer_new — allocate new peer
bus1_peer_free — destroy peer
bus1_peer_ioctl — handle peer ioctls
3. Bus1 Message
struct bus1_factory — message factory
struct bus1_message — data messages
bus1_message_ref — acquire object reference
bus1_message_unref — release object reference
bus1_factory_new — create new message factory
bus1_factory_free — destroy message factory
bus1_factory_seal — charge and commit local resources
bus1_factory_instantiate — instantiate a message from a factory
bus1_message_free — destroy message
bus1_message_stage — stage message
bus1_message_install — install message payload into target process
4. Bus1 Transaction
ipc/bus1/tx.h — Document generation inconsistency
enum bus1_tx_bits — transaction flags
struct bus1_tx — transaction context
bus1_tx_init — initialize transaction context
bus1_tx_deinit — deinitialize transaction context
bus1_tx_stage_sync — stage message
bus1_tx_stage_later — postpone message
bus1_tx_join — HIC SUNT DRACONES!
bus1_tx_commit — commit transaction
5. Bus1 Handle
enum bus1_handle_bits — node flags
struct bus1_handle — object handle
bus1_handle_is_anchor — check whether handle is an anchor
bus1_handle_is_live — check whether handle is live
bus1_handle_is_public — check whether handle is public
bus1_handle_ref — acquire object reference
bus1_handle_unref — release object reference
bus1_handle_acquire — acquire weak/strong reference
bus1_handle_release — release weak/strong reference
bus1_handle_release_n — release multiple references
bus1_handle_new_anchor — allocate new anchor handle
bus1_handle_new_remote — allocate new remote handle
bus1_handle_free — free handle
bus1_handle_acquire_owner — acquire owner of a handle
bus1_handle_ref_by_other — lookup handle on a peer
bus1_handle_acquire_locked — acquire strong reference
bus1_handle_acquire_slow — slow-path of handle acquisition
bus1_handle_release_slow — slow-path of handle release
bus1_handle_destroy_locked — stage node destruction
bus1_handle_is_live_at — check whether handle is live at a given time
bus1_handle_import — import handle
bus1_handle_identify — identify handle
bus1_handle_export — export handle
bus1_handle_forget — forget handle
bus1_handle_forget_keep — forget handle but keep rb-tree order
6. Bus1 User
struct bus1_user_usage — usage counters
struct bus1_user_limits — resource limit counters
struct bus1_user — resource accounting for users
bus1_user_modexit — clean up global resources of user accounting
bus1_user_limits_init — initialize resource limit counter
bus1_user_limits_deinit — deinitialize source limit counter
bus1_user_ref_by_uid — get a user object for a uid
bus1_user_ref — acquire reference
bus1_user_unref — release reference
bus1_user_charge — charge a user resource
bus1_user_discharge — discharge a user resource
bus1_user_charge_quota — charge quota resources
bus1_user_discharge_quota — discharge quota resources
bus1_user_commit_quota — commit quota resources
7. Bus1 Active Reference
struct bus1_active — active references
bus1_active_acquire — acquire active reference
bus1_active_release — release active reference
bus1_active_init_private — initialize object
bus1_active_deinit — destroy object
bus1_active_is_new — check whether object is new
bus1_active_is_active — check whether object is active
bus1_active_is_deactivated — check whether object was deactivated
bus1_active_is_drained — check whether object is drained
bus1_active_activate — activate object
bus1_active_deactivate — deactivate object
bus1_active_drain — drain active references
bus1_active_cleanup — cleanup drained object
bus1_active_lockdep_acquired — acquire lockdep reader
bus1_active_lockdep_released — release lockdep reader
8. Bus1 Fixed List
struct bus1_flist — fixed list
bus1_flist_inline_size — calculate required inline size
bus1_flist_init — initialize an flist
bus1_flist_deinit — deinitialize an flist
bus1_flist_next — flist iterator
bus1_flist_walk — walk flist in batches
bus1_flist_populate — populate an flist
bus1_flist_new — allocate new flist
bus1_flist_free — free flist
9. Bus1 Pool
struct bus1_pool_slice — pool slice
struct bus1_pool — client pool
bus1_pool_slice_is_public — check whether a slice is public
bus1_pool_init — create memory pool
bus1_pool_deinit — destroy pool
bus1_pool_alloc — allocate memory
bus1_pool_release_kernel — release kernel-owned slice reference
bus1_pool_publish — publish a slice
bus1_pool_release_user — release a public slice
bus1_pool_flush — flush all user references
bus1_pool_mmap — mmap the pool
bus1_pool_write_iovec — copy user memory to a slice
bus1_pool_write_kvec — copy kernel memory to a slice
10. Bus1 Queue
struct bus1_queue_node — node into message queue
struct bus1_queue — message queue
bus1_queue_node_init — initialize queue node
bus1_queue_node_deinit — destroy queue node
bus1_queue_node_get_type — query node type
bus1_queue_node_get_timestamp — query node timestamp
bus1_queue_node_is_queued — check whether a node is queued
bus1_queue_node_is_staging — check whether a node is marked staging
bus1_queue_tick — increment queue clock
bus1_queue_sync — sync queue clock
bus1_queue_is_readable_rcu — check whether a queue is readable
bus1_queue_compare — comparator for queue ordering
bus1_queue_init — initialize queue
bus1_queue_deinit — destroy queue
bus1_queue_flush — flush message queue
bus1_queue_stage — stage queue entry with fresh timestamp
bus1_queue_commit_staged — commit staged queue entry with new timestamp
bus1_queue_commit_unstaged — commit unstaged queue entry with new timestamp
bus1_queue_commit_synthetic — commit synthetic entry
bus1_queue_remove — remove entry from queue
bus1_queue_peek — peek first available entry

Chapter 1. Bus1 Overview

bus1 is a local IPC system, which provides a decentralized infrastructure to share objects between local peers. The main building blocks are nodes and handles. Nodes represent objects of a local peer, while handles represent descriptors that point to a node. Nodes can be created and destroyed by any peer, and they will always remain owned by their respective creator. Handles, on the other hand, are used to refer to nodes and can be passed around with messages as auxiliary data. Whenever a handle is transferred, the receiver will get its own handle allocated, pointing to the same node as the original handle.

Any peer can send messages directed at one of their handles. This will transfer the message to the owner of the node the handle points to. If a peer does not posess a handle to a given node, it will not be able to send a message to that node. That is, handles provide exclusive access management. Anyone that somehow acquired a handle to a node is privileged to further send this handle to other peers. As such, access management is transitive. Once a peer acquired a handle, it cannot be revoked again. However, a node owner can, at anytime, destroy a node. This will effectively unbind all existing handles to that node on any peer, notifying each one of the destruction.

Unlike nodes and handles, peers cannot be addressed directly. In fact, peers are completely disconnected entities. A peer is merely an anchor of a set of nodes and handles, including an incoming message queue for any of those. Whether multiple nodes are all part of the same peer, or part of different peers does not affect the remote view of those. Peers solely exist as management entity and command dispatcher to local processes.

The set of actors on a system is completely decentralized. There is no global component involved that provides a central registry or discovery mechanism. Furthermore, communication between peers only involves those peers, and does not affect any other peer in any way. No global communication lock is taken. However, any communication is still globally ordered, including unicasts, multicasts, and notifications.

Chapter 2. Bus1 Peer

Table of Contents

struct bus1_peer — peer context
bus1_peer_acquire — acquire active reference to peer
bus1_peer_release — release an active reference
bus1_peer_new — allocate new peer
bus1_peer_free — destroy peer
bus1_peer_ioctl — handle peer ioctls

A peer context provides access to the bus1 system. A peer itself is not a routable entity, but rather only a local anchor to serve as gateway to the bus. To participate on the bus, you need to allocate a peer. This peer manages all your state on the bus, including all allocated nodes, owned handles, incoming messages, and more.

A peer is split into 3 sections: - A static section that is initialized at peer creation and never changes - A peer-local section that is only ever accessed by ioctls done by the peer itself. - A data section that might be accessed by remote peers when interacting with this peer.

All peers on the system operate on the same level. There is no context a peer is linked into. Hence, you can never lock multiple peers at the same time. Instead, peers provide active-references. Before performing an operation on a peer, an active reference must be acquired, and hold as long as the operation goes on. When done, the reference is released again. When a peer is disconnected, no more active references can be acquired, and any outstanding operation is waited for before the peer is destroyed.

Additionally to active-references, there are 2 locks: A peer-local lock and a data lock. The peer-local lock is used to synchronize operations done by the peer itself. It is never acquired by a remote peer. The data lock protects the data of the peer, which might be modified by remote peers. The data lock nests underneath the local-lock. Furthermore, the data-lock critical sections must be kept small and never block indefinitely. Remote peers might wait for data-locks, hence they must rely on not being DoSed. The local peer lock, however, is private to the peer itself. Not such restrictions apply. It is mostly used to give the impression of atomic operations (i.e., making the API appear consistent and coherent).

struct bus1_peer — peer context

Synopsis

struct bus1_peer {
  u64 id;
  u64 flags;
  const struct cred * cred;
  struct pid_namespace * pid_ns;
  struct bus1_user * user;
  struct rcu_head rcu;
  wait_queue_head_t waitq;
  struct bus1_active active;
  struct dentry * debugdir;
  struct local;
};  

Members

id

peer ID

flags

peer flags

cred

pinned credentials

pid_ns

pinned pid-namespace

user

pinned user

rcu

rcu-delayed kfree of peer

waitq

peer wide wait queue

active

active references

debugdir

debugfs root of this peer, or NULL/ERR_PTR

local

handle ID allocator


bus1_peer_acquire — acquire active reference to peer

Synopsis

struct bus1_peer * bus1_peer_acquire (struct bus1_peer * peer);

Arguments

peer

peer to operate on, or NULL

Description

Acquire a new active reference to the given peer. If the peer was not activated yet, or if it was already deactivated, this will fail.

If NULL is passed, this is a no-op.

Return

Pointer to peer, NULL on failure.


bus1_peer_release — release an active reference

Synopsis

struct bus1_peer * bus1_peer_release (struct bus1_peer * peer);

Arguments

peer

handle to release, or NULL

Description

This releases an active reference to a peer, acquired previously via bus1_peer_acquire.

If NULL is passed, this is a no-op.

Return

NULL is returned.


bus1_peer_new — allocate new peer

Synopsis

struct bus1_peer * bus1_peer_new ( void);

Arguments

void

no arguments

Description

Allocate a new peer. It is immediately activated and ready for use. It is not linked into any context. The caller will get exclusively access to the peer object on success.

Note that the peer is opened on behalf of 'current'. That is, it pins its credentials and namespaces.

Return

Pointer to peer, ERR_PTR on failure.


bus1_peer_free — destroy peer

Synopsis

struct bus1_peer * bus1_peer_free (struct bus1_peer * peer);

Arguments

peer

peer to destroy, or NULL

Description

Destroy a peer object that was previously allocated via bus1_peer_new. This synchronously waits for any outstanding operations on this peer to finish, then releases all linked resources and deallocates the peer in an rcu-delayed manner.

If NULL is passed, this is a no-op.

Return

NULL is returned.


bus1_peer_ioctl — handle peer ioctls

Synopsis

long bus1_peer_ioctl (struct file * file, unsigned int cmd, unsigned long arg);

Arguments

file

file the ioctl is called on

cmd

ioctl command

arg

ioctl argument

Description

This handles the given ioctl (cmd+arg) on a peer. This expects the peer to be stored in the private_data field of file.

Multiple ioctls can be called in parallel just fine. No locking is needed.

Return

0 on success, negative error code on failure.

Chapter 3. Bus1 Message

Table of Contents

struct bus1_factory — message factory
struct bus1_message — data messages
bus1_message_ref — acquire object reference
bus1_message_unref — release object reference
bus1_factory_new — create new message factory
bus1_factory_free — destroy message factory
bus1_factory_seal — charge and commit local resources
bus1_factory_instantiate — instantiate a message from a factory
bus1_message_free — destroy message
bus1_message_stage — stage message
bus1_message_install — install message payload into target process

XXX

struct bus1_factory — message factory

Synopsis

struct bus1_factory {
  struct bus1_peer * peer;
  struct bus1_cmd_send * param;
  const struct cred * cred;
  struct pid * pid;
  struct pid * tid;
  bool on_stack:1;
  bool has_secctx:1;
  size_t length_vecs;
  size_t n_vecs;
  size_t n_handles;
  size_t n_handles_charge;
  size_t n_files;
  u32 n_secctx;
  struct iovec * vecs;
  struct file ** files;
  char * secctx;
  struct bus1_flist handles[];
};  

Members

peer

sending peer

param

factory parameters

cred

sender credentials

pid

sender PID

tid

sender TID

on_stack

whether object lives on stack

has_secctx

whether secctx has been set

length_vecs

total length of data in vectors

n_vecs

number of vectors

n_handles

number of handles

n_handles_charge

number of handles to charge on commit

n_files

number of files

n_secctx

length of secctx

vecs

vector array

files

file array

secctx

allocated secctx

handles[]

handle array


struct bus1_message — data messages

Synopsis

struct bus1_message {
  struct kref ref;
  struct bus1_queue_node qnode;
  struct bus1_handle * dst;
  struct bus1_user * user;
  u64 flags;
  uid_t uid;
  gid_t gid;
  pid_t pid;
  pid_t tid;
  size_t n_bytes;
  size_t n_handles;
  size_t n_handles_charge;
  size_t n_files;
  size_t n_secctx;
  struct bus1_pool_slice * slice;
  struct file ** files;
  struct bus1_flist handles[];
};  

Members

ref

reference counter

qnode

embedded queue node

dst

destination handle

user

sending user

flags

message flags

uid

sender UID

gid

sender GID

pid

sender PID

tid

sender TID

n_bytes

number of user-bytes transmitted

n_handles

number of handles transmitted

n_handles_charge

number of handle charges

n_files

number of files transmitted

n_secctx

number of bytes of security context transmitted

slice

actual message data

files

passed file descriptors

handles[]

passed handles


bus1_message_ref — acquire object reference

Synopsis

struct bus1_message * bus1_message_ref (struct bus1_message * m);

Arguments

m

message to operate on, or NULL

Description

This acquires a single reference to m. The caller must already hold a reference when calling this.

If m is NULL, this is a no-op.

Return

m is returned.


bus1_message_unref — release object reference

Synopsis

struct bus1_message * bus1_message_unref (struct bus1_message * m);

Arguments

m

message to operate on, or NULL

Description

This releases a single object reference to m. If the reference counter drops to 0, the message is destroyed.

If m is NULL, this is a no-op.

Return

NULL is returned.


bus1_factory_new — create new message factory

Synopsis

struct bus1_factory * bus1_factory_new (struct bus1_peer * peer, struct bus1_cmd_send * param, void * stack, size_t n_stack);

Arguments

peer

peer to operate as

param

factory parameters

stack

optional stack for factory, or NULL

n_stack

size of space at stack

Description

This allocates a new message factory. It imports data from param and prepares the factory for a transaction. From this factory, messages can be instantiated. This is used both for unicasts and multicasts.

If stack is given, this tries to place the factory on the specified stack space. The caller must guarantee that the factory does not outlive the stack frame. If this is not wanted, pass 0 as n_stack. In either case, if the stack frame is too small, this will allocate the factory on the heap.

Return

Pointer to factory, or ERR_PTR on failure.


bus1_factory_free — destroy message factory

Synopsis

struct bus1_factory * bus1_factory_free (struct bus1_factory * f);

Arguments

f

factory to operate on, or NULL

Description

This destroys the message factory f, previously created via bus1_factory_new. All pinned resources are freed. Messages created via the factory are unaffected.

If f is NULL, this is a no-op.

Return

NULL is returned.


bus1_factory_seal — charge and commit local resources

Synopsis

int bus1_factory_seal (struct bus1_factory * f);

Arguments

f

factory to use

Description

The factory needs to pin and possibly create local peer resources. This commits those resources. You should call this after you instantiated all messages, since you cannot undo it easily.

Return

0 on success, negative error code on failure.


bus1_factory_instantiate — instantiate a message from a factory

Synopsis

struct bus1_message * bus1_factory_instantiate (struct bus1_factory * f, struct bus1_handle * handle, struct bus1_peer * peer);

Arguments

f

factory to use

handle

destination handle

peer

destination peer

Description

This instantiates a new message targetted at handle, based on the plans in the message factory f.

The newly created message is not linked into any contexts, but is available for free use to the caller.

Return

Pointer to new message, or ERR_PTR on failure.


bus1_message_free — destroy message

Synopsis

void bus1_message_free (struct kref * k);

Arguments

k

kref belonging to a message

Description

This frees the message belonging to the reference counter k. It is supposed to be used with kref_put. See bus1_message_unref. Like all queue nodes, the memory deallocation is rcu-delayed.


bus1_message_stage — stage message

Synopsis

void bus1_message_stage (struct bus1_message * m, struct bus1_tx * tx);

Arguments

m

message to operate on

tx

transaction to stage on

Description

This acquires all resources of the message m and then stages the message on tx. Like all stage operations, this cannot be undone. Hence, you must make sure you can continue to commit the transaction without erroring-out in between.

This consumes the caller's reference on m, plus the active reference on the destination peer.


bus1_message_install — install message payload into target process

Synopsis

int bus1_message_install (struct bus1_message * m, struct bus1_cmd_recv * param);

Arguments

m

message to operate on

param

-- undescribed --

Description

This installs the payload FDs and handles of message into the receiving peer and the calling process. Handles are always installed, FDs are only installed if explicitly requested via param.

Return

0 on success, negative error code on failure.

Chapter 4. Bus1 Transaction

Table of Contents

ipc/bus1/tx.h — Document generation inconsistency
enum bus1_tx_bits — transaction flags
struct bus1_tx — transaction context
bus1_tx_init — initialize transaction context
bus1_tx_deinit — deinitialize transaction context
bus1_tx_stage_sync — stage message
bus1_tx_stage_later — postpone message
bus1_tx_join — HIC SUNT DRACONES!
bus1_tx_commit — commit transaction

ipc/bus1/tx.h — Document generation inconsistency

Oops

Warning

The template for this document tried to insert the structured comment from the file ipc/bus1/tx.h at this point, but none was found. This dummy section is inserted to allow generation to continue.


enum bus1_tx_bits — transaction flags

Synopsis

enum bus1_tx_bits {
  BUS1_TX_BIT_SEALED
};  

Constants

BUS1_TX_BIT_SEALED

The transaction is sealed, no new messages can be added to the transaction. The commit of all staged messages is ongoing.


struct bus1_tx — transaction context

Synopsis

struct bus1_tx {
  struct bus1_peer * origin;
  struct bus1_queue_node * sync;
  struct bus1_queue_node * async;
  struct bus1_queue_node * postponed;
  unsigned long flags;
  u64 timestamp;
  u64 async_ts;
};  

Members

origin

origin of this transaction

sync

unlocked list of staged messages

async

locked list of staged messages

postponed

unlocked list of unstaged messages

flags

transaction flags

timestamp

unlocked timestamp of this transaction

async_ts

locked timestamp cache of async list


bus1_tx_init — initialize transaction context

Synopsis

void bus1_tx_init (struct bus1_tx * tx, struct bus1_peer * origin);

Arguments

tx

transaction context to operate on

origin

origin of this transaction

Description

This initializes a transaction context. The initiating peer must be pinned by the caller for the entire lifetime of tx (until bus1_tx_deinit is called) and given as origin.


bus1_tx_deinit — deinitialize transaction context

Synopsis

void bus1_tx_deinit (struct bus1_tx * tx);

Arguments

tx

transaction context to operate on

Description

This deinitializes a transaction context previously created via bus1_tx_init. This is merely for debugging, as no resources are pinned on the transaction. However, if any message was staged on the transaction, it must be committed via bus1_tx_commit before it is deinitialized.


bus1_tx_stage_sync — stage message

Synopsis

void bus1_tx_stage_sync (struct bus1_tx * tx, struct bus1_queue_node * qnode);

Arguments

tx

transaction to operate on

qnode

message to stage

Description

This stages qnode on the transaction tx. It is an error to call this on a qnode that is already staged. The caller must set qnode->owner to the destination peer and acquire it. If it is NULL, it is assumed to be the same as the origin of the transaction.

The caller must hold the data-lock of the destination peer.

This consumes qnode. The caller must increment the required reference counts to make sure qnode does not vanish.


bus1_tx_stage_later — postpone message

Synopsis

void bus1_tx_stage_later (struct bus1_tx * tx, struct bus1_queue_node * qnode);

Arguments

tx

transaction to operate on

qnode

message to postpone

Description

This queues qnode on tx, but does not stage it. It will be staged just before the transaction is committed. This can be used over bus1_tx_stage_sync if no immediate staging is necessary, or if required locks cannot be taken.

It is a caller-error if qnode is already part of a transaction.


bus1_tx_join — HIC SUNT DRACONES!

Synopsis

bool bus1_tx_join (struct bus1_queue_node * whom, struct bus1_queue_node * qnode);

Arguments

whom

whom to join

qnode

who joins

Description

This makes qnode join the on-going transaction of whom. That is, it is semantically equivalent of calling:

bus1_tx_stage_sync(whom->group, qnode);

However, you can only dereference whom->group while it is still ongoing. Once committed, it might be a stale pointer. This function safely checks for the required conditions and bails out if too late.

The caller must hold the data locks of both peers (target of whom and qnode). node->owner must not be NULL! Furthermore, qnode must not be staged into any transaction, yet.

In general, this function is not what you want. There is no guarantee that you can join the transaction, hence a negative return value must be expected by the caller and handled gracefully. In that case, this function guarantees that the clock of the holder of qnode is synced with the transaction of whom, and as such is correctly ordered against the transaction.

If this function returns false, you must settle on the transaction before visibly reacting to it. That is, user-space must not see that you failed to join the transaction before the transaction is settled!

Return

True if successfull, false if too late.


bus1_tx_commit — commit transaction

Synopsis

u64 bus1_tx_commit (struct bus1_tx * tx);

Arguments

tx

transaction to operate on

Description

Commit a transaction. First all postponed entries are staged, then we commit all messages that belong to this transaction. This works with any number of messages.

Return

This returns the commit timestamp used.

Chapter 5. Bus1 Handle

Table of Contents

enum bus1_handle_bits — node flags
struct bus1_handle — object handle
bus1_handle_is_anchor — check whether handle is an anchor
bus1_handle_is_live — check whether handle is live
bus1_handle_is_public — check whether handle is public
bus1_handle_ref — acquire object reference
bus1_handle_unref — release object reference
bus1_handle_acquire — acquire weak/strong reference
bus1_handle_release — release weak/strong reference
bus1_handle_release_n — release multiple references
bus1_handle_new_anchor — allocate new anchor handle
bus1_handle_new_remote — allocate new remote handle
bus1_handle_free — free handle
bus1_handle_acquire_owner — acquire owner of a handle
bus1_handle_ref_by_other — lookup handle on a peer
bus1_handle_acquire_locked — acquire strong reference
bus1_handle_acquire_slow — slow-path of handle acquisition
bus1_handle_release_slow — slow-path of handle release
bus1_handle_destroy_locked — stage node destruction
bus1_handle_is_live_at — check whether handle is live at a given time
bus1_handle_import — import handle
bus1_handle_identify — identify handle
bus1_handle_export — export handle
bus1_handle_forget — forget handle
bus1_handle_forget_keep — forget handle but keep rb-tree order

The object system on a bus is based on 'nodes' and 'handles'. Any peer can allocate new, local objects at any time. The creator automatically becomes the sole owner of the object. References to objects can be passed as payload of messages. The recipient will then gain their own reference to the object as well. Additionally, an object can be the destination of a message, in which case the message is always sent to the original creator (and thus the owner) of the object.

Internally, objects are called 'nodes'. A reference to an object is a 'handle'. Whenever a new node is created, the owner implicitly gains an handle as well. In fact, handles are the only way to refer to a node. The node itself is entirely hidden in the implementation, and visible in the API as an anchor handle.

Whenever a handle is passed as payload of a message, the target peer will gain a handle linked to the same underlying node. This works regardless of whether the sender is the owner of the underlying node, or not.

Each peer can identify all its handles (both owned and un-owned) by a 64-bit integer. The namespace is local to each peer, and the numbers cannot be compared with the numbers of other peers (in fact, they are very likely to clash, but might still have *different* underlying nodes). However, if a peer receives a reference to the same node multiple times, the resulting handle will be the same. The kernel keeps count of how often each peer owns a handle.

If a peer no longer requires a specific handle, it can release it. If the peer releases its last reference to a handle, the handle will be destroyed.

The owner of a node (and *only* the owner) can trigger the destruction of a node (even if other peers still own handles to it). In this case, all peers that own a handle are notified of this fact. Once all handles to a specific node have been released (except for the handle internally pinned in the node itself), the owner of the node is notified of this, so it can potentially destroy both any linked state and the node itself.

Node destruction is fully synchronized with any transaction. That is, a node and all its handles are valid in every message that is transmitted *before* the notification of its destruction. Furthermore, no message after this notification will carry the ID of such a destroyed node. Note that message transactions are asynchronous. That is, there is no unique point in time that a message is synchronized with another message. Hence, whether a specific handle passed with a message is still valid or not, cannot be predicted by the sender, but only by one of the receivers.

enum bus1_handle_bits — node flags

Synopsis

enum bus1_handle_bits {
  BUS1_HANDLE_BIT_RELEASED,
  BUS1_HANDLE_BIT_DESTROYED
};  

Constants

BUS1_HANDLE_BIT_RELEASED

The anchor handle has been released. Any further attach operation will still work, but result in a stale attach, even in case of re-attach of the anchor itself.

BUS1_HANDLE_BIT_DESTROYED

A destruction has already been scheduled for this node.


struct bus1_handle — object handle

Synopsis

struct bus1_handle {
  struct kref ref;
  atomic_t n_weak;
  atomic_t n_user;
  struct bus1_peer * holder;
  struct bus1_handle * anchor;
  struct bus1_handle * tlink;
  struct rb_node rb_to_peer;
  u64 id;
  struct bus1_queue_node qnode;
  union {unnamed_union};
};  

Members

ref

object reference counter

n_weak

number of weak references

n_user

number of user references

holder

holder of this handle

anchor

anchor handle

tlink

singly-linked list for free use

rb_to_peer

rb-link into peer by ID

id

current ID

qnode

queue node for notifications

{unnamed_union}

anonymous


bus1_handle_is_anchor — check whether handle is an anchor

Synopsis

bool bus1_handle_is_anchor (struct bus1_handle * h);

Arguments

h

handle to check

Description

This checks whether h is an anchor. That is, h was created via bus1_handle_new_anchor, rather than via bus1_handle_new_remote.

Return

True if it is an anchor, false if not.


bus1_handle_is_live — check whether handle is live

Synopsis

bool bus1_handle_is_live (struct bus1_handle * h);

Arguments

h

handle to check

Description

This checks whether the given handle is still live. That is, its anchor was not destroyed, yet.

Return

True if it is live, false if already destroyed.


bus1_handle_is_public — check whether handle is public

Synopsis

bool bus1_handle_is_public (struct bus1_handle * h);

Arguments

h

handle to check

Description

This checks whether the given handle is public. That is, it was exported to user-space and at least one public reference is left.

Return

True if it is public, false if not.


bus1_handle_ref — acquire object reference

Synopsis

struct bus1_handle * bus1_handle_ref (struct bus1_handle * h);

Arguments

h

handle to operate on, or NULL

Description

This acquires an object reference to h. The caller must already hold a reference. Otherwise, the behavior is undefined.

If NULL is passed, this is a no-op.

Return

h is returned.


bus1_handle_unref — release object reference

Synopsis

struct bus1_handle * bus1_handle_unref (struct bus1_handle * h);

Arguments

h

handle to operate on, or NULL

Description

This releases an object reference. If the reference count drops to 0, the object is released (rcu-delayed).

If NULL is passed, this is a no-op.

Return

NULL is returned.


bus1_handle_acquire — acquire weak/strong reference

Synopsis

struct bus1_handle * bus1_handle_acquire (struct bus1_handle * h, bool strong);

Arguments

h

handle to operate on, or NULL

strong

whether to acquire a strong reference

Description

This acquires a weak/strong reference to the node h is attached to. This always succeeds. However, if a conflict is detected, h is unreferenced and the conflicting handle is returned (with an object reference taken and strong reference acquired).

If NULL is passed, this is a no-op.

Return

Pointer to the acquired handle is returned.


bus1_handle_release — release weak/strong reference

Synopsis

struct bus1_handle * bus1_handle_release (struct bus1_handle * h, bool strong);

Arguments

h

handle to operate on, or NULL

strong

whether to release a strong reference

Description

This releases a weak or strong reference to the node h is attached to.

If NULL is passed, this is a no-op.

Return

NULL is returned.


bus1_handle_release_n — release multiple references

Synopsis

struct bus1_handle * bus1_handle_release_n (struct bus1_handle * h, unsigned int n, bool strong);

Arguments

h

handle to operate on, or NULL

n

number of references to release

strong

whether to release strong references

Description

This releases n weak or strong references to the node h is attached to.

If NULL is passed, this is a no-op.

Return

NULL is returned.


bus1_handle_new_anchor — allocate new anchor handle

Synopsis

struct bus1_handle * bus1_handle_new_anchor (struct bus1_peer * holder);

Arguments

holder

peer to set as holder

Description

This allocates a new, fresh, anchor handle for free use to the caller.

Return

Pointer to handle, or ERR_PTR on failure.


bus1_handle_new_remote — allocate new remote handle

Synopsis

struct bus1_handle * bus1_handle_new_remote (struct bus1_peer * holder, struct bus1_handle * other);

Arguments

holder

peer to set as holder

other

other handle to link to

Description

This allocates a new, fresh, remote handle for free use to the caller. The handle will use the same anchor as other (or other in case it is an anchor).

Return

Pointer to handle, or ERR_PTR on failure.


bus1_handle_free — free handle

Synopsis

void bus1_handle_free (struct kref * k);

Arguments

k

kref of handle to free

Description

This frees the handle belonging to the kref k. It is meant to be used as callback for kref_put. The actual memory release is rcu-delayed so the handle stays around at least until the next grace period.


bus1_handle_acquire_owner — acquire owner of a handle

Synopsis

struct bus1_peer * bus1_handle_acquire_owner (struct bus1_handle * handle);

Arguments

handle

handle to operate on

Description

This tries to acquire the owner of a handle. If the owner is already detached, this will return NULL.

Return

Pointer to owner on success, NULL on failure.


bus1_handle_ref_by_other — lookup handle on a peer

Synopsis

struct bus1_handle * bus1_handle_ref_by_other (struct bus1_peer * peer, struct bus1_handle * handle);

Arguments

peer

peer to lookup handle for

handle

other handle to match for

Description

This looks for an handle held by peer, which points to the same node as handle (i.e., it is linked to handle->anchor). If peer does not hold such a handle, this returns NULL. Otherwise, an object reference is acquired and returned as pointer.

The caller must hold an active reference to peer.

Return

Pointer to handle if found, NULL if not found.


bus1_handle_acquire_locked — acquire strong reference

Synopsis

struct bus1_handle * bus1_handle_acquire_locked (struct bus1_handle * handle, bool strong);

Arguments

handle

handle to operate on, or NULL

strong

whether to acquire a strong reference

Description

This is the same as bus1_handle_acquire_slow, but requires the caller to hold the data lock of holder and the owner.

Return

Acquired handle (possibly a conflict).


bus1_handle_acquire_slow — slow-path of handle acquisition

Synopsis

struct bus1_handle * bus1_handle_acquire_slow (struct bus1_handle * handle, bool strong);

Arguments

handle

handle to acquire

strong

whether to acquire a strong reference

Description

This is the slow-path of bus1_handle_acquire. See there for details.

Return

Acquired handle (possibly a conflict).


bus1_handle_release_slow — slow-path of handle release

Synopsis

void bus1_handle_release_slow (struct bus1_handle * handle, bool strong);

Arguments

handle

handle to release

strong

whether to release a strong reference

Description

This is the slow-path of bus1_handle_release. See there for details.


bus1_handle_destroy_locked — stage node destruction

Synopsis

void bus1_handle_destroy_locked (struct bus1_handle * handle, struct bus1_tx * tx);

Arguments

handle

handle to destroy

tx

transaction to use

Description

This stages a destruction on handle. That is, it marks handle as destroyed and stages a release-notification for all live handles via tx. It is the responsibility of the caller to commit tx.

The given handle must be an anchor and not destroyed, yet. Furthermore, the caller must hold the local-lock and data-lock of the owner.


bus1_handle_is_live_at — check whether handle is live at a given time

Synopsis

bool bus1_handle_is_live_at (struct bus1_handle * h, u64 timestamp);

Arguments

h

handle to check

timestamp

timestamp to check

Description

This checks whether the handle h is live at the time of timestamp. The caller must make sure that timestamp was acquired on the clock of the holder of h.

Note that this does not synchronize on the node owner. That is, usually you want to call this at the time of RECV, so it is guaranteed that there is no staging message in front of timestamp. Otherwise, a node owner might acquire a commit-timestamp for the destruction of h lower than timestamp.

The caller must hold the data-lock of the holder of h.

Return

True if live at the given timestamp, false if destroyed.


bus1_handle_import — import handle

Synopsis

struct bus1_handle * bus1_handle_import (struct bus1_peer * peer, u64 id, bool * is_newp);

Arguments

peer

peer to operate on

id

ID of handle

is_newp

store whether handle is new

Description

This searches the ID-namespace of peer for a handle with the given ID. If found, it is referenced, returned to the caller, and is_newp is set to false.

If not found and id is a remote ID, then an error is returned. But if it is a local ID, a new handle is created and placed in the lookup tree. In this case is_newp is set to true.

Return

Pointer to referenced handle is returned.


bus1_handle_identify — identify handle

Synopsis

u64 bus1_handle_identify (struct bus1_handle * h);

Arguments

h

handle to operate on

Description

This returns the ID of h. If no ID was assigned, yet, a new one is picked.

Return

The ID of h is returned.


bus1_handle_export — export handle

Synopsis

void bus1_handle_export (struct bus1_handle * handle);

Arguments

handle

handle to operate on

Description

This exports handle into the ID namespace of its holder. That is, if handle is not linked into the ID namespace yet, it is linked into it.

If handle is already linked, nothing is done.


bus1_handle_forget — forget handle

Synopsis

void bus1_handle_forget (struct bus1_handle * h);

Arguments

h

handle to operate on, or NULL

Description

If h is not public, but linked into the ID-lookup tree, this will remove it from the tree and clear the ID of h. It basically undoes what bus1_handle_import and bus1_handle_export do.

Note that there is no counter in bus1_handle_import or bus1_handle_export. That is, if you call bus1_handle_import multiple times, a single bus1_handle_forget undoes it. It is the callers responsibility to not release the local-lock randomly, and to properly detect cases where the same handle is used multiple times.


bus1_handle_forget_keep — forget handle but keep rb-tree order

Synopsis

void bus1_handle_forget_keep (struct bus1_handle * h);

Arguments

h

handle to operate on, or NULL

Description

This is like bus1_handle_forget, but does not modify the ID-namespace rb-tree. That is, the backlink in h is cleared (h->rb_to_peer), but the rb-tree is not rebalanced. As such, you can use it with rbtree_postorder_for_each_entry_safe to drop all entries.

Chapter 6. Bus1 User

Table of Contents

struct bus1_user_usage — usage counters
struct bus1_user_limits — resource limit counters
struct bus1_user — resource accounting for users
bus1_user_modexit — clean up global resources of user accounting
bus1_user_limits_init — initialize resource limit counter
bus1_user_limits_deinit — deinitialize source limit counter
bus1_user_ref_by_uid — get a user object for a uid
bus1_user_ref — acquire reference
bus1_user_unref — release reference
bus1_user_charge — charge a user resource
bus1_user_discharge — discharge a user resource
bus1_user_charge_quota — charge quota resources
bus1_user_discharge_quota — discharge quota resources
bus1_user_commit_quota — commit quota resources

Different users can communicate via bus1, and many resources are shared between multiple users. The bus1_user object represents the UID of a user, like struct user_struct does in the kernel core. It is used to account global resources, apply limits, and calculate quotas if different UIDs communicate with each other.

All dynamic resources have global per-user limits, which cannot be exceeded by a user. They prevent a single user from exhausting local resources. Each peer that is created is always owned by the user that initialized it. All resources allocated on that peer are accounted on that pinned user. Additionally to global resources, there are local limits per peer, that can be controlled by each peer individually (e.g., specifying a maximum pool size). Those local limits allow a user to distribute the globally available resources across its peer instances.

Since bus1 allows communication across UID boundaries, any such transmission of resources must be properly accounted. Bus1 employs dynamic quotas to fairly distribute available resources. Those quotas make sure that available resources of a peer cannot be exhausted by remote UIDs, but are fairly divided among all communicating peers.

struct bus1_user_usage — usage counters

Synopsis

struct bus1_user_usage {
  atomic_t n_slices;
  atomic_t n_handles;
  atomic_t n_bytes;
  atomic_t n_fds;
};  

Members

n_slices

number of used slices

n_handles

number of used handles

n_bytes

number of used bytes

n_fds

number of used fds


struct bus1_user_limits — resource limit counters

Synopsis

struct bus1_user_limits {
  atomic_t n_slices;
  atomic_t n_handles;
  atomic_t n_inflight_bytes;
  atomic_t n_inflight_fds;
  unsigned int max_slices;
  unsigned int max_handles;
  unsigned int max_inflight_bytes;
  unsigned int max_inflight_fds;
  struct idr usages;
};  

Members

n_slices

number of remaining quota for owned slices

n_handles

number of remaining quota for owned handles

n_inflight_bytes

number of remaining quota for inflight bytes

n_inflight_fds

number of remaining quota for inflight FDs

max_slices

maximum number of owned slices

max_handles

maximum number of owned handles

max_inflight_bytes

maximum number of inflight bytes

max_inflight_fds

maximum number of inflight FDs

usages

idr of usage entries per uid


struct bus1_user — resource accounting for users

Synopsis

struct bus1_user {
  struct kref ref;
  kuid_t uid;
  struct mutex lock;
  union {unnamed_union};
};  

Members

ref

reference counter

uid

UID of the user

lock

object lock

{unnamed_union}

anonymous


bus1_user_modexit — clean up global resources of user accounting

Synopsis

void bus1_user_modexit ( void);

Arguments

void

no arguments

Description

This function cleans up any remaining global resources that were allocated by the user accounting helpers. The caller must make sure that no user object is referenced anymore, before calling this. This function just clears caches and verifies nothing is leaked.

This is meant to be called on module-exit.


bus1_user_limits_init — initialize resource limit counter

Synopsis

void bus1_user_limits_init (struct bus1_user_limits * limits, struct bus1_user * source);

Arguments

limits

object to initialize

source

source to initialize from, or NULL

Description

This initializes the resource-limit counter limit. The initial limits are taken from source, if given. If NULL, the global default limits are taken.


bus1_user_limits_deinit — deinitialize source limit counter

Synopsis

void bus1_user_limits_deinit (struct bus1_user_limits * limits);

Arguments

limits

object to deinitialize

Description

This should be called on destruction of limits. It verifies the correctness of the limits and emits warnings if something went wrong.


bus1_user_ref_by_uid — get a user object for a uid

Synopsis

struct bus1_user * bus1_user_ref_by_uid (kuid_t uid);

Arguments

uid

uid of the user

Description

Find and return the user object for the uid if it exists, otherwise create it first.

Return

A user object for the given uid, ERR_PTR on failure.


bus1_user_ref — acquire reference

Synopsis

struct bus1_user * bus1_user_ref (struct bus1_user * user);

Arguments

user

user to acquire, or NULL

Description

Acquire an additional reference to a user-object. The caller must already own a reference.

If NULL is passed, this is a no-op.

Return

user is returned.


bus1_user_unref — release reference

Synopsis

struct bus1_user * bus1_user_unref (struct bus1_user * user);

Arguments

user

user to release, or NULL

Description

Release a reference to a user-object.

If NULL is passed, this is a no-op.

Return

NULL is returned.


bus1_user_charge — charge a user resource

Synopsis

int bus1_user_charge (atomic_t * global, atomic_t * local, int charge);

Arguments

global

global resource to charge on

local

local resource to charge on

charge

charge to apply

Description

This charges charge on two resource counters. Only if both charges apply, this returns success. It is an error to call this with negative charges.

Return

0 on success, negative error code on failure.


bus1_user_discharge — discharge a user resource

Synopsis

void bus1_user_discharge (atomic_t * global, atomic_t * local, int charge);

Arguments

global

global resource to charge on

local

local resource to charge on

charge

charge to apply

Description

This discharges charge on two resource counters. This always succeeds. It is an error to call this with a negative charge.


bus1_user_charge_quota — charge quota resources

Synopsis

int bus1_user_charge_quota (struct bus1_user * user, struct bus1_user * actor, struct bus1_user_limits * limits, int n_slices, int n_handles, int n_bytes, int n_fds);

Arguments

user

user to charge on

actor

user to charge as

limits

local limits to charge on

n_slices

number of slices to charge

n_handles

number of handles to charge

n_bytes

number of bytes to charge

n_fds

number of FDs to charge

Description

This charges the given resources on user and limits. It does both, local and remote charges. It is all charged for user actor.

Negative charges always succeed. Positive charges might fail if quota is denied. Note that a single call is always atomic, so either all succeed or all fail. Hence, it makes little sense to mix negative and positive charges in a single call.

Return

0 on success, negative error code on failure.


bus1_user_discharge_quota — discharge quota resources

Synopsis

void bus1_user_discharge_quota (struct bus1_user * user, struct bus1_user * actor, struct bus1_user_limits * l_local, int n_slices, int n_handles, int n_bytes, int n_fds);

Arguments

user

user to charge on

actor

user to charge as

l_local

local limits to charge on

n_slices

number of slices to charge

n_handles

number of handles to charge

n_bytes

number of bytes to charge

n_fds

number of FDs to charge

Description

This discharges the given resources on user and limits. It does both local and remote charges. It is all discharged for user actor.


bus1_user_commit_quota — commit quota resources

Synopsis

void bus1_user_commit_quota (struct bus1_user * user, struct bus1_user * actor, struct bus1_user_limits * l_local, int n_slices, int n_handles, int n_bytes, int n_fds);

Arguments

user

user to charge on

actor

user to charge as

l_local

local limits to charge on

n_slices

number of slices to charge

n_handles

number of handles to charge

n_bytes

number of bytes to charge

n_fds

number of FDs to charge

Description

This commits the given resources on user and limits. Committing a quota means discharging the usage objects but leaving the limits untouched.

Chapter 7. Bus1 Active Reference

Table of Contents

struct bus1_active — active references
bus1_active_acquire — acquire active reference
bus1_active_release — release active reference
bus1_active_init_private — initialize object
bus1_active_deinit — destroy object
bus1_active_is_new — check whether object is new
bus1_active_is_active — check whether object is active
bus1_active_is_deactivated — check whether object was deactivated
bus1_active_is_drained — check whether object is drained
bus1_active_activate — activate object
bus1_active_deactivate — deactivate object
bus1_active_drain — drain active references
bus1_active_cleanup — cleanup drained object
bus1_active_lockdep_acquired — acquire lockdep reader
bus1_active_lockdep_released — release lockdep reader

The bus1_active object implements active references. They work similarly to plain object reference counters, but allow disabling any new references from being taken.

Each bus1_active object goes through a set of states: NEW: Initial state, no active references can be acquired ACTIVE: Live state, active references can be acquired DRAINING: Deactivated but lingering, no active references can be acquired DRAINED: Deactivated and all active references were dropped RELEASED: Fully drained and synchronously released

Initially, all bus1_active objects are in state NEW. As soon as they're activated, they enter ACTIVE and active references can be acquired. This is the normal, live state. Once the object is deactivated, it enters state DRAINING. No new active references can be acquired, but some threads might still own active references. Once all those are dropped, the object enters state DRAINED. Now the object can be released a *single* time, before it enters state RELEASED and is finished. It cannot be re-used anymore.

Active-references are very useful to track threads that call methods on an object. As long as a method is running, an active reference is held, and as such the object is usually protected from being destroyed. The destructor of the object needs to deactivate *and* drain the object, before releasing resources.

Note that active-references cannot be used to manage their own backing memory. That is, they do not replace normal reference counts.

struct bus1_active — active references

Synopsis

struct bus1_active {
  atomic_t count;
#ifdef CONFIG_DEBUG_LOCK_ALLOC
  struct lockdep_map dep_map;
#endif
};  

Members

count

active reference counter

dep_map

lockdep annotations

Description

This object should be treated like a simple atomic_t. It will only contain more fields in the case of lockdep-enabled compilations.

Users must embed this object into their parent structures and create/destroy it via bus1_active_init and bus1_active_deinit.


bus1_active_acquire — acquire active reference

Synopsis

struct bus1_active * bus1_active_acquire (struct bus1_active * active);

Arguments

active

object to acquire active reference to, or NULL

Description

This acquires an active reference to the passed object. If the object was not activated, yet, or if it was already deactivated, this will fail and return NULL. If a reference was successfully acquired, this will return active.

If NULL is passed, this is a no-op and always returns NULL.

This behaves as a down_read_trylock. Use bus1_active_release to release the reference again and get the matching up_read.

Return

active if reference was acquired, NULL if not.


bus1_active_release — release active reference

Synopsis

struct bus1_active * bus1_active_release (struct bus1_active * active, wait_queue_head_t * waitq);

Arguments

active

object to release active reference of, or NULL

waitq

wait-queue linked to active, or NULL

Description

This releases an active reference that was previously acquired via bus1_active_acquire.

This is a no-op if NULL is passed.

This behaves like an up_read.

Return

NULL is returned.


bus1_active_init_private — initialize object

Synopsis

void bus1_active_init_private (struct bus1_active * active);

Arguments

active

object to initialize

Description

This initializes an active-object. The initial state is NEW, and as such no active reference can be acquired. The object must be activated first.

This is an internal helper. Always use the public bus1_active_init macro which does proper lockdep initialization for private key classes.


bus1_active_deinit — destroy object

Synopsis

void bus1_active_deinit (struct bus1_active * active);

Arguments

active

object to destroy

Description

Destroy an active-object. The object must have been initialized via bus1_active_init, deactivated via bus1_active_deactivate, drained via bus1_active_drain and cleaned via bus1_active_cleanup, before you can destroy it. Alternatively, it can also be destroyed if still in state NEW.

This function only does sanity checks, it does not modify the object itself. There is no allocated memory, so there is nothing to do.


bus1_active_is_new — check whether object is new

Synopsis

bool bus1_active_is_new (struct bus1_active * active);

Arguments

active

object to check

Description

This checks whether the object is new, that is, it was never activated nor deactivated.

Return

True if new, false if not.


bus1_active_is_active — check whether object is active

Synopsis

bool bus1_active_is_active (struct bus1_active * active);

Arguments

active

object to check

Description

This checks whether the given active-object is active. That is, the object was already activated, but not deactivated, yet.

Note that this function does not give any guarantee that the object is still active/inactive at the time this call returns. It only serves as a barrier.

Return

True if active, false if not.


bus1_active_is_deactivated — check whether object was deactivated

Synopsis

bool bus1_active_is_deactivated (struct bus1_active * active);

Arguments

active

object to check

Description

This checks whether the given active-object was already deactivated. That is, the object was actively deactivated (state NEW does *not* count as deactivated) via bus1_active_deactivate.

Once this function returns true, it cannot change again on this object.

Return

True if already deactivated, false if not.


bus1_active_is_drained — check whether object is drained

Synopsis

bool bus1_active_is_drained (struct bus1_active * active);

Arguments

active

object to check

Description

This checks whether the given object was already deactivated and is fully drained. That is, no active references to the object exist, nor can they be acquired, anymore.

Return

True if drained, false if not.


bus1_active_activate — activate object

Synopsis

bool bus1_active_activate (struct bus1_active * active);

Arguments

active

object to activate

Description

This activates the given object, if it is still in state NEW. Otherwise, it is a no-op (and the object might already be deactivated).

Once this returns successfully, active references can be acquired.

Return

True if this call activated it, false if it was already activated, or deactivated.


bus1_active_deactivate — deactivate object

Synopsis

bool bus1_active_deactivate (struct bus1_active * active);

Arguments

active

object to deactivate

Description

This deactivates the given object, if not already done by someone else. Once this returns, no new active references can be acquired.

Return

True if this call deactivated the object, false if it was already deactivated by someone else.


bus1_active_drain — drain active references

Synopsis

void bus1_active_drain (struct bus1_active * active, wait_queue_head_t * waitq);

Arguments

active

object to drain

waitq

wait-queue linked to active

Description

This waits for all active-references on active to be dropped. It uses the passed wait-queue to sleep. It must be the same wait-queue that is used when calling bus1_active_release.

The caller must guarantee that bus1_active_deactivate was called before.

This function can be safely called in parallel on multiple CPUs.

Semantically (and also enforced by lockdep), this call behaves like a down_write, followed by an up_write, on this active object.


bus1_active_cleanup — cleanup drained object

Synopsis

bool bus1_active_cleanup (struct bus1_active * active, wait_queue_head_t * waitq, void (*cleanup) (struct bus1_active *, void *), void * userdata);

Arguments

active

object to release

waitq

wait-queue linked to active, or NULL

cleanup

cleanup callback, or NULL

userdata

userdata for callback

Description

This performs the final object cleanup. The caller must guarantee that the object is drained, by calling bus1_active_drain.

This function invokes the passed cleanup callback on the object. However, it guarantees that this is done exactly once. If there're multiple parallel callers, this will pick one randomly and make all others wait until it is done. If you call this after it was already cleaned up, this is a no-op and only serves as barrier.

If waitq is NULL, the wait is skipped and the call returns immediately. In this case, another thread has entered before, but there is no guarantee that they finished executing the cleanup callback, yet.

If waitq is non-NULL, this call behaves like a down_write, followed by an up_write, just like bus1_active_drain. If waitq is NULL, this rather behaves like a down_write_trylock, optionally followed by an up_write.

Return

True if this is the thread that released it, false otherwise.


bus1_active_lockdep_acquired — acquire lockdep reader

Synopsis

void bus1_active_lockdep_acquired (struct bus1_active * active);

Arguments

active

object to acquire lockdep reader of, or NULL

Description

Whenever you acquire an active reference via bus1_active_acquire, this function is implicitly called afterwards. It enables lockdep annotations and tells lockdep that you acquired the active reference.

However, lockdep cannot support arbitrary depths, hence, we allow temporarily dropping the lockdep-annotation via bus1_active_lockdep_release, and acquiring them later again via bus1_active_lockdep_acquire.

Example

   If you need to pin a large number of objects, you would acquire each
            of them individually via bus1_active_acquire. Then you would
            perform state tracking, etc. on that object. Before you continue
            with the next, you call bus1_active_lockdep_released, to pretend
            you released the lock (but you still retain your active reference).
            Now you continue with pinning the next object, etc. until you
            pinned all objects you need.

            If you now need to access one of your pinned objects (or want to
            release them eventually), you call bus1_active_lockdep_acquired
            before accessing the object. This enables the lockdep annotations
            again. This cannot fail, ever. You still own the active reference
            at all times.
            Once you're done with the single object, you either release your
            entire active reference via bus1_active_release, or you
            temporarily disable lockdep via bus1_active_lockdep_released
            again, in case you need the pinned object again later.

   Note that you can acquired multiple active references just fine. The only
   reason those lockdep helpers are provided, is if you need to acquire a
   *large* number at the same time. Lockdep is usually limited to a depths of 64
   so you cannot hold more locks at the same time.

bus1_active_lockdep_released — release lockdep reader

Synopsis

void bus1_active_lockdep_released (struct bus1_active * active);

Arguments

active

object to release lockdep reader of, or NULL

Description

This is the counterpart of bus1_active_lockdep_acquired. See its documentation for details.

Chapter 8. Bus1 Fixed List

Table of Contents

struct bus1_flist — fixed list
bus1_flist_inline_size — calculate required inline size
bus1_flist_init — initialize an flist
bus1_flist_deinit — deinitialize an flist
bus1_flist_next — flist iterator
bus1_flist_walk — walk flist in batches
bus1_flist_populate — populate an flist
bus1_flist_new — allocate new flist
bus1_flist_free — free flist

This implements a fixed-size list called bus1_flist. The size of the list must be constant over the lifetime of the list. The list can hold one arbitrary pointer per node.

Fixed lists are a combination of a linked list and a static array. That is, fixed lists behave like linked lists (no random access, but arbitrary size), but compare in speed with arrays (consequetive accesses are fast). Unlike fixed arrays, fixed lists can hold huge number of elements without requiring vmalloc, but solely relying on small-size kmalloc allocations.

Internally, fixed lists are a singly-linked list of static arrays. This guarantees that iterations behave almost like on an array, except when crossing a batch-border.

Fixed lists can replace fixed-size arrays whenever you need to support large number of elements, but don't need random access. Fixed lists have ALMOST the same memory requirements as fixed-size arrays, except one pointer of state per 'BUS1_FLIST_BATCH' elements. If only a small size (i.e., it only requires one batch) is stored in a fixed list, then its memory requirements and iteration time are equivalent to fixed-size arrays.

struct bus1_flist — fixed list

Synopsis

struct bus1_flist {
  union {unnamed_union};
};  

Members

{unnamed_union}

anonymous


bus1_flist_inline_size — calculate required inline size

Synopsis

size_t bus1_flist_inline_size (size_t n);

Arguments

n

number of entries

Description

When allocating storage for an flist, this calculates the size of the initial array in bytes. Use bus1_flist_new directly if you want to allocate an flist on the heap. This helper is only needed if you embed an flist into another struct like this:

struct foo { ... struct bus1_flist list[]; };

In that case the flist must be the last element, and the size in bytes required by it is returned by this function.

The inline-size of an flist is always bound to a fixed maximum. That is, regardless of n, this will always return a reasonable number that can be allocated via kmalloc.

Return

Size in bytes required for the initial batch of an flist.


bus1_flist_init — initialize an flist

Synopsis

void bus1_flist_init (struct bus1_flist * list, size_t n);

Arguments

list

flist to initialize

n

number of entries

Description

This initializes an flist of size n. It does NOT preallocate the memory, but only initializes list in a way that bus1_flist_deinit can be called on it. Use bus1_flist_populate to populate the flist.

This is only needed if your backing memory of list is shared with another object. If possible, use bus1_flist_new to allocate an flist on the heap and avoid this dance.


bus1_flist_deinit — deinitialize an flist

Synopsis

void bus1_flist_deinit (struct bus1_flist * list, size_t n);

Arguments

list

flist to deinitialize

n

number of entries

Description

This deallocates an flist and releases all resources. If already deinitialized, this is a no-op. This is only needed if you called bus1_flist_populate.


bus1_flist_next — flist iterator

Synopsis

struct bus1_flist * bus1_flist_next (struct bus1_flist * iter, size_t * pos);

Arguments

iter

iterator

pos

current position

Description

This advances an flist iterator by one position. iter must point to the current position, and the new position is returned by this function. pos must point to a variable that contains the current index position. That is, pos must be initialized to 0 and iter to the flist head.

Neither pos nor iter must be modified by anyone but this helper. In the loop body you can use iter->ptr to access the current element.

This iterator is normally used like this:

size_t pos, n = 128; struct bus1_flist *e, *list = bus1_flist_new(n);

...

for (pos = 0, e = list; pos < n; e = bus1_flist_next(e, pos)) { ... access e->ptr ... }

Return

Next iterator position.


bus1_flist_walk — walk flist in batches

Synopsis

size_t bus1_flist_walk (struct bus1_flist * list, size_t n, struct bus1_flist ** iter, size_t * pos);

Arguments

list

list to walk

n

number of entries

iter

iterator

pos

current position

Description

This walks an flist in batches of size up to BUS1_FLIST_BATCH. It is normally used like this:

size_t pos, z, n = 65536; struct bus1_flist *e, *list = bus1_flist_new(n);

...

pos = 0; while ((z = bus1_flist_walk(list, n, e, pos)) > 0) { ... access e[0...z]->ptr ... invariant: z <= BUS1_FLIST_BATCH ... invariant: e[i]->ptr == (e->ptr)[i] }

Return

Size of batch at iter.


bus1_flist_populate — populate an flist

Synopsis

int bus1_flist_populate (struct bus1_flist * list, size_t n, gfp_t gfp);

Arguments

list

flist to operate on

n

number of elements

gfp

GFP to use for allocations

Description

Populate an flist. This pre-allocates the backing memory for an flist that was statically initialized via bus1_flist_init. This is NOT needed if the list was allocated via bus1_flist_new.

Return

0 on success, negative error code on failure.


bus1_flist_new — allocate new flist

Synopsis

struct bus1_flist * bus1_flist_new (size_t n, gfp_t gfp);

Arguments

n

number of elements

gfp

GFP to use for allocations

Description

This allocates a new flist ready to store n elements.

Return

Pointer to flist, NULL if out-of-memory.


bus1_flist_free — free flist

Synopsis

struct bus1_flist * bus1_flist_free (struct bus1_flist * list, size_t n);

Arguments

list

flist to operate on, or NULL

n

number of elements

Description

This deallocates an flist previously created via bus1_flist_new.

If NULL is passed, this is a no-op.

Return

NULL is returned.

Chapter 9. Bus1 Pool

Table of Contents

struct bus1_pool_slice — pool slice
struct bus1_pool — client pool
bus1_pool_slice_is_public — check whether a slice is public
bus1_pool_init — create memory pool
bus1_pool_deinit — destroy pool
bus1_pool_alloc — allocate memory
bus1_pool_release_kernel — release kernel-owned slice reference
bus1_pool_publish — publish a slice
bus1_pool_release_user — release a public slice
bus1_pool_flush — flush all user references
bus1_pool_mmap — mmap the pool
bus1_pool_write_iovec — copy user memory to a slice
bus1_pool_write_kvec — copy kernel memory to a slice

A pool is a shmem-backed memory pool shared between userspace and the kernel. The pool is used to transfer memory from the kernel to userspace without requiring userspace to allocate the memory.

The pool is managed in slices, which are published to userspace when they are ready to be read and must be released by userspace when userspace is done with them.

Userspace has read-only access to its pools and the kernel has read-write access, but published slices are not altered.

struct bus1_pool_slice — pool slice

Synopsis

struct bus1_pool_slice {
  u32 offset;
  u32 free:1;
  u32 ref_kernel:1;
  u32 ref_user:1;
  struct list_head entry;
  struct rb_node rb;
};  

Members

offset

relative offset in parent pool

free

whether this slice is in-use or not

ref_kernel

whether a kernel reference exists

ref_user

whether a user reference exists

entry

link into linear list of slices

rb

link to busy/free rb-tree

Description

Each chunk of memory in the pool is managed as a slice. A slice can be accessible by both the kernel and user-space, and their access rights are managed independently. As long as the kernel has a reference to a slice, its offset and size can be accessed freely and will not change. Once the kernel drops its reference, it must not access the slice, anymore.

To allow user-space access, the slice must be published. This marks the slice as referenced by user-space. Note that all slices are always readable by user-space, since the entire pool can be mapped. Publishing a slice only marks the slice as referenced by user-space, so it will not be modified or removed. Once user-space releases its reference, it should no longer access the slice as it might be modified and/or overwritten by other data.

Only if neither kernel nor user-space have a reference to a slice, the slice is released. The kernel reference can only be acquired/released once, but user-space references can be published/released several times. In particular, if the kernel retains a reference when a slice is published and later released by userspace, the same slice can be published again in the future.

Note that both kernel-space and user-space must be aware that slice references are not ref-counted. They are simple booleans. For the kernel-side this is obvious, as no ref/unref functions are provided. But user-space must be aware that the same slice being published several times does not increase the reference count.


struct bus1_pool — client pool

Synopsis

struct bus1_pool {
  struct file * f;
  size_t allocated_size;
  struct list_head slices;
  struct rb_root slices_busy;
  struct rb_root slices_free;
};  

Members

f

backing shmem file

allocated_size

currently allocated memory in bytes

slices

all slices sorted by address

slices_busy

tree of allocated slices

slices_free

tree of free slices

Description

A pool is used to allocate memory slices that can be shared between kernel-space and user-space. A pool is always backed by a shmem-file and puts a simple slice-allocator on top. User-space gets read-only access to the entire pool, kernel-space gets read/write access via accessor-functions.

Pools are used to transfer large sets of data to user-space, without requiring a round-trip to ask user-space for a suitable memory chunk. Instead, the kernel simply allocates slices in the pool and tells user-space where it put the data.

All pool operations must be serialized by the caller. No internal lock is provided. Slices can be queried/modified unlocked. But any pool operation (allocation, release, flush, ...) must be serialized.


bus1_pool_slice_is_public — check whether a slice is public

Synopsis

bool bus1_pool_slice_is_public (struct bus1_pool_slice * slice);

Arguments

slice

slice to check

Description

This checks whether slice is public. That is, bus1_pool_publish has been called and the user has not released their reference, yet.

Note that if you need reliable results, you better make sure this cannot race calls to bus1_pool_publish or bus1_pool_release_user.

Return

True if public, false if not.


bus1_pool_init — create memory pool

Synopsis

int bus1_pool_init (struct bus1_pool * pool, const char * filename);

Arguments

pool

pool to operate on

filename

name to use for the shmem-file (only visible via /proc)

Description

Initialize a new pool object.

Return

0 on success, negative error code on failure.


bus1_pool_deinit — destroy pool

Synopsis

void bus1_pool_deinit (struct bus1_pool * pool);

Arguments

pool

pool to destroy, or NULL

Description

This destroys a pool that was previously create via bus1_pool_init. If NULL is passed, or if pool->f is NULL (i.e., the pool was initialized to 0 but not created via bus1_pool_init, yet), then this is a no-op.

The caller must make sure that no kernel reference to any slice exists. Any pending user-space reference to any slice is dropped by this function.


bus1_pool_alloc — allocate memory

Synopsis

struct bus1_pool_slice * bus1_pool_alloc (struct bus1_pool * pool, size_t size);

Arguments

pool

pool to allocate memory from

size

number of bytes to allocate

Description

This allocates a new slice of size bytes from the memory pool at pool. The slice must be released via bus1_pool_release_kernel by the caller. All slices are aligned to 8 bytes (both offset and size).

If no suitable slice can be allocated, an error is returned.

Each pool slice can have two different references, a kernel reference and a user-space reference. Initially, it only has a kernel-reference, which must be dropped via bus1_pool_release_kernel. However, if you previously publish the slice via bus1_pool_publish, it will also have a user-space reference, which user-space must (indirectly) release via a call to bus1_pool_release_user. A slice is only actually freed if neither reference exists, anymore. Hence, pool-slice can be held by both, the kernel and user-space, and both can rely on it staying around as long as they wish.

Return

Pointer to new slice, or ERR_PTR on failure.


bus1_pool_release_kernel — release kernel-owned slice reference

Synopsis

struct bus1_pool_slice * bus1_pool_release_kernel (struct bus1_pool * pool, struct bus1_pool_slice * slice);

Arguments

pool

pool to free memory on

slice

slice to release

Description

This releases the kernel-reference to a slice that was previously allocated via bus1_pool_alloc. This only releases the kernel reference to the slice. If the slice was already published to user-space, then their reference is left untouched. Once both references are gone, the memory is actually freed.

Return

NULL is returned.


bus1_pool_publish — publish a slice

Synopsis

void bus1_pool_publish (struct bus1_pool * pool, struct bus1_pool_slice * slice);

Arguments

pool

pool to operate on

slice

slice to publish

Description

Publish a pool slice to user-space, so user-space can get access to it via the mapped pool memory. If the slice was already published, this is a no-op. Otherwise, the slice is marked as public and will only get freed once both the user-space reference *and* kernel-space reference are released.


bus1_pool_release_user — release a public slice

Synopsis

int bus1_pool_release_user (struct bus1_pool * pool, size_t offset, size_t * n_slicesp);

Arguments

pool

pool to operate on

offset

offset of slice to release

n_slicesp

output variable to store number of released slices, or NULL

Description

Release the user-space reference to a pool-slice, specified via the offset of the slice. If both, the user-space reference *and* the kernel-space reference to the slice are gone, the slice will be actually freed.

If no slice exists with the given offset, or if there is no user-space reference to the specified slice, an error is returned.

Return

0 on success, negative error code on failure.


bus1_pool_flush — flush all user references

Synopsis

void bus1_pool_flush (struct bus1_pool * pool, size_t * n_slicesp);

Arguments

pool

pool to flush

n_slicesp

output variable to store number of released slices, or NULL

Description

This flushes all user-references to any slice in pool. Kernel references are left untouched.


bus1_pool_mmap — mmap the pool

Synopsis

int bus1_pool_mmap (struct bus1_pool * pool, struct vm_area_struct * vma);

Arguments

pool

pool to operate on

vma

VMA to map to

Description

This maps the pools shmem file to the provided VMA. Only read-only mappings are allowed.

Return

0 on success, negative error code on failure.


bus1_pool_write_iovec — copy user memory to a slice

Synopsis

ssize_t bus1_pool_write_iovec (struct bus1_pool * pool, struct bus1_pool_slice * slice, loff_t offset, struct iovec * iov, size_t n_iov, size_t total_len);

Arguments

pool

pool to operate on

slice

slice to write to

offset

relative offset into slice memory

iov

iovec array, pointing to data to copy

n_iov

number of elements in iov

total_len

total number of bytes to copy

Description

This copies the memory pointed to by iov into the memory slice slice at relative offset offset (relative to begin of slice).

Return

Numbers of bytes copied, negative error code on failure.


bus1_pool_write_kvec — copy kernel memory to a slice

Synopsis

ssize_t bus1_pool_write_kvec (struct bus1_pool * pool, struct bus1_pool_slice * slice, loff_t offset, struct kvec * iov, size_t n_iov, size_t total_len);

Arguments

pool

pool to operate on

slice

slice to write to

offset

relative offset into slice memory

iov

kvec array, pointing to data to copy

n_iov

number of elements in iov

total_len

total number of bytes to copy

Description

This copies the memory pointed to by iov into the memory slice slice at relative offset offset (relative to begin of slice).

Return

Numbers of bytes copied, negative error code on failure.

Chapter 10. Bus1 Queue

Table of Contents

struct bus1_queue_node — node into message queue
struct bus1_queue — message queue
bus1_queue_node_init — initialize queue node
bus1_queue_node_deinit — destroy queue node
bus1_queue_node_get_type — query node type
bus1_queue_node_get_timestamp — query node timestamp
bus1_queue_node_is_queued — check whether a node is queued
bus1_queue_node_is_staging — check whether a node is marked staging
bus1_queue_tick — increment queue clock
bus1_queue_sync — sync queue clock
bus1_queue_is_readable_rcu — check whether a queue is readable
bus1_queue_compare — comparator for queue ordering
bus1_queue_init — initialize queue
bus1_queue_deinit — destroy queue
bus1_queue_flush — flush message queue
bus1_queue_stage — stage queue entry with fresh timestamp
bus1_queue_commit_staged — commit staged queue entry with new timestamp
bus1_queue_commit_unstaged — commit unstaged queue entry with new timestamp
bus1_queue_commit_synthetic — commit synthetic entry
bus1_queue_remove — remove entry from queue
bus1_queue_peek — peek first available entry

(You are highly encouraged to read up on 'Lamport Timestamps', the concept of 'happened-before', and 'causal ordering'. The queue implementation has its roots in Lamport Timestamps, treating a set of local CPUs as a distributed system to avoid any global synchronization.)

A message queue is a FIFO, i.e., messages are linearly ordered by the time they were sent. Moreover, atomic delivery of messages to multiple queues are supported, without any global synchronization, i.e., the order of message delivery is consistent across queues.

Messages can be destined for multiple queues, hence, we need to be careful that all queues get a consistent order of incoming messages. We define the concept of `global order' to provide a basic set of guarantees. This global order is a partial order on the set of all messages. The order is defined as:

1) If a message B was queued *after* a message A, then: A < B

2) If a message B was queued *after* a message A was dequeued, then: A < B

3) If a message B was dequeued *after* a message A on the same queue, then: A < B

(Note: Causality is honored. `after' and `before' do not refer to the same task, nor the same queue, but rather any kind of synchronization between the two operations.)

The queue object implements this global order in a lockless fashion. It solely relies on a distributed clock on each queue. Each message to be sent causes a clock tick on the local clock and on all destination clocks. Furthermore, all clocks are synchronized, meaning they're fast-forwarded in case they're behind the highest of all participating peers. No global state tracking is involved.

During a message transaction, we first queue a message as 'staging' entry in each destination with a preliminary timestamp. This timestamp is explicitly odd numbered. Any odd numbered timestamp is considered 'staging' and causes *any* message ordered after it to be blocked until it is no longer staging. This allows us to queue the message in parallel with any racing multicast, and be guaranteed that all possible conflicts are blocked until we eventually commit a transaction. To commit a transaction (after all staging entries are queued), we choose the highest timestamp we have seen across all destinations and re-queue all our entries on each peer using that timestamp. Here we use a commit timestamp (even numbered).

With this in mind, we define that a client can only dequeue messages from its queue that have an even timestamp. Furthermore, if there is a message queued with an odd timestamp that is lower than the even timestamp of another message, then neither message can be dequeued. They're considered to be in-flight conflicts. This guarantees that two concurrent multicast messages can be queued without any *global* locks, but either can only be dequeued by a peer if their ordering has been established (via commit timestamps).

NOTE: A fully committed message is not guaranteed to be ready to be dequeued as it may be blocked by a staging entry. This means that there is an arbitrary (though bounded) time from a message transaction completing when the queue may still appear to be empty. In other words, message transmission is not instantaneous. It would be possible to change this at the cost of shortly blocking each message transaction on all other conflicting tasks.

The queue implementation uses an rb-tree (ordered by timestamps and sender), with a cached pointer to the front of the queue.

struct bus1_queue_node — node into message queue

Synopsis

struct bus1_queue_node {
  union {unnamed_union};
  u64 timestamp_and_type;
  struct bus1_queue_node * next;
  void * group;
  void * owner;
};  

Members

{unnamed_union}

anonymous

timestamp_and_type

message timestamp and type of parent object

next

single-linked utility list

group

group association

owner

node owner


struct bus1_queue — message queue

Synopsis

struct bus1_queue {
  u64 clock;
  u64 flush;
  struct rb_node * leftmost;
  struct rb_node __rcu * front;
  struct rb_root messages;
};  

Members

clock

local clock (used for Lamport Timestamps)

flush

last flush timestamp

leftmost

cached left-most entry

front

cached front entry

messages

queued messages


bus1_queue_node_init — initialize queue node

Synopsis

void bus1_queue_node_init (struct bus1_queue_node * node, unsigned int type);

Arguments

node

node to initialize

type

message type

Description

This initializes a previously unused node, and prepares it for use with a message queue.


bus1_queue_node_deinit — destroy queue node

Synopsis

void bus1_queue_node_deinit (struct bus1_queue_node * node);

Arguments

node

node to destroy

Description

This destroys a previously initialized queue node. This is a no-op and only serves as debugger, testing whether the node was properly unqueued before.


bus1_queue_node_get_type — query node type

Synopsis

unsigned int bus1_queue_node_get_type (struct bus1_queue_node * node);

Arguments

node

node to query

Description

This queries the node type that was provided via the node constructor. A node never changes its type during its entire lifetime.

Return

Type of node is returned.


bus1_queue_node_get_timestamp — query node timestamp

Synopsis

u64 bus1_queue_node_get_timestamp (struct bus1_queue_node * node);

Arguments

node

node to query

Description

This queries the node timestamp that is currently set on this node.

Return

Timestamp of node is returned.


bus1_queue_node_is_queued — check whether a node is queued

Synopsis

bool bus1_queue_node_is_queued (struct bus1_queue_node * node);

Arguments

node

node to query

Description

This checks whether a node is currently queued in a message queue. That is, the node was linked and has not been dequeued, yet.

Return

True if node is currently queued.


bus1_queue_node_is_staging — check whether a node is marked staging

Synopsis

bool bus1_queue_node_is_staging (struct bus1_queue_node * node);

Arguments

node

node to query

Description

This checks whether a given node is queued, but still marked staging. That means, the node has been put on the queue but there is still a transaction that pins it to commit it later.

Return

True if node is queued as staging entry.


bus1_queue_tick — increment queue clock

Synopsis

u64 bus1_queue_tick (struct bus1_queue * queue);

Arguments

queue

queue to operate on

Description

This performs a clock-tick on queue. The clock is incremented by a full interval (+2). The caller is free to use both, the new value (even numbered) and its successor (odd numbered). Both are uniquely allocated to the caller.

Return

New clock value is returned.


bus1_queue_sync — sync queue clock

Synopsis

u64 bus1_queue_sync (struct bus1_queue * queue, u64 timestamp);

Arguments

queue

queue to operate on

timestamp

timestamp to sync on

Description

This synchronizes the clock of queue with the externally provided timestamp timestamp. That is, the queue clock is fast-forwarded to timestamp, in case it is newer than the queue clock. Otherwise, nothing is done.

The passed in timestamp must be even.

Return

New clock value is returned.


bus1_queue_is_readable_rcu — check whether a queue is readable

Synopsis

bool bus1_queue_is_readable_rcu (struct bus1_queue * queue);

Arguments

queue

queue to operate on

Description

This checks whether the given queue is readable.

This does not require any locking, except for an rcu-read-side critical section.

Return

True if the queue is readable, false if not.


bus1_queue_compare — comparator for queue ordering

Synopsis

int bus1_queue_compare (u64 a_ts, void * a_g, u64 b_ts, void * b_g);

Arguments

a_ts

timestamp of first node to compare

a_g

group of first node to compare

b_ts

timestamp of second node to compare against

b_g

group of second node to compare against

Description

Messages on a message queue are ordered. This function implements the comparator used for all message ordering in queues. Two tags are used for ordering, the timestamp and the group-tag of a node. Both must be passed to this function.

This compares the tuples (a_ts, a_g) and (b_ts, b_g).

Return

<0 if (a_ts, a_g) is ordered before, >0 if after, 0 if same.


bus1_queue_init — initialize queue

Synopsis

void bus1_queue_init (struct bus1_queue * queue);

Arguments

queue

queue to initialize

Description

This initializes a new queue. The queue memory is considered uninitialized, any previous content is unrecoverable.


bus1_queue_deinit — destroy queue

Synopsis

void bus1_queue_deinit (struct bus1_queue * queue);

Arguments

queue

queue to destroy

Description

This destroys a queue that was previously initialized via bus1_queue_init. The caller must make sure the queue is empty before calling this.

This function is a no-op, and only does safety checks on the queue. It is safe to call this function multiple times on the same queue.

The caller must guarantee that the backing memory of queue is freed in an rcu-delayed manner.


bus1_queue_flush — flush message queue

Synopsis

struct bus1_queue_node * bus1_queue_flush (struct bus1_queue * queue, u64 ts);

Arguments

queue

queue to flush

ts

flush timestamp

Description

This flushes all committed entries from queue and returns them as singly-linked list for the caller to clean up. Staged entries are left in the queue.

You must acquire a timestamp before flushing the queue (e.g., tick the clock). This timestamp must be given as ts. Only entries lower than, or equal to, this timestamp are flushed. The timestamp is remembered as queue->flush.

Return

Single-linked list of flushed entries.


bus1_queue_stage — stage queue entry with fresh timestamp

Synopsis

u64 bus1_queue_stage (struct bus1_queue * queue, struct bus1_queue_node * node, u64 timestamp);

Arguments

queue

queue to operate on

node

queue entry to stage

timestamp

minimum timestamp for node

Description

Link a queue entry with a new timestamp. The staging entry blocks all messages with timestamps synced on this queue in the future, as well as any messages with a timestamp greater than timestamp. However, it does not block any messages already committed to this queue.

The caller must provide an even timestamp and the entry may not already have been committed.

Return

The timestamp used.


bus1_queue_commit_staged — commit staged queue entry with new timestamp

Synopsis

void bus1_queue_commit_staged (struct bus1_queue * queue, wait_queue_head_t * waitq, struct bus1_queue_node * node, u64 timestamp);

Arguments

queue

queue to operate on

waitq

wait-queue to wake up on change, or NULL

node

queue entry to commit

timestamp

new timestamp for node

Description

Update a staging queue entry according to timestamp. The timestamp must be even and the entry may not already have been committed.

Furthermore, the queue clock must be synced with the new timestamp *before* staging an entry. Similarly, the timestamp of an entry can only be increased, never decreased.


bus1_queue_commit_unstaged — commit unstaged queue entry with new timestamp

Synopsis

void bus1_queue_commit_unstaged (struct bus1_queue * queue, wait_queue_head_t * waitq, struct bus1_queue_node * node);

Arguments

queue

queue to operate on

waitq

wait-queue to wake up on change, or NULL

node

queue entry to commit

Description

Directly commit an unstaged queue entry to the destination queue. The entry must not be queued, yet.

The destination queue is ticked and the resulting timestamp is used to commit the queue entry.


bus1_queue_commit_synthetic — commit synthetic entry

Synopsis

bool bus1_queue_commit_synthetic (struct bus1_queue * queue, struct bus1_queue_node * node, u64 timestamp);

Arguments

queue

queue to operate on

node

entry to commit

timestamp

timestamp to use

Description

This inserts the unqueued entry node into the queue with the commit timestamp timestamp (just like bus1_queue_commit_unstaged). However, it only does so if the new entry would NOT become the new front. It thus allows inserting fake synthetic entries somewhere in the middle of a queue, but accepts the possibility of failure.

Return

True if committed, false if not.


bus1_queue_remove — remove entry from queue

Synopsis

void bus1_queue_remove (struct bus1_queue * queue, wait_queue_head_t * waitq, struct bus1_queue_node * node);

Arguments

queue

queue to operate on

waitq

wait-queue to wake up on change, or NULL

node

queue entry to remove

Description

This unlinks node and fully removes it from the queue queue. If you want to re-insert the node into a queue, you must re-initialize it first.

It is an error to call this on an unlinked entry.


bus1_queue_peek — peek first available entry

Synopsis

struct bus1_queue_node * bus1_queue_peek (struct bus1_queue * queue, bool * morep);

Arguments

queue

queue to operate on

morep

where to store group-state

Description

This returns a pointer to the first available entry in the given queue, or NULL if there is none. The queue stays unmodified and the returned entry remains on the queue.

This only returns entries that are ready to be dequeued. Entries that are still in staging mode will not be considered.

If a node is returned, its group-state is stored in morep. That means, if there are more messages queued as part of the same transaction, true is stored in morep. But if the returned node is the last part of the transaction, false is returned.

Return

Pointer to first available entry, NULL if none available.