1
    2
    3
    4
    5
    6
    7
    8
    9
   10
   11
   12
   13
   14
   15
   16
   17
   18
   19
   20
   21
   22
   23
   24
   25
   26
   27
   28
   29
   30
   31
   32
   33
   34
   35
   36
   37
   38
   39
   40
   41
   42
   43
   44
   45
   46
   47
   48
   49
   50
   51
   52
   53
   54
   55
   56
   57
   58
   59
   60
   61
   62
   63
   64
   65
   66
   67
   68
   69
   70
   71
   72
   73
   74
   75
   76
   77
   78
   79
   80
   81
   82
   83
   84
   85
   86
   87
   88
   89
   90
   91
   92
   93
   94
   95
   96
   97
   98
   99
  100
  101
  102
  103
  104
  105
  106
  107
  108
  109
  110
  111
  112
  113
  114
  115
  116
  117
  118
  119
  120
  121
  122
  123
  124
  125
  126
  127
  128
  129
  130
  131
  132
  133
  134
  135
  136
  137
  138
  139
  140
  141
  142
  143
  144
  145
  146
  147
  148
  149
  150
  151
  152
  153
  154
  155
  156
  157
  158
  159
  160
  161
  162
  163
  164
  165
  166
  167
  168
  169
  170
  171
  172
  173
  174
  175
  176
  177
  178
  179
  180
  181
  182
  183
  184
  185
  186
  187
  188
  189
  190
  191
  192
  193
  194
  195
  196
  197
  198
  199
  200
  201
  202
  203
  204
  205
  206
  207
  208
  209
  210
  211
  212
  213
  214
  215
  216
  217
  218
  219
  220
  221
  222
  223
  224
  225
  226
  227
  228
  229
  230
  231
  232
  233
  234
  235
  236
  237
  238
  239
  240
  241
  242
  243
  244
  245
  246
  247
  248
  249
  250
  251
  252
  253
  254
  255
  256
  257
  258
  259
  260
  261
  262
  263
  264
  265
  266
  267
  268
  269
  270
  271
  272
  273
  274
  275
  276
  277
  278
  279
  280
  281
  282
  283
  284
  285
  286
  287
  288
  289
  290
  291
  292
  293
  294
  295
  296
  297
  298
  299
  300
  301
  302
  303
  304
  305
  306
  307
  308
  309

mojo / core / README.md [blame]

# Mojo Core Overview

[TOC]

## Overview

Mojo Core implements an IPC mechanism based on Message routing over a network
of "Nodes". A Node in typical usage corresponds to a system process.
"Messages" are exchanged between pairs of "Ports", where a pair of Ports
represent a "MessagePipe" from the outside. A MessagePipe provides
reliable in-order delivery of Messages.
Messages can be used to transport Ports and platform handles (file descriptors,
Mach ports, Windows handles, etc.) between Nodes.
Communication between Nodes is done through platform IPC channels, such as
AF_UNIX sockets, pipes or Fuchsia channels. A new Node has to be invited into
the network by giving it an IPC channel to an existing Node in the network,
or else provided with a way to establish one, such as e.g. the name/path to a
listening pipe or an AF_UNIX socket.

Mojo Core also provides a shared memory buffer primitive and assists in the
creation and transport of shared memory buffers as required by the underlying
operating system and process permissions.

A single Node in the network is termed the "Broker".
The Broker has some special responsibilities in the network, notably the Broker
is responsible for:
  - Introducing a pair of Nodes in the network to one another to establish a
    direct IPC link between them.
  - Copying handles between Nodes, where they cannot themselves do so. This is
    e.g. the case for sandboxed processes on Windows.
  - Creating shared memory objects for processes that can't themselves do so.

Mojo Core exposes two API sets:
  1. The [C system API](../public/c/system/README.md).
  1. The [embedder API](embedder/README.md).

This document is to describe how these APIs are implemented by Mojo Core.

## Layering

This image provides an overview of how the Mojo Core implementation is layered:

![Mojo Core Layering](doc/layering.png)

The C system API is a stable, versioned API, exposed through a singleton
structure containing C function thunks. This allows multiple clients in the same
application to use a single embedder-provided instance of Mojo, and allows these
clients to be versioned and distributed independently of Mojo.
Using C thunks also makes it easier to provide compatibility with other
programming languages than C and C++.

When clients invoke on the Mojo public API, they go through the dispatch
functions implemented in [dispatcher.cc](dispatcher.cc), which in turn forwards
the call to a global instance of [mojo::core::Core](core.h).
Core further dispatches the calls to the implementation instance, which is
either a [mojo::core::Dispatcher](dispatcher.h), or a
[mojo::core::UserMessageImpl](user_message_impl.h).
In the case of a Dispatcher, the incoming MojoHandle is looked up in the
[mojo::core::HandleTable](handle_table.h), which returns the Dispatcher
corresponding to the MojoHandle.

The public API exposes the following concepts as MojoHandles:
  1. [DataPipeConsumer](data_pipe_consumer_dispatcher.h).
  1. [DataPipeProducer](data_pipe_producer_dispatcher.h).
  1. [Invitation](invitation_dispatcher.h).
  1. [MessagePipe](message_pipe_dispatcher.h).
  1. [PlatformHandle](platform_handle_dispatcher.h).
  1. [SharedBuffer](shared_buffer_dispatcher.h).
  1. [Watcher](watcher_dispatcher.h) - this is know as "Trap" in the public API.

These are implemented as subclasses of [mojo::core::Dispatcher](dispatcher.h).
The Dispatcher class implements the union of all possible operations on a
MojoHandle, and the various subclasses implement the subset appropriate to their
type.

Additionally the public API exposes Messages as MojoMessageHandles, which are
simply an instance of a [mojo::core::UserMessageImpl](user_message_impl.h).

## Threading model

Mojo Core is multi-threaded, so any API can be used from any thread in a
process. Receiving Messages from other Nodes is done on the "IO Thread", which
is an [embedder-supplied task runner](embedder/README.md#ipc-initialization).
Sending Messages to other Nodes is typically done on the sending thread.
However, if the native IPC channel's buffer is full, the outgoing Message will
be queued and subsequently sent on the "IO Thread".

## Event Dispatching

During the processing of an API call and during activity in the NodeController,
various entities may change states, which can lead to the generation of
[TrapEvents](../public/c/system/README.md#Signals-Traps) to the user.
The [mojo::core::RequestContext](request_context.h) takes care of aggregating
Watchers that need dispatching and issuing outstanding TrapEvents as the last
RequestContext in a thread goes out of scope.

This avoids calling out to user code from deep within Core, and the resulting
reentrancy that might ensue.

## Messaging implementation

Mojo Core is implemented on top of "[Ports](ports)" and is an embedder of Ports.

Ports provides the base abstractions of addressing,
[mojo::core::ports::Node](ports/node.h) and
[mojo::core::ports::Port](ports/port.h) as well as the messaging and event
handling for maintaining and transporting Port pairs across Nodes.

All routing and IPC is however delegated to the embedder, Mojo Core, via
the [mojo::core::NodeController](node_controller.h) which owns an instance of
Node, and implements its delegate interface.

### Naming and Addressing

Nodes are named by a large (128 bit) random number. Ports are likewise named
by a large (128 bit) random number.
Messages are directed at a Port, and so are addressed by a (Node:Port) pair.

Each Port has precisely one "conjugate" Port at any point in time. When a
Message is sent through a Port, its conjugate Port is the implicit destination,
and this is a symmetrical relationship. The address of the conjugate Port can
change as it is moved between nodes.
In practice a Port is renamed each time it's moved between Nodes, and so changes
in both components of the (Node:Port) address pair.

Note that since each Port has a single conjugate Port at any time, en-route
Messages are addressed only with the destination Port address, as the source
is implicit.

### Routing and Topology

A process network initially starts with only the Broker process Node.
New Nodes must be introduced to the broker by some means. This can be done by
inheriting an IPC handle into a newly created process, or by providing them
the name of a listening IPC channel they can connect to.

Each Node thus starts with only a direct IPC connection to a single other Node
in the network. The first time a Node tries to forward a Message to a Node it
doesn't have a direct IPC channel with, it will send a message to the broker to
request an introduction to the new peer Node.

If the broker knows of the peer Node, it will construct a new native IPC
channel, then hand one end of it to the requesting Node and one end to the new
peer Node. This will result in a direct Node-to-Node link between the two, and
so the topology of a process network will trend towards a fully connected graph.

### Messages

A UserMessageImpl is the unit of transport in Mojo Core.
This is a collection of
  1. Handles,
  1. Ports,
  1. User data.

Mojo Core takes care of transferring and serializing and deserializing Handles
and Ports.
The user data on the other hand has to be serialized and deserialized by the
user. Mojo Core optimizes this by allowing the user data to be serialized only
at need, e.g. when a Message is routed to another Node.

### Port

A Node maintains a set of Ports local to that Node.

Logically, each Port has a conjugate Port where all its Messages are destined.
However, when a Port is transferred to a different Node, the newly created Port
on the destination Node will have the transferred Port as its next-hop "Proxy"
Port it sends Messages to. The Proxy Port is responsible for delivering
in-progress Messages across to the new destination Port, and may buffer Messages
to that end. Once a Proxy Port has delivered all in-progress Messages, it
is dissolved and Messages start flowing through the next Peer Port in turn,
which may be the conjugate Port or another Proxy Port.

Each Port maintains state relating to its:
  - Current Peer Port, which may be a Proxy Port, but in the stable state will
    be its conjugate Port.
  - Its current state of messaging, which changes as Ports move across Nodes.
  - Its outgoing Message state, notably Message serial numbers.
  - Its incoming Message queue.

The unit of messaging at this level is the
[mojo::core::ports::Event](ports/event.h), which has several sub-types relating
to the business of maintaining and transporting Ports and keeping their
Peer Port address up to date.

The UserMessageEvent subclass is the event type that carries all user messaging.
A UserMessageEvent owns a
[mojo::core::ports::UserMessage](ports/user_message.h), which is the interface
to the embedder's Message type. Mojo Core implements this in
[mojoe::core::UserMessageImpl](user_message_impl.h). Note that at the Ports
level, this is simply opaque user data. It's the Mojo Core embedder that is
aware of the handles and data attached to a UserMessageImpl.

Note that if a Port is transferred across multiple Nodes, it may end up with a
multi-leg Peer relationship. As result, and because different Messages for the
same Port may take different routes through the process network, out of order
Message delivery is possible. The incoming
[mojo::core::ports::MessageQueue](ports/message_queue.h) takes care of
re-ordering incoming Messages.

The Node does all Message handling and delivery for its local Ports, but
delegates all routing to its delegate through the
[mojo::core::ports::NodeDelegate](ports/node_delegate.h) interface.
Mojo Core implements this in [mojo::core::NodeController](node_controller.h).

### Node to Node IPC

The business of handling Node to Node IPC is primarily handled in
[mojo::core::NodeChannel](node_channel.h) as well as
[mojo::core::Channel](channel.h) and its OS-specific subclasses by exchanging
[mojo::core::Channel::Message](channel.h)s.

Note that a Channel::Message always goes from a source Node to a destination
Node through a Channel (strictly speaking one on each side). As the Channels on
either side (or the OS-provided IPC mechanism) conspire to copy the handles
across, the processes involved in the handle transfer are always implicit.
This means that a low-privilege or sandboxed process can only send its own
handles in a Message, and it's not possible to abuse this to acquire another
process's handles.

A NodeChannel wraps a Channel that communicates to and from another given Node.
It implements all messaging to another specific Node, both relating to general
messaging (see NodeChannel::OnEventMessage), as well as messaging relating to
invitations and Port management.
Since the NodeChannel is associated with a single given Node, the peer Node name
is an implicit given in any incoming Message from the Channel.

The Channel takes care of maintaining and servicing a queue of Messages outgoing
to the peer node, as well as reading and parsing the next incoming Message from
the IPC channel.
Depending on the platform and the Node topology, it may also take care of moving
outgoing and incoming handles between the processes hosting the Nodes on either
side of the Channel.

Note that on Windows, the Broker always takes care of handle copying, as the
Broker will typically be the only process with sufficient privilege to copy
handles. This means that any Message carrying handles is routed through the
Broker on Windows.

### Packet Framing

An in-transit Message on an IPC channel will be serialized as a
[mojo::core::Channel::Message](channel.h).

For user Messages, this will contain:
  1. A [mojo::core::Channel::Message::Header](channel.h) containing:
    - The length of the full Message
    - The length of the header.
    - The Message type.
    - The number of handles.
  2. The Platform handles.
  3. The serialized [mojo::core::ports::Event](ports/event.h). Specifically for
     a UserMessageEvent:
    - [mojo::core::ports::SerializedHeader](ports/event.cc).
    - [mojo::core::ports::UserMessageEventData](ports/event.cc).
    - Ports; num_ports times core::Ports::PortDescriptor.
    - The serialized user data.

## Buffering

Message buffering in Mojo Core primarily occurs at two levels:
  - In the incoming Message queue of a receiving Port.
  - In the outgoing queue for an IPC Channel to a peer Node.

There is also transient buffering of Messages to a new Peer Node while
introductions are in-flight.
Finally, while a Port is in transport to different node, it buffers inbound data
until the destination Node accepts the new Peer, at which time any buffered data
is forwarded to the new Port in the destination Node.

Mojo Core does implement quotas on Port receiving queue Message number and byte
size. If the receive length quota is exceeded on a Port, it signals an
over-quota TrapEvent on the receiving Port. This doesn't however limit buffering
as the caller needs to handle the TrapEvent in some way - most likely by simply
closing the Port.

Since Mojo Core does not limit buffering, the producers and consumers in a
process network must be balanced.
If a producer runs wild, the producer's process may buffer an arbirarly large
number of Messages (and bytes) in the outgoing Message queue on the IPC channel
to the consumer's Node.
Alternatively the consumer's process may buffer an arbitrarily large
number of Messages (and bytes) in the incoming Message queue on the consumer's
Port.

## Security

Mojo can be viewed as a Capability system, where a Port is a Capability.
If a Node can name a Port, that Port can be considered a granted Capability
to that Node. Many Ports will grant other Ports upon request, and so the
transitive closure of those Ports can be considered in the set of granted
Capabilities to a Node.

Native handles can also be viewed as Capabilities, and so any native handle
reachable from the set of Ports granted to a Node can also be considered in
the Node's set of granted Capabilities.

There's however a significant difference between Ports and native handles, in
that native handles are kernel-mediated capabilities. This means there's no way
for a process that doesn't hold a handle to operate on it.

Mojo Ports, in contrast, are a pure user-mode construct and any Node can send
any Message to any Port of any other Node so long as it has knowledge of the
Port and Node names. If it doesn't already have an IPC channel to that node,
it can either send the Message through the Broker, or request an invitation to
the destination Node from the Broker and then send the Message directly.

It is therefore important not to leak Port names into Nodes that shouldn't be
granted the corresponding Capability.