Documentation

VMware Event Router

The VMware Event Router is responsible for connecting to event stream sources, such as VMware vCenter, and forward events to an event processor. To allow for extensibility and different event sources/processors event sources and processors are abstracted via Go interfaces.

Currently, one VMware Event Router is deployed per appliance (1:1 mapping). Only one vCenter event stream can be processed per appliance. Also, only one event stream (source) and one processor can be configured. The list of supported event sources and processors can be found below.We are evaluating options to support multiple event sources (vCenter servers) and processors per appliance (scale up) or alternatively support multi-node appliance deployments (scale out), which might be required in large deployments (performance, throughput).

Note: We have not done any extensive performance and scalability testing to understand the limits of the single appliance model.

Supported Event Sources

Supported Event Processors

Event Handling

As described in the architecture section, due to the microservices architecture used in the VMware Event Broker Appliance one always has to consider message delivery problems such as timeouts, delays, reordering, loss. These challenges are fundamental to distributed systems and must be understood and considered by function authors.

Event Types supported

For the supported event stream source, e.g. VMware vCenter, all events provided by that source can be used. Since event types are environment specific (vSphere version, extensions), a list of events for vCenter as an event source can be generated as described in this blog post.

Message Delivery Guarantees

Consider the following most basic form of messaging between two systems:

[PRODUCER]——[MESSAGE]—–>[CONSUMER]
[PRODUCER]<—[MESSAGE_ACK]—[CONSUMER]

Even though this example looks simple, a lot of things can go wrong when transferring a message over the network (vs in-process communication):

  • The message might never be received by the consumer
  • The message might arrive out of order (previous message not shown here)
  • The message might be delayed during transport
  • The message might be duplicated during transport
  • The consumer might be slow acknowledging the message
  • The consumer might receive the message and then crash before acknowledging it
  • The consumer acknowledges the message but this message is lost/delayed/arrives out of order
  • The producer crashes immediately after receiving the acknowledgement

Note: For our example, it doesn’t really matter whether the packet (message) actually leaves the machine or the destination (consumer) is on the same host. Of course, having a physical network in between the actors increases the chances of messaging failures. The network protocol in use was intentionally left unspecified.

One of the following message delivery semantics is typically used to describe the messaging characteristics of a distributed system such as the VMware Event Broker Appliance:

  • At most once semantics: a message will be delivered once or not at all to the consumer
  • At least once semantics: a message will be delivered once or multiple times to the consumer
  • Exactly once semantics: a message will be delivered exactly once to the consumer

Note: Exactly once semantics is not supported by all messaging systems as it requires significant engineering effort to implement. It is considered the gold standard in messaging while at the same time being a highly debated topic.

As of today the VMware Event Broker Appliance guarantees at most once delivery. While this might sound like a huge limitation in the appliance (and it might be, depending on your use case) in practice the chances for message delivery failures are/can be reduced by:

  • Using TCP/IP as the underlying communication protocol which provides certain ordering (sequencing), back-pressure and retry capabilities at the transmission layer (default in the appliance)
  • Using asynchronous function invocation (defaults to “off”, i.e. “synchronous”, in the appliance) which internally uses a message queue for event processing
  • Following best practices for writing functions

Note: The VMware Event Broker Appliance currently does not persist (to disk) or retry event delivery in case of failure during function invocation or upstream (external system, such as Slack) communication issues. For introspection and debugging purposes invocations are logged to standard output by the OpenFaaS vcenter-connector (“sync” invocation mode) or OpenFaaS queue-worker (“async” invocation mode).

We are currently investigating options to support at least once delivery semantics. However, this requires significant changes to the event router such as:

  • Tracking and checkpointing (to disk) successfully processed vCenter events (stream history position)
  • Buffering events in the connector (incl. queue management to protect from overflows)
  • Raising awareness (docs, tutorials) for function authors to deal with duplicated, delayed or out of order arriving event messages
  • High-availability deployments (active-active/active-passive) to continue to retrieve the event stream during appliance downtime (maintenance, crash)
  • Describe mitigation strategies for data loss in the appliance (snapshots, backups)

Invocation

Functions in OpenFaaS can be invoked synchronously or asynchronously:

synchronous: The function is called and the caller, e.g. OpenFaaS vcenter-connector, waits until the function returns (successful/error) or the timeout threshold is hit.

asynchronous: The function is not directly called. Instead, HTTP status code 202 (“accepted”) is returned and the request, including the event payload, is stored in a NATS Streaming queue. One or more “queue-workers” process the queue items.

If you directly invoke your functions deployed in the appliance you can decide which invocation mode is used (per function). More details can be found here.

The VMware Event Broker appliance by default uses synchronous invocation mode. If you experience performance issues due to long-running/slow/blocking functions, consider running the VMware Event Router in asynchronous mode by setting the "async" option to "true" (quotes required) in the configuration file for the VMware Event Router deployment:

{
    "type": "processor",
    "provider": "openfaas",
    "address": "http://127.0.0.1:8080",
    "auth": {
          ...skipped
        }
    },
    "options": {
        "async": "true"
    }
}

When the AWS EventBridge event processor is used, events are only forwarded for the patterns configured in the AWS event rule ARN. For example, if the rule is configured with this event pattern:

{
  "detail": {
    "subject": [
      "VmPoweredOnEvent",
      "VmPoweredOffEvent",
      "VmReconfiguredEvent"
    ]
  }
}

Only these three vCenter event types would be forwarded. Other events are discarded to save network bandwidth and costs.

Get Started

Explore the capabilities that the VMware Event Router enables