WoT standards developed by the W3C Web of Things (WoT) Working Group and their latest status:

specification The current state
WoT Architecture CR
WoT Thing Description CR
WoT Scripting API WD, Working Draft
WoT Binding Templates Working Group Note
WoT Security and Privacy Considerations Working Group Note

This series will start with the WoT standard itself, WoT Architecture (WoT Architecture), WoT Thing Description (WoT Thing Description), and WoT Scripting API (WoT Scripting) are currently in CR stage (W3C standard stage see figure below) Programming API) for a quick parse.

As shown in the figure below, the standard entering CR stage means that the content has been relatively stable, WD stage means greater uncertainty, and Working Group Note (Working Group Note) is very variable. So the “architecture” and “object description” in the CR phase are worth taking the time to understand (it has a good chance of becoming an official recommended standard REC), while the PROGRAMMING API in the WD phase recently (28 October 2019) underwent a major content overhaul that almost completely scrapped the previous version. Close to steady state, but programming apis are always something developers love, so this series will cover them.

W3C Process Document, www.w3.org/2019/Proces…

  1. WoT architecture

The core of the WoT architecture specification describes WoT related terms and their relationships from seven aspects. The essence of architecture is terms and their relationships. The term stands for the accepted concept of the skeleton and muscle of the architecture; Relationships describe terms of interaction and character that are the blood and nerve of architecture.

  • Overview:
  • Affordances
  • Web Thing (Web Thing)
  • Interaction Model
  • Hypermedia Controls
  • Protocol Bindings
  • WoT System Components and Their Interconnectivity
  1. An overview of the

In this part, the three basic concepts of Thing, Consumer and Intermediary and their interrelations are defined macroscopically.

  • A Thing is an abstract representation of a physical or virtual entity, such as a device or room, described by standardized metadata. In W3C WoT, the Description metadata must be WoT Thing Description (TD).
  • Consumer: Consumers must be able to parse and process JSON-based TDS.
  • Intermediary (Intermediary) : can act as the agent of the object, at this time Intermediary has TD similar to the object behind, but pointing to the WoT interface provided by the Intermediary. Mediations can also add additional capabilities to the objects behind them or combine multiple available objects to form virtual entities. To the consumer, the mediation looks like a thing because it has TD and provides a WoT interface.

Direct interaction between consumers and objects is the simplest and most intuitive way:

Object TD can contain links to other objects or resources:

The Internet cannot access the local Network, usually because of IPv4 Network Address Translation (NAT) or firewalls, is a typical problem in actual deployment. To solve this problem, the W3C WoT allows an intermediary between objects and consumers.

The above concepts apply to all layers of IoT applications: device layer, edge layer, and cloud computing layer. Applying the above concepts at these levels can facilitate the creation of common interfaces and apis across layers, leading to different interaction patterns between IoT applications such as thing to thing, thing to gateway, thing to cloud, gateway to cloud, and even cloud alliances (such as the interconnection of cloud computing environments between two or more service providers).

  1. Discernible function

Affordance is a term borrowed from the field of cognitive psychology, “to describe the observable and actual properties of something, and specifically the underlying properties that determine how they are used.” (Donald Norman) If you see a board by the door and know you need to push it away.

(see: communitywiki.org/wiki/WhatIs…).

The ability to describe an object in TD. WoT classifies the following four concepts as recognisable features:

  • navigation
  • attribute
  • action
  • The event

The navigation uses hyperlinks. Descriptions of hyperlinks, such as link relationship types and link target properties, are functions that tell Web clients how to navigate and how to handle linked resources. In other words, links provide navigational recognition.

Properties, actions, and events are described using another hypermedia control, a form. Forms enable more complex interactions with objects than hyperlinks.

  1. Web content

There are four aspects to Web objects:

  • Behavior = Behavior
  • The Interaction Affordance Affordance
  • Security Configuration
  • Protocol Binding

As shown in the figure below, in addition to behavior, interaction recognition, security configuration, and protocol binding need to be reflected in TD. The behavior here can be understood as scenario-based control logic on top of the other three aspects.

The interaction recognition capabilities described in TD provide consumers with an interaction model that does not involve specific network protocols or data encoding. The mapping of interaction identifiers to protocol-specific messages is provided by protocol bindings. In general, different protocols are often used to support different interaction recognition capabilities. The security configuration of the object represents the access control mechanism for the interaction identification function and the management of the associated public or private metadata.

Physical architecture

  1. Interaction model

As mentioned earlier, WoT formalizes three interactive identifiers in addition to navigation:

  • attribute
  • action
  • The event

Properties are used as an interactively identifiable feature to expose the state of an object. The state exposed by the property must be accessible (readable). Optionally, the state exposed by the property can be updated (writable). Objects can make their properties observable by pushing a new state after a change (see Observing Resources [RFC7641]). Write only state should be updated with some action.

Examples of properties are sensor values (read only), stateful actuators (read and write), configuration parameters (read and write), object states (read and write only), and computed results (read only).

Action, as an interactive recognition function, allows the activation of a certain function of the object. Actions can operate on directly exposed states (see properties), on multiple properties simultaneously, or on properties (such as switching) based on internal logic. A trigger action can also trigger a process on an object that manipulates state over time, including changing physical state through an actuator.

Examples of actions include changing multiple properties at once, changing properties over time or through a process that should not be leaked (such as a proprietary control loop algorithm) (such as dimming lights), or starting a process that lasts a long time (such as printing files).

Events, as an interactive recognition feature, describe the source of events that asynchronously push data from objects to consumers. The point here is not to transmit state, but rather to transmit state changes (such as events). Events can be triggered by conditions that are not exposed as attributes.

Examples of events include discrete events such as alerts, as well as timing sampled events that are pushed at a time.

Properties, actions, and events can contain additional Data schemas if the Data in the interaction is not fully defined by the protocol binding (via the media type).

  1. Hypermedia control

W3C WoT uses two types of hypermedia controls: the well-known Web links for surfing the Web (RFC8288) and the more powerful Web forms that can support arbitrary actions. Links have been used in other IoT standards and IoT platforms, such as CoRE Link Format [RFC6690], OMA LWM2M [LWM2M], AND OCF [OCF]. Forms are a new concept, and aside from W3C WoT, THE IETF’s Constrained RESTful Application Language (CoRAL) also uses it.

6.1 the link

Links let consumers (or Web clients more generally) change the current context (the various resources currently rendered by the Web browser) or include more resources in the current context, depending on the context’s relationship to the link target. The consumer does this by dereferencing the target URI, that is, by following the link to get the resource representation.

W3C WoT follows the definition of a Web link ([RFC8288]), which means that a link contains:

  • Link context
  • Relationship types
  • The link target
  • Target properties (optional)

In WoT, links are used to discover and represent relationships between objects, such as hierarchical or functional relationships, as well as relationships between objects and other documents on the Web, such as alternative representations such as manuals or CAD models.

6.2 the form

Forms let consumers (or Web clients more broadly) perform operations beyond dereferencing URIs (such as manipulating the state of objects). The consumer does this by filling in and submitting the form to the submission target. This usually requires more detailed (request) message details (such as methods, header fields, or other protocol options) than links. You can think of the form as a request template, where the provider prefills in some of the information based on its interface and state, and the remaining fields are filled in by the consumer (or Web client).

The W3C WoT defines forms as a new hypermedia control. Note that the definition of CoRAL ([CoRAL]) is similar and can be considered consistent. The form contains:

  • Form context
  • Operation type
  • Submit the target
  • Request method
  • Form fields (optional)

A form can be thought of as “an action of the specified action type is performed in the context of the form to send the request to the submission target in the specified request method,” and then optional fields are used to further describe the request.

Common operation types of WoT

Operation type instructions
readproperty The read operation that identifies the attribute identification function and is used to obtain the corresponding data
writeproperty The write operation that identifies the attribute recognition function and is used to update the corresponding data
observeproperty The observation action that identifies the attribute recognition capability and receives notification containing new data when the attribute is updated
unobserveproperty Cancel the observation operation of the identification attribute identification function and stop receiving the corresponding notification
invokeaction Identify action Identifies the call operation of a function to perform the corresponding action
subscribeevent A subscription operation that identifies the event recognition capability to receive notification when an event for an object occurs
unsubscribeevent The unsubscribe action that identifies the event recognition function and stops receiving corresponding notifications
readallproperties A read all property operation that identifies an object and is used to obtain data for all properties at once
writeallproperties A write all property operation that identifies an object and is used to write data for all properties at once
readmultipleproperties A read multiple property operation that identifies an object and is used to obtain data for the selected property at one time
writemultipleproperties The write multiple properties operation that identifies an object and is used to write data for the selected properties at once
  1. Protocol binding

A protocol binding is a mapping from an interaction identifiable feature to a specific message for a particular protocol, such as HTTP [RFC7231], CoAP [RFC7252], or MQTT [MQTT]. Protocol binding lets consumers know how to enable interactive recognition through a network-oriented interface. To support interoperability, the protocol binding follows the uniform interface specification of REST ([REST]).

WoT has restrictions on protocol binding in the following ways.

7.1 Hypermedia Driver

Interaction recognition must include one or more protocol bindings. The protocol binding must be serialized as a hypermedia control that explains how to enable interaction recognition. Hypermedia controls must be given by the body of the object that provides interactive recognition. The body can be the object itself, producing a TD document at run time (based on its current state and containing network parameters such as IP addresses), or it can be given from memory (containing only current network parameters). A principal can also be an external entity (such as a software stack) that can master the object and its network parameters and internal structure comprehensively and in real time. This allows objects to be loosely coupled to consumers, allowing them to have a separate life cycle and evolve independently. Hypermedia controls can be cached outside objects to use cached metadata while offline.

7.2 the URI

W3C WoT compliant protocols must have a URI schema registered with IANA ([RFC4395]). Hypermedia controls rely on URIs to identify links and submission targets. Thus, the URI pattern (the first component before the colon) represents the protocol used when communicating with the object’s interaction recognizable function. The W3C WoT calls these protocols transport protocols.

7.3 Standard Methods

A W3C WoT compliant protocol must be based on a well-known standard methodology. The standard approach allows messages to be self-interpreted, allowing interaction-recognition capabilities to perform intermediate processing, such as processing through proxies or transformation from one protocol binding to another [REST]. It is also possible to give the consumer a reusable transport stack, such as HTTP, CoAP, or MQTT, avoiding the need for the consumer to write object specific code or plug-ins.

7.4 Media Types

All data (or content) exchanged when interactive recognition is enabled must be identified by the media type in the protocol binding [RFC6838]. The media type indicates the content format, for example, Application/JSON for JSON [RFC8259] or Application/cbOR for CBor [RFC7049]. Media types are managed by IANA.

Some media types may require additional parameters to fully specify the format to use for content. Such as text/plain; charset=utf-8 or application/ld+json; Profile = “http://www.w3.org/ns/json-ld#compacted”. You need to pay special attention to this when describing data sent to objects. Standard transformations of data, such as content encoding [RFC7231], may be required during transmission. The protocol binding can contain additional information to provide more specific information beyond the media type to determine the content format.

Note that many media types only identify a common serialization format and do not provide much semantics for content composition such as XML, JSON, CBOR. Therefore, the corresponding interaction recognizability should declare data schemas to provide more specific syntax-level metadata for data exchange.

  1. Components and their interconnections

The overview describes the WoT architecture in terms of abstract WoT architecture components, including objects, consumers, and mediations. When these abstract architectural components are implemented as software stacks that play their respective roles in the WoT architecture, they are called Servient.

This section uses diagrams to illustrate how the suite of services works together to form a WoT architecture based system.

First, things can be implemented as a suite of services. We call the Inclusion Representation (TD) service suite software stack Exposed Thing, which exposes the WoT interface to the consumer of the Thing. Exposure can be used by other software components on the suite of services to implement the behavior of things.

Service suite

Second, consumers will always be implemented through the suite of services, because they must be able to handle TDS and must have a protocol stack that can be configured with the protocol binding information contained in TDS.

In consumers, we call the provider suite software stack a Consumer, which allows applications running on top of the suite that need to handle TDS to interact with objects.

Consumer Service Suite

Instances of consumers in the service suite software stack are designed to separate protocol-level complexity from applications and communicate with exposures instead of applications.

Finally, mediations are a WoT architecture component implemented by the service suite. The intermediary is located between the object and the consumer, acting the roles of both the consumer (for the object) and the object (for the consumer). In the mediation suite, the software stack contains both consumers (consumers) and objects (exposures).

Mediation Service Suite

8.1 Direct Communication

The figure below shows the direct communication between object and consumer. Object exposes the interaction recognition function through TD, and the consumer uses the object through the interaction recognition function. Direct communication applies to the scenario where the two service suites use the same network protocol and can access each other.

Consumers communicate directly with things

Exposed objects are abstract software representations of objects that provide interactive recognition capabilities that expose the WoT interface.

A consumer is a software representation of a remote object consumed by a consumer as an interface for an application to access the remote object. A consumer can generate a consumer instance by parsing and processing a TD. The interaction between the consumer and the object is realized through the direct connection and data exchange between the consumer and the exposed object.

8.2 Indirect Communication

The figure below shows how consumers and objects are connected to each other through intermediaries. If the suite of services that implement consumer-to-thing use different protocols, or if consumer-to-thing are on different networks that require authentication to access, they must communicate with each other through mediations.

Communicate through intermediaries

Mediations contain the functions of both the exposed and the consumer. The functions of mediations include forwarding messages for consumer-object interaction recognition, optionally caching object data to speed responses, and communication transformations if the object’s functions are extended by mediations. In the mediation, the consumer creates a proxy object for the exposure of the object, which the consumer can access through his own consumer (the exposure of the mediation).

The protocol used by the intermediary to communicate with the consumer may be different from that used to communicate with the object. For example, an object using CoAP and a consumer using HTTP can be bridged through a mediation.

Even if multiple different protocols are used between intermediaries and objects, consumers can use only one protocol to communicate with these objects through intermediaries. The same goes for certification. The consumer of the consumer needs only one security mechanism to authenticate with the exposure of the intermediary, while the intermediary may need multiple security mechanisms to authenticate different objects.

Typically, a mediation creates the TD of its proxy object based on the TD of the original object. Depending on the requirements of the use case, the TD of this proxy object can use the same identifier as the original TD, or it can be assigned a new identifier. If necessary, mediation-generated TDS can contain interfaces that support other communication protocols.

  1. summary

This article, the first part of the W3C Universal Iot Standard Parsing series, focuses on the core concepts of the WoT architecture. The essence of architecture is concepts and their relationships. The WoT architecture describes many concepts, including objects, mediations, consumers, recognizable functions (properties, actions, properties), hypermedia controls (links, forms), suite of services, exposes, consumers, and more. Objects are abstractions of physical or virtual devices that represent server-side functionality. The consumer is an abstraction of the WoT application and represents client functionality. The mediation is both the abstraction of object and the abstraction of application. The realization of things, consumers, and mediations is called a service suite and has both direct and indirect communication modes.

At the heart of the WoT architecture is the specification of object Description (TD) including interaction recognition, security configuration, and protocol binding. Objects can have four interactively identifiable functions: navigation, properties, actions, and properties, and are provided through hypermedia controls (links, forms). This is called the interaction model. The WoT architecture currently defines 12 common interactions, eight of which are attribute related, one action related, and two event related.

A protocol binding is a mapping of interaction identifiable functions to protocol-specific invocations. The security configuration represents the access control mechanism for interaction recognition capabilities.

  1. Further reading

For details on the WoT working Group set up by the W3C to develop the WoT standard, see this article.

  • “Introduction to W3C Universal Iot Standard”
  • Zhihu user “yao to jung’s” for an answer: www.zhihu.com/question/26…
  1. A link to the

  • WoT Architecture:www.w3.org/TR/wot-arch…
  • WoT Thing Description: www.w3.org/TR/wot-thin…
  • WoT Scripting API: www.w3.org/TR/wot-scri…
  • WoT Binding Templates:www.w3.org/TR/wot-bind…
  • WoT Security and Privacy Considerations: www.w3.org/TR/wot-secu…
  • WoT Interest Group: www.w3.org/2019/07/wot…
  • WoT working group: www.w3.org/2016/12/wot…