This specification defines the syntax and semantics of a behavior definition language. A service may send instructions encoded in this format for clients such as 3D game engines to realize. These messages include instructions about positioning objects, activating or deactivating systems, changing scenes or contexts, controlling visual properties, animating characters, and playing audio.
The message format outlined in this specification allows for fine-control execution of agents in a virtual or live environment. JSON may be one possible vessel for these types of messages.
In progress...
In progress...
[Exposed] interface Behavior { attribute long id; attribute DOMString subject; attribute DOMString action; attribute Params params; attribute double delay; attribute Dictionary start; attribute SequenceOfStrings cc; };
typedef sequence<DOMString> SequenceOfStrings;
The id specifies the unique identifier. Note that id's are non unique across different blocks.
The subject attribute is the name of the agent that is performing an action. In grammar, this is often called the subject of a verb. Examples include:
"boy"
And even abstract nouns such as the following.
"system"
Each subject affords a set of actions. The action attribute specifies one of those available actions. For example, if the subject is "boy", a possible action may look as follows.
"say"
And if the subject is "system", a possible action may be an abstract concept as follows.
"initialize"
The Params attribute specifies the data-structure containing all data to be received and manipulated by the client. See Params below.
[Exposed] interface Params { attribute String intent; attribute String ssml; attribute String name; attribute String url; attribute Phonemes phonemes; attribute String language; attribute Dictionary context; };
The intent attribute specifies what a user is trying to accomplish. An intent is specified within a String data type.
"statement.i-can-help"
The ssml attribute specifies the speech synthesis markup language for the specified intent. The ssml is specified within a String data type.
"I can help you with that!"
The name attribute specifies the name of the audio file to be played.
"GO_0020"
The url attribute specifies the source location of the audio file to be played. The url attribute is specified within a String data type.
"http://audioURL..."
The language attribute specifies the tongue in which any text, or audio will be communicated in. The
language is specified by the three letter codec abiding
ISO-639.
The following example shows the three letter codec for the English language.
"eng"
The phonemes attribute is an object containing the phonetic translation and total number of frames belonging to the corresponding ssml text. See Phonemes below.
{ "segments": [ { "phonemeLabel": "IY", "startFrame": 3 } ], "framecount": 202 }
typedef sequence< Dictionary > dictionarySequence; [Exposed] interface Phonemes { attribute dictionarySequence segments; attribute long framecount; };
A custom typedef was defined such that a sequence of Dictionary's corresponds to a
dictionarySequence. Each Dictionary object within the segments attribute is a
Dictionary containing a pair of key:value
attributes. The first item in each nested
Dictionary is specified by a String:String
pairing denoting a phonetic
label. The second Dictionary item corresponds to a
String : double
pairing
corresponding to the start frame of the phonetic label.
{ "phonemeLabel": "IY", "startFrame": 3 }
The framecount attribute specifies the total number of feature frames of the transcribed ssml text.
202
The context attribute specifies a
Dictionary
as defined
in [[WebIDL-1]]. The dictionary members are specified in key:value
pairs where both
key
and value
of of type String.
{ "category": "colors", "gamestate" : "1", "target" : "blue" }
The delay attribute specifies the number of seconds to wait to perform a corresponding action through a
double
data type.
0.0
The start attribute is an optional Dictionary argument that specifies when the Behavior
should begin
its execution, specified by a trigger
and an id
.
The trigger attribute specifies the point at which the Behavior corresponding to the
specified id should begin its execution. The trigger value may be "start"
or "end"
.
The id key specifies the Behavior that will be affected by the corresponding trigger value.
0
{ "trigger" : "end", "id": 0 }
In defining the cc attribute, a custom
typedef
was used such that a sequence of the String's corresponds to a
SequenceOfStrings data type. The String data type
defined in [[WebIDL-1]] is used. The cc attribute is then specified by a
sequence of
DCMP
standard abiding String's, containing the textual
representation of any corresponding events within the message.
The following example describes the closed captioning for the event where a character named Frog excitedly says 'Hi there!'.
[ "[Frog excited]", "Hi there!" ]
The following example describes the closed captioning for the event where a character named Pig says 'Welcome back!' in a Southern Accent.
[ "[Pig Southern Accent]", "Welcome back!" ]
The following example shows the closed captioning for the event where three characters are simultaneously cheering.
"[Frog, Pig, Goose cheering]"
The following example shows the closed captioning for the event where a Cow is Mooing.
"[Cow moo's]"
In order to synchronize behavior (especially the "say" and "animate" actions), we propose to use the "message block". A message block is an ordered sequence of Behavior objects, where each object conforms to the Behavior message format specified within this document. The server sends sequences of blocks, and the client (e.g. Unity) must fully process one block of Behavior before reading the next block. While processing a single block of Behaviors, the client should schedule Behaviors by putting them on "tracks"(every action of every object has a track), then all the tracks start simultaneously.
Suppose a humanoid agent named goose
stands near a landmark labelled Landmark_1
.
Goose walks towards a landmark named Landmark_1
.
[ { "subject": "goose", "action": "do", "params": { "name": "walk", "context": ["Landmark_1"] } } ]
Goose points at a landmark named Landmark_1
.
[ { "object": "goose", "action": "do", "params": { "name": "point", "context": ["Landmark_1", "enabled"] } } ]
Goose points at a landmark named Landmark_1
, and stops after 2 seconds.
[ { "object": "goose", "action": "do", "params": { "name": "point", "context": ["Landmark_1", "enable"] } }, { "object": "goose", "action": "do", "params": { "name": "point", "context": ["Landmark_1", "disable"] }, "delay": 2.0 } ]
Goose looks at a landmark named Landmark_1
.
[ { "object": "goose", "action": "do", "params": { "name": "look", "context": ["Landmark_1", "enable"] } } ]
Goose looks at a landmark named Landmark_1
, and stops after 2 seconds.
[ { "object": "goose", "action": "do", "params": { "name": "look", "context": ["Landmark_1", "enable"] } }, { "object": "goose", "action": "do", "params": { "name": "look", "context": ["Landmark_1", "disable"] }, "delay": 2.0 } ]
Goose turns body to a landmark named Landmark_1
.
[ { "object": "goose", "action": "do", "params": { "name": "turn", "context": ["Landmark_1"] } } ]
Goose walks towards a landmark named Landmark_1
while pointing at it.
[ { "object": "goose", "action": "do", "params": { "name": "point", "context": ["Landmark_1", "enable"] } }, { "object": "goose", "action": "walk", "params": { "name": "point", "context": ["Landmark_1"] } } ]
Goose walks towards a landmark named Landmark_1
while pointing at it, then stops
pointing.
[ { "object": "goose", "action": "do", "params": { "name": "point", "context": ["Landmark_1", "enable"] } }, { "object": "goose", "action": "walk", "params": { "name": "point", "context": ["Landmark_1"] } } ] [ { "object": "goose", "action": "do", "params": { "name": "point", "context": ["Landmark_1", "disable"] } } ]
In the following example we examine events A and B: we want to enforce that event B can't start until event A has started.
Right when Goose starts speaking, Fox jumps.
[ { "subject": "goose", "action": "say", "params": {"name": "GO_0020" , "ssml": "I can help you with that!"} }, { "subject": "fox", "action": "animate", "params": {"name": "FX_Jump"} } ]
In this example we want to run the statements in a particular sequence. Specifically, B can't start until A has finished. In order to achieve the desired order, we place each behavior in a separate block.
Only when Goose is done speaking, is the Fox able to jump.[ { "subject": "goose", "action": "say", "params": {"name": "GO_0020" ,"intent": "statement.i-can-help","ssml": "I can help you with that!"} } ] [ { "subject": "fox", "action": "animate", "params": {"name": "FX_Jump"} } ]
In this example we show how to sequence actions together.
Goose waves hands as it starts speaking. After hand-waving animation completes, sound effect plays.[ {"id": "0", "subject": "goose", "action": "animate", "params": {"name": "wave-hands"}}, {"subject": "goose", "action": "say", "params": {"name": "GO_0020" ,"intent": "greeting"}}, {"subject": "sound", "action": "play", "params": {"name": "sfx.boom" ,"intent": "sfx.boom"}, "start": {"trigger" : "end", "id":0}} ]
In this example we show how to sequence actions together.
Goose waves hands as it starts speaking. After audio of utterances completes, then the sound effect plays.[ {"subject": "goose", "action": "animate", "params": {"name": "wave-hands"}}, {"subject": "goose", "action": "say", "params": {"name": "GO_0020" ,"intent": "greeting"}}, {"subject": "sound", "action": "play", "params": {"name": "sfx.boom"}, "start": {"trigger" : "end", "id":1}} ]
In this example we show how to control different subjects in parallel.
Goose and fox cheer "yay!", together at the same time.[ {"subject": "goose", "action": "say", "params": {"name": "GO_0020","intent": "yay"}}, {"subject": "fox", "action": "say", "params": {"name": "FX_0020","intent": "yay"}} ]
In this example we show how to control different subjects in parallel.
Goose and fox cheer together, and then jump together.[ {"subject": "goose", "action": "say", "params": {"name": "GO_0020" ,"intent": "yay"}}, {"subject": "fox", "action": "say", "params": {"name": "FX_0020" ,"intent": "yay"}} ] [ {"subject": "goose", "action": "animate", "params": {"name": "jump"}}, {"subject": "fox", "action": "animate", "params": {"name": "jump"}} ]
In the following examples we show how to control different objects in parallel.
Goose and fox cheer together (goose finishes before fox). Goose's jump animation starts right after cheering. Same for Fox.[ {"id": "0", "subject": "goose", "action": "say", "params": {"name": "GO_0020" ,"intent": "yay"}}, {"id": "1", "subject": "fox", "action": "say", "params": {"name": "FX_0020" ,"intent": "yay"}}, {"subject": "goose", "action": "animate", "params": {"name": "jump"}, "start": {"trigger" : "end", "id":0}}, {"subject": "fox", "action": "animate", "params": {"name": "jump"}, "start": {"trigger" : "end", "id":1}} ]The Goose says 'hello' and waves hand at the Fox. The Fox says 'hi' back (without interrupting the goose's utterance.
[ {"subject": "goose", "action": "say", "params": {"name": "GO_0020" ,"intent": "greeting"}}, {"subject": "goose", "action": "animate", "params": {"name": "wave-hands"}}, {"subject": "fox", "action": "say", "params": {"name": "FX_0020" ,"intent": "greeting"}, "start": {"trigger" : "start", "id":1}, "delay": 0.5} ]
The following are examples of objects and their associated behaviors. Note that the specified parameters for each of the following actions are normative.
Animate
An animate payload specifies an object to be animated by the client.
{ "subject": "goose", "action": "animate", "delay": 0, "params": { "name": "FR_Maracas_1" } }
Say
A say payload specifies what the desired subject should say.
{ "subject": "frog", "action": "say", "delay": 0, "params": { "intent": "statement.i-can-help", "ssml": "I can help you with that!", "name": "GO_0020", "url": "http://audioURL...", "phonemes": { "segments": [ { "phonemeLabel": "IY", "startFrame": 3 } ], "framecount": 202 } } }
Do
A do payload specifies the name of the action the specified target should perform.
{ "subject": "goose", "action": "do", "delay": 0, "params": { "name": "walk", "context" : ["pig"]} }
Play
A play payload specifies the audio clip the client should play.
{ "subject": "sound", "action": "play", "delay": 0, "params": { "name": "sxf.bonusround.wav" } }
Spawn
A spawn payload notifies the client to display the specified object.
{ "subject": "system", "action": "spawn", "delay": 0, "params": {"name": "frog", "context": ["character2"]} }
Destroy
A destroy payload notifies the client to remove the specified object.
{ "subject": "system", "action": "destroy", "delay": 0, "params": {"name": "frog"} }
The client may store blocks in a queue and realize the behaviors at its own pace.
In the following example, the queue is cleared by calling clear
action on a system
subject.
{ "subject": "system", "action": "clear" }
Clearing the queue only removes future actions, but doesn't interrupt the current actions.
The following example shows how to immediately terminte actions that are currently being performed.
The params.ignore
field is a list of subjects to ignore/skip, if desired.
{ "subject": "system", "action": "interrupt", "params": { "ignore": ["music"] } }
Clearing the queue and interrupting current actions are two behaviors that tend to be performed together. The following block shows how to send both behaviors at once.
[ { "subject": "system", "action": "clear" }, { "subject": "system", "action": "interrupt", "params": { "ignore": ["music"] } } ]
The following BML
snippet simultaneously starts an
animation and a speech utterance.
<bml character="Alice"> <pointing target="blueBox" mode="RIGHT_HAND" start="speech1:start"/> <speech id="speech1"> <text>Look there!</text> </speech> </bml>
The equivalent behavior can be represented as follows.
[ {"subject": "Alice", "action": "do", "params": {"name": "point", "context": ["blueBox", "mode": "RIGHT_HAND"]}}, {"subject": "Alice", "action": "say", "params": {"intent": "look"}} ]
BML documentation recommends using <wait>
to align behavior with a condition or an
event.
<bml character="Alice"> <gesture id=”g1” type=”point” target=”object1”/> <body id=”b1” posture=”sit”/> <wait id=”w1” condition=”g1:end AND b1:end”/> <gaze target=”object2” start=”w1:end”/> </bml>
The <wait>
is unnecessary since we can synchronize
behaviors by placing them in different blocks.
[ {"subject": "Alice", "action": "do", "params": {"name": "point", "context": ["object1"]}}, {"subject": "Alice", "action": "do", "params": {"name": "body", "context": ["sit"]}} ] [ {"subject": "Alice", "action": "animate", "params": {"name": "gaze"}} ]
Multi-party behavior synchronization is limited. It's non-trivial to have two characters say something at the same time as BML.
<bml character="Alice"> <speech><text>Yay!</text></speech> </bml> <bml character="Bob"> <speech><text>Yay!</text></speech> </bml>
Act message format allows fine-control of multi-party behaviors, with a natural syntax.
[ {"subject": "Alice", "action": "say", "params": {"intent": "yay"}}, {"subject": "Bob", "action": "say", "params": {"intent": "yay"}} ]