Spatio-Temporal Control for Masked Motion Synthesis

Anonymous

Compared to STMC for Body Part Timeline Control

MaskControl (our)

STMC

No "pick something", no "wipe", and walk to the wrong side

Compared to SOTA - Multiple Joints

a person crosses their arms for chest fly

MaskControl (our)

OmniControl

MotionLCM

a person jumps in the air once

MaskControl (our)

OmniControl

MotionLCM

a person walks in a circle clockwise

MaskControl (our)

OmniControl

MotionLCM

a person walks forward and waves his hands

MaskControl (our)

OmniControl

MotionLCM

Compared to SOTA - Pelvis Only

a person walks forward and waves his hands
a person dances to salsa music

MaskControl (our)

GMD

MaskControl (our)

GMD

a person walks forward and come back to the same position from where we started

MaskControl (our)

GMD

Compared to SOTA - Zero-shot Objective Control

a man slowly leaps forward, turns around and leaps again and then repeats the forward leap.

MaskControl (our)

ProMoGen

Dense Signals

the person draws a heart with hand

person walks down and up in a figure 8 pattern

A figure walks forward in a zig zag pattern

a person waves both his arms

someone is lifting something up

a person dances to salsa music

a person stands and waving

a person stands and bows

a man walks in a curved line with his hands at his sides

a person walks with support

a person walks

a person walks in a circle

Sparse Signals

A person walks forward with their hands up in a surrender pose

person walks over and sits down in a chair.

A person jumps and kicks a football in the air with their head

a person walks slowly

A person walks forward, casually greeting others with a wave or hello

A person walks forward and raises both arms high.

a man walks left and right

A person walks forward giving a high five

A person walks, pauses, and performs a high kick in the air.

Body Part Timeline Control

Upper Body: a person puts hands in the air.
Left Foot : a person kicks left legs.
Lower Body: a person jumps forward.
0 frames 60 120 frames
Generating motion for the upper body from frames 0 to 120 based on the “a person puts hands in the air.” For the lower body, motion is generated in two parts: From frames 0 to 60, based on the “a person kicks left legs.” From frames 60 to 120, based on the “a person jumps forward.”
Upper Body: the person is bending over forward
Left Foot : shake with their left leg
0 frames 60 120 frames
Generating upper body motion from frames 0 to 120 based on the prompt: "the person is bending over forward" Simultaneously, lower body motion is generated from frames 0 to 120 based on the prompt: "shake with their left leg"

Zero-shot Objective Control - Obstacle Avoidance

the man walks zig zag.
the man walks forward in a straight line.

Diversity

a person walks in a circle clockwise

the man walks zig zag.