Examples

Pose Format Conversion with Sirens

The pose_to_siren_to_pose.py script shows us how to change a 3D pose using something called a Siren neural network.

Note

The function follows these steps:

Fills missing data in the body with zeros.
Normalizes the pose distribution.
Converts the pose using the Siren neural network.
Constructs the new Pose with the predicted data.
Unnormalizes the pose distribution.

Also: Before you start, ensure you have a pose file/ path. This is the standardized format that stores the 3D pose information. If you don’t have one, you might either need to obtain it from a relevant dataset or convert your existing pose data into this format.

Step-by-step Guide:

Preparation

Begin by importing the necessary modules:

import numpy as np
from numpy import ma
import pose_format.utils.siren as siren
from pose_format import Pose
from pose_format.numpy import NumPyPoseBody
from pose_format.pose_visualizer import PoseVisualizer

Define the Conversion Function

The function pose_to_siren_to_pose is used to perform the conversion. An you find the overview of the whole function Overview of pose_to_siren_to_pose.py

def pose_to_siren_to_pose(p: Pose, fps=None) -> Pose:
"""Converts a given Pose object to its Siren representation and back to Pose."""

    # Fills missing values with 0's
    p.body.zero_filled()

    # Noralizes
    mu, std = p.normalize_distribution()

    # Use siren net
    net = siren.get_pose_siren(p, total_steps=3000, steps_til_summary=100, learning_rate=1e-4, cuda=True)

    new_fps = fps if fps is not None else p.body.fps
    coords = siren.PoseDataset.get_coords(time=len(p.body.data) / p.body.fps, fps=new_fps)

    # Get predicitons of new Pose data
    pred = net(coords).cpu().numpy()

    # Construct new Body out of predcitions
    pose_body = NumPyPoseBody(fps=new_fps, data=ma.array(pred), confidence=np.ones(shape=tuple(pred.shape[:3])))
    p = Pose(header=p.header, body=pose_body)

    # Revert normalization and give back the pose instance
    p.unnormalize_distribution(mu, std)
    return p

The function does the following operations:

Fills missing data in the pose body with zeros.
Normalizes the pose distribution.
Uses the Siren neural network to transform the pose.
Constructs a new Pose with the predicted data.
Reverts the normalization on the pose distribution.

Usage

After defining the function, you can use it in your main script:

if __name__ == "__main__":
 pose_path = "/home/nlp/amit/PhD/PoseFormat/sample-data/1.pose"  # your own file path to a `.pose` file

 buffer = open(pose_path, "rb").read()
 p = Pose.read(buffer)
 print("Poses loaded")

 p = pose_to_siren_to_pose(p)

 info = p.header.normalization_info(
     p1=("pose_keypoints_2d", "RShoulder"),
     p2=("pose_keypoints_2d", "LShoulder")
 )
 p.normalize(info, scale_factor=300)
 p.focus()

 v = PoseVisualizer(p)
 v.save_video("reconstructed.mp4", v.draw(max_frames=3000))

The main script performs these tasks:

Reads the pose data from a file.
Applies the pose_to_siren_to_pose function to the read pose.
Normalizes and focuses the pose.
Visualizes the converted pose using the PoseVisualizer.

Execution
To run the script:
$ python pose_format_converter.py

The pose_format combined with Siren neural networks is great to transform and work with 3D pose data. By understanding and using the functions and methods provided in this script, you will be able to understand better how to manipulate and visualize 3D poses to suit your own requirements.

Overview of `pose_to_siren_to_pose.py`

"""
pose_format_converter
---------------------
A utility to convert poses using Siren neural networks and visualize the results.

Modules:
- numpy
- pose_format
- pose_format.utils.siren
- pose_format.numpy
- pose_format.pose_visualizer

Functions:
- pose_to_siren_to_pose(p: Pose, fps=None) -> Pose

Example usage:
$ python pose_format_converter.py
"""

import numpy as np
from numpy import ma

import pose_format.utils.siren as siren
from pose_format import Pose
from pose_format.numpy import NumPyPoseBody
from pose_format.pose_visualizer import PoseVisualizer


def pose_to_siren_to_pose(p: Pose, fps=None) -> Pose:
    """
    Converts a given Pose object to its Siren representation and back to Pose.

    Parameters
    ----------
    p : Pose
        Input pose to be converted.
    fps : int, optional
        Frames per second for the Siren representation. If None, uses the fps of the input Pose.

    Returns
    -------
    Pose
        The Pose representation after converting it through the Siren neural network.

    """
    p.body.zero_filled()
    mu, std = p.normalize_distribution()

    net = siren.get_pose_siren(p, total_steps=3000, steps_til_summary=100, learning_rate=1e-4, cuda=True)

    new_fps = fps if fps is not None else p.body.fps
    coords = siren.PoseDataset.get_coords(time=len(p.body.data) / p.body.fps, fps=new_fps)
    pred = net(coords).cpu().numpy()

    pose_body = NumPyPoseBody(fps=new_fps, data=ma.array(pred), confidence=np.ones(shape=tuple(pred.shape[:3])))
    p = Pose(header=p.header, body=pose_body)
    p.unnormalize_distribution(mu, std)
    return p


if __name__ == "__main__":
    # Example usage of the pose_to_siren_to_pose function.
    pose_path = "/home/nlp/amit/PhD/PoseFormat/sample-data/1.pose"

    buffer = open(pose_path, "rb").read()
    p = Pose.read(buffer)
    print("Poses loaded")

    p = pose_to_siren_to_pose(p)

    info = p.header.normalization_info(
        p1=("pose_keypoints_2d", "RShoulder"),
        p2=("pose_keypoints_2d", "LShoulder")
    )
    p.normalize(info, scale_factor=300)
    p.focus()

    v = PoseVisualizer(p)
    v.save_video("reconstructed.mp4", v.draw(max_frames=3000))

Examples

Pose Format Conversion with Sirens

Step-by-step Guide:

Overview of pose_to_siren_to_pose.py

Overview of `pose_to_siren_to_pose.py`