tbp.monty.frameworks.environments#

tbp.monty.frameworks.environments.embodied_data#

tbp.monty.frameworks.environments.environment#

class Environment(*args, **kwargs)[source]#

Bases: Protocol

Base protocol for all environments that support steppable actions.

__init__(*args, **kwargs)#
close() None[source]#

Close the environment and release all resources.

Any call to any other environment method may raise an exception.

Return type:

None

step(actions: Sequence[Action]) tuple[Observations, ProprioceptiveState][source]#

Apply the given actions to the environment.

Parameters:

actions – The actions to apply to the environment.

Returns:

The current observations and proprioceptive state.

Note

If the actions are an empty sequence, the current observations are returned.

class ObjectEnvironment(*args, **kwargs)[source]#

Bases: Protocol

Protocol for environments that support adding and removing objects.

__init__(*args, **kwargs)#
add_object(name: str, position: VectorXYZ = (0.0, 0.0, 0.0), rotation: QuaternionWXYZ = (1.0, 0.0, 0.0, 0.0), scale: VectorXYZ = (1.0, 1.0, 1.0), semantic_id: SemanticID | None = None, primary_target_object: ObjectID | None = None) ObjectID[source]#

Add an object to the environment.

Parameters:
  • name (str) – The name of the object to add.

  • position (VectorXYZ) – The initial absolute position of the object.

  • rotation (QuaternionWXYZ) – The initial rotation WXYZ quaternion of the object. Defaults to (1,0,0,0).

  • scale (VectorXYZ) – The scale of the object to add. Defaults to (1,1,1).

  • semantic_id (SemanticID | None) – Optional override for the object semantic ID. Defaults to None.

  • primary_target_object (ObjectID | None) – The ID of the primary target object. If not None, the added object will be positioned so that it does not obscure the initial view of the primary target object (which avoiding collision alone cannot guarantee). Used when adding multiple objects. Defaults to None.

Return type:

ObjectID

Returns:

The ID of the added object.

remove_all_objects() None[source]#

Remove all objects from the environment.

Return type:

None

TODO: This remove_all_objects interface is elevated from

HabitatSim.remove_all_objects and is quite specific to HabitatSim implementation. We should consider refactoring this to be more generic.

class ObjectID#

Unique identifier for an object in the environment.

alias of int

class ObjectInfo(object_id: ObjectID, semantic_id: SemanticID | None) None[source]#

Bases: object

Contains the identifying information of an object created in the environment.

__init__(object_id: ObjectID, semantic_id: SemanticID | None) None#
object_id: ObjectID#
semantic_id: SemanticID | None#
class ResettableEnvironment(*args, **kwargs)[source]#

Bases: Protocol

Protocol for environments that can be reset to their initial state.

__init__(*args, **kwargs)#
reset() tuple[Observations, ProprioceptiveState][source]#

Reset the environment to its initial state.

Returns:

The environment’s initial observations and proprioceptive state.

class SemanticID#

Unique identifier for an object’s semantic class.

alias of int

class SimulatedEnvironment(*args, **kwargs)[source]#

Bases: Environment, ResettableEnvironment, Protocol

__init__(*args, **kwargs)#
class SimulatedObjectEnvironment(*args, **kwargs)[source]#

Bases: Environment, ObjectEnvironment, ResettableEnvironment, Protocol

__init__(*args, **kwargs)#

tbp.monty.frameworks.environments.object_init_samplers#

class Default[source]#

Bases: object

class MultiObjectNames(*args, **kwargs)[source]#

Bases: dict

num_distractors: int#
source_object_list: Sequence[str]#
targets_list: Sequence[str]#
class ObjectInitParams(*args, **kwargs)[source]#

Bases: dict

euler_rotation: npt.NDArray[np.float64] | EulerAnglesXYZ#
position: VectorXYZ#
quat_rotation: NotRequired[npt.NDArray[np.float64]]#
rotation: QuaternionWXYZ#
scale: VectorXYZ#
class Predefined(positions: Sequence[VectorXYZ] | None = None, rotations: Sequence[EulerAnglesXYZ] | None = None, scales: Sequence[VectorXYZ] | None = None, change_every_episode: bool | None = None)[source]#

Bases: Default

__init__(positions: Sequence[VectorXYZ] | None = None, rotations: Sequence[EulerAnglesXYZ] | None = None, scales: Sequence[VectorXYZ] | None = None, change_every_episode: bool | None = None)[source]#
all_combinations_of_params()[source]#
class RandomRotation(position: VectorXYZ | None = None, scale: VectorXYZ | None = None)[source]#

Bases: Default

__init__(position: VectorXYZ | None = None, scale: VectorXYZ | None = None)[source]#

tbp.monty.frameworks.environments.positioning_procedures#

class GetGoodView(agent_id: AgentID, good_view_distance: float, good_view_percentage: float, multiple_objects_present: bool, sensor_id: SensorID, target_semantic_id: SemanticID, allow_translation: bool = True, max_orientation_attempts: int = 1) None[source]#

Bases: PositioningProcedure

Positioning procedure to get a good view of the object before an episode.

Used to position the distant agent so that it finds the initial view of an object at the beginning of an episode with respect to a given sensor (the surface agent is positioned using the TouchObject positioning procedure instead). Also currently used by the distant agent after a “jump” has been initialized by a model-based policy.

First, the agent is moved towards the target object until the object fills a minimum of percentage (given by good_view_percentage) of the sensor’s field of view or the closest point of the object is less than good_view_distance from the sensor. This makes sure that big and small objects all fill similar amount of space in the sensor’s field of view. Otherwise small objects may be too small to perform saccades or the sensor ends up inside of big objects. This step is performed by default but can be skipped by setting allow_translation=False.

Second, the agent will then be oriented towards the object so that the sensor’s central pixel is on-object. In the case of multi-object experiments, (i.e., when multiple_objects_present=True), there is an additional orientation step performed prior to the translational movement step.

__init__(agent_id: AgentID, good_view_distance: float, good_view_percentage: float, multiple_objects_present: bool, sensor_id: SensorID, target_semantic_id: SemanticID, allow_translation: bool = True, max_orientation_attempts: int = 1) None[source]#

Initialize the GetGoodView positioning procedure.

Parameters:
  • agent_id (NewType()(AgentID, str)) – The ID of the agent to generate actions for.

  • good_view_distance (float) – The desired distance to the object for a good view.

  • good_view_percentage (float) – The percentage of the sensor that should be filled with the object.

  • multiple_objects_present (bool) – Whether there are multiple objects in the scene.

  • sensor_id (NewType()(SensorID, str)) – The ID of the sensor to use for positioning.

  • target_semantic_id (NewType()(SemanticID, int)) – The semantic ID of the target object.

  • allow_translation (bool) – Whether to allow movement toward the object via the motor systems’s move_close_enough method. If False, only orientienting movements are performed. Defaults to True.

  • max_orientation_attempts (int) – The maximum number of orientation attempts allowed before giving up and truncating the procedure indicating that the sensor is not on the target object.

compute_look_amounts(relative_location: np.ndarray, state: MotorSystemState) tuple[float, float][source]#

Compute the amount to look down and left given a relative location.

This function computes the amount needed to look down and left in order for the sensor to be aimed at the target. The returned amounts are relative to the agent’s current position and rotation. Looking up and right is done by returning negative amounts.

TODO: Test whether this function works when the agent is facing in the positive z-direction. It may be fine, but there were some adjustments to accommodate the z-axis positive direction pointing opposite the body’s initial orientation (e.g., using negative z in left_amount = -np.degrees(np.arctan2(x_rot, -z_rot))).

Parameters:
  • relative_location – the x,y,z coordinates of the target with respect to the sensor.

  • state – The current state of the motor system. Defaults to None.

Returns:

Amount to look down (degrees). left_amount: Amount to look left (degrees).

Return type:

down_amount

find_location_to_look_at(sem3d_obs: np.ndarray, image_shape: tuple[int, int], state: MotorSystemState) np.ndarray[source]#

Find the location to look at in the observation.

Takes in a semantic 3D observation and returns an x,y,z location.

The location is on the object and surrounded by pixels that are also on the object. This is done by smoothing the on_object image and then taking the maximum of this smoothed image.

Parameters:
  • sem3d_obs – The location of each pixel and the semantic ID associated with that location.

  • image_shape – The shape of the camera image.

  • state – The current state of the motor system.

Returns:

The x,y,z coordinates of the target with respect to the sensor.

is_on_target_object(observation: Mapping) bool[source]#

Check if a sensor is on the target object.

Parameters:

observation (Mapping) – The observation to use for positioning.

Return type:

bool

Returns:

Whether the sensor is on the target object.

move_close_enough(observation: Mapping) Action | None[source]#

Move closer to the object until we are close enough.

Parameters:

observation (Mapping) – The observation to use for positioning.

Return type:

Action | None

Returns:

The next action to take, or None if we are already close enough to the object.

Raises:

ValueError – If the object is not visible.

orient_to_object(observation: Mapping, state: MotorSystemState) list[Action][source]#

Rotate sensors so that they are centered on the object using the view finder.

The view finder needs to be in the same position as the sensor patch and the object needs to be somewhere in the view finders view.

Parameters:
  • observation – The observation to use for positioning.

  • state – The current state of the motor system.

Returns:

A list of actions of length two composed of actions needed to get us onto the target object.

sensor_rotation_relative_to_world(state: MotorSystemState) Any[source]#

Derives the positioning sensor’s rotation relative to the world.

Parameters:

state (MotorSystemState) – The current state of the motor system.

Return type:

Any

Returns:

The positioning sensor’s rotation relative to the world.

class GetGoodViewFactory(agent_id: AgentID, sensor_id: SensorID, allow_translation: bool = True, good_view_distance: float = 0.03, good_view_percentage: float = 0.5, max_orientation_attempts: int = 1, multiple_objects_present: bool = False)[source]#

Bases: PositioningProcedureFactory

Factory for creating GetGoodView positioning procedures.

__init__(agent_id: AgentID, sensor_id: SensorID, allow_translation: bool = True, good_view_distance: float = 0.03, good_view_percentage: float = 0.5, max_orientation_attempts: int = 1, multiple_objects_present: bool = False)[source]#
create(target_semantic_id: SemanticID) GetGoodView[source]#

Create a positioning procedure.

Parameters:

target_semantic_id (NewType()(SemanticID, int)) – The semantic ID of the target object.

Return type:

GetGoodView

Returns:

A positioning procedure.

class PositioningProcedure(*args, **kwargs)[source]#

Bases: Protocol

Positioning procedure to position the agent in the scene.

The positioning procedure should be called repeatedly until the procedure result indicates that the procedure has terminated or truncated.

__init__(*args, **kwargs)#
static depth_at_center(agent_id: AgentID, observations: Observations, sensor_id: SensorID) float[source]#

Determine the depth of the central pixel for the sensor.

Parameters:
  • agent_id (NewType()(AgentID, str)) – The ID of the agent to use.

  • observations (Observations) – The observations to use.

  • sensor_id (NewType()(SensorID, str)) – The ID of the sensor to use.

Return type:

float

Returns:

The depth of the central pixel for the sensor.

class PositioningProcedureFactory(*args, **kwargs)[source]#

Bases: Protocol

Factory for creating positioning procedures.

__init__(*args, **kwargs)#
create(target_semantic_id: SemanticID) PositioningProcedure[source]#

Create a positioning procedure.

Parameters:

target_semantic_id (NewType()(SemanticID, int)) – The semantic ID of the target object.

Return type:

PositioningProcedure

Returns:

A positioning procedure.

class PositioningProcedureResult(actions: list[Action] = <factory>, success: bool = False, terminated: bool = False, truncated: bool = False) None[source]#

Bases: object

Result of a positioning procedure.

For more on the terminated/truncated terminology, see https://farama.org/Gymnasium-Terminated-Truncated-Step-API.

__init__(actions=<factory>, success=False, terminated=False, truncated=False)#
actions: list[Action]#

Actions to take.

success: bool = False#

Whether the procedure succeeded in its positioning goal.

terminated: bool = False#

Whether the procedure reached a terminal state with success or failure.

truncated: bool = False#

Whether the procedure was truncated due to a limit on the number of attempts or other criteria.

tbp.monty.frameworks.environments.two_d_data#