tbp.monty.frameworks.environments#
tbp.monty.frameworks.environments.embodied_data#
tbp.monty.frameworks.environments.environment#
- class Environment(*args, **kwargs)[source]#
Bases:
ProtocolBase protocol for all environments that support steppable actions.
- __init__(*args, **kwargs)#
- close() None[source]#
Close the environment and release all resources.
Any call to any other environment method may raise an exception.
- Return type:
- step(actions: Sequence[Action]) tuple[Observations, ProprioceptiveState][source]#
Apply the given actions to the environment.
- Parameters:
actions – The actions to apply to the environment.
- Returns:
The current observations and proprioceptive state.
Note
If the actions are an empty sequence, the current observations are returned.
- class ObjectEnvironment(*args, **kwargs)[source]#
Bases:
ProtocolProtocol for environments that support adding and removing objects.
- __init__(*args, **kwargs)#
- add_object(name: str, position: VectorXYZ = (0.0, 0.0, 0.0), rotation: QuaternionWXYZ = (1.0, 0.0, 0.0, 0.0), scale: VectorXYZ = (1.0, 1.0, 1.0), semantic_id: SemanticID | None = None, primary_target_object: ObjectID | None = None) ObjectID[source]#
Add an object to the environment.
- Parameters:
name (str) – The name of the object to add.
position (VectorXYZ) – The initial absolute position of the object.
rotation (QuaternionWXYZ) – The initial rotation WXYZ quaternion of the object. Defaults to (1,0,0,0).
scale (VectorXYZ) – The scale of the object to add. Defaults to (1,1,1).
semantic_id (SemanticID | None) – Optional override for the object semantic ID. Defaults to None.
primary_target_object (ObjectID | None) – The ID of the primary target object. If not None, the added object will be positioned so that it does not obscure the initial view of the primary target object (which avoiding collision alone cannot guarantee). Used when adding multiple objects. Defaults to None.
- Return type:
ObjectID
- Returns:
The ID of the added object.
- class ObjectInfo(object_id: ObjectID, semantic_id: SemanticID | None) None[source]#
Bases:
objectContains the identifying information of an object created in the environment.
- __init__(object_id: ObjectID, semantic_id: SemanticID | None) None#
- object_id: ObjectID#
- semantic_id: SemanticID | None#
- class ResettableEnvironment(*args, **kwargs)[source]#
Bases:
ProtocolProtocol for environments that can be reset to their initial state.
- __init__(*args, **kwargs)#
- reset() tuple[Observations, ProprioceptiveState][source]#
Reset the environment to its initial state.
- Returns:
The environment’s initial observations and proprioceptive state.
- class SimulatedEnvironment(*args, **kwargs)[source]#
Bases:
Environment,ResettableEnvironment,Protocol- __init__(*args, **kwargs)#
- class SimulatedObjectEnvironment(*args, **kwargs)[source]#
Bases:
Environment,ObjectEnvironment,ResettableEnvironment,Protocol- __init__(*args, **kwargs)#
tbp.monty.frameworks.environments.object_init_samplers#
- class ObjectInitParams(*args, **kwargs)[source]#
Bases:
dict- euler_rotation: npt.NDArray[np.float64] | EulerAnglesXYZ#
- position: VectorXYZ#
- quat_rotation: NotRequired[npt.NDArray[np.float64]]#
- rotation: QuaternionWXYZ#
- scale: VectorXYZ#
- class Predefined(positions: Sequence[VectorXYZ] | None = None, rotations: Sequence[EulerAnglesXYZ] | None = None, scales: Sequence[VectorXYZ] | None = None, change_every_episode: bool | None = None)[source]#
Bases:
Default
tbp.monty.frameworks.environments.positioning_procedures#
- class GetGoodView(agent_id: AgentID, good_view_distance: float, good_view_percentage: float, multiple_objects_present: bool, sensor_id: SensorID, target_semantic_id: SemanticID, allow_translation: bool = True, max_orientation_attempts: int = 1) None[source]#
Bases:
PositioningProcedurePositioning procedure to get a good view of the object before an episode.
Used to position the distant agent so that it finds the initial view of an object at the beginning of an episode with respect to a given sensor (the surface agent is positioned using the TouchObject positioning procedure instead). Also currently used by the distant agent after a “jump” has been initialized by a model-based policy.
First, the agent is moved towards the target object until the object fills a minimum of percentage (given by good_view_percentage) of the sensor’s field of view or the closest point of the object is less than good_view_distance from the sensor. This makes sure that big and small objects all fill similar amount of space in the sensor’s field of view. Otherwise small objects may be too small to perform saccades or the sensor ends up inside of big objects. This step is performed by default but can be skipped by setting allow_translation=False.
Second, the agent will then be oriented towards the object so that the sensor’s central pixel is on-object. In the case of multi-object experiments, (i.e., when multiple_objects_present=True), there is an additional orientation step performed prior to the translational movement step.
- __init__(agent_id: AgentID, good_view_distance: float, good_view_percentage: float, multiple_objects_present: bool, sensor_id: SensorID, target_semantic_id: SemanticID, allow_translation: bool = True, max_orientation_attempts: int = 1) None[source]#
Initialize the GetGoodView positioning procedure.
- Parameters:
agent_id (
NewType()(AgentID,str)) – The ID of the agent to generate actions for.good_view_distance (
float) – The desired distance to the object for a good view.good_view_percentage (
float) – The percentage of the sensor that should be filled with the object.multiple_objects_present (
bool) – Whether there are multiple objects in the scene.sensor_id (
NewType()(SensorID,str)) – The ID of the sensor to use for positioning.target_semantic_id (
NewType()(SemanticID,int)) – The semantic ID of the target object.allow_translation (
bool) – Whether to allow movement toward the object via the motor systems’s move_close_enough method. If False, only orientienting movements are performed. Defaults to True.max_orientation_attempts (
int) – The maximum number of orientation attempts allowed before giving up and truncating the procedure indicating that the sensor is not on the target object.
- compute_look_amounts(relative_location: np.ndarray, state: MotorSystemState) tuple[float, float][source]#
Compute the amount to look down and left given a relative location.
This function computes the amount needed to look down and left in order for the sensor to be aimed at the target. The returned amounts are relative to the agent’s current position and rotation. Looking up and right is done by returning negative amounts.
TODO: Test whether this function works when the agent is facing in the positive z-direction. It may be fine, but there were some adjustments to accommodate the z-axis positive direction pointing opposite the body’s initial orientation (e.g., using negative z in left_amount = -np.degrees(np.arctan2(x_rot, -z_rot))).
- Parameters:
relative_location – the x,y,z coordinates of the target with respect to the sensor.
state – The current state of the motor system. Defaults to None.
- Returns:
Amount to look down (degrees). left_amount: Amount to look left (degrees).
- Return type:
down_amount
- find_location_to_look_at(sem3d_obs: np.ndarray, image_shape: tuple[int, int], state: MotorSystemState) np.ndarray[source]#
Find the location to look at in the observation.
Takes in a semantic 3D observation and returns an x,y,z location.
The location is on the object and surrounded by pixels that are also on the object. This is done by smoothing the on_object image and then taking the maximum of this smoothed image.
- Parameters:
sem3d_obs – The location of each pixel and the semantic ID associated with that location.
image_shape – The shape of the camera image.
state – The current state of the motor system.
- Returns:
The x,y,z coordinates of the target with respect to the sensor.
- move_close_enough(observation: Mapping) Action | None[source]#
Move closer to the object until we are close enough.
- Parameters:
observation (Mapping) – The observation to use for positioning.
- Return type:
Action | None
- Returns:
The next action to take, or None if we are already close enough to the object.
- Raises:
ValueError – If the object is not visible.
- orient_to_object(observation: Mapping, state: MotorSystemState) list[Action][source]#
Rotate sensors so that they are centered on the object using the view finder.
The view finder needs to be in the same position as the sensor patch and the object needs to be somewhere in the view finders view.
- Parameters:
observation – The observation to use for positioning.
state – The current state of the motor system.
- Returns:
A list of actions of length two composed of actions needed to get us onto the target object.
- sensor_rotation_relative_to_world(state: MotorSystemState) Any[source]#
Derives the positioning sensor’s rotation relative to the world.
- Parameters:
state (
MotorSystemState) – The current state of the motor system.- Return type:
- Returns:
The positioning sensor’s rotation relative to the world.
- class GetGoodViewFactory(agent_id: AgentID, sensor_id: SensorID, allow_translation: bool = True, good_view_distance: float = 0.03, good_view_percentage: float = 0.5, max_orientation_attempts: int = 1, multiple_objects_present: bool = False)[source]#
Bases:
PositioningProcedureFactoryFactory for creating GetGoodView positioning procedures.
- __init__(agent_id: AgentID, sensor_id: SensorID, allow_translation: bool = True, good_view_distance: float = 0.03, good_view_percentage: float = 0.5, max_orientation_attempts: int = 1, multiple_objects_present: bool = False)[source]#
- create(target_semantic_id: SemanticID) GetGoodView[source]#
Create a positioning procedure.
- Parameters:
target_semantic_id (
NewType()(SemanticID,int)) – The semantic ID of the target object.- Return type:
- Returns:
A positioning procedure.
- class PositioningProcedure(*args, **kwargs)[source]#
Bases:
ProtocolPositioning procedure to position the agent in the scene.
The positioning procedure should be called repeatedly until the procedure result indicates that the procedure has terminated or truncated.
- __init__(*args, **kwargs)#
- static depth_at_center(agent_id: AgentID, observations: Observations, sensor_id: SensorID) float[source]#
Determine the depth of the central pixel for the sensor.
- Parameters:
agent_id (
NewType()(AgentID,str)) – The ID of the agent to use.observations (
Observations) – The observations to use.sensor_id (
NewType()(SensorID,str)) – The ID of the sensor to use.
- Return type:
- Returns:
The depth of the central pixel for the sensor.
- class PositioningProcedureFactory(*args, **kwargs)[source]#
Bases:
ProtocolFactory for creating positioning procedures.
- __init__(*args, **kwargs)#
- create(target_semantic_id: SemanticID) PositioningProcedure[source]#
Create a positioning procedure.
- Parameters:
target_semantic_id (
NewType()(SemanticID,int)) – The semantic ID of the target object.- Return type:
- Returns:
A positioning procedure.
- class PositioningProcedureResult(actions: list[Action] = <factory>, success: bool = False, terminated: bool = False, truncated: bool = False) None[source]#
Bases:
objectResult of a positioning procedure.
For more on the terminated/truncated terminology, see https://farama.org/Gymnasium-Terminated-Truncated-Step-API.
- __init__(actions=<factory>, success=False, terminated=False, truncated=False)#
- actions: list[Action]#
Actions to take.
- success: bool = False#
Whether the procedure succeeded in its positioning goal.
- terminated: bool = False#
Whether the procedure reached a terminal state with success or failure.
- truncated: bool = False#
Whether the procedure was truncated due to a limit on the number of attempts or other criteria.