tbp.monty.frameworks.environments#

tbp.monty.frameworks.environments.embodied_data#

class EnvironmentInterface(env: SimulatedObjectEnvironment, motor_system: MotorSystem, rng, seed: int, experiment_mode: ExperimentMode, transform=None)[source]#

Bases: object

Provides an interface to an embodied environment.

The observations are based on the actions returned by the motor_system.

The first values returned by this iterator are the observations of the environment’s initial state, subsequent observations are returned after the action returned by motor_system is applied.

env#

An instance of a class that implements SimulatedObjectEnvironment.

motor_system#

MotorSystem

rng#

Random number generator to use.

seed#

The configured random seed.

experiment_mode#

The experiment mode that this environment interface is used in.

transform#

Callable used to transform the observations returned by the environment.

Note

If the amount variable returned by motor_system is None, the amount used by habitat will be the default for the actuator, e.g. PanTiltZoomCamera.translation_step

Note

This one on its own won’t work.

Raises:

TypeError – If motor_system is not an instance of MotorSystem.

__init__(env: SimulatedObjectEnvironment, motor_system: MotorSystem, rng, seed: int, experiment_mode: ExperimentMode, transform=None)[source]#
apply_transform(transform, observation: Observations, state: ProprioceptiveState) Observations[source]#
Return type:

Observations

post_episode()[source]#
post_epoch()[source]#
pre_episode(rng: numpy.random.RandomState)[source]#
pre_epoch()[source]#
reset(rng: numpy.random.RandomState)[source]#
step(ctx: RuntimeContext, first: bool = False) Observations[source]#

Request actions from the motor system and step the environment.

Parameters:
  • ctx (RuntimeContext) – The runtime context.

  • first (bool) –

    Whether this is the first step of the episode. If True, then return the initial observation without requesting actions from the motor system or stepping the environment. TODO: This is a hack to preserve the behavior that the first call

    to the environment interface returns the observation that is returned by the environment’s reset method. Once the EnvironmentInterface stops invoking motor_system(ctx), this can be removed as the runtime/experiment will initialize the runtime loop by calling step(ctx, actions=[]) instead.

Return type:

Observations

Returns:

The observations.

class EnvironmentInterfacePerObject(object_names, object_init_sampler, parent_to_child_mapping=None, *args, **kwargs)[source]#

Bases: EnvironmentInterface

Interface for testing in an environment with one “primary target” object.

Interface for testing in an environment where we load one “primary target” object at a time; in addition, we can optionally add other “distractor” objects to the environment.

Has a list of primary target objects, swapping these objects in and out for episodes without resetting the environment. The objects are initialized with parameters such that we can vary their location, rotation, and scale.

After the primary target is added to the environment, other distractor objects, sampled from the same object list, can be added.

__init__(object_names, object_init_sampler, parent_to_child_mapping=None, *args, **kwargs)[source]#

Initialize environment interface.

Parameters:
  • object_names

    list of objects if doing a simple experiment with primary target objects only; dict for experiments with multiple objects, corresponding to –> targets_list : the list of primary target objects source_object_list : the original object list from which the primary

    target objects were sampled; used to sample distractor objects

    num_distractorsthe number of distractor objects to add to the

    environment

  • object_init_sampler – Function that returns dict with position, rotation, and scale of objects when re-initializing.

  • parent_to_child_mapping – dictionary mapping parent objects to their child objects. Used for logging.

  • *args – passed to super() call

  • **kwargs – passed to super() call

Raises:

TypeError – If object_names is not a list or dictionary

add_distractor_objects(primary_target_obj: ObjectID, init_params, primary_target_name)[source]#

Add arbitrarily many “distractor” objects to the environment.

Parameters:
  • primary_target_obj (NewType()(ObjectID, int)) – The ID of the object which is the primary target in the scene.

  • init_params – Parameters used to initialize the object, e.g. orientation; for now, these are identical to the primary target except for the object ID.

  • primary_target_name – name of the primary target object

change_object_by_idx(idx)[source]#

Update the primary target object in the scene based on the given index.

The given idx is the index of the object in the self.object_names list, which should correspond to the index of the object in the self.object_params list.

Also add any distractor objects if required.

Parameters:

idx – Index of the new object and its parameters in object_params

create_semantic_mapping()[source]#

Create a unique semantic ID (positive integer) for each object.

Used by Habitat for the semantic sensor.

In addition, create a dictionary mapping back and forth between these IDs and the corresponding name of the object

cycle_object()[source]#

Remove the previous object(s) from the scene and add a new primary target.

Also add any potential distractor objects.

post_episode()[source]#
post_epoch()[source]#
pre_episode(rng: numpy.random.RandomState)[source]#
pre_epoch()[source]#
class InformedEnvironmentInterface(*args, good_view_distance: float = 0.03, good_view_percentage: float = 0.5, **kwargs) None[source]#

Bases: EnvironmentInterfacePerObject

Env interface that supports a policy which makes use of previous observation(s).

Extension of the EnvironmentInterface where the actions can be informed by the observations. It passes the observation to the InformedPolicy class (which is an extension of the BasePolicy). This policy can then make use of the observation to decide on the next action.

Also has the following, additional functionality; TODO refactor/separate these out as appropriate

i) this environment interface allows for early stopping by adding the set_done method which can for example be called when the object is recognized.

ii) the motor_only_step can be set such that the sensory module can later determine whether perceptual data should be sent to the learning module, or just fed back to the motor policy.

iii) Handles different environment interface updates depending on whether the policy is based on the surface-agent or touch-agent

  1. Supports hypothesis-testing “jump” policy

__init__(*args, good_view_distance: float = 0.03, good_view_percentage: float = 0.5, **kwargs) None[source]#

Initialize environment interface.

Parameters:
  • object_names

    list of objects if doing a simple experiment with primary target objects only; dict for experiments with multiple objects, corresponding to –> targets_list : the list of primary target objects source_object_list : the original object list from which the primary

    target objects were sampled; used to sample distractor objects

    num_distractorsthe number of distractor objects to add to the

    environment

  • object_init_sampler – Function that returns dict with position, rotation, and scale of objects when re-initializing.

  • parent_to_child_mapping – dictionary mapping parent objects to their child objects. Used for logging.

  • *args – passed to super() call

  • **kwargs – passed to super() call

Raises:

TypeError – If object_names is not a list or dictionary

execute_jump_attempt()[source]#

Attempt a hypothesis-testing “jump” onto a location of the object.

Delegates to motor policy directly to determine specific jump actions.

Returns:

The observation from the jump attempt.

first_step()[source]#

Carry out particular motor-system state updates required on the first step.

TODO: can get rid of this by appropriately initializing motor_only_step

Returns:

The observation from the first step.

get_good_view(sensor_id: SensorID, allow_translation: bool = True, max_orientation_attempts: int = 1) bool[source]#

Invoke the GetGoodView positioning procedure.

Parameters:
  • sensor_id (NewType()(SensorID, str)) – The ID of the sensor to use for positioning.

  • allow_translation (bool) – Whether to allow movement toward the object via the motor system’s move_close_enough method. If False, only orienting movements are performed. Defaults to True.

  • max_orientation_attempts (int) – The maximum number of orientation attempts allowed before giving up and truncating the procedure indicating that the sensor is not on the target object.

Return type:

bool

Returns:

Whether the sensor is on the target object.

get_good_view_with_patch_refinement() bool[source]#

Policy to get a good view of the object for the central patch.

Used by the distant agent to move and orient toward an object such that the central patch is on-object. This is done by first moving and orienting the agent toward the object using the view finder. Then orienting movements are performed using the central patch (i.e., the sensor module with id “patch” or “patch_0”) to ensure that the patch’s central pixel is on-object. Up to 3 reorientation attempts are performed using the central patch.

Return type:

bool

Returns:

Whether the sensor is on the object.

handle_failed_jump(pre_jump_state, first_sensor)[source]#

Deal with the results of a failed hypothesis-testing jump.

A failed jump is “off-object”, i.e. the object is not perceived by the sensor.

handle_successful_jump()[source]#

Deal with the results of a successful hypothesis-testing jump.

A successful jump is “on-object”, i.e. the object is perceived by the sensor.

pre_episode(rng: numpy.random.RandomState)[source]#
step(ctx: RuntimeContext, first: bool = False) Observations[source]#

Request actions from the motor system and step the environment.

Parameters:
  • ctx (RuntimeContext) – The runtime context.

  • first (bool) –

    Whether this is the first step of the episode. If True, then return the initial observation without requesting actions from the motor system or stepping the environment. TODO: This is a hack to preserve the behavior that the first call

    to the environment interface returns the observation that is returned by the environment’s reset method. Once the EnvironmentInterface stops invoking motor_system(ctx), this can be removed as the runtime/experiment will initialize the runtime loop by calling step(ctx, actions=[]) instead.

Return type:

Observations

Returns:

The observations.

class OmniglotEnvironmentInterface(alphabets, characters, versions, env: OmniglotEnvironment, motor_system: MotorSystem, rng, transform=None, parent_to_child_mapping=None, *_args, **_kwargs)[source]#

Bases: EnvironmentInterfacePerObject

Environment interface for Omniglot dataset.

__init__(alphabets, characters, versions, env: OmniglotEnvironment, motor_system: MotorSystem, rng, transform=None, parent_to_child_mapping=None, *_args, **_kwargs)[source]#

Initialize environment interface.

Parameters:
  • alphabets – List of alphabets.

  • characters – List of characters.

  • versions – List of versions.

  • env (OmniglotEnvironment) – An instance of a class that implements OmniglotEnvironment.

  • motor_system (MotorSystem) – The motor system.

  • rng – Random number generator to use.

  • transform – Callable used to transform the observations returned by the environment.

  • parent_to_child_mapping – dictionary mapping parent objects to their child objects. Used for logging.

  • *args – Unused?

  • **kwargs – Unused?

Raises:

TypeError – If motor_system is not an instance of MotorSystem.

change_object_by_idx(idx)[source]#

Update the object in the scene given the idx of it in the object params.

Parameters:

idx – Index of the new object and ints parameters in object params

cycle_object()[source]#

Switch to the next character image.

post_episode()[source]#
post_epoch()[source]#
class SaccadeOnImageEnvironmentInterface(scenes, versions, env: SaccadeOnImageEnvironment, motor_system: MotorSystem, rng, transform=None, parent_to_child_mapping=None, *_args, **_kwargs)[source]#

Bases: EnvironmentInterfacePerObject

Environment interface for moving over a 2D image with depth channel.

__init__(scenes, versions, env: SaccadeOnImageEnvironment, motor_system: MotorSystem, rng, transform=None, parent_to_child_mapping=None, *_args, **_kwargs)[source]#

Initialize environment interface.

Parameters:
  • scenes – List of scenes

  • versions – List of versions

  • env (SaccadeOnImageEnvironment) – An instance of a class that implements SaccadeOnImageEnvironment.

  • motor_system (MotorSystem) – The motor system.

  • rng – Random number generator to use.

  • transform – Callable used to transform the observations returned by the environment.

  • parent_to_child_mapping – dictionary mapping parent objects to their child objects. Used for logging.

  • *args – Unused?

  • **kwargs – Unused?

Raises:

TypeError – If motor_system is not an instance of MotorSystem.

change_object_by_idx(idx)[source]#

Update the object in the scene given the idx of it in the object params.

Parameters:

idx – Index of the new object and ints parameters in object params

cycle_object()[source]#

Switch to the next scene image.

post_episode()[source]#
post_epoch()[source]#
class SaccadeOnImageFromStreamEnvironmentInterface(env: SaccadeOnImageFromStreamEnvironment, motor_system: MotorSystem, rng, transform=None, *_args, **_kwargs)[source]#

Bases: SaccadeOnImageEnvironmentInterface

Environment interface for moving over a 2D image with depth channel.

__init__(env: SaccadeOnImageFromStreamEnvironment, motor_system: MotorSystem, rng, transform=None, *_args, **_kwargs)[source]#

Initialize environment interface.

Parameters:
  • env (SaccadeOnImageFromStreamEnvironment) – An instance of a class that implements SaccadeOnImageFromStreamEnvironment.

  • motor_system (MotorSystem) – The motor system.

  • rng – Random number generator to use.

  • transform – Callable used to transform the observations returned by the environment.

  • *args – Unused?

  • **kwargs – Unused?

Raises:

TypeError – If motor_system is not an instance of MotorSystem.

change_scene_by_idx(idx)[source]#

Update the object in the scene given the idx of it in the object params.

Parameters:

idx – Index of the new object and ints parameters in object params

cycle_scene()[source]#

Switch to the next scene image.

post_episode()[source]#
post_epoch()[source]#
pre_epoch()[source]#

tbp.monty.frameworks.environments.environment#

class Environment(*args, **kwargs)[source]#

Bases: Protocol

Base protocol for all environments that support steppable actions.

__init__(*args, **kwargs)#
close() None[source]#

Close the environment and release all resources.

Any call to any other environment method may raise an exception.

Return type:

None

step(actions: Sequence[Action]) tuple[Observations, ProprioceptiveState][source]#

Apply the given actions to the environment.

Parameters:

actions – The actions to apply to the environment.

Returns:

The current observations and proprioceptive state.

Note

If the actions are an empty sequence, the current observations are returned.

class ObjectEnvironment(*args, **kwargs)[source]#

Bases: Protocol

Protocol for environments that support adding and removing objects.

__init__(*args, **kwargs)#
add_object(name: str, position: VectorXYZ = (0.0, 0.0, 0.0), rotation: QuaternionWXYZ = (1.0, 0.0, 0.0, 0.0), scale: VectorXYZ = (1.0, 1.0, 1.0), semantic_id: SemanticID | None = None, primary_target_object: ObjectID | None = None) ObjectID[source]#

Add an object to the environment.

Parameters:
  • name (str) – The name of the object to add.

  • position (VectorXYZ) – The initial absolute position of the object.

  • rotation (QuaternionWXYZ) – The initial rotation WXYZ quaternion of the object. Defaults to (1,0,0,0).

  • scale (VectorXYZ) – The scale of the object to add. Defaults to (1,1,1).

  • semantic_id (SemanticID | None) – Optional override for the object semantic ID. Defaults to None.

  • primary_target_object (ObjectID | None) – The ID of the primary target object. If not None, the added object will be positioned so that it does not obscure the initial view of the primary target object (which avoiding collision alone cannot guarantee). Used when adding multiple objects. Defaults to None.

Return type:

ObjectID

Returns:

The ID of the added object.

remove_all_objects() None[source]#

Remove all objects from the environment.

Return type:

None

TODO: This remove_all_objects interface is elevated from

HabitatSim.remove_all_objects and is quite specific to HabitatSim implementation. We should consider refactoring this to be more generic.

class ObjectID#

Unique identifier for an object in the environment.

alias of int

class ObjectInfo(object_id: ObjectID, semantic_id: SemanticID | None) None[source]#

Bases: object

Contains the identifying information of an object created in the environment.

__init__(object_id: ObjectID, semantic_id: SemanticID | None) None#
object_id: ObjectID#
semantic_id: SemanticID | None#
class ResettableEnvironment(*args, **kwargs)[source]#

Bases: Protocol

Protocol for environments that can be reset to their initial state.

__init__(*args, **kwargs)#
reset() tuple[Observations, ProprioceptiveState][source]#

Reset the environment to its initial state.

Returns:

The environment’s initial observations and proprioceptive state.

class SemanticID#

Unique identifier for an object’s semantic class.

alias of int

class SimulatedEnvironment(*args, **kwargs)[source]#

Bases: Environment, ResettableEnvironment, Protocol

__init__(*args, **kwargs)#
class SimulatedObjectEnvironment(*args, **kwargs)[source]#

Bases: Environment, ObjectEnvironment, ResettableEnvironment, Protocol

__init__(*args, **kwargs)#

tbp.monty.frameworks.environments.object_init_samplers#

class Default[source]#

Bases: object

class Predefined(positions=None, rotations=None, scales=None, change_every_episode=None)[source]#

Bases: Default

__init__(positions=None, rotations=None, scales=None, change_every_episode=None)[source]#
all_combinations_of_params()[source]#
class RandomRotation(position=None, scale=None)[source]#

Bases: Default

__init__(position=None, scale=None)[source]#

tbp.monty.frameworks.environments.positioning_procedures#

class GetGoodView(agent_id: AgentID, good_view_distance: float, good_view_percentage: float, multiple_objects_present: bool, sensor_id: SensorID, target_semantic_id: SemanticID, allow_translation: bool = True, max_orientation_attempts: int = 1) None[source]#

Bases: PositioningProcedure

Positioning procedure to get a good view of the object before an episode.

Used to position the distant agent so that it finds the initial view of an object at the beginning of an episode with respect to a given sensor (the surface agent is positioned using the TouchObject positioning procedure instead). Also currently used by the distant agent after a “jump” has been initialized by a model-based policy.

First, the agent is moved towards the target object until the object fills a minimum of percentage (given by good_view_percentage) of the sensor’s field of view or the closest point of the object is less than good_view_distance from the sensor. This makes sure that big and small objects all fill similar amount of space in the sensor’s field of view. Otherwise small objects may be too small to perform saccades or the sensor ends up inside of big objects. This step is performed by default but can be skipped by setting allow_translation=False.

Second, the agent will then be oriented towards the object so that the sensor’s central pixel is on-object. In the case of multi-object experiments, (i.e., when multiple_objects_present=True), there is an additional orientation step performed prior to the translational movement step.

__init__(agent_id: AgentID, good_view_distance: float, good_view_percentage: float, multiple_objects_present: bool, sensor_id: SensorID, target_semantic_id: SemanticID, allow_translation: bool = True, max_orientation_attempts: int = 1) None[source]#

Initialize the GetGoodView positioning procedure.

Parameters:
  • agent_id (NewType()(AgentID, str)) – The ID of the agent to generate actions for.

  • good_view_distance (float) – The desired distance to the object for a good view.

  • good_view_percentage (float) – The percentage of the sensor that should be filled with the object.

  • multiple_objects_present (bool) – Whether there are multiple objects in the scene.

  • sensor_id (NewType()(SensorID, str)) – The ID of the sensor to use for positioning.

  • target_semantic_id (NewType()(SemanticID, int)) – The semantic ID of the target object.

  • allow_translation (bool) – Whether to allow movement toward the object via the motor systems’s move_close_enough method. If False, only orientienting movements are performed. Defaults to True.

  • max_orientation_attempts (int) – The maximum number of orientation attempts allowed before giving up and truncating the procedure indicating that the sensor is not on the target object.

compute_look_amounts(relative_location: np.ndarray, state: MotorSystemState) tuple[float, float][source]#

Compute the amount to look down and left given a relative location.

This function computes the amount needed to look down and left in order for the sensor to be aimed at the target. The returned amounts are relative to the agent’s current position and rotation. Looking up and right is done by returning negative amounts.

TODO: Test whether this function works when the agent is facing in the positive z-direction. It may be fine, but there were some adjustments to accommodate the z-axis positive direction pointing opposite the body’s initial orientation (e.g., using negative z in left_amount = -np.degrees(np.arctan2(x_rot, -z_rot))).

Parameters:
  • relative_location – the x,y,z coordinates of the target with respect to the sensor.

  • state – The current state of the motor system. Defaults to None.

Returns:

Amount to look down (degrees). left_amount: Amount to look left (degrees).

Return type:

down_amount

find_location_to_look_at(sem3d_obs: np.ndarray, image_shape: tuple[int, int], state: MotorSystemState) np.ndarray[source]#

Find the location to look at in the observation.

Takes in a semantic 3D observation and returns an x,y,z location.

The location is on the object and surrounded by pixels that are also on the object. This is done by smoothing the on_object image and then taking the maximum of this smoothed image.

Parameters:
  • sem3d_obs – The location of each pixel and the semantic ID associated with that location.

  • image_shape – The shape of the camera image.

  • state – The current state of the motor system.

Returns:

The x,y,z coordinates of the target with respect to the sensor.

is_on_target_object(observation: Mapping) bool[source]#

Check if a sensor is on the target object.

Parameters:

observation (Mapping) – The observation to use for positioning.

Return type:

bool

Returns:

Whether the sensor is on the target object.

move_close_enough(observation: Mapping) Action | None[source]#

Move closer to the object until we are close enough.

Parameters:

observation (Mapping) – The observation to use for positioning.

Return type:

Action | None

Returns:

The next action to take, or None if we are already close enough to the object.

Raises:

ValueError – If the object is not visible.

orient_to_object(observation: Mapping, state: MotorSystemState) list[Action][source]#

Rotate sensors so that they are centered on the object using the view finder.

The view finder needs to be in the same position as the sensor patch and the object needs to be somewhere in the view finders view.

Parameters:
  • observation – The observation to use for positioning.

  • state – The current state of the motor system.

Returns:

A list of actions of length two composed of actions needed to get us onto the target object.

sensor_rotation_relative_to_world(state: MotorSystemState) Any[source]#

Derives the positioning sensor’s rotation relative to the world.

Parameters:

state (MotorSystemState) – The current state of the motor system.

Return type:

Any

Returns:

The positioning sensor’s rotation relative to the world.

class PositioningProcedure(*args, **kwargs)[source]#

Bases: Protocol

Positioning procedure to position the agent in the scene.

The positioning procedure should be called repeatedly until the procedure result indicates that the procedure has terminated or truncated.

__init__(*args, **kwargs)#
static depth_at_center(agent_id: AgentID, observations: Observations, sensor_id: SensorID) float[source]#

Determine the depth of the central pixel for the sensor.

Parameters:
  • agent_id (NewType()(AgentID, str)) – The ID of the agent to use.

  • observations (Observations) – The observations to use.

  • sensor_id (NewType()(SensorID, str)) – The ID of the sensor to use.

Return type:

float

Returns:

The depth of the central pixel for the sensor.

class PositioningProcedureResult(actions: list[Action] = <factory>, success: bool = False, terminated: bool = False, truncated: bool = False) None[source]#

Bases: object

Result of a positioning procedure.

For more on the terminated/truncated terminology, see https://farama.org/Gymnasium-Terminated-Truncated-Step-API.

__init__(actions=<factory>, success=False, terminated=False, truncated=False)#
actions: list[Action]#

Actions to take.

success: bool = False#

Whether the procedure succeeded in its positioning goal.

terminated: bool = False#

Whether the procedure reached a terminal state with success or failure.

truncated: bool = False#

Whether the procedure was truncated due to a limit on the number of attempts or other criteria.

tbp.monty.frameworks.environments.two_d_data#

class OmniglotEnvironment(patch_size=10, data_path=None)[source]#

Bases: SimulatedEnvironment

Environment for Omniglot dataset.

__init__(patch_size=10, data_path=None)[source]#

Initialize environment.

Parameters:
  • patch_size – height and width of patch in pixels, defaults to 10

  • data_path – path to the omniglot dataset. If None, defaults to ~/tbp/data/omniglot/python/

close() None[source]#

Close the environment and release all resources.

Any call to any other environment method may raise an exception.

Return type:

None

get_image_patch(img, loc, patch_size)[source]#
get_state() ProprioceptiveState[source]#
Return type:

ProprioceptiveState

load_new_character_data()[source]#
motor_to_locations(motor)[source]#
reset() tuple[Observations, ProprioceptiveState][source]#

Reset the environment to its initial state.

Returns:

The environment’s initial observations and proprioceptive state.

step(actions: Sequence[Action]) tuple[Observations, ProprioceptiveState][source]#

Retrieve the next observation.

Since the omniglot dataset includes stroke information (the order in which the character was drawn as a list of x,y coordinates) we use that for movement. This means we start at the first x,y coordinate saved in the move path and then move in increments specified by amount through this list. Overall there are usually several hundred points (~200-400) but it varies between characters and versions. If we reach the end of a move path and the episode is not finished, we start from the beginning again. If len(move_path) % amount != 0, we will sample different points on the second pass.

Parameters:
  • actions – Not used at the moment since we just follow the draw path. However,

  • at (we do use the rotation_degrees to determine the amount of pixels to move) –

  • step. (each) –

Returns:

The observations and proprioceptive state.

switch_to_object(alphabet_id, character_id, version_id)[source]#
class SaccadeOnImageEnvironment(patch_size=64, data_path=None)[source]#

Bases: SimulatedEnvironment

Environment for moving over a 2D image with depth channel.

Images should be stored in .png format for rgb and .data format for depth.

__init__(patch_size=64, data_path=None)[source]#

Initialize environment.

Parameters:
  • patch_size – height and width of patch in pixels, defaults to 64

  • data_path – path to the image dataset. If None, defaults to ~/tbp/data/worldimages/labeled_scenes/

close() None[source]#

Close the environment and release all resources.

Any call to any other environment method may raise an exception.

Return type:

None

get_3d_coordinates_from_pixel_indices(pixel_idx)[source]#

Retrieve 3D coordinates of a pixel.

Returns:

The 3D coordinates of the pixel.

get_3d_scene_point_cloud()[source]#

Turn 2D depth image into 3D pointcloud using DepthTo3DLocations.

This point cloud is used to estimate the sensor displacement in 3D space between two subsequent steps. Without this we get displacements in pixel space which does not work with our 3D models.

Returns:

The 3D scene point cloud. current_sf_scene_point_cloud: The 3D scene point cloud in sensor frame.

Return type:

current_scene_point_cloud

get_image_patch(loc)[source]#

Extract 2D image patch from a location in pixel space.

Returns:

The depth patch. rgb_patch: The rgb patch. depth3d_patch: The depth3d patch. sensor_frame_patch: The sensor frame patch.

Return type:

depth_patch

get_move_area()[source]#

Calculate area in which patch can move on the image.

Returns:

The move area.

get_next_loc(action_name, amount)[source]#

Calculate next location in pixel space given the current action.

Returns:

The next location in pixel space.

get_state() ProprioceptiveState[source]#
Return type:

ProprioceptiveState

load_depth_data(depth_path, height, width)[source]#

Load depth image from .data file.

Returns:

The depth image.

load_new_scene_data()[source]#

Load depth and rgb data for next scene environment.

Returns:

The depth image. current_rgb_image: The RGB image. start_location: The start location.

Return type:

current_depth_image

load_rgb_data(rgb_path)[source]#

Load RGB image and put into np array.

Returns:

The RGB image.

process_depth_data(depth)[source]#

Process depth data by reshaping, clipping and flipping.

Returns:

The processed depth image.

reset() tuple[Observations, ProprioceptiveState][source]#

Reset environment and extract image patch.

TODO: clean up. Do we need this? No reset is required in this environment

interface, so this should be indicated more clearly.

Returns:

The observation from the image patch.

step(actions: Sequence[Action]) tuple[Observations, ProprioceptiveState][source]#

Retrieve the next observation.

Parameters:

actions – moving up, down, left or right from current location.

Returns:

The observation and proprioceptive state.

switch_to_object(scene_id, scene_version_id)[source]#

Load new image to be used as environment.

class SaccadeOnImageFromStreamEnvironment(patch_size=64, data_path=None)[source]#

Bases: SaccadeOnImageEnvironment

Environment for moving over a 2D streamed image with depth channel.

__init__(patch_size=64, data_path=None)[source]#

Initialize environment.

Parameters:
  • patch_size – height and width of patch in pixels, defaults to 64

  • data_path – path to the image dataset. If None, defaults to ~/tbp/data/worldimages/world_data_stream/

load_new_scene_data()[source]#

Load depth and rgb data for next scene environment.

Returns:

The depth image. current_rgb_image: The RGB image. start_location: The start location.

Return type:

current_depth_image

switch_to_scene(scene_id)[source]#