tbp.monty.frameworks.utils#

tbp.monty.frameworks.utils.communication_utils#

get_first_sensory_state(states)[source]#

Given a list of states return the first one from a sensory channel.

Returns:: First state from a sensory channel.

get_state_from_channel(states, channel_name)[source]#

Given a list of states, return the state of the specified channel.

Parameters:

states – List of states
channel_name – The name of the channel to return the state for

Returns:

The state of the specified channel

Raises:

ValueError – If the channel name is not found in the states

tbp.monty.frameworks.utils.dataclass_utils#

as_dataclass_dict(obj)[source]#

Convert a dataclass instance to a serializable dataclass dict.

Parameters:: obj – The dataclass instance to convert
Returns:: A dictionary with the dataclass fields and values
Raises:: TypeError – If the object is not a dataclass instance

create_dataclass_args(dataclass_name: str, function: Callable, base: Type | None = None)[source]#

Creates configuration dataclass args from a given function arguments.

When the function arguments have type annotations these annotations will be passed to the dataclass fields, otherwise the type will be inferred from the argument default value, if any.

For example:

SingleSensorAgentArgs = create_dataclass_args(
        "SingleSensorAgentArgs", SingleSensorAgent.__init__)

# Is equivalent to
@dataclass(frozen=True)
class SingleSensorAgentArgs:
    agent_id: str
    sensor_id: str
    position: Tuple[float. float, float] = (0.0, 1.5, 0.0)
    rotation: Tuple[float, float, float, float] = (1.0, 0.0, 0.0, 0.0)
    height: float = 0.0
    :

Parameters:

dataclass_name (str) – The name of the new dataclass
function (Callable) – The function used to extract the parameters for the dataclass
base (Optional[Type]) – Optional base class for newly created dataclass

Returns:

New dataclass with fields defined by the function arguments.

from_dataclass_dict(datadict)[source]#

Convert a serializable dataclass dict back into the original dataclass.

Expecting that the serializable dataclass dict was created via asdict().

Parameters:: datadict – The serializable dataclass dict to convert
Returns:: The original dataclass instance
Raises:: TypeError – If the object is not a dict instance

tbp.monty.frameworks.utils.evidence_matching#

exception InvalidEvidenceThresholdConfig[source]#

Bases: ValueError

Raised when the evidence update threshold is invalid.

class ChannelMapper(channel_sizes: Dict[str, int] | None = None) → None[source]#

Bases: object

Marks the range of hypotheses that correspond to each input channel.

The EvidenceGraphLM implementation stacks the hypotheses from all input channels in the same array to perform efficient vector operations on them. Therefore, we need to keep track of which indices in the stacked array correspond to which input channel. This class stores only the sizes of the input channels in an ordered data structure (OrderedDict), and computes the range of indices for each channel. Storing the sizes of channels in an ordered dictionary allows us to insert or remove channels, as well as dynamically resize them.

add_channel(channel_name: str, size: int, position: int | None = None) → None[source]#

Adds a new channel at a specified position (default is at the end).

Parameters:

channel_name (str) – The name of the new channel.
size (int) – The size of the new channel.
position (Optional[int]) – The index at which to insert the channel.

Raises:

ValueError – If the channel already exists or position is out of bounds.

Return type:

None

channel_range(channel_name: str) → Tuple[int, int][source]#

Returns the start and end indices of the given channel.

Parameters:: channel_name (str) – The name of the channel.
Return type:: Tuple[int, int]
Returns:: The start and end indices of the channel.
Raises:: ValueError – If the channel is not found.

channel_size(channel_name: str) → int[source]#

Returns the total number of hypotheses for a specific channel.

Return type:: int
Returns:: Size of channel
Raises:: ValueError – If the channel is not found.

extract(original: numpy.ndarray, channel: str) → numpy.ndarray[source]#

Extracts the portion of the original array corresponding to a given channel.

Parameters:

original (ndarray) – The full hypotheses array across all channels.
channel (str) – The name of the channel to extract.

Return type:

ndarray

Returns:

The extracted slice of the original array. Returns a view, not a copy of the original array.

Raises:

ValueError – If the channel is not found.

extract_hypotheses(hypotheses: Hypotheses, channel: str) → ChannelHypotheses[source]#

Extracts the hypotheses corresponding to a given channel.

Parameters:

hypotheses (Hypotheses) – The full hypotheses array across all channels.
channel (str) – The name of the channel to extract.

Return type:

ChannelHypotheses

Returns:

The hypotheses corresponding to the given channel.

resize_channel_by(channel_name: str, value: int) → None[source]#

Increases or decreases the channel by a specific amount.

Parameters:

channel_name (str) – The name of the channel.
value (int) – The value used to modify the channel size. Use a negative value to decrease the size.

Raises:

ValueError – If the channel is not found or the requested size is negative.

Return type:

None

resize_channel_to(channel_name: str, new_size: int) → None[source]#

Sets the size of the given channel to a specific value.

Parameters:

channel_name (str) – The name of the channel.
new_size (int) – The new size to set for the channel.

Raises:

ValueError – If the channel is not found or if the new size is not positive.

Return type:

None

update(original: numpy.ndarray, channel: str, data: numpy.ndarray) → numpy.ndarray[source]#

Inserts data into the original array at the position of the given channel.

This function inserts new data at the index range previously associated with the provided channel. If the new data is of the same shape as the existing channel data shape, we simply replace the data at the channel range indices. Otherwise, We split the original array around the input channel range, then concatenate the before and after splits with the data to be inserted. This accommodates ‘data’ being of a different size than the current channel size.

For example, if original has the shape (20, 3), channel start index is 10, channel end index is 13, and the data has the shape (5, 3). We would concatenate as such: (original[0:10], data, original[13:]). This will result in an array of the shape (22, 3), i.e., we removed 3 rows and added new 5 rows.

Parameters:

original (ndarray) – The original array.
channel (str) – The name of the input channel.
data (ndarray) – The new data to insert.

Return type:

ndarray

Returns:

The resulting array after insertion. Can return a new copy or a view, depending on whether the inserted data is of the same size as the existing channel.

Raises:

ValueError – If the channel is not found.

property channels: List[str]#

Returns the existing channel names.

Returns:: List of channel names.

property total_size: int#

Returns the total number of hypotheses across all channels.

Returns:: Total size across all channels.

class EvidenceSlopeTracker(window_size: int = 3, min_age: int = 5) → None[source]#

Bases: object

Tracks the slopes of evidence streams over a sliding window per input channel.

Each input channel maintains its own hypotheses with independent evidence histories. This tracker supports adding, updating, pruning, and analyzing hypotheses per channel.

Note

One optimization might be to treat the array of tracked values as a ring-like
structure. Rather than shifting the values every time they are updated, we could just iterate an index which determines where in the ring we are. Then we would update one column, which based on the index, corresponds to the most recent values.
Another optimization is only track slopes not the actual evidence values. The
pairwise slopes for previous scores are not expected to change over time and therefore can be calculated a single time and stored.
We can also test returning a random subsample of indices with
slopes < mean(slopes) for to_remove instead of using np.argsort.

window_size#: Number of past values to consider for slope calculation.

min_age#: Minimum number of updates before a hypothesis can be considered for removal.

evidence_buffer#: Maps channel names to their hypothesis evidence buffers.

hyp_age#: Maps channel names to hypothesis age counters.

add_hyp(num_new_hyp: int, channel: str) → None[source]#

Adds new hypotheses to the specified input channel.

Parameters:

num_new_hyp (int) – Number of new hypotheses to add.
channel (str) – Name of the input channel.

Return type:

None

calculate_keep_and_remove_ids(num_keep: int, channel: str) → tuple[npt.NDArray[np.int_], npt.NDArray[np.int_]][source]#

Determines which hypotheses to keep and which to remove in a channel.

Hypotheses with the lowest average slope are selected for removal.

Parameters:

num_keep – Requested number of hypotheses to retain.
channel – Name of the input channel.

Returns:

Indices of hypotheses to retain. - to_remove: Indices of hypotheses to remove.

Return type:

to_keep

Raises:

ValueError – If the channel does not exist.
ValueError – If the requested hypotheses to retain are more than available hypotheses.

clear_hyp(channel: str) → None[source]#

Clears the hypotheses in a specific channel.

Parameters:: channel (str) – Name of the input channel.
Return type:: None

removable_indices_mask(channel: str) → numpy.typing.NDArray.numpy.bool_[source]#

Returns a boolean mask for removable hypotheses in a channel.

Parameters:: channel (str) – Name of the input channel.
Return type:: bool_
Returns:: Boolean array indicating removable hypotheses (age >= min_age).

remove_hyp(hyp_ids: numpy.typing.NDArray.numpy.int_, channel: str) → None[source]#

Removes specific hypotheses by index in the specified channel.

Parameters:

hyp_ids (int_) – List of hypothesis indices to remove.
channel (str) – Name of the input channel.

Return type:

None

total_size(channel: str) → int[source]#

Returns the number of hypotheses in a given channel.

Parameters:: channel (str) – Name of the input channel.
Return type:: int
Returns:: Number of hypotheses currently tracked in the channel.

update(values: numpy.typing.NDArray.numpy.float64, channel: str) → None[source]#

Updates all hypotheses in a channel with new evidence values.

Parameters:

values (float64) – List or array of new evidence values.
channel (str) – Name of the input channel.

Raises:

ValueError – If the channel doesn’t exist or the number of values is incorrect.

Return type:

None

evidence_update_threshold(evidence_threshold_config: float | str, x_percent_threshold: float | str, max_global_evidence: float, evidence_all_channels: np.ndarray) → float[source]#

Determine how much evidence a hypothesis should have to be updated.

Parameters:

evidence_threshold_config (float | str) – The heuristic for deciding which hypotheses should be updated.
x_percent_threshold (float | str) – The x_percent value to use for deciding on the evidence_update_threshold when the x_percent_threshold is used as a heuristic.
max_global_evidence (float) – Highest evidence of all hypotheses (i.e., current mlh evidence),
evidence_all_channels (np.ndarray) – Evidence values for all hypotheses.

Return type:

float

Returns:

The evidence update threshold.

Note

The logic of evidence_threshold_config=”all” can be optimized by bypassing the np.min function here and bypassing the indexing of np.where function in the displacer. We want to update all the existing hypotheses, therefore there is no need to find the specific indices for them in the hypotheses space.

Raises:: InvalidEvidenceThresholdConfig – If evidence_threshold_config is not in the allowed values

tbp.monty.frameworks.utils.follow_up_configs#

tbp.monty.frameworks.utils.graph_matching_utils#

add_pose_features_to_tolerances(tolerances, default_tolerances=20) → dict[source]#

Add point_normal and curvature_direction default tolerances if not set.

Return type:: dict
Returns:: Tolerances dictionary with added pose_vectors if not set.

create_exponential_kernel(size, decay_rate)[source]#

Create an exponentially decaying kernel.

Used to convolve e.g. evidence history when determining whether we are on a new object.

Parameters:

size – Size of the kernel.
decay_rate – Decay rate of the kernel.

Returns:

Exponentially decaying kernel.

detect_new_object_exponential(max_ev_per_step, detection_threshold=-1.0, decay_rate=3)[source]#

Detect we’re on a new object using exponentially decaying evidence changes.

Evidence changes from multiple steps into the past are convolved by exponentially decaying constants, such that more recent steps carry more significance.

Parameters:

max_ev_per_step – List of the maximum evidence (across all locations/poses) for the current MLH object, across all steps of the current episode
detection_threshold – The total amount of negative evidence in the counter/sum that needs to be exceeded to determine that the LM has moved on to a new object
decay_rate – The rate of decay that determines how much past evidence -drops contribute to the current estimate of change

Returns:

True if the total amount of negative evidence is less than or equal to the detection threshold, False otherwise.

detect_new_object_k_steps(max_ev_per_step, detection_threshold=-1.0, k=3, reset_at_positive_jump=False)[source]#

Detect we’re on a new object using the evidence changes from multiple steps.

Evidence changes from multiple steps into the past are considered. We look at the change in evidence over k discrete steps, weighing these equally.

Parameters:

max_ev_per_step – List of the maximum evidence (across all locations/poses) for the current MLH object, across all steps of the current episode
detection_threshold – The total amount of negative evidence in the counter/sum that needs to be exceeded to determine that the LM has moved on to a new object
k – How many steps into the past to look when summing the negative change in evidence
reset_at_positive_jump – Boolean to “reset” the accumulated changes when there is a positive increase in evidence, i.e. k is further limited by this history

Returns:

True if the total evidence change is less than or equal to the detection threshold, False otherwise.

find_step_on_new_object(stepwise_targets, primary_target, n_steps_off_primary_target)[source]#

Returns the episode step at which we’ve moved off the primary target object.

The returned episode step is the first step at which we’ve been off the primary target for a total of n_steps_off_primary_target consecutive steps.

get_correct_k_n(k_n, num_datapoints)[source]#

Determine k_n given the number of datapoints.

k_n specified in hyperparameter may not be possible to achive with the number of data points collected. This function checks for that and adjusts k_n.

Parameters:

k_n – current number k neareast neighbors specified for graph building
num_datapoints – number of observations available to build the graph.

Returns:

adjusted k_n

get_custom_distances(nearest_node_locs, search_locs, search_pns, search_curvature)[source]#

Calculate custom distances modulated by point normal and curvature.

Parameters:

nearest_node_locs – locations of nearest nodes to search_locs. shape=(num_hyp, max_nneighbors, 3)
search_locs – search locations for each hypothesis. shape=(num_hyp, 3)
search_pns – sensed point normal rotated by hypothesis pose. shape=(num_hyp, 3)
search_curvature – magnitude of sensed curvature (maximum if using two principal curvatures). Is used to modulate the search spheres thickness in the direction of the point normal. shape=1

Returns:

custom distances of each nearest location: from its search location taking into account the hypothesis point normal and sensed curvature. shape=(num_hyp, max_nneighbors)

Return type:

custom_nearest_node_dists

get_initial_possible_poses(initial_possible_pose_type) → list[Rotation] | None[source]#

Initialize initial_possible_poses to test based on initial_possible_pose_type.

Parameters:

initial_possible_pose_type –

How to sample initial possible poses. Options are: - “uniform”: Sample uniformly from the space of possible poses. - “informed”: Sample poses that are likely to be possible based on

the object’s geometry and the first observation.

list of euler angles: Use a list of predefiende poses to test (useful for
debugging).

Returns:

List of initial possible poses to test or None if initial_possible_pose_type is “informed”.

get_relevant_curvature(features)[source]#

Get the relevant curvature from features. Used to scale search sphere.

In the case of principal_curvatures and principal_curvatures_log we use the maximum absolute curvature between the two values. Otherwise we just return the curvature value.

Note

Not sure if other curvatures work as well as log curvatures since they may have too big of a range.

Returns:: Magnitude of sensed curvature (maximum if using two principal curvatures).

get_scaled_evidences(evidences, per_object=False)[source]#

Scale evidences to be in range [-1, 1] for voting.

This is useful so that the absolute evidence values don’t distort the votes (for example if one LM has already had a lot more matching steps than another and the evidence is not bounded). It is also useful to keep the evidence added from a single voting step in a reasonable range.

By default we normalize using the maximum and minimum evidence over all objects. There is also an option to scale the evidence for each object independently but that might give low evidence objects too much of a boost. We could probably remove this option.

Returns:: Scaled evidences.

get_uniform_initial_possible_poses(n_degrees_sampled=9)[source]#

Get initial list of possible poses.

Parameters:: n_degrees_sampled – Number of degrees sampled for each axis. Default = 9 Which means tested degrees are in [ 0., 45., 90., 135., 180., 225., 270., 315.] This results in 512 unique pose combinations.

Of those, depending on the displacement vector, some are equivalent eg. [0,0,0] and [180,180,180] or [ 0, 45, 90] and [180, 135, 270] (a, b, c) == (a + 180, -b + 180, c + 180) (see https://books.google.gr/books?id=rn3OBQAAQBAJ p.267)

Returns:: List of poses to test.

get_unique_paths(possible_paths, threshold=0.01)[source]#

Get all unique paths in a list of possible paths.

Parameters:

possible_paths – List of possible paths (locations)
threshold – minimun distance between two parts to be considered different. defaults to 0.01

Returns:

List of unique paths (locations)

is_in_ranges(array, ranges)[source]#

Check for each element in an array whether it is in its specified range.

Each element can have a different tolerance range.

Returns:: True if all elements are in their respective ranges, False otherwise.

possible_sensed_directions(sensed_directions: numpy.ndarray, num_hyps_per_node: int) → numpy.ndarray[source]#

Returns the possible sensed directions for all nodes.

This function determines the possible sensed directions for a given set of sensed directions. It relies on two different behaviors depending on the value of num_hyps_per_node.

If num_hyps_per_node equals 2: then pose is well defined (i.e., PC1 != PC2).
A well defined pose does not distinguish between mirrored directions of PC1 and PC2 (e.g., object can be upside down), therefore we sample both directions.

If num_hyps_per_node is not 2: this function samples additional poses in
the plane perpendicular to the sensed point normal.

Parameters:

sensed_directions (ndarray) – An array of sensed directions.
num_hyps_per_node (int) – Number of rotations to get for each node.

Returns:

Possible sensed direction for all nodes at each rotation

Return type:

possible_s_d

process_delta_evidence_values(max_ev_per_step)[source]#

Pre-process the max evidence values to get the change in evidence across steps.

Clip the values to be less than or equal to 0.

Also returns the index of the most recent positive change in evidence

Returns:: Clipped change in evidence across steps. postive_jump_loc: Index of the most recent positive change in evidence.
Return type:: clipped_ev_changes

tbp.monty.frameworks.utils.logging_utils#

add_evidence_lm_episode_stats(lm, stats)[source]#

add_policy_episode_stats(lm, stats)[source]#

add_pose_lm_episode_stats(lm, stats)[source]#

Add possible poses of lm to episode stats.

Parameters:

lm – LM istance from which to add the statistics.
stats – Statistics dictionary to update.

Returns:

Updated stats dictionary.

calculate_fpr(fp, tn)[source]#

Calculate False Positive Rate, aka specificity.

Parameters:

fp – false positives
tn – true negatives

Returns:

False Positive Rate

calculate_performance(stats, performance_type, lm, target_object)[source]#

Calculate performance of an LM on a given target object.

Parameters:

stats – Statistics dictionary to update.
performance_type – performance type index into stats
lm – Learning module for which to generate stats.
target_object – target (primary or stepwise) object for the LM to have converged to

Returns:

Updated stats dictionary.

calculate_tpr(tp, fn)[source]#

Calculate True Positive Rate, aka sensitivity.

Parameters:

tp – true positives
fn – false negatives

Returns:

True Positive Rate

check_detection_accuracy_at_step(stats, last_n_step=1)[source]#

check_rotation_accuracy(stats, last_n_step=1)[source]#

compute_pose_error(predicted_rotation: scipy.spatial.transform.Rotation, target_rotation: scipy.spatial.transform.Rotation) → float[source]#

Computes the minimum angular pose error between predicted and target rotations.

Both inputs must be instances of scipy.spatial.transform.Rotation. The predicted_rotation may contain a single rotation or a list of rotations, while target_rotation must be exactly one rotation.

The pose error is defined as the geodesic distance on SO(3) — the angle of the relative rotation between predicted and target. If predicted_rotation contains multiple rotations, this function returns the minimum error among them.

Note that the .inv() operation in this method is due to how geodesic distance between two rotations is calculated, not a side-effect of whether the target rotation is stored in its normal form, or as its inverse. The function therefore assumes that the orientations are already in the same coordinate system before the comparison.

Parameters:

predicted_rotation (Rotation) – Predicted rotation(s). Can be a single or list of rotation.
target_rotation (Rotation) – Target rotation. Must represent a single rotation.

Return type:

float

Returns:

The minimum angular error in radians.

compute_unsupervised_stats(possible_matches, target, graph_to_target, target_to_graph)[source]#

Compute statistics like how many graphs are built per object.

Parameters:

possible_matches – container of str that key into graph_to_target
target – str ground truth name of object being presented
graph_to_target – dict mapping each graph to the set of objects used to build
target_to_graph – dict mapping each object to the set of graphs that used it

Returns:

Return type:

dict

deserialize_json_chunks(json_file, start=0, stop=None, episodes=None)[source]#

Deserialize one episode at a time from json file.

Only get episodes specified by arguments, which follow list / numpy like semantics.

Note

assumes line counter is exactly in line with episode keys

Parameters:

json_file – full path to the json file to load
start – int, get data starting at this episode
stop – int, get data ending at this episode, not inclussive as usual in python
episodes – iterable of ints with episodes to pull

Returns:

dict containing contents of file_handle

Return type:

detailed_json

format_columns_for_wandb(lm_dict)[source]#

Various columns break wandb because we are playing fast and loose with types.

Put any standardizations here.

Parameters:: lm_dict – dict, part of a larger dict ~ {LM_0: lm_dict, LM_1: lm_dict}
Returns:: formatted lm_dict

get_graph_lm_episode_stats(lm)[source]#

Populate stats dictionary for one episode for a lm.

Parameters:: lm – Learning module for which to generate stats.
Returns:: dict with stats of one episode.

get_object_graph_stats(graph_to_target, target_to_graph)[source]#

get_overall_pose_error(stats, lm_id='LM_0')[source]#

Get mean pose error over all episodes.

Note

This can now be obtained easier from the .csv stats.

Parameters:

stats – detailed stats
lm_id – id of learning module

Returns:

mean pose error

get_reverse_rotation(rotation)[source]#

get_rgba_frames_single_sm(observations)[source]#

Convert a time series of rgba observations into format for wandb.Video.

Parameters:: observations – episode_stats[sm][___observations]
Returns:: formatted observations

get_stats_per_lm(model, target)[source]#

Loop through lms and get stats.

Parameters:

model – model instance
target – target object

Returns:

dict with stats per lm

Return type:

performance_dict

get_time_stats(all_ds, all_conditions) → pandas.DataFrame[source]#

Get summary of run times in a dataframe for each condition.

Parameters:

all_ds – detailed stats (dict) for each condition
all_conditions – name of each condition

Return type:

DataFrame

Returns:

Runtime stats.

get_unique_euler_poses(poses)[source]#

Get unique poses for an object from possible poses per path.

Returns:: array of unique poses
Return type:: unique_poses

lm_stats_to_dataframe(stats, format_for_wandb=False)[source]#

Take in a dictionary and format into a dataframe.

Example:

{0: {LM_0: stats, LM_1: stats...}, 1:...} --> dataframe

Currently we are reporting once per episode, so the loop over episodes is only over a singel key, value pair, but leaving it here because it is backward compatible.

Returns:: dataframe

load_models_from_dir(exp_path, pretrained_dict=None)[source]#

load_stats(exp_path, load_train=True, load_eval=True, load_detailed=True, load_models=True, pretrained_dict=None)[source]#

Load experiment statistics from an experiment for analysis.

Returns:: pandas DataFrame with training statistics eval_stats: pandas DataFrame with evaluation statistics detailed_stats: dict with detailed statistics lm_models: dict with loaded language models
Return type:: train_stats

matches_to_target_str(possible_matches, graph_to_target)[source]#

Get the possible target objects associated with each possible match.

Targets are concatenated into a single string name for easy saving in a csv.

Returns:

Return type:

dict

maybe_rename_existing_directory(path, report_count)[source]#

maybe_rename_existing_file(log_file, extension, report_count)[source]#

Check if this run has already been executed.

If so, change name of existing log file by appending _old to it.

Parameters:

log_file – full path to the file, e.g. ~/…/detailed_run_stats.json
extension – str name of file type
report_count –
?

print_overall_stats(stats)[source]#

print_unsupervised_stats(stats, epoch_len)[source]#: Print stats of unsupervised learning experiment.

target_data_to_dict(target)[source]#

Format target params to dict.

Parameters:: target – target params
Returns:: dict with target params

total_size(o)[source]#

Returns the approximate memory footprint an object and all of its contents.

Automatically finds the contents of the following builtin containers and their subclasses: tuple, list, deque, dict, set and frozenset. To search other containers, add handlers to iterate over their contents:

handlers = {SomeContainerClass: iter,
OtherContainerClass: OtherContainerClass.get_elements}

The recursive recipe universally cited on stack exchange and blogs for gauging the size of python objets in memory.

tbp.monty.frameworks.utils.object_model_utils#

class NumpyGraph(my_dict)[source]#

Bases: object

Alternative way to represent graphs without using torch.

Speeds up runtime significantly.

already_in_list(existing_points, new_point, features, clean_ids, query_id, graph_delta_thresholds) → bool[source]#

Check if a given point is already in a list of points.

Parameters:

existing_points – List of x,y,z locations
new_point – new location
features – all features (both existing and candidate points)
clean_ids – indices (w.r.t “features”) that have been accepted into the graph and are compared to
query_id – index (w.r.t “features”) that is currently being considered
graph_delta_thresholds – Dictionary of thresholds used to determine whether a point should be considered sufficiently different so as to be included in the graph

Return type:

bool

Returns:

Whether the point is already in the list

build_point_cloud_graph(locations, features, feature_mapping)[source]#

Build a graph from observations without edges.

Parameters:

locations – array of x, y, z positions in space
features – dictionary of features at locations
feature_mapping –
?

Returns:

A NumpyGraph containing the observed features at locations.

circular_mean(values)[source]#

Calculate the mean of a circular value such as hue where 0==1.

Returns:: Mean value.

expand_index_dims(indices_3d, last_dim_size)[source]#

Expand 3d indices to 4d indices by adding a 4th dimension with size.

Parameters:

indices_3d – 3d indices that should be comverted to 4d
last_dim_size – desired size of the 4th dimension (will be filled with arange indices from 0 to last_dim_size-1)

Returns:

Tensor of 4d indices.

get_cubic_patches(arr_shape, centers, size)[source]#

Cut a cubic patch around a center id out of a 3d array.

NOTE: Currently not used. Was implemented for draft of nn search in grid.

Returns:: New centers and mask.

get_most_common_bool(booleans)[source]#

Get most common value out of a list of boolean values.

Returns:: True when we have equally many True as False entries.

get_values_from_dense_last_dim(tensor, index_3d)[source]#

Get values from 4d tensor at indices in last dimension.

This function assumes that the entries in the last dimension are dense. This is the case in all our sparse tensors where the first 3 dimensions represent the 3d location (sparse) and the 4th represents values at this location (dense).

Returns:: List of values.

increment_sparse_tensor_by_count(old_tensor, indices)[source]#

pose_vector_mean(pose_vecs, pose_fully_defined)[source]#

Calculate mean of pose vectors.

This takes into account that point normals may contain observations from two surface sides and curvature directions have an ambiguous direction. It also enforces them to stay orthogonal.

If not pose_fully_defined, the curvature directions are meaningless and we just return the first observation. Theoretically this shouldn’t matter but it can save some computation time.

Returns:

remove_close_points(point_cloud, features, graph_delta_thresholds, old_graph_index)[source]#

Remove points from a point cloud unless sufficiently far away.

Points are removed unless sufficiently far away either by Euclidean distance, or feature-space.

Parameters:

point_cloud – List of 3D points
features –
?
graph_delta_thresholds – dictionary of thresholds; if the L-2 distance between the locations of two observations (or other feature-distance measure) is below all of the given thresholds, then a point will be considered insufficiently interesting to be added
old_graph_index – If the graph is not new, the index associated with the final point in the old graph; we will skip this when checking for sameness, as they will already have been compared in the past to one-another, saving computation.

Returns:

List of 3D points that are sufficiently novel w.r.t one-another, along with their associated indices.

torch_graph_to_numpy(torch_graph)[source]#

Turn torch geometric data structure into dict with numpy arrays.

Parameters:: torch_graph – Torch geometric data structure.
Returns:: NumpyGraph.

tbp.monty.frameworks.utils.plot_utils#

tbp.monty.frameworks.utils.profile_utils#

bar_chart_cumtime(df, n_functions=None)[source]#

bar_chart_tottime(df, n_functions=None)[source]#

drop_filename(string)[source]#

Drop filename for shorter strings and easier viz.

We do this because strings for code calls are long.

Returns:: String without filename.

get_data_from_df(df, sortby='cumtime')[source]#

get_total_time(df)[source]#

linebreak_long_strings(string, chars_per_line=40)[source]#

Strings with filename are long, try to get them more readable in bar plots.

Parameters:

string – String to format.
chars_per_line – Number of characters per line. Defaults to 40.

Returns:

Formatted string.

print_top_k_functions(func_names, k=20)[source]#

sort_by_cumtime(df)[source]#

sort_by_tottime(df)[source]#

tbp.monty.frameworks.utils.sensor_processing#

get_center_neighbors(point_cloud, center_id, neighbor_patch_frac)[source]#

Get neighbors within a given neighborhood of the patch center.

Returns:: Locations and semantic id of all points within a given neighborhood of the patch center which lie on an object.

get_curvature_at_point(point_cloud, center_id, normal)[source]#

Compute principal curvatures from point cloud.

Computes the two principal curvatures of a 2D surface and corresponding principal directions

Parameters:

point_cloud – point cloud (2d numpy array) based on which the 2D surface is approximated
center_id – center point around which the local curvature is estimated
normal – surface normal at the center point

Returns:

first principal curvature k2: second principal curvature dir1: first principal direction dir2: second principal direction

Return type:

get_pixel_dist_to_center(n_points, patch_width, center_id)[source]#

Extracts the relative distance of each pixel to patch center (in pixel space).

Returns:: Relative distance of each pixel to patch center (in pixel space)

get_point_normal_naive(point_cloud, patch_radius_frac=2.5)[source]#

Estimate point normal.

This is a very simplified alternative to open3d’s estimate_normals where we make use of several assumptions specific to our case: - we know which locations are neighboring locations from the camera patch

arrangement

we only need the point normal at the center of the patch

TODO: Calculate point normal from multiple points at different distances (tan_len: values) and then take the average of them. Test if this improves robustness to raw sensor noise.

Parameters:

point_cloud – list of 3d coordinates and whether the points are on the object or not. shape = [n, 4]
patch_radius_frac – Fraction of observation size to use for PN calculation. Default of 2.5 means that we look half_obs_dim//2.5 to the left, right, up and down. With a resolution of 64x64 that would be 12 pixels. The calculated tan_len (in this example 12) describes the distance of pixels used to span up the two tangent vectors to calculate the point normals. These two vectors are then used to calculate the point normal by taking the cross product. If we set tan_len to a larger value the point normal is more influenced by the global shape of the patch.

Returns:

Estimated point normal at center of patch valid_pn: Boolean for whether the point-normal was valid or not (True by

default); an invalid point-normal means there were not enough points in the patch to make any estimate of the point-normal

Return type:

norm

get_point_normal_ordinary_least_squares(sensor_frame_data, world_camera, center_id, neighbor_patch_frac=3.2)[source]#

Extracts the point-normal direction from a noisy point-cloud.

Uses ordinary least-square fitting with error minimization along the view direction.

Parameters:

sensor_frame_data – point-cloud in sensor coordinates (assumes full patch is provided i.e. no preliminary filtering of off-object points).
world_camera – matrix defining sensor-to-world frame transformation.
center_id – id of the center point in point_cloud.
neighbor_patch_frac – fraction of the patch width that defines the local neighborhood within which to perform the least-squares fitting.

Returns:

Estimated point normal at center of patch valid_pn: Boolean for whether the point-normal was valid or not. Defaults

to True. An invalid point-normal means there were not enough points in the patch to make any estimate of the point-normal

Return type:

point_normal

get_point_normal_total_least_squares(point_cloud_base, center_id, view_dir, neighbor_patch_frac=3.2)[source]#

Extracts the point-normal direction from a noisy point-cloud.

Uses total least-square fitting. Error minimization is independent of view direction.

Parameters:

point_cloud_base – point-cloud in world coordinates (assumes full patch is provided i.e. no preliminary filtering of off-object points).
center_id – id of the center point in point_cloud.
view_dir – viewing direction used to adjust the sign of the estimated point-normal.
neighbor_patch_frac – fraction of the patch width that defines the local neighborhood within which to perform the least-squares fitting.

Returns:

Estimated point normal at center of patch valid_pn: Boolean for whether the point-normal was valid or not. Defaults

to True. An invalid point-normal means there were not enough points in the patch to make any estimate of the point-normal

Return type:

norm

get_principal_curvatures(point_cloud_base, center_id, n_dir, neighbor_patch_frac=2.13, weighted=True, fit_intercept=True)[source]#

Compute principal curvatures from point cloud.

Computes the two principal curvatures of a 2D surface and corresponding principal directions

Parameters:

point_cloud_base – point cloud (2d numpy array) based on which the 2D surface is approximated
center_id – center point around which the local curvature is estimated
n_dir – surface normal at the center point
neighbor_patch_frac – fraction of the patch width that defines the std of the gaussian distribution used to sample the weights. Defines a local neighborhood for principal curvature computation.
weighted – boolean flag that determines if regression is weighted or not. Weighting scheme is defined in get_weight_matrix.
fit_intercept – boolean flag that determines whether to fit an intercept term for the regression.

Returns:

first principal curvature k2: second principal curvature dir1: first principal direction dir2: second principal direction

Return type:

get_weight_matrix(n_points, center_id, neighbor_patch_frac=2.13)[source]#

Extracts individual pixel weights for least-squares fitting.

Weight for each pixel is sampled from a gaussian distribution based on its distance to the patch center.

Parameters:

n_points – total number of points in the full RGB-D square patch.
center_id – id of the center point in point_cloud.
neighbor_patch_frac – fraction of the patch width that defines the std of the gaussian distribution used to sample the weights.

Returns:

w_diag

log_sign(to_scale)[source]#

Apply symlog to input array, preserving sign.

This implementation makes sure to preserve the sign of the input values and to avoid extreme outputs when values are close to 0.

Parameters:: to_scale – array to scale.
Returns:: Scaled values of array.

point_pair_features(pos_i, pos_j, normal_i, normal_j)[source]#

Get point pair features between two points.

Parameters:

pos_i – Location of point 1
pos_j – Location of point 2
normal_i – Point normal of point 1
normal_j – Point normal of point 2

Returns:

Point pair feature

scale_clip(to_scale, clip)[source]#

Clip values into range and scale with sqrt.

This can be used to get gaussian and mean curvatures into a reasonable range and remove outliers. Makes it easier to deal with noise. Preserves sign before applying sqrt.

Parameters:

to_scale – array where each element should be scaled.
clip – range to which the array values should be clipped.

Returns:

scaled values of array.

tbp.monty.frameworks.utils.spatial_arithmetics#

align_multiple_orthonormal_vectors(ms1, ms2, as_scipy=True)[source]#

Calculate rotations between multiple orthonormal vector sets.

Parameters:

ms1 – multiple orthonormal vectors. shape = (N, 3, 3)
ms2 – orthonormal vectors to align with. shape = (3, 3)
as_scipy – Whether to return a list of N scipy.Rotation objects or a np.array of rotation matrices (N, 3, 3).

Returns:

List of N Rotations that align ms2 with each element in ms1.

align_orthonormal_vectors(m1, m2, as_scipy=True)[source]#

Calculate the rotation that aligns two sets of orthonormal vectors.

Parameters:

m1 – First set of orthonormal vectors.
m2 – Second set of orthonormal vectors to align with.
as_scipy – Whether to return a scipy rotation object or a rotation matrix. Defaults to True.

Returns:

apply_rf_transform_to_points(locations, features, location_rel_model, object_location_rel_body, object_rotation, object_scale=1)[source]#

Apply location and rotation transform to locations and features.

These transforms tell us how to transform new observations into the existing model reference frame. They are calculated from the detected object pose.

Parameters:

locations – Locations to transform (in body reference frame). Shape (N, 3)
features – Features to transform (in body reference frame). Shape (N, F)
location_rel_model – Detected location of the sensor on the object (object reference frame).
object_location_rel_body – Location of the sensor in the body reference frame.
object_rotation – Rotation of the object in the world relative to the learned model of the object. Expresses how the object model needs to be rotated to be consistent with the observations. To transfor the observed locations (rel. body) into the models reference frame, the inverse of this rotation is applied.
object_scale – Scale of the object relative to the model. Not used yet.

Note

Function can also be used in different contexts besides transforming points from body to object centric reference frame.

Returns:: Transformed locations features: Transformed features
Return type:: transformed_locations

check_orthonormal(matrix)[source]#

euler_to_quats(euler_rots, invert=False)[source]#

Convert euler rotations to rotation matrices.

Parameters:

euler_rots – Euler rotations
invert – Whether to invert the rotation. Defaults to False.

Returns:

Quaternions

get_angle(vec1, vec2)[source]#

Get angle between two vectors.

NOTE: for efficiency reasons we assume vec1 and vec2 are already normalized (which is the case for point normals and curvature directions).

Parameters:

vec1 – Vector 1
vec2 – Vector 2

Returns:

angle in radians

get_angle_beefed_up(v1, v2)[source]#

Returns the angle in radians between vectors ‘v1’ and ‘v2’.

If one of the vectors is undefined, return arbitrarily large distance

If one of the vectors is the zero vector, return arbitrarily large distance

Also enforces that vectors are unit vectors (therefore less efficient than the standard get_angle)

>>> angle_between_vecs((1, 0, 0), (0, 1, 0))
1.5707963267948966
>>> angle_between_vecs((1, 0, 0), (1, 0, 0))
0.0
>>> angle_between_vecs((1, 0, 0), (-1, 0, 0))
3.141592653589793

get_angle_torch(v1, v2)[source]#

Get angle between two torch vectors.

Parameters:

v1 – Vector 1
v2 – Vector 2

Returns:

angle in radians

get_angles_for_all_hypotheses(hyp_f, query_f)[source]#

Get all angles for hypotheses and their neighbors at once.

hyp_f shape = (num_hyp, num_nn, 3) query_f shape = (num_hyp, 3)

for each hypothesis we want to get num_nn angles.

return shape = (num_hyp, num_nn)

Parameters:

hyp_f – Hypotheses features three pose vectors
query_f – Query features three pose vectors

Returns:

Angles between hypotheses and query pose vectors

get_more_directions_in_plane(vecs, n_poses) → List[numpy.ndarray][source]#

Get a list of unit vectors, evenly spaced in a plane orthogonal to vecs[0].

This is used to sample possible poses orthogonal to the point normal when the curvature directions are undefined (like on a flat surface).

Parameters:

vecs – Vector to get more directions in plane for
n_poses – Number of poses to get

Return type:

List[ndarray]

Returns:

List of vectors evenly spaced in a plane orthogonal to vecs[0]

get_right_hand_angle(v1, v2, pn)[source]#

get_unique_rotations(poses, similarity_th, get_reverse_r=True)[source]#

Get unique scipy.Rotations out of a list, given a similarity threshold.

Parameters:

poses – List of poses to get unique rotations from
similarity_th – Similarity threshold
get_reverse_r – Whether to get the reverse rotation. Defaults to True.

Returns:

Unique euler poses r_poses: Unique rotations corresponding to euler_poses

Return type:

euler_poses

non_singular_mat(a)[source]#

Return True if a matrix is non-singular, i.e. can be inverted.

Uses the condition number of the matrix, which will approach a very large value, given by (1 / sys.float_info.epsilon) (where epsilon is the smallest possible floating-point difference)

pose_is_new(all_poses, new_pose, similarity_th) → bool[source]#

Check if a pose is different from a list of poses.

Use the magnitude of the difference between quaternions as a measure for similarity and check that it is below pose_similarity_threshold.

Return type:: bool
Returns:: True if the pose is new, False otherwise

rot_mats_to_quats(rot_mats, invert=False)[source]#

Convert rotation matrices to quaternions.

Parameters:

rot_mats – Rotation matrices
invert – Whether to invert the rotation. Defaults to False.

Returns:

Quaternions

rotate_multiple_pose_dependent_features(features, ref_frame_rot) → dict[source]#

Rotate point normal and curv dirs given a rotation matrix.

Parameters:

features – dict of features with pose vectors to rotate. Pose vectors have shape (N, 9)
ref_frame_rot – scipy rotation to rotate pose vectors with.

Return type:

dict

Returns:

Features with rotated pose vectors

rotate_pose_dependent_features(features, ref_frame_rots) → dict[source]#

Rotate pose_vectors given a list of rotation matrices.

Parameters:

features – dict of features with pose vectors to rotate. pose vectors have shape (3, 3)
ref_frame_rots – Rotation matrices to rotate pose features by. Can either be - A single scipy rotation (as used in FeatureGraphLM) - An array of rotation matrices of shape (N, 3, 3) or (3, 3) (as used in EvidenceGraphLM).

Return type:

dict

Returns:

Original features but with the pose_vectors rotated. If multiple rotations were given, pose_vectors entry will now contain multiple entries of shape (N, 3, 3).

rotations_to_quats(rotations, invert=False)[source]#

tbp.monty.frameworks.utils.transform_utils#

find_transform_instance(composed_transform, transform_type)[source]#

Find the first instance of transform_type.

Loop over a composition of transforms and return the first instance of transform_type.

Parameters:

composed_transform – Output of T.Compose([args]), T being torchvision or torch_geometric transforms
transform_type – The class you are looking for an instance of

Returns:

The first instance of transform_type or None if not found

numpy_to_scipy_quat(quat)[source]#

Convert from wxyz to xyzw format of quaternions.

i.e. identity rotation in scipy is (0,0,0,1).

Parameters:: quat – A quaternion in wxyz format
Returns:: A quaternion in xyzw format

scipy_to_numpy_quat(quat: numpy.ndarray) → numpy.quaternion[source]#

Return type:: quaternion