Tracking

Functions to help the tracking process.

The idea is to have some production ready functions that are optimized to the best fitting machine learning algorithms, hyperparameters and features.

make_detection_score_fun(dw_truth, frame_diff=1, radius=110, clf=None, **kwargs)[source]

Function to generate a scoring function that scores tracks and matching detections.

Note

The default classifier, parameters and features should be set to the best default for beesbook tracking.

This score function uses a LinearSVC SVM Classifier from scikit-learn and the features distance via distances_positions_v() and id similarity via score_id_sim_orientation_v().

Parameters:

dw_truth (DataWrapperTruth) – DataWrapperTruth with truth data

Keyword Arguments:
 
  • clf (scikit-learn classifier) – a scikit-learn classifier that will be trained
  • frame_diff (int) – after n frames close track if no matching object is found
  • radius (int) – radius in image coordinates to restrict neighborhood search
  • **kwargs (dict) – keyword arguments for train_bin_clf()
Returns:

tuple containing:

  • score_fun (func): scoring function to score tracks and matching detections
  • clf (scikit-learn classifier): the trained scikit-learn classifier

Return type:

tuple

make_track_score_fun(dw_truth, frame_diff=15, radius=<sphinx.ext.autodoc._MockModule object>, clf=None, **kwargs)[source]

Function to generate a scoring function that scores tracks and matching tracks.

Note

The default classifier, parameters and features should be set to the best default for beesbook tracking.

Parameters:
  • dw_truth (DataWrapperTruthTracks) – DataWrapperTruthTracks with truth data
  • frame_diff (int) – after n frames close track if no matching object is found
  • radius (int) – radius in image coordinates to restrict neighborhood search
Keyword Arguments:
 
  • clf (scikit-learn classifier) – a scikit-learn classifier that will be trained
  • **kwargs (dict) – keyword arguments for train_bin_clf()
Returns:

tuple containing:

  • score_fun (func): scoring function to score matching tracks
  • clf (scikit-learn classifier): the trained scikit-learn classifier

Return type:

tuple

Todo:

  • set default classifier, parameters and features!

Scoring

Defines some functions that compare features and score them.

Deprecated scoring functions are not imported to the parent module!

Note

Functions with suffix _v are vectorized with numpy.

score_ids_best_fit(ids1, ids2, length=12)[source]

Compares two lists of ids by choosing the pair with the best score.

Deprecated since version September: 2016 This scoring is used with data from the old Computer Vision Pipeline.

Parameters:
  • ids1 (list of int) – Iterable with ids as integers as base ids
  • ids2 (list of int) – Iterable with ids as integers to compare base to
Keyword Arguments:
 

length (Optional int) – number of bits in the bit array used to compare ids

Returns:

Uses Hamming distance of best matching pair

Return type:

float

score_ids_best_fit_rotating(ids1, ids2, rotation_penalty=1, length=12)[source]

Compares two lists of ids by choosing the pair best score by rotating the bits.

Deprecated since version September: 2016 This scoring is used with data from the old Computer Vision Pipeline.

Parameters:
  • ids1 (list of int) – Iterable with ids as integers as base ids
  • ids2 (list of int) – Iterable with ids as integers to compare base to
Keyword Arguments:
 
  • rotation_penalty (Optional float) – the penalty that is added for a rotation of 1 to the left or right
  • length (Optional int) – number of bits in the bit array used to compare ids
Returns:

Uses Hamming distance of best matching (rotated) pair

Return type:

float

score_ids_and_orientation(id_orientation_tuple1, id_orientation_tuple2, length=12, range_bonus_orientation=30)[source]

Compares lists of ids by choosing the pair with the best score and considering orientation.

The bonus is equal to the negative score of one non matching bit.

Deprecated since version September: 2016 This scoring is used with data from the old Computer Vision Pipeline.

Parameters:
  • id_orientation_tuple1 (tuple) – (Iterable with ids, Iterable with orientations)
  • id_orientation_tuple2 (tuple) – (Iterable with ids, Iterable with orientations)
Keyword Arguments:
 
  • length (int) – number of bits in the bit array used to compare ids
  • range_bonus_orientation (float) – range in degrees, so that two orientations get a bonus
Returns:

Uses Hamming distance of best matching pair with bonus for same orientation

Return type:

float

score_id_sim(id1, id2)[source]

Compares two id frequency distributions for similarity

Parameters:
  • id1 (list of float) – id bit frequency distribution of base id
  • id2 (list of float) – id bit frequency distribution of id to compare base to
Returns:

Use Manhattan distance \(\sum_i |id1_i - id2_i|\)

Return type:

float

score_id_sim_v(detections1, detections2)[source]

Compares two id frequency distributions for similarity (vectorized)

Parameters:
Returns:

Use Manhattan distance \(\sum_i |id1_i - id2_i|\)

Return type:

np.array

score_id_sim_orientation(id1, or1, id2, or2, range_bonus_orientation=30, value_bonus_orientation=1.0)[source]

Compares two id frequency distributions for similarity

Parameters:
  • id1 (list of float) – id bit frequency distribution of base id
  • or1 (float) – orientation belonging to id1
  • id2 (list of float) – id bit frequency distribution of id to compare base to
  • or2 (float) – orientation belonging to id2
Keyword Arguments:
 
  • range_bonus_orientation (Optional float) – range in degrees, so that two orientations get a bonus
  • value_bonus_orientation (Optional float) – value to add if orientations are within range_bonus_orientation
Returns:

Use Manhattan distance \(\sum_i |id1_i - id2_i|\)

Return type:

float

score_id_sim_orientation_v(detections1, detections2, range_bonus_orientation=0.5235987755982988, value_bonus_orientation=1.0)[source]

Compares id frequency distributions for similarity (vectorized)

Parameters:
Keyword Arguments:
 
  • range_bonus_orientation (Optional float) – range in degrees, so that two orientations get a bonus
  • value_bonus_orientation (Optional float) – value to add if orientations are within range_bonus_orientation
Returns:

Use Manhattan distance \(\sum_i |id1_i - id2_i|\)

Return type:

np.array

score_id_sim_rotating(id1, id2, rotation_penalty=0.5)[source]

Compares two id frequency distributions for similarity by rotating them

Instead of only using the distance metric, one id is rotated to check for better results.

Parameters:
  • id1 (list) – id bit frequency distribution of base id
  • id2 (list) – id bit frequency distribution of id to compare base to
Keyword Arguments:
 

rotation_penalty (float) – the penalty that is added for a rotation of 1 to the left or right

Returns:

Manhattan distance but also rotated with added rotation_penalty.

Return type:

float

score_id_sim_rotating_v(detections1, detections2, rotation_penalty=0.5)[source]

Compares id frequency distributions for similarity by rotating them (vectorized)

Instead of only using the distance metric, one id is rotated to check for better results.

Note

The calculation of this feature is quite expensiv compared to score_id_sim_v()!

Parameters:
Keyword Arguments:
 

rotation_penalty (Optional float) – the penalty that is added for a rotation of 1 to the left or right

Returns:

Manhattan distance but also rotated with added rotation_penalty.

Return type:

np.array

score_id_sim_tracks_median_v(tracks1, tracks2)[source]

Compares id frequency distributions of tracks by comparing the median (vectorized)

Parameters:
  • tracks1 (list of Track) – Iterable with Tracks
  • tracks2 (list of Track) – Iterable with Tracks
Returns:

Use Manhattan distance \(\sum_i |Median(ids1)_i - Median(ids2)_i|\)

Return type:

np.array

distance_orientations(rad1, rad2)[source]

Calculates the distance between two orientations.

Orientations in rad are converted to degrees then the difference is calculated.

Parameters:
  • rad1 (float) – orientation of first detection in rad
  • rad2 (float) – orientation of second detection in rad
Returns:

distance between two orientation in degrees, e.g. 90 for -pi/2 and pi

Return type:

float

distance_orientations_v(detections1, detections2, meta_key=None)[source]

Calculates the distances between orientations (vectorized)

Orientations are expected to be in rad.

Parameters:
Keyword Arguments:
 

meta_key (Optional str) – Instead of Detection.orientation use value from Detection.meta

Returns:

distance between orientations of detections line by line in rad

Return type:

np.array

distance_positions_v(detections1, detections2)[source]

Calculates the euclidean distances between the x and y positions (vectorized)

Parameters:
Returns:

Euclidean distance between detections line by line

Return type:

np.array

bit_array_to_int_v(bit_arrays, threshold=0.5, endian='little')[source]

Converts the bit frequency distribution of the id to an integer representation.

Note

Instead of bit_arrays you could also pass lists with Detection objects. In this case the Detection.beeId is used and interpreted as bit array.

Parameters:

bit_arrays (list of arrays or Detection) – Iterable with ids to decode.

Keyword Arguments:
 
  • threshold (Optional float) – values >= threshold are interpreted as 1
  • endian (Optional str) – Either little for little endianess or big for big endianess.
Returns:

the decoded ids represented as integer

Return type:

list of int

calc_median_ids(tracks)[source]

Helper to calculate the median bit for all the ids in the given track.

Note

For performance reasons the median id is saved as meta key and only is recalculated if the length of the track changes.

Parameters:tracks (list of Track) – Iterable with Tracks
Returns:median for all the bits in the given track
Return type:np.array
calc_track_ids(tracks)[source]

Function to calculate an id for a Track.

This functions calculates the median for each bit of the detections in the track and then uses a threshold to decide whether a bit is set or not.

Note

Used as default implementation to calculate Score.calc_id.

Parameters:tracks (list of Track) – A list of Track object to calculate an id based on it’s list of Detection objects.
Returns:the calculated id for the Track
Return type:int

Training

Code to help train and evaluate models, generate scoring functions and training data.

To help generating learning data we provide some helpers. The provided functions will combine the generation of learning data with training and evaluation of classifiers. You may also use the generated learning data and scoring functions for your own training and evaluation process.

generate_learning_data(dw_truth, features, frame_diff, radius)[source]

Function to generate learning data using truth data.

The features are are the same that could be used to train a binary classifier.

Warning

It is recommended to use an collections.OrderedDict instead of a regular dict for features because the order in dict is not guaranteed.

Parameters:
  • dw_truth (DataWrapperTruth) – DataWrapperTruth with truth data
  • features (dict) – {:attr:`feature`: score_fun(tracks, frame_objects_test)} mapping
  • frame_diff (int) – after n frames a close track if no matching object is found
  • radius (int) – radius in image coordinates to restrict neighborhood search
Returns:

tuple containing:

  • x_data (np.array): The learning data

  • y_data (np.array): The correct classes for the learning data

  • tracks (list of Track): list of tracks generated while generating

    learning data.

Return type:

tuple

train_and_evaluate(clf, x_data, y_data, verbose=False, **kwargs)[source]

Function to train and evaluate a Classifier.

When verbose is True then several metrics are calculated and printed. For training only 90 percent of the dataset is used!

Parameters:
  • clf (scikit-learn classifier) – a scikit-learn classifier that will be trained
  • x_data (np.array) – learning data
  • y_data (np.array) – the classes for the learning data
Keyword Arguments:
 
  • verbose (Optional bool) – if true calculates accuracy, 10-fold cross validation...
  • **kwargs (dict) – Keyword arguments for clf.fit().
train_bin_clf(clf, dw_truth, features, frame_diff, radius, **kwargs)[source]

Function to train a binary classifier using truth data.

The features are used to train a binary classifier to corresponding frame objects.

Warning

It is recommended to use an collections.OrderedDict instead of a regular dict for features because the order in dict is not guaranteed.

You will also get a generic scoring function that is compatible with SimpleWalker and the generated learning data for training the classifier. Use this data if you want to use some custom training methods.

Parameters:
  • clf (scikit-learn classifier) – a scikit-learn classifier that will be trained
  • dw_truth (DataWrapperTruth) – DataWrapperTruth with truth data
  • features (dict) – {:attr:`feature`: score_fun(tracks, frame_objects_test)} mapping
  • frame_diff (int) – after n frames a close track if no matching object is found
  • radius (int) – radius in image coordinates to restrict neighborhood search
Keyword Arguments:
 
  • verbose (Optional bool) – if true prints some information about training success
  • **kwargs (dict) – Keyword arguments for train_and_evaluate() that are also passed to clf.fit().
Returns:

tuple containing:

  • x_data (np.array): The learning data
  • y_data (np.array): The correct classes for the learning data
  • score_fun (func): A generic scoring function for the Classifier

Return type:

tuple

Walker

Walkers provide a simple framework for walking through the beesbook data.

It will provide detections, manage tracks, assign detections and close tracks.

You basically just have to provide a scoring function. Examples and helpers to generate scoring functions are provided in training.

class SimpleWalker(data_wrapper, score_fun, frame_diff, radius, track_prefix=None)[source]

Bases: object

Class for walking through the beesbook data.

track_id_count = 0

int: counter to increment unique track ids

max_weight = 1000.0

float: used to mark non assignable pairs in a cost matrix

prune_weight = 1000.0

float: used to ignore claims with bad weights

min_track_start_length = 1

int: minimum length of track to start a new track as base of a path. Only relevant when assigning Track to other Track objects.

__init__(data_wrapper, score_fun, frame_diff, radius, track_prefix=None)[source]

Initialization of a simple Walker to calculate tracks.

This Walker will run through the data time step for time step, no jumps or sliding windows, hence the naming.

The Walker could be used multiple times but will share the track_id_count to generate unique Track ids.

Parameters:
  • data_wrapper (DataWrapper) – a DataWrapper object to access frame objects
  • score_fun (func) – scoring function to calculate the weights between two frame objects
  • frame_diff (int) – after n frames a close track if no matching object is found
  • radius (int) – radius in image coordinates to restrict neighborhood search
Keyword Argument:
track_prefix (Optional str): prefix for Track.id for unique track ids
data = None

DataWrapper: the DataWrapper object to access detections

score_fun = None

func: scoring function to calculate the weights between two frame objects

frame_diff = None

int: after n frames a track is closed if no matching frame object is found

radius = None

int: radius in image coordinates to restrict neighborhood search

track_prefix = None

str: prefix for Track.id for unique track ids over several instances

assigned_tracks = None

(set): keeps track of all the tracks that are already assigned

calc_tracks(start=None, stop=None)[source]

Merge frame objects to bigger Track objects.

Note

At the moment this walker only considers the frame objects on each camera separately. This will have to be adjusted once the stitching is completed.

Keyword Arguments:
 
  • start (timestamp) – restrict to frames with timestamp >= start
  • stop (timestamp) – restrict to frames with timestamp < stop
Returns:

list of merged Track

Return type:

list of Track