Dataset

A Dataset combines together images and annotations to facilitate processing.

Use a Dataset to:

  • query images and annotations in remo

  • annotate

  • export annotations

  • feed data to a training model

class remo.Dataset

Remo dataset

documentation
class remo.Dataset(id: int = None, name: str = None, quantity: int = 0, \*\*kwargs)
  • Parameters

    • id – dataset id

    • name – dataset name

    • quantity – number of images


add_annotations

Adds annotations to the Dataset. If annotation_set_id is not specified, annotations are added to the default Annotation Set. Note: this method is particularly slow for now and will be improved in the future. Use .add_data() for faster upload (you’d need to convert your annotation files to a file supported by Remo)

documentation
add_annotations(annotations: List[remo.domain.annotation.Annotation], annotation_set_id: int = None)
  • Parameters

    • annotations – list of annotations objects

    • annotation_set_id – annotation set id


add_annotations_from_file

Uploads annotations from a custom annotation file to an annotation set. If using a supported annotation format, you can directly use add_data() function

documentation
add_annotations_from_file(file_path: str, parser_function: Callable[str, List[remo.domain.annotation.Annotation]], annotation_set_id: int = None)
  • Parameters

    • file_path – path to annotation file to upload

    • parser_function – function which receives file_path and returns a List[remo.Annotation]

    • annotation_set_id – id of the annotation set to use

Example:

import csv
from remo import Annotation

ds = remo.create_dataset(...)
ds.add_annotations_from_file('annotations.csv', parser_function)


def parser_function(file_path):
'''
File example:
file_name,class_name
000012dasd21e.jpg,Dog
000012dasd221.jpg,Cat
'''
    annotations = []
    with open(file_path, 'r') as f:
        csv_file = csv.reader(f, delimiter=',')
        for row in csv_file:
            file_name, class_name = row
            annotation = Annotation(img_filename=file_name)
            annotation.add_item(classes=[class_name])
            annotations.append(annotation)
    return annotations

add_data

Adds images and/or annotations to the dataset. To be able to add annotations you need to specify an annotation task. Annotations

documentation
add_data(local_files: List[str] = None, paths_to_upload: List[str] = None, urls: List[str] = None, annotation_task: str = None, folder_id: int = None, annotation_set_id: int = None, class_encoding=None)
  • Parameters

    • local_files – list of files or directories containing annotationos and image files These files will be linked. Folders will be recursively scanned for image files: jpg,``jpeg``, png, tif.

    • paths_to_upload – list of files or directories. These files will be copied. Supported files: images, annotations and archives.

      • image files: jpg, png, tif.

      • annotation files: json, xml, csv.

      • archive files: zip, tar, gzip.

      Unpacked archive will be scanned for images, annotations and nested archives.

    • urls – list of urls pointing to downloadable target, which can be image, annotation file or archive.

    • annotation_task – specifies annotation task. See also: remo.task.

    • folder_id – specifies target folder in the dataset.

    • annotation_set_id – specifies target annotation set in the dataset.

    • class_encoding – specifies how to convert class labels in annotation files to classes. See also: remo.class_encodings.

  • Returns

    Dictionary with results for linking files, upload files and upload urls:

    {
        'files_link_result': ...,
        'files_upload_result': ...,
        'urls_upload_result': ...
    }
    

annotation_sets

Lists the annotation sets within the dataset

documentation
annotation_sets()
  • Returns

    List[remo.AnnotationSet]


annotations

Returns all annotations for a given annotation set. If no annotation set is specified, the default annotation set will be used

documentation
annotations(annotation_set_id: int = None)
  • Parameters

    annotation_set_id – annotation set id

  • Returns

    List[remo.Annotation]


classes

Lists all the classes within the dataset

documentation
classes(annotation_set_id: int = None)
  • Parameters

    annotation_set_id – annotation set id. If not specified the default annotation set is considered.

  • Returns

    List of classes


create_annotation_set

Creates a new annotation set. If path_to_annotation_file is provided, it populates it with the given annotations.

documentation
create_annotation_set(annotation_task: str, name: str, classes: List[str], path_to_annotation_file: str = None)
  • Parameters

    • annotation_task – annotation task. See also: remo.task

    • name – annotation set name

    • classes – list of classes. Example: [‘Cat’, ‘Dog’]

    • path_to_annotation_file – path to .csv annotation file

  • Returns

    remo.AnnotationSet


default_annotation_set

If a default annotation set exists, it returns that annotation set. If a default annotation set doesn’t exist, it sets the first annotation set to be default and returns that annotation set.

documentation
default_annotation_set()

export_annotations

Export annotations for a given annotation set

documentation
export_annotations(annotation_set_id: int = None, annotation_format: str = 'json', export_coordinates: str = 'pixel', full_path: str = 'true')
  • Parameters

    • annotation_set_id – annotation set id, by default will be used default_annotation_set

    • annotation_format – can be one of [‘json’, ‘coco’, ‘csv’], default=’json’

    • export_coordinates – converts output values to percentage or pixels, can be one of [‘pixel’, ‘percent’], default=’pixel’

    • full_path – uses full image path (e.g. local path), can be one of [‘true’, ‘false’], default=’false’

  • Returns

    annotation file content


export_annotations_to_file

Exports annotations in given format and save to output file

documentation
export_annotations_to_file(output_file: str, annotation_set_id: int = None, annotation_format: str = 'json', export_coordinates: str = 'pixel', full_path: str = 'true')
  • Parameters

    • output_file – output file to save

    • annotation_set_id – annotation set id

    • annotation_format – can be one of [‘json’, ‘coco’, ‘csv’], default=’json’

    • full_path – uses full image path (e.g. local path), can be one of [‘true’, ‘false’], default=’false’

    • export_coordinates – converts output values to percentage or pixels, can be one of [‘pixel’, ‘percent’], default=’pixel’


fetch

Updates dataset information from server

documentation
fetch()

get_annotation

Retrieves annotation for a given image

documentation
get_annotation(annotation_set_id: int, image_id: int)
  • Parameters

    • annotation_set_id – annotation set id

    • image_id – image id

  • Returns

    remo.Annotation


get_annotation_set

Retrieves annotation set with given id. If no annotation set id is passed, it returns the default annotation set.

documentation
get_annotation_set(annotation_set_id: int = None)
  • Parameters

    annotation_set_id – annotation set id

  • Returns

    remo.AnnotationSet


get_annotation_statistics

Prints annotation statistics of all the available annotation sets within the dataset

documentation
get_annotation_statistics(annotation_set_id: int = None)
  • Returns

    list of dictionaries with fields annotation set id, name, num of images, num of classes, num of objects, top3 classes, release and update dates


images

Lists images within the dataset

documentation
images(limit: int = None, offset: int = None)
  • Parameters

    • limit – the number of images to be listed

    • offset – specifies offset

  • Returns

    List[remo.Image]


Given a list of classes and annotation task, it returns a list of all the images with mathcing annotations

documentation
search(classes=None, task: str = None)
  • Parameters

    • classes – string or list of strings - search for images which match all given classes

    • task – annotation task. See also: remo.task

  • Returns

    subset of the dataset


set_default_annotation_set

Sets the default annotation set for a dataset. Important: default annotation sets are not stored in Remo, so every time a script runs the default annotation set will be assigned to the first annotation set that was created.

documentation
set_default_annotation_set(annotation_set_id: int)
  • Parameters

    annotation_set_id – annotation set id


view

Opens browser on dataset page

documentation
view()

view_annotate

Opens browser on the annotation tool for the given annotation set

documentation
view_annotate(annotation_set_id: int = None)
  • Parameters

    annotation_set_id – annotation set id. If not specified, default one be used.


view_annotation_stats

Opens browser on annotation set insights page

documentation
view_annotation_stats(annotation_set_id: int = None)
  • Parameters

    annotation_set_id – annotation set id. If not specified, default one be used.


view_image

Opens browser on image view page for the given image

documentation
view_image(image_id: int)
  • Parameters

    image_id – image id