Dataset

A Dataset combines together images and annotations to provide quick functionalities to manage the data.

Use a Dataset to:

  • query images and annotations in remo

  • annotate

  • export annotations

  • feed data to a training model

  • upload model predictions

class remo.Dataset

Remo dataset

documentation
class remo.Dataset(id: int = None, name: str = None, quantity: int = 0, \*\*kwargs)
  • Parameters

    • id – dataset id

    • name – dataset name

    • quantity – number of images


add_annotations

Fast upload of annotations to the Dataset.

If annotation_set_id is not provided, annotations will be added to:

  • the only annotation set present, if the Dataset has exactly one Annotation Set and the tasks match

  • a new annotation set, if the Dataset doesn’t have any Annotation Sets or if ceate_new_annotation_set = True

Otherwise, annotations will be added to the Annotation Set specified by annotation_set_id.

Example::

urls = [[https://remo-scripts.s3-eu-west-1.amazonaws.com/open_images_sample_dataset.zip](https://remo-scripts.s3-eu-west-1.amazonaws.com/open_images_sample_dataset.zip)]
my_dataset = remo.create_dataset(name = D1, urls = urls)
image_name = 000a1249af2bc5f0.jpg
annotations = []

annotation = remo.Annotation()
annotation.img_filename = image_name
annotation.classes=Human hand
annotation.bbox=[227, 284, 678, 674]
annotations.append(annotation)

annotation = remo.Annotation()
annotation.img_filename = image_name
annotation.classes=Fashion accessory
annotation.bbox=[496, 322, 544,370]
annotations.append(annotation)

my_dataset.add_annotations(annotations)
documentation
add_annotations(annotations: List[remo.domain.annotation.Annotation], annotation_set_id: int = None, create_new_annotation_set: bool = False)
  • Parameters

    • annotations – list of Annotation objects

    • annotation_set_id ((optional)) – annotation set id

    • create_new_annotation_set ((optional)) – if True, a new annotation set will be created


add_data

Adds images and/or annotations to the dataset.

Use the parameters as follows:

  • Use local files to link (rather than copy) images.

  • Use paths_to_upload if you want to copy image files or archive files.

  • Use urls to download from the web images, annotations or archives.

In terms of supported formats:

  • Adding images: support for jpg, jpeg, png, tif

  • Adding annotations: to add annotations, you need to specify the annotation task and make sure the specific file format is one of those supported. See documentation here: https://remo.ai/docs/annotation-formats/

  • Adding archive files: support for zip, tar, gzip

Example::

urls = [‘[https://remo-scripts.s3-eu-west-1.amazonaws.com/open_images_sample_dataset.zip](https://remo-scripts.s3-eu-west-1.amazonaws.com/open_images_sample_dataset.zip)’]
my_dataset = remo.create_dataset(name = ‘D1’, urls = urls)
my_dataset.add_data(local_files=annotation_files, annotation_task = ‘Object detection’)
documentation
add_data(local_files: List[str] = None, paths_to_upload: List[str] = None, urls: List[str] = None, annotation_task: str = None, folder_id: int = None, annotation_set_id: int = None, class_encoding=None, wait_for_complete=True)
  • Parameters

    • dataset_id – id of the dataset to add data to

    • local_files – list of files or directories containing annotations and image files Remo will create smaller copies of your images for quick previews but it will point at the original files to show original resolutions images. Folders will be recursively scanned for image files.

    • paths_to_upload – list of files or directories containing images, annotations and archives. These files will be copied inside .remo folder. Folders will be recursively scanned for image files. Unpacked archive will be scanned for images, annotations and nested archives.

    • urls – list of urls pointing to downloadable target, which can be image, annotation file or archive.

    • annotation_task – annotation tasks tell remo how to parse annotations. See also: remo.task.

    • folder_id – specifies target virtual folder in the remo dataset. If None, it adds to the root level.

    • annotation_set_id – specifies target annotation set in the dataset. If None, it adds to the default annotation set.

    • class_encoding – specifies how to convert labels in annotation files to readable labels. If None, Remo will try to interpret the encoding automatically - which for standard words, means they will be read as they are. See also: remo.class_encodings.

    • wait_for_complete – if True, the function waits for upload data to complete

  • Returns

    Dictionary with results for linking files, upload files and upload urls:

    {
        'files_link_result': ...,
        'files_upload_result': ...,
        'urls_upload_result': ...
    }
    

annotation_sets

Lists the annotation sets within the dataset.

documentation
annotation_sets()
  • Returns

    List[remo.AnnotationSet]


annotations

Returns all annotations for a given annotation set. If no annotation set is specified, the default annotation set will be used

documentation
annotations(annotation_set_id: int = None)
  • Parameters

    annotation_set_id – annotation set id

  • Returns

    List[remo.Annotation]


classes

Lists all the classes within the dataset

documentation
classes(annotation_set_id: int = None)
  • Parameters

    annotation_set_id – annotation set id. If not specified the default annotation set is considered.

  • Returns

    List of classes


create_annotation_set

Creates a new annotation set within the dataset If path_to_annotation_file is provided, it populates it with the given annotations. The first created annotation set for the given dataset, is considered the default one.

documentation
create_annotation_set(annotation_task: str, name: str, classes: List[str] = [], path_to_annotation_file: str = None)
  • Parameters

    • annotation_task – annotation task. See also: remo.task

    • name – annotation set name

    • classes – list of classes to prepopulate the annotation set. Example: [‘Cat’, ‘Dog’]. Default is no classes

    • path_to_annotation_file – path to .csv annotation file

  • Returns

    remo.AnnotationSet


default_annotation_set

If the dataset has only one annotation set, it returns that annotation set. Otherwise, it raises an exception.

documentation
default_annotation_set()

delete

Deletes dataset

documentation
delete()

export_annotations

Export annotations for a given annotation set

documentation
export_annotations(annotation_set_id: int = None, annotation_format: str = 'json', export_coordinates: str = 'pixel', full_path: bool = True, export_tags: bool = True)
  • Parameters

    • annotation_set_id – annotation set id, by default will be used default_annotation_set

    • annotation_format – can be one of [‘json’, ‘coco’, ‘csv’], default=’json’

    • export_coordinates – converts output values to percentage or pixels, can be one of [‘pixel’, ‘percent’], default=’pixel’

    • full_path – uses full image path (e.g. local path), it can be one of [True, False], default=True

    • export_tags – exports the tags to a CSV file, it can be one of [True, False], default=True

  • Returns

    annotation file content


export_annotations_to_file

Exports annotations in given format and save to output file

documentation
export_annotations_to_file(output_file: str, annotation_set_id: int = None, annotation_format: str = 'json', export_coordinates: str = 'pixel', full_path: bool = True, export_tags: bool = True)
  • Parameters

    • output_file – output file to save

    • annotation_set_id – annotation set id

    • annotation_format – can be one of [‘json’, ‘coco’, ‘csv’], default=’json’

    • full_path – uses full image path (e.g. local path), it can be one of [True, False], default=True

    • export_coordinates – converts output values to percentage or pixels, can be one of [‘pixel’, ‘percent’], default=’pixel’

    • export_tags – exports the tags to a CSV file, it can be one of [True, False], default=True


fetch

Updates dataset information from server

documentation
fetch()

get_annotation_set

Retrieves annotation set with given id. If no annotation set id is passed, it returns the default annotation set.

documentation
get_annotation_set(annotation_set_id: int = None)
  • Parameters

    annotation_set_id – annotation set id

  • Returns

    remo.AnnotationSet


get_annotation_statistics

Prints annotation statistics of all the available annotation sets within the dataset

documentation
get_annotation_statistics(annotation_set_id: int = None)
  • Returns

    list of dictionaries with fields annotation set id, name, num of images, num of classes, num of objects, top3 classes, release and update dates


image

Returns the remo.Image with matching img_filename or img_id. Pass either img_filename or img_id.

documentation
image(img_filename=None, img_id=None)
  • Parameters

    • img_filename – filename of the Image to retrieve

    • img_id – id of the the Image to retrieve

  • Returns

    remo.Image


images

Lists images within the dataset

documentation
images(limit: int = None, offset: int = None)
  • Parameters

    • limit – the number of images to be listed

    • offset – specifies offset

  • Returns

    List[remo.Image]

Example::

my_dataset.images()

info

Prints basic info about the dataset:

  • Dataset name

  • Dataset ID

  • Number of images contained in the dataset

  • Number of annotation sets contained in the dataset

documentation
info()

list_image_annotations

Retrieves annotations for a given image

documentation
list_image_annotations(annotation_set_id: int, image_id: int)
  • Parameters

    • annotation_set_id – annotation set id

    • image_id – image id

  • Returns

    List[remo.Annotation]


Given a list of classes and annotation task, it returns a list of all the images with mathcing annotations

documentation
search(classes=None, task: str = None)
  • Parameters

    • classes – string or list of strings - search for images which match all given classes

    • task – annotation task. See also: remo.task

  • Returns

    subset of the dataset


view

Opens browser on dataset page

documentation
view()

view_annotate

Opens browser on the annotation tool for the given annotation set

documentation
view_annotate(annotation_set_id: int = None)
  • Parameters

    annotation_set_id – annotation set id. If not specified, default one be used.


view_annotation_stats

Opens browser on annotation set insights page

documentation
view_annotation_stats(annotation_set_id: int = None)
  • Parameters

    annotation_set_id – annotation set id. If not specified, default one be used.


view_image

Opens browser on image view page for the given image

documentation
view_image(image_id: int)
  • Parameters

    image_id – image id