Importing Cloud (Remote) Assets

Overview of importing publicly available assets into Ango Hub

Administrators and project managers can import files (i.e. assets) to Ango Hub without them ever leaving their current storage location.

You may import both publicly available assets this way (e.g. public, open links to files, like, for example, https://i.imgur.com/CzXTtJV.jpg) or assets located in private buckets in AWS S3, Google Cloud Platform, or Microsoft Azure.

The main difference between importing data from cloud storage services and using drag-and-drop is the file location. When importing assets with drag-and-drop, the assets are uploaded to iMerit's private AWS S3 storage in Germany. When importing assets from your own cloud storage, however, your assets are left in your own storage and are never copied anywhere else.

Steps for Importing Cloud (Remote) Assets

If you are importing public assets, you may start at the Step 1: Preparing the JSON.

If you are importing private assets, (i.e. located in private storage buckets in AWS S3, GCP, or Azure), please start at Step 0: Setting Up the Storage Integration.

Step 0: Set Up the Storage Integration (private files only)

This step is only necessary if you are attempting to import files located in private cloud storages.

Please create a Storage Integration between Ango Hub and your bucket. Instructions on how to do so can be found here:

After having created the storage integration and having saved it in Ango Hub, as per the instructions linked, before moving on to the next steps, please ensure that the bucket you are trying to connect has a CORS policy which allows Ango Hub to display the files. More information on how to set up CORS for your bucket can be found here: Set Up CORS.

Step 1: Preparing the JSON

Before importing the assets to Ango Hub, you will need to prepare a JSON file containing each asset’s external ID as well as the asset's full absolute path.

Ensure your URLs are percent-encoded in the UTF-8 format.

If your filenames contain spaces, for example, ensure they are encoded as %20, and not as pluses (+).

For private files in AWS S3 buckets, ensure the region information is present in the URLs you provide in the JSON.

The URL must be in the format https://<bucket-name>.s3.<region>.amazonaws.com/<path-to-file>

If you do not provide region information, Hub will assume the region is us-east-1.

This is what the minimal working JSON looks like:

[
  {
   "data":"https://url-to-asset.com/CzXTtJV.jpg",
   "externalId":"cute-cat.jpg"
  },
  {
   "data":"https://url-to-asset.com/OB0y6MR.jpg",
   "externalId":"cute-dog.jpg"
  }
]

Considerations when importing video files

When importing video files, in one of the supported formats, Ango Hub performs a metadata check to extract various metadata (e.g. width, height, frame rate, frame total). For large video files, this check can last a few seconds per file – thus if you are importing a very large number of large video files, the metadata check can take a significant amount of time.

If you already have each video's metadata at hand, however, you may choose to provide this metadata in the import JSON such that Ango Hub can skip the metadata check and use the metadata you provided.

The JSON with the optional metadata is as follows:

[
  {
   "data":"https://url-to-asset.com/CzXTtJV.mp4",
   "externalId":"cute-cat.mp4",
   "metadata": {
     "frameRate": 24,   // Can be an integer or a floating point value
     "frameCount": 150, // Must be an integer
     "width": 450,      // Can be an integer or a floating point value
     "height": 450      // Can be an integer or a floating point value
   }
  },
  ...
]

Videos provided with all of the above metadata will not be processed for metadata and will therefore be available earlier for annotation.

Adding custom data to assets on import (contextData)

You may add custom key-value pairs when uploading each asset. This data, while not displayed in the UI in Ango Hub, will be stored and returned when obtaining the final export. We call this contextData.

Please refer to the docs page on contextData for more details about this feature.

Assigning individual assets to batches on import

Batches are a way Ango Hub allows project managers to logically bundle assets. Batches are like tags you may apply to assets to group them together. You may assign any asset to any number of batches.

If you wish to add all assets you are uploading with the JSON to the same batch, you may do so in Step 2 from the UI.

If, however, you'd like to specify batch(es) for individual assets as you import them, you may do so by adding a batches array property to the JSON like so:

[
  {
   "data":"https://url-to-asset.com/CzXTtJV.mp4",
   "externalId":"cute-cat.mp4",
   "batches": [
      "batchId1"
    ]
  },
  {
   "data":"https://url-to-asset.com/2.mp4",
   "externalId":"asset-2.mp4",
   "batches": [
      "batchId1", "batchId2"
    ]
  }
]

In the case above, the first asset will be added to the batch called batchId1 and the second asset to two batches: batchId1 and batchId2.

See the docs page on batches for more information.

Importing pre-labels during asset import

You may include pre-labels as you prepare the JSON for importing assets.

See this docs page on how to accomplish this: Importing Annotations during Asset Import.

Importing attachments during asset import

You may add video, image, or text attachments to the assets you are uploading as you prepare the JSON. Attachments will be displayed next to the main asset.

See this docs page on how to accomplish this: Importing Attachments during Asset Import and read more about attachments here.

Importing multiple images as a single asset

You may make it so that multiple images appear as a carousel to annotators. Images uploaded this way will form 'pages' of a larger asset.

This is useful if you wish for the same annotator to annotate a series of images in a row. It also brings additional benefits such as being able to quickly clone annotations from one image to another, and more.

Please see this docs page on how to accomplish this and on more details about multi-page assets: Importing Multiple Images as One Multi-Page Asset

Importing multiple Markdown (HTML) files as a single asset

You may make it so that multiple markdown files appear in a bundle to annotators. Files uploaded this way will form 'pages' of a larger asset.

This is useful if you wish for the same annotator to annotate a series of files in a row.

Please see this docs page on how to accomplish this and on more details about this feature: Importing Multiple Markdown Files as One Multi-Page Asset

Importing reference images which can be overlayed on top of the main image

You may add reference images in such a way that the annotator, while they are annotating the main image, are able to overlay additional reference images on top.

Please see this docs page on how to accomplish this and on more details about this feature: Importing Reference Images as Overlay

Step 2: Uploading the JSON to Ango Hub

From your project’s dashboard, enter the Assets tab and click on Add Data.

A dialog will pop up. Click on “Cloud Storage” at the top.

Drag the JSON file you would like to upload to the box in the center. Alternatively, click on the box to open your system’s file explorer and select it from there.

If you are uploading a JSON containing links to private assets, you must also pick the storage integration you created in Step 0 from the Storage Method dropdown on the left.

Click on the Upload button. Your assets will appear in the Assets tab.

It is also possible to upload assets through our API.

Last updated