Skip to content

Readme Example 4: Suggestions #574

Description

@BenjaminEliaMueller

Describe your problem:

1. Missing required python packages when working with a fresh / venv environment

  • You need to have matplotlib and torch installed for plotting image sprites
  • You need to have pydantic, uvicorn and fastapi installed for the embedding viewport
  • When running the example as script without having all of these packages installed, you have to run all code again (which is frustrating, especially for beginners)

Suggestion:

  • Just mention that these packages need to be installed or provide the pip command for installing them

2. RAM issues when working with the whole Totally-Looks-Like dataset

  • The dataset contains over 6000 images, thus a lot of RAM is required to run the example
  • I tried using 1000 images by adding [:1000] when loading left_da and right_da and ended up using around ~8.5GBs of RAM

Suggestion:

  • Reduce the size of the Document Arrays in the example by using [:1000]. This also intuitively shows that you can work with DocumentArrays like with Python Arrays without having to specifically mention it.
  • Try out how much RAM you need to process the whole dataset and add a description like

If you have at least XGB of RAM and want to try using the whole dataset instead of just the first 1000 images, remove [:1000] when loading the files into the DocumentArrays left_da and right_da.

3. Super long, not very useful matches prints

  • The two codes after the paragraph Let's inspect what's inside left_da matches now: print the matches off all images in left_da.
  • In my opinion it's not very useful to print all of them. I think it's more useful to print the matches of one image in left_da instead.
  • When running the example in Google Colab, the super long print causes issues with running the notebook. This issue will maybe appear in other Jupyter notebook environments aswell.

Suggestion

  • Change the printing code to just print the matches for one image of left_da

Environment:

Ubuntu 22.04, Pycharm venv, Python 3.10, docarray 0.17.0

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions