Describe your problem:
1. Missing required python packages when working with a fresh / venv environment
- You need to have matplotlib and torch installed for plotting image sprites
- You need to have pydantic, uvicorn and fastapi installed for the embedding viewport
- When running the example as script without having all of these packages installed, you have to run all code again (which is frustrating, especially for beginners)
Suggestion:
- Just mention that these packages need to be installed or provide the pip command for installing them
2. RAM issues when working with the whole Totally-Looks-Like dataset
- The dataset contains over 6000 images, thus a lot of RAM is required to run the example
- I tried using 1000 images by adding
[:1000] when loading left_da and right_da and ended up using around ~8.5GBs of RAM
Suggestion:
- Reduce the size of the Document Arrays in the example by using
[:1000]. This also intuitively shows that you can work with DocumentArrays like with Python Arrays without having to specifically mention it.
- Try out how much RAM you need to process the whole dataset and add a description like
If you have at least XGB of RAM and want to try using the whole dataset instead of just the first 1000 images, remove [:1000] when loading the files into the DocumentArrays left_da and right_da.
3. Super long, not very useful matches prints
- The two codes after the paragraph
Let's inspect what's inside left_da matches now: print the matches off all images in left_da.
- In my opinion it's not very useful to print all of them. I think it's more useful to print the matches of one image in left_da instead.
- When running the example in Google Colab, the super long print causes issues with running the notebook. This issue will maybe appear in other Jupyter notebook environments aswell.
Suggestion
- Change the printing code to just print the matches for one image of left_da
Environment:
Ubuntu 22.04, Pycharm venv, Python 3.10, docarray 0.17.0
Describe your problem:
1. Missing required python packages when working with a fresh / venv environment
Suggestion:
2. RAM issues when working with the whole Totally-Looks-Like dataset
[:1000]when loading left_da and right_da and ended up using around ~8.5GBs of RAMSuggestion:
[:1000]. This also intuitively shows that you can work with DocumentArrays like with Python Arrays without having to specifically mention it.3. Super long, not very useful matches prints
Let's inspect what's inside left_da matches now:print the matches off all images in left_da.Suggestion
Environment:
Ubuntu 22.04, Pycharm venv, Python 3.10, docarray 0.17.0