# AMMICO Demonstration Notebook
With ammico, you can analyze text on images and image content at the same time. This is a demonstration notebook to showcase the capabilities of ammico.
You can run this notebook on google colab or locally / on your own HPC resource. The analysis can be quite slow on the google colab default runtime. For production data processing, it is recommended to run the analysis locally on a GPU-supported machine. You can also make use of the colab GPU runtime, or purchase additional runtime. However, google colab comes with pre-installed libraries that can lead to dependency conflicts. The setting on google colab changes frequently, so it is only ensured that this demonstration notebook runs on the default runtime. 

This first cell only runs on google colab; on all other machines, you need to create a conda environment first and install ammico from the Python Package Index using  
```pip install ammico```  
Alternatively you can install the development version from the GitHub repository  
```pip install git+https://github.com/ssciwr/AMMICO.git```

In [None]:
# if running on google colab\
# PLEASE RUN THIS ONLY AS CPU RUNTIME
# for a GPU runtime, there are conflicts with pre-installed packages - 
# you first need to uninstall them (prepare a clean environment with no pre-installs) and then install ammico
# flake8-noqa-cell

if "google.colab" in str(get_ipython()):
    # update python version
    # install setuptools
    # %pip install setuptools==61 -qqq
    # uninstall some pre-installed packages due to incompatibility
    %pip uninstall --yes tensorflow-probability dopamine-rl lida pandas-gbq torchaudio torchdata torchtext orbax-checkpoint flex-y jax jaxlib -qqq
    # install ammico
    %pip install git+https://github.com/ssciwr/ammico.git -qqq
    # install older version of jax to support transformers use of diffusers
    # mount google drive for data and API key
    from google.colab import drive

    drive.mount("/content/drive")

## Use a test dataset

You can download this dataset for test purposes. Skip this step if you use your own data. If the data set on Hugging Face is gated or private, Hugging Face will ask you for a login token. However, for the default dataset in this notebook you do not need to provide one.

In [None]:
from datasets import load_dataset
from pathlib import Path

# If the dataset is gated/private, make sure you have run huggingface-cli login
dataset = load_dataset("iulusoy/test-images")

Next you need to provide a path for the saved images - a folder where the data is stored locally. This directory is automatically created if it does not exist.

In [None]:
data_path = "./data-test"
data_path = Path(data_path)
print(data_path)
data_path.mkdir(parents=True, exist_ok=True)
# now save the files from the Huggingface dataset as images into the data_path folder
for i, image in enumerate(dataset["train"]["image"]):
    filename = "img" + str(i) + ".png"
    image.save(data_path / filename)

## Import the ammico package

In [None]:
# NBVAL_IGNORE_OUTPUT
# ignore output of this cell for automated testing
import os
# jax also sometimes leads to problems on google colab
# if this is the case, try restarting the kernel and executing this 
# and the above two code cells again
import ammico
# for displaying a progress bar
from tqdm import tqdm

Sometimes you may need to restart a session after installing the correct versions of packages, because `Tensorflow` and `EmotitionDetector` may not work and give an error. You can check it by running this code: 
```
import tensorflow as tf
tf.ones([2, 2])
```
If this code generates an error, you need to restart session. For this please click `Runtime` -> `Restart session`. And rerun the notebook again. All required packages will already be installed, so the execution will be very fast. 

## Image Multimodal Search

This module shows how to carry out an image multimodal search with the [LAVIS](https://github.com/salesforce/LAVIS) library. 

### Indexing and extracting features from images in selected folder

First you need to select a model. You can choose one of the following models: 
- [blip](https://github.com/salesforce/BLIP)
- [blip2](https://huggingface.co/docs/transformers/main/model_doc/blip-2) 
- [albef](https://github.com/salesforce/ALBEF) 
- [clip_base](https://github.com/openai/CLIP/blob/main/model-card.md)
- [clip_vitl14](https://github.com/mlfoundations/open_clip) 
- [clip_vitl14_336](https://github.com/mlfoundations/open_clip)

In [None]:
model_type = "blip"
# model_type = "blip2"
# model_type = "albef"
# model_type = "clip_base"
# model_type = "clip_vitl14"
# model_type = "clip_vitl14_336"

To process the loaded images using the selected model, use the below code and substitute the path to your images:

In [None]:
image_dict = ammico.find_files(
    path = data_path,
)

In [None]:
image_dict

In [None]:
my_obj = ammico.MultimodalSearch(image_dict)

In [None]:
(
    model,
    vis_processors,
    txt_processors,
    image_keys,
    image_names,
    features_image_stacked,
) = my_obj.parsing_images(
    model_type, 
    path_to_save_tensors=data_path,
    )

The images are then processed and stored in a numerical representation, a tensor. These tensors do not change for the same image and same model - so if you run this analysis once, and save the tensors giving a path with the keyword `path_to_save_tensors`, a file with filename `.<Number_of_images>_<model_name>_saved_features_image.pt` will be placed there.

This can save you time if you want to analyse the same images with the same model but different questions. To run using the saved tensors, execute the below code giving the path and name of the tensor file. Any subsequent query of the model will run in a fraction of the time than it run in initially.

In [None]:
# uncomment the code below if you want to load the tensors from the drive
# and just want to ask different questions for the same set of images
# (
#     model,
#     vis_processors,
#     txt_processors,
#     image_keys,
#     image_names,
#     features_image_stacked,
# ) = my_obj.parsing_images(
#     model_type,
#     path_to_load_tensors="/content/drive/MyDrive/misinformation-data/5_clip_base_saved_features_image.pt",
# )

Here we already processed our image folder with 5 images and the `clip_base` model. So you need just to write the name `5_clip_base_saved_features_image.pt` of the saved file that consists of tensors of all images as keyword argument for `path_to_load_tensors`. 

### Formulate your search queries

Next, you need to form search queries. You can search either by image or by text. You can search for a single query, or you can search for several queries at once, the computational time should not be much different. The format of the queries is as follows:

In [None]:
image_example_query = data_path / "img0.png"  

search_query = [
    {"image": str(image_example_query)},      # This is how looks image query, here `image_example_path` is the path to query image like "data/test-crop-image.png"
]

You can filter your results in 3 different ways:
- `filter_number_of_images` limits the number of images found. That is, if the parameter `filter_number_of_images = 10`, then the first 10 images that best match the query will be shown. The other images ranks will be set to `None` and the similarity value to `0`.
- `filter_val_limit` limits the output of images with a similarity value not bigger than `filter_val_limit`. That is, if the parameter `filter_val_limit = 0.2`, all images with similarity less than 0.2 will be discarded.
- `filter_rel_error` (percentage) limits the output of images with a similarity value not bigger than `100 * abs(current_similarity_value - best_similarity_value_in_current_search)/best_similarity_value_in_current_search < filter_rel_error`. That is, if we set filter_rel_error = 30, it means that if the top1 image have 0.5 similarity value, we discard all image with similarity less than 0.35.

In [None]:
similarity, sorted_lists = my_obj.multimodal_search(
    model,
    vis_processors,
    txt_processors,
    model_type,
    image_keys,
    features_image_stacked,
    search_query,
    filter_number_of_images=20,
)

In [None]:
similarity

In [None]:
sorted_lists 

After launching `multimodal_search` function, the results of each query will be added to the source dictionary.  

In [None]:
image_dict

A special function was written to present the search results conveniently. 

In [None]:
my_obj.show_results(
    search_query[0], # you can change the index to see the results for other queries
)

## Formulate your search queries: Search for the best match using multiple reference images, for example, of a person

In [None]:
# Here goes the code that reads in multiple images as reference
# then you will loop over these multiple images and find the best matches
# in the end, the best matches will be averaged over for each picture and a list of averaged best matches will be provided

In [None]:
image_example_query = data_path / "img0.png"  # creating the path to the image for the image query example
image_example_query2 = data_path / "img1.png"

search_query = [
    {"image": str(image_example_query)},      # This is how looks image query, here `image_example_path` is the path to query image like "data/test-crop-image.png"
    {"image": str(image_example_query2)},
]

In [None]:
similarity, sorted_lists = my_obj.multimodal_search(
    model,
    vis_processors,
    txt_processors,
    model_type,
    image_keys,
    features_image_stacked,
    search_query,
    filter_number_of_images=20,
)

In [None]:
similarity # now a 2D tensor

In [None]:
# average similarities
print(similarity.mean(dim = 1)) 

In [None]:
# add the similarity average to the image_dict
for key in image_dict.keys():
    # find the similarities for each image in the search query
    similarities = [image_dict[key][query["image"]] for query in search_query]
    image_dict[key]["similarity_average"] = sum(similarities)/len(similarities)

In [None]:
# Not yet compatible with show_results due to dictionary design

In [None]:
# convert to dataframe
df = ammico.get_dataframe(image_dict)
df.head(10)

In [None]:
# save to csv
df.to_csv(data_path / "data_out.csv")