AMMICO

Contents:

  • AMMICO - AI Media and Misinformation Content Analysis Tool
  • Facial Expression recognition with DeepFace
  • Notebook for text extraction on image
  • Image summary and visual question answering
  • Image Multimodal Search
    • Indexing and extracting features from images in selected folder
    • Formulate your search queries
    • Improve the search results
    • Save search results to csv
  • Color analysis of pictures
  • Objects recognition
  • Crop posts from social media posts images
  • AMMICO package modules
  • License
AMMICO
  • Image Multimodal Search
  • View page source

Image Multimodal Search

This notebooks shows how to carry out an image multimodal search with the LAVIS library.

The first cell is only run on google colab and installs the ammico package.

After that, we can import ammico and read in the files given a folder path.

[1]:
# if running on google colab
# flake8-noqa-cell
import os

if "google.colab" in str(get_ipython()):
    # update python version
    # install setuptools
    # %pip install setuptools==61 -qqq
    # install ammico
    %pip install git+https://github.com/ssciwr/ammico.git -qqq
    # mount google drive for data and API key
    from google.colab import drive

    drive.mount("/content/drive")
[2]:
import ammico.utils as mutils
import ammico.multimodal_search as ms
[3]:
images = mutils.find_files(
    path="data/",
    limit=10,
)
[4]:
images
[4]:
['data/106349S_por.png', 'data/102141_2_eng.png', 'data/102730_eng.png']
[5]:
mydict = mutils.initialize_dict(images)
[6]:
mydict
[6]:
{'106349S_por': {'filename': 'data/106349S_por.png'},
 '102141_2_eng': {'filename': 'data/102141_2_eng.png'},
 '102730_eng': {'filename': 'data/102730_eng.png'}}

Indexing and extracting features from images in selected folder

First you need to select a model. You can choose one of the following models: - blip - blip2 - albef - clip_base - clip_vitl14 - clip_vitl14_336

[7]:
model_type = "blip"
# model_type = "blip2"
# model_type = "albef"
# model_type = "clip_base"
# model_type = "clip_vitl14"
# model_type = "clip_vitl14_336"

To process the loaded images using the selected model, use the below code:

[8]:
my_obj = ms.MultimodalSearch(mydict)
[9]:
my_obj.subdict
[9]:
{'106349S_por': {'filename': 'data/106349S_por.png'},
 '102141_2_eng': {'filename': 'data/102141_2_eng.png'},
 '102730_eng': {'filename': 'data/102730_eng.png'}}
[10]:
(
    model,
    vis_processors,
    txt_processors,
    image_keys,
    image_names,
    features_image_stacked,
) = my_obj.parsing_images(
    model_type,
    path_to_save_tensors="data/",
    )
Downloading (…)solve/main/vocab.txt: 232kB [00:00, 12.6MB/s]
Downloading (…)okenizer_config.json: 100%|██████████| 28.0/28.0 [00:00<00:00, 7.80kB/s]
Downloading (…)lve/main/config.json: 100%|██████████| 570/570 [00:00<00:00, 144kB/s]
100%|██████████| 1.97G/1.97G [00:09<00:00, 212MB/s]
[11]:
features_image_stacked
[11]:
tensor([[ 1.1012e-01, -7.5168e-02,  5.1168e-02, -1.7778e-01, -1.6888e-01,
         -9.1135e-04,  1.3566e-02, -1.4314e-01,  6.6218e-02, -4.5880e-02,
          1.4472e-02,  4.8086e-02,  6.6029e-03,  4.4415e-02,  5.3860e-03,
          4.4499e-02, -3.1550e-02,  1.0571e-02, -5.8567e-02,  3.1155e-02,
          5.4091e-02, -1.0610e-01, -2.5944e-02, -7.5799e-03,  6.8304e-02,
         -6.6986e-02,  8.0149e-02, -1.2928e-02, -6.3677e-02, -5.2397e-02,
         -1.3488e-01, -8.1277e-02,  1.1877e-03, -5.3062e-02,  7.8236e-02,
          5.2934e-02,  3.3611e-03, -6.9611e-02, -3.2997e-02,  5.7090e-02,
         -8.5948e-02, -9.3056e-02,  5.7117e-02, -1.2415e-01, -5.9904e-02,
         -5.9758e-02, -1.3205e-01, -7.9004e-02, -2.5256e-02, -1.0186e-01,
          6.6683e-02,  3.1179e-02, -8.6700e-02, -2.4749e-02,  5.9429e-02,
          5.7969e-02,  4.3389e-02,  1.4305e-02,  4.1522e-02,  4.5499e-02,
          5.9855e-02,  5.0948e-02, -9.5958e-02,  5.9531e-04,  6.9768e-02,
          4.8947e-02,  5.3179e-02,  3.9013e-02,  2.1069e-02, -7.2380e-03,
          5.0662e-02,  1.7943e-02,  6.6895e-03,  1.5344e-02, -4.6793e-02,
         -1.7770e-02,  4.4943e-02, -6.1835e-03,  4.5840e-02,  7.2644e-02,
         -2.3762e-02, -4.3159e-02,  8.2409e-02,  1.5021e-02,  4.9489e-02,
          6.1911e-02,  8.3408e-03, -1.7154e-02, -3.0081e-02, -4.1671e-02,
         -3.1439e-02, -6.8992e-02,  1.2365e-02,  3.2730e-02, -1.3427e-02,
          4.4939e-02, -6.8477e-02,  7.6106e-02, -8.8382e-02, -4.0990e-02,
          1.9999e-02, -5.8816e-02,  1.7368e-02,  5.7964e-02,  1.5909e-01,
         -3.2197e-02,  5.1099e-02, -2.6202e-02,  1.6105e-03, -3.2063e-03,
          1.6105e-02,  5.3417e-02,  5.7979e-02, -2.6439e-02, -3.4372e-03,
         -1.7876e-02, -5.7793e-03, -1.1739e-02, -7.6935e-02,  7.4784e-02,
         -5.2521e-02,  1.2977e-02,  2.0117e-02, -9.6784e-02,  1.6069e-02,
          2.9233e-02,  4.5277e-02,  1.0033e-03,  6.7313e-03,  1.0139e-02,
         -8.3055e-02,  1.5429e-02,  8.7240e-02,  7.2129e-03,  4.7280e-02,
          7.6404e-02,  3.3901e-02, -4.7372e-02, -1.1716e-03, -3.7976e-02,
          2.1521e-02,  7.8778e-04, -2.3503e-02,  4.6782e-03, -5.0352e-03,
         -2.6523e-02,  5.5778e-02,  2.1003e-02,  2.4718e-02,  4.3634e-03,
          2.5560e-02, -7.9926e-02,  1.3990e-01, -9.4285e-03, -3.3241e-02,
         -3.3887e-03, -2.0582e-03,  5.1940e-02,  3.9994e-02, -8.9044e-03,
         -3.7222e-02,  5.0615e-02,  5.8512e-02, -1.1737e-01,  1.2046e-01,
         -9.1137e-02, -6.2802e-02, -1.8729e-02,  4.4493e-02,  4.8080e-02,
          6.7830e-02,  4.6744e-02,  6.2280e-02, -4.1152e-02, -1.7641e-02,
          2.6493e-02,  7.3633e-02,  1.9432e-02, -4.4554e-02, -1.5420e-02,
          6.6406e-02, -3.5878e-02,  4.6508e-02, -2.7463e-02, -1.1229e-01,
         -3.1511e-02, -2.2031e-02,  5.2967e-02,  8.9757e-03,  5.1510e-02,
         -2.7554e-02, -1.4970e-01, -1.6232e-02,  1.0702e-01,  3.6492e-02,
         -3.0948e-03, -3.6520e-02,  4.5058e-02,  5.2594e-02,  7.0829e-02,
          6.3175e-02, -5.2307e-02,  4.4408e-02, -2.3004e-02, -5.8003e-02,
          6.6977e-03, -3.7382e-02,  6.8639e-02, -4.4470e-02,  5.5630e-02,
          2.7618e-03,  1.3580e-01,  5.9816e-02,  3.3938e-02,  5.5501e-02,
          1.1331e-01,  5.3495e-02, -8.7937e-02, -1.5506e-02,  2.1425e-02,
         -2.4320e-02,  7.5178e-02, -2.0411e-02, -7.3793e-02, -8.3548e-02,
         -6.6768e-02, -7.9829e-02, -1.1950e-02, -1.1924e-01,  4.3731e-04,
         -9.8474e-02, -2.9941e-02,  2.1787e-02, -8.7719e-02, -4.3941e-03,
         -3.0280e-02,  2.0636e-02,  8.6543e-02, -4.2707e-02, -1.4635e-01,
         -2.1618e-02,  2.5519e-01,  6.3774e-02, -9.4064e-02, -5.5964e-02,
          8.5568e-02, -2.5098e-02, -1.8287e-03, -5.7677e-03, -4.8354e-02,
          1.1468e-01, -1.3158e-01,  1.5233e-02, -9.3690e-03, -1.5281e-01,
          2.1163e-02],
        [-3.1164e-02,  8.1989e-03,  4.1795e-02, -1.3446e-01, -1.2907e-01,
         -2.6754e-02, -2.6497e-02, -9.9805e-02,  6.9648e-02, -8.9566e-02,
         -6.6048e-02,  5.2083e-02,  1.1241e-03, -9.1322e-03,  9.2742e-03,
          9.7092e-02, -8.1129e-02,  7.2043e-02, -3.2057e-02,  9.2243e-02,
          4.2411e-02, -1.5072e-01, -5.1422e-02,  2.3339e-02,  7.7359e-02,
         -7.0478e-02,  8.9051e-03, -7.7342e-03, -1.1392e-02, -1.1554e-02,
         -9.6200e-02, -7.8713e-02, -4.3739e-02, -1.0089e-02,  5.2780e-02,
         -1.5134e-02, -8.3012e-02, -5.7259e-05, -5.0367e-02,  8.0765e-02,
         -3.9783e-02, -6.5552e-02,  5.4163e-02, -1.3714e-01,  1.9904e-02,
         -3.0555e-02, -1.2708e-01, -6.9254e-02, -1.0282e-01, -1.4026e-01,
          5.2235e-02,  1.2107e-01, -6.0284e-02, -1.5995e-02,  1.1738e-01,
         -2.6826e-02,  7.0068e-02, -6.1171e-02,  7.3894e-02, -1.1002e-02,
          1.2297e-02,  4.2592e-02, -3.2391e-02, -5.0095e-02,  1.6830e-03,
          8.7965e-03,  4.5043e-03,  8.0837e-02, -2.0403e-03, -3.2196e-02,
          2.5989e-02,  3.6446e-02,  3.4951e-02,  8.8740e-03, -7.0723e-02,
         -6.5689e-02, -5.2307e-03, -4.5166e-02, -1.3355e-03, -3.6439e-03,
          5.6545e-02, -2.6859e-02,  8.0501e-02,  4.1611e-02,  1.0065e-02,
         -1.7620e-02,  9.6369e-02,  1.9216e-02, -3.0261e-02,  4.2976e-02,
         -4.3991e-02,  2.1417e-04,  4.0874e-02,  5.5617e-02, -6.6902e-02,
          4.4322e-02, -7.8471e-02,  8.3969e-02, -1.1869e-01, -1.4385e-02,
          4.8134e-02, -1.3211e-01,  3.4555e-02,  4.0508e-02,  1.1793e-01,
          5.4095e-02,  6.8710e-02,  2.4825e-02, -7.3113e-02,  2.0910e-02,
          7.2264e-02, -1.9415e-02, -9.1016e-03,  5.3200e-03,  1.1442e-02,
         -3.9044e-02, -1.1724e-02,  6.3622e-03,  3.5475e-03, -2.3111e-03,
         -1.4227e-02,  7.5602e-02,  2.4797e-02, -6.4644e-02, -2.7618e-02,
          5.0614e-02,  6.6187e-04,  1.5339e-02,  1.0458e-01,  4.0525e-02,
         -1.2838e-01,  1.1920e-02,  1.7468e-02,  7.7678e-03, -2.4481e-02,
          1.6154e-01, -4.8973e-02, -7.4852e-02,  3.4362e-02, -2.1357e-02,
          2.6962e-02, -1.7112e-02,  6.1105e-03,  1.0571e-01,  6.9005e-05,
         -1.4003e-02,  7.1748e-02,  3.0738e-02,  1.7992e-02,  5.1729e-02,
          6.6664e-02, -6.7228e-02,  1.2088e-01, -5.2015e-02,  4.6751e-02,
          1.0251e-02, -3.9151e-02,  9.5366e-03,  1.2293e-02, -8.0279e-02,
         -1.0735e-01, -5.9143e-02,  5.0980e-03,  6.0568e-02,  5.3911e-02,
         -8.4256e-02,  2.4663e-02, -9.1507e-02, -2.6754e-02,  4.6592e-02,
          6.8756e-02,  1.1072e-02,  3.4203e-02, -8.8727e-03, -1.2427e-01,
          4.0695e-02, -6.4650e-02,  3.3053e-02, -1.5730e-02, -4.8715e-02,
          9.6889e-02,  3.7285e-02, -2.9729e-02, -1.6436e-02, -5.3257e-02,
         -3.4825e-02,  1.8575e-02,  4.2490e-02,  8.4915e-02,  6.9943e-02,
         -1.4868e-02, -8.4394e-02, -6.9336e-02, -1.7557e-02, -5.7259e-03,
          1.0456e-02, -1.3997e-02,  8.5069e-03,  3.6159e-02,  1.1070e-01,
          1.0540e-01, -1.0440e-03,  4.3275e-02,  2.9507e-02,  8.7414e-02,
         -4.4689e-03, -1.4300e-02,  1.1346e-01, -4.2881e-02,  9.7738e-02,
          1.9278e-02,  1.8649e-01,  9.2842e-02,  1.5103e-02,  6.8677e-02,
          6.4610e-02,  5.9507e-02, -5.2550e-02, -4.7533e-02,  4.8374e-02,
         -2.0671e-03,  1.8360e-02, -8.2667e-02, -1.1774e-01, -1.2933e-01,
         -2.9302e-02, -3.2352e-02, -2.5486e-02, -3.8304e-02, -4.6321e-02,
         -3.9816e-02,  6.5332e-02,  7.1257e-02, -9.2949e-02,  8.0231e-02,
          5.7684e-03, -3.4453e-02,  9.1015e-02, -1.0268e-01, -8.8126e-02,
          8.1906e-02,  1.2352e-01,  6.5130e-02, -8.0906e-02, -4.3627e-02,
          2.7933e-02,  3.3885e-02,  3.5205e-02,  2.2513e-03, -7.7013e-02,
          4.9526e-02, -1.0929e-01, -3.8326e-02, -2.5682e-03, -6.6640e-02,
          6.7460e-02],
        [ 3.8704e-02, -6.5872e-03,  6.4091e-02, -1.2137e-01, -8.6449e-02,
         -1.4590e-02, -7.4130e-03, -1.0394e-01,  2.8826e-02, -4.1845e-02,
          3.7168e-02,  1.1118e-01, -1.3505e-02,  6.0235e-03,  8.9227e-05,
          3.2005e-02, -1.3496e-01,  2.9082e-03, -3.8615e-02,  3.0804e-02,
          7.7103e-02, -1.4979e-01, -2.8390e-02, -1.6990e-02,  6.3808e-02,
         -1.3135e-01,  1.0513e-01,  1.4381e-02, -5.8167e-02, -1.9642e-03,
         -1.5241e-01, -4.5724e-02, -2.1128e-02, -1.0526e-01,  3.5515e-02,
          2.2537e-02, -4.3570e-02, -3.4177e-03, -7.5509e-03,  1.2463e-02,
          8.1264e-03, -1.1059e-01,  4.8502e-02, -8.1062e-02, -8.1252e-02,
         -2.7807e-02, -1.4570e-01, -8.6797e-02, -6.8093e-02, -9.7331e-02,
          1.1802e-01,  6.9852e-02, -7.5976e-02, -2.7399e-02,  1.1688e-01,
         -5.0666e-02,  5.2366e-02, -3.3355e-02,  3.2395e-02, -5.4470e-04,
          3.9203e-02,  2.2737e-02, -9.1094e-02,  2.4388e-02,  2.9964e-02,
         -3.6205e-02,  3.7497e-04,  6.1142e-02,  1.5916e-02, -6.7279e-02,
          3.6079e-03,  4.0223e-04, -2.4583e-02,  3.1065e-02, -4.2247e-02,
         -5.3042e-02,  2.7570e-02,  5.3230e-02,  1.4304e-02,  6.8737e-02,
          1.4924e-02, -4.0730e-03,  1.5391e-02,  1.3633e-02, -6.9916e-02,
          4.9008e-03, -2.1637e-02, -3.9184e-02, -3.0116e-02,  5.9534e-02,
         -4.2737e-02, -3.0022e-02,  8.3971e-03,  5.3988e-02, -1.3125e-01,
          1.0547e-02, -1.4760e-01,  6.2876e-02, -1.5544e-01,  5.5300e-02,
         -2.2773e-02, -2.4887e-02, -5.4123e-03,  7.0171e-02,  9.8391e-02,
         -3.3291e-02,  7.9825e-02, -7.9297e-02, -3.3894e-02,  4.6762e-02,
          3.0376e-02,  3.0549e-02,  4.6353e-02, -4.4347e-02,  1.0420e-01,
         -1.1635e-01,  3.8742e-02,  2.5165e-02, -6.6759e-02,  1.1625e-01,
          7.1682e-03,  5.8881e-02,  8.9747e-02, -6.0257e-02,  1.2392e-02,
          8.8685e-02,  7.9759e-02,  7.4902e-02,  3.3598e-02,  4.5776e-03,
         -1.3680e-01, -3.9998e-02,  2.0899e-02, -1.9931e-02,  1.1591e-02,
          6.0340e-02, -2.6890e-02, -5.2974e-02,  1.3127e-02,  6.9916e-02,
         -3.3791e-03, -4.8345e-02, -4.2392e-02,  5.4170e-02, -2.1096e-02,
         -2.6964e-02, -3.7906e-02,  4.5741e-02,  3.0892e-02,  8.3127e-02,
          8.3920e-03, -8.2779e-02,  1.7831e-01,  9.0946e-03,  3.0161e-02,
          6.9507e-02, -3.2588e-02,  1.7632e-02,  5.8314e-03, -2.9669e-02,
         -5.3066e-02, -4.6275e-02,  2.9985e-02, -2.7543e-02, -1.5128e-02,
         -7.7447e-02,  4.6040e-02, -2.1461e-02, -5.4988e-02,  1.0719e-03,
          5.9942e-02, -2.4077e-03,  9.9754e-02,  9.7726e-03, -7.7468e-02,
          1.1552e-02,  8.3116e-02,  5.7766e-02,  1.7199e-02,  1.4087e-02,
          1.3115e-01, -4.9539e-02,  2.2415e-02, -4.1460e-02, -7.4067e-02,
         -7.4913e-03, -9.0612e-02, -3.9120e-02,  1.6216e-02,  1.1568e-01,
         -1.9876e-02, -1.1729e-01, -4.7536e-04, -1.9798e-03,  1.2053e-02,
         -4.6838e-02,  4.5577e-02,  4.3319e-02, -1.1650e-02,  9.4887e-02,
          1.0965e-01,  1.1613e-04, -1.4307e-03,  2.2078e-02,  3.4985e-02,
          5.8727e-02, -5.4662e-02,  3.3419e-02, -3.0432e-02,  1.1281e-01,
          7.0955e-02,  1.2149e-01,  3.5411e-02,  2.6957e-02,  4.6307e-02,
          7.8418e-02,  4.0232e-02,  2.0678e-02, -8.7644e-02,  3.9371e-02,
          3.8554e-02,  8.6980e-02, -8.4941e-02, -5.5180e-02, -9.9368e-02,
          3.6541e-02,  1.7492e-02, -5.1596e-02, -1.1429e-01, -2.5058e-02,
         -9.1812e-02,  9.4610e-02,  3.1208e-02, -1.0094e-01,  1.8692e-02,
          1.2707e-02, -2.2691e-02,  1.1470e-01,  3.1514e-03,  1.7054e-02,
          7.1725e-02,  1.2500e-01,  6.4919e-02, -3.3964e-02,  2.3052e-03,
          4.3615e-02,  3.2325e-02, -4.9830e-02, -8.4111e-02, -6.9008e-02,
          1.9245e-02, -4.8980e-02, -7.1298e-02, -2.9685e-02, -6.5941e-02,
          4.7879e-02]])

The images are then processed and stored in a numerical representation, a tensor. These tensors do not change for the same image and same model - so if you run this analysis once, and save the tensors giving a path with the keyword path_to_save_tensors, a file with filename .<Number_of_images>_<model_name>_saved_features_image.pt will be placed there.

This will save you a lot of time if you want to analyse same images with the same model but different questions. To run using the saved tensors, execute the below code giving the path and name of the tensor file.

[12]:
# (
#     model,
#     vis_processors,
#     txt_processors,
#     image_keys,
#     image_names,
#     features_image_stacked,
# ) = my_obj.parsing_images(
#     model_type,
#     path_to_load_tensors="/content/drive/MyDrive/misinformation-data/5_clip_base_saved_features_image.pt",
# )

Here we already processed our image folder with 5 images and the clip_base model. So you need just to write the name 5_clip_base_saved_features_image.pt of the saved file that consists of tensors of all images as keyword argument for path_to_load_tensors.

Formulate your search queries

Next, you need to form search queries. You can search either by image or by text. You can search for a single query, or you can search for several queries at once, the computational time should not be much different. The format of the queries is as follows:

[13]:
search_query3 = [
    {"text_input": "politician press conference"},
    {"text_input": "a world map"},
    {"text_input": "a dog"},
]

You can filter your results in 3 different ways: - filter_number_of_images limits the number of images found. That is, if the parameter filter_number_of_images = 10, then the first 10 images that best match the query will be shown. The other images ranks will be set to None and the similarity value to 0. - filter_val_limit limits the output of images with a similarity value not bigger than filter_val_limit. That is, if the parameter filter_val_limit = 0.2, all images with similarity less than 0.2 will be discarded. - filter_rel_error (percentage) limits the output of images with a similarity value not bigger than 100 * abs(current_simularity_value - best_simularity_value_in_current_search)/best_simularity_value_in_current_search < filter_rel_error. That is, if we set filter_rel_error = 30, it means that if the top1 image have 0.5 similarity value, we discard all image with similarity less than 0.35.

[14]:
similarity, sorted_lists = my_obj.multimodal_search(
    model,
    vis_processors,
    txt_processors,
    model_type,
    image_keys,
    features_image_stacked,
    search_query3,
    filter_number_of_images=20,
)
[15]:
similarity
[15]:
tensor([[0.1135, 0.1063, 0.0490],
        [0.1441, 0.1311, 0.1008],
        [0.1666, 0.0935, 0.1086]])
[16]:
sorted_lists
[16]:
[[2, 1, 0], [1, 0, 2], [2, 1, 0]]
[17]:
mydict
[17]:
{'106349S_por': {'filename': 'data/106349S_por.png',
  'rank politician press conference': 0,
  'politician press conference': 0.16655394434928894,
  'rank a world map': 2,
  'a world map': 0.09352913498878479,
  'rank a dog': 0,
  'a dog': 0.10862952470779419},
 '102141_2_eng': {'filename': 'data/102141_2_eng.png',
  'rank politician press conference': 2,
  'politician press conference': 0.1135006919503212,
  'rank a world map': 1,
  'a world map': 0.10633404552936554,
  'rank a dog': 2,
  'a dog': 0.04904000088572502},
 '102730_eng': {'filename': 'data/102730_eng.png',
  'rank politician press conference': 1,
  'politician press conference': 0.14405082166194916,
  'rank a world map': 0,
  'a world map': 0.13108758628368378,
  'rank a dog': 1,
  'a dog': 0.10083307325839996}}

After launching multimodal_search function, the results of each query will be added to the source dictionary.

[18]:
mydict["106349S_por"]
[18]:
{'filename': 'data/106349S_por.png',
 'rank politician press conference': 0,
 'politician press conference': 0.16655394434928894,
 'rank a world map': 2,
 'a world map': 0.09352913498878479,
 'rank a dog': 0,
 'a dog': 0.10862952470779419}

A special function was written to present the search results conveniently.

[19]:
my_obj.show_results(
    search_query3[0],
)
'Your search query: politician press conference'
'--------------------------------------------------'
'Results:'
'Rank: 0 Val: 0.16655394434928894'
'106349S_por'
../_images/notebooks_Example_multimodal_29_5.png
'--------------------------------------------------'
'Rank: 1 Val: 0.14405082166194916'
'102730_eng'
../_images/notebooks_Example_multimodal_29_9.png
'--------------------------------------------------'
'Rank: 2 Val: 0.1135006919503212'
'102141_2_eng'
../_images/notebooks_Example_multimodal_29_13.png
'--------------------------------------------------'

Improve the search results

For even better results, a slightly different approach has been prepared that can improve search results. It is quite resource-intensive, so it is applied after the main algorithm has found the most relevant images. This approach works only with text queries. Among the parameters you can choose 3 models: "blip_base", "blip_large", "blip2_coco". If you get an Out of Memory error, try reducing the batch_size value (minimum = 1), which is the number of images being processed simultaneously. With the parameter need_grad_cam = True/False you can enable the calculation of the heat map of each image to be processed. Thus the image_text_match_reordering function calculates new similarity values and new ranks for each image. The resulting values are added to the general dictionary.

[20]:
itm_model = "blip_base"
# itm_model = "blip_large"
# itm_model = "blip2_coco"
[21]:
itm_scores, image_gradcam_with_itm = my_obj.image_text_match_reordering(
    search_query3,
    itm_model,
    image_keys,
    sorted_lists,
    batch_size=1,
    need_grad_cam=True,
)
100%|██████████| 1.78G/1.78G [00:12<00:00, 151MB/s]

Then using the same output function you can add the ITM=True arguments to output the new image order. You can also add the image_gradcam_with_itm argument to output the heat maps of the calculated images.

[22]:
my_obj.show_results(
    search_query3[0], itm=True, image_gradcam_with_itm=image_gradcam_with_itm
)
'Your search query: politician press conference'
'--------------------------------------------------'
'Results:'
'Rank: 0 Val: 0.058296382427215576'
'106349S_por'
../_images/notebooks_Example_multimodal_34_5.png
'--------------------------------------------------'
'Rank: 1 Val: 0.0018555393908172846'
'102730_eng'
../_images/notebooks_Example_multimodal_34_9.png
'--------------------------------------------------'
'Rank: 2 Val: 0.0012622162466868758'
'102141_2_eng'
../_images/notebooks_Example_multimodal_34_13.png
'--------------------------------------------------'

Save search results to csv

Convert the dictionary of dictionarys into a dictionary with lists:

[23]:
outdict = mutils.append_data_to_dict(mydict)
df = mutils.dump_df(outdict)

Check the dataframe:

[24]:
df.head(10)
[24]:
filename rank politician press conference politician press conference rank a world map a world map rank a dog a dog itm politician press conference itm_rank politician press conference itm a world map itm_rank a world map itm a dog itm_rank a dog
0 data/106349S_por.png 0 0.166554 2 0.093529 0 0.108630 0.058296 0 0.000794 2 0.000091 2
1 data/102141_2_eng.png 2 0.113501 1 0.106334 2 0.049040 0.001262 2 0.085762 0 0.000175 1
2 data/102730_eng.png 1 0.144051 0 0.131088 1 0.100833 0.001856 1 0.004548 1 0.000812 0

Write the csv file:

[25]:
df.to_csv("data/data_out.csv")
[ ]:

Previous Next

© Copyright 2022, Scientific Software Center, Heidelberg University.

Built with Sphinx using a theme provided by Read the Docs.