merge main to add_itm

2025-10-29 21:16:06 +02:00 · 2023-03-23 17:09:07 +01:00 · 2023-03-23 17:09:07 +01:00 · cc3d7f3aa8
--- a/.github/workflows/ci.yml
+++ b/.github/workflows/ci.yml
@ -32,14 +32,7 @@ jobs:
    - name: Run pytest
      run: |
        cd misinformation
-        python -m pytest -vv test/test_cropposts.py --cov=. --cov-report=xml
-        python -m pytest -vv test/test_display.py --cov=. --cov-report=xml
-        python -m pytest -vv test/test_faces.py --cov=. --cov-report=xml
-        python -m pytest -vv test/test_multimodal_search.py --cov=. --cov-report=xml
-        python -m pytest -vv test/test_objects.py --cov=. --cov-report=xml
-        python -m pytest -vv test/test_summary.py --cov=. --cov-report=xml
-        python -m pytest -vv test/test_text.py -m "not gcv" --cov=. --cov-report=xml
-        python -m pytest -vv test/test_utils.py --cov=. --cov-report=xml
+        python -m pytest -m "not gcv" --cov=. --cov-report=xml
    - name: Upload coverage
      if: matrix.os == 'ubuntu-22.04' && matrix.python-version == '3.9'
      uses: codecov/codecov-action@v3
--- a/.github/workflows/docs.yml
+++ b/.github/workflows/docs.yml
@ -21,6 +21,13 @@ jobs:
      run: |
        pip install -e .
        python -m pip install -r requirements-dev.txt
+    - name: set google auth
+      uses: 'google-github-actions/auth@v0.4.0'
+      with:
+        credentials_json: '${{ secrets.GOOGLE_APPLICATION_CREDENTIALS }}'
+    - name: get pandoc
+      run: |
+        sudo apt-get install -y pandoc
    - name: Build documentation
      run: |
        cd docs
--- a/README.md
+++ b/README.md
@ -1,4 +1,4 @@
-# Misinformation campaign analysis
+# AMMICO - AI Media and Misinformation Content Analysis Tool

 ![License: MIT](https://img.shields.io/github/license/ssciwr/misinformation)
 ![GitHub Workflow Status](https://img.shields.io/github/actions/workflow/status/ssciwr/misinformation/ci.yml?branch=main)
@ -6,44 +6,42 @@
 ![Quality Gate Status](https://sonarcloud.io/api/project_badges/measure?project=ssciwr_misinformation&metric=alert_status)
 ![Language](https://img.shields.io/github/languages/top/ssciwr/misinformation)

-Extract data from from social media images and texts in disinformation campaigns.
+This package extracts data from images such as social media images, and the accompanying text/text that is included in the image. The analysis can extract a very large number of features, depending on the user input.

 **_This project is currently under development!_**

-Use the pre-processed social media posts (image files) and process to collect information:
-1. Cropping images to remove comments from posts
+Use pre-processed image files such as social media posts with comments and process to collect information:
 1. Text extraction from the images
-1. Language recognition, translation into English, cleaning of the text/spell-check
-1. Sentiment and subjectivity analysis
-1. Performing person and face recognition in images, emotion recognition
-1. Extraction of other non-human objects in the image
+    1. Language detection
+    1. Translation into English or other languages
+    1. Cleaning of the text, spell-check
+    1. Sentiment analysis
+    1. Subjectivity analysis
+    1. Named entity recognition
+    1. Topic analysis
+1. Content extraction from the images
+    1. Textual summary of the image content ("image caption") that can be analyzed further using the above tools
+    1. Feature extraction from the images: User inputs query and images are matched to that query (both text and image query)
+    1. Question answering   
+1. Performing person and face recognition in images
+    1. Face mask detection
+    1. Age, gender and race detection
+    1. Emotion recognition
+1. Object detection in images
+    1. Detection of position and number of objects in the image; currently  person, bicycle, car, motorcycle, airplane, bus, train, truck, boat, traffic light, cell phone
+1. Cropping images to remove comments from posts
 
-This development will serve the fight to combat misinformation, by providing more comprehensive data about its content and techniques. 
-The ultimate goal of this project is to develop a computer-assisted toolset to investigate the content of disinformation campaigns worldwide. 

-# Installation
+## Installation

-The `misinformation` package can be installed using pip: Navigate into your package folder `misinformation/` and execute
+The `AMMICO` package can be installed using pip: Navigate into your package folder `misinformation/` and execute
 ```
 pip install .
 ```
 This will install the package and its dependencies locally.

-## Installation on Windows

-Some modules use [lavis]() to anaylse image content. To enable this functionality on Windows OS, you need to install some dependencies that are not available by default or can be obtained from the command line:
-1. Download [Visual C++](https://learn.microsoft.com/en-us/cpp/windows/latest-supported-vc-redist?view=msvc-170) and install (see also [here](https://github.com/philferriere/cocoapi)).
-1. Then install the coco API from Github
-```
-pip install "git+https://github.com/philferriere/cocoapi.git#egg=pycocotools&subdirectory=PythonAPI"
-```
-1. Now you can install the package by navigating to the misinformation directory and typing
-```
-pip install .
-```
-in the command prompt.
-
-# Usage
+## Usage

 There are sample notebooks in the `misinformation/notebooks` folder for you to explore the package:
 1. Text analysis: Use the notebook `get-text-from-image.ipynb` to extract any text from the images. The text is directly translated into English. If the text should be further analysed, set the keyword `analyse_text` to `True` as demonstrated in the notebook.\
@ -56,8 +54,8 @@ Place the data files in your google drive to access the data.**

 There are further notebooks that are currently of exploratory nature (`colors_expression.ipynb` to identify certain colors on the image).

-# Features
-## Text extraction
+## Features
+### Text extraction
 The text is extracted from the images using [`google-cloud-vision`](https://cloud.google.com/vision). For this, you need an API key. Set up your google account following the instructions on the google Vision AI website.
 You then need to export the location of the API key as an environment variable:
 `export GOOGLE_APPLICATION_CREDENTIALS="location of your .json"`
@ -67,8 +65,18 @@ The extracted text is then stored under the `text` key (column when exporting a

 If you further want to analyse the text, you have to set the `analyse_text` keyword to `True`. In doing so, the text is then processed using [spacy](https://spacy.io/) (tokenized, part-of-speech, lemma, ...). The English text is cleaned from numbers and unrecognized words (`text_clean`), spelling of the English text is corrected (`text_english_correct`), and further sentiment and subjectivity analysis are carried out (`polarity`, `subjectivity`). The latter two steps are carried out using [TextBlob](https://textblob.readthedocs.io/en/dev/index.html). For more information on the sentiment analysis using TextBlob see [here](https://towardsdatascience.com/my-absolute-go-to-for-sentiment-analysis-textblob-3ac3a11d524).

-## Emotion recognition
+### Content extraction

-## Object detection
+The image content ("caption") is extracted using the [LAVIS](https://github.com/salesforce/LAVIS) library. This library enables vision intelligence extraction using several state-of-the-art models, depending on the task. Further, it allows feature extraction from the images, where users can input textual and image queries, and the images in the database are matched to that query (multimodal search). Another option is question answering, where the user inputs a text question and the library finds the images that match the query.

-## Cropping of posts
+### Emotion recognition
+
+Emotion recognition is carried out using the [deepface](https://github.com/serengil/deepface) and [retinaface](https://github.com/serengil/retinaface) libraries. These libraries detect the presence of faces, and their age, gender, emotion and race based on several state-of-the-art models. It is also detected if the person is wearing a face mask - if they are, then no further detection is carried out as the mask prevents an accurate prediction.
+
+### Object detection
+
+Object detection is carried out using [cvlib](https://github.com/arunponnusamy/cvlib) and the [YOLOv4](https://github.com/AlexeyAB/darknet) model. This library detects faces, people, and several inanimate objects; we currently have restricted the output to person, bicycle, car, motorcycle, airplane, bus, train, truck, boat, traffic light, cell phone.
+
+### Cropping of posts
+
+Social media posts can automatically be cropped to remove further comments on the page and restrict the textual content to the first comment only.
--- a/docs/source/conf.py
+++ b/docs/source/conf.py
@ -12,7 +12,7 @@ sys.path.insert(0, os.path.abspath("../../misinformation/"))
 # -- Project information -----------------------------------------------------
 # https://www.sphinx-doc.org/en/master/usage/configuration.html#project-information

-project = "misinformation"
+project = "AMMICO"
 copyright = "2022, Scientific Software Center, Heidelberg University"
 author = "Scientific Software Center, Heidelberg University"
 release = "0.0.1"
@ -20,7 +20,8 @@ release = "0.0.1"
 # -- General configuration ---------------------------------------------------
 # https://www.sphinx-doc.org/en/master/usage/configuration.html#general-configuration

-extensions = ["sphinx.ext.autodoc", "sphinx.ext.napoleon", "myst_parser"]
+extensions = ["sphinx.ext.autodoc", "sphinx.ext.napoleon", "myst_parser", "nbsphinx"]
+nbsphinx_allow_errors = True

 templates_path = ["_templates"]
 exclude_patterns = ["_build", "Thumbs.db", ".DS_Store"]
--- a/docs/source/index.rst
+++ b/docs/source/index.rst
@ -3,14 +3,19 @@
   You can adapt this file completely to your liking, but it should at least
   contain the root `toctree` directive.

-Welcome to misinformation's documentation!
-==========================================
+Welcome to AMMICO's documentation!
+==================================

 .. toctree::
   :maxdepth: 2
   :caption: Contents:

   readme_link
+   notebooks/Example faces
+   notebooks/Example text
+   notebooks/Example summary
+   notebooks/Example multimodal
+   notebooks/Example objects
   modules
   license_link

--- a/docs/source/modules.rst
+++ b/docs/source/modules.rst
@ -1,5 +1,5 @@
-misinformation package modules
-==============================
+AMMICO package modules
+======================

 .. toctree::
   :maxdepth: 4
--- a/docs/source/notebooks/Example
+++ b/docs/source/notebooks/Example
@ -0,0 +1,220 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "id": "d2c4d40d-8aca-4024-8d19-a65c4efe825d",
+   "metadata": {},
+   "source": [
+    "# Facial Expression recognition with DeepFace"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "51f8888b-d1a3-4b85-a596-95c0993fa192",
+   "metadata": {},
+   "source": [
+    "This notebooks shows some preliminary work on detecting facial expressions with DeepFace. It is mainly meant to explore its capabilities and to decide on future research directions. We package our code into a `misinformation` package that is imported here:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "b21e52a5-d379-42db-aae6-f2ab9ed9a369",
+   "metadata": {
+    "tags": []
+   },
+   "outputs": [],
+   "source": [
+    "import misinformation\n",
+    "from misinformation import utils as mutils\n",
+    "from misinformation import display as mdisplay"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "a2bd2153",
+   "metadata": {},
+   "source": [
+    "We select a subset of image files to try facial expression detection on. The `find_files` function finds image files within a given directory:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "afe7e638-f09d-47e7-9295-1c374bd64c53",
+   "metadata": {
+    "tags": []
+   },
+   "outputs": [],
+   "source": [
+    "images = mutils.find_files(\n",
+    "    path=\"data/\",\n",
+    "    limit=10,\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "e149bfe5-90b0-49b2-af3d-688e41aab019",
+   "metadata": {},
+   "source": [
+    "If you want to fine tune the discovery of image files, you can provide more parameters:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "f38bb8ed-1004-4e33-8ed6-793cb5869400",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "?mutils.find_files"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "705e7328",
+   "metadata": {},
+   "source": [
+    "We need to initialize the main dictionary that contains all information for the images and is updated through each subsequent analysis:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "b37c0c91",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "mydict = mutils.initialize_dict(images)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "a9372561",
+   "metadata": {},
+   "source": [
+    "To check the analysis, you can inspect the analyzed elements here. Loading the results takes a moment, so please be patient. If you are sure of what you are doing, you can skip this and directly export a csv file in the step below.\n",
+    "Here, we display the face recognition results provided by the DeepFace library. Click on the tabs to see the results in the right sidebar:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "992499ed-33f1-4425-ad5d-738cf565d175",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "mdisplay.explore_analysis(mydict, identify=\"faces\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "6f974341",
+   "metadata": {},
+   "source": [
+    "Directly carry out the analysis and export the result into a csv: Analysis - "
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "6f97c7d0",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "for key in mydict.keys():\n",
+    "    mydict[key] = misinformation.faces.EmotionDetector(mydict[key]).analyse_image()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "174357b1",
+   "metadata": {},
+   "source": [
+    "Convert the dictionary of dictionarys into a dictionary with lists:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "604bd257",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "outdict = mutils.append_data_to_dict(mydict)\n",
+    "df = mutils.dump_df(outdict)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "8373d9f8",
+   "metadata": {},
+   "source": [
+    "Check the dataframe:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "aa4b518a",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "df.head(10)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "579cd59f",
+   "metadata": {},
+   "source": [
+    "Write the csv file:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "4618decb",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "df.to_csv(\"data/data_out.csv\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "b1a80023",
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3 (ipykernel)",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.9.16"
+  },
+  "vscode": {
+   "interpreter": {
+    "hash": "da98320027a74839c7141b42ef24e2d47d628ba1f51115c13da5d8b45a372ec2"
+   }
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}
--- a/docs/source/notebooks/Example
+++ b/docs/source/notebooks/Example
@ -0,0 +1,341 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "id": "22df2297-0629-45aa-b88c-6c61f1544db6",
+   "metadata": {},
+   "source": [
+    "# Image Multimodal Search"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "9eeeb302-296e-48dc-86c7-254aa02f2b3a",
+   "metadata": {},
+   "source": [
+    "This notebooks shows some preliminary work on Image Multimodal Search with lavis library. It is mainly meant to explore its capabilities and to decide on future research directions. We package our code into a `misinformation` package that is imported here:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "f10ad6c9-b1a0-4043-8c5d-ed660d77be37",
+   "metadata": {
+    "tags": []
+   },
+   "outputs": [],
+   "source": [
+    "import misinformation\n",
+    "import misinformation.multimodal_search as ms"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "acf08b44-3ea6-44cd-926d-15c0fd9f39e0",
+   "metadata": {},
+   "source": [
+    "Set an image path as input file path."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "8d3fe589-ff3c-4575-b8f5-650db85596bc",
+   "metadata": {
+    "tags": []
+   },
+   "outputs": [],
+   "source": [
+    "images = misinformation.utils.find_files(\n",
+    "    path=\"data/\",\n",
+    "    limit=10,\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "adf3db21-1f8b-4d44-bbef-ef0acf4623a0",
+   "metadata": {
+    "tags": []
+   },
+   "outputs": [],
+   "source": [
+    "mydict = misinformation.utils.initialize_dict(images)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "987540a8-d800-4c70-a76b-7bfabaf123fa",
+   "metadata": {},
+   "source": [
+    "## Indexing and extracting features from images in selected folder"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "66d6ede4-00bc-4aeb-9a36-e52d7de33fe5",
+   "metadata": {},
+   "source": [
+    "You can choose one of the following models: blip, blip2, albef, clip_base, clip_vitl14, clip_vitl14_336"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "7bbca1f0-d4b0-43cd-8e05-ee39d37c328e",
+   "metadata": {
+    "tags": []
+   },
+   "outputs": [],
+   "source": [
+    "model_type = \"blip\"\n",
+    "# model_type = \"blip2\"\n",
+    "# model_type = \"albef\"\n",
+    "# model_type = \"clip_base\"\n",
+    "# model_type = \"clip_vitl14\"\n",
+    "# model_type = \"clip_vitl14_336\""
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "ca095404-57d0-4f5d-aeb0-38c232252b17",
+   "metadata": {
+    "tags": []
+   },
+   "outputs": [],
+   "source": [
+    "(\n",
+    "    model,\n",
+    "    vis_processors,\n",
+    "    txt_processors,\n",
+    "    image_keys,\n",
+    "    image_names,\n",
+    "    features_image_stacked,\n",
+    ") = ms.MultimodalSearch.parsing_images(mydict, model_type)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "9ff8a894-566b-4c4f-acca-21c50b5b1f52",
+   "metadata": {},
+   "source": [
+    "The tensors of all images `features_image_stacked` was saved in `<Number_of_images>_<model_name>_saved_features_image.pt`. If you run it once for current model and current set of images you do not need to repeat it again. Instead you can load this features with the command:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "56c6d488-f093-4661-835a-5c73a329c874",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# (\n",
+    "#    model,\n",
+    "#    vis_processors,\n",
+    "#    txt_processors,\n",
+    "#    image_keys,\n",
+    "#    image_names,\n",
+    "#    features_image_stacked,\n",
+    "# ) = ms.MultimodalSearch.parsing_images(mydict, model_type,\"18_clip_base_saved_features_image.pt\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "309923c1-d6f8-4424-8fca-bde5f3a98b38",
+   "metadata": {},
+   "source": [
+    "Here we already processed our image folder with 18 images with `clip_base` model. So you need just write the name `18_clip_base_saved_features_image.pt` of the saved file that consists of tensors of all images as a 3rd argument to the previous function. "
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "162a52e8-6652-4897-b92e-645cab07aaef",
+   "metadata": {},
+   "source": [
+    "Next, you need to form search queries. You can search either by image or by text. You can search for a single query, or you can search for several queries at once, the computational time should not be much different. The format of the queries is as follows:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "c4196a52-d01e-42e4-8674-5712f7d6f792",
+   "metadata": {
+    "tags": []
+   },
+   "outputs": [],
+   "source": [
+    "search_query3 = [\n",
+    "    {\"text_input\": \"politician press conference\"},\n",
+    "    {\"text_input\": \"a person wearing a mask\"},\n",
+    "    {\"image\": \"data/106349S_por.png\"},\n",
+    "]"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "8bcf3127-3dfd-4ff4-b9e7-a043099b1418",
+   "metadata": {},
+   "source": [
+    "You can filter your results in 3 different ways:\n",
+    "- `filter_number_of_images` limits the number of images found. That is, if the parameter `filter_number_of_images = 10`, then the first 10 images that best match the query will be shown. The other images ranks will be set to `None` and the similarity value to `0`.\n",
+    "- `filter_val_limit` limits the output of images with a similarity value not bigger than `filter_val_limit`. That is, if the parameter `filter_val_limit = 0.2`, all images with similarity less than 0.2 will be discarded.\n",
+    "- `filter_rel_error` (percentage) limits the output of images with a similarity value not bigger than `100 * abs(current_simularity_value - best_simularity_value_in_current_search)/best_simularity_value_in_current_search < filter_rel_error`. That is, if we set filter_rel_error = 30, it means that if the top1 image have 0.5 similarity value, we discard all image with similarity less than 0.35."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "7f7dc52f-7ee9-4590-96b7-e0d9d3b82378",
+   "metadata": {
+    "tags": []
+   },
+   "outputs": [],
+   "source": [
+    "similarity = ms.MultimodalSearch.multimodal_search(\n",
+    "    mydict,\n",
+    "    model,\n",
+    "    vis_processors,\n",
+    "    txt_processors,\n",
+    "    model_type,\n",
+    "    image_keys,\n",
+    "    features_image_stacked,\n",
+    "    search_query3,\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "e1cf7e46-0c2c-4fb2-b89a-ef585ccb9339",
+   "metadata": {},
+   "source": [
+    "After launching `multimodal_search` function, the results of each query will be added to the source dictionary.  "
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "9ad74b21-6187-4a58-9ed8-fd3e80f5a4ed",
+   "metadata": {
+    "tags": []
+   },
+   "outputs": [],
+   "source": [
+    "mydict[\"106349S_por\"]"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "cd3ee120-8561-482b-a76a-e8f996783325",
+   "metadata": {},
+   "source": [
+    "A special function was written to present the search results conveniently. "
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "4324e4fd-e9aa-4933-bb12-074d54e0c510",
+   "metadata": {
+    "tags": []
+   },
+   "outputs": [],
+   "source": [
+    "ms.MultimodalSearch.show_results(mydict, search_query3[0])"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "d86ab96b-1907-4b7f-a78e-3983b516d781",
+   "metadata": {
+    "tags": []
+   },
+   "source": [
+    "## Save search results to csv"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "4bdbc4d4-695d-4751-ab7c-d2d98e2917d7",
+   "metadata": {
+    "tags": []
+   },
+   "source": [
+    "Convert the dictionary of dictionarys into a dictionary with lists:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "6c6ddd83-bc87-48f2-a8d6-1bd3f4201ff7",
+   "metadata": {
+    "tags": []
+   },
+   "outputs": [],
+   "source": [
+    "outdict = misinformation.utils.append_data_to_dict(mydict)\n",
+    "df = misinformation.utils.dump_df(outdict)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "ea2675d5-604c-45e7-86d2-080b1f4559a0",
+   "metadata": {
+    "tags": []
+   },
+   "source": [
+    "Check the dataframe:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "e78646d6-80be-4d3e-8123-3360957bcaa8",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "df.head(10)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "05546d99-afab-4565-8f30-f14e1426abcf",
+   "metadata": {},
+   "source": [
+    "Write the csv file:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "185f7dde-20dc-44d8-9ab0-de41f9b5734d",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "df.to_csv(\"./data_out.csv\")"
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3 (ipykernel)",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.9.16"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}
--- a/docs/source/notebooks/Example
+++ b/docs/source/notebooks/Example
@ -0,0 +1,174 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Objects Expression recognition"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "This notebooks shows some preliminary work on detecting objects expressions with cvlib. It is mainly meant to explore its capabilities and to decide on future research directions. We package our code into a `misinformation` package that is imported here:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "tags": []
+   },
+   "outputs": [],
+   "source": [
+    "import misinformation\n",
+    "from misinformation import utils as mutils\n",
+    "from misinformation import display as mdisplay\n",
+    "import misinformation.objects as ob"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Set an image path as input file path."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "tags": []
+   },
+   "outputs": [],
+   "source": [
+    "images = mutils.find_files(\n",
+    "    path=\"data/\",\n",
+    "    limit=10,\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "tags": []
+   },
+   "outputs": [],
+   "source": [
+    "mydict = mutils.initialize_dict(images)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Manually inspect what was detected\n",
+    "\n",
+    "To check the analysis, you can inspect the analyzed elements here. Loading the results takes a moment, so please be patient. If you are sure of what you are doing."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "mdisplay.explore_analysis(mydict, identify=\"objects\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Detect objects and directly write to csv"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "for key in mydict:\n",
+    "    mydict[key] = ob.ObjectDetector(mydict[key]).analyse_image()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Convert the dictionary of dictionarys into a dictionary with lists:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "outdict = mutils.append_data_to_dict(mydict)\n",
+    "df = mutils.dump_df(outdict)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Check the dataframe:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "df.head(10)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Write the csv file:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "df.to_csv(\"./data_out.csv\")"
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3 (ipykernel)",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.9.16"
+  },
+  "vscode": {
+   "interpreter": {
+    "hash": "f1142466f556ab37fe2d38e2897a16796906208adb09fea90ba58bdf8a56f0ba"
+   }
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 4
+}
--- a/docs/source/notebooks/Example
+++ b/docs/source/notebooks/Example
@ -0,0 +1,283 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Image summary and visual question answering"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "This notebooks shows some preliminary work on Image Captioning and Visual question answering with lavis. It is mainly meant to explore its capabilities and to decide on future research directions. We package our code into a `misinformation` package that is imported here:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import misinformation\n",
+    "from misinformation import utils as mutils\n",
+    "from misinformation import display as mdisplay\n",
+    "import misinformation.summary as sm"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Set an image path as input file path."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "images = mutils.find_files(\n",
+    "    path=\"data/\",\n",
+    "    limit=10,\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "mydict = mutils.initialize_dict(images)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Create captions for images and directly write to csv"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Here you can choose between two models: \"base\" or \"large\""
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "summary_model, summary_vis_processors = mutils.load_model(\"base\")\n",
+    "# summary_model, summary_vis_processors = mutils.load_model(\"large\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "for key in mydict:\n",
+    "    mydict[key] = sm.SummaryDetector(mydict[key]).analyse_image(\n",
+    "        summary_model, summary_vis_processors\n",
+    "    )"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "tags": []
+   },
+   "source": [
+    "Convert the dictionary of dictionarys into a dictionary with lists:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "tags": []
+   },
+   "outputs": [],
+   "source": [
+    "outdict = mutils.append_data_to_dict(mydict)\n",
+    "df = mutils.dump_df(outdict)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Check the dataframe:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "df.head(10)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Write the csv file:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "df.to_csv(\"./data_out.csv\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Manually inspect the summaries\n",
+    "\n",
+    "To check the analysis, you can inspect the analyzed elements here. Loading the results takes a moment, so please be patient. If you are sure of what you are doing.\n",
+    "\n",
+    "`const_image_summary` - the permanent summarys, which does not change from run to run (analyse_image).\n",
+    "\n",
+    "`3_non-deterministic summary` - 3 different summarys examples that change from run to run (analyse_image). "
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "mdisplay.explore_analysis(mydict, identify=\"summary\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Generate answers to free-form questions about images written in natural language. "
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Set the list of questions"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "list_of_questions = [\n",
+    "    \"How many persons on the picture?\",\n",
+    "    \"Are there any politicians in the picture?\",\n",
+    "    \"Does the picture show something from medicine?\",\n",
+    "]"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "for key in mydict:\n",
+    "    mydict[key] = sm.SummaryDetector(mydict[key]).analyse_questions(list_of_questions)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "mdisplay.explore_analysis(mydict, identify=\"summary\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Convert the dictionary of dictionarys into a dictionary with lists:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "outdict2 = mutils.append_data_to_dict(mydict)\n",
+    "df2 = mutils.dump_df(outdict2)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "df2.head(10)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "df2.to_csv(\"./data_out2.csv\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3 (ipykernel)",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.9.16"
+  },
+  "vscode": {
+   "interpreter": {
+    "hash": "f1142466f556ab37fe2d38e2897a16796906208adb09fea90ba58bdf8a56f0ba"
+   }
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 4
+}
--- a/docs/source/notebooks/Example
+++ b/docs/source/notebooks/Example
@ -0,0 +1,212 @@
+{
+ "cells": [
+  {
+   "attachments": {},
+   "cell_type": "markdown",
+   "id": "dcaa3da1",
+   "metadata": {},
+   "source": [
+    "# Text extraction on image\n",
+    "Inga Ulusoy, SSC, July 2022"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "f43f327c",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# if running on google colab\n",
+    "# flake8-noqa-cell\n",
+    "import os\n",
+    "\n",
+    "if \"google.colab\" in str(get_ipython()):\n",
+    "    # update python version\n",
+    "    # install setuptools\n",
+    "    !pip install setuptools==61 -qqq\n",
+    "    # install misinformation\n",
+    "    !pip install git+https://github.com/ssciwr/misinformation.git -qqq\n",
+    "    # mount google drive for data and API key\n",
+    "    from google.colab import drive\n",
+    "\n",
+    "    drive.mount(\"/content/drive\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "cf362e60",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import os\n",
+    "from IPython.display import Image, display\n",
+    "import misinformation\n",
+    "from misinformation import utils as mutils\n",
+    "from misinformation import display as mdisplay\n",
+    "import tensorflow as tf"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "27675810",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# download the models if they are not there yet\n",
+    "!python -m spacy download en_core_web_md\n",
+    "!python -m textblob.download_corpora"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "6da3a7aa",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "images = mutils.find_files(path=\"data\", limit=10)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "bf811ce0",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "for i in images:\n",
+    "    display(Image(filename=i))"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "8b32409f",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "mydict = mutils.initialize_dict(images)"
+   ]
+  },
+  {
+   "attachments": {},
+   "cell_type": "markdown",
+   "id": "7b8b929f",
+   "metadata": {},
+   "source": [
+    "## google cloud vision API\n",
+    "First 1000 images per month are free."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "0891b795-c7fe-454c-a45d-45fadf788142",
+   "metadata": {},
+   "source": [
+    "## Inspect the elements per image"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "7c6ecc88",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "mdisplay.explore_analysis(mydict, identify=\"text-on-image\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "9c3e72b5-0e57-4019-b45e-3e36a74e7f52",
+   "metadata": {},
+   "source": [
+    "## Or directly analyze for further processing"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "365c78b1-7ff4-4213-86fa-6a0a2d05198f",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "for key in mydict:\n",
+    "    print(key)\n",
+    "    mydict[key] = misinformation.text.TextDetector(\n",
+    "        mydict[key], analyse_text=True\n",
+    "    ).analyse_image()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "3c063eda",
+   "metadata": {},
+   "source": [
+    "## Convert to dataframe and write csv"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "5709c2cd",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "outdict = mutils.append_data_to_dict(mydict)\n",
+    "df = mutils.dump_df(outdict)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "c4f05637",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# check the dataframe\n",
+    "df.head(10)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "bf6c9ddb",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# Write the csv\n",
+    "df.to_csv(\"./data_out.csv\")"
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3 (ipykernel)",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.9.5"
+  },
+  "vscode": {
+   "interpreter": {
+    "hash": "da98320027a74839c7141b42ef24e2d47d628ba1f51115c13da5d8b45a372ec2"
+   }
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}
--- a/docs/source/notebooks/data/102141_2_eng.png
+++ b/docs/source/notebooks/data/102141_2_eng.png
--- a/docs/source/notebooks/data/102730_eng.png
+++ b/docs/source/notebooks/data/102730_eng.png
--- a/docs/source/notebooks/data/106349S_por.png
+++ b/docs/source/notebooks/data/106349S_por.png
--- a/misinformation/faces.py
+++ b/misinformation/faces.py
@ -141,7 +141,7 @@ class EmotionDetector(utils.AnalysisMethod):
            DeepFace.analyze(
                img_path=face,
                actions=actions,
-                prog_bar=False,
+                silent=True,
                detector_backend="skip",
            )
        )
--- a/misinformation/objects_cvlib.py
+++ b/misinformation/objects_cvlib.py
@ -1,5 +1,7 @@
 import cv2
 import cvlib as cv
+import numpy as np
+from PIL import Image


 def objects_from_cvlib(objects_list: list) -> dict:
@ -50,7 +52,11 @@ class ObjectCVLib(ObjectsMethod):
        image_path: The path to the local file.
        """
        img = cv2.imread(image_path)
-        bbox, label, conf = cv.detect_common_objects(img)
+        # preimg = Image.open(image_path).convert("RGB")
+        # preimg2 = np.asarray(preimg)
+        # img = cv2.cvtColor(preimg2, cv2.COLOR_BGR2RGB)
+
+        _, label, _ = cv.detect_common_objects(img)
        # output_image = draw_bbox(im, bbox, label, conf)
        objects = objects_from_cvlib(label)
        return objects
--- a/misinformation/summary.py
+++ b/misinformation/summary.py
@ -1,5 +1,5 @@
 from misinformation.utils import AnalysisMethod
-import torch
+from torch import device, cuda, no_grad
 from PIL import Image
 from lavis.models import load_model_and_preprocess

@ -8,7 +8,7 @@ class SummaryDetector(AnalysisMethod):
    def __init__(self, subdict: dict) -> None:
        super().__init__(subdict)

-    summary_device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
+    summary_device = device("cuda" if cuda.is_available() else "cpu")
    summary_model, summary_vis_processors, _ = load_model_and_preprocess(
        name="blip_caption",
        model_type="base_coco",
@ -16,6 +16,34 @@ class SummaryDetector(AnalysisMethod):
        device=summary_device,
    )

+    def load_model_base(self):
+        summary_device = device("cuda" if cuda.is_available() else "cpu")
+        summary_model, summary_vis_processors, _ = load_model_and_preprocess(
+            name="blip_caption",
+            model_type="base_coco",
+            is_eval=True,
+            device=summary_device,
+        )
+        return summary_model, summary_vis_processors
+
+    def load_model_large(self):
+        summary_device = device("cuda" if cuda.is_available() else "cpu")
+        summary_model, summary_vis_processors, _ = load_model_and_preprocess(
+            name="blip_caption",
+            model_type="large_coco",
+            is_eval=True,
+            device=summary_device,
+        )
+        return summary_model, summary_vis_processors
+
+    def load_model(self, model_type):
+        select_model = {
+            "base": SummaryDetector.load_model_base,
+            "large": SummaryDetector.load_model_large,
+        }
+        summary_model, summary_vis_processors = select_model[model_type](self)
+        return summary_model, summary_vis_processors
+
    def analyse_image(self, summary_model=None, summary_vis_processors=None):

        if summary_model is None and summary_vis_processors is None:
@ -29,7 +57,7 @@ class SummaryDetector(AnalysisMethod):
            .unsqueeze(0)
            .to(self.summary_device)
        )
-        with torch.no_grad():
+        with no_grad():
            self.subdict["const_image_summary"] = summary_model.generate(
                {"image": image}
            )[0]
@ -62,7 +90,7 @@ class SummaryDetector(AnalysisMethod):
            batch_size = len(list_of_questions)
            image_batch = image.repeat(batch_size, 1, 1, 1)

-            with torch.no_grad():
+            with no_grad():
                answers_batch = self.summary_VQA_model.predict_answers(
                    samples={"image": image_batch, "text_input": question_batch},
                    inference_method="generate",
--- a/misinformation/test/conftest.py
+++ b/misinformation/test/conftest.py
@ -0,0 +1,18 @@
+import os
+import pytest
+
+
+@pytest.fixture
+def get_path(request):
+    mypath = os.path.dirname(request.module.__file__)
+    mypath = mypath + "/data/"
+    return mypath
+
+
+@pytest.fixture
+def set_environ(request):
+    mypath = os.path.dirname(request.module.__file__)
+    os.environ["GOOGLE_APPLICATION_CREDENTIALS"] = (
+        mypath + "/../../data/seismic-bonfire-329406-412821a70264.json"
+    )
+    print(os.environ.get("GOOGLE_APPLICATION_CREDENTIALS"))
--- a/misinformation/test/data/IMG_2809.png
+++ b/misinformation/test/data/IMG_2809.png
--- a/misinformation/test/data/example_objects_cvlib.json
+++ b/misinformation/test/data/example_objects_cvlib.json
@ -1 +1 @@
-{"filename": "./test/data/IMG_2809.png", "person": "yes", "bicycle": "no", "car": "yes", "motorcycle": "no", "airplane": "no", "bus": "yes", "train": "no", "truck": "no", "boat": "no", "traffic light": "no", "cell phone": "no"}
+{"filename": "IMG_2809.png", "person": "yes", "bicycle": "no", "car": "yes", "motorcycle": "no", "airplane": "no", "bus": "yes", "train": "no", "truck": "no", "boat": "no", "traffic light": "no", "cell phone": "no"}
--- a/misinformation/test/data/text_IMG_3756.txt
+++ b/misinformation/test/data/text_IMG_3756.txt
@ -3,10 +3,10 @@ The Quantum Theory of
 Nonrelativistic Collisions
 JOHN R. TAYLOR
 University of Colorado
-ostaliga Lanbidean
+postaldia Lanbidean
 1 ilde
 ballenger stor goin
-gdĐOL, SIVI 23 TL 02
+gd OOL, STVÍ 23 TL 02
 de in obl
 och yd badalang
 a
--- a/misinformation/test/data/text_translated_IMG_3756.txt
+++ b/misinformation/test/data/text_translated_IMG_3756.txt
@ -3,12 +3,12 @@ The Quantum Theory of
 Nonrelativistic Collisions
 JOHN R. TAYLOR
 University of Colorado
-ostaliga Lanbidean
+postaldia Lanbidean
 1 ilde
-balloons big goin
-gdĐOL, SIVI 23 TL
-there in obl
-och yd change
+ballenger stor goin
+gd OOL, STVÍ 23 TL 02
+de in obl
+och yd badalang
 a
 Ber
-ook Sy-RW isn't going anywhere
+ook Sy-RW enot go baldus
--- a/misinformation/test/test_objects.py
+++ b/misinformation/test/test_objects.py
@ -6,8 +6,8 @@ import misinformation.objects_cvlib as ob_cvlib
 OBJECT_1 = "cell phone"
 OBJECT_2 = "motorcycle"
 OBJECT_3 = "traffic light"
-TEST_IMAGE_1 = "./test/data/IMG_2809.png"
-JSON_1 = "./test/data/example_objects_cvlib.json"
+TEST_IMAGE_1 = "IMG_2809.png"
+JSON_1 = "example_objects_cvlib.json"


@pytest.fixture()
@ -25,11 +25,11 @@ def test_objects_from_cvlib(default_objects):
    assert str(objects) == str(out_objects)


-def test_analyse_image_cvlib():
-    mydict = {"filename": TEST_IMAGE_1}
+def test_analyse_image_cvlib(get_path):
+    mydict = {"filename": get_path + TEST_IMAGE_1}
    ob_cvlib.ObjectCVLib().analyse_image(mydict)

-    with open(JSON_1, "r") as file:
+    with open(get_path + JSON_1, "r") as file:
        out_dict = json.load(file)
    for key in mydict.keys():
        assert mydict[key] == out_dict[key]
@ -54,37 +54,37 @@ def test_init_default_objects():
        assert init_objects[obj] == "no"


-def test_analyse_image_from_file_cvlib():
-    file_path = TEST_IMAGE_1
-    objs = ob_cvlib.ObjectCVLib().analyse_image_from_file(file_path)
+def test_analyse_image_from_file_cvlib(get_path):
+    file_path = get_path + TEST_IMAGE_1
+    objs = ob_cvlib.ObjectCVLib().analyse_image_from_file(get_path + file_path)

-    with open(JSON_1, "r") as file:
+    with open(get_path + JSON_1, "r") as file:
        out_dict = json.load(file)
    for key in objs.keys():
        assert objs[key] == out_dict[key]


-def test_detect_objects_cvlib():
-    file_path = TEST_IMAGE_1
+def test_detect_objects_cvlib(get_path):
+    file_path = get_path + TEST_IMAGE_1
    objs = ob_cvlib.ObjectCVLib().detect_objects_cvlib(file_path)

-    with open(JSON_1, "r") as file:
+    with open(get_path + JSON_1, "r") as file:
        out_dict = json.load(file)
    for key in objs.keys():
        assert objs[key] == out_dict[key]


-def test_set_keys(default_objects):
-    mydict = {"filename": TEST_IMAGE_1}
+def test_set_keys(default_objects, get_path):
+    mydict = {"filename": get_path + TEST_IMAGE_1}
    key_objs = ob.ObjectDetector(mydict).set_keys()
    assert str(default_objects) == str(key_objs)


-def test_analyse_image():
-    mydict = {"filename": TEST_IMAGE_1}
+def test_analyse_image(get_path):
+    mydict = {"filename": get_path + TEST_IMAGE_1}
    ob.ObjectDetector.set_client_to_cvlib()
    ob.ObjectDetector(mydict).analyse_image()
-    with open(JSON_1, "r") as file:
+    with open(get_path + JSON_1, "r") as file:
        out_dict = json.load(file)

    assert str(mydict) == str(out_dict)
--- a/misinformation/test/test_text.py
+++ b/misinformation/test/test_text.py
@ -2,31 +2,30 @@ import os
 import pytest
 import spacy
 import misinformation.text as tt
-import misinformation
-import pandas as pd
-
-TESTDICT = {
-    "IMG_3755": {
-        "filename": "./test/data/IMG_3755.jpg",
-    },
-    "IMG_3756": {
-        "filename": "./test/data/IMG_3756.jpg",
-    },
-    "IMG_3757": {
-        "filename": "./test/data/IMG_3757.jpg",
-    },
-}
-
-LANGUAGES = ["de", "om", "en"]
-
-os.environ[
-    "GOOGLE_APPLICATION_CREDENTIALS"
-] = "../data/seismic-bonfire-329406-412821a70264.json"


-def test_TextDetector():
-    for item in TESTDICT:
-        test_obj = tt.TextDetector(TESTDICT[item])
+@pytest.fixture
+def set_testdict(get_path):
+    testdict = {
+        "IMG_3755": {
+            "filename": get_path + "IMG_3755.jpg",
+        },
+        "IMG_3756": {
+            "filename": get_path + "IMG_3756.jpg",
+        },
+        "IMG_3757": {
+            "filename": get_path + "IMG_3757.jpg",
+        },
+    }
+    return testdict
+
+
+LANGUAGES = ["de", "en", "en"]
+
+
+def test_TextDetector(set_testdict):
+    for item in set_testdict:
+        test_obj = tt.TextDetector(set_testdict[item])
        assert test_obj.subdict["text"] is None
        assert test_obj.subdict["text_language"] is None
        assert test_obj.subdict["text_english"] is None
@ -34,30 +33,30 @@ def test_TextDetector():


@pytest.mark.gcv
-def test_analyse_image():
-    for item in TESTDICT:
-        test_obj = tt.TextDetector(TESTDICT[item])
+def test_analyse_image(set_testdict, set_environ):
+    for item in set_testdict:
+        test_obj = tt.TextDetector(set_testdict[item])
        test_obj.analyse_image()
-        test_obj = tt.TextDetector(TESTDICT[item], analyse_text=True)
+        test_obj = tt.TextDetector(set_testdict[item], analyse_text=True)
        test_obj.analyse_image()


@pytest.mark.gcv
-def test_get_text_from_image():
-    for item in TESTDICT:
-        test_obj = tt.TextDetector(TESTDICT[item])
+def test_get_text_from_image(set_testdict, get_path, set_environ):
+    for item in set_testdict:
+        test_obj = tt.TextDetector(set_testdict[item])
        test_obj.get_text_from_image()
-        ref_file = "./test/data/text_" + item + ".txt"
+        ref_file = get_path + "text_" + item + ".txt"
        with open(ref_file, "r", encoding="utf8") as file:
            reference_text = file.read()
        assert test_obj.subdict["text"] == reference_text


-def test_translate_text():
-    for item, lang in zip(TESTDICT, LANGUAGES):
-        test_obj = tt.TextDetector(TESTDICT[item])
-        ref_file = "./test/data/text_" + item + ".txt"
-        trans_file = "./test/data/text_translated_" + item + ".txt"
+def test_translate_text(set_testdict, get_path):
+    for item, lang in zip(set_testdict, LANGUAGES):
+        test_obj = tt.TextDetector(set_testdict[item])
+        ref_file = get_path + "text_" + item + ".txt"
+        trans_file = get_path + "text_translated_" + item + ".txt"
        with open(ref_file, "r", encoding="utf8") as file:
            reference_text = file.read()
        with open(trans_file, "r", encoding="utf8") as file:
@ -77,9 +76,9 @@ def test_remove_linebreaks():
    assert test_obj.subdict["text_english"] == "This is   another  test."


-def test_run_spacy():
-    test_obj = tt.TextDetector(TESTDICT["IMG_3755"], analyse_text=True)
-    ref_file = "./test/data/text_IMG_3755.txt"
+def test_run_spacy(set_testdict, get_path):
+    test_obj = tt.TextDetector(set_testdict["IMG_3755"], analyse_text=True)
+    ref_file = get_path + "text_IMG_3755.txt"
    with open(ref_file, "r") as file:
        reference_text = file.read()
    test_obj.subdict["text_english"] = reference_text
@ -87,10 +86,10 @@ def test_run_spacy():
    assert isinstance(test_obj.doc, spacy.tokens.doc.Doc)


-def test_clean_text():
+def test_clean_text(set_testdict):
    nlp = spacy.load("en_core_web_md")
    doc = nlp("I like cats and fjejg")
-    test_obj = tt.TextDetector(TESTDICT["IMG_3755"])
+    test_obj = tt.TextDetector(set_testdict["IMG_3755"])
    test_obj.doc = doc
    test_obj.clean_text()
    result = "I like cats and"
@ -117,30 +116,35 @@ def test_sentiment_analysis():
    assert test_obj.subdict["subjectivity"] == 0.6


-def test_PostprocessText():
+def test_PostprocessText(set_testdict, get_path):
    reference_dict = "THE\nALGEBRAIC\nEIGENVALUE\nPROBLEM\nDOM\nNVS TIO\nMINA\nMonographs\non Numerical Analysis\nJ.. H. WILKINSON"
    reference_df = "Mathematische Formelsammlung\nfür Ingenieure und Naturwissenschaftler\nMit zahlreichen Abbildungen und Rechenbeispielen\nund einer ausführlichen Integraltafel\n3., verbesserte Auflage"
-    obj = tt.PostprocessText(mydict=TESTDICT)
-    # make sure test works on windows where end-of-line character is \r\n
+    img_numbers = ["IMG_3755", "IMG_3756", "IMG_3757"]
+    for image_ref in img_numbers:
+        ref_file = get_path + "text_" + image_ref + ".txt"
+        with open(ref_file, "r") as file:
+            reference_text = file.read()
+        set_testdict[image_ref]["text_english"] = reference_text
+    obj = tt.PostprocessText(mydict=set_testdict)
    test_dict = obj.list_text_english[2].replace("\r", "")
    assert test_dict == reference_dict
-    for key in TESTDICT.keys():
-        TESTDICT[key].pop("text_english")
+    for key in set_testdict.keys():
+        set_testdict[key].pop("text_english")
    with pytest.raises(ValueError):
-        tt.PostprocessText(mydict=TESTDICT)
-    obj = tt.PostprocessText(use_csv=True, csv_path="./test/data/test_data_out.csv")
+        tt.PostprocessText(mydict=set_testdict)
+    obj = tt.PostprocessText(use_csv=True, csv_path=get_path + "test_data_out.csv")
    # make sure test works on windows where end-of-line character is \r\n
    test_df = obj.list_text_english[0].replace("\r", "")
    assert test_df == reference_df
    with pytest.raises(ValueError):
-        tt.PostprocessText(use_csv=True, csv_path="./test/data/test_data_out_nokey.csv")
+        tt.PostprocessText(use_csv=True, csv_path=get_path + "test_data_out_nokey.csv")
    with pytest.raises(ValueError):
        tt.PostprocessText()


-def test_analyse_topic():
+def test_analyse_topic(get_path):
    _, topic_df, most_frequent_topics = tt.PostprocessText(
-        use_csv=True, csv_path="./test/data/topic_analysis_test.csv"
+        use_csv=True, csv_path=get_path + "topic_analysis_test.csv"
    ).analyse_topic()
    # since this is not deterministic we cannot be sure we get the same result twice
    assert len(topic_df) == 2
--- a/misinformation/test/test_utils.py
+++ b/misinformation/test/test_utils.py
@ -3,38 +3,36 @@ import pandas as pd
 import misinformation.utils as ut


-def test_find_files():
-    result = ut.find_files(
-        path="./test/data/", pattern="*.png", recursive=True, limit=10
-    )
+def test_find_files(get_path):
+    result = ut.find_files(path=get_path, pattern="*.png", recursive=True, limit=10)
    assert len(result) > 0


-def test_initialize_dict():
+def test_initialize_dict(get_path):
    result = [
        "./test/data/image_faces.jpg",
        "./test/data/image_objects.jpg",
    ]
    mydict = ut.initialize_dict(result)
-    with open("./test/data/example_utils_init_dict.json", "r") as file:
+    with open(get_path + "example_utils_init_dict.json", "r") as file:
        out_dict = json.load(file)
    assert mydict == out_dict


-def test_append_data_to_dict():
-    with open("./test/data/example_append_data_to_dict_in.json", "r") as file:
+def test_append_data_to_dict(get_path):
+    with open(get_path + "example_append_data_to_dict_in.json", "r") as file:
        mydict = json.load(file)
    outdict = ut.append_data_to_dict(mydict)
    print(outdict)
-    with open("./test/data/example_append_data_to_dict_out.json", "r") as file:
+    with open(get_path + "example_append_data_to_dict_out.json", "r") as file:
        example_outdict = json.load(file)

    assert outdict == example_outdict


-def test_dump_df():
-    with open("./test/data/example_append_data_to_dict_out.json", "r") as file:
+def test_dump_df(get_path):
+    with open(get_path + "example_append_data_to_dict_out.json", "r") as file:
        outdict = json.load(file)
    df = ut.dump_df(outdict)
-    out_df = pd.read_csv("./test/data/example_dump_df.csv", index_col=[0])
+    out_df = pd.read_csv(get_path + "example_dump_df.csv", index_col=[0])
    pd.testing.assert_frame_equal(df, out_df)
--- a/notebooks/image_summary.ipynb
+++ b/notebooks/image_summary.ipynb
@ -83,7 +83,7 @@
   "metadata": {},
   "outputs": [],
   "source": [
-    "summary_model, summary_vis_processors = mutils.load_model(\"base\")\n",
+    "summary_model, summary_vis_processors = sm.SummaryDetector.load_model(mydict, \"base\")\n",
    "# summary_model, summary_vis_processors = mutils.load_model(\"large\")"
   ]
  },
@ -279,7 +279,7 @@
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
-   "version": "3.10.8"
+   "version": "3.9.0"
  },
  "vscode": {
   "interpreter": {
--- a/pyproject.toml
+++ b/pyproject.toml
@ -22,33 +22,34 @@ classifiers = [
    "License :: OSI Approved :: MIT License",
 ]
 dependencies = [
-    "google-cloud-vision",
+    "bertopic",
    "cvlib",
-    "deepface<=0.0.75",
+    "deepface @ git+https://github.com/iulusoy/deepface.git",
+    "googletrans==3.1.0a0",
+    "grpcio",
+    "importlib_metadata",
+    "ipython",
    "ipywidgets",
+    "ipykernel",
+    "matplotlib",
    "numpy<=1.23.4",
    "pandas",
+    "Pillow",
    "pooch",
    "protobuf",
-    "retina_face",
-    "setuptools",
-    "tensorflow",
-    "keras",
-    "openpyxl",
    "pytest",
    "pytest-cov",
-    "matplotlib",
-    "pytest",
-    "opencv-contrib-python <= 4.6",
-    "googletrans==3.1.0a0",
+    "requests",
+    "retina_face @ git+https://github.com/iulusoy/retinaface.git",
+    "salesforce-lavis @ git+https://github.com/iulusoy/LAVIS.git",
    "spacy",
-    "jupyterlab",
    "spacytextblob",
+    "tensorflow",
    "textblob",
    "torch",
-    "salesforce-lavis",
-    "bertopic",
-    "grpcio",
+    "google-cloud-vision",
+    "setuptools",
+    "opencv-contrib-python",
 ]

 [project.scripts]
--- a/requirements-dev.txt
+++ b/requirements-dev.txt
@ -1,4 +1,5 @@
 sphinx
 myst-parser
 sphinx_rtd_theme
-sphinxcontrib-napoleon
+sphinxcontrib-napoleon
+nbsphinx