{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# AMMICO Demonstration Notebook\n", "With ammico, you can analyze text on images and image content at the same time. This is a demonstration notebook to showcase the capabilities of ammico.\n", "You can run this notebook on google colab or locally / on your own HPC resource. The first cell only runs on google colab; on all other machines, you need to create a conda environment first and install ammico from the Python Package Index using \n", "```pip install ammico``` \n", "Alternatively you can install the development version from the GitHub repository \n", "```pip install git+https://github.com/ssciwr/AMMICO.git```" ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "execution": { "iopub.execute_input": "2024-02-19T08:42:44.420408Z", "iopub.status.busy": "2024-02-19T08:42:44.420216Z", "iopub.status.idle": "2024-02-19T08:42:44.428568Z", "shell.execute_reply": "2024-02-19T08:42:44.428037Z" } }, "outputs": [], "source": [ "# if running on google colab\n", "# flake8-noqa-cell\n", "\n", "if \"google.colab\" in str(get_ipython()):\n", " # update python version\n", " # install setuptools\n", " # %pip install setuptools==61 -qqq\n", " # uninstall some pre-installed packages due to incompatibility\n", " %pip uninstall tensorflow-probability dopamine-rl lida pandas-gbq torchaudio torchdata torchtext orbax-checkpoint flex-y -qqq\n", " # install ammico\n", " %pip install git+https://github.com/ssciwr/ammico.git -qqq\n", " # mount google drive for data and API key\n", " from google.colab import drive\n", "\n", " drive.mount(\"/content/drive\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Use a test dataset\n", "You can download a dataset for test purposes. Skip this step if you use your own data." ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "execution": { "iopub.execute_input": "2024-02-19T08:42:44.430935Z", "iopub.status.busy": "2024-02-19T08:42:44.430571Z", "iopub.status.idle": "2024-02-19T08:42:51.757352Z", "shell.execute_reply": "2024-02-19T08:42:51.756689Z" } }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "/opt/hostedtoolcache/Python/3.9.18/x64/lib/python3.9/site-packages/tqdm/auto.py:21: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html\n", " from .autonotebook import tqdm as notebook_tqdm\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "\r", "Downloading readme: 0%| | 0.00/21.0 [00:00 `Restart session`. And rerun the notebook again. All required packages will already be installed, so the execution will be very fast. " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Step 0: Create and set a Google Cloud Vision Key\n", "\n", "Please note that for the [Google Cloud Vision API](https://cloud.google.com/vision/docs/setup) (the TextDetector class) you need to set a key in order to process the images. A key is generated following [these instructions](../set_up_credentials.md). This key is ideally set as an environment variable using for example\n", "```\n", "os.environ[\n", " \"GOOGLE_APPLICATION_CREDENTIALS\"\n", "] = \"/content/drive/MyDrive/misinformation-data/misinformation-campaign-981aa55a3b13.json\"\n", "```\n", "where you place the key on your Google Drive if running on colab, or place it in a local folder on your machine." ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "execution": { "iopub.execute_input": "2024-02-19T08:43:01.943248Z", "iopub.status.busy": "2024-02-19T08:43:01.942735Z", "iopub.status.idle": "2024-02-19T08:43:01.945881Z", "shell.execute_reply": "2024-02-19T08:43:01.945338Z" } }, "outputs": [], "source": [ "# os.environ[\"GOOGLE_APPLICATION_CREDENTIALS\"] = \"/content/drive/MyDrive/misinformation-data/misinformation-campaign-981aa55a3b13.json\"" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Step 1: Read your data into AMMICO\n", "The ammico package reads in one or several input files given in a folder for processing. The user can select to read in all image files in a folder, to include subfolders via the `recursive` option, and can select the file extension that should be considered (for example, only \"jpg\" files, or both \"jpg\" and \"png\" files). For reading in the files, the ammico function `find_files` is used, with optional keywords:\n", "\n", "| input key | input type | possible input values |\n", "| --------- | ---------- | --------------------- |\n", "`path` | `str` | the directory containing the image files (defaults to the location set by environment variable `AMMICO_DATA_HOME`) |\n", "| `pattern` | `str\\|list` | the file extensions to consider (defaults to \"png\", \"jpg\", \"jpeg\", \"gif\", \"webp\", \"avif\", \"tiff\") |\n", "| `recursive` | `bool` | include subdirectories recursively (defaults to `True`) |\n", "| `limit` | `int` | maximum number of files to read (defaults to `20`, for all images set to `None` or `-1`) |\n", "| `random_seed` | `str` | the random seed for shuffling the images; applies when only a few images are read and the selection should be preserved (defaults to `None`) |\n", "\n", "The `find_files` function returns a nested dict that contains the file ids and the paths to the files and is empty otherwise. This dict is filled step by step with more data as each detector class is run on the data (see below).\n", "\n", "If you downloaded the test dataset above, you can directly provide the path you already set for the test directory, `data_path`." ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "execution": { "iopub.execute_input": "2024-02-19T08:43:01.948209Z", "iopub.status.busy": "2024-02-19T08:43:01.947847Z", "iopub.status.idle": "2024-02-19T08:43:01.952312Z", "shell.execute_reply": "2024-02-19T08:43:01.951836Z" } }, "outputs": [], "source": [ "image_dict = ammico.find_files(\n", " # path=\"/content/drive/MyDrive/misinformation-data/\",\n", " path=data_path.as_posix(),\n", " limit=15,\n", ")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Step 2: Inspect the input files using the graphical user interface\n", "A Dash user interface is to select the most suitable options for the analysis, before running a complete analysis on the whole data set. The options for each detector module are explained below in the corresponding sections; for example, different models can be selected that will provide slightly different results. This way, the user can interactively explore which settings provide the most accurate results. In the interface, the nested `image_dict` is passed through the `AnalysisExplorer` class. The interface is run on a specific port which is passed using the `port` keyword; if a port is already in use, it will return an error message, in which case the user should select a different port number. \n", "The interface opens a dash app inside the Jupyter Notebook and allows selection of the input file in the top left dropdown menu, as well as selection of the detector type in the top right, with options for each detector type as explained below. The output of the detector is shown directly on the right next to the image. This way, the user can directly inspect how updating the options for each detector changes the computed results, and find the best settings for a production run." ] }, { "cell_type": "code", "execution_count": 7, "metadata": { "execution": { "iopub.execute_input": "2024-02-19T08:43:01.954766Z", "iopub.status.busy": "2024-02-19T08:43:01.954396Z", "iopub.status.idle": "2024-02-19T08:43:01.978861Z", "shell.execute_reply": "2024-02-19T08:43:01.977627Z" } }, "outputs": [ { "data": { "text/html": [ "\n", " \n", " " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "analysis_explorer = ammico.AnalysisExplorer(image_dict)\n", "analysis_explorer.run_server(port=8055)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Step 3: Analyze all images\n", "The analysis can be run in production on all images in the data set. Depending on the size of the data set and the computing resources available, this can take some time. " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "It is also possible to set the dump file creation `dump_file` in order to save the calculated data every `dump_every` images. " ] }, { "cell_type": "code", "execution_count": 8, "metadata": { "execution": { "iopub.execute_input": "2024-02-19T08:43:01.981825Z", "iopub.status.busy": "2024-02-19T08:43:01.981158Z", "iopub.status.idle": "2024-02-19T08:43:01.984935Z", "shell.execute_reply": "2024-02-19T08:43:01.983983Z" } }, "outputs": [], "source": [ "# dump file name\n", "dump_file = \"dump_file.csv\"\n", "# dump every N images \n", "dump_every = 10" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The desired detector modules are called sequentially in any order, for example the `EmotionDetector`:" ] }, { "cell_type": "code", "execution_count": 9, "metadata": { "execution": { "iopub.execute_input": "2024-02-19T08:43:01.988248Z", "iopub.status.busy": "2024-02-19T08:43:01.987632Z", "iopub.status.idle": "2024-02-19T08:44:04.645259Z", "shell.execute_reply": "2024-02-19T08:44:04.644561Z" } }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "\r", " 0%| | 0/6 [00:00=3.7.2 in /opt/hostedtoolcache/Python/3.9.18/x64/lib/python3.9/site-packages (from en-core-web-md==3.7.1) (3.7.4)\n", "Requirement already satisfied: spacy-legacy<3.1.0,>=3.0.11 in /opt/hostedtoolcache/Python/3.9.18/x64/lib/python3.9/site-packages (from spacy<3.8.0,>=3.7.2->en-core-web-md==3.7.1) (3.0.12)\n", "Requirement already satisfied: spacy-loggers<2.0.0,>=1.0.0 in /opt/hostedtoolcache/Python/3.9.18/x64/lib/python3.9/site-packages (from spacy<3.8.0,>=3.7.2->en-core-web-md==3.7.1) (1.0.5)\n", "Requirement already satisfied: murmurhash<1.1.0,>=0.28.0 in /opt/hostedtoolcache/Python/3.9.18/x64/lib/python3.9/site-packages (from spacy<3.8.0,>=3.7.2->en-core-web-md==3.7.1) (1.0.10)\n", "Requirement already satisfied: cymem<2.1.0,>=2.0.2 in /opt/hostedtoolcache/Python/3.9.18/x64/lib/python3.9/site-packages (from spacy<3.8.0,>=3.7.2->en-core-web-md==3.7.1) (2.0.8)\n", "Requirement already satisfied: preshed<3.1.0,>=3.0.2 in /opt/hostedtoolcache/Python/3.9.18/x64/lib/python3.9/site-packages (from spacy<3.8.0,>=3.7.2->en-core-web-md==3.7.1) (3.0.9)\n", "Requirement already satisfied: thinc<8.3.0,>=8.2.2 in /opt/hostedtoolcache/Python/3.9.18/x64/lib/python3.9/site-packages (from spacy<3.8.0,>=3.7.2->en-core-web-md==3.7.1) (8.2.3)\n", "Requirement already satisfied: wasabi<1.2.0,>=0.9.1 in /opt/hostedtoolcache/Python/3.9.18/x64/lib/python3.9/site-packages (from spacy<3.8.0,>=3.7.2->en-core-web-md==3.7.1) (1.1.2)\n", "Requirement already satisfied: srsly<3.0.0,>=2.4.3 in /opt/hostedtoolcache/Python/3.9.18/x64/lib/python3.9/site-packages (from spacy<3.8.0,>=3.7.2->en-core-web-md==3.7.1) (2.4.8)\n", "Requirement already satisfied: catalogue<2.1.0,>=2.0.6 in /opt/hostedtoolcache/Python/3.9.18/x64/lib/python3.9/site-packages (from spacy<3.8.0,>=3.7.2->en-core-web-md==3.7.1) (2.0.10)\n", "Requirement already satisfied: weasel<0.4.0,>=0.1.0 in /opt/hostedtoolcache/Python/3.9.18/x64/lib/python3.9/site-packages (from spacy<3.8.0,>=3.7.2->en-core-web-md==3.7.1) (0.3.4)\n", "Requirement already satisfied: typer<0.10.0,>=0.3.0 in /opt/hostedtoolcache/Python/3.9.18/x64/lib/python3.9/site-packages (from spacy<3.8.0,>=3.7.2->en-core-web-md==3.7.1) (0.9.0)\n", "Requirement already satisfied: smart-open<7.0.0,>=5.2.1 in /opt/hostedtoolcache/Python/3.9.18/x64/lib/python3.9/site-packages (from spacy<3.8.0,>=3.7.2->en-core-web-md==3.7.1) (6.4.0)\n", "Requirement already satisfied: tqdm<5.0.0,>=4.38.0 in /opt/hostedtoolcache/Python/3.9.18/x64/lib/python3.9/site-packages (from spacy<3.8.0,>=3.7.2->en-core-web-md==3.7.1) (4.66.2)\n", "Requirement already satisfied: requests<3.0.0,>=2.13.0 in /opt/hostedtoolcache/Python/3.9.18/x64/lib/python3.9/site-packages (from spacy<3.8.0,>=3.7.2->en-core-web-md==3.7.1) (2.31.0)\n", "Requirement already satisfied: pydantic!=1.8,!=1.8.1,<3.0.0,>=1.7.4 in /opt/hostedtoolcache/Python/3.9.18/x64/lib/python3.9/site-packages (from spacy<3.8.0,>=3.7.2->en-core-web-md==3.7.1) (1.10.14)\n", "Requirement already satisfied: jinja2 in /opt/hostedtoolcache/Python/3.9.18/x64/lib/python3.9/site-packages (from spacy<3.8.0,>=3.7.2->en-core-web-md==3.7.1) (3.1.3)\n", "Requirement already satisfied: setuptools in /opt/hostedtoolcache/Python/3.9.18/x64/lib/python3.9/site-packages (from spacy<3.8.0,>=3.7.2->en-core-web-md==3.7.1) (58.1.0)\n", "Requirement already satisfied: packaging>=20.0 in /opt/hostedtoolcache/Python/3.9.18/x64/lib/python3.9/site-packages (from spacy<3.8.0,>=3.7.2->en-core-web-md==3.7.1) (23.2)\n", "Requirement already satisfied: langcodes<4.0.0,>=3.2.0 in /opt/hostedtoolcache/Python/3.9.18/x64/lib/python3.9/site-packages (from spacy<3.8.0,>=3.7.2->en-core-web-md==3.7.1) (3.3.0)\n", "Requirement already satisfied: numpy>=1.19.0 in /opt/hostedtoolcache/Python/3.9.18/x64/lib/python3.9/site-packages (from spacy<3.8.0,>=3.7.2->en-core-web-md==3.7.1) (1.23.4)\n", "Requirement already satisfied: typing-extensions>=4.2.0 in /opt/hostedtoolcache/Python/3.9.18/x64/lib/python3.9/site-packages (from pydantic!=1.8,!=1.8.1,<3.0.0,>=1.7.4->spacy<3.8.0,>=3.7.2->en-core-web-md==3.7.1) (4.5.0)\n", "Requirement already satisfied: charset-normalizer<4,>=2 in /opt/hostedtoolcache/Python/3.9.18/x64/lib/python3.9/site-packages (from requests<3.0.0,>=2.13.0->spacy<3.8.0,>=3.7.2->en-core-web-md==3.7.1) (3.3.2)\n", "Requirement already satisfied: idna<4,>=2.5 in /opt/hostedtoolcache/Python/3.9.18/x64/lib/python3.9/site-packages (from requests<3.0.0,>=2.13.0->spacy<3.8.0,>=3.7.2->en-core-web-md==3.7.1) (2.10)\n", "Requirement already satisfied: urllib3<3,>=1.21.1 in /opt/hostedtoolcache/Python/3.9.18/x64/lib/python3.9/site-packages (from requests<3.0.0,>=2.13.0->spacy<3.8.0,>=3.7.2->en-core-web-md==3.7.1) (2.2.1)\n", "Requirement already satisfied: certifi>=2017.4.17 in /opt/hostedtoolcache/Python/3.9.18/x64/lib/python3.9/site-packages (from requests<3.0.0,>=2.13.0->spacy<3.8.0,>=3.7.2->en-core-web-md==3.7.1) (2024.2.2)\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Requirement already satisfied: blis<0.8.0,>=0.7.8 in /opt/hostedtoolcache/Python/3.9.18/x64/lib/python3.9/site-packages (from thinc<8.3.0,>=8.2.2->spacy<3.8.0,>=3.7.2->en-core-web-md==3.7.1) (0.7.11)\n", "Requirement already satisfied: confection<1.0.0,>=0.0.1 in /opt/hostedtoolcache/Python/3.9.18/x64/lib/python3.9/site-packages (from thinc<8.3.0,>=8.2.2->spacy<3.8.0,>=3.7.2->en-core-web-md==3.7.1) (0.1.4)\n", "Requirement already satisfied: click<9.0.0,>=7.1.1 in /opt/hostedtoolcache/Python/3.9.18/x64/lib/python3.9/site-packages (from typer<0.10.0,>=0.3.0->spacy<3.8.0,>=3.7.2->en-core-web-md==3.7.1) (8.1.7)\n", "Requirement already satisfied: cloudpathlib<0.17.0,>=0.7.0 in /opt/hostedtoolcache/Python/3.9.18/x64/lib/python3.9/site-packages (from weasel<0.4.0,>=0.1.0->spacy<3.8.0,>=3.7.2->en-core-web-md==3.7.1) (0.16.0)\n", "Requirement already satisfied: MarkupSafe>=2.0 in /opt/hostedtoolcache/Python/3.9.18/x64/lib/python3.9/site-packages (from jinja2->spacy<3.8.0,>=3.7.2->en-core-web-md==3.7.1) (2.1.5)\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Installing collected packages: en-core-web-md\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Successfully installed en-core-web-md-3.7.1\n", "\u001b[38;5;2m✔ Download and installation successful\u001b[0m\n", "You can now load the package via spacy.load('en_core_web_md')\n", "\u001b[38;5;3m⚠ Restart to reload dependencies\u001b[0m\n", "If you are in a Jupyter or Colab notebook, you may need to restart Python in\n", "order to load all the package's dependencies. You can do this by selecting the\n", "'Restart kernel' or 'Restart runtime' option.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "\n", "\u001b[1m[\u001b[0m\u001b[34;49mnotice\u001b[0m\u001b[1;39;49m]\u001b[0m\u001b[39;49m A new release of pip is available: \u001b[0m\u001b[31;49m23.0.1\u001b[0m\u001b[39;49m -> \u001b[0m\u001b[32;49m24.0\u001b[0m\n", "\u001b[1m[\u001b[0m\u001b[34;49mnotice\u001b[0m\u001b[1;39;49m]\u001b[0m\u001b[39;49m To update, run: \u001b[0m\u001b[32;49mpip install --upgrade pip\u001b[0m\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "\r", "config.json: 0%| | 0.00/1.80k [00:00\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
filenamefacemultiple_facesno_faceswears_maskagegenderraceemotionemotion (category)...text_languagetext_englishtext_cleantext_summarysentimentsentiment_scoreentityentity_typeconst_image_summary3_non-deterministic_summary
0data-test/img4.pngNoNo0[No][None][None][None][None][None]...enMOODOVIN XIXIMOODOVIN XI XI: Vladimir Putin, Vladimir Vlad...POSITIVE0.66[MOODOVIN XI][ORG]a river running through a city next to tall bu...[buildings near a waterway with small boats pa...
1data-test/img1.pngNoNo0[No][None][None][None][None][None]...enSCATTERING THEORY The Quantum Theory of Nonrel...THEORY The Quantum Theory of Collisions JOHN R...SCATTERING THEORY The Quantum Theory of Nonre...POSITIVE0.91[Non, ##vist, Col, ##N, R, T, ##AYL, Universit...[MISC, MISC, MISC, ORG, PER, PER, ORG, ORG]a close up of a piece of paper with writing on it[a white paper with some black writing on it, ...
2data-test/img2.pngNoNo0[No][None][None][None][None][None]...enTHE ALGEBRAIC EIGENVALUE PROBLEM DOM NVS TIO M...THE PROBLEM DOM NVS TIO MINA Monographs on Num...H. H. W. WILKINSON: The AlgebriNEGATIVE0.97[ALGEBRAIC EIGENVAL, NVS TIO MI, J, H, WILKINSON][MISC, ORG, ORG, ORG, ORG]a yellow book with green lettering on it[an old book with a picture of the slogan of t...
\n", "

3 rows × 21 columns

\n", "" ], "text/plain": [ " filename face multiple_faces no_faces wears_mask age \\\n", "0 data-test/img4.png No No 0 [No] [None] \n", "1 data-test/img1.png No No 0 [No] [None] \n", "2 data-test/img2.png No No 0 [No] [None] \n", "\n", " gender race emotion emotion (category) ... text_language \\\n", "0 [None] [None] [None] [None] ... en \n", "1 [None] [None] [None] [None] ... en \n", "2 [None] [None] [None] [None] ... en \n", "\n", " text_english \\\n", "0 MOODOVIN XI \n", "1 SCATTERING THEORY The Quantum Theory of Nonrel... \n", "2 THE ALGEBRAIC EIGENVALUE PROBLEM DOM NVS TIO M... \n", "\n", " text_clean \\\n", "0 XI \n", "1 THEORY The Quantum Theory of Collisions JOHN R... \n", "2 THE PROBLEM DOM NVS TIO MINA Monographs on Num... \n", "\n", " text_summary sentiment \\\n", "0 MOODOVIN XI XI: Vladimir Putin, Vladimir Vlad... POSITIVE \n", "1 SCATTERING THEORY The Quantum Theory of Nonre... POSITIVE \n", "2 H. H. W. WILKINSON: The Algebri NEGATIVE \n", "\n", " sentiment_score entity \\\n", "0 0.66 [MOODOVIN XI] \n", "1 0.91 [Non, ##vist, Col, ##N, R, T, ##AYL, Universit... \n", "2 0.97 [ALGEBRAIC EIGENVAL, NVS TIO MI, J, H, WILKINSON] \n", "\n", " entity_type \\\n", "0 [ORG] \n", "1 [MISC, MISC, MISC, ORG, PER, PER, ORG, ORG] \n", "2 [MISC, ORG, ORG, ORG, ORG] \n", "\n", " const_image_summary \\\n", "0 a river running through a city next to tall bu... \n", "1 a close up of a piece of paper with writing on it \n", "2 a yellow book with green lettering on it \n", "\n", " 3_non-deterministic_summary \n", "0 [buildings near a waterway with small boats pa... \n", "1 [a white paper with some black writing on it, ... \n", "2 [an old book with a picture of the slogan of t... \n", "\n", "[3 rows x 21 columns]" ] }, "execution_count": 14, "metadata": {}, "output_type": "execute_result" } ], "source": [ "image_df.head(3)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Or write to a csv file:" ] }, { "cell_type": "code", "execution_count": 15, "metadata": { "execution": { "iopub.execute_input": "2024-02-19T08:50:22.391547Z", "iopub.status.busy": "2024-02-19T08:50:22.391161Z", "iopub.status.idle": "2024-02-19T08:50:25.022235Z", "shell.execute_reply": "2024-02-19T08:50:25.021403Z" } }, "outputs": [ { "ename": "OSError", "evalue": "Cannot save file into a non-existent directory: '/content/drive/MyDrive/misinformation-data'", "output_type": "error", "traceback": [ "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", "\u001b[0;31mOSError\u001b[0m Traceback (most recent call last)", "Cell \u001b[0;32mIn[15], line 1\u001b[0m\n\u001b[0;32m----> 1\u001b[0m \u001b[43mimage_df\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mto_csv\u001b[49m\u001b[43m(\u001b[49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[38;5;124;43m/content/drive/MyDrive/misinformation-data/data_out.csv\u001b[39;49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[43m)\u001b[49m\n", "File \u001b[0;32m/opt/hostedtoolcache/Python/3.9.18/x64/lib/python3.9/site-packages/pandas/util/_decorators.py:333\u001b[0m, in \u001b[0;36mdeprecate_nonkeyword_arguments..decorate..wrapper\u001b[0;34m(*args, **kwargs)\u001b[0m\n\u001b[1;32m 327\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m \u001b[38;5;28mlen\u001b[39m(args) \u001b[38;5;241m>\u001b[39m num_allow_args:\n\u001b[1;32m 328\u001b[0m warnings\u001b[38;5;241m.\u001b[39mwarn(\n\u001b[1;32m 329\u001b[0m msg\u001b[38;5;241m.\u001b[39mformat(arguments\u001b[38;5;241m=\u001b[39m_format_argument_list(allow_args)),\n\u001b[1;32m 330\u001b[0m \u001b[38;5;167;01mFutureWarning\u001b[39;00m,\n\u001b[1;32m 331\u001b[0m stacklevel\u001b[38;5;241m=\u001b[39mfind_stack_level(),\n\u001b[1;32m 332\u001b[0m )\n\u001b[0;32m--> 333\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m \u001b[43mfunc\u001b[49m\u001b[43m(\u001b[49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[43margs\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[43mkwargs\u001b[49m\u001b[43m)\u001b[49m\n", "File \u001b[0;32m/opt/hostedtoolcache/Python/3.9.18/x64/lib/python3.9/site-packages/pandas/core/generic.py:3961\u001b[0m, in \u001b[0;36mNDFrame.to_csv\u001b[0;34m(self, path_or_buf, sep, na_rep, float_format, columns, header, index, index_label, mode, encoding, compression, quoting, quotechar, lineterminator, chunksize, date_format, doublequote, escapechar, decimal, errors, storage_options)\u001b[0m\n\u001b[1;32m 3950\u001b[0m df \u001b[38;5;241m=\u001b[39m \u001b[38;5;28mself\u001b[39m \u001b[38;5;28;01mif\u001b[39;00m \u001b[38;5;28misinstance\u001b[39m(\u001b[38;5;28mself\u001b[39m, ABCDataFrame) \u001b[38;5;28;01melse\u001b[39;00m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39mto_frame()\n\u001b[1;32m 3952\u001b[0m formatter \u001b[38;5;241m=\u001b[39m DataFrameFormatter(\n\u001b[1;32m 3953\u001b[0m frame\u001b[38;5;241m=\u001b[39mdf,\n\u001b[1;32m 3954\u001b[0m header\u001b[38;5;241m=\u001b[39mheader,\n\u001b[0;32m (...)\u001b[0m\n\u001b[1;32m 3958\u001b[0m decimal\u001b[38;5;241m=\u001b[39mdecimal,\n\u001b[1;32m 3959\u001b[0m )\n\u001b[0;32m-> 3961\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m \u001b[43mDataFrameRenderer\u001b[49m\u001b[43m(\u001b[49m\u001b[43mformatter\u001b[49m\u001b[43m)\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mto_csv\u001b[49m\u001b[43m(\u001b[49m\n\u001b[1;32m 3962\u001b[0m \u001b[43m \u001b[49m\u001b[43mpath_or_buf\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 3963\u001b[0m \u001b[43m \u001b[49m\u001b[43mlineterminator\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mlineterminator\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 3964\u001b[0m \u001b[43m \u001b[49m\u001b[43msep\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43msep\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 3965\u001b[0m \u001b[43m \u001b[49m\u001b[43mencoding\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mencoding\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 3966\u001b[0m \u001b[43m \u001b[49m\u001b[43merrors\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43merrors\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 3967\u001b[0m \u001b[43m \u001b[49m\u001b[43mcompression\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mcompression\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 3968\u001b[0m \u001b[43m \u001b[49m\u001b[43mquoting\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mquoting\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 3969\u001b[0m \u001b[43m \u001b[49m\u001b[43mcolumns\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mcolumns\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 3970\u001b[0m \u001b[43m \u001b[49m\u001b[43mindex_label\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mindex_label\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 3971\u001b[0m \u001b[43m \u001b[49m\u001b[43mmode\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mmode\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 3972\u001b[0m \u001b[43m \u001b[49m\u001b[43mchunksize\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mchunksize\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 3973\u001b[0m \u001b[43m \u001b[49m\u001b[43mquotechar\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mquotechar\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 3974\u001b[0m \u001b[43m \u001b[49m\u001b[43mdate_format\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mdate_format\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 3975\u001b[0m \u001b[43m \u001b[49m\u001b[43mdoublequote\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mdoublequote\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 3976\u001b[0m \u001b[43m \u001b[49m\u001b[43mescapechar\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mescapechar\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 3977\u001b[0m \u001b[43m \u001b[49m\u001b[43mstorage_options\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mstorage_options\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 3978\u001b[0m \u001b[43m\u001b[49m\u001b[43m)\u001b[49m\n", "File \u001b[0;32m/opt/hostedtoolcache/Python/3.9.18/x64/lib/python3.9/site-packages/pandas/io/formats/format.py:1014\u001b[0m, in \u001b[0;36mDataFrameRenderer.to_csv\u001b[0;34m(self, path_or_buf, encoding, sep, columns, index_label, mode, compression, quoting, quotechar, lineterminator, chunksize, date_format, doublequote, escapechar, errors, storage_options)\u001b[0m\n\u001b[1;32m 993\u001b[0m created_buffer \u001b[38;5;241m=\u001b[39m \u001b[38;5;28;01mFalse\u001b[39;00m\n\u001b[1;32m 995\u001b[0m csv_formatter \u001b[38;5;241m=\u001b[39m CSVFormatter(\n\u001b[1;32m 996\u001b[0m path_or_buf\u001b[38;5;241m=\u001b[39mpath_or_buf,\n\u001b[1;32m 997\u001b[0m lineterminator\u001b[38;5;241m=\u001b[39mlineterminator,\n\u001b[0;32m (...)\u001b[0m\n\u001b[1;32m 1012\u001b[0m formatter\u001b[38;5;241m=\u001b[39m\u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39mfmt,\n\u001b[1;32m 1013\u001b[0m )\n\u001b[0;32m-> 1014\u001b[0m \u001b[43mcsv_formatter\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43msave\u001b[49m\u001b[43m(\u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m 1016\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m created_buffer:\n\u001b[1;32m 1017\u001b[0m \u001b[38;5;28;01massert\u001b[39;00m \u001b[38;5;28misinstance\u001b[39m(path_or_buf, StringIO)\n", "File \u001b[0;32m/opt/hostedtoolcache/Python/3.9.18/x64/lib/python3.9/site-packages/pandas/io/formats/csvs.py:251\u001b[0m, in \u001b[0;36mCSVFormatter.save\u001b[0;34m(self)\u001b[0m\n\u001b[1;32m 247\u001b[0m \u001b[38;5;250m\u001b[39m\u001b[38;5;124;03m\"\"\"\u001b[39;00m\n\u001b[1;32m 248\u001b[0m \u001b[38;5;124;03mCreate the writer & save.\u001b[39;00m\n\u001b[1;32m 249\u001b[0m \u001b[38;5;124;03m\"\"\"\u001b[39;00m\n\u001b[1;32m 250\u001b[0m \u001b[38;5;66;03m# apply compression and byte/text conversion\u001b[39;00m\n\u001b[0;32m--> 251\u001b[0m \u001b[38;5;28;01mwith\u001b[39;00m \u001b[43mget_handle\u001b[49m\u001b[43m(\u001b[49m\n\u001b[1;32m 252\u001b[0m \u001b[43m \u001b[49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mfilepath_or_buffer\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 253\u001b[0m \u001b[43m \u001b[49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mmode\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 254\u001b[0m \u001b[43m \u001b[49m\u001b[43mencoding\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mencoding\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 255\u001b[0m \u001b[43m \u001b[49m\u001b[43merrors\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43merrors\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 256\u001b[0m \u001b[43m \u001b[49m\u001b[43mcompression\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mcompression\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 257\u001b[0m \u001b[43m \u001b[49m\u001b[43mstorage_options\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mstorage_options\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 258\u001b[0m \u001b[43m\u001b[49m\u001b[43m)\u001b[49m \u001b[38;5;28;01mas\u001b[39;00m handles:\n\u001b[1;32m 259\u001b[0m \u001b[38;5;66;03m# Note: self.encoding is irrelevant here\u001b[39;00m\n\u001b[1;32m 260\u001b[0m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39mwriter \u001b[38;5;241m=\u001b[39m csvlib\u001b[38;5;241m.\u001b[39mwriter(\n\u001b[1;32m 261\u001b[0m handles\u001b[38;5;241m.\u001b[39mhandle,\n\u001b[1;32m 262\u001b[0m lineterminator\u001b[38;5;241m=\u001b[39m\u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39mlineterminator,\n\u001b[0;32m (...)\u001b[0m\n\u001b[1;32m 267\u001b[0m quotechar\u001b[38;5;241m=\u001b[39m\u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39mquotechar,\n\u001b[1;32m 268\u001b[0m )\n\u001b[1;32m 270\u001b[0m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39m_save()\n", "File \u001b[0;32m/opt/hostedtoolcache/Python/3.9.18/x64/lib/python3.9/site-packages/pandas/io/common.py:749\u001b[0m, in \u001b[0;36mget_handle\u001b[0;34m(path_or_buf, mode, encoding, compression, memory_map, is_text, errors, storage_options)\u001b[0m\n\u001b[1;32m 747\u001b[0m \u001b[38;5;66;03m# Only for write methods\u001b[39;00m\n\u001b[1;32m 748\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mr\u001b[39m\u001b[38;5;124m\"\u001b[39m \u001b[38;5;129;01mnot\u001b[39;00m \u001b[38;5;129;01min\u001b[39;00m mode \u001b[38;5;129;01mand\u001b[39;00m is_path:\n\u001b[0;32m--> 749\u001b[0m \u001b[43mcheck_parent_directory\u001b[49m\u001b[43m(\u001b[49m\u001b[38;5;28;43mstr\u001b[39;49m\u001b[43m(\u001b[49m\u001b[43mhandle\u001b[49m\u001b[43m)\u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m 751\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m compression:\n\u001b[1;32m 752\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m compression \u001b[38;5;241m!=\u001b[39m \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mzstd\u001b[39m\u001b[38;5;124m\"\u001b[39m:\n\u001b[1;32m 753\u001b[0m \u001b[38;5;66;03m# compression libraries do not like an explicit text-mode\u001b[39;00m\n", "File \u001b[0;32m/opt/hostedtoolcache/Python/3.9.18/x64/lib/python3.9/site-packages/pandas/io/common.py:616\u001b[0m, in \u001b[0;36mcheck_parent_directory\u001b[0;34m(path)\u001b[0m\n\u001b[1;32m 614\u001b[0m parent \u001b[38;5;241m=\u001b[39m Path(path)\u001b[38;5;241m.\u001b[39mparent\n\u001b[1;32m 615\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m \u001b[38;5;129;01mnot\u001b[39;00m parent\u001b[38;5;241m.\u001b[39mis_dir():\n\u001b[0;32m--> 616\u001b[0m \u001b[38;5;28;01mraise\u001b[39;00m \u001b[38;5;167;01mOSError\u001b[39;00m(\u001b[38;5;124mrf\u001b[39m\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mCannot save file into a non-existent directory: \u001b[39m\u001b[38;5;124m'\u001b[39m\u001b[38;5;132;01m{\u001b[39;00mparent\u001b[38;5;132;01m}\u001b[39;00m\u001b[38;5;124m'\u001b[39m\u001b[38;5;124m\"\u001b[39m)\n", "\u001b[0;31mOSError\u001b[0m: Cannot save file into a non-existent directory: '/content/drive/MyDrive/misinformation-data'" ] } ], "source": [ "image_df.to_csv(\"/content/drive/MyDrive/misinformation-data/data_out.csv\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# The detector modules\n", "The different detector modules with their options are explained in more detail in this section.\n", "## Text detector\n", "Text on the images can be extracted using the `TextDetector` class (`text` module). The text is initally extracted using the Google Cloud Vision API and then translated into English with googletrans. The translated text is cleaned of whitespace, linebreaks, and numbers using Python syntax and spaCy. \n", "\n", "\n", "\n", "The user can set if the text should be further summarized, and analyzed for sentiment and named entity recognition, by setting the keyword `analyse_text` to `True` (the default is `False`). If set, the transformers pipeline is used for each of these tasks, with the default models as of 03/2023. Other models can be selected by setting the optional keyword `model_names` to a list of selected models, on for each task: `model_names=[\"sshleifer/distilbart-cnn-12-6\", \"distilbert-base-uncased-finetuned-sst-2-english\", \"dbmdz/bert-large-cased-finetuned-conll03-english\"]` for summary, sentiment, and ner. To be even more specific, revision numbers can also be selected by specifying the optional keyword `revision_numbers` to a list of revision numbers for each model, for example `revision_numbers=[\"a4f8f3e\", \"af0f99b\", \"f2482bf\"]`. \n", "\n", "Please note that for the Google Cloud Vision API (the TextDetector class) you need to set a key in order to process the images. This key is ideally set as an environment variable using for example" ] }, { "cell_type": "code", "execution_count": 16, "metadata": { "execution": { "iopub.execute_input": "2024-02-19T08:50:25.032719Z", "iopub.status.busy": "2024-02-19T08:50:25.032386Z", "iopub.status.idle": "2024-02-19T08:50:25.035336Z", "shell.execute_reply": "2024-02-19T08:50:25.034770Z" } }, "outputs": [], "source": [ "# os.environ[\"GOOGLE_APPLICATION_CREDENTIALS\"] = \"/content/drive/MyDrive/misinformation-data/misinformation-campaign-981aa55a3b13.json\"\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "where you place the key on your Google Drive if running on colab, or place it in a local folder on your machine.\n", "\n", "Summarizing, the text detection is carried out using the following method call and keywords, where `analyse_text`, `model_names`, and `revision_numbers` are optional:\n" ] }, { "cell_type": "code", "execution_count": 17, "metadata": { "execution": { "iopub.execute_input": "2024-02-19T08:50:25.037648Z", "iopub.status.busy": "2024-02-19T08:50:25.037344Z", "iopub.status.idle": "2024-02-19T08:51:21.184249Z", "shell.execute_reply": "2024-02-19T08:51:21.183549Z" } }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "\r", " 0%| | 0/6 [00:00\n", "\n", "This module is based on the [LAVIS](https://github.com/salesforce/LAVIS) library. Since the models can be quite large, an initial object is created which will load the necessary models into RAM/VRAM and then use them in the analysis. The user can specify the type of analysis to be performed using the `analysis_type` keyword. Setting it to `summary` will generate a caption (summary), `questions` will prepare answers (VQA) to a list of questions as set by the user, `summary_and_questions` will do both. Note that the desired analysis type needs to be set here in the initialization of the \n", "detector object, and not when running the analysis for each image; the same holds true for the selected model." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The implemented models are listed below.\n", "\n", "| input model name | model |\n", "| ---------------- | ----- |\n", "| base | BLIP image captioning base, ViT-B/16, pretrained on COCO dataset |\n", "| large | BLIP image captioning large, ViT-L/16, pretrained on COCO dataset |\n", "| vqa | BLIP base model fine-tuned on VQA v2.0 dataset |\n", "| blip2_t5_pretrain_flant5xxl | BLIP2 pretrained on FlanT5XXL | \n", "| blip2_t5_pretrain_flant5xl | BLIP2 pretrained on FlanT5XL | \n", "| blip2_t5_caption_coco_flant5xl | BLIP2 pretrained on FlanT5XL, fine-tuned on COCO | \n", "| blip2_opt_pretrain_opt2.7b | BLIP2 pretrained on OPT-2.7b |\n", "| blip2_opt_pretrain_opt6.7b | BLIP2 pretrained on OPT-6.7b | \n", "| blip2_opt_caption_coco_opt2.7b | BLIP2 pretrained on OPT-2.7b, fine-tuned on COCO | \n", "| blip2_opt_caption_coco_opt6.7b | BLIP2 pretrained on OPT-6.7b, fine-tuned on COCO |\n", "\n", "Please note that `base`, `large` and `vqa` models can be run on the base TPU video card in Google Colab.\n", "To run any advanced `BLIP2` models you need more than 20 gb of video memory, so you need to connect a paid A100 in Google Colab." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "First of all, we can run only the summary module `analysis_type`. You can choose a `base` or a `large` model_type. " ] }, { "cell_type": "code", "execution_count": 18, "metadata": { "execution": { "iopub.execute_input": "2024-02-19T08:51:21.186991Z", "iopub.status.busy": "2024-02-19T08:51:21.186599Z", "iopub.status.idle": "2024-02-19T08:51:27.937584Z", "shell.execute_reply": "2024-02-19T08:51:27.936965Z" } }, "outputs": [], "source": [ "image_summary_detector = ammico.SummaryDetector(image_dict, analysis_type=\"summary\", model_type=\"base\")" ] }, { "cell_type": "code", "execution_count": 19, "metadata": { "execution": { "iopub.execute_input": "2024-02-19T08:51:27.945516Z", "iopub.status.busy": "2024-02-19T08:51:27.945076Z", "iopub.status.idle": "2024-02-19T08:52:37.388418Z", "shell.execute_reply": "2024-02-19T08:52:37.387691Z" } }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "\r", " 0%| | 0/6 [00:00 1\u001b[0m image_summary_vqa_detector \u001b[38;5;241m=\u001b[39m \u001b[43mammico\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mSummaryDetector\u001b[49m\u001b[43m(\u001b[49m\u001b[43mimage_dict\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43manalysis_type\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[38;5;124;43mquestions\u001b[39;49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\n\u001b[1;32m 2\u001b[0m \u001b[43m \u001b[49m\u001b[43mmodel_type\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[38;5;124;43mvqa\u001b[39;49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[43m)\u001b[49m\n\u001b[1;32m 4\u001b[0m \u001b[38;5;28;01mfor\u001b[39;00m num, key \u001b[38;5;129;01min\u001b[39;00m tqdm(\u001b[38;5;28menumerate\u001b[39m(image_dict\u001b[38;5;241m.\u001b[39mkeys()),total\u001b[38;5;241m=\u001b[39m\u001b[38;5;28mlen\u001b[39m(image_dict)):\n\u001b[1;32m 5\u001b[0m image_dict[key] \u001b[38;5;241m=\u001b[39m image_summary_vqa_detector\u001b[38;5;241m.\u001b[39manalyse_image(subdict\u001b[38;5;241m=\u001b[39mimage_dict[key], \n\u001b[1;32m 6\u001b[0m analysis_type\u001b[38;5;241m=\u001b[39m\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mquestions\u001b[39m\u001b[38;5;124m\"\u001b[39m, \n\u001b[1;32m 7\u001b[0m list_of_questions \u001b[38;5;241m=\u001b[39m list_of_questions)\n", "File \u001b[0;32m~/work/AMMICO/AMMICO/ammico/summary.py:141\u001b[0m, in \u001b[0;36mSummaryDetector.__init__\u001b[0;34m(self, subdict, model_type, analysis_type, list_of_questions, summary_model, summary_vis_processors, summary_vqa_model, summary_vqa_vis_processors, summary_vqa_txt_processors, summary_vqa_model_new, summary_vqa_vis_processors_new, summary_vqa_txt_processors_new, device_type)\u001b[0m\n\u001b[1;32m 127\u001b[0m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39msummary_vis_processors \u001b[38;5;241m=\u001b[39m summary_vis_processors\n\u001b[1;32m 128\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m (\n\u001b[1;32m 129\u001b[0m model_type \u001b[38;5;129;01min\u001b[39;00m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39mallowed_model_types\n\u001b[1;32m 130\u001b[0m \u001b[38;5;129;01mand\u001b[39;00m (summary_vqa_model \u001b[38;5;129;01mis\u001b[39;00m \u001b[38;5;28;01mNone\u001b[39;00m)\n\u001b[0;32m (...)\u001b[0m\n\u001b[1;32m 135\u001b[0m )\n\u001b[1;32m 136\u001b[0m ):\n\u001b[1;32m 137\u001b[0m (\n\u001b[1;32m 138\u001b[0m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39msummary_vqa_model,\n\u001b[1;32m 139\u001b[0m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39msummary_vqa_vis_processors,\n\u001b[1;32m 140\u001b[0m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39msummary_vqa_txt_processors,\n\u001b[0;32m--> 141\u001b[0m ) \u001b[38;5;241m=\u001b[39m \u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mload_vqa_model\u001b[49m\u001b[43m(\u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m 142\u001b[0m \u001b[38;5;28;01melse\u001b[39;00m:\n\u001b[1;32m 143\u001b[0m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39msummary_vqa_model \u001b[38;5;241m=\u001b[39m summary_vqa_model\n", "File \u001b[0;32m~/work/AMMICO/AMMICO/ammico/summary.py:232\u001b[0m, in \u001b[0;36mSummaryDetector.load_vqa_model\u001b[0;34m(self)\u001b[0m\n\u001b[1;32m 216\u001b[0m \u001b[38;5;28;01mdef\u001b[39;00m \u001b[38;5;21mload_vqa_model\u001b[39m(\u001b[38;5;28mself\u001b[39m):\n\u001b[1;32m 217\u001b[0m \u001b[38;5;250m \u001b[39m\u001b[38;5;124;03m\"\"\"\u001b[39;00m\n\u001b[1;32m 218\u001b[0m \u001b[38;5;124;03m Load blip_vqa model and preprocessors for visual and text inputs from lavis.models.\u001b[39;00m\n\u001b[1;32m 219\u001b[0m \n\u001b[0;32m (...)\u001b[0m\n\u001b[1;32m 226\u001b[0m \n\u001b[1;32m 227\u001b[0m \u001b[38;5;124;03m \"\"\"\u001b[39;00m\n\u001b[1;32m 228\u001b[0m (\n\u001b[1;32m 229\u001b[0m summary_vqa_model,\n\u001b[1;32m 230\u001b[0m summary_vqa_vis_processors,\n\u001b[1;32m 231\u001b[0m summary_vqa_txt_processors,\n\u001b[0;32m--> 232\u001b[0m ) \u001b[38;5;241m=\u001b[39m \u001b[43mload_model_and_preprocess\u001b[49m\u001b[43m(\u001b[49m\n\u001b[1;32m 233\u001b[0m \u001b[43m \u001b[49m\u001b[43mname\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[38;5;124;43mblip_vqa\u001b[39;49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[43m,\u001b[49m\n\u001b[1;32m 234\u001b[0m \u001b[43m \u001b[49m\u001b[43mmodel_type\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[38;5;124;43mvqav2\u001b[39;49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[43m,\u001b[49m\n\u001b[1;32m 235\u001b[0m \u001b[43m \u001b[49m\u001b[43mis_eval\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[38;5;28;43;01mTrue\u001b[39;49;00m\u001b[43m,\u001b[49m\n\u001b[1;32m 236\u001b[0m \u001b[43m \u001b[49m\u001b[43mdevice\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43msummary_device\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 237\u001b[0m \u001b[43m \u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m 238\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m summary_vqa_model, summary_vqa_vis_processors, summary_vqa_txt_processors\n", "File \u001b[0;32m/opt/hostedtoolcache/Python/3.9.18/x64/lib/python3.9/site-packages/lavis/models/__init__.py:195\u001b[0m, in \u001b[0;36mload_model_and_preprocess\u001b[0;34m(name, model_type, is_eval, device)\u001b[0m\n\u001b[1;32m 192\u001b[0m model_cls \u001b[38;5;241m=\u001b[39m registry\u001b[38;5;241m.\u001b[39mget_model_class(name)\n\u001b[1;32m 194\u001b[0m \u001b[38;5;66;03m# load model\u001b[39;00m\n\u001b[0;32m--> 195\u001b[0m model \u001b[38;5;241m=\u001b[39m \u001b[43mmodel_cls\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mfrom_pretrained\u001b[49m\u001b[43m(\u001b[49m\u001b[43mmodel_type\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mmodel_type\u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m 197\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m is_eval:\n\u001b[1;32m 198\u001b[0m model\u001b[38;5;241m.\u001b[39meval()\n", "File \u001b[0;32m/opt/hostedtoolcache/Python/3.9.18/x64/lib/python3.9/site-packages/lavis/models/base_model.py:70\u001b[0m, in \u001b[0;36mBaseModel.from_pretrained\u001b[0;34m(cls, model_type)\u001b[0m\n\u001b[1;32m 60\u001b[0m \u001b[38;5;250m\u001b[39m\u001b[38;5;124;03m\"\"\"\u001b[39;00m\n\u001b[1;32m 61\u001b[0m \u001b[38;5;124;03mBuild a pretrained model from default configuration file, specified by model_type.\u001b[39;00m\n\u001b[1;32m 62\u001b[0m \n\u001b[0;32m (...)\u001b[0m\n\u001b[1;32m 67\u001b[0m \u001b[38;5;124;03m - model (nn.Module): pretrained or finetuned model, depending on the configuration.\u001b[39;00m\n\u001b[1;32m 68\u001b[0m \u001b[38;5;124;03m\"\"\"\u001b[39;00m\n\u001b[1;32m 69\u001b[0m model_cfg \u001b[38;5;241m=\u001b[39m OmegaConf\u001b[38;5;241m.\u001b[39mload(\u001b[38;5;28mcls\u001b[39m\u001b[38;5;241m.\u001b[39mdefault_config_path(model_type))\u001b[38;5;241m.\u001b[39mmodel\n\u001b[0;32m---> 70\u001b[0m model \u001b[38;5;241m=\u001b[39m \u001b[38;5;28;43mcls\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mfrom_config\u001b[49m\u001b[43m(\u001b[49m\u001b[43mmodel_cfg\u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m 72\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m model\n", "File \u001b[0;32m/opt/hostedtoolcache/Python/3.9.18/x64/lib/python3.9/site-packages/lavis/models/blip_models/blip_vqa.py:373\u001b[0m, in \u001b[0;36mBlipVQA.from_config\u001b[0;34m(cls, cfg)\u001b[0m\n\u001b[1;32m 364\u001b[0m max_txt_len \u001b[38;5;241m=\u001b[39m cfg\u001b[38;5;241m.\u001b[39mget(\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mmax_txt_len\u001b[39m\u001b[38;5;124m\"\u001b[39m, \u001b[38;5;241m35\u001b[39m)\n\u001b[1;32m 366\u001b[0m model \u001b[38;5;241m=\u001b[39m \u001b[38;5;28mcls\u001b[39m(\n\u001b[1;32m 367\u001b[0m image_encoder\u001b[38;5;241m=\u001b[39mimage_encoder,\n\u001b[1;32m 368\u001b[0m text_encoder\u001b[38;5;241m=\u001b[39mtext_encoder,\n\u001b[1;32m 369\u001b[0m text_decoder\u001b[38;5;241m=\u001b[39mtext_decoder,\n\u001b[1;32m 370\u001b[0m max_txt_len\u001b[38;5;241m=\u001b[39mmax_txt_len,\n\u001b[1;32m 371\u001b[0m )\n\u001b[0;32m--> 373\u001b[0m \u001b[43mmodel\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mload_checkpoint_from_config\u001b[49m\u001b[43m(\u001b[49m\u001b[43mcfg\u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m 375\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m model\n", "File \u001b[0;32m/opt/hostedtoolcache/Python/3.9.18/x64/lib/python3.9/site-packages/lavis/models/base_model.py:95\u001b[0m, in \u001b[0;36mBaseModel.load_checkpoint_from_config\u001b[0;34m(self, cfg, **kwargs)\u001b[0m\n\u001b[1;32m 91\u001b[0m finetune_path \u001b[38;5;241m=\u001b[39m cfg\u001b[38;5;241m.\u001b[39mget(\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mfinetuned\u001b[39m\u001b[38;5;124m\"\u001b[39m, \u001b[38;5;28;01mNone\u001b[39;00m)\n\u001b[1;32m 92\u001b[0m \u001b[38;5;28;01massert\u001b[39;00m (\n\u001b[1;32m 93\u001b[0m finetune_path \u001b[38;5;129;01mis\u001b[39;00m \u001b[38;5;129;01mnot\u001b[39;00m \u001b[38;5;28;01mNone\u001b[39;00m\n\u001b[1;32m 94\u001b[0m ), \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mFound load_finetuned is True, but finetune_path is None.\u001b[39m\u001b[38;5;124m\"\u001b[39m\n\u001b[0;32m---> 95\u001b[0m \u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mload_checkpoint\u001b[49m\u001b[43m(\u001b[49m\u001b[43murl_or_filename\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mfinetune_path\u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m 96\u001b[0m \u001b[38;5;28;01melse\u001b[39;00m:\n\u001b[1;32m 97\u001b[0m \u001b[38;5;66;03m# load pre-trained weights\u001b[39;00m\n\u001b[1;32m 98\u001b[0m pretrain_path \u001b[38;5;241m=\u001b[39m cfg\u001b[38;5;241m.\u001b[39mget(\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mpretrained\u001b[39m\u001b[38;5;124m\"\u001b[39m, \u001b[38;5;28;01mNone\u001b[39;00m)\n", "File \u001b[0;32m/opt/hostedtoolcache/Python/3.9.18/x64/lib/python3.9/site-packages/lavis/models/base_model.py:37\u001b[0m, in \u001b[0;36mBaseModel.load_checkpoint\u001b[0;34m(self, url_or_filename)\u001b[0m\n\u001b[1;32m 30\u001b[0m \u001b[38;5;250m\u001b[39m\u001b[38;5;124;03m\"\"\"\u001b[39;00m\n\u001b[1;32m 31\u001b[0m \u001b[38;5;124;03mLoad from a finetuned checkpoint.\u001b[39;00m\n\u001b[1;32m 32\u001b[0m \n\u001b[1;32m 33\u001b[0m \u001b[38;5;124;03mThis should expect no mismatch in the model keys and the checkpoint keys.\u001b[39;00m\n\u001b[1;32m 34\u001b[0m \u001b[38;5;124;03m\"\"\"\u001b[39;00m\n\u001b[1;32m 36\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m is_url(url_or_filename):\n\u001b[0;32m---> 37\u001b[0m cached_file \u001b[38;5;241m=\u001b[39m \u001b[43mdownload_cached_file\u001b[49m\u001b[43m(\u001b[49m\n\u001b[1;32m 38\u001b[0m \u001b[43m \u001b[49m\u001b[43murl_or_filename\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mcheck_hash\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[38;5;28;43;01mFalse\u001b[39;49;00m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mprogress\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[38;5;28;43;01mTrue\u001b[39;49;00m\n\u001b[1;32m 39\u001b[0m \u001b[43m \u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m 40\u001b[0m checkpoint \u001b[38;5;241m=\u001b[39m torch\u001b[38;5;241m.\u001b[39mload(cached_file, map_location\u001b[38;5;241m=\u001b[39m\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mcpu\u001b[39m\u001b[38;5;124m\"\u001b[39m)\n\u001b[1;32m 41\u001b[0m \u001b[38;5;28;01melif\u001b[39;00m os\u001b[38;5;241m.\u001b[39mpath\u001b[38;5;241m.\u001b[39misfile(url_or_filename):\n", "File \u001b[0;32m/opt/hostedtoolcache/Python/3.9.18/x64/lib/python3.9/site-packages/lavis/common/dist_utils.py:132\u001b[0m, in \u001b[0;36mdownload_cached_file\u001b[0;34m(url, check_hash, progress)\u001b[0m\n\u001b[1;32m 129\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m cached_file\n\u001b[1;32m 131\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m is_main_process():\n\u001b[0;32m--> 132\u001b[0m \u001b[43mtimm_hub\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mdownload_cached_file\u001b[49m\u001b[43m(\u001b[49m\u001b[43murl\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mcheck_hash\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mprogress\u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m 134\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m is_dist_avail_and_initialized():\n\u001b[1;32m 135\u001b[0m dist\u001b[38;5;241m.\u001b[39mbarrier()\n", "File \u001b[0;32m/opt/hostedtoolcache/Python/3.9.18/x64/lib/python3.9/site-packages/timm/models/hub.py:51\u001b[0m, in \u001b[0;36mdownload_cached_file\u001b[0;34m(url, check_hash, progress)\u001b[0m\n\u001b[1;32m 49\u001b[0m r \u001b[38;5;241m=\u001b[39m HASH_REGEX\u001b[38;5;241m.\u001b[39msearch(filename) \u001b[38;5;66;03m# r is Optional[Match[str]]\u001b[39;00m\n\u001b[1;32m 50\u001b[0m hash_prefix \u001b[38;5;241m=\u001b[39m r\u001b[38;5;241m.\u001b[39mgroup(\u001b[38;5;241m1\u001b[39m) \u001b[38;5;28;01mif\u001b[39;00m r \u001b[38;5;28;01melse\u001b[39;00m \u001b[38;5;28;01mNone\u001b[39;00m\n\u001b[0;32m---> 51\u001b[0m \u001b[43mdownload_url_to_file\u001b[49m\u001b[43m(\u001b[49m\u001b[43murl\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mcached_file\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mhash_prefix\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mprogress\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mprogress\u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m 52\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m cached_file\n", "File \u001b[0;32m/opt/hostedtoolcache/Python/3.9.18/x64/lib/python3.9/site-packages/torch/hub.py:636\u001b[0m, in \u001b[0;36mdownload_url_to_file\u001b[0;34m(url, dst, hash_prefix, progress)\u001b[0m\n\u001b[1;32m 634\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m \u001b[38;5;28mlen\u001b[39m(buffer) \u001b[38;5;241m==\u001b[39m \u001b[38;5;241m0\u001b[39m:\n\u001b[1;32m 635\u001b[0m \u001b[38;5;28;01mbreak\u001b[39;00m\n\u001b[0;32m--> 636\u001b[0m \u001b[43mf\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mwrite\u001b[49m\u001b[43m(\u001b[49m\u001b[43mbuffer\u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m 637\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m hash_prefix \u001b[38;5;129;01mis\u001b[39;00m \u001b[38;5;129;01mnot\u001b[39;00m \u001b[38;5;28;01mNone\u001b[39;00m:\n\u001b[1;32m 638\u001b[0m sha256\u001b[38;5;241m.\u001b[39mupdate(buffer)\n", "File \u001b[0;32m/opt/hostedtoolcache/Python/3.9.18/x64/lib/python3.9/tempfile.py:478\u001b[0m, in \u001b[0;36m_TemporaryFileWrapper.__getattr__..func_wrapper\u001b[0;34m(*args, **kwargs)\u001b[0m\n\u001b[1;32m 476\u001b[0m \u001b[38;5;129m@_functools\u001b[39m\u001b[38;5;241m.\u001b[39mwraps(func)\n\u001b[1;32m 477\u001b[0m \u001b[38;5;28;01mdef\u001b[39;00m \u001b[38;5;21mfunc_wrapper\u001b[39m(\u001b[38;5;241m*\u001b[39margs, \u001b[38;5;241m*\u001b[39m\u001b[38;5;241m*\u001b[39mkwargs):\n\u001b[0;32m--> 478\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m \u001b[43mfunc\u001b[49m\u001b[43m(\u001b[49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[43margs\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[43mkwargs\u001b[49m\u001b[43m)\u001b[49m\n", "\u001b[0;31mOSError\u001b[0m: [Errno 28] No space left on device" ] } ], "source": [ "image_summary_vqa_detector = ammico.SummaryDetector(image_dict, analysis_type=\"questions\", \n", " model_type=\"vqa\")\n", "\n", "for num, key in tqdm(enumerate(image_dict.keys()),total=len(image_dict)):\n", " image_dict[key] = image_summary_vqa_detector.analyse_image(subdict=image_dict[key], \n", " analysis_type=\"questions\", \n", " list_of_questions = list_of_questions)\n", " if num % dump_every == 0 | num == len(image_dict) - 1: \n", " image_df = ammico.get_dataframe(image_dict)\n", " image_df.to_csv(dump_file)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Or you can specify the analysis type as `summary_and_questions`, then both caption creation and question answers will be generated for each image. In this case, you can choose a `base` or a `large` model_type. " ] }, { "cell_type": "code", "execution_count": 22, "metadata": { "execution": { "iopub.execute_input": "2024-02-19T08:52:56.408315Z", "iopub.status.busy": "2024-02-19T08:52:56.407932Z", "iopub.status.idle": "2024-02-19T08:53:18.175891Z", "shell.execute_reply": "2024-02-19T08:53:18.175071Z" } }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "\r", " 0%| | 0.00/1.35G [00:00 1\u001b[0m image_summary_vqa_detector \u001b[38;5;241m=\u001b[39m \u001b[43mammico\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mSummaryDetector\u001b[49m\u001b[43m(\u001b[49m\u001b[43mimage_dict\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43manalysis_type\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[38;5;124;43msummary_and_questions\u001b[39;49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\n\u001b[1;32m 2\u001b[0m \u001b[43m \u001b[49m\u001b[43mmodel_type\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[38;5;124;43mbase\u001b[39;49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[43m)\u001b[49m\n\u001b[1;32m 3\u001b[0m \u001b[38;5;28;01mfor\u001b[39;00m num, key \u001b[38;5;129;01min\u001b[39;00m tqdm(\u001b[38;5;28menumerate\u001b[39m(image_dict\u001b[38;5;241m.\u001b[39mkeys()),total\u001b[38;5;241m=\u001b[39m\u001b[38;5;28mlen\u001b[39m(image_dict)):\n\u001b[1;32m 4\u001b[0m image_dict[key] \u001b[38;5;241m=\u001b[39m image_summary_vqa_detector\u001b[38;5;241m.\u001b[39manalyse_image(subdict\u001b[38;5;241m=\u001b[39mimage_dict[key], \n\u001b[1;32m 5\u001b[0m analysis_type\u001b[38;5;241m=\u001b[39m\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124msummary_and_questions\u001b[39m\u001b[38;5;124m\"\u001b[39m, \n\u001b[1;32m 6\u001b[0m list_of_questions \u001b[38;5;241m=\u001b[39m list_of_questions)\n", "File \u001b[0;32m~/work/AMMICO/AMMICO/ammico/summary.py:141\u001b[0m, in \u001b[0;36mSummaryDetector.__init__\u001b[0;34m(self, subdict, model_type, analysis_type, list_of_questions, summary_model, summary_vis_processors, summary_vqa_model, summary_vqa_vis_processors, summary_vqa_txt_processors, summary_vqa_model_new, summary_vqa_vis_processors_new, summary_vqa_txt_processors_new, device_type)\u001b[0m\n\u001b[1;32m 127\u001b[0m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39msummary_vis_processors \u001b[38;5;241m=\u001b[39m summary_vis_processors\n\u001b[1;32m 128\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m (\n\u001b[1;32m 129\u001b[0m model_type \u001b[38;5;129;01min\u001b[39;00m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39mallowed_model_types\n\u001b[1;32m 130\u001b[0m \u001b[38;5;129;01mand\u001b[39;00m (summary_vqa_model \u001b[38;5;129;01mis\u001b[39;00m \u001b[38;5;28;01mNone\u001b[39;00m)\n\u001b[0;32m (...)\u001b[0m\n\u001b[1;32m 135\u001b[0m )\n\u001b[1;32m 136\u001b[0m ):\n\u001b[1;32m 137\u001b[0m (\n\u001b[1;32m 138\u001b[0m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39msummary_vqa_model,\n\u001b[1;32m 139\u001b[0m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39msummary_vqa_vis_processors,\n\u001b[1;32m 140\u001b[0m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39msummary_vqa_txt_processors,\n\u001b[0;32m--> 141\u001b[0m ) \u001b[38;5;241m=\u001b[39m \u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mload_vqa_model\u001b[49m\u001b[43m(\u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m 142\u001b[0m \u001b[38;5;28;01melse\u001b[39;00m:\n\u001b[1;32m 143\u001b[0m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39msummary_vqa_model \u001b[38;5;241m=\u001b[39m summary_vqa_model\n", "File \u001b[0;32m~/work/AMMICO/AMMICO/ammico/summary.py:232\u001b[0m, in \u001b[0;36mSummaryDetector.load_vqa_model\u001b[0;34m(self)\u001b[0m\n\u001b[1;32m 216\u001b[0m \u001b[38;5;28;01mdef\u001b[39;00m \u001b[38;5;21mload_vqa_model\u001b[39m(\u001b[38;5;28mself\u001b[39m):\n\u001b[1;32m 217\u001b[0m \u001b[38;5;250m \u001b[39m\u001b[38;5;124;03m\"\"\"\u001b[39;00m\n\u001b[1;32m 218\u001b[0m \u001b[38;5;124;03m Load blip_vqa model and preprocessors for visual and text inputs from lavis.models.\u001b[39;00m\n\u001b[1;32m 219\u001b[0m \n\u001b[0;32m (...)\u001b[0m\n\u001b[1;32m 226\u001b[0m \n\u001b[1;32m 227\u001b[0m \u001b[38;5;124;03m \"\"\"\u001b[39;00m\n\u001b[1;32m 228\u001b[0m (\n\u001b[1;32m 229\u001b[0m summary_vqa_model,\n\u001b[1;32m 230\u001b[0m summary_vqa_vis_processors,\n\u001b[1;32m 231\u001b[0m summary_vqa_txt_processors,\n\u001b[0;32m--> 232\u001b[0m ) \u001b[38;5;241m=\u001b[39m \u001b[43mload_model_and_preprocess\u001b[49m\u001b[43m(\u001b[49m\n\u001b[1;32m 233\u001b[0m \u001b[43m \u001b[49m\u001b[43mname\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[38;5;124;43mblip_vqa\u001b[39;49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[43m,\u001b[49m\n\u001b[1;32m 234\u001b[0m \u001b[43m \u001b[49m\u001b[43mmodel_type\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[38;5;124;43mvqav2\u001b[39;49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[43m,\u001b[49m\n\u001b[1;32m 235\u001b[0m \u001b[43m \u001b[49m\u001b[43mis_eval\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[38;5;28;43;01mTrue\u001b[39;49;00m\u001b[43m,\u001b[49m\n\u001b[1;32m 236\u001b[0m \u001b[43m \u001b[49m\u001b[43mdevice\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43msummary_device\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 237\u001b[0m \u001b[43m \u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m 238\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m summary_vqa_model, summary_vqa_vis_processors, summary_vqa_txt_processors\n", "File \u001b[0;32m/opt/hostedtoolcache/Python/3.9.18/x64/lib/python3.9/site-packages/lavis/models/__init__.py:195\u001b[0m, in \u001b[0;36mload_model_and_preprocess\u001b[0;34m(name, model_type, is_eval, device)\u001b[0m\n\u001b[1;32m 192\u001b[0m model_cls \u001b[38;5;241m=\u001b[39m registry\u001b[38;5;241m.\u001b[39mget_model_class(name)\n\u001b[1;32m 194\u001b[0m \u001b[38;5;66;03m# load model\u001b[39;00m\n\u001b[0;32m--> 195\u001b[0m model \u001b[38;5;241m=\u001b[39m \u001b[43mmodel_cls\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mfrom_pretrained\u001b[49m\u001b[43m(\u001b[49m\u001b[43mmodel_type\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mmodel_type\u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m 197\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m is_eval:\n\u001b[1;32m 198\u001b[0m model\u001b[38;5;241m.\u001b[39meval()\n", "File \u001b[0;32m/opt/hostedtoolcache/Python/3.9.18/x64/lib/python3.9/site-packages/lavis/models/base_model.py:70\u001b[0m, in \u001b[0;36mBaseModel.from_pretrained\u001b[0;34m(cls, model_type)\u001b[0m\n\u001b[1;32m 60\u001b[0m \u001b[38;5;250m\u001b[39m\u001b[38;5;124;03m\"\"\"\u001b[39;00m\n\u001b[1;32m 61\u001b[0m \u001b[38;5;124;03mBuild a pretrained model from default configuration file, specified by model_type.\u001b[39;00m\n\u001b[1;32m 62\u001b[0m \n\u001b[0;32m (...)\u001b[0m\n\u001b[1;32m 67\u001b[0m \u001b[38;5;124;03m - model (nn.Module): pretrained or finetuned model, depending on the configuration.\u001b[39;00m\n\u001b[1;32m 68\u001b[0m \u001b[38;5;124;03m\"\"\"\u001b[39;00m\n\u001b[1;32m 69\u001b[0m model_cfg \u001b[38;5;241m=\u001b[39m OmegaConf\u001b[38;5;241m.\u001b[39mload(\u001b[38;5;28mcls\u001b[39m\u001b[38;5;241m.\u001b[39mdefault_config_path(model_type))\u001b[38;5;241m.\u001b[39mmodel\n\u001b[0;32m---> 70\u001b[0m model \u001b[38;5;241m=\u001b[39m \u001b[38;5;28;43mcls\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mfrom_config\u001b[49m\u001b[43m(\u001b[49m\u001b[43mmodel_cfg\u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m 72\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m model\n", "File \u001b[0;32m/opt/hostedtoolcache/Python/3.9.18/x64/lib/python3.9/site-packages/lavis/models/blip_models/blip_vqa.py:373\u001b[0m, in \u001b[0;36mBlipVQA.from_config\u001b[0;34m(cls, cfg)\u001b[0m\n\u001b[1;32m 364\u001b[0m max_txt_len \u001b[38;5;241m=\u001b[39m cfg\u001b[38;5;241m.\u001b[39mget(\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mmax_txt_len\u001b[39m\u001b[38;5;124m\"\u001b[39m, \u001b[38;5;241m35\u001b[39m)\n\u001b[1;32m 366\u001b[0m model \u001b[38;5;241m=\u001b[39m \u001b[38;5;28mcls\u001b[39m(\n\u001b[1;32m 367\u001b[0m image_encoder\u001b[38;5;241m=\u001b[39mimage_encoder,\n\u001b[1;32m 368\u001b[0m text_encoder\u001b[38;5;241m=\u001b[39mtext_encoder,\n\u001b[1;32m 369\u001b[0m text_decoder\u001b[38;5;241m=\u001b[39mtext_decoder,\n\u001b[1;32m 370\u001b[0m max_txt_len\u001b[38;5;241m=\u001b[39mmax_txt_len,\n\u001b[1;32m 371\u001b[0m )\n\u001b[0;32m--> 373\u001b[0m \u001b[43mmodel\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mload_checkpoint_from_config\u001b[49m\u001b[43m(\u001b[49m\u001b[43mcfg\u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m 375\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m model\n", "File \u001b[0;32m/opt/hostedtoolcache/Python/3.9.18/x64/lib/python3.9/site-packages/lavis/models/base_model.py:95\u001b[0m, in \u001b[0;36mBaseModel.load_checkpoint_from_config\u001b[0;34m(self, cfg, **kwargs)\u001b[0m\n\u001b[1;32m 91\u001b[0m finetune_path \u001b[38;5;241m=\u001b[39m cfg\u001b[38;5;241m.\u001b[39mget(\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mfinetuned\u001b[39m\u001b[38;5;124m\"\u001b[39m, \u001b[38;5;28;01mNone\u001b[39;00m)\n\u001b[1;32m 92\u001b[0m \u001b[38;5;28;01massert\u001b[39;00m (\n\u001b[1;32m 93\u001b[0m finetune_path \u001b[38;5;129;01mis\u001b[39;00m \u001b[38;5;129;01mnot\u001b[39;00m \u001b[38;5;28;01mNone\u001b[39;00m\n\u001b[1;32m 94\u001b[0m ), \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mFound load_finetuned is True, but finetune_path is None.\u001b[39m\u001b[38;5;124m\"\u001b[39m\n\u001b[0;32m---> 95\u001b[0m \u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mload_checkpoint\u001b[49m\u001b[43m(\u001b[49m\u001b[43murl_or_filename\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mfinetune_path\u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m 96\u001b[0m \u001b[38;5;28;01melse\u001b[39;00m:\n\u001b[1;32m 97\u001b[0m \u001b[38;5;66;03m# load pre-trained weights\u001b[39;00m\n\u001b[1;32m 98\u001b[0m pretrain_path \u001b[38;5;241m=\u001b[39m cfg\u001b[38;5;241m.\u001b[39mget(\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mpretrained\u001b[39m\u001b[38;5;124m\"\u001b[39m, \u001b[38;5;28;01mNone\u001b[39;00m)\n", "File \u001b[0;32m/opt/hostedtoolcache/Python/3.9.18/x64/lib/python3.9/site-packages/lavis/models/base_model.py:37\u001b[0m, in \u001b[0;36mBaseModel.load_checkpoint\u001b[0;34m(self, url_or_filename)\u001b[0m\n\u001b[1;32m 30\u001b[0m \u001b[38;5;250m\u001b[39m\u001b[38;5;124;03m\"\"\"\u001b[39;00m\n\u001b[1;32m 31\u001b[0m \u001b[38;5;124;03mLoad from a finetuned checkpoint.\u001b[39;00m\n\u001b[1;32m 32\u001b[0m \n\u001b[1;32m 33\u001b[0m \u001b[38;5;124;03mThis should expect no mismatch in the model keys and the checkpoint keys.\u001b[39;00m\n\u001b[1;32m 34\u001b[0m \u001b[38;5;124;03m\"\"\"\u001b[39;00m\n\u001b[1;32m 36\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m is_url(url_or_filename):\n\u001b[0;32m---> 37\u001b[0m cached_file \u001b[38;5;241m=\u001b[39m \u001b[43mdownload_cached_file\u001b[49m\u001b[43m(\u001b[49m\n\u001b[1;32m 38\u001b[0m \u001b[43m \u001b[49m\u001b[43murl_or_filename\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mcheck_hash\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[38;5;28;43;01mFalse\u001b[39;49;00m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mprogress\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[38;5;28;43;01mTrue\u001b[39;49;00m\n\u001b[1;32m 39\u001b[0m \u001b[43m \u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m 40\u001b[0m checkpoint \u001b[38;5;241m=\u001b[39m torch\u001b[38;5;241m.\u001b[39mload(cached_file, map_location\u001b[38;5;241m=\u001b[39m\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mcpu\u001b[39m\u001b[38;5;124m\"\u001b[39m)\n\u001b[1;32m 41\u001b[0m \u001b[38;5;28;01melif\u001b[39;00m os\u001b[38;5;241m.\u001b[39mpath\u001b[38;5;241m.\u001b[39misfile(url_or_filename):\n", "File \u001b[0;32m/opt/hostedtoolcache/Python/3.9.18/x64/lib/python3.9/site-packages/lavis/common/dist_utils.py:132\u001b[0m, in \u001b[0;36mdownload_cached_file\u001b[0;34m(url, check_hash, progress)\u001b[0m\n\u001b[1;32m 129\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m cached_file\n\u001b[1;32m 131\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m is_main_process():\n\u001b[0;32m--> 132\u001b[0m \u001b[43mtimm_hub\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mdownload_cached_file\u001b[49m\u001b[43m(\u001b[49m\u001b[43murl\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mcheck_hash\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mprogress\u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m 134\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m is_dist_avail_and_initialized():\n\u001b[1;32m 135\u001b[0m dist\u001b[38;5;241m.\u001b[39mbarrier()\n", "File \u001b[0;32m/opt/hostedtoolcache/Python/3.9.18/x64/lib/python3.9/site-packages/timm/models/hub.py:51\u001b[0m, in \u001b[0;36mdownload_cached_file\u001b[0;34m(url, check_hash, progress)\u001b[0m\n\u001b[1;32m 49\u001b[0m r \u001b[38;5;241m=\u001b[39m HASH_REGEX\u001b[38;5;241m.\u001b[39msearch(filename) \u001b[38;5;66;03m# r is Optional[Match[str]]\u001b[39;00m\n\u001b[1;32m 50\u001b[0m hash_prefix \u001b[38;5;241m=\u001b[39m r\u001b[38;5;241m.\u001b[39mgroup(\u001b[38;5;241m1\u001b[39m) \u001b[38;5;28;01mif\u001b[39;00m r \u001b[38;5;28;01melse\u001b[39;00m \u001b[38;5;28;01mNone\u001b[39;00m\n\u001b[0;32m---> 51\u001b[0m \u001b[43mdownload_url_to_file\u001b[49m\u001b[43m(\u001b[49m\u001b[43murl\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mcached_file\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mhash_prefix\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mprogress\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mprogress\u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m 52\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m cached_file\n", "File \u001b[0;32m/opt/hostedtoolcache/Python/3.9.18/x64/lib/python3.9/site-packages/torch/hub.py:636\u001b[0m, in \u001b[0;36mdownload_url_to_file\u001b[0;34m(url, dst, hash_prefix, progress)\u001b[0m\n\u001b[1;32m 634\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m \u001b[38;5;28mlen\u001b[39m(buffer) \u001b[38;5;241m==\u001b[39m \u001b[38;5;241m0\u001b[39m:\n\u001b[1;32m 635\u001b[0m \u001b[38;5;28;01mbreak\u001b[39;00m\n\u001b[0;32m--> 636\u001b[0m \u001b[43mf\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mwrite\u001b[49m\u001b[43m(\u001b[49m\u001b[43mbuffer\u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m 637\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m hash_prefix \u001b[38;5;129;01mis\u001b[39;00m \u001b[38;5;129;01mnot\u001b[39;00m \u001b[38;5;28;01mNone\u001b[39;00m:\n\u001b[1;32m 638\u001b[0m sha256\u001b[38;5;241m.\u001b[39mupdate(buffer)\n", "File \u001b[0;32m/opt/hostedtoolcache/Python/3.9.18/x64/lib/python3.9/tempfile.py:478\u001b[0m, in \u001b[0;36m_TemporaryFileWrapper.__getattr__..func_wrapper\u001b[0;34m(*args, **kwargs)\u001b[0m\n\u001b[1;32m 476\u001b[0m \u001b[38;5;129m@_functools\u001b[39m\u001b[38;5;241m.\u001b[39mwraps(func)\n\u001b[1;32m 477\u001b[0m \u001b[38;5;28;01mdef\u001b[39;00m \u001b[38;5;21mfunc_wrapper\u001b[39m(\u001b[38;5;241m*\u001b[39margs, \u001b[38;5;241m*\u001b[39m\u001b[38;5;241m*\u001b[39mkwargs):\n\u001b[0;32m--> 478\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m \u001b[43mfunc\u001b[49m\u001b[43m(\u001b[49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[43margs\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[43mkwargs\u001b[49m\u001b[43m)\u001b[49m\n", "\u001b[0;31mOSError\u001b[0m: [Errno 28] No space left on device" ] } ], "source": [ "image_summary_vqa_detector = ammico.SummaryDetector(image_dict, analysis_type=\"summary_and_questions\", \n", " model_type=\"base\")\n", "for num, key in tqdm(enumerate(image_dict.keys()),total=len(image_dict)):\n", " image_dict[key] = image_summary_vqa_detector.analyse_image(subdict=image_dict[key], \n", " analysis_type=\"summary_and_questions\", \n", " list_of_questions = list_of_questions)\n", " if num % dump_every == 0 | num == len(image_dict) - 1: \n", " image_df = ammico.get_dataframe(image_dict)\n", " image_df.to_csv(dump_file)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The output is given as a dictionary with the following keys and data types:\n", "\n", "| output key | output type | output value |\n", "| ---------- | ----------- | ------------ |\n", "| `const_image_summary` | `str` | when `analysis_type=\"summary\"` or `\"summary_and_questions\"`, constant image caption (does not change upon re-running the analysis for the same model) |\n", "| `3_non-deterministic_summary` | `list[str]` | when `analysis_type=\"summary\"` or s`ummary_and_questions`, three different captions generated with different random seeds |\n", "| *a user-defined input question* | `str` | when `analysis_type=\"questions\"` or `summary_and_questions`, the answer to the user-defined input question | \n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### BLIP2 models\n", "This is very heavy models. They requare approx 60GB of RAM and they can use more than 20GB GPUs memory." ] }, { "cell_type": "code", "execution_count": 23, "metadata": { "execution": { "iopub.execute_input": "2024-02-19T08:53:18.184352Z", "iopub.status.busy": "2024-02-19T08:53:18.183915Z", "iopub.status.idle": "2024-02-19T08:53:49.441778Z", "shell.execute_reply": "2024-02-19T08:53:49.440627Z" } }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "\r", " 0%| | 0.00/1.89G [00:00 1\u001b[0m obj \u001b[38;5;241m=\u001b[39m \u001b[43mammico\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mSummaryDetector\u001b[49m\u001b[43m(\u001b[49m\u001b[43msubdict\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mimage_dict\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43manalysis_type\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43m \u001b[49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[38;5;124;43msummary_and_questions\u001b[39;49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mmodel_type\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43m \u001b[49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[38;5;124;43mblip2_t5_caption_coco_flant5xl\u001b[39;49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[43m)\u001b[49m\n\u001b[1;32m 2\u001b[0m \u001b[38;5;66;03m# list of the new models that can be used:\u001b[39;00m\n\u001b[1;32m 3\u001b[0m \u001b[38;5;66;03m# \"blip2_t5_pretrain_flant5xxl\",\u001b[39;00m\n\u001b[1;32m 4\u001b[0m \u001b[38;5;66;03m# \"blip2_t5_pretrain_flant5xl\",\u001b[39;00m\n\u001b[0;32m (...)\u001b[0m\n\u001b[1;32m 14\u001b[0m \n\u001b[1;32m 15\u001b[0m \u001b[38;5;66;03m#also you can perform all calculation on cpu if you set device_type= \"cpu\" or gpu if you set device_type= \"cuda\"\u001b[39;00m\n", "File \u001b[0;32m~/work/AMMICO/AMMICO/ammico/summary.py:156\u001b[0m, in \u001b[0;36mSummaryDetector.__init__\u001b[0;34m(self, subdict, model_type, analysis_type, list_of_questions, summary_model, summary_vis_processors, summary_vqa_model, summary_vqa_vis_processors, summary_vqa_txt_processors, summary_vqa_model_new, summary_vqa_vis_processors_new, summary_vqa_txt_processors_new, device_type)\u001b[0m\n\u001b[1;32m 145\u001b[0m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39msummary_vqa_txt_processors \u001b[38;5;241m=\u001b[39m summary_vqa_txt_processors\n\u001b[1;32m 146\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m (\n\u001b[1;32m 147\u001b[0m model_type \u001b[38;5;129;01min\u001b[39;00m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39mallowed_new_model_types\n\u001b[1;32m 148\u001b[0m \u001b[38;5;129;01mand\u001b[39;00m (summary_vqa_model_new \u001b[38;5;129;01mis\u001b[39;00m \u001b[38;5;28;01mNone\u001b[39;00m)\n\u001b[1;32m 149\u001b[0m \u001b[38;5;129;01mand\u001b[39;00m (summary_vqa_vis_processors_new \u001b[38;5;129;01mis\u001b[39;00m \u001b[38;5;28;01mNone\u001b[39;00m)\n\u001b[1;32m 150\u001b[0m \u001b[38;5;129;01mand\u001b[39;00m (summary_vqa_txt_processors_new \u001b[38;5;129;01mis\u001b[39;00m \u001b[38;5;28;01mNone\u001b[39;00m)\n\u001b[1;32m 151\u001b[0m ):\n\u001b[1;32m 152\u001b[0m (\n\u001b[1;32m 153\u001b[0m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39msummary_vqa_model_new,\n\u001b[1;32m 154\u001b[0m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39msummary_vqa_vis_processors_new,\n\u001b[1;32m 155\u001b[0m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39msummary_vqa_txt_processors_new,\n\u001b[0;32m--> 156\u001b[0m ) \u001b[38;5;241m=\u001b[39m \u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mload_new_model\u001b[49m\u001b[43m(\u001b[49m\u001b[43mmodel_type\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mmodel_type\u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m 157\u001b[0m \u001b[38;5;28;01melse\u001b[39;00m:\n\u001b[1;32m 158\u001b[0m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39msummary_vqa_model_new \u001b[38;5;241m=\u001b[39m summary_vqa_model_new\n", "File \u001b[0;32m~/work/AMMICO/AMMICO/ammico/summary.py:479\u001b[0m, in \u001b[0;36mSummaryDetector.load_new_model\u001b[0;34m(self, model_type)\u001b[0m\n\u001b[1;32m 455\u001b[0m \u001b[38;5;250m\u001b[39m\u001b[38;5;124;03m\"\"\"\u001b[39;00m\n\u001b[1;32m 456\u001b[0m \u001b[38;5;124;03mLoad new BLIP2 models.\u001b[39;00m\n\u001b[1;32m 457\u001b[0m \n\u001b[0;32m (...)\u001b[0m\n\u001b[1;32m 464\u001b[0m \u001b[38;5;124;03m txt_processors (dict): preprocessors for text inputs.\u001b[39;00m\n\u001b[1;32m 465\u001b[0m \u001b[38;5;124;03m\"\"\"\u001b[39;00m\n\u001b[1;32m 466\u001b[0m select_model \u001b[38;5;241m=\u001b[39m {\n\u001b[1;32m 467\u001b[0m \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mblip2_t5_pretrain_flant5xxl\u001b[39m\u001b[38;5;124m\"\u001b[39m: SummaryDetector\u001b[38;5;241m.\u001b[39mload_model_blip2_t5_pretrain_flant5xxl,\n\u001b[1;32m 468\u001b[0m \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mblip2_t5_pretrain_flant5xl\u001b[39m\u001b[38;5;124m\"\u001b[39m: SummaryDetector\u001b[38;5;241m.\u001b[39mload_model_blip2_t5_pretrain_flant5xl,\n\u001b[0;32m (...)\u001b[0m\n\u001b[1;32m 473\u001b[0m \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mblip2_opt_caption_coco_opt6.7b\u001b[39m\u001b[38;5;124m\"\u001b[39m: SummaryDetector\u001b[38;5;241m.\u001b[39mload_model_base_blip2_opt_caption_coco_opt67b,\n\u001b[1;32m 474\u001b[0m }\n\u001b[1;32m 475\u001b[0m (\n\u001b[1;32m 476\u001b[0m summary_vqa_model,\n\u001b[1;32m 477\u001b[0m summary_vqa_vis_processors,\n\u001b[1;32m 478\u001b[0m summary_vqa_txt_processors,\n\u001b[0;32m--> 479\u001b[0m ) \u001b[38;5;241m=\u001b[39m \u001b[43mselect_model\u001b[49m\u001b[43m[\u001b[49m\u001b[43mmodel_type\u001b[49m\u001b[43m]\u001b[49m\u001b[43m(\u001b[49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[43m)\u001b[49m\n\u001b[1;32m 480\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m summary_vqa_model, summary_vqa_vis_processors, summary_vqa_txt_processors\n", "File \u001b[0;32m~/work/AMMICO/AMMICO/ammico/summary.py:543\u001b[0m, in \u001b[0;36mSummaryDetector.load_model_blip2_t5_caption_coco_flant5xl\u001b[0;34m(self)\u001b[0m\n\u001b[1;32m 528\u001b[0m \u001b[38;5;28;01mdef\u001b[39;00m \u001b[38;5;21mload_model_blip2_t5_caption_coco_flant5xl\u001b[39m(\u001b[38;5;28mself\u001b[39m):\n\u001b[1;32m 529\u001b[0m \u001b[38;5;250m \u001b[39m\u001b[38;5;124;03m\"\"\"\u001b[39;00m\n\u001b[1;32m 530\u001b[0m \u001b[38;5;124;03m Load BLIP2 model with caption_coco_flant5xl architecture.\u001b[39;00m\n\u001b[1;32m 531\u001b[0m \n\u001b[0;32m (...)\u001b[0m\n\u001b[1;32m 537\u001b[0m \u001b[38;5;124;03m txt_processors (dict): preprocessors for text inputs.\u001b[39;00m\n\u001b[1;32m 538\u001b[0m \u001b[38;5;124;03m \"\"\"\u001b[39;00m\n\u001b[1;32m 539\u001b[0m (\n\u001b[1;32m 540\u001b[0m summary_vqa_model,\n\u001b[1;32m 541\u001b[0m summary_vqa_vis_processors,\n\u001b[1;32m 542\u001b[0m summary_vqa_txt_processors,\n\u001b[0;32m--> 543\u001b[0m ) \u001b[38;5;241m=\u001b[39m \u001b[43mload_model_and_preprocess\u001b[49m\u001b[43m(\u001b[49m\n\u001b[1;32m 544\u001b[0m \u001b[43m \u001b[49m\u001b[43mname\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[38;5;124;43mblip2_t5\u001b[39;49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[43m,\u001b[49m\n\u001b[1;32m 545\u001b[0m \u001b[43m \u001b[49m\u001b[43mmodel_type\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[38;5;124;43mcaption_coco_flant5xl\u001b[39;49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[43m,\u001b[49m\n\u001b[1;32m 546\u001b[0m \u001b[43m \u001b[49m\u001b[43mis_eval\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[38;5;28;43;01mTrue\u001b[39;49;00m\u001b[43m,\u001b[49m\n\u001b[1;32m 547\u001b[0m \u001b[43m \u001b[49m\u001b[43mdevice\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43msummary_device\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 548\u001b[0m \u001b[43m \u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m 549\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m summary_vqa_model, summary_vqa_vis_processors, summary_vqa_txt_processors\n", "File \u001b[0;32m/opt/hostedtoolcache/Python/3.9.18/x64/lib/python3.9/site-packages/lavis/models/__init__.py:195\u001b[0m, in \u001b[0;36mload_model_and_preprocess\u001b[0;34m(name, model_type, is_eval, device)\u001b[0m\n\u001b[1;32m 192\u001b[0m model_cls \u001b[38;5;241m=\u001b[39m registry\u001b[38;5;241m.\u001b[39mget_model_class(name)\n\u001b[1;32m 194\u001b[0m \u001b[38;5;66;03m# load model\u001b[39;00m\n\u001b[0;32m--> 195\u001b[0m model \u001b[38;5;241m=\u001b[39m \u001b[43mmodel_cls\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mfrom_pretrained\u001b[49m\u001b[43m(\u001b[49m\u001b[43mmodel_type\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mmodel_type\u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m 197\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m is_eval:\n\u001b[1;32m 198\u001b[0m model\u001b[38;5;241m.\u001b[39meval()\n", "File \u001b[0;32m/opt/hostedtoolcache/Python/3.9.18/x64/lib/python3.9/site-packages/lavis/models/base_model.py:70\u001b[0m, in \u001b[0;36mBaseModel.from_pretrained\u001b[0;34m(cls, model_type)\u001b[0m\n\u001b[1;32m 60\u001b[0m \u001b[38;5;250m\u001b[39m\u001b[38;5;124;03m\"\"\"\u001b[39;00m\n\u001b[1;32m 61\u001b[0m \u001b[38;5;124;03mBuild a pretrained model from default configuration file, specified by model_type.\u001b[39;00m\n\u001b[1;32m 62\u001b[0m \n\u001b[0;32m (...)\u001b[0m\n\u001b[1;32m 67\u001b[0m \u001b[38;5;124;03m - model (nn.Module): pretrained or finetuned model, depending on the configuration.\u001b[39;00m\n\u001b[1;32m 68\u001b[0m \u001b[38;5;124;03m\"\"\"\u001b[39;00m\n\u001b[1;32m 69\u001b[0m model_cfg \u001b[38;5;241m=\u001b[39m OmegaConf\u001b[38;5;241m.\u001b[39mload(\u001b[38;5;28mcls\u001b[39m\u001b[38;5;241m.\u001b[39mdefault_config_path(model_type))\u001b[38;5;241m.\u001b[39mmodel\n\u001b[0;32m---> 70\u001b[0m model \u001b[38;5;241m=\u001b[39m \u001b[38;5;28;43mcls\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mfrom_config\u001b[49m\u001b[43m(\u001b[49m\u001b[43mmodel_cfg\u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m 72\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m model\n", "File \u001b[0;32m/opt/hostedtoolcache/Python/3.9.18/x64/lib/python3.9/site-packages/lavis/models/blip2_models/blip2_t5.py:368\u001b[0m, in \u001b[0;36mBlip2T5.from_config\u001b[0;34m(cls, cfg)\u001b[0m\n\u001b[1;32m 364\u001b[0m max_txt_len \u001b[38;5;241m=\u001b[39m cfg\u001b[38;5;241m.\u001b[39mget(\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mmax_txt_len\u001b[39m\u001b[38;5;124m\"\u001b[39m, \u001b[38;5;241m32\u001b[39m)\n\u001b[1;32m 366\u001b[0m apply_lemmatizer \u001b[38;5;241m=\u001b[39m cfg\u001b[38;5;241m.\u001b[39mget(\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mapply_lemmatizer\u001b[39m\u001b[38;5;124m\"\u001b[39m, \u001b[38;5;28;01mFalse\u001b[39;00m)\n\u001b[0;32m--> 368\u001b[0m model \u001b[38;5;241m=\u001b[39m \u001b[38;5;28;43mcls\u001b[39;49m\u001b[43m(\u001b[49m\n\u001b[1;32m 369\u001b[0m \u001b[43m \u001b[49m\u001b[43mvit_model\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mvit_model\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 370\u001b[0m \u001b[43m \u001b[49m\u001b[43mimg_size\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mimg_size\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 371\u001b[0m \u001b[43m \u001b[49m\u001b[43mdrop_path_rate\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mdrop_path_rate\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 372\u001b[0m \u001b[43m \u001b[49m\u001b[43muse_grad_checkpoint\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43muse_grad_checkpoint\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 373\u001b[0m \u001b[43m \u001b[49m\u001b[43mvit_precision\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mvit_precision\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 374\u001b[0m \u001b[43m \u001b[49m\u001b[43mfreeze_vit\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mfreeze_vit\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 375\u001b[0m \u001b[43m \u001b[49m\u001b[43mnum_query_token\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mnum_query_token\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 376\u001b[0m \u001b[43m \u001b[49m\u001b[43mt5_model\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mt5_model\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 377\u001b[0m \u001b[43m \u001b[49m\u001b[43mprompt\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mprompt\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 378\u001b[0m \u001b[43m \u001b[49m\u001b[43mmax_txt_len\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mmax_txt_len\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 379\u001b[0m \u001b[43m \u001b[49m\u001b[43mapply_lemmatizer\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mapply_lemmatizer\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 380\u001b[0m \u001b[43m\u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m 381\u001b[0m model\u001b[38;5;241m.\u001b[39mload_checkpoint_from_config(cfg)\n\u001b[1;32m 383\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m model\n", "File \u001b[0;32m/opt/hostedtoolcache/Python/3.9.18/x64/lib/python3.9/site-packages/lavis/models/blip2_models/blip2_t5.py:61\u001b[0m, in \u001b[0;36mBlip2T5.__init__\u001b[0;34m(self, vit_model, img_size, drop_path_rate, use_grad_checkpoint, vit_precision, freeze_vit, num_query_token, t5_model, prompt, max_txt_len, apply_lemmatizer)\u001b[0m\n\u001b[1;32m 57\u001b[0m \u001b[38;5;28msuper\u001b[39m()\u001b[38;5;241m.\u001b[39m\u001b[38;5;21m__init__\u001b[39m()\n\u001b[1;32m 59\u001b[0m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39mtokenizer \u001b[38;5;241m=\u001b[39m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39minit_tokenizer()\n\u001b[0;32m---> 61\u001b[0m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39mvisual_encoder, \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39mln_vision \u001b[38;5;241m=\u001b[39m \u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43minit_vision_encoder\u001b[49m\u001b[43m(\u001b[49m\n\u001b[1;32m 62\u001b[0m \u001b[43m \u001b[49m\u001b[43mvit_model\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mimg_size\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mdrop_path_rate\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43muse_grad_checkpoint\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mvit_precision\u001b[49m\n\u001b[1;32m 63\u001b[0m \u001b[43m\u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m 64\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m freeze_vit:\n\u001b[1;32m 65\u001b[0m \u001b[38;5;28;01mfor\u001b[39;00m name, param \u001b[38;5;129;01min\u001b[39;00m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39mvisual_encoder\u001b[38;5;241m.\u001b[39mnamed_parameters():\n", "File \u001b[0;32m/opt/hostedtoolcache/Python/3.9.18/x64/lib/python3.9/site-packages/lavis/models/blip2_models/blip2.py:72\u001b[0m, in \u001b[0;36mBlip2Base.init_vision_encoder\u001b[0;34m(cls, model_name, img_size, drop_path_rate, use_grad_checkpoint, precision)\u001b[0m\n\u001b[1;32m 67\u001b[0m \u001b[38;5;28;01massert\u001b[39;00m model_name \u001b[38;5;129;01min\u001b[39;00m [\n\u001b[1;32m 68\u001b[0m \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124meva_clip_g\u001b[39m\u001b[38;5;124m\"\u001b[39m,\n\u001b[1;32m 69\u001b[0m \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mclip_L\u001b[39m\u001b[38;5;124m\"\u001b[39m,\n\u001b[1;32m 70\u001b[0m ], \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mvit model must be eva_clip_g or clip_L\u001b[39m\u001b[38;5;124m\"\u001b[39m\n\u001b[1;32m 71\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m model_name \u001b[38;5;241m==\u001b[39m \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124meva_clip_g\u001b[39m\u001b[38;5;124m\"\u001b[39m:\n\u001b[0;32m---> 72\u001b[0m visual_encoder \u001b[38;5;241m=\u001b[39m \u001b[43mcreate_eva_vit_g\u001b[49m\u001b[43m(\u001b[49m\n\u001b[1;32m 73\u001b[0m \u001b[43m \u001b[49m\u001b[43mimg_size\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mdrop_path_rate\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43muse_grad_checkpoint\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mprecision\u001b[49m\n\u001b[1;32m 74\u001b[0m \u001b[43m \u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m 75\u001b[0m \u001b[38;5;28;01melif\u001b[39;00m model_name \u001b[38;5;241m==\u001b[39m \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mclip_L\u001b[39m\u001b[38;5;124m\"\u001b[39m:\n\u001b[1;32m 76\u001b[0m visual_encoder \u001b[38;5;241m=\u001b[39m create_clip_vit_L(img_size, use_grad_checkpoint, precision)\n", "File \u001b[0;32m/opt/hostedtoolcache/Python/3.9.18/x64/lib/python3.9/site-packages/lavis/models/eva_vit.py:430\u001b[0m, in \u001b[0;36mcreate_eva_vit_g\u001b[0;34m(img_size, drop_path_rate, use_checkpoint, precision)\u001b[0m\n\u001b[1;32m 416\u001b[0m model \u001b[38;5;241m=\u001b[39m VisionTransformer(\n\u001b[1;32m 417\u001b[0m img_size\u001b[38;5;241m=\u001b[39mimg_size,\n\u001b[1;32m 418\u001b[0m patch_size\u001b[38;5;241m=\u001b[39m\u001b[38;5;241m14\u001b[39m,\n\u001b[0;32m (...)\u001b[0m\n\u001b[1;32m 427\u001b[0m use_checkpoint\u001b[38;5;241m=\u001b[39muse_checkpoint,\n\u001b[1;32m 428\u001b[0m ) \n\u001b[1;32m 429\u001b[0m url \u001b[38;5;241m=\u001b[39m \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mhttps://storage.googleapis.com/sfr-vision-language-research/LAVIS/models/BLIP2/eva_vit_g.pth\u001b[39m\u001b[38;5;124m\"\u001b[39m\n\u001b[0;32m--> 430\u001b[0m cached_file \u001b[38;5;241m=\u001b[39m \u001b[43mdownload_cached_file\u001b[49m\u001b[43m(\u001b[49m\n\u001b[1;32m 431\u001b[0m \u001b[43m \u001b[49m\u001b[43murl\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mcheck_hash\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[38;5;28;43;01mFalse\u001b[39;49;00m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mprogress\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[38;5;28;43;01mTrue\u001b[39;49;00m\n\u001b[1;32m 432\u001b[0m \u001b[43m\u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m 433\u001b[0m state_dict \u001b[38;5;241m=\u001b[39m torch\u001b[38;5;241m.\u001b[39mload(cached_file, map_location\u001b[38;5;241m=\u001b[39m\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mcpu\u001b[39m\u001b[38;5;124m\"\u001b[39m) \n\u001b[1;32m 434\u001b[0m interpolate_pos_embed(model,state_dict)\n", "File \u001b[0;32m/opt/hostedtoolcache/Python/3.9.18/x64/lib/python3.9/site-packages/lavis/common/dist_utils.py:132\u001b[0m, in \u001b[0;36mdownload_cached_file\u001b[0;34m(url, check_hash, progress)\u001b[0m\n\u001b[1;32m 129\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m cached_file\n\u001b[1;32m 131\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m is_main_process():\n\u001b[0;32m--> 132\u001b[0m \u001b[43mtimm_hub\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mdownload_cached_file\u001b[49m\u001b[43m(\u001b[49m\u001b[43murl\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mcheck_hash\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mprogress\u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m 134\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m is_dist_avail_and_initialized():\n\u001b[1;32m 135\u001b[0m dist\u001b[38;5;241m.\u001b[39mbarrier()\n", "File \u001b[0;32m/opt/hostedtoolcache/Python/3.9.18/x64/lib/python3.9/site-packages/timm/models/hub.py:51\u001b[0m, in \u001b[0;36mdownload_cached_file\u001b[0;34m(url, check_hash, progress)\u001b[0m\n\u001b[1;32m 49\u001b[0m r \u001b[38;5;241m=\u001b[39m HASH_REGEX\u001b[38;5;241m.\u001b[39msearch(filename) \u001b[38;5;66;03m# r is Optional[Match[str]]\u001b[39;00m\n\u001b[1;32m 50\u001b[0m hash_prefix \u001b[38;5;241m=\u001b[39m r\u001b[38;5;241m.\u001b[39mgroup(\u001b[38;5;241m1\u001b[39m) \u001b[38;5;28;01mif\u001b[39;00m r \u001b[38;5;28;01melse\u001b[39;00m \u001b[38;5;28;01mNone\u001b[39;00m\n\u001b[0;32m---> 51\u001b[0m \u001b[43mdownload_url_to_file\u001b[49m\u001b[43m(\u001b[49m\u001b[43murl\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mcached_file\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mhash_prefix\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mprogress\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mprogress\u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m 52\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m cached_file\n", "File \u001b[0;32m/opt/hostedtoolcache/Python/3.9.18/x64/lib/python3.9/site-packages/torch/hub.py:636\u001b[0m, in \u001b[0;36mdownload_url_to_file\u001b[0;34m(url, dst, hash_prefix, progress)\u001b[0m\n\u001b[1;32m 634\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m \u001b[38;5;28mlen\u001b[39m(buffer) \u001b[38;5;241m==\u001b[39m \u001b[38;5;241m0\u001b[39m:\n\u001b[1;32m 635\u001b[0m \u001b[38;5;28;01mbreak\u001b[39;00m\n\u001b[0;32m--> 636\u001b[0m \u001b[43mf\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mwrite\u001b[49m\u001b[43m(\u001b[49m\u001b[43mbuffer\u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m 637\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m hash_prefix \u001b[38;5;129;01mis\u001b[39;00m \u001b[38;5;129;01mnot\u001b[39;00m \u001b[38;5;28;01mNone\u001b[39;00m:\n\u001b[1;32m 638\u001b[0m sha256\u001b[38;5;241m.\u001b[39mupdate(buffer)\n", "File \u001b[0;32m/opt/hostedtoolcache/Python/3.9.18/x64/lib/python3.9/tempfile.py:478\u001b[0m, in \u001b[0;36m_TemporaryFileWrapper.__getattr__..func_wrapper\u001b[0;34m(*args, **kwargs)\u001b[0m\n\u001b[1;32m 476\u001b[0m \u001b[38;5;129m@_functools\u001b[39m\u001b[38;5;241m.\u001b[39mwraps(func)\n\u001b[1;32m 477\u001b[0m \u001b[38;5;28;01mdef\u001b[39;00m \u001b[38;5;21mfunc_wrapper\u001b[39m(\u001b[38;5;241m*\u001b[39margs, \u001b[38;5;241m*\u001b[39m\u001b[38;5;241m*\u001b[39mkwargs):\n\u001b[0;32m--> 478\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m \u001b[43mfunc\u001b[49m\u001b[43m(\u001b[49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[43margs\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[43mkwargs\u001b[49m\u001b[43m)\u001b[49m\n", "\u001b[0;31mOSError\u001b[0m: [Errno 28] No space left on device" ] } ], "source": [ "obj = ammico.SummaryDetector(subdict=image_dict, analysis_type = \"summary_and_questions\", model_type = \"blip2_t5_caption_coco_flant5xl\")\n", "# list of the new models that can be used:\n", "# \"blip2_t5_pretrain_flant5xxl\",\n", "# \"blip2_t5_pretrain_flant5xl\",\n", "# \"blip2_t5_caption_coco_flant5xl\",\n", "# \"blip2_opt_pretrain_opt2.7b\",\n", "# \"blip2_opt_pretrain_opt6.7b\",\n", "# \"blip2_opt_caption_coco_opt2.7b\",\n", "# \"blip2_opt_caption_coco_opt6.7b\",\n", "\n", "# You can use `pretrain_` model types for zero-shot image-to-text generation with prompts.\n", "# Or you can use `caption_coco_`` model types to generate coco-style captions.\n", "# `flant5` and `opt` means that the model equipped with FlanT5 and OPT LLMs respectively.\n", "\n", "#also you can perform all calculation on cpu if you set device_type= \"cpu\" or gpu if you set device_type= \"cuda\"" ] }, { "cell_type": "code", "execution_count": 24, "metadata": { "execution": { "iopub.execute_input": "2024-02-19T08:53:49.445792Z", "iopub.status.busy": "2024-02-19T08:53:49.445585Z", "iopub.status.idle": "2024-02-19T08:53:49.470611Z", "shell.execute_reply": "2024-02-19T08:53:49.470071Z" } }, "outputs": [ { "ename": "NameError", "evalue": "name 'obj' is not defined", "output_type": "error", "traceback": [ "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", "\u001b[0;31mNameError\u001b[0m Traceback (most recent call last)", "Cell \u001b[0;32mIn[24], line 2\u001b[0m\n\u001b[1;32m 1\u001b[0m \u001b[38;5;28;01mfor\u001b[39;00m key \u001b[38;5;129;01min\u001b[39;00m image_dict:\n\u001b[0;32m----> 2\u001b[0m image_dict[key] \u001b[38;5;241m=\u001b[39m \u001b[43mobj\u001b[49m\u001b[38;5;241m.\u001b[39manalyse_image(subdict \u001b[38;5;241m=\u001b[39m image_dict[key], analysis_type\u001b[38;5;241m=\u001b[39m\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124msummary_and_questions\u001b[39m\u001b[38;5;124m\"\u001b[39m)\n\u001b[1;32m 4\u001b[0m \u001b[38;5;66;03m# analysis_type can be \u001b[39;00m\n\u001b[1;32m 5\u001b[0m \u001b[38;5;66;03m# \"summary\",\u001b[39;00m\n\u001b[1;32m 6\u001b[0m \u001b[38;5;66;03m# \"questions\",\u001b[39;00m\n\u001b[1;32m 7\u001b[0m \u001b[38;5;66;03m# \"summary_and_questions\".\u001b[39;00m\n", "\u001b[0;31mNameError\u001b[0m: name 'obj' is not defined" ] } ], "source": [ "for key in image_dict:\n", " image_dict[key] = obj.analyse_image(subdict = image_dict[key], analysis_type=\"summary_and_questions\")\n", "\n", "# analysis_type can be \n", "# \"summary\",\n", "# \"questions\",\n", "# \"summary_and_questions\"." ] }, { "cell_type": "code", "execution_count": 25, "metadata": { "execution": { "iopub.execute_input": "2024-02-19T08:53:49.473138Z", "iopub.status.busy": "2024-02-19T08:53:49.472942Z", "iopub.status.idle": "2024-02-19T08:53:49.480775Z", "shell.execute_reply": "2024-02-19T08:53:49.480208Z" } }, "outputs": [ { "data": { "text/plain": [ "{'img4': {'filename': 'data-test/img4.png',\n", " 'face': 'No',\n", " 'multiple_faces': 'No',\n", " 'no_faces': 0,\n", " 'wears_mask': ['No'],\n", " 'age': [None],\n", " 'gender': [None],\n", " 'race': [None],\n", " 'emotion': [None],\n", " 'emotion (category)': [None],\n", " 'text': 'MOODOVIN XI',\n", " 'text_language': 'en',\n", " 'text_english': 'MOODOVIN XI',\n", " 'text_clean': 'XI',\n", " 'text_summary': ' MOODOVIN XI XI: Vladimir Putin, Vladimir Vladmir Zelizer, Vladimir',\n", " 'sentiment': 'POSITIVE',\n", " 'sentiment_score': 0.66,\n", " 'entity': ['MOODOVIN XI'],\n", " 'entity_type': ['ORG'],\n", " 'const_image_summary': 'a river running through a city next to tall buildings',\n", " '3_non-deterministic_summary': ['there is a pretty house that sits above the water',\n", " 'there is a building with a balcony and lots of plants on the side of it',\n", " 'several buildings with a river flowing through it']},\n", " 'img1': {'filename': 'data-test/img1.png',\n", " 'no_faces': 0,\n", " 'age': [None],\n", " 'wears_mask': ['No'],\n", " 'emotion (category)': [None],\n", " 'multiple_faces': 'No',\n", " 'emotion': [None],\n", " 'gender': [None],\n", " 'race': [None],\n", " 'face': 'No',\n", " 'text': 'SCATTERING THEORY The Quantum Theory of Nonrelativistic Collisions JOHN R. TAYLOR University of Colorado',\n", " 'text_language': 'en',\n", " 'text_english': 'SCATTERING THEORY The Quantum Theory of Nonrelativistic Collisions JOHN R. TAYLOR University of Colorado',\n", " 'text_clean': 'THEORY The Quantum Theory of Collisions JOHN R. TAYLOR University of Colorado',\n", " 'text_summary': ' SCATTERING THEORY The Quantum Theory of Nonrelativistic Collisions',\n", " 'sentiment': 'POSITIVE',\n", " 'sentiment_score': 0.91,\n", " 'entity': ['Non',\n", " '##vist',\n", " 'Col',\n", " '##N',\n", " 'R',\n", " 'T',\n", " '##AYL',\n", " 'University of Colorado'],\n", " 'entity_type': ['MISC', 'MISC', 'MISC', 'ORG', 'PER', 'PER', 'ORG', 'ORG'],\n", " 'const_image_summary': 'a close up of a piece of paper with writing on it',\n", " '3_non-deterministic_summary': ['a book opened to the book title for a novel',\n", " 'there are many text on this page',\n", " 'the text in a book is a handwritten poem']},\n", " 'img2': {'filename': 'data-test/img2.png',\n", " 'no_faces': 0,\n", " 'age': [None],\n", " 'wears_mask': ['No'],\n", " 'emotion (category)': [None],\n", " 'multiple_faces': 'No',\n", " 'emotion': [None],\n", " 'gender': [None],\n", " 'race': [None],\n", " 'face': 'No',\n", " 'text': 'THE ALGEBRAIC EIGENVALUE PROBLEM DOM NVS TIO MINA Monographs on Numerical Analysis J.. H. WILKINSON',\n", " 'text_language': 'en',\n", " 'text_english': 'THE ALGEBRAIC EIGENVALUE PROBLEM DOM NVS TIO MINA Monographs on Numerical Analysis J.. H. WILKINSON',\n", " 'text_clean': 'THE PROBLEM DOM NVS TIO MINA Monographs on Numerical Analysis J .. H. WILKINSON',\n", " 'text_summary': ' H. H. W. WILKINSON: The Algebri',\n", " 'sentiment': 'NEGATIVE',\n", " 'sentiment_score': 0.97,\n", " 'entity': ['ALGEBRAIC EIGENVAL', 'NVS TIO MI', 'J', 'H', 'WILKINSON'],\n", " 'entity_type': ['MISC', 'ORG', 'ORG', 'ORG', 'ORG'],\n", " 'const_image_summary': 'a yellow book with green lettering on it',\n", " '3_non-deterministic_summary': ['a book cover with green writing on a black background',\n", " 'the title page of a book with information from its authors',\n", " 'a book about the age - related engineering and engineering']},\n", " 'img3': {'filename': 'data-test/img3.png',\n", " 'no_faces': 0,\n", " 'age': [None],\n", " 'wears_mask': ['No'],\n", " 'emotion (category)': [None],\n", " 'multiple_faces': 'No',\n", " 'emotion': [None],\n", " 'gender': [None],\n", " 'race': [None],\n", " 'face': 'No',\n", " 'text': 'm OOOO 0000 www.',\n", " 'text_language': 'en',\n", " 'text_english': 'm OOOO 0000 www.',\n", " 'text_clean': 'm www .',\n", " 'text_summary': ' www. m OOOO 0000 0000 www.m.m OOOo 0000',\n", " 'sentiment': 'NEGATIVE',\n", " 'sentiment_score': 0.62,\n", " 'entity': [],\n", " 'entity_type': [],\n", " 'const_image_summary': 'a bus that is sitting on the side of a road',\n", " '3_non-deterministic_summary': ['there are cars and a bus on the side of the road',\n", " 'a bus that is sitting in the middle of a street',\n", " 'an aerial view of an empty city street with two large buses passing by']},\n", " 'img0': {'filename': 'data-test/img0.png',\n", " 'no_faces': 0,\n", " 'age': [None],\n", " 'wears_mask': ['No'],\n", " 'emotion (category)': [None],\n", " 'multiple_faces': 'No',\n", " 'emotion': [None],\n", " 'gender': [None],\n", " 'race': [None],\n", " 'face': 'No',\n", " 'text': 'Mathematische Formelsammlung für Ingenieure und Naturwissenschaftler Mit zahlreichen Abbildungen und Rechenbeispielen und einer ausführlichen Integraltafel 3., verbesserte Auflage',\n", " 'text_language': 'de',\n", " 'text_english': 'Mathematical formula collection for engineers and scientists With numerous illustrations and calculation examples and a detailed integral table 3rd, improved edition',\n", " 'text_clean': 'Mathematical formula collection for engineers and scientists With numerous illustrations and calculation examples and a detailed integral table 3rd , improved edition',\n", " 'text_summary': ' Mathematical formula collection for engineers and scientists . Includes numerous illustrations and calculation examples . Includes',\n", " 'sentiment': 'POSITIVE',\n", " 'sentiment_score': 1.0,\n", " 'entity': [],\n", " 'entity_type': [],\n", " 'const_image_summary': 'a close up of an open book with writing on it',\n", " '3_non-deterministic_summary': ['a close up of a book with many languages',\n", " 'a book that is opened up in german',\n", " 'book about mathemarche formulals and their meaning']},\n", " 'img5': {'filename': 'data-test/img5.png',\n", " 'no_faces': 1,\n", " 'age': [26],\n", " 'wears_mask': ['No'],\n", " 'emotion (category)': ['Negative'],\n", " 'multiple_faces': 'No',\n", " 'emotion': ['sad'],\n", " 'gender': ['Man'],\n", " 'race': [None],\n", " 'face': 'Yes',\n", " 'text': None,\n", " 'text_language': 'en',\n", " 'text_english': '',\n", " 'text_clean': '',\n", " 'text_summary': ' CNN.com will feature iReporter photos in a weekly Travel Snapshots gallery .',\n", " 'sentiment': 'POSITIVE',\n", " 'sentiment_score': 0.75,\n", " 'entity': [],\n", " 'entity_type': [],\n", " 'const_image_summary': 'a person running on a beach near a rock formation',\n", " '3_non-deterministic_summary': ['a woman is running down the beach next to some rocks',\n", " 'a woman running along the beach by the ocean',\n", " 'there is a person running on the beach next to the ocean']}}" ] }, "execution_count": 25, "metadata": {}, "output_type": "execute_result" } ], "source": [ "image_dict" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You can also pass a list of questions to this cell if `analysis_type=\"summary_and_questions\"` or `analysis_type=\"questions\"`. But the format of questions has changed in new models. \n", "\n", "Here is an example of a list of questions:" ] }, { "cell_type": "code", "execution_count": 26, "metadata": { "execution": { "iopub.execute_input": "2024-02-19T08:53:49.483664Z", "iopub.status.busy": "2024-02-19T08:53:49.483466Z", "iopub.status.idle": "2024-02-19T08:53:49.486232Z", "shell.execute_reply": "2024-02-19T08:53:49.485670Z" } }, "outputs": [], "source": [ "list_of_questions = [\n", " \"Question: Are there people in the image? Answer:\",\n", " \"Question: What is this picture about? Answer:\",\n", "]" ] }, { "cell_type": "code", "execution_count": 27, "metadata": { "execution": { "iopub.execute_input": "2024-02-19T08:53:49.488981Z", "iopub.status.busy": "2024-02-19T08:53:49.488787Z", "iopub.status.idle": "2024-02-19T08:53:49.509942Z", "shell.execute_reply": "2024-02-19T08:53:49.509438Z" } }, "outputs": [ { "ename": "NameError", "evalue": "name 'obj' is not defined", "output_type": "error", "traceback": [ "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", "\u001b[0;31mNameError\u001b[0m Traceback (most recent call last)", "Cell \u001b[0;32mIn[27], line 2\u001b[0m\n\u001b[1;32m 1\u001b[0m \u001b[38;5;28;01mfor\u001b[39;00m key \u001b[38;5;129;01min\u001b[39;00m image_dict:\n\u001b[0;32m----> 2\u001b[0m image_dict[key] \u001b[38;5;241m=\u001b[39m \u001b[43mobj\u001b[49m\u001b[38;5;241m.\u001b[39manalyse_image(subdict \u001b[38;5;241m=\u001b[39m image_dict[key], analysis_type\u001b[38;5;241m=\u001b[39m\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mquestions\u001b[39m\u001b[38;5;124m\"\u001b[39m, list_of_questions\u001b[38;5;241m=\u001b[39mlist_of_questions)\n", "\u001b[0;31mNameError\u001b[0m: name 'obj' is not defined" ] } ], "source": [ "for key in image_dict:\n", " image_dict[key] = obj.analyse_image(subdict = image_dict[key], analysis_type=\"questions\", list_of_questions=list_of_questions)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You can also pass a question with previous answers as context into this model and pass in questions like this one to get a more accurate answer:\n", "\n", "You can combine as many questions as you want in a single query as a list." ] }, { "cell_type": "code", "execution_count": 28, "metadata": { "execution": { "iopub.execute_input": "2024-02-19T08:53:49.513195Z", "iopub.status.busy": "2024-02-19T08:53:49.512999Z", "iopub.status.idle": "2024-02-19T08:53:49.515828Z", "shell.execute_reply": "2024-02-19T08:53:49.515286Z" } }, "outputs": [], "source": [ "list_of_questions = [\n", " \"Question: What country is in the picture? Answer: USA. Question: Why? Answer: Because there is an American flag in the background . Question: Where it comes from? Answer:\",\n", " \"Question: Which city is this? Answer: Frankfurt. Question: Why?\",\n", "]" ] }, { "cell_type": "code", "execution_count": 29, "metadata": { "execution": { "iopub.execute_input": "2024-02-19T08:53:49.520071Z", "iopub.status.busy": "2024-02-19T08:53:49.519734Z", "iopub.status.idle": "2024-02-19T08:53:49.540462Z", "shell.execute_reply": "2024-02-19T08:53:49.539940Z" } }, "outputs": [ { "ename": "NameError", "evalue": "name 'obj' is not defined", "output_type": "error", "traceback": [ "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", "\u001b[0;31mNameError\u001b[0m Traceback (most recent call last)", "Cell \u001b[0;32mIn[29], line 2\u001b[0m\n\u001b[1;32m 1\u001b[0m \u001b[38;5;28;01mfor\u001b[39;00m key \u001b[38;5;129;01min\u001b[39;00m image_dict:\n\u001b[0;32m----> 2\u001b[0m image_dict[key] \u001b[38;5;241m=\u001b[39m \u001b[43mobj\u001b[49m\u001b[38;5;241m.\u001b[39manalyse_image(subdict \u001b[38;5;241m=\u001b[39m image_dict[key], analysis_type\u001b[38;5;241m=\u001b[39m\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mquestions\u001b[39m\u001b[38;5;124m\"\u001b[39m, list_of_questions\u001b[38;5;241m=\u001b[39mlist_of_questions)\n", "\u001b[0;31mNameError\u001b[0m: name 'obj' is not defined" ] } ], "source": [ "for key in image_dict:\n", " image_dict[key] = obj.analyse_image(subdict = image_dict[key], analysis_type=\"questions\", list_of_questions=list_of_questions)" ] }, { "cell_type": "code", "execution_count": 30, "metadata": { "execution": { "iopub.execute_input": "2024-02-19T08:53:49.544909Z", "iopub.status.busy": "2024-02-19T08:53:49.544431Z", "iopub.status.idle": "2024-02-19T08:53:49.551819Z", "shell.execute_reply": "2024-02-19T08:53:49.551270Z" } }, "outputs": [ { "data": { "text/plain": [ "{'img4': {'filename': 'data-test/img4.png',\n", " 'face': 'No',\n", " 'multiple_faces': 'No',\n", " 'no_faces': 0,\n", " 'wears_mask': ['No'],\n", " 'age': [None],\n", " 'gender': [None],\n", " 'race': [None],\n", " 'emotion': [None],\n", " 'emotion (category)': [None],\n", " 'text': 'MOODOVIN XI',\n", " 'text_language': 'en',\n", " 'text_english': 'MOODOVIN XI',\n", " 'text_clean': 'XI',\n", " 'text_summary': ' MOODOVIN XI XI: Vladimir Putin, Vladimir Vladmir Zelizer, Vladimir',\n", " 'sentiment': 'POSITIVE',\n", " 'sentiment_score': 0.66,\n", " 'entity': ['MOODOVIN XI'],\n", " 'entity_type': ['ORG'],\n", " 'const_image_summary': 'a river running through a city next to tall buildings',\n", " '3_non-deterministic_summary': ['there is a pretty house that sits above the water',\n", " 'there is a building with a balcony and lots of plants on the side of it',\n", " 'several buildings with a river flowing through it']},\n", " 'img1': {'filename': 'data-test/img1.png',\n", " 'no_faces': 0,\n", " 'age': [None],\n", " 'wears_mask': ['No'],\n", " 'emotion (category)': [None],\n", " 'multiple_faces': 'No',\n", " 'emotion': [None],\n", " 'gender': [None],\n", " 'race': [None],\n", " 'face': 'No',\n", " 'text': 'SCATTERING THEORY The Quantum Theory of Nonrelativistic Collisions JOHN R. TAYLOR University of Colorado',\n", " 'text_language': 'en',\n", " 'text_english': 'SCATTERING THEORY The Quantum Theory of Nonrelativistic Collisions JOHN R. TAYLOR University of Colorado',\n", " 'text_clean': 'THEORY The Quantum Theory of Collisions JOHN R. TAYLOR University of Colorado',\n", " 'text_summary': ' SCATTERING THEORY The Quantum Theory of Nonrelativistic Collisions',\n", " 'sentiment': 'POSITIVE',\n", " 'sentiment_score': 0.91,\n", " 'entity': ['Non',\n", " '##vist',\n", " 'Col',\n", " '##N',\n", " 'R',\n", " 'T',\n", " '##AYL',\n", " 'University of Colorado'],\n", " 'entity_type': ['MISC', 'MISC', 'MISC', 'ORG', 'PER', 'PER', 'ORG', 'ORG'],\n", " 'const_image_summary': 'a close up of a piece of paper with writing on it',\n", " '3_non-deterministic_summary': ['a book opened to the book title for a novel',\n", " 'there are many text on this page',\n", " 'the text in a book is a handwritten poem']},\n", " 'img2': {'filename': 'data-test/img2.png',\n", " 'no_faces': 0,\n", " 'age': [None],\n", " 'wears_mask': ['No'],\n", " 'emotion (category)': [None],\n", " 'multiple_faces': 'No',\n", " 'emotion': [None],\n", " 'gender': [None],\n", " 'race': [None],\n", " 'face': 'No',\n", " 'text': 'THE ALGEBRAIC EIGENVALUE PROBLEM DOM NVS TIO MINA Monographs on Numerical Analysis J.. H. WILKINSON',\n", " 'text_language': 'en',\n", " 'text_english': 'THE ALGEBRAIC EIGENVALUE PROBLEM DOM NVS TIO MINA Monographs on Numerical Analysis J.. H. WILKINSON',\n", " 'text_clean': 'THE PROBLEM DOM NVS TIO MINA Monographs on Numerical Analysis J .. H. WILKINSON',\n", " 'text_summary': ' H. H. W. WILKINSON: The Algebri',\n", " 'sentiment': 'NEGATIVE',\n", " 'sentiment_score': 0.97,\n", " 'entity': ['ALGEBRAIC EIGENVAL', 'NVS TIO MI', 'J', 'H', 'WILKINSON'],\n", " 'entity_type': ['MISC', 'ORG', 'ORG', 'ORG', 'ORG'],\n", " 'const_image_summary': 'a yellow book with green lettering on it',\n", " '3_non-deterministic_summary': ['a book cover with green writing on a black background',\n", " 'the title page of a book with information from its authors',\n", " 'a book about the age - related engineering and engineering']},\n", " 'img3': {'filename': 'data-test/img3.png',\n", " 'no_faces': 0,\n", " 'age': [None],\n", " 'wears_mask': ['No'],\n", " 'emotion (category)': [None],\n", " 'multiple_faces': 'No',\n", " 'emotion': [None],\n", " 'gender': [None],\n", " 'race': [None],\n", " 'face': 'No',\n", " 'text': 'm OOOO 0000 www.',\n", " 'text_language': 'en',\n", " 'text_english': 'm OOOO 0000 www.',\n", " 'text_clean': 'm www .',\n", " 'text_summary': ' www. m OOOO 0000 0000 www.m.m OOOo 0000',\n", " 'sentiment': 'NEGATIVE',\n", " 'sentiment_score': 0.62,\n", " 'entity': [],\n", " 'entity_type': [],\n", " 'const_image_summary': 'a bus that is sitting on the side of a road',\n", " '3_non-deterministic_summary': ['there are cars and a bus on the side of the road',\n", " 'a bus that is sitting in the middle of a street',\n", " 'an aerial view of an empty city street with two large buses passing by']},\n", " 'img0': {'filename': 'data-test/img0.png',\n", " 'no_faces': 0,\n", " 'age': [None],\n", " 'wears_mask': ['No'],\n", " 'emotion (category)': [None],\n", " 'multiple_faces': 'No',\n", " 'emotion': [None],\n", " 'gender': [None],\n", " 'race': [None],\n", " 'face': 'No',\n", " 'text': 'Mathematische Formelsammlung für Ingenieure und Naturwissenschaftler Mit zahlreichen Abbildungen und Rechenbeispielen und einer ausführlichen Integraltafel 3., verbesserte Auflage',\n", " 'text_language': 'de',\n", " 'text_english': 'Mathematical formula collection for engineers and scientists With numerous illustrations and calculation examples and a detailed integral table 3rd, improved edition',\n", " 'text_clean': 'Mathematical formula collection for engineers and scientists With numerous illustrations and calculation examples and a detailed integral table 3rd , improved edition',\n", " 'text_summary': ' Mathematical formula collection for engineers and scientists . Includes numerous illustrations and calculation examples . Includes',\n", " 'sentiment': 'POSITIVE',\n", " 'sentiment_score': 1.0,\n", " 'entity': [],\n", " 'entity_type': [],\n", " 'const_image_summary': 'a close up of an open book with writing on it',\n", " '3_non-deterministic_summary': ['a close up of a book with many languages',\n", " 'a book that is opened up in german',\n", " 'book about mathemarche formulals and their meaning']},\n", " 'img5': {'filename': 'data-test/img5.png',\n", " 'no_faces': 1,\n", " 'age': [26],\n", " 'wears_mask': ['No'],\n", " 'emotion (category)': ['Negative'],\n", " 'multiple_faces': 'No',\n", " 'emotion': ['sad'],\n", " 'gender': ['Man'],\n", " 'race': [None],\n", " 'face': 'Yes',\n", " 'text': None,\n", " 'text_language': 'en',\n", " 'text_english': '',\n", " 'text_clean': '',\n", " 'text_summary': ' CNN.com will feature iReporter photos in a weekly Travel Snapshots gallery .',\n", " 'sentiment': 'POSITIVE',\n", " 'sentiment_score': 0.75,\n", " 'entity': [],\n", " 'entity_type': [],\n", " 'const_image_summary': 'a person running on a beach near a rock formation',\n", " '3_non-deterministic_summary': ['a woman is running down the beach next to some rocks',\n", " 'a woman running along the beach by the ocean',\n", " 'there is a person running on the beach next to the ocean']}}" ] }, "execution_count": 30, "metadata": {}, "output_type": "execute_result" } ], "source": [ "image_dict" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You can also ask sequential questions if you pass the argument `cosequential_questions=True`. This means that the answers to previous questions will be passed as context to the next question. However, this method will work a bit slower, because for each image the answers to the questions will not be calculated simultaneously, but sequentially. " ] }, { "cell_type": "code", "execution_count": 31, "metadata": { "execution": { "iopub.execute_input": "2024-02-19T08:53:49.554718Z", "iopub.status.busy": "2024-02-19T08:53:49.554288Z", "iopub.status.idle": "2024-02-19T08:53:49.557147Z", "shell.execute_reply": "2024-02-19T08:53:49.556606Z" } }, "outputs": [], "source": [ "list_of_questions = [\n", " \"Question: Is this picture taken inside or outside? Answer:\",\n", " \"Question: Why? Answer:\",\n", "]" ] }, { "cell_type": "code", "execution_count": 32, "metadata": { "execution": { "iopub.execute_input": "2024-02-19T08:53:49.559854Z", "iopub.status.busy": "2024-02-19T08:53:49.559416Z", "iopub.status.idle": "2024-02-19T08:53:49.581420Z", "shell.execute_reply": "2024-02-19T08:53:49.580718Z" } }, "outputs": [ { "ename": "NameError", "evalue": "name 'obj' is not defined", "output_type": "error", "traceback": [ "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", "\u001b[0;31mNameError\u001b[0m Traceback (most recent call last)", "Cell \u001b[0;32mIn[32], line 2\u001b[0m\n\u001b[1;32m 1\u001b[0m \u001b[38;5;28;01mfor\u001b[39;00m key \u001b[38;5;129;01min\u001b[39;00m image_dict:\n\u001b[0;32m----> 2\u001b[0m image_dict[key] \u001b[38;5;241m=\u001b[39m \u001b[43mobj\u001b[49m\u001b[38;5;241m.\u001b[39manalyse_image(subdict \u001b[38;5;241m=\u001b[39m image_dict[key], analysis_type\u001b[38;5;241m=\u001b[39m\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mquestions\u001b[39m\u001b[38;5;124m\"\u001b[39m, list_of_questions\u001b[38;5;241m=\u001b[39mlist_of_questions, consequential_questions\u001b[38;5;241m=\u001b[39m\u001b[38;5;28;01mTrue\u001b[39;00m)\n", "\u001b[0;31mNameError\u001b[0m: name 'obj' is not defined" ] } ], "source": [ "for key in image_dict:\n", " image_dict[key] = obj.analyse_image(subdict = image_dict[key], analysis_type=\"questions\", list_of_questions=list_of_questions, consequential_questions=True)" ] }, { "cell_type": "code", "execution_count": 33, "metadata": { "execution": { "iopub.execute_input": "2024-02-19T08:53:49.584997Z", "iopub.status.busy": "2024-02-19T08:53:49.584485Z", "iopub.status.idle": "2024-02-19T08:53:49.592760Z", "shell.execute_reply": "2024-02-19T08:53:49.592039Z" } }, "outputs": [ { "data": { "text/plain": [ "{'img4': {'filename': 'data-test/img4.png',\n", " 'face': 'No',\n", " 'multiple_faces': 'No',\n", " 'no_faces': 0,\n", " 'wears_mask': ['No'],\n", " 'age': [None],\n", " 'gender': [None],\n", " 'race': [None],\n", " 'emotion': [None],\n", " 'emotion (category)': [None],\n", " 'text': 'MOODOVIN XI',\n", " 'text_language': 'en',\n", " 'text_english': 'MOODOVIN XI',\n", " 'text_clean': 'XI',\n", " 'text_summary': ' MOODOVIN XI XI: Vladimir Putin, Vladimir Vladmir Zelizer, Vladimir',\n", " 'sentiment': 'POSITIVE',\n", " 'sentiment_score': 0.66,\n", " 'entity': ['MOODOVIN XI'],\n", " 'entity_type': ['ORG'],\n", " 'const_image_summary': 'a river running through a city next to tall buildings',\n", " '3_non-deterministic_summary': ['there is a pretty house that sits above the water',\n", " 'there is a building with a balcony and lots of plants on the side of it',\n", " 'several buildings with a river flowing through it']},\n", " 'img1': {'filename': 'data-test/img1.png',\n", " 'no_faces': 0,\n", " 'age': [None],\n", " 'wears_mask': ['No'],\n", " 'emotion (category)': [None],\n", " 'multiple_faces': 'No',\n", " 'emotion': [None],\n", " 'gender': [None],\n", " 'race': [None],\n", " 'face': 'No',\n", " 'text': 'SCATTERING THEORY The Quantum Theory of Nonrelativistic Collisions JOHN R. TAYLOR University of Colorado',\n", " 'text_language': 'en',\n", " 'text_english': 'SCATTERING THEORY The Quantum Theory of Nonrelativistic Collisions JOHN R. TAYLOR University of Colorado',\n", " 'text_clean': 'THEORY The Quantum Theory of Collisions JOHN R. TAYLOR University of Colorado',\n", " 'text_summary': ' SCATTERING THEORY The Quantum Theory of Nonrelativistic Collisions',\n", " 'sentiment': 'POSITIVE',\n", " 'sentiment_score': 0.91,\n", " 'entity': ['Non',\n", " '##vist',\n", " 'Col',\n", " '##N',\n", " 'R',\n", " 'T',\n", " '##AYL',\n", " 'University of Colorado'],\n", " 'entity_type': ['MISC', 'MISC', 'MISC', 'ORG', 'PER', 'PER', 'ORG', 'ORG'],\n", " 'const_image_summary': 'a close up of a piece of paper with writing on it',\n", " '3_non-deterministic_summary': ['a book opened to the book title for a novel',\n", " 'there are many text on this page',\n", " 'the text in a book is a handwritten poem']},\n", " 'img2': {'filename': 'data-test/img2.png',\n", " 'no_faces': 0,\n", " 'age': [None],\n", " 'wears_mask': ['No'],\n", " 'emotion (category)': [None],\n", " 'multiple_faces': 'No',\n", " 'emotion': [None],\n", " 'gender': [None],\n", " 'race': [None],\n", " 'face': 'No',\n", " 'text': 'THE ALGEBRAIC EIGENVALUE PROBLEM DOM NVS TIO MINA Monographs on Numerical Analysis J.. H. WILKINSON',\n", " 'text_language': 'en',\n", " 'text_english': 'THE ALGEBRAIC EIGENVALUE PROBLEM DOM NVS TIO MINA Monographs on Numerical Analysis J.. H. WILKINSON',\n", " 'text_clean': 'THE PROBLEM DOM NVS TIO MINA Monographs on Numerical Analysis J .. H. WILKINSON',\n", " 'text_summary': ' H. H. W. WILKINSON: The Algebri',\n", " 'sentiment': 'NEGATIVE',\n", " 'sentiment_score': 0.97,\n", " 'entity': ['ALGEBRAIC EIGENVAL', 'NVS TIO MI', 'J', 'H', 'WILKINSON'],\n", " 'entity_type': ['MISC', 'ORG', 'ORG', 'ORG', 'ORG'],\n", " 'const_image_summary': 'a yellow book with green lettering on it',\n", " '3_non-deterministic_summary': ['a book cover with green writing on a black background',\n", " 'the title page of a book with information from its authors',\n", " 'a book about the age - related engineering and engineering']},\n", " 'img3': {'filename': 'data-test/img3.png',\n", " 'no_faces': 0,\n", " 'age': [None],\n", " 'wears_mask': ['No'],\n", " 'emotion (category)': [None],\n", " 'multiple_faces': 'No',\n", " 'emotion': [None],\n", " 'gender': [None],\n", " 'race': [None],\n", " 'face': 'No',\n", " 'text': 'm OOOO 0000 www.',\n", " 'text_language': 'en',\n", " 'text_english': 'm OOOO 0000 www.',\n", " 'text_clean': 'm www .',\n", " 'text_summary': ' www. m OOOO 0000 0000 www.m.m OOOo 0000',\n", " 'sentiment': 'NEGATIVE',\n", " 'sentiment_score': 0.62,\n", " 'entity': [],\n", " 'entity_type': [],\n", " 'const_image_summary': 'a bus that is sitting on the side of a road',\n", " '3_non-deterministic_summary': ['there are cars and a bus on the side of the road',\n", " 'a bus that is sitting in the middle of a street',\n", " 'an aerial view of an empty city street with two large buses passing by']},\n", " 'img0': {'filename': 'data-test/img0.png',\n", " 'no_faces': 0,\n", " 'age': [None],\n", " 'wears_mask': ['No'],\n", " 'emotion (category)': [None],\n", " 'multiple_faces': 'No',\n", " 'emotion': [None],\n", " 'gender': [None],\n", " 'race': [None],\n", " 'face': 'No',\n", " 'text': 'Mathematische Formelsammlung für Ingenieure und Naturwissenschaftler Mit zahlreichen Abbildungen und Rechenbeispielen und einer ausführlichen Integraltafel 3., verbesserte Auflage',\n", " 'text_language': 'de',\n", " 'text_english': 'Mathematical formula collection for engineers and scientists With numerous illustrations and calculation examples and a detailed integral table 3rd, improved edition',\n", " 'text_clean': 'Mathematical formula collection for engineers and scientists With numerous illustrations and calculation examples and a detailed integral table 3rd , improved edition',\n", " 'text_summary': ' Mathematical formula collection for engineers and scientists . Includes numerous illustrations and calculation examples . Includes',\n", " 'sentiment': 'POSITIVE',\n", " 'sentiment_score': 1.0,\n", " 'entity': [],\n", " 'entity_type': [],\n", " 'const_image_summary': 'a close up of an open book with writing on it',\n", " '3_non-deterministic_summary': ['a close up of a book with many languages',\n", " 'a book that is opened up in german',\n", " 'book about mathemarche formulals and their meaning']},\n", " 'img5': {'filename': 'data-test/img5.png',\n", " 'no_faces': 1,\n", " 'age': [26],\n", " 'wears_mask': ['No'],\n", " 'emotion (category)': ['Negative'],\n", " 'multiple_faces': 'No',\n", " 'emotion': ['sad'],\n", " 'gender': ['Man'],\n", " 'race': [None],\n", " 'face': 'Yes',\n", " 'text': None,\n", " 'text_language': 'en',\n", " 'text_english': '',\n", " 'text_clean': '',\n", " 'text_summary': ' CNN.com will feature iReporter photos in a weekly Travel Snapshots gallery .',\n", " 'sentiment': 'POSITIVE',\n", " 'sentiment_score': 0.75,\n", " 'entity': [],\n", " 'entity_type': [],\n", " 'const_image_summary': 'a person running on a beach near a rock formation',\n", " '3_non-deterministic_summary': ['a woman is running down the beach next to some rocks',\n", " 'a woman running along the beach by the ocean',\n", " 'there is a person running on the beach next to the ocean']}}" ] }, "execution_count": 33, "metadata": {}, "output_type": "execute_result" } ], "source": [ "image_dict" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Detection of faces and facial expression analysis\n", "Faces and facial expressions are detected and analyzed using the `EmotionDetector` class from the `faces` module. Initially, it is detected if faces are present on the image using RetinaFace, followed by analysis if face masks are worn (Face-Mask-Detection). The detection of age, gender, race, and emotions is carried out with deepface.\n", "\n", "\n", "\n", "Depending on the features found on the image, the face detection module returns a different analysis content: If no faces are found on the image, all further steps are skipped and the result `\"face\": \"No\", \"multiple_faces\": \"No\", \"no_faces\": 0, \"wears_mask\": [\"No\"], \"age\": [None], \"gender\": [None], \"race\": [None], \"emotion\": [None], \"emotion (category)\": [None]` is returned. If one or several faces are found, up to three faces are analyzed if they are partially concealed by a face mask. If yes, only age and gender are detected; if no, also race, emotion, and dominant emotion are detected. In case of the latter, the output could look like this: `\"face\": \"Yes\", \"multiple_faces\": \"Yes\", \"no_faces\": 2, \"wears_mask\": [\"No\", \"No\"], \"age\": [27, 28], \"gender\": [\"Man\", \"Man\"], \"race\": [\"asian\", None], \"emotion\": [\"angry\", \"neutral\"], \"emotion (category)\": [\"Negative\", \"Neutral\"]`, where for the two faces that are detected (given by `no_faces`), some of the values are returned as a list with the first item for the first (largest) face and the second item for the second (smaller) face (for example, `\"emotion\"` returns a list `[\"angry\", \"neutral\"]` signifying the first face expressing anger, and the second face having a neutral expression).\n", "\n", "The emotion detection reports the seven facial expressions angry, fear, neutral, sad, disgust, happy and surprise. These emotions are assigned based on the returned confidence of the model (between 0 and 1), with a high confidence signifying a high likelihood of the detected emotion being correct. Emotion recognition is not an easy task, even for a human; therefore, we have added a keyword `emotion_threshold` signifying the % value above which an emotion is counted as being detected. The default is set to 50%, so that a confidence above 0.5 results in an emotion being assigned. If the confidence is lower, no emotion is assigned. \n", "\n", "From the seven facial expressions, an overall dominating emotion category is identified: negative, positive, or neutral emotion. These are defined with the facial expressions angry, disgust, fear and sad for the negative category, happy for the positive category, and surprise and neutral for the neutral category.\n", "\n", "A similar threshold as for the emotion recognition is set for the race detection, `race_threshold`, with the default set to 50% so that a confidence for the race above 0.5 only will return a value in the analysis. \n", "\n", "Summarizing, the face detection is carried out using the following method call and keywords, where `emotion_threshold` and \n", "`race_threshold` are optional:" ] }, { "cell_type": "code", "execution_count": 34, "metadata": { "execution": { "iopub.execute_input": "2024-02-19T08:53:49.596180Z", "iopub.status.busy": "2024-02-19T08:53:49.595807Z", "iopub.status.idle": "2024-02-19T08:54:11.388417Z", "shell.execute_reply": "2024-02-19T08:54:11.387832Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\r", "1/1 [==============================] - ETA: 0s" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "1/1 [==============================] - 1s 535ms/step\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\r", "1/1 [==============================] - ETA: 0s" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "1/1 [==============================] - 0s 343ms/step\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\r", "1/1 [==============================] - ETA: 0s" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "1/1 [==============================] - 0s 226ms/step\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\r", "1/1 [==============================] - ETA: 0s" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "1/1 [==============================] - 0s 233ms/step\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\r", "1/1 [==============================] - ETA: 0s" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\r", "1/1 [==============================] - 0s 21ms/step\n" ] } ], "source": [ "for key in image_dict.keys():\n", " image_dict[key] = ammico.EmotionDetector(image_dict[key], emotion_threshold=50, race_threshold=50).analyse_image()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The thresholds can be adapted interactively in the notebook interface and the optimal value can then be used in a subsequent analysis of the whole data set.\n", "\n", "The output keys that are generated are\n", "\n", "| output key | output type | output value |\n", "| ---------- | ----------- | ------------ |\n", "| `face` | `str` | if a face is detected |\n", "| `multiple_faces` | `str` | if multiple faces are detected |\n", "| `no_faces` | `int` | the number of detected faces |\n", "| `wears_mask` | `list[str]` | if each of the detected faces wears a face covering, up to three faces |\n", "| `age` | `list[int]` | the detected age, up to three faces |\n", "| `gender` | `list[str]` | the detected gender, up to three faces |\n", "| `race` | `list[str]` | the detected race, up to three faces, if above the confidence threshold |\n", "| `emotion` | `list[str]` | the detected emotion, up to three faces, if above the confidence threshold |\n", "| `emotion (category)` | `list[str]` | the detected emotion category (positive, negative, or neutral), up to three faces, if above the confidence threshold |" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Image Multimodal Search" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This module shows how to carry out an image multimodal search with the [LAVIS](https://github.com/salesforce/LAVIS) library. " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Indexing and extracting features from images in selected folder" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "First you need to select a model. You can choose one of the following models: \n", "- [blip](https://github.com/salesforce/BLIP)\n", "- [blip2](https://huggingface.co/docs/transformers/main/model_doc/blip-2) \n", "- [albef](https://github.com/salesforce/ALBEF) \n", "- [clip_base](https://github.com/openai/CLIP/blob/main/model-card.md)\n", "- [clip_vitl14](https://github.com/mlfoundations/open_clip) \n", "- [clip_vitl14_336](https://github.com/mlfoundations/open_clip)" ] }, { "cell_type": "code", "execution_count": 35, "metadata": { "execution": { "iopub.execute_input": "2024-02-19T08:54:11.393676Z", "iopub.status.busy": "2024-02-19T08:54:11.393305Z", "iopub.status.idle": "2024-02-19T08:54:11.396416Z", "shell.execute_reply": "2024-02-19T08:54:11.395833Z" } }, "outputs": [], "source": [ "model_type = \"blip\"\n", "# model_type = \"blip2\"\n", "# model_type = \"albef\"\n", "# model_type = \"clip_base\"\n", "# model_type = \"clip_vitl14\"\n", "# model_type = \"clip_vitl14_336\"" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To process the loaded images using the selected model, use the below code:" ] }, { "cell_type": "code", "execution_count": 36, "metadata": { "execution": { "iopub.execute_input": "2024-02-19T08:54:11.399454Z", "iopub.status.busy": "2024-02-19T08:54:11.399132Z", "iopub.status.idle": "2024-02-19T08:54:11.402161Z", "shell.execute_reply": "2024-02-19T08:54:11.401648Z" } }, "outputs": [], "source": [ "my_obj = ammico.MultimodalSearch(image_dict)" ] }, { "cell_type": "code", "execution_count": 37, "metadata": { "execution": { "iopub.execute_input": "2024-02-19T08:54:11.404913Z", "iopub.status.busy": "2024-02-19T08:54:11.404591Z", "iopub.status.idle": "2024-02-19T08:54:26.394925Z", "shell.execute_reply": "2024-02-19T08:54:26.394163Z" } }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "\r", " 0%| | 0.00/1.97G [00:00 8\u001b[0m ) \u001b[38;5;241m=\u001b[39m \u001b[43mmy_obj\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mparsing_images\u001b[49m\u001b[43m(\u001b[49m\n\u001b[1;32m 9\u001b[0m \u001b[43m \u001b[49m\u001b[43mmodel_type\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\n\u001b[1;32m 10\u001b[0m \u001b[43m \u001b[49m\u001b[43mpath_to_save_tensors\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[38;5;124;43m/content/drive/MyDrive/misinformation-data/\u001b[39;49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[43m,\u001b[49m\n\u001b[1;32m 11\u001b[0m \u001b[43m \u001b[49m\u001b[43m)\u001b[49m\n", "File \u001b[0;32m~/work/AMMICO/AMMICO/ammico/multimodal_search.py:363\u001b[0m, in \u001b[0;36mMultimodalSearch.parsing_images\u001b[0;34m(self, model_type, path_to_save_tensors, path_to_load_tensors)\u001b[0m\n\u001b[1;32m 349\u001b[0m select_extract_image_features \u001b[38;5;241m=\u001b[39m {\n\u001b[1;32m 350\u001b[0m \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mblip2\u001b[39m\u001b[38;5;124m\"\u001b[39m: MultimodalSearch\u001b[38;5;241m.\u001b[39mextract_image_features_blip2,\n\u001b[1;32m 351\u001b[0m \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mblip\u001b[39m\u001b[38;5;124m\"\u001b[39m: MultimodalSearch\u001b[38;5;241m.\u001b[39mextract_image_features_basic,\n\u001b[0;32m (...)\u001b[0m\n\u001b[1;32m 355\u001b[0m \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mclip_vitl14_336\u001b[39m\u001b[38;5;124m\"\u001b[39m: MultimodalSearch\u001b[38;5;241m.\u001b[39mextract_image_features_clip,\n\u001b[1;32m 356\u001b[0m }\n\u001b[1;32m 358\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m model_type \u001b[38;5;129;01min\u001b[39;00m select_model\u001b[38;5;241m.\u001b[39mkeys():\n\u001b[1;32m 359\u001b[0m (\n\u001b[1;32m 360\u001b[0m model,\n\u001b[1;32m 361\u001b[0m vis_processors,\n\u001b[1;32m 362\u001b[0m txt_processors,\n\u001b[0;32m--> 363\u001b[0m ) \u001b[38;5;241m=\u001b[39m \u001b[43mselect_model\u001b[49m\u001b[43m[\u001b[49m\n\u001b[1;32m 364\u001b[0m \u001b[43m \u001b[49m\u001b[43mmodel_type\u001b[49m\n\u001b[1;32m 365\u001b[0m \u001b[43m \u001b[49m\u001b[43m]\u001b[49m\u001b[43m(\u001b[49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mMultimodalSearch\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mmultimodal_device\u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m 366\u001b[0m \u001b[38;5;28;01melse\u001b[39;00m:\n\u001b[1;32m 367\u001b[0m \u001b[38;5;28;01mraise\u001b[39;00m \u001b[38;5;167;01mSyntaxError\u001b[39;00m(\n\u001b[1;32m 368\u001b[0m \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mPlease, use one of the following models: blip2, blip, albef, clip_base, clip_vitl14, clip_vitl14_336\u001b[39m\u001b[38;5;124m\"\u001b[39m\n\u001b[1;32m 369\u001b[0m )\n", "File \u001b[0;32m~/work/AMMICO/AMMICO/ammico/multimodal_search.py:55\u001b[0m, in \u001b[0;36mMultimodalSearch.load_feature_extractor_model_blip\u001b[0;34m(self, device)\u001b[0m\n\u001b[1;32m 43\u001b[0m \u001b[38;5;28;01mdef\u001b[39;00m \u001b[38;5;21mload_feature_extractor_model_blip\u001b[39m(\u001b[38;5;28mself\u001b[39m, device: \u001b[38;5;28mstr\u001b[39m \u001b[38;5;241m=\u001b[39m \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mcpu\u001b[39m\u001b[38;5;124m\"\u001b[39m):\n\u001b[1;32m 44\u001b[0m \u001b[38;5;250m \u001b[39m\u001b[38;5;124;03m\"\"\"\u001b[39;00m\n\u001b[1;32m 45\u001b[0m \u001b[38;5;124;03m Load base blip_feature_extractor model and preprocessors for visual and text inputs from lavis.models.\u001b[39;00m\n\u001b[1;32m 46\u001b[0m \n\u001b[0;32m (...)\u001b[0m\n\u001b[1;32m 53\u001b[0m \u001b[38;5;124;03m txt_processors (dict): preprocessors for text inputs.\u001b[39;00m\n\u001b[1;32m 54\u001b[0m \u001b[38;5;124;03m \"\"\"\u001b[39;00m\n\u001b[0;32m---> 55\u001b[0m model, vis_processors, txt_processors \u001b[38;5;241m=\u001b[39m \u001b[43mload_model_and_preprocess\u001b[49m\u001b[43m(\u001b[49m\n\u001b[1;32m 56\u001b[0m \u001b[43m \u001b[49m\u001b[43mname\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[38;5;124;43mblip_feature_extractor\u001b[39;49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[43m,\u001b[49m\n\u001b[1;32m 57\u001b[0m \u001b[43m \u001b[49m\u001b[43mmodel_type\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[38;5;124;43mbase\u001b[39;49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[43m,\u001b[49m\n\u001b[1;32m 58\u001b[0m \u001b[43m \u001b[49m\u001b[43mis_eval\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[38;5;28;43;01mTrue\u001b[39;49;00m\u001b[43m,\u001b[49m\n\u001b[1;32m 59\u001b[0m \u001b[43m \u001b[49m\u001b[43mdevice\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mdevice\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 60\u001b[0m \u001b[43m \u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m 61\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m model, vis_processors, txt_processors\n", "File \u001b[0;32m/opt/hostedtoolcache/Python/3.9.18/x64/lib/python3.9/site-packages/lavis/models/__init__.py:195\u001b[0m, in \u001b[0;36mload_model_and_preprocess\u001b[0;34m(name, model_type, is_eval, device)\u001b[0m\n\u001b[1;32m 192\u001b[0m model_cls \u001b[38;5;241m=\u001b[39m registry\u001b[38;5;241m.\u001b[39mget_model_class(name)\n\u001b[1;32m 194\u001b[0m \u001b[38;5;66;03m# load model\u001b[39;00m\n\u001b[0;32m--> 195\u001b[0m model \u001b[38;5;241m=\u001b[39m \u001b[43mmodel_cls\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mfrom_pretrained\u001b[49m\u001b[43m(\u001b[49m\u001b[43mmodel_type\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mmodel_type\u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m 197\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m is_eval:\n\u001b[1;32m 198\u001b[0m model\u001b[38;5;241m.\u001b[39meval()\n", "File \u001b[0;32m/opt/hostedtoolcache/Python/3.9.18/x64/lib/python3.9/site-packages/lavis/models/base_model.py:70\u001b[0m, in \u001b[0;36mBaseModel.from_pretrained\u001b[0;34m(cls, model_type)\u001b[0m\n\u001b[1;32m 60\u001b[0m \u001b[38;5;250m\u001b[39m\u001b[38;5;124;03m\"\"\"\u001b[39;00m\n\u001b[1;32m 61\u001b[0m \u001b[38;5;124;03mBuild a pretrained model from default configuration file, specified by model_type.\u001b[39;00m\n\u001b[1;32m 62\u001b[0m \n\u001b[0;32m (...)\u001b[0m\n\u001b[1;32m 67\u001b[0m \u001b[38;5;124;03m - model (nn.Module): pretrained or finetuned model, depending on the configuration.\u001b[39;00m\n\u001b[1;32m 68\u001b[0m \u001b[38;5;124;03m\"\"\"\u001b[39;00m\n\u001b[1;32m 69\u001b[0m model_cfg \u001b[38;5;241m=\u001b[39m OmegaConf\u001b[38;5;241m.\u001b[39mload(\u001b[38;5;28mcls\u001b[39m\u001b[38;5;241m.\u001b[39mdefault_config_path(model_type))\u001b[38;5;241m.\u001b[39mmodel\n\u001b[0;32m---> 70\u001b[0m model \u001b[38;5;241m=\u001b[39m \u001b[38;5;28;43mcls\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mfrom_config\u001b[49m\u001b[43m(\u001b[49m\u001b[43mmodel_cfg\u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m 72\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m model\n", "File \u001b[0;32m/opt/hostedtoolcache/Python/3.9.18/x64/lib/python3.9/site-packages/lavis/models/blip_models/blip_feature_extractor.py:208\u001b[0m, in \u001b[0;36mBlipFeatureExtractor.from_config\u001b[0;34m(cls, cfg)\u001b[0m\n\u001b[1;32m 206\u001b[0m pretrain_path \u001b[38;5;241m=\u001b[39m cfg\u001b[38;5;241m.\u001b[39mget(\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mpretrained\u001b[39m\u001b[38;5;124m\"\u001b[39m, \u001b[38;5;28;01mNone\u001b[39;00m)\n\u001b[1;32m 207\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m pretrain_path \u001b[38;5;129;01mis\u001b[39;00m \u001b[38;5;129;01mnot\u001b[39;00m \u001b[38;5;28;01mNone\u001b[39;00m:\n\u001b[0;32m--> 208\u001b[0m msg \u001b[38;5;241m=\u001b[39m \u001b[43mmodel\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mload_from_pretrained\u001b[49m\u001b[43m(\u001b[49m\u001b[43murl_or_filename\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mpretrain_path\u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m 209\u001b[0m \u001b[38;5;28;01melse\u001b[39;00m:\n\u001b[1;32m 210\u001b[0m warnings\u001b[38;5;241m.\u001b[39mwarn(\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mNo pretrained weights are loaded.\u001b[39m\u001b[38;5;124m\"\u001b[39m)\n", "File \u001b[0;32m/opt/hostedtoolcache/Python/3.9.18/x64/lib/python3.9/site-packages/lavis/models/blip_models/blip.py:30\u001b[0m, in \u001b[0;36mBlipBase.load_from_pretrained\u001b[0;34m(self, url_or_filename)\u001b[0m\n\u001b[1;32m 28\u001b[0m \u001b[38;5;28;01mdef\u001b[39;00m \u001b[38;5;21mload_from_pretrained\u001b[39m(\u001b[38;5;28mself\u001b[39m, url_or_filename):\n\u001b[1;32m 29\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m is_url(url_or_filename):\n\u001b[0;32m---> 30\u001b[0m cached_file \u001b[38;5;241m=\u001b[39m \u001b[43mdownload_cached_file\u001b[49m\u001b[43m(\u001b[49m\n\u001b[1;32m 31\u001b[0m \u001b[43m \u001b[49m\u001b[43murl_or_filename\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mcheck_hash\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[38;5;28;43;01mFalse\u001b[39;49;00m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mprogress\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[38;5;28;43;01mTrue\u001b[39;49;00m\n\u001b[1;32m 32\u001b[0m \u001b[43m \u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m 33\u001b[0m checkpoint \u001b[38;5;241m=\u001b[39m torch\u001b[38;5;241m.\u001b[39mload(cached_file, map_location\u001b[38;5;241m=\u001b[39m\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mcpu\u001b[39m\u001b[38;5;124m\"\u001b[39m)\n\u001b[1;32m 34\u001b[0m \u001b[38;5;28;01melif\u001b[39;00m os\u001b[38;5;241m.\u001b[39mpath\u001b[38;5;241m.\u001b[39misfile(url_or_filename):\n", "File \u001b[0;32m/opt/hostedtoolcache/Python/3.9.18/x64/lib/python3.9/site-packages/lavis/common/dist_utils.py:132\u001b[0m, in \u001b[0;36mdownload_cached_file\u001b[0;34m(url, check_hash, progress)\u001b[0m\n\u001b[1;32m 129\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m cached_file\n\u001b[1;32m 131\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m is_main_process():\n\u001b[0;32m--> 132\u001b[0m \u001b[43mtimm_hub\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mdownload_cached_file\u001b[49m\u001b[43m(\u001b[49m\u001b[43murl\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mcheck_hash\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mprogress\u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m 134\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m is_dist_avail_and_initialized():\n\u001b[1;32m 135\u001b[0m dist\u001b[38;5;241m.\u001b[39mbarrier()\n", "File \u001b[0;32m/opt/hostedtoolcache/Python/3.9.18/x64/lib/python3.9/site-packages/timm/models/hub.py:51\u001b[0m, in \u001b[0;36mdownload_cached_file\u001b[0;34m(url, check_hash, progress)\u001b[0m\n\u001b[1;32m 49\u001b[0m r \u001b[38;5;241m=\u001b[39m HASH_REGEX\u001b[38;5;241m.\u001b[39msearch(filename) \u001b[38;5;66;03m# r is Optional[Match[str]]\u001b[39;00m\n\u001b[1;32m 50\u001b[0m hash_prefix \u001b[38;5;241m=\u001b[39m r\u001b[38;5;241m.\u001b[39mgroup(\u001b[38;5;241m1\u001b[39m) \u001b[38;5;28;01mif\u001b[39;00m r \u001b[38;5;28;01melse\u001b[39;00m \u001b[38;5;28;01mNone\u001b[39;00m\n\u001b[0;32m---> 51\u001b[0m \u001b[43mdownload_url_to_file\u001b[49m\u001b[43m(\u001b[49m\u001b[43murl\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mcached_file\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mhash_prefix\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mprogress\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mprogress\u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m 52\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m cached_file\n", "File \u001b[0;32m/opt/hostedtoolcache/Python/3.9.18/x64/lib/python3.9/site-packages/torch/hub.py:636\u001b[0m, in \u001b[0;36mdownload_url_to_file\u001b[0;34m(url, dst, hash_prefix, progress)\u001b[0m\n\u001b[1;32m 634\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m \u001b[38;5;28mlen\u001b[39m(buffer) \u001b[38;5;241m==\u001b[39m \u001b[38;5;241m0\u001b[39m:\n\u001b[1;32m 635\u001b[0m \u001b[38;5;28;01mbreak\u001b[39;00m\n\u001b[0;32m--> 636\u001b[0m \u001b[43mf\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mwrite\u001b[49m\u001b[43m(\u001b[49m\u001b[43mbuffer\u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m 637\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m hash_prefix \u001b[38;5;129;01mis\u001b[39;00m \u001b[38;5;129;01mnot\u001b[39;00m \u001b[38;5;28;01mNone\u001b[39;00m:\n\u001b[1;32m 638\u001b[0m sha256\u001b[38;5;241m.\u001b[39mupdate(buffer)\n", "File \u001b[0;32m/opt/hostedtoolcache/Python/3.9.18/x64/lib/python3.9/tempfile.py:478\u001b[0m, in \u001b[0;36m_TemporaryFileWrapper.__getattr__..func_wrapper\u001b[0;34m(*args, **kwargs)\u001b[0m\n\u001b[1;32m 476\u001b[0m \u001b[38;5;129m@_functools\u001b[39m\u001b[38;5;241m.\u001b[39mwraps(func)\n\u001b[1;32m 477\u001b[0m \u001b[38;5;28;01mdef\u001b[39;00m \u001b[38;5;21mfunc_wrapper\u001b[39m(\u001b[38;5;241m*\u001b[39margs, \u001b[38;5;241m*\u001b[39m\u001b[38;5;241m*\u001b[39mkwargs):\n\u001b[0;32m--> 478\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m \u001b[43mfunc\u001b[49m\u001b[43m(\u001b[49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[43margs\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[43mkwargs\u001b[49m\u001b[43m)\u001b[49m\n", "\u001b[0;31mOSError\u001b[0m: [Errno 28] No space left on device" ] } ], "source": [ "(\n", " model,\n", " vis_processors,\n", " txt_processors,\n", " image_keys,\n", " image_names,\n", " features_image_stacked,\n", ") = my_obj.parsing_images(\n", " model_type, \n", " path_to_save_tensors=\"/content/drive/MyDrive/misinformation-data/\",\n", " )" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The images are then processed and stored in a numerical representation, a tensor. These tensors do not change for the same image and same model - so if you run this analysis once, and save the tensors giving a path with the keyword `path_to_save_tensors`, a file with filename `.__saved_features_image.pt` will be placed there.\n", "\n", "This can save you time if you want to analyse same images with the same model but different questions. To run using the saved tensors, execute the below code giving the path and name of the tensor file." ] }, { "cell_type": "code", "execution_count": 38, "metadata": { "execution": { "iopub.execute_input": "2024-02-19T08:54:26.399067Z", "iopub.status.busy": "2024-02-19T08:54:26.398613Z", "iopub.status.idle": "2024-02-19T08:54:26.401763Z", "shell.execute_reply": "2024-02-19T08:54:26.401225Z" } }, "outputs": [], "source": [ "# (\n", "# model,\n", "# vis_processors,\n", "# txt_processors,\n", "# image_keys,\n", "# image_names,\n", "# features_image_stacked,\n", "# ) = my_obj.parsing_images(\n", "# model_type,\n", "# path_to_load_tensors=\"/content/drive/MyDrive/misinformation-data/5_clip_base_saved_features_image.pt\",\n", "# )" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Here we already processed our image folder with 5 images and the `clip_base` model. So you need just to write the name `5_clip_base_saved_features_image.pt` of the saved file that consists of tensors of all images as keyword argument for `path_to_load_tensors`. " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Formulate your search queries\n", "\n", "Next, you need to form search queries. You can search either by image or by text. You can search for a single query, or you can search for several queries at once, the computational time should not be much different. The format of the queries is as follows:" ] }, { "cell_type": "code", "execution_count": 39, "metadata": { "execution": { "iopub.execute_input": "2024-02-19T08:54:26.406155Z", "iopub.status.busy": "2024-02-19T08:54:26.405642Z", "iopub.status.idle": "2024-02-19T08:54:26.409654Z", "shell.execute_reply": "2024-02-19T08:54:26.408999Z" } }, "outputs": [], "source": [ "import importlib_resources # only require for image query example\n", "image_example_query = str(importlib_resources.files(\"ammico\") / \"data\" / \"test-crop-image.png\") # creating the path to the image for the image query example\n", "\n", "search_query = [\n", " {\"text_input\": \"politician press conference\"}, \n", " {\"text_input\": \"a world map\"},\n", " {\"text_input\": \"a dog\"}, # This is how looks text query\n", " {\"image\": image_example_query}, # This is how looks image query, here `image_example_path` is the path to query image like \"data/test-crop-image.png\"\n", "]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You can filter your results in 3 different ways:\n", "- `filter_number_of_images` limits the number of images found. That is, if the parameter `filter_number_of_images = 10`, then the first 10 images that best match the query will be shown. The other images ranks will be set to `None` and the similarity value to `0`.\n", "- `filter_val_limit` limits the output of images with a similarity value not bigger than `filter_val_limit`. That is, if the parameter `filter_val_limit = 0.2`, all images with similarity less than 0.2 will be discarded.\n", "- `filter_rel_error` (percentage) limits the output of images with a similarity value not bigger than `100 * abs(current_simularity_value - best_simularity_value_in_current_search)/best_simularity_value_in_current_search < filter_rel_error`. That is, if we set filter_rel_error = 30, it means that if the top1 image have 0.5 similarity value, we discard all image with similarity less than 0.35." ] }, { "cell_type": "code", "execution_count": 40, "metadata": { "execution": { "iopub.execute_input": "2024-02-19T08:54:26.412858Z", "iopub.status.busy": "2024-02-19T08:54:26.412664Z", "iopub.status.idle": "2024-02-19T08:54:26.436288Z", "shell.execute_reply": "2024-02-19T08:54:26.435775Z" } }, "outputs": [ { "ename": "NameError", "evalue": "name 'model' is not defined", "output_type": "error", "traceback": [ "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", "\u001b[0;31mNameError\u001b[0m Traceback (most recent call last)", "Cell \u001b[0;32mIn[40], line 2\u001b[0m\n\u001b[1;32m 1\u001b[0m similarity, sorted_lists \u001b[38;5;241m=\u001b[39m my_obj\u001b[38;5;241m.\u001b[39mmultimodal_search(\n\u001b[0;32m----> 2\u001b[0m \u001b[43mmodel\u001b[49m,\n\u001b[1;32m 3\u001b[0m vis_processors,\n\u001b[1;32m 4\u001b[0m txt_processors,\n\u001b[1;32m 5\u001b[0m model_type,\n\u001b[1;32m 6\u001b[0m image_keys,\n\u001b[1;32m 7\u001b[0m features_image_stacked,\n\u001b[1;32m 8\u001b[0m search_query,\n\u001b[1;32m 9\u001b[0m filter_number_of_images\u001b[38;5;241m=\u001b[39m\u001b[38;5;241m20\u001b[39m,\n\u001b[1;32m 10\u001b[0m )\n", "\u001b[0;31mNameError\u001b[0m: name 'model' is not defined" ] } ], "source": [ "similarity, sorted_lists = my_obj.multimodal_search(\n", " model,\n", " vis_processors,\n", " txt_processors,\n", " model_type,\n", " image_keys,\n", " features_image_stacked,\n", " search_query,\n", " filter_number_of_images=20,\n", ")" ] }, { "cell_type": "code", "execution_count": 41, "metadata": { "execution": { "iopub.execute_input": "2024-02-19T08:54:26.443401Z", "iopub.status.busy": "2024-02-19T08:54:26.442981Z", "iopub.status.idle": "2024-02-19T08:54:26.462491Z", "shell.execute_reply": "2024-02-19T08:54:26.461996Z" } }, "outputs": [ { "ename": "NameError", "evalue": "name 'similarity' is not defined", "output_type": "error", "traceback": [ "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", "\u001b[0;31mNameError\u001b[0m Traceback (most recent call last)", "Cell \u001b[0;32mIn[41], line 1\u001b[0m\n\u001b[0;32m----> 1\u001b[0m \u001b[43msimilarity\u001b[49m\n", "\u001b[0;31mNameError\u001b[0m: name 'similarity' is not defined" ] } ], "source": [ "similarity" ] }, { "cell_type": "code", "execution_count": 42, "metadata": { "execution": { "iopub.execute_input": "2024-02-19T08:54:26.465916Z", "iopub.status.busy": "2024-02-19T08:54:26.465540Z", "iopub.status.idle": "2024-02-19T08:54:26.484725Z", "shell.execute_reply": "2024-02-19T08:54:26.484175Z" } }, "outputs": [ { "ename": "NameError", "evalue": "name 'sorted_lists' is not defined", "output_type": "error", "traceback": [ "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", "\u001b[0;31mNameError\u001b[0m Traceback (most recent call last)", "Cell \u001b[0;32mIn[42], line 1\u001b[0m\n\u001b[0;32m----> 1\u001b[0m \u001b[43msorted_lists\u001b[49m\n", "\u001b[0;31mNameError\u001b[0m: name 'sorted_lists' is not defined" ] } ], "source": [ "sorted_lists" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "After launching `multimodal_search` function, the results of each query will be added to the source dictionary. " ] }, { "cell_type": "code", "execution_count": 43, "metadata": { "execution": { "iopub.execute_input": "2024-02-19T08:54:26.487891Z", "iopub.status.busy": "2024-02-19T08:54:26.487505Z", "iopub.status.idle": "2024-02-19T08:54:26.494730Z", "shell.execute_reply": "2024-02-19T08:54:26.494169Z" } }, "outputs": [ { "data": { "text/plain": [ "{'img4': {'filename': 'data-test/img4.png',\n", " 'face': 'No',\n", " 'multiple_faces': 'No',\n", " 'no_faces': 0,\n", " 'wears_mask': ['No'],\n", " 'age': [None],\n", " 'gender': [None],\n", " 'race': [None],\n", " 'emotion': [None],\n", " 'emotion (category)': [None],\n", " 'text': 'MOODOVIN XI',\n", " 'text_language': 'en',\n", " 'text_english': 'MOODOVIN XI',\n", " 'text_clean': 'XI',\n", " 'text_summary': ' MOODOVIN XI XI: Vladimir Putin, Vladimir Vladmir Zelizer, Vladimir',\n", " 'sentiment': 'POSITIVE',\n", " 'sentiment_score': 0.66,\n", " 'entity': ['MOODOVIN XI'],\n", " 'entity_type': ['ORG'],\n", " 'const_image_summary': 'a river running through a city next to tall buildings',\n", " '3_non-deterministic_summary': ['there is a pretty house that sits above the water',\n", " 'there is a building with a balcony and lots of plants on the side of it',\n", " 'several buildings with a river flowing through it']},\n", " 'img1': {'filename': 'data-test/img1.png',\n", " 'no_faces': 0,\n", " 'age': [None],\n", " 'wears_mask': ['No'],\n", " 'emotion (category)': [None],\n", " 'multiple_faces': 'No',\n", " 'emotion': [None],\n", " 'gender': [None],\n", " 'race': [None],\n", " 'face': 'No',\n", " 'text': 'SCATTERING THEORY The Quantum Theory of Nonrelativistic Collisions JOHN R. TAYLOR University of Colorado',\n", " 'text_language': 'en',\n", " 'text_english': 'SCATTERING THEORY The Quantum Theory of Nonrelativistic Collisions JOHN R. TAYLOR University of Colorado',\n", " 'text_clean': 'THEORY The Quantum Theory of Collisions JOHN R. TAYLOR University of Colorado',\n", " 'text_summary': ' SCATTERING THEORY The Quantum Theory of Nonrelativistic Collisions',\n", " 'sentiment': 'POSITIVE',\n", " 'sentiment_score': 0.91,\n", " 'entity': ['Non',\n", " '##vist',\n", " 'Col',\n", " '##N',\n", " 'R',\n", " 'T',\n", " '##AYL',\n", " 'University of Colorado'],\n", " 'entity_type': ['MISC', 'MISC', 'MISC', 'ORG', 'PER', 'PER', 'ORG', 'ORG'],\n", " 'const_image_summary': 'a close up of a piece of paper with writing on it',\n", " '3_non-deterministic_summary': ['a book opened to the book title for a novel',\n", " 'there are many text on this page',\n", " 'the text in a book is a handwritten poem']},\n", " 'img2': {'filename': 'data-test/img2.png',\n", " 'no_faces': 0,\n", " 'age': [None],\n", " 'wears_mask': ['No'],\n", " 'emotion (category)': [None],\n", " 'multiple_faces': 'No',\n", " 'emotion': [None],\n", " 'gender': [None],\n", " 'race': [None],\n", " 'face': 'No',\n", " 'text': 'THE ALGEBRAIC EIGENVALUE PROBLEM DOM NVS TIO MINA Monographs on Numerical Analysis J.. H. WILKINSON',\n", " 'text_language': 'en',\n", " 'text_english': 'THE ALGEBRAIC EIGENVALUE PROBLEM DOM NVS TIO MINA Monographs on Numerical Analysis J.. H. WILKINSON',\n", " 'text_clean': 'THE PROBLEM DOM NVS TIO MINA Monographs on Numerical Analysis J .. H. WILKINSON',\n", " 'text_summary': ' H. H. W. WILKINSON: The Algebri',\n", " 'sentiment': 'NEGATIVE',\n", " 'sentiment_score': 0.97,\n", " 'entity': ['ALGEBRAIC EIGENVAL', 'NVS TIO MI', 'J', 'H', 'WILKINSON'],\n", " 'entity_type': ['MISC', 'ORG', 'ORG', 'ORG', 'ORG'],\n", " 'const_image_summary': 'a yellow book with green lettering on it',\n", " '3_non-deterministic_summary': ['a book cover with green writing on a black background',\n", " 'the title page of a book with information from its authors',\n", " 'a book about the age - related engineering and engineering']},\n", " 'img3': {'filename': 'data-test/img3.png',\n", " 'no_faces': 0,\n", " 'age': [None],\n", " 'wears_mask': ['No'],\n", " 'emotion (category)': [None],\n", " 'multiple_faces': 'No',\n", " 'emotion': [None],\n", " 'gender': [None],\n", " 'race': [None],\n", " 'face': 'No',\n", " 'text': 'm OOOO 0000 www.',\n", " 'text_language': 'en',\n", " 'text_english': 'm OOOO 0000 www.',\n", " 'text_clean': 'm www .',\n", " 'text_summary': ' www. m OOOO 0000 0000 www.m.m OOOo 0000',\n", " 'sentiment': 'NEGATIVE',\n", " 'sentiment_score': 0.62,\n", " 'entity': [],\n", " 'entity_type': [],\n", " 'const_image_summary': 'a bus that is sitting on the side of a road',\n", " '3_non-deterministic_summary': ['there are cars and a bus on the side of the road',\n", " 'a bus that is sitting in the middle of a street',\n", " 'an aerial view of an empty city street with two large buses passing by']},\n", " 'img0': {'filename': 'data-test/img0.png',\n", " 'no_faces': 0,\n", " 'age': [None],\n", " 'wears_mask': ['No'],\n", " 'emotion (category)': [None],\n", " 'multiple_faces': 'No',\n", " 'emotion': [None],\n", " 'gender': [None],\n", " 'race': [None],\n", " 'face': 'No',\n", " 'text': 'Mathematische Formelsammlung für Ingenieure und Naturwissenschaftler Mit zahlreichen Abbildungen und Rechenbeispielen und einer ausführlichen Integraltafel 3., verbesserte Auflage',\n", " 'text_language': 'de',\n", " 'text_english': 'Mathematical formula collection for engineers and scientists With numerous illustrations and calculation examples and a detailed integral table 3rd, improved edition',\n", " 'text_clean': 'Mathematical formula collection for engineers and scientists With numerous illustrations and calculation examples and a detailed integral table 3rd , improved edition',\n", " 'text_summary': ' Mathematical formula collection for engineers and scientists . Includes numerous illustrations and calculation examples . Includes',\n", " 'sentiment': 'POSITIVE',\n", " 'sentiment_score': 1.0,\n", " 'entity': [],\n", " 'entity_type': [],\n", " 'const_image_summary': 'a close up of an open book with writing on it',\n", " '3_non-deterministic_summary': ['a close up of a book with many languages',\n", " 'a book that is opened up in german',\n", " 'book about mathemarche formulals and their meaning']},\n", " 'img5': {'filename': 'data-test/img5.png',\n", " 'no_faces': 1,\n", " 'age': [26],\n", " 'wears_mask': ['No'],\n", " 'emotion (category)': ['Negative'],\n", " 'multiple_faces': 'No',\n", " 'emotion': ['sad'],\n", " 'gender': ['Man'],\n", " 'race': [None],\n", " 'face': 'Yes',\n", " 'text': None,\n", " 'text_language': 'en',\n", " 'text_english': '',\n", " 'text_clean': '',\n", " 'text_summary': ' CNN.com will feature iReporter photos in a weekly Travel Snapshots gallery .',\n", " 'sentiment': 'POSITIVE',\n", " 'sentiment_score': 0.75,\n", " 'entity': [],\n", " 'entity_type': [],\n", " 'const_image_summary': 'a person running on a beach near a rock formation',\n", " '3_non-deterministic_summary': ['a woman is running down the beach next to some rocks',\n", " 'a woman running along the beach by the ocean',\n", " 'there is a person running on the beach next to the ocean']}}" ] }, "execution_count": 43, "metadata": {}, "output_type": "execute_result" } ], "source": [ "image_dict" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "A special function was written to present the search results conveniently. " ] }, { "cell_type": "code", "execution_count": 44, "metadata": { "execution": { "iopub.execute_input": "2024-02-19T08:54:26.498308Z", "iopub.status.busy": "2024-02-19T08:54:26.498116Z", "iopub.status.idle": "2024-02-19T08:54:26.561629Z", "shell.execute_reply": "2024-02-19T08:54:26.560724Z" } }, "outputs": [ { "data": { "text/plain": [ "'Your search query: politician press conference'" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/plain": [ "'--------------------------------------------------'" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/plain": [ "'Results:'" ] }, "metadata": {}, "output_type": "display_data" }, { "ename": "KeyError", "evalue": "'politician press conference'", "output_type": "error", "traceback": [ "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", "\u001b[0;31mKeyError\u001b[0m Traceback (most recent call last)", "Cell \u001b[0;32mIn[44], line 1\u001b[0m\n\u001b[0;32m----> 1\u001b[0m \u001b[43mmy_obj\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mshow_results\u001b[49m\u001b[43m(\u001b[49m\n\u001b[1;32m 2\u001b[0m \u001b[43m \u001b[49m\u001b[43msearch_query\u001b[49m\u001b[43m[\u001b[49m\u001b[38;5;241;43m0\u001b[39;49m\u001b[43m]\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;66;43;03m# you can change the index to see the results for other queries\u001b[39;49;00m\n\u001b[1;32m 3\u001b[0m \u001b[43m)\u001b[49m\n", "File \u001b[0;32m~/work/AMMICO/AMMICO/ammico/multimodal_search.py:970\u001b[0m, in \u001b[0;36mMultimodalSearch.show_results\u001b[0;34m(self, query, itm, image_gradcam_with_itm)\u001b[0m\n\u001b[1;32m 967\u001b[0m current_querry_val \u001b[38;5;241m=\u001b[39m \u001b[38;5;28mlist\u001b[39m(query\u001b[38;5;241m.\u001b[39mvalues())[\u001b[38;5;241m0\u001b[39m]\n\u001b[1;32m 968\u001b[0m current_querry_rank \u001b[38;5;241m=\u001b[39m \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mrank \u001b[39m\u001b[38;5;124m\"\u001b[39m \u001b[38;5;241m+\u001b[39m \u001b[38;5;28mlist\u001b[39m(query\u001b[38;5;241m.\u001b[39mvalues())[\u001b[38;5;241m0\u001b[39m]\n\u001b[0;32m--> 970\u001b[0m \u001b[38;5;28;01mfor\u001b[39;00m s \u001b[38;5;129;01min\u001b[39;00m \u001b[38;5;28;43msorted\u001b[39;49m\u001b[43m(\u001b[49m\n\u001b[1;32m 971\u001b[0m \u001b[43m \u001b[49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43msubdict\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mitems\u001b[49m\u001b[43m(\u001b[49m\u001b[43m)\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mkey\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[38;5;28;43;01mlambda\u001b[39;49;00m\u001b[43m \u001b[49m\u001b[43mt\u001b[49m\u001b[43m:\u001b[49m\u001b[43m \u001b[49m\u001b[43mt\u001b[49m\u001b[43m[\u001b[49m\u001b[38;5;241;43m1\u001b[39;49m\u001b[43m]\u001b[49m\u001b[43m[\u001b[49m\u001b[43mcurrent_querry_val\u001b[49m\u001b[43m]\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mreverse\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[38;5;28;43;01mTrue\u001b[39;49;00m\n\u001b[1;32m 972\u001b[0m \u001b[43m\u001b[49m\u001b[43m)\u001b[49m:\n\u001b[1;32m 973\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m s[\u001b[38;5;241m1\u001b[39m][current_querry_rank] \u001b[38;5;129;01mis\u001b[39;00m \u001b[38;5;28;01mNone\u001b[39;00m:\n\u001b[1;32m 974\u001b[0m \u001b[38;5;28;01mbreak\u001b[39;00m\n", "File \u001b[0;32m~/work/AMMICO/AMMICO/ammico/multimodal_search.py:971\u001b[0m, in \u001b[0;36mMultimodalSearch.show_results..\u001b[0;34m(t)\u001b[0m\n\u001b[1;32m 967\u001b[0m current_querry_val \u001b[38;5;241m=\u001b[39m \u001b[38;5;28mlist\u001b[39m(query\u001b[38;5;241m.\u001b[39mvalues())[\u001b[38;5;241m0\u001b[39m]\n\u001b[1;32m 968\u001b[0m current_querry_rank \u001b[38;5;241m=\u001b[39m \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mrank \u001b[39m\u001b[38;5;124m\"\u001b[39m \u001b[38;5;241m+\u001b[39m \u001b[38;5;28mlist\u001b[39m(query\u001b[38;5;241m.\u001b[39mvalues())[\u001b[38;5;241m0\u001b[39m]\n\u001b[1;32m 970\u001b[0m \u001b[38;5;28;01mfor\u001b[39;00m s \u001b[38;5;129;01min\u001b[39;00m \u001b[38;5;28msorted\u001b[39m(\n\u001b[0;32m--> 971\u001b[0m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39msubdict\u001b[38;5;241m.\u001b[39mitems(), key\u001b[38;5;241m=\u001b[39m\u001b[38;5;28;01mlambda\u001b[39;00m t: \u001b[43mt\u001b[49m\u001b[43m[\u001b[49m\u001b[38;5;241;43m1\u001b[39;49m\u001b[43m]\u001b[49m\u001b[43m[\u001b[49m\u001b[43mcurrent_querry_val\u001b[49m\u001b[43m]\u001b[49m, reverse\u001b[38;5;241m=\u001b[39m\u001b[38;5;28;01mTrue\u001b[39;00m\n\u001b[1;32m 972\u001b[0m ):\n\u001b[1;32m 973\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m s[\u001b[38;5;241m1\u001b[39m][current_querry_rank] \u001b[38;5;129;01mis\u001b[39;00m \u001b[38;5;28;01mNone\u001b[39;00m:\n\u001b[1;32m 974\u001b[0m \u001b[38;5;28;01mbreak\u001b[39;00m\n", "\u001b[0;31mKeyError\u001b[0m: 'politician press conference'" ] } ], "source": [ "my_obj.show_results(\n", " search_query[0], # you can change the index to see the results for other queries\n", ")" ] }, { "cell_type": "code", "execution_count": 45, "metadata": { "execution": { "iopub.execute_input": "2024-02-19T08:54:26.564539Z", "iopub.status.busy": "2024-02-19T08:54:26.564124Z", "iopub.status.idle": "2024-02-19T08:54:26.742817Z", "shell.execute_reply": "2024-02-19T08:54:26.742251Z" } }, "outputs": [ { "data": { "text/plain": [ "'Your search query: '" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "image/jpeg": "", "image/png": "", "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/plain": [ "'--------------------------------------------------'" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/plain": [ "'Results:'" ] }, "metadata": {}, "output_type": "display_data" }, { "ename": "KeyError", "evalue": "'/home/runner/work/AMMICO/AMMICO/ammico/data/test-crop-image.png'", "output_type": "error", "traceback": [ "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", "\u001b[0;31mKeyError\u001b[0m Traceback (most recent call last)", "Cell \u001b[0;32mIn[45], line 1\u001b[0m\n\u001b[0;32m----> 1\u001b[0m \u001b[43mmy_obj\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mshow_results\u001b[49m\u001b[43m(\u001b[49m\n\u001b[1;32m 2\u001b[0m \u001b[43m \u001b[49m\u001b[43msearch_query\u001b[49m\u001b[43m[\u001b[49m\u001b[38;5;241;43m3\u001b[39;49m\u001b[43m]\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;66;43;03m# you can change the index to see the results for other queries\u001b[39;49;00m\n\u001b[1;32m 3\u001b[0m \u001b[43m)\u001b[49m\n", "File \u001b[0;32m~/work/AMMICO/AMMICO/ammico/multimodal_search.py:970\u001b[0m, in \u001b[0;36mMultimodalSearch.show_results\u001b[0;34m(self, query, itm, image_gradcam_with_itm)\u001b[0m\n\u001b[1;32m 967\u001b[0m current_querry_val \u001b[38;5;241m=\u001b[39m \u001b[38;5;28mlist\u001b[39m(query\u001b[38;5;241m.\u001b[39mvalues())[\u001b[38;5;241m0\u001b[39m]\n\u001b[1;32m 968\u001b[0m current_querry_rank \u001b[38;5;241m=\u001b[39m \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mrank \u001b[39m\u001b[38;5;124m\"\u001b[39m \u001b[38;5;241m+\u001b[39m \u001b[38;5;28mlist\u001b[39m(query\u001b[38;5;241m.\u001b[39mvalues())[\u001b[38;5;241m0\u001b[39m]\n\u001b[0;32m--> 970\u001b[0m \u001b[38;5;28;01mfor\u001b[39;00m s \u001b[38;5;129;01min\u001b[39;00m \u001b[38;5;28;43msorted\u001b[39;49m\u001b[43m(\u001b[49m\n\u001b[1;32m 971\u001b[0m \u001b[43m \u001b[49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43msubdict\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mitems\u001b[49m\u001b[43m(\u001b[49m\u001b[43m)\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mkey\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[38;5;28;43;01mlambda\u001b[39;49;00m\u001b[43m \u001b[49m\u001b[43mt\u001b[49m\u001b[43m:\u001b[49m\u001b[43m \u001b[49m\u001b[43mt\u001b[49m\u001b[43m[\u001b[49m\u001b[38;5;241;43m1\u001b[39;49m\u001b[43m]\u001b[49m\u001b[43m[\u001b[49m\u001b[43mcurrent_querry_val\u001b[49m\u001b[43m]\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mreverse\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[38;5;28;43;01mTrue\u001b[39;49;00m\n\u001b[1;32m 972\u001b[0m \u001b[43m\u001b[49m\u001b[43m)\u001b[49m:\n\u001b[1;32m 973\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m s[\u001b[38;5;241m1\u001b[39m][current_querry_rank] \u001b[38;5;129;01mis\u001b[39;00m \u001b[38;5;28;01mNone\u001b[39;00m:\n\u001b[1;32m 974\u001b[0m \u001b[38;5;28;01mbreak\u001b[39;00m\n", "File \u001b[0;32m~/work/AMMICO/AMMICO/ammico/multimodal_search.py:971\u001b[0m, in \u001b[0;36mMultimodalSearch.show_results..\u001b[0;34m(t)\u001b[0m\n\u001b[1;32m 967\u001b[0m current_querry_val \u001b[38;5;241m=\u001b[39m \u001b[38;5;28mlist\u001b[39m(query\u001b[38;5;241m.\u001b[39mvalues())[\u001b[38;5;241m0\u001b[39m]\n\u001b[1;32m 968\u001b[0m current_querry_rank \u001b[38;5;241m=\u001b[39m \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mrank \u001b[39m\u001b[38;5;124m\"\u001b[39m \u001b[38;5;241m+\u001b[39m \u001b[38;5;28mlist\u001b[39m(query\u001b[38;5;241m.\u001b[39mvalues())[\u001b[38;5;241m0\u001b[39m]\n\u001b[1;32m 970\u001b[0m \u001b[38;5;28;01mfor\u001b[39;00m s \u001b[38;5;129;01min\u001b[39;00m \u001b[38;5;28msorted\u001b[39m(\n\u001b[0;32m--> 971\u001b[0m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39msubdict\u001b[38;5;241m.\u001b[39mitems(), key\u001b[38;5;241m=\u001b[39m\u001b[38;5;28;01mlambda\u001b[39;00m t: \u001b[43mt\u001b[49m\u001b[43m[\u001b[49m\u001b[38;5;241;43m1\u001b[39;49m\u001b[43m]\u001b[49m\u001b[43m[\u001b[49m\u001b[43mcurrent_querry_val\u001b[49m\u001b[43m]\u001b[49m, reverse\u001b[38;5;241m=\u001b[39m\u001b[38;5;28;01mTrue\u001b[39;00m\n\u001b[1;32m 972\u001b[0m ):\n\u001b[1;32m 973\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m s[\u001b[38;5;241m1\u001b[39m][current_querry_rank] \u001b[38;5;129;01mis\u001b[39;00m \u001b[38;5;28;01mNone\u001b[39;00m:\n\u001b[1;32m 974\u001b[0m \u001b[38;5;28;01mbreak\u001b[39;00m\n", "\u001b[0;31mKeyError\u001b[0m: '/home/runner/work/AMMICO/AMMICO/ammico/data/test-crop-image.png'" ] } ], "source": [ "my_obj.show_results(\n", " search_query[3], # you can change the index to see the results for other queries\n", ")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Improve the search results\n", "\n", "For even better results, a slightly different approach has been prepared that can improve search results. It is quite resource-intensive, so it is applied after the main algorithm has found the most relevant images. This approach works only with text queries and it skips image queries. Among the parameters you can choose 3 models: `\"blip_base\"`, `\"blip_large\"`, `\"blip2_coco\"`. If you get an `Out of Memory` error, try reducing the batch_size value (minimum = 1), which is the number of images being processed simultaneously. With the parameter `need_grad_cam = True/False` you can enable the calculation of the heat map of each image to be processed and save them in `image_gradcam_with_itm`. Thus the `image_text_match_reordering()` function calculates new similarity values and new ranks for each image. The resulting values are added to the general dictionary." ] }, { "cell_type": "code", "execution_count": 46, "metadata": { "execution": { "iopub.execute_input": "2024-02-19T08:54:26.746808Z", "iopub.status.busy": "2024-02-19T08:54:26.746601Z", "iopub.status.idle": "2024-02-19T08:54:26.749457Z", "shell.execute_reply": "2024-02-19T08:54:26.748884Z" } }, "outputs": [], "source": [ "itm_model = \"blip_base\"\n", "# itm_model = \"blip_large\"\n", "# itm_model = \"blip2_coco\"" ] }, { "cell_type": "code", "execution_count": 47, "metadata": { "execution": { "iopub.execute_input": "2024-02-19T08:54:26.752555Z", "iopub.status.busy": "2024-02-19T08:54:26.752084Z", "iopub.status.idle": "2024-02-19T08:54:26.773035Z", "shell.execute_reply": "2024-02-19T08:54:26.772413Z" } }, "outputs": [ { "ename": "NameError", "evalue": "name 'image_keys' is not defined", "output_type": "error", "traceback": [ "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", "\u001b[0;31mNameError\u001b[0m Traceback (most recent call last)", "Cell \u001b[0;32mIn[47], line 4\u001b[0m\n\u001b[1;32m 1\u001b[0m itm_scores, image_gradcam_with_itm \u001b[38;5;241m=\u001b[39m my_obj\u001b[38;5;241m.\u001b[39mimage_text_match_reordering(\n\u001b[1;32m 2\u001b[0m search_query,\n\u001b[1;32m 3\u001b[0m itm_model,\n\u001b[0;32m----> 4\u001b[0m \u001b[43mimage_keys\u001b[49m,\n\u001b[1;32m 5\u001b[0m sorted_lists,\n\u001b[1;32m 6\u001b[0m batch_size\u001b[38;5;241m=\u001b[39m\u001b[38;5;241m1\u001b[39m,\n\u001b[1;32m 7\u001b[0m need_grad_cam\u001b[38;5;241m=\u001b[39m\u001b[38;5;28;01mTrue\u001b[39;00m,\n\u001b[1;32m 8\u001b[0m )\n", "\u001b[0;31mNameError\u001b[0m: name 'image_keys' is not defined" ] } ], "source": [ "itm_scores, image_gradcam_with_itm = my_obj.image_text_match_reordering(\n", " search_query,\n", " itm_model,\n", " image_keys,\n", " sorted_lists,\n", " batch_size=1,\n", " need_grad_cam=True,\n", ")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Then using the same output function you can add the `itm=True` argument to output the new image order. Remember that for images querys, an error will be thrown with `itm=True` argument. You can also add the `image_gradcam_with_itm` along with `itm=True` argument to output the heat maps of the calculated images." ] }, { "cell_type": "code", "execution_count": 48, "metadata": { "execution": { "iopub.execute_input": "2024-02-19T08:54:26.775851Z", "iopub.status.busy": "2024-02-19T08:54:26.775512Z", "iopub.status.idle": "2024-02-19T08:54:26.795328Z", "shell.execute_reply": "2024-02-19T08:54:26.794723Z" } }, "outputs": [ { "ename": "NameError", "evalue": "name 'image_gradcam_with_itm' is not defined", "output_type": "error", "traceback": [ "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", "\u001b[0;31mNameError\u001b[0m Traceback (most recent call last)", "Cell \u001b[0;32mIn[48], line 2\u001b[0m\n\u001b[1;32m 1\u001b[0m my_obj\u001b[38;5;241m.\u001b[39mshow_results(\n\u001b[0;32m----> 2\u001b[0m search_query[\u001b[38;5;241m0\u001b[39m], itm\u001b[38;5;241m=\u001b[39m\u001b[38;5;28;01mTrue\u001b[39;00m, image_gradcam_with_itm\u001b[38;5;241m=\u001b[39m\u001b[43mimage_gradcam_with_itm\u001b[49m\n\u001b[1;32m 3\u001b[0m )\n", "\u001b[0;31mNameError\u001b[0m: name 'image_gradcam_with_itm' is not defined" ] } ], "source": [ "my_obj.show_results(\n", " search_query[0], itm=True, image_gradcam_with_itm=image_gradcam_with_itm\n", ")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Save search results to csv" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Convert the dictionary of dictionarys into a dictionary with lists:" ] }, { "cell_type": "code", "execution_count": 49, "metadata": { "execution": { "iopub.execute_input": "2024-02-19T08:54:26.799498Z", "iopub.status.busy": "2024-02-19T08:54:26.799139Z", "iopub.status.idle": "2024-02-19T08:54:26.818722Z", "shell.execute_reply": "2024-02-19T08:54:26.818206Z" } }, "outputs": [ { "ename": "AttributeError", "evalue": "module 'ammico' has no attribute 'append_data_to_dict'", "output_type": "error", "traceback": [ "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", "\u001b[0;31mAttributeError\u001b[0m Traceback (most recent call last)", "Cell \u001b[0;32mIn[49], line 1\u001b[0m\n\u001b[0;32m----> 1\u001b[0m outdict \u001b[38;5;241m=\u001b[39m \u001b[43mammico\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mappend_data_to_dict\u001b[49m(image_dict)\n\u001b[1;32m 2\u001b[0m df \u001b[38;5;241m=\u001b[39m ammico\u001b[38;5;241m.\u001b[39mdump_df(outdict)\n", "\u001b[0;31mAttributeError\u001b[0m: module 'ammico' has no attribute 'append_data_to_dict'" ] } ], "source": [ "outdict = ammico.append_data_to_dict(image_dict)\n", "df = ammico.dump_df(outdict)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Check the dataframe:" ] }, { "cell_type": "code", "execution_count": 50, "metadata": { "execution": { "iopub.execute_input": "2024-02-19T08:54:26.822275Z", "iopub.status.busy": "2024-02-19T08:54:26.821902Z", "iopub.status.idle": "2024-02-19T08:54:26.841826Z", "shell.execute_reply": "2024-02-19T08:54:26.841214Z" } }, "outputs": [ { "ename": "NameError", "evalue": "name 'df' is not defined", "output_type": "error", "traceback": [ "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", "\u001b[0;31mNameError\u001b[0m Traceback (most recent call last)", "Cell \u001b[0;32mIn[50], line 1\u001b[0m\n\u001b[0;32m----> 1\u001b[0m \u001b[43mdf\u001b[49m\u001b[38;5;241m.\u001b[39mhead(\u001b[38;5;241m10\u001b[39m)\n", "\u001b[0;31mNameError\u001b[0m: name 'df' is not defined" ] } ], "source": [ "df.head(10)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Write the csv file:" ] }, { "cell_type": "code", "execution_count": 51, "metadata": { "execution": { "iopub.execute_input": "2024-02-19T08:54:26.845662Z", "iopub.status.busy": "2024-02-19T08:54:26.845309Z", "iopub.status.idle": "2024-02-19T08:54:26.864880Z", "shell.execute_reply": "2024-02-19T08:54:26.864321Z" } }, "outputs": [ { "ename": "NameError", "evalue": "name 'df' is not defined", "output_type": "error", "traceback": [ "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", "\u001b[0;31mNameError\u001b[0m Traceback (most recent call last)", "Cell \u001b[0;32mIn[51], line 1\u001b[0m\n\u001b[0;32m----> 1\u001b[0m \u001b[43mdf\u001b[49m\u001b[38;5;241m.\u001b[39mto_csv(\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124m/content/drive/MyDrive/misinformation-data/data_out.csv\u001b[39m\u001b[38;5;124m\"\u001b[39m)\n", "\u001b[0;31mNameError\u001b[0m: name 'df' is not defined" ] } ], "source": [ "df.to_csv(\"/content/drive/MyDrive/misinformation-data/data_out.csv\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Color analysis of pictures\n", "\n", "This module shows primary color analysis of color image using K-Means algorithm.\n", "The output are N primary colors and their corresponding percentage." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To check the analysis, you can inspect the analyzed elements here. Loading the results takes a moment, so please be patient. If you are sure of what you are doing, you can skip this and directly export a csv file in the step below.\n", "Here, we display the color detection results provided by `colorgram` and `colour` libraries. Click on the tabs to see the results in the right sidebar. You may need to increment the `port` number if you are already running several notebook instances on the same server." ] }, { "cell_type": "code", "execution_count": 52, "metadata": { "execution": { "iopub.execute_input": "2024-02-19T08:54:26.868479Z", "iopub.status.busy": "2024-02-19T08:54:26.868084Z", "iopub.status.idle": "2024-02-19T08:54:26.911851Z", "shell.execute_reply": "2024-02-19T08:54:26.911132Z" } }, "outputs": [ { "data": { "text/html": [ "\n", " \n", " " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "analysis_explorer = ammico.AnalysisExplorer(image_dict)\n", "analysis_explorer.run_server(port = 8057)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Instead of inspecting each of the images, you can also directly carry out the analysis and export the result into a csv. This may take a while depending on how many images you have loaded." ] }, { "cell_type": "code", "execution_count": 53, "metadata": { "execution": { "iopub.execute_input": "2024-02-19T08:54:26.917049Z", "iopub.status.busy": "2024-02-19T08:54:26.916702Z", "iopub.status.idle": "2024-02-19T08:54:38.080384Z", "shell.execute_reply": "2024-02-19T08:54:38.079754Z" } }, "outputs": [], "source": [ "for key in image_dict.keys():\n", " image_dict[key] = ammico.colors.ColorDetector(image_dict[key]).analyse_image()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "These steps are required to convert the dictionary of dictionarys into a dictionary with lists, that can be converted into a pandas dataframe and exported to a csv file." ] }, { "cell_type": "code", "execution_count": 54, "metadata": { "execution": { "iopub.execute_input": "2024-02-19T08:54:38.085405Z", "iopub.status.busy": "2024-02-19T08:54:38.084973Z", "iopub.status.idle": "2024-02-19T08:54:38.089091Z", "shell.execute_reply": "2024-02-19T08:54:38.088517Z" } }, "outputs": [], "source": [ "df = ammico.get_dataframe(image_dict)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Check the dataframe:" ] }, { "cell_type": "code", "execution_count": 55, "metadata": { "execution": { "iopub.execute_input": "2024-02-19T08:54:38.097290Z", "iopub.status.busy": "2024-02-19T08:54:38.096874Z", "iopub.status.idle": "2024-02-19T08:54:38.119692Z", "shell.execute_reply": "2024-02-19T08:54:38.119109Z" } }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
filenamefacemultiple_facesno_faceswears_maskagegenderraceemotionemotion (category)...blueyellowcyanorangepurplepinkbrowngreywhiteblack
0data-test/img4.pngNoNo0[No][None][None][None][None][None]...0.160.00000.0000.100.420.050.21
1data-test/img1.pngNoNo0[No][None][None][None][None][None]...0.000.00000.0000.000.960.000.04
2data-test/img2.pngNoNo0[No][None][None][None][None][None]...0.000.75000.0000.040.150.000.02
3data-test/img3.pngNoNo0[No][None][None][None][None][None]...0.000.00000.0200.060.920.010.00
4data-test/img0.pngNoNo0[No][None][None][None][None][None]...0.000.00000.0000.000.980.000.02
5data-test/img5.pngYesNo1[No][26][Man][None][sad][Negative]...0.120.00000.0000.020.500.000.00
\n", "

6 rows × 33 columns

\n", "
" ], "text/plain": [ " filename face multiple_faces no_faces wears_mask age \\\n", "0 data-test/img4.png No No 0 [No] [None] \n", "1 data-test/img1.png No No 0 [No] [None] \n", "2 data-test/img2.png No No 0 [No] [None] \n", "3 data-test/img3.png No No 0 [No] [None] \n", "4 data-test/img0.png No No 0 [No] [None] \n", "5 data-test/img5.png Yes No 1 [No] [26] \n", "\n", " gender race emotion emotion (category) ... blue yellow cyan orange \\\n", "0 [None] [None] [None] [None] ... 0.16 0.00 0 0 \n", "1 [None] [None] [None] [None] ... 0.00 0.00 0 0 \n", "2 [None] [None] [None] [None] ... 0.00 0.75 0 0 \n", "3 [None] [None] [None] [None] ... 0.00 0.00 0 0 \n", "4 [None] [None] [None] [None] ... 0.00 0.00 0 0 \n", "5 [Man] [None] [sad] [Negative] ... 0.12 0.00 0 0 \n", "\n", " purple pink brown grey white black \n", "0 0.00 0 0.10 0.42 0.05 0.21 \n", "1 0.00 0 0.00 0.96 0.00 0.04 \n", "2 0.00 0 0.04 0.15 0.00 0.02 \n", "3 0.02 0 0.06 0.92 0.01 0.00 \n", "4 0.00 0 0.00 0.98 0.00 0.02 \n", "5 0.00 0 0.02 0.50 0.00 0.00 \n", "\n", "[6 rows x 33 columns]" ] }, "execution_count": 55, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df.head(10)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Write the csv file - here you should provide a file path and file name for the csv file to be written." ] }, { "cell_type": "code", "execution_count": 56, "metadata": { "execution": { "iopub.execute_input": "2024-02-19T08:54:38.124073Z", "iopub.status.busy": "2024-02-19T08:54:38.123661Z", "iopub.status.idle": "2024-02-19T08:54:38.203962Z", "shell.execute_reply": "2024-02-19T08:54:38.203276Z" } }, "outputs": [ { "ename": "OSError", "evalue": "Cannot save file into a non-existent directory: '/content/drive/MyDrive/misinformation-data'", "output_type": "error", "traceback": [ "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", "\u001b[0;31mOSError\u001b[0m Traceback (most recent call last)", "Cell \u001b[0;32mIn[56], line 1\u001b[0m\n\u001b[0;32m----> 1\u001b[0m \u001b[43mdf\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mto_csv\u001b[49m\u001b[43m(\u001b[49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[38;5;124;43m/content/drive/MyDrive/misinformation-data/data_out.csv\u001b[39;49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[43m)\u001b[49m\n", "File \u001b[0;32m/opt/hostedtoolcache/Python/3.9.18/x64/lib/python3.9/site-packages/pandas/util/_decorators.py:333\u001b[0m, in \u001b[0;36mdeprecate_nonkeyword_arguments..decorate..wrapper\u001b[0;34m(*args, **kwargs)\u001b[0m\n\u001b[1;32m 327\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m \u001b[38;5;28mlen\u001b[39m(args) \u001b[38;5;241m>\u001b[39m num_allow_args:\n\u001b[1;32m 328\u001b[0m warnings\u001b[38;5;241m.\u001b[39mwarn(\n\u001b[1;32m 329\u001b[0m msg\u001b[38;5;241m.\u001b[39mformat(arguments\u001b[38;5;241m=\u001b[39m_format_argument_list(allow_args)),\n\u001b[1;32m 330\u001b[0m \u001b[38;5;167;01mFutureWarning\u001b[39;00m,\n\u001b[1;32m 331\u001b[0m stacklevel\u001b[38;5;241m=\u001b[39mfind_stack_level(),\n\u001b[1;32m 332\u001b[0m )\n\u001b[0;32m--> 333\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m \u001b[43mfunc\u001b[49m\u001b[43m(\u001b[49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[43margs\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[43mkwargs\u001b[49m\u001b[43m)\u001b[49m\n", "File \u001b[0;32m/opt/hostedtoolcache/Python/3.9.18/x64/lib/python3.9/site-packages/pandas/core/generic.py:3961\u001b[0m, in \u001b[0;36mNDFrame.to_csv\u001b[0;34m(self, path_or_buf, sep, na_rep, float_format, columns, header, index, index_label, mode, encoding, compression, quoting, quotechar, lineterminator, chunksize, date_format, doublequote, escapechar, decimal, errors, storage_options)\u001b[0m\n\u001b[1;32m 3950\u001b[0m df \u001b[38;5;241m=\u001b[39m \u001b[38;5;28mself\u001b[39m \u001b[38;5;28;01mif\u001b[39;00m \u001b[38;5;28misinstance\u001b[39m(\u001b[38;5;28mself\u001b[39m, ABCDataFrame) \u001b[38;5;28;01melse\u001b[39;00m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39mto_frame()\n\u001b[1;32m 3952\u001b[0m formatter \u001b[38;5;241m=\u001b[39m DataFrameFormatter(\n\u001b[1;32m 3953\u001b[0m frame\u001b[38;5;241m=\u001b[39mdf,\n\u001b[1;32m 3954\u001b[0m header\u001b[38;5;241m=\u001b[39mheader,\n\u001b[0;32m (...)\u001b[0m\n\u001b[1;32m 3958\u001b[0m decimal\u001b[38;5;241m=\u001b[39mdecimal,\n\u001b[1;32m 3959\u001b[0m )\n\u001b[0;32m-> 3961\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m \u001b[43mDataFrameRenderer\u001b[49m\u001b[43m(\u001b[49m\u001b[43mformatter\u001b[49m\u001b[43m)\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mto_csv\u001b[49m\u001b[43m(\u001b[49m\n\u001b[1;32m 3962\u001b[0m \u001b[43m \u001b[49m\u001b[43mpath_or_buf\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 3963\u001b[0m \u001b[43m \u001b[49m\u001b[43mlineterminator\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mlineterminator\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 3964\u001b[0m \u001b[43m \u001b[49m\u001b[43msep\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43msep\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 3965\u001b[0m \u001b[43m \u001b[49m\u001b[43mencoding\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mencoding\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 3966\u001b[0m \u001b[43m \u001b[49m\u001b[43merrors\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43merrors\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 3967\u001b[0m \u001b[43m \u001b[49m\u001b[43mcompression\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mcompression\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 3968\u001b[0m \u001b[43m \u001b[49m\u001b[43mquoting\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mquoting\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 3969\u001b[0m \u001b[43m \u001b[49m\u001b[43mcolumns\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mcolumns\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 3970\u001b[0m \u001b[43m \u001b[49m\u001b[43mindex_label\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mindex_label\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 3971\u001b[0m \u001b[43m \u001b[49m\u001b[43mmode\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mmode\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 3972\u001b[0m \u001b[43m \u001b[49m\u001b[43mchunksize\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mchunksize\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 3973\u001b[0m \u001b[43m \u001b[49m\u001b[43mquotechar\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mquotechar\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 3974\u001b[0m \u001b[43m \u001b[49m\u001b[43mdate_format\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mdate_format\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 3975\u001b[0m \u001b[43m \u001b[49m\u001b[43mdoublequote\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mdoublequote\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 3976\u001b[0m \u001b[43m \u001b[49m\u001b[43mescapechar\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mescapechar\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 3977\u001b[0m \u001b[43m \u001b[49m\u001b[43mstorage_options\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mstorage_options\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 3978\u001b[0m \u001b[43m\u001b[49m\u001b[43m)\u001b[49m\n", "File \u001b[0;32m/opt/hostedtoolcache/Python/3.9.18/x64/lib/python3.9/site-packages/pandas/io/formats/format.py:1014\u001b[0m, in \u001b[0;36mDataFrameRenderer.to_csv\u001b[0;34m(self, path_or_buf, encoding, sep, columns, index_label, mode, compression, quoting, quotechar, lineterminator, chunksize, date_format, doublequote, escapechar, errors, storage_options)\u001b[0m\n\u001b[1;32m 993\u001b[0m created_buffer \u001b[38;5;241m=\u001b[39m \u001b[38;5;28;01mFalse\u001b[39;00m\n\u001b[1;32m 995\u001b[0m csv_formatter \u001b[38;5;241m=\u001b[39m CSVFormatter(\n\u001b[1;32m 996\u001b[0m path_or_buf\u001b[38;5;241m=\u001b[39mpath_or_buf,\n\u001b[1;32m 997\u001b[0m lineterminator\u001b[38;5;241m=\u001b[39mlineterminator,\n\u001b[0;32m (...)\u001b[0m\n\u001b[1;32m 1012\u001b[0m formatter\u001b[38;5;241m=\u001b[39m\u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39mfmt,\n\u001b[1;32m 1013\u001b[0m )\n\u001b[0;32m-> 1014\u001b[0m \u001b[43mcsv_formatter\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43msave\u001b[49m\u001b[43m(\u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m 1016\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m created_buffer:\n\u001b[1;32m 1017\u001b[0m \u001b[38;5;28;01massert\u001b[39;00m \u001b[38;5;28misinstance\u001b[39m(path_or_buf, StringIO)\n", "File \u001b[0;32m/opt/hostedtoolcache/Python/3.9.18/x64/lib/python3.9/site-packages/pandas/io/formats/csvs.py:251\u001b[0m, in \u001b[0;36mCSVFormatter.save\u001b[0;34m(self)\u001b[0m\n\u001b[1;32m 247\u001b[0m \u001b[38;5;250m\u001b[39m\u001b[38;5;124;03m\"\"\"\u001b[39;00m\n\u001b[1;32m 248\u001b[0m \u001b[38;5;124;03mCreate the writer & save.\u001b[39;00m\n\u001b[1;32m 249\u001b[0m \u001b[38;5;124;03m\"\"\"\u001b[39;00m\n\u001b[1;32m 250\u001b[0m \u001b[38;5;66;03m# apply compression and byte/text conversion\u001b[39;00m\n\u001b[0;32m--> 251\u001b[0m \u001b[38;5;28;01mwith\u001b[39;00m \u001b[43mget_handle\u001b[49m\u001b[43m(\u001b[49m\n\u001b[1;32m 252\u001b[0m \u001b[43m \u001b[49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mfilepath_or_buffer\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 253\u001b[0m \u001b[43m \u001b[49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mmode\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 254\u001b[0m \u001b[43m \u001b[49m\u001b[43mencoding\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mencoding\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 255\u001b[0m \u001b[43m \u001b[49m\u001b[43merrors\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43merrors\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 256\u001b[0m \u001b[43m \u001b[49m\u001b[43mcompression\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mcompression\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 257\u001b[0m \u001b[43m \u001b[49m\u001b[43mstorage_options\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mstorage_options\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 258\u001b[0m \u001b[43m\u001b[49m\u001b[43m)\u001b[49m \u001b[38;5;28;01mas\u001b[39;00m handles:\n\u001b[1;32m 259\u001b[0m \u001b[38;5;66;03m# Note: self.encoding is irrelevant here\u001b[39;00m\n\u001b[1;32m 260\u001b[0m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39mwriter \u001b[38;5;241m=\u001b[39m csvlib\u001b[38;5;241m.\u001b[39mwriter(\n\u001b[1;32m 261\u001b[0m handles\u001b[38;5;241m.\u001b[39mhandle,\n\u001b[1;32m 262\u001b[0m lineterminator\u001b[38;5;241m=\u001b[39m\u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39mlineterminator,\n\u001b[0;32m (...)\u001b[0m\n\u001b[1;32m 267\u001b[0m quotechar\u001b[38;5;241m=\u001b[39m\u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39mquotechar,\n\u001b[1;32m 268\u001b[0m )\n\u001b[1;32m 270\u001b[0m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39m_save()\n", "File \u001b[0;32m/opt/hostedtoolcache/Python/3.9.18/x64/lib/python3.9/site-packages/pandas/io/common.py:749\u001b[0m, in \u001b[0;36mget_handle\u001b[0;34m(path_or_buf, mode, encoding, compression, memory_map, is_text, errors, storage_options)\u001b[0m\n\u001b[1;32m 747\u001b[0m \u001b[38;5;66;03m# Only for write methods\u001b[39;00m\n\u001b[1;32m 748\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mr\u001b[39m\u001b[38;5;124m\"\u001b[39m \u001b[38;5;129;01mnot\u001b[39;00m \u001b[38;5;129;01min\u001b[39;00m mode \u001b[38;5;129;01mand\u001b[39;00m is_path:\n\u001b[0;32m--> 749\u001b[0m \u001b[43mcheck_parent_directory\u001b[49m\u001b[43m(\u001b[49m\u001b[38;5;28;43mstr\u001b[39;49m\u001b[43m(\u001b[49m\u001b[43mhandle\u001b[49m\u001b[43m)\u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m 751\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m compression:\n\u001b[1;32m 752\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m compression \u001b[38;5;241m!=\u001b[39m \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mzstd\u001b[39m\u001b[38;5;124m\"\u001b[39m:\n\u001b[1;32m 753\u001b[0m \u001b[38;5;66;03m# compression libraries do not like an explicit text-mode\u001b[39;00m\n", "File \u001b[0;32m/opt/hostedtoolcache/Python/3.9.18/x64/lib/python3.9/site-packages/pandas/io/common.py:616\u001b[0m, in \u001b[0;36mcheck_parent_directory\u001b[0;34m(path)\u001b[0m\n\u001b[1;32m 614\u001b[0m parent \u001b[38;5;241m=\u001b[39m Path(path)\u001b[38;5;241m.\u001b[39mparent\n\u001b[1;32m 615\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m \u001b[38;5;129;01mnot\u001b[39;00m parent\u001b[38;5;241m.\u001b[39mis_dir():\n\u001b[0;32m--> 616\u001b[0m \u001b[38;5;28;01mraise\u001b[39;00m \u001b[38;5;167;01mOSError\u001b[39;00m(\u001b[38;5;124mrf\u001b[39m\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mCannot save file into a non-existent directory: \u001b[39m\u001b[38;5;124m'\u001b[39m\u001b[38;5;132;01m{\u001b[39;00mparent\u001b[38;5;132;01m}\u001b[39;00m\u001b[38;5;124m'\u001b[39m\u001b[38;5;124m\"\u001b[39m)\n", "\u001b[0;31mOSError\u001b[0m: Cannot save file into a non-existent directory: '/content/drive/MyDrive/misinformation-data'" ] } ], "source": [ "df.to_csv(\"/content/drive/MyDrive/misinformation-data/data_out.csv\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Further detector modules\n", "Further detector modules exist, also it is possible to carry out a topic analysis on the text data, as well as crop social media posts automatically. These are more experimental features and have their own demonstration notebooks." ] }, { "cell_type": "markdown", "metadata": {}, "source": [] } ], "metadata": { "kernelspec": { "display_name": "ammico", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.9.18" } }, "nbformat": 4, "nbformat_minor": 2 }