AMMICO/build/html/notebooks/Example text.ipynb

5838 строки
254 KiB
Plaintext
Исходник Ответственный История

Этот файл содержит невидимые символы Юникода

This file contains invisible Unicode characters that are indistinguishable to humans but may be processed differently by a computer. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

{
"cells": [
{
"attachments": {},
"cell_type": "markdown",
"id": "dcaa3da1",
"metadata": {},
"source": [
"# Notebook for text extraction on image\n",
"\n",
"The text extraction and analysis is carried out using a variety of tools: \n",
"\n",
"1. Text extraction from the image using [google-cloud-vision](https://cloud.google.com/vision) \n",
"1. Language detection of the extracted text using [Googletrans](https://py-googletrans.readthedocs.io/en/latest/) \n",
"1. Translation into English or other languages using [Googletrans](https://py-googletrans.readthedocs.io/en/latest/) \n",
"1. Cleaning of the text using [spacy](https://spacy.io/) \n",
"1. Spell-check using [TextBlob](https://textblob.readthedocs.io/en/dev/index.html) \n",
"1. Subjectivity analysis using [TextBlob](https://textblob.readthedocs.io/en/dev/index.html) \n",
"1. Text summarization using [transformers](https://huggingface.co/docs/transformers/index) pipelines\n",
"1. Sentiment analysis using [transformers](https://huggingface.co/docs/transformers/index) pipelines \n",
"1. Named entity recognition using [transformers](https://huggingface.co/docs/transformers/index) pipelines \n",
"1. Topic analysis using [BERTopic](https://github.com/MaartenGr/BERTopic) \n",
"\n",
"The first cell is only run on google colab and installs the [ammico](https://github.com/ssciwr/AMMICO) package.\n",
"\n",
"After that, we can import `ammico` and read in the files given a folder path."
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "f43f327c",
"metadata": {
"execution": {
"iopub.execute_input": "2023-05-26T09:36:45.337109Z",
"iopub.status.busy": "2023-05-26T09:36:45.336832Z",
"iopub.status.idle": "2023-05-26T09:36:45.346856Z",
"shell.execute_reply": "2023-05-26T09:36:45.345710Z"
}
},
"outputs": [],
"source": [
"# if running on google colab\n",
"# flake8-noqa-cell\n",
"import os\n",
"\n",
"if \"google.colab\" in str(get_ipython()):\n",
" # update python version\n",
" # install setuptools\n",
" # %pip install setuptools==61 -qqq\n",
" # install ammico\n",
" %pip install git+https://github.com/ssciwr/ammico.git -qqq\n",
" # mount google drive for data and API key\n",
" from google.colab import drive\n",
"\n",
" drive.mount(\"/content/drive\")"
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "cf362e60",
"metadata": {
"execution": {
"iopub.execute_input": "2023-05-26T09:36:45.350065Z",
"iopub.status.busy": "2023-05-26T09:36:45.349632Z",
"iopub.status.idle": "2023-05-26T09:37:00.245786Z",
"shell.execute_reply": "2023-05-26T09:37:00.245076Z"
}
},
"outputs": [],
"source": [
"import os\n",
"import ammico\n",
"from ammico import utils as mutils\n",
"from ammico import display as mdisplay"
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "fddba721",
"metadata": {},
"source": [
"We select a subset of image files to try the text extraction on, see the `limit` keyword. The `find_files` function finds image files within a given directory: "
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "27675810",
"metadata": {
"execution": {
"iopub.execute_input": "2023-05-26T09:37:00.250044Z",
"iopub.status.busy": "2023-05-26T09:37:00.249223Z",
"iopub.status.idle": "2023-05-26T09:37:00.253515Z",
"shell.execute_reply": "2023-05-26T09:37:00.252836Z"
}
},
"outputs": [],
"source": [
"# Here you need to provide the path to your google drive folder\n",
"# or local folder containing the images\n",
"images = mutils.find_files(\n",
" path=\"data/\",\n",
" limit=10,\n",
")"
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "3a7dfe11",
"metadata": {},
"source": [
"We need to initialize the main dictionary that contains all information for the images and is updated through each subsequent analysis:"
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "8b32409f",
"metadata": {
"execution": {
"iopub.execute_input": "2023-05-26T09:37:00.256652Z",
"iopub.status.busy": "2023-05-26T09:37:00.256215Z",
"iopub.status.idle": "2023-05-26T09:37:00.259492Z",
"shell.execute_reply": "2023-05-26T09:37:00.258807Z"
}
},
"outputs": [],
"source": [
"mydict = mutils.initialize_dict(images)"
]
},
{
"cell_type": "markdown",
"id": "7b8b929f",
"metadata": {},
"source": [
"## Google cloud vision API\n",
"\n",
"For this you need an API key and have the app activated in your google console. The first 1000 images per month are free (July 2022)."
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "cbf74c0b-52fe-4fb8-b617-f18611e8f986",
"metadata": {},
"source": [
"```\n",
"os.environ[\n",
" \"GOOGLE_APPLICATION_CREDENTIALS\"\n",
"] = \"your-credentials.json\"\n",
"```"
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "0891b795-c7fe-454c-a45d-45fadf788142",
"metadata": {},
"source": [
"## Inspect the elements per image\n",
"To check the analysis, you can inspect the analyzed elements here. Loading the results takes a moment, so please be patient. If you are sure of what you are doing, you can skip this and directly export a csv file in the step below.\n",
"Here, we display the text extraction and translation results provided by the above libraries. Click on the tabs to see the results in the right sidebar. You may need to increment the `port` number if you are already running several notebook instances on the same server."
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "7c6ecc88",
"metadata": {
"execution": {
"iopub.execute_input": "2023-05-26T09:37:00.262694Z",
"iopub.status.busy": "2023-05-26T09:37:00.262269Z",
"iopub.status.idle": "2023-05-26T09:37:00.311831Z",
"shell.execute_reply": "2023-05-26T09:37:00.311083Z"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Dash is running on http://127.0.0.1:8054/\n",
"\n"
]
},
{
"data": {
"text/html": [
"\n",
" <iframe\n",
" width=\"100%\"\n",
" height=\"650\"\n",
" src=\"http://127.0.0.1:8054/\"\n",
" frameborder=\"0\"\n",
" allowfullscreen\n",
" \n",
" ></iframe>\n",
" "
],
"text/plain": [
"<IPython.lib.display.IFrame at 0x7fcb9e371640>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"analysis_explorer = mdisplay.AnalysisExplorer(mydict, identify=\"text-on-image\")\n",
"analysis_explorer.run_server(port=8054)"
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "9c3e72b5-0e57-4019-b45e-3e36a74e7f52",
"metadata": {},
"source": [
"## Or directly analyze for further processing\n",
"Instead of inspecting each of the images, you can also directly carry out the analysis and export the result into a csv. This may take a while depending on how many images you have loaded. Set the keyword `analyse_text` to `True` if you want the text to be analyzed (spell check, subjectivity, text summary, sentiment, NER)."
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "365c78b1-7ff4-4213-86fa-6a0a2d05198f",
"metadata": {
"execution": {
"iopub.execute_input": "2023-05-26T09:37:00.315445Z",
"iopub.status.busy": "2023-05-26T09:37:00.314976Z",
"iopub.status.idle": "2023-05-26T09:38:33.112221Z",
"shell.execute_reply": "2023-05-26T09:38:33.102145Z"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Collecting en-core-web-md==3.5.0\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
" Downloading https://github.com/explosion/spacy-models/releases/download/en_core_web_md-3.5.0/en_core_web_md-3.5.0-py3-none-any.whl (42.8 MB)\n",
"\u001b[?25l 0.0/42.8 MB ? eta -:--:--\r",
"\u001b[2K 0.1/42.8 MB 2.6 MB/s eta 0:00:17\r",
"\u001b[2K 0.5/42.8 MB 6.5 MB/s eta 0:00:07\r",
"\u001b[2K ╸ 0.8/42.8 MB 8.0 MB/s eta 0:00:06\r",
"\u001b[2K ━ 1.3/42.8 MB 9.0 MB/s eta 0:00:05\r",
"\u001b[2K ━╸ 1.8/42.8 MB 10.0 MB/s eta 0:00:05"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"\u001b[2K ━━ 2.3/42.8 MB 10.9 MB/s eta 0:00:04\r",
"\u001b[2K ━━╸ 2.9/42.8 MB 11.9 MB/s eta 0:00:04\r",
"\u001b[2K ━━━ 3.7/42.8 MB 13.0 MB/s eta 0:00:04\r",
"\u001b[2K ━━━━ 4.5/42.8 MB 14.1 MB/s eta 0:00:03\r",
"\u001b[2K ━━━━╸ 5.3/42.8 MB 15.0 MB/s eta 0:00:03\r",
"\u001b[2K ━━━━━╸ 6.3/42.8 MB 16.3 MB/s eta 0:00:03"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"\u001b[2K ━━━━━━━ 7.5/42.8 MB 17.7 MB/s eta 0:00:02\r",
"\u001b[2K ━━━━━━━━ 8.8/42.8 MB 19.1 MB/s eta 0:00:02\r",
"\u001b[2K ━━━━━━━━━╸ 10.3/42.8 MB 21.9 MB/s eta 0:00:02\r",
"\u001b[2K ━━━━━━━━━━━ 11.9/42.8 MB 28.6 MB/s eta 0:00:02\r",
"\u001b[2K ━━━━━━━━━━━━╸ 13.8/42.8 MB 35.7 MB/s eta 0:00:01\r",
"\u001b[2K ━━━━━━━━━━━━━━━ 16.0/42.8 MB 44.4 MB/s eta 0:00:01"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"\u001b[2K ━━━━━━━━━━━━━━━━━ 18.5/42.8 MB 53.6 MB/s eta 0:00:01\r",
"\u001b[2K ━━━━━━━━━━━━━━━━━━━╸ 21.3/42.8 MB 64.1 MB/s eta 0:00:01\r",
"\u001b[2K ━━━━━━━━━━━━━━━━━━━━━━╸ 24.5/42.8 MB 75.8 MB/s eta 0:00:01\r",
"\u001b[2K ━━━━━━━━━━━━━━━━━━━━━━━━━━ 28.2/42.8 MB 89.3 MB/s eta 0:00:01\r",
"\u001b[2K ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 32.2/42.8 MB 104.0 MB/s eta 0:00:01\r",
"\u001b[2K ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 36.5/42.8 MB 115.8 MB/s eta 0:00:01"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"\u001b[2K ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╸ 41.5/42.8 MB 132.1 MB/s eta 0:00:01\r",
"\u001b[2K ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╸ 42.8/42.8 MB 134.8 MB/s eta 0:00:01\r",
"\u001b[2K ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╸ 42.8/42.8 MB 134.8 MB/s eta 0:00:01\r",
"\u001b[2K ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╸ 42.8/42.8 MB 134.8 MB/s eta 0:00:01\r",
"\u001b[2K ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╸ 42.8/42.8 MB 134.8 MB/s eta 0:00:01\r",
"\u001b[2K ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 42.8/42.8 MB 47.4 MB/s eta 0:00:00\n",
"\u001b[?25h"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Requirement already satisfied: spacy<3.6.0,>=3.5.0 in /opt/hostedtoolcache/Python/3.9.16/x64/lib/python3.9/site-packages (from en-core-web-md==3.5.0) (3.5.3)\n",
"Requirement already satisfied: spacy-legacy<3.1.0,>=3.0.11 in /opt/hostedtoolcache/Python/3.9.16/x64/lib/python3.9/site-packages (from spacy<3.6.0,>=3.5.0->en-core-web-md==3.5.0) (3.0.12)\n",
"Requirement already satisfied: spacy-loggers<2.0.0,>=1.0.0 in /opt/hostedtoolcache/Python/3.9.16/x64/lib/python3.9/site-packages (from spacy<3.6.0,>=3.5.0->en-core-web-md==3.5.0) (1.0.4)\n",
"Requirement already satisfied: murmurhash<1.1.0,>=0.28.0 in /opt/hostedtoolcache/Python/3.9.16/x64/lib/python3.9/site-packages (from spacy<3.6.0,>=3.5.0->en-core-web-md==3.5.0) (1.0.9)\n",
"Requirement already satisfied: cymem<2.1.0,>=2.0.2 in /opt/hostedtoolcache/Python/3.9.16/x64/lib/python3.9/site-packages (from spacy<3.6.0,>=3.5.0->en-core-web-md==3.5.0) (2.0.7)\n",
"Requirement already satisfied: preshed<3.1.0,>=3.0.2 in /opt/hostedtoolcache/Python/3.9.16/x64/lib/python3.9/site-packages (from spacy<3.6.0,>=3.5.0->en-core-web-md==3.5.0) (3.0.8)\n",
"Requirement already satisfied: thinc<8.2.0,>=8.1.8 in /opt/hostedtoolcache/Python/3.9.16/x64/lib/python3.9/site-packages (from spacy<3.6.0,>=3.5.0->en-core-web-md==3.5.0) (8.1.10)\n",
"Requirement already satisfied: wasabi<1.2.0,>=0.9.1 in /opt/hostedtoolcache/Python/3.9.16/x64/lib/python3.9/site-packages (from spacy<3.6.0,>=3.5.0->en-core-web-md==3.5.0) (1.1.1)\n",
"Requirement already satisfied: srsly<3.0.0,>=2.4.3 in /opt/hostedtoolcache/Python/3.9.16/x64/lib/python3.9/site-packages (from spacy<3.6.0,>=3.5.0->en-core-web-md==3.5.0) (2.4.6)\n",
"Requirement already satisfied: catalogue<2.1.0,>=2.0.6 in /opt/hostedtoolcache/Python/3.9.16/x64/lib/python3.9/site-packages (from spacy<3.6.0,>=3.5.0->en-core-web-md==3.5.0) (2.0.8)\n",
"Requirement already satisfied: typer<0.8.0,>=0.3.0 in /opt/hostedtoolcache/Python/3.9.16/x64/lib/python3.9/site-packages (from spacy<3.6.0,>=3.5.0->en-core-web-md==3.5.0) (0.7.0)\n",
"Requirement already satisfied: pathy>=0.10.0 in /opt/hostedtoolcache/Python/3.9.16/x64/lib/python3.9/site-packages (from spacy<3.6.0,>=3.5.0->en-core-web-md==3.5.0) (0.10.1)\n",
"Requirement already satisfied: smart-open<7.0.0,>=5.2.1 in /opt/hostedtoolcache/Python/3.9.16/x64/lib/python3.9/site-packages (from spacy<3.6.0,>=3.5.0->en-core-web-md==3.5.0) (6.3.0)\n",
"Requirement already satisfied: tqdm<5.0.0,>=4.38.0 in /opt/hostedtoolcache/Python/3.9.16/x64/lib/python3.9/site-packages (from spacy<3.6.0,>=3.5.0->en-core-web-md==3.5.0) (4.65.0)\n",
"Requirement already satisfied: numpy>=1.15.0 in /opt/hostedtoolcache/Python/3.9.16/x64/lib/python3.9/site-packages (from spacy<3.6.0,>=3.5.0->en-core-web-md==3.5.0) (1.23.4)\n",
"Requirement already satisfied: requests<3.0.0,>=2.13.0 in /opt/hostedtoolcache/Python/3.9.16/x64/lib/python3.9/site-packages (from spacy<3.6.0,>=3.5.0->en-core-web-md==3.5.0) (2.31.0)\n",
"Requirement already satisfied: pydantic!=1.8,!=1.8.1,<1.11.0,>=1.7.4 in /opt/hostedtoolcache/Python/3.9.16/x64/lib/python3.9/site-packages (from spacy<3.6.0,>=3.5.0->en-core-web-md==3.5.0) (1.10.8)\n",
"Requirement already satisfied: jinja2 in /opt/hostedtoolcache/Python/3.9.16/x64/lib/python3.9/site-packages (from spacy<3.6.0,>=3.5.0->en-core-web-md==3.5.0) (3.1.2)\n",
"Requirement already satisfied: setuptools in /opt/hostedtoolcache/Python/3.9.16/x64/lib/python3.9/site-packages (from spacy<3.6.0,>=3.5.0->en-core-web-md==3.5.0) (58.1.0)\n",
"Requirement already satisfied: packaging>=20.0 in /opt/hostedtoolcache/Python/3.9.16/x64/lib/python3.9/site-packages (from spacy<3.6.0,>=3.5.0->en-core-web-md==3.5.0) (23.1)\n",
"Requirement already satisfied: langcodes<4.0.0,>=3.2.0 in /opt/hostedtoolcache/Python/3.9.16/x64/lib/python3.9/site-packages (from spacy<3.6.0,>=3.5.0->en-core-web-md==3.5.0) (3.3.0)\n",
"Requirement already satisfied: typing-extensions>=4.2.0 in /opt/hostedtoolcache/Python/3.9.16/x64/lib/python3.9/site-packages (from pydantic!=1.8,!=1.8.1,<1.11.0,>=1.7.4->spacy<3.6.0,>=3.5.0->en-core-web-md==3.5.0) (4.6.2)\n",
"Requirement already satisfied: charset-normalizer<4,>=2 in /opt/hostedtoolcache/Python/3.9.16/x64/lib/python3.9/site-packages (from requests<3.0.0,>=2.13.0->spacy<3.6.0,>=3.5.0->en-core-web-md==3.5.0) (3.1.0)\n",
"Requirement already satisfied: idna<4,>=2.5 in /opt/hostedtoolcache/Python/3.9.16/x64/lib/python3.9/site-packages (from requests<3.0.0,>=2.13.0->spacy<3.6.0,>=3.5.0->en-core-web-md==3.5.0) (2.10)\n",
"Requirement already satisfied: urllib3<3,>=1.21.1 in /opt/hostedtoolcache/Python/3.9.16/x64/lib/python3.9/site-packages (from requests<3.0.0,>=2.13.0->spacy<3.6.0,>=3.5.0->en-core-web-md==3.5.0) (1.26.16)\n",
"Requirement already satisfied: certifi>=2017.4.17 in /opt/hostedtoolcache/Python/3.9.16/x64/lib/python3.9/site-packages (from requests<3.0.0,>=2.13.0->spacy<3.6.0,>=3.5.0->en-core-web-md==3.5.0) (2023.5.7)\n",
"Requirement already satisfied: blis<0.8.0,>=0.7.8 in /opt/hostedtoolcache/Python/3.9.16/x64/lib/python3.9/site-packages (from thinc<8.2.0,>=8.1.8->spacy<3.6.0,>=3.5.0->en-core-web-md==3.5.0) (0.7.9)\n",
"Requirement already satisfied: confection<1.0.0,>=0.0.1 in /opt/hostedtoolcache/Python/3.9.16/x64/lib/python3.9/site-packages (from thinc<8.2.0,>=8.1.8->spacy<3.6.0,>=3.5.0->en-core-web-md==3.5.0) (0.0.4)\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Requirement already satisfied: click<9.0.0,>=7.1.1 in /opt/hostedtoolcache/Python/3.9.16/x64/lib/python3.9/site-packages (from typer<0.8.0,>=0.3.0->spacy<3.6.0,>=3.5.0->en-core-web-md==3.5.0) (8.1.3)\n",
"Requirement already satisfied: MarkupSafe>=2.0 in /opt/hostedtoolcache/Python/3.9.16/x64/lib/python3.9/site-packages (from jinja2->spacy<3.6.0,>=3.5.0->en-core-web-md==3.5.0) (2.1.2)\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Installing collected packages: en-core-web-md\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Successfully installed en-core-web-md-3.5.0\n"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"\n",
"[notice] A new release of pip is available: 22.0.4 -> 23.1.2\n",
"[notice] To update, run: pip install --upgrade pip\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\u001b[38;5;2m✔ Download and installation successful\u001b[0m\n",
"You can now load the package via spacy.load('en_core_web_md')\n"
]
},
{
"data": {
"application/vnd.jupyter.widget-view+json": {
"model_id": "5237f9c534e24937b5376273ce58b8df",
"version_major": 2,
"version_minor": 0
},
"text/plain": [
"Downloading (…)/a4f8f3e/config.json: 0%| | 0.00/1.80k [00:00<?, ?B/s]"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"application/vnd.jupyter.widget-view+json": {
"model_id": "1f2193c670c640c88001c750be5ba3e4",
"version_major": 2,
"version_minor": 0
},
"text/plain": [
"Downloading pytorch_model.bin: 0%| | 0.00/1.22G [00:00<?, ?B/s]"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"application/vnd.jupyter.widget-view+json": {
"model_id": "6ab131dea3604ea9b53980bc469559ad",
"version_major": 2,
"version_minor": 0
},
"text/plain": [
"Downloading (…)okenizer_config.json: 0%| | 0.00/26.0 [00:00<?, ?B/s]"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"application/vnd.jupyter.widget-view+json": {
"model_id": "3b5394cc3f5c47beafb332747fba8918",
"version_major": 2,
"version_minor": 0
},
"text/plain": [
"Downloading (…)e/a4f8f3e/vocab.json: 0%| | 0.00/899k [00:00<?, ?B/s]"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"application/vnd.jupyter.widget-view+json": {
"model_id": "3850a5b6f6d642ff8e4bf76bdde5c740",
"version_major": 2,
"version_minor": 0
},
"text/plain": [
"Downloading (…)e/a4f8f3e/merges.txt: 0%| | 0.00/456k [00:00<?, ?B/s]"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"400 DEATHS GET E-BOOK X AN Corporation ncy Services A municipal worker sprays disinfectant on his colleague after they cremated the body of a man who died due to COVID-19 at a crematorium in Ahmedabad on April 12, 2020. | Photo Credit: REUTERS\n"
]
},
{
"data": {
"application/vnd.jupyter.widget-view+json": {
"model_id": "c79b6fc5d57846fb85d5a4abe6f25d1f",
"version_major": 2,
"version_minor": 0
},
"text/plain": [
"Downloading (…)/af0f99b/config.json: 0%| | 0.00/629 [00:00<?, ?B/s]"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"application/vnd.jupyter.widget-view+json": {
"model_id": "0440bf3b407c4dc4b87333caabc50b84",
"version_major": 2,
"version_minor": 0
},
"text/plain": [
"Downloading pytorch_model.bin: 0%| | 0.00/268M [00:00<?, ?B/s]"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"application/vnd.jupyter.widget-view+json": {
"model_id": "bc3946fbb9334b6ba7265b4cb6bb7d54",
"version_major": 2,
"version_minor": 0
},
"text/plain": [
"Downloading (…)okenizer_config.json: 0%| | 0.00/48.0 [00:00<?, ?B/s]"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"application/vnd.jupyter.widget-view+json": {
"model_id": "659303a039f64e3380e704a92deb30c2",
"version_major": 2,
"version_minor": 0
},
"text/plain": [
"Downloading (…)ve/af0f99b/vocab.txt: 0%| | 0.00/232k [00:00<?, ?B/s]"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"application/vnd.jupyter.widget-view+json": {
"model_id": "662898f7cc534c1a83ea051f1abdf1d9",
"version_major": 2,
"version_minor": 0
},
"text/plain": [
"Downloading (…)/f2482bf/config.json: 0%| | 0.00/998 [00:00<?, ?B/s]"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"application/vnd.jupyter.widget-view+json": {
"model_id": "2b22abf7e7144f499902318115731078",
"version_major": 2,
"version_minor": 0
},
"text/plain": [
"Downloading pytorch_model.bin: 0%| | 0.00/1.33G [00:00<?, ?B/s]"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"application/vnd.jupyter.widget-view+json": {
"model_id": "796df27060bd4a6883869d7076db72fc",
"version_major": 2,
"version_minor": 0
},
"text/plain": [
"Downloading (…)okenizer_config.json: 0%| | 0.00/60.0 [00:00<?, ?B/s]"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"application/vnd.jupyter.widget-view+json": {
"model_id": "c0631177f2a44954a838cd01e33c16f2",
"version_major": 2,
"version_minor": 0
},
"text/plain": [
"Downloading (…)ve/f2482bf/vocab.txt: 0%| | 0.00/213k [00:00<?, ?B/s]"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"CORONAVIRUS QUARANTINE CORONAVIRUS OUTBREAK BEYOND THIS POINT 2019-nCov Coronavirus\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"NEWS URGENT SAMSUNG LIVE Rio de Janeiro NEW COUNTING METHOD RJ CITY HALL EXCLUDES 1,177 DEAD FROM COVID-19 STATISTICS G1 ARE TARGET OF PF OPERATION AGAINST FAKE NEWS OPERATION IS SEEN ON THE PLANALTO AT 10:09\n"
]
}
],
"source": [
"for key in mydict:\n",
" mydict[key] = ammico.text.TextDetector(\n",
" mydict[key], analyse_text=True\n",
" ).analyse_image()"
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "3c063eda",
"metadata": {},
"source": [
"## Convert to dataframe and write csv\n",
"These steps are required to convert the dictionary of dictionarys into a dictionary with lists, that can be converted into a pandas dataframe and exported to a csv file."
]
},
{
"cell_type": "code",
"execution_count": 7,
"id": "5709c2cd",
"metadata": {
"execution": {
"iopub.execute_input": "2023-05-26T09:38:33.139127Z",
"iopub.status.busy": "2023-05-26T09:38:33.137692Z",
"iopub.status.idle": "2023-05-26T09:38:33.218456Z",
"shell.execute_reply": "2023-05-26T09:38:33.217705Z"
}
},
"outputs": [],
"source": [
"outdict = mutils.append_data_to_dict(mydict)\n",
"df = mutils.dump_df(outdict)"
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "ae182eb7",
"metadata": {},
"source": [
"Check the dataframe:"
]
},
{
"cell_type": "code",
"execution_count": 8,
"id": "c4f05637",
"metadata": {
"execution": {
"iopub.execute_input": "2023-05-26T09:38:33.223277Z",
"iopub.status.busy": "2023-05-26T09:38:33.222734Z",
"iopub.status.idle": "2023-05-26T09:38:33.335444Z",
"shell.execute_reply": "2023-05-26T09:38:33.334613Z"
}
},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>filename</th>\n",
" <th>text</th>\n",
" <th>text_language</th>\n",
" <th>text_english</th>\n",
" <th>text_clean</th>\n",
" <th>text_english_correct</th>\n",
" <th>polarity</th>\n",
" <th>subjectivity</th>\n",
" <th>text_summary</th>\n",
" <th>sentiment</th>\n",
" <th>sentiment_score</th>\n",
" <th>entity</th>\n",
" <th>entity_type</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>data/102730_eng.png</td>\n",
" <td>400 DEATHS GET E-BOOK X AN Corporation ncy Ser...</td>\n",
" <td>en</td>\n",
" <td>400 DEATHS GET E-BOOK X AN Corporation ncy Ser...</td>\n",
" <td>DEATHS GET E - BOOK X AN Corporation Services ...</td>\n",
" <td>400 DEATHS GET E-BOOK X of Corporation ney Ser...</td>\n",
" <td>-0.125000</td>\n",
" <td>0.375000</td>\n",
" <td>A municipal worker sprays disinfectant on his...</td>\n",
" <td>NEGATIVE</td>\n",
" <td>0.991692</td>\n",
" <td>[AN Corporation ncy Services, Ahmedabad, RE, #...</td>\n",
" <td>[ORG, LOC, PER, ORG]</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>data/102141_2_eng.png</td>\n",
" <td>CORONAVIRUS QUARANTINE CORONAVIRUS OUTBREAK BE...</td>\n",
" <td>en</td>\n",
" <td>CORONAVIRUS QUARANTINE CORONAVIRUS OUTBREAK BE...</td>\n",
" <td>CORONAVIRUS QUARANTINE CORONAVIRUS OUTBREAK BE...</td>\n",
" <td>CORONAVIRUS QUARANTINE CORONAVIRUS OUTBREAK BE...</td>\n",
" <td>0.000000</td>\n",
" <td>0.000000</td>\n",
" <td>Coronavirus QUARANTINE CORONAVIRUS OUTBREAK</td>\n",
" <td>NEGATIVE</td>\n",
" <td>0.976247</td>\n",
" <td>[CORONAVIRUS, ##AR, ##TI, ##RONAVIR, ##C, Co]</td>\n",
" <td>[ORG, MISC, MISC, ORG, MISC, MISC]</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>data/106349S_por.png</td>\n",
" <td>NEWS URGENTE SAMSUNG AO VIVO Rio de Janeiro NO...</td>\n",
" <td>pt</td>\n",
" <td>NEWS URGENT SAMSUNG LIVE Rio de Janeiro NEW CO...</td>\n",
" <td>NEWS URGENT SAMSUNG LIVE Rio de Janeiro NEW CO...</td>\n",
" <td>NEWS URGENT SAMSUNG LIVE Rio de Janeiro NEW CO...</td>\n",
" <td>-0.106818</td>\n",
" <td>0.588636</td>\n",
" <td>NEW COUNTING METHOD RJ City HALL EXCLUDES 1,1...</td>\n",
" <td>NEGATIVE</td>\n",
" <td>0.990659</td>\n",
" <td>[Rio de Janeiro, C, ##IT, P, ##NA, ##LTO]</td>\n",
" <td>[LOC, ORG, LOC, LOC, ORG, LOC]</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" filename text \n",
"0 data/102730_eng.png 400 DEATHS GET E-BOOK X AN Corporation ncy Ser... \\\n",
"1 data/102141_2_eng.png CORONAVIRUS QUARANTINE CORONAVIRUS OUTBREAK BE... \n",
"2 data/106349S_por.png NEWS URGENTE SAMSUNG AO VIVO Rio de Janeiro NO... \n",
"\n",
" text_language text_english \n",
"0 en 400 DEATHS GET E-BOOK X AN Corporation ncy Ser... \\\n",
"1 en CORONAVIRUS QUARANTINE CORONAVIRUS OUTBREAK BE... \n",
"2 pt NEWS URGENT SAMSUNG LIVE Rio de Janeiro NEW CO... \n",
"\n",
" text_clean \n",
"0 DEATHS GET E - BOOK X AN Corporation Services ... \\\n",
"1 CORONAVIRUS QUARANTINE CORONAVIRUS OUTBREAK BE... \n",
"2 NEWS URGENT SAMSUNG LIVE Rio de Janeiro NEW CO... \n",
"\n",
" text_english_correct polarity subjectivity \n",
"0 400 DEATHS GET E-BOOK X of Corporation ney Ser... -0.125000 0.375000 \\\n",
"1 CORONAVIRUS QUARANTINE CORONAVIRUS OUTBREAK BE... 0.000000 0.000000 \n",
"2 NEWS URGENT SAMSUNG LIVE Rio de Janeiro NEW CO... -0.106818 0.588636 \n",
"\n",
" text_summary sentiment \n",
"0 A municipal worker sprays disinfectant on his... NEGATIVE \\\n",
"1 Coronavirus QUARANTINE CORONAVIRUS OUTBREAK NEGATIVE \n",
"2 NEW COUNTING METHOD RJ City HALL EXCLUDES 1,1... NEGATIVE \n",
"\n",
" sentiment_score entity \n",
"0 0.991692 [AN Corporation ncy Services, Ahmedabad, RE, #... \\\n",
"1 0.976247 [CORONAVIRUS, ##AR, ##TI, ##RONAVIR, ##C, Co] \n",
"2 0.990659 [Rio de Janeiro, C, ##IT, P, ##NA, ##LTO] \n",
"\n",
" entity_type \n",
"0 [ORG, LOC, PER, ORG] \n",
"1 [ORG, MISC, MISC, ORG, MISC, MISC] \n",
"2 [LOC, ORG, LOC, LOC, ORG, LOC] "
]
},
"execution_count": 8,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df.head(10)"
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "eedf1e47",
"metadata": {},
"source": [
"Write the csv file - here you should provide a file path and file name for the csv file to be written."
]
},
{
"cell_type": "code",
"execution_count": 9,
"id": "bf6c9ddb",
"metadata": {
"execution": {
"iopub.execute_input": "2023-05-26T09:38:33.339548Z",
"iopub.status.busy": "2023-05-26T09:38:33.339261Z",
"iopub.status.idle": "2023-05-26T09:38:33.367365Z",
"shell.execute_reply": "2023-05-26T09:38:33.366733Z"
}
},
"outputs": [],
"source": [
"# Write the csv\n",
"df.to_csv(\"./data_out.csv\")"
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "4bc8ac0a",
"metadata": {},
"source": [
"## Topic analysis\n",
"The topic analysis is carried out using [BERTopic](https://maartengr.github.io/BERTopic/index.html) using an embedded model through a [spaCy](https://spacy.io/) pipeline."
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "4931941b",
"metadata": {},
"source": [
"BERTopic takes a list of strings as input. The more items in the list, the better for the topic modeling. If the below returns an error for `analyse_topic()`, the reason can be that your dataset is too small.\n",
"\n",
"You can pass which dataframe entry you would like to have analyzed. The default is `text_english`, but you could for example also select `text_summary` or `text_english_correct` setting the keyword `analyze_text` as so:\n",
"\n",
"`ammico.text.PostprocessText(mydict=mydict, analyze_text=\"text_summary\").analyse_topic()`\n",
"\n",
"### Option 1: Use the dictionary as obtained from the above analysis."
]
},
{
"cell_type": "code",
"execution_count": 10,
"id": "a3450a61",
"metadata": {
"execution": {
"iopub.execute_input": "2023-05-26T09:38:33.371027Z",
"iopub.status.busy": "2023-05-26T09:38:33.370573Z",
"iopub.status.idle": "2023-05-26T09:38:47.672161Z",
"shell.execute_reply": "2023-05-26T09:38:47.670959Z"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Reading data from dict.\n"
]
},
{
"ename": "TypeError",
"evalue": "Cannot use scipy.linalg.eigh for sparse A with k >= N. Use scipy.linalg.eigh(A.toarray()) or reduce k.",
"output_type": "error",
"traceback": [
"\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
"\u001b[0;31mTypeError\u001b[0m Traceback (most recent call last)",
"File \u001b[0;32m/opt/hostedtoolcache/Python/3.9.16/x64/lib/python3.9/site-packages/bertopic/_bertopic.py:2868\u001b[0m, in \u001b[0;36mBERTopic._reduce_dimensionality\u001b[0;34m(self, embeddings, y, partial_fit)\u001b[0m\n\u001b[1;32m 2867\u001b[0m \u001b[38;5;28;01mtry\u001b[39;00m:\n\u001b[0;32m-> 2868\u001b[0m \u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mumap_model\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mfit\u001b[49m\u001b[43m(\u001b[49m\u001b[43membeddings\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43my\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43my\u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m 2869\u001b[0m \u001b[38;5;28;01mexcept\u001b[39;00m \u001b[38;5;167;01mTypeError\u001b[39;00m:\n",
"File \u001b[0;32m/opt/hostedtoolcache/Python/3.9.16/x64/lib/python3.9/site-packages/umap/umap_.py:2684\u001b[0m, in \u001b[0;36mUMAP.fit\u001b[0;34m(self, X, y)\u001b[0m\n\u001b[1;32m 2683\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39mtransform_mode \u001b[38;5;241m==\u001b[39m \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124membedding\u001b[39m\u001b[38;5;124m\"\u001b[39m:\n\u001b[0;32m-> 2684\u001b[0m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39membedding_, aux_data \u001b[38;5;241m=\u001b[39m \u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43m_fit_embed_data\u001b[49m\u001b[43m(\u001b[49m\n\u001b[1;32m 2685\u001b[0m \u001b[43m \u001b[49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43m_raw_data\u001b[49m\u001b[43m[\u001b[49m\u001b[43mindex\u001b[49m\u001b[43m]\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 2686\u001b[0m \u001b[43m \u001b[49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mn_epochs\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 2687\u001b[0m \u001b[43m \u001b[49m\u001b[43minit\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 2688\u001b[0m \u001b[43m \u001b[49m\u001b[43mrandom_state\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;66;43;03m# JH why raw data?\u001b[39;49;00m\n\u001b[1;32m 2689\u001b[0m \u001b[43m \u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m 2690\u001b[0m \u001b[38;5;66;03m# Assign any points that are fully disconnected from our manifold(s) to have embedding\u001b[39;00m\n\u001b[1;32m 2691\u001b[0m \u001b[38;5;66;03m# coordinates of np.nan. These will be filtered by our plotting functions automatically.\u001b[39;00m\n\u001b[1;32m 2692\u001b[0m \u001b[38;5;66;03m# They also prevent users from being deceived a distance query to one of these points.\u001b[39;00m\n\u001b[1;32m 2693\u001b[0m \u001b[38;5;66;03m# Might be worth moving this into simplicial_set_embedding or _fit_embed_data\u001b[39;00m\n",
"File \u001b[0;32m/opt/hostedtoolcache/Python/3.9.16/x64/lib/python3.9/site-packages/umap/umap_.py:2717\u001b[0m, in \u001b[0;36mUMAP._fit_embed_data\u001b[0;34m(self, X, n_epochs, init, random_state)\u001b[0m\n\u001b[1;32m 2714\u001b[0m \u001b[38;5;250m\u001b[39m\u001b[38;5;124;03m\"\"\"A method wrapper for simplicial_set_embedding that can be\u001b[39;00m\n\u001b[1;32m 2715\u001b[0m \u001b[38;5;124;03mreplaced by subclasses.\u001b[39;00m\n\u001b[1;32m 2716\u001b[0m \u001b[38;5;124;03m\"\"\"\u001b[39;00m\n\u001b[0;32m-> 2717\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m \u001b[43msimplicial_set_embedding\u001b[49m\u001b[43m(\u001b[49m\n\u001b[1;32m 2718\u001b[0m \u001b[43m \u001b[49m\u001b[43mX\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 2719\u001b[0m \u001b[43m \u001b[49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mgraph_\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 2720\u001b[0m \u001b[43m \u001b[49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mn_components\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 2721\u001b[0m \u001b[43m \u001b[49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43m_initial_alpha\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 2722\u001b[0m \u001b[43m \u001b[49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43m_a\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 2723\u001b[0m \u001b[43m \u001b[49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43m_b\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 2724\u001b[0m \u001b[43m \u001b[49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mrepulsion_strength\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 2725\u001b[0m \u001b[43m \u001b[49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mnegative_sample_rate\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 2726\u001b[0m \u001b[43m \u001b[49m\u001b[43mn_epochs\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 2727\u001b[0m \u001b[43m \u001b[49m\u001b[43minit\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 2728\u001b[0m \u001b[43m \u001b[49m\u001b[43mrandom_state\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 2729\u001b[0m \u001b[43m \u001b[49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43m_input_distance_func\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 2730\u001b[0m \u001b[43m \u001b[49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43m_metric_kwds\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 2731\u001b[0m \u001b[43m \u001b[49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mdensmap\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 2732\u001b[0m \u001b[43m \u001b[49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43m_densmap_kwds\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 2733\u001b[0m \u001b[43m \u001b[49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43moutput_dens\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 2734\u001b[0m \u001b[43m \u001b[49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43m_output_distance_func\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 2735\u001b[0m \u001b[43m \u001b[49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43m_output_metric_kwds\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 2736\u001b[0m \u001b[43m \u001b[49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43moutput_metric\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;129;43;01min\u001b[39;49;00m\u001b[43m \u001b[49m\u001b[43m(\u001b[49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[38;5;124;43meuclidean\u001b[39;49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[38;5;124;43ml2\u001b[39;49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[43m)\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 2737\u001b[0m \u001b[43m \u001b[49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mrandom_state\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;129;43;01mis\u001b[39;49;00m\u001b[43m \u001b[49m\u001b[38;5;28;43;01mNone\u001b[39;49;00m\u001b[43m,\u001b[49m\n\u001b[1;32m 2738\u001b[0m \u001b[43m \u001b[49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mverbose\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 2739\u001b[0m \u001b[43m \u001b[49m\u001b[43mtqdm_kwds\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mtqdm_kwds\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 2740\u001b[0m \u001b[43m\u001b[49m\u001b[43m)\u001b[49m\n",
"File \u001b[0;32m/opt/hostedtoolcache/Python/3.9.16/x64/lib/python3.9/site-packages/umap/umap_.py:1078\u001b[0m, in \u001b[0;36msimplicial_set_embedding\u001b[0;34m(data, graph, n_components, initial_alpha, a, b, gamma, negative_sample_rate, n_epochs, init, random_state, metric, metric_kwds, densmap, densmap_kwds, output_dens, output_metric, output_metric_kwds, euclidean_output, parallel, verbose, tqdm_kwds)\u001b[0m\n\u001b[1;32m 1076\u001b[0m \u001b[38;5;28;01melif\u001b[39;00m \u001b[38;5;28misinstance\u001b[39m(init, \u001b[38;5;28mstr\u001b[39m) \u001b[38;5;129;01mand\u001b[39;00m init \u001b[38;5;241m==\u001b[39m \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mspectral\u001b[39m\u001b[38;5;124m\"\u001b[39m:\n\u001b[1;32m 1077\u001b[0m \u001b[38;5;66;03m# We add a little noise to avoid local minima for optimization to come\u001b[39;00m\n\u001b[0;32m-> 1078\u001b[0m initialisation \u001b[38;5;241m=\u001b[39m \u001b[43mspectral_layout\u001b[49m\u001b[43m(\u001b[49m\n\u001b[1;32m 1079\u001b[0m \u001b[43m \u001b[49m\u001b[43mdata\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 1080\u001b[0m \u001b[43m \u001b[49m\u001b[43mgraph\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 1081\u001b[0m \u001b[43m \u001b[49m\u001b[43mn_components\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 1082\u001b[0m \u001b[43m \u001b[49m\u001b[43mrandom_state\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 1083\u001b[0m \u001b[43m \u001b[49m\u001b[43mmetric\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mmetric\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 1084\u001b[0m \u001b[43m \u001b[49m\u001b[43mmetric_kwds\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mmetric_kwds\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 1085\u001b[0m \u001b[43m \u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m 1086\u001b[0m expansion \u001b[38;5;241m=\u001b[39m \u001b[38;5;241m10.0\u001b[39m \u001b[38;5;241m/\u001b[39m np\u001b[38;5;241m.\u001b[39mabs(initialisation)\u001b[38;5;241m.\u001b[39mmax()\n",
"File \u001b[0;32m/opt/hostedtoolcache/Python/3.9.16/x64/lib/python3.9/site-packages/umap/spectral.py:332\u001b[0m, in \u001b[0;36mspectral_layout\u001b[0;34m(data, graph, dim, random_state, metric, metric_kwds)\u001b[0m\n\u001b[1;32m 331\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m L\u001b[38;5;241m.\u001b[39mshape[\u001b[38;5;241m0\u001b[39m] \u001b[38;5;241m<\u001b[39m \u001b[38;5;241m2000000\u001b[39m:\n\u001b[0;32m--> 332\u001b[0m eigenvalues, eigenvectors \u001b[38;5;241m=\u001b[39m \u001b[43mscipy\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43msparse\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mlinalg\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43meigsh\u001b[49m\u001b[43m(\u001b[49m\n\u001b[1;32m 333\u001b[0m \u001b[43m \u001b[49m\u001b[43mL\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 334\u001b[0m \u001b[43m \u001b[49m\u001b[43mk\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 335\u001b[0m \u001b[43m \u001b[49m\u001b[43mwhich\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[38;5;124;43mSM\u001b[39;49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[43m,\u001b[49m\n\u001b[1;32m 336\u001b[0m \u001b[43m \u001b[49m\u001b[43mncv\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mnum_lanczos_vectors\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 337\u001b[0m \u001b[43m \u001b[49m\u001b[43mtol\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[38;5;241;43m1e-4\u001b[39;49m\u001b[43m,\u001b[49m\n\u001b[1;32m 338\u001b[0m \u001b[43m \u001b[49m\u001b[43mv0\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mnp\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mones\u001b[49m\u001b[43m(\u001b[49m\u001b[43mL\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mshape\u001b[49m\u001b[43m[\u001b[49m\u001b[38;5;241;43m0\u001b[39;49m\u001b[43m]\u001b[49m\u001b[43m)\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 339\u001b[0m \u001b[43m \u001b[49m\u001b[43mmaxiter\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mgraph\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mshape\u001b[49m\u001b[43m[\u001b[49m\u001b[38;5;241;43m0\u001b[39;49m\u001b[43m]\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[43m \u001b[49m\u001b[38;5;241;43m5\u001b[39;49m\u001b[43m,\u001b[49m\n\u001b[1;32m 340\u001b[0m \u001b[43m \u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m 341\u001b[0m \u001b[38;5;28;01melse\u001b[39;00m:\n",
"File \u001b[0;32m/opt/hostedtoolcache/Python/3.9.16/x64/lib/python3.9/site-packages/scipy/sparse/linalg/_eigen/arpack/arpack.py:1597\u001b[0m, in \u001b[0;36meigsh\u001b[0;34m(A, k, M, sigma, which, v0, ncv, maxiter, tol, return_eigenvectors, Minv, OPinv, mode)\u001b[0m\n\u001b[1;32m 1596\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m issparse(A):\n\u001b[0;32m-> 1597\u001b[0m \u001b[38;5;28;01mraise\u001b[39;00m \u001b[38;5;167;01mTypeError\u001b[39;00m(\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mCannot use scipy.linalg.eigh for sparse A with \u001b[39m\u001b[38;5;124m\"\u001b[39m\n\u001b[1;32m 1598\u001b[0m \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mk >= N. Use scipy.linalg.eigh(A.toarray()) or\u001b[39m\u001b[38;5;124m\"\u001b[39m\n\u001b[1;32m 1599\u001b[0m \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124m reduce k.\u001b[39m\u001b[38;5;124m\"\u001b[39m)\n\u001b[1;32m 1600\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m \u001b[38;5;28misinstance\u001b[39m(A, LinearOperator):\n",
"\u001b[0;31mTypeError\u001b[0m: Cannot use scipy.linalg.eigh for sparse A with k >= N. Use scipy.linalg.eigh(A.toarray()) or reduce k.",
"\nDuring handling of the above exception, another exception occurred:\n",
"\u001b[0;31mTypeError\u001b[0m Traceback (most recent call last)",
"Cell \u001b[0;32mIn[10], line 2\u001b[0m\n\u001b[1;32m 1\u001b[0m \u001b[38;5;66;03m# make a list of all the text_english entries per analysed image from the mydict variable as above\u001b[39;00m\n\u001b[0;32m----> 2\u001b[0m topic_model, topic_df, most_frequent_topics \u001b[38;5;241m=\u001b[39m \u001b[43mammico\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mtext\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mPostprocessText\u001b[49m\u001b[43m(\u001b[49m\n\u001b[1;32m 3\u001b[0m \u001b[43m \u001b[49m\u001b[43mmydict\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mmydict\u001b[49m\n\u001b[1;32m 4\u001b[0m \u001b[43m)\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43manalyse_topic\u001b[49m\u001b[43m(\u001b[49m\u001b[43m)\u001b[49m\n",
"File \u001b[0;32m~/work/AMMICO/AMMICO/ammico/text.py:217\u001b[0m, in \u001b[0;36mPostprocessText.analyse_topic\u001b[0;34m(self, return_topics)\u001b[0m\n\u001b[1;32m 215\u001b[0m \u001b[38;5;28;01mexcept\u001b[39;00m \u001b[38;5;167;01mTypeError\u001b[39;00m:\n\u001b[1;32m 216\u001b[0m \u001b[38;5;28mprint\u001b[39m(\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mBERTopic excited with an error - maybe your dataset is too small?\u001b[39m\u001b[38;5;124m\"\u001b[39m)\n\u001b[0;32m--> 217\u001b[0m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39mtopics, \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39mprobs \u001b[38;5;241m=\u001b[39m \u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mtopic_model\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mfit_transform\u001b[49m\u001b[43m(\u001b[49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mlist_text_english\u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m 218\u001b[0m \u001b[38;5;66;03m# return the topic list\u001b[39;00m\n\u001b[1;32m 219\u001b[0m topic_df \u001b[38;5;241m=\u001b[39m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39mtopic_model\u001b[38;5;241m.\u001b[39mget_topic_info()\n",
"File \u001b[0;32m/opt/hostedtoolcache/Python/3.9.16/x64/lib/python3.9/site-packages/bertopic/_bertopic.py:356\u001b[0m, in \u001b[0;36mBERTopic.fit_transform\u001b[0;34m(self, documents, embeddings, y)\u001b[0m\n\u001b[1;32m 354\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39mseed_topic_list \u001b[38;5;129;01mis\u001b[39;00m \u001b[38;5;129;01mnot\u001b[39;00m \u001b[38;5;28;01mNone\u001b[39;00m \u001b[38;5;129;01mand\u001b[39;00m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39membedding_model \u001b[38;5;129;01mis\u001b[39;00m \u001b[38;5;129;01mnot\u001b[39;00m \u001b[38;5;28;01mNone\u001b[39;00m:\n\u001b[1;32m 355\u001b[0m y, embeddings \u001b[38;5;241m=\u001b[39m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39m_guided_topic_modeling(embeddings)\n\u001b[0;32m--> 356\u001b[0m umap_embeddings \u001b[38;5;241m=\u001b[39m \u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43m_reduce_dimensionality\u001b[49m\u001b[43m(\u001b[49m\u001b[43membeddings\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43my\u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m 358\u001b[0m \u001b[38;5;66;03m# Cluster reduced embeddings\u001b[39;00m\n\u001b[1;32m 359\u001b[0m documents, probabilities \u001b[38;5;241m=\u001b[39m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39m_cluster_embeddings(umap_embeddings, documents, y\u001b[38;5;241m=\u001b[39my)\n",
"File \u001b[0;32m/opt/hostedtoolcache/Python/3.9.16/x64/lib/python3.9/site-packages/bertopic/_bertopic.py:2872\u001b[0m, in \u001b[0;36mBERTopic._reduce_dimensionality\u001b[0;34m(self, embeddings, y, partial_fit)\u001b[0m\n\u001b[1;32m 2869\u001b[0m \u001b[38;5;28;01mexcept\u001b[39;00m \u001b[38;5;167;01mTypeError\u001b[39;00m:\n\u001b[1;32m 2870\u001b[0m logger\u001b[38;5;241m.\u001b[39minfo(\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mThe dimensionality reduction algorithm did not contain the `y` parameter and\u001b[39m\u001b[38;5;124m\"\u001b[39m\n\u001b[1;32m 2871\u001b[0m \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124m therefore the `y` parameter was not used\u001b[39m\u001b[38;5;124m\"\u001b[39m)\n\u001b[0;32m-> 2872\u001b[0m \u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mumap_model\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mfit\u001b[49m\u001b[43m(\u001b[49m\u001b[43membeddings\u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m 2874\u001b[0m umap_embeddings \u001b[38;5;241m=\u001b[39m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39mumap_model\u001b[38;5;241m.\u001b[39mtransform(embeddings)\n\u001b[1;32m 2875\u001b[0m logger\u001b[38;5;241m.\u001b[39minfo(\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mReduced dimensionality\u001b[39m\u001b[38;5;124m\"\u001b[39m)\n",
"File \u001b[0;32m/opt/hostedtoolcache/Python/3.9.16/x64/lib/python3.9/site-packages/umap/umap_.py:2684\u001b[0m, in \u001b[0;36mUMAP.fit\u001b[0;34m(self, X, y)\u001b[0m\n\u001b[1;32m 2681\u001b[0m \u001b[38;5;28mprint\u001b[39m(ts(), \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mConstruct embedding\u001b[39m\u001b[38;5;124m\"\u001b[39m)\n\u001b[1;32m 2683\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39mtransform_mode \u001b[38;5;241m==\u001b[39m \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124membedding\u001b[39m\u001b[38;5;124m\"\u001b[39m:\n\u001b[0;32m-> 2684\u001b[0m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39membedding_, aux_data \u001b[38;5;241m=\u001b[39m \u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43m_fit_embed_data\u001b[49m\u001b[43m(\u001b[49m\n\u001b[1;32m 2685\u001b[0m \u001b[43m \u001b[49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43m_raw_data\u001b[49m\u001b[43m[\u001b[49m\u001b[43mindex\u001b[49m\u001b[43m]\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 2686\u001b[0m \u001b[43m \u001b[49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mn_epochs\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 2687\u001b[0m \u001b[43m \u001b[49m\u001b[43minit\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 2688\u001b[0m \u001b[43m \u001b[49m\u001b[43mrandom_state\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;66;43;03m# JH why raw data?\u001b[39;49;00m\n\u001b[1;32m 2689\u001b[0m \u001b[43m \u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m 2690\u001b[0m \u001b[38;5;66;03m# Assign any points that are fully disconnected from our manifold(s) to have embedding\u001b[39;00m\n\u001b[1;32m 2691\u001b[0m \u001b[38;5;66;03m# coordinates of np.nan. These will be filtered by our plotting functions automatically.\u001b[39;00m\n\u001b[1;32m 2692\u001b[0m \u001b[38;5;66;03m# They also prevent users from being deceived a distance query to one of these points.\u001b[39;00m\n\u001b[1;32m 2693\u001b[0m \u001b[38;5;66;03m# Might be worth moving this into simplicial_set_embedding or _fit_embed_data\u001b[39;00m\n\u001b[1;32m 2694\u001b[0m disconnected_vertices \u001b[38;5;241m=\u001b[39m np\u001b[38;5;241m.\u001b[39marray(\u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39mgraph_\u001b[38;5;241m.\u001b[39msum(axis\u001b[38;5;241m=\u001b[39m\u001b[38;5;241m1\u001b[39m))\u001b[38;5;241m.\u001b[39mflatten() \u001b[38;5;241m==\u001b[39m \u001b[38;5;241m0\u001b[39m\n",
"File \u001b[0;32m/opt/hostedtoolcache/Python/3.9.16/x64/lib/python3.9/site-packages/umap/umap_.py:2717\u001b[0m, in \u001b[0;36mUMAP._fit_embed_data\u001b[0;34m(self, X, n_epochs, init, random_state)\u001b[0m\n\u001b[1;32m 2713\u001b[0m \u001b[38;5;28;01mdef\u001b[39;00m \u001b[38;5;21m_fit_embed_data\u001b[39m(\u001b[38;5;28mself\u001b[39m, X, n_epochs, init, random_state):\n\u001b[1;32m 2714\u001b[0m \u001b[38;5;250m \u001b[39m\u001b[38;5;124;03m\"\"\"A method wrapper for simplicial_set_embedding that can be\u001b[39;00m\n\u001b[1;32m 2715\u001b[0m \u001b[38;5;124;03m replaced by subclasses.\u001b[39;00m\n\u001b[1;32m 2716\u001b[0m \u001b[38;5;124;03m \"\"\"\u001b[39;00m\n\u001b[0;32m-> 2717\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m \u001b[43msimplicial_set_embedding\u001b[49m\u001b[43m(\u001b[49m\n\u001b[1;32m 2718\u001b[0m \u001b[43m \u001b[49m\u001b[43mX\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 2719\u001b[0m \u001b[43m \u001b[49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mgraph_\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 2720\u001b[0m \u001b[43m \u001b[49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mn_components\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 2721\u001b[0m \u001b[43m \u001b[49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43m_initial_alpha\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 2722\u001b[0m \u001b[43m \u001b[49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43m_a\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 2723\u001b[0m \u001b[43m \u001b[49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43m_b\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 2724\u001b[0m \u001b[43m \u001b[49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mrepulsion_strength\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 2725\u001b[0m \u001b[43m \u001b[49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mnegative_sample_rate\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 2726\u001b[0m \u001b[43m \u001b[49m\u001b[43mn_epochs\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 2727\u001b[0m \u001b[43m \u001b[49m\u001b[43minit\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 2728\u001b[0m \u001b[43m \u001b[49m\u001b[43mrandom_state\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 2729\u001b[0m \u001b[43m \u001b[49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43m_input_distance_func\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 2730\u001b[0m \u001b[43m \u001b[49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43m_metric_kwds\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 2731\u001b[0m \u001b[43m \u001b[49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mdensmap\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 2732\u001b[0m \u001b[43m \u001b[49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43m_densmap_kwds\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 2733\u001b[0m \u001b[43m \u001b[49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43moutput_dens\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 2734\u001b[0m \u001b[43m \u001b[49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43m_output_distance_func\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 2735\u001b[0m \u001b[43m \u001b[49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43m_output_metric_kwds\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 2736\u001b[0m \u001b[43m \u001b[49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43moutput_metric\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;129;43;01min\u001b[39;49;00m\u001b[43m \u001b[49m\u001b[43m(\u001b[49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[38;5;124;43meuclidean\u001b[39;49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[38;5;124;43ml2\u001b[39;49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[43m)\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 2737\u001b[0m \u001b[43m \u001b[49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mrandom_state\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;129;43;01mis\u001b[39;49;00m\u001b[43m \u001b[49m\u001b[38;5;28;43;01mNone\u001b[39;49;00m\u001b[43m,\u001b[49m\n\u001b[1;32m 2738\u001b[0m \u001b[43m \u001b[49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mverbose\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 2739\u001b[0m \u001b[43m \u001b[49m\u001b[43mtqdm_kwds\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mtqdm_kwds\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 2740\u001b[0m \u001b[43m \u001b[49m\u001b[43m)\u001b[49m\n",
"File \u001b[0;32m/opt/hostedtoolcache/Python/3.9.16/x64/lib/python3.9/site-packages/umap/umap_.py:1078\u001b[0m, in \u001b[0;36msimplicial_set_embedding\u001b[0;34m(data, graph, n_components, initial_alpha, a, b, gamma, negative_sample_rate, n_epochs, init, random_state, metric, metric_kwds, densmap, densmap_kwds, output_dens, output_metric, output_metric_kwds, euclidean_output, parallel, verbose, tqdm_kwds)\u001b[0m\n\u001b[1;32m 1073\u001b[0m embedding \u001b[38;5;241m=\u001b[39m random_state\u001b[38;5;241m.\u001b[39muniform(\n\u001b[1;32m 1074\u001b[0m low\u001b[38;5;241m=\u001b[39m\u001b[38;5;241m-\u001b[39m\u001b[38;5;241m10.0\u001b[39m, high\u001b[38;5;241m=\u001b[39m\u001b[38;5;241m10.0\u001b[39m, size\u001b[38;5;241m=\u001b[39m(graph\u001b[38;5;241m.\u001b[39mshape[\u001b[38;5;241m0\u001b[39m], n_components)\n\u001b[1;32m 1075\u001b[0m )\u001b[38;5;241m.\u001b[39mastype(np\u001b[38;5;241m.\u001b[39mfloat32)\n\u001b[1;32m 1076\u001b[0m \u001b[38;5;28;01melif\u001b[39;00m \u001b[38;5;28misinstance\u001b[39m(init, \u001b[38;5;28mstr\u001b[39m) \u001b[38;5;129;01mand\u001b[39;00m init \u001b[38;5;241m==\u001b[39m \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mspectral\u001b[39m\u001b[38;5;124m\"\u001b[39m:\n\u001b[1;32m 1077\u001b[0m \u001b[38;5;66;03m# We add a little noise to avoid local minima for optimization to come\u001b[39;00m\n\u001b[0;32m-> 1078\u001b[0m initialisation \u001b[38;5;241m=\u001b[39m \u001b[43mspectral_layout\u001b[49m\u001b[43m(\u001b[49m\n\u001b[1;32m 1079\u001b[0m \u001b[43m \u001b[49m\u001b[43mdata\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 1080\u001b[0m \u001b[43m \u001b[49m\u001b[43mgraph\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 1081\u001b[0m \u001b[43m \u001b[49m\u001b[43mn_components\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 1082\u001b[0m \u001b[43m \u001b[49m\u001b[43mrandom_state\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 1083\u001b[0m \u001b[43m \u001b[49m\u001b[43mmetric\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mmetric\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 1084\u001b[0m \u001b[43m \u001b[49m\u001b[43mmetric_kwds\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mmetric_kwds\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 1085\u001b[0m \u001b[43m \u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m 1086\u001b[0m expansion \u001b[38;5;241m=\u001b[39m \u001b[38;5;241m10.0\u001b[39m \u001b[38;5;241m/\u001b[39m np\u001b[38;5;241m.\u001b[39mabs(initialisation)\u001b[38;5;241m.\u001b[39mmax()\n\u001b[1;32m 1087\u001b[0m embedding \u001b[38;5;241m=\u001b[39m (initialisation \u001b[38;5;241m*\u001b[39m expansion)\u001b[38;5;241m.\u001b[39mastype(\n\u001b[1;32m 1088\u001b[0m np\u001b[38;5;241m.\u001b[39mfloat32\n\u001b[1;32m 1089\u001b[0m ) \u001b[38;5;241m+\u001b[39m random_state\u001b[38;5;241m.\u001b[39mnormal(\n\u001b[0;32m (...)\u001b[0m\n\u001b[1;32m 1092\u001b[0m np\u001b[38;5;241m.\u001b[39mfloat32\n\u001b[1;32m 1093\u001b[0m )\n",
"File \u001b[0;32m/opt/hostedtoolcache/Python/3.9.16/x64/lib/python3.9/site-packages/umap/spectral.py:332\u001b[0m, in \u001b[0;36mspectral_layout\u001b[0;34m(data, graph, dim, random_state, metric, metric_kwds)\u001b[0m\n\u001b[1;32m 330\u001b[0m \u001b[38;5;28;01mtry\u001b[39;00m:\n\u001b[1;32m 331\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m L\u001b[38;5;241m.\u001b[39mshape[\u001b[38;5;241m0\u001b[39m] \u001b[38;5;241m<\u001b[39m \u001b[38;5;241m2000000\u001b[39m:\n\u001b[0;32m--> 332\u001b[0m eigenvalues, eigenvectors \u001b[38;5;241m=\u001b[39m \u001b[43mscipy\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43msparse\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mlinalg\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43meigsh\u001b[49m\u001b[43m(\u001b[49m\n\u001b[1;32m 333\u001b[0m \u001b[43m \u001b[49m\u001b[43mL\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 334\u001b[0m \u001b[43m \u001b[49m\u001b[43mk\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 335\u001b[0m \u001b[43m \u001b[49m\u001b[43mwhich\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[38;5;124;43mSM\u001b[39;49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[43m,\u001b[49m\n\u001b[1;32m 336\u001b[0m \u001b[43m \u001b[49m\u001b[43mncv\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mnum_lanczos_vectors\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 337\u001b[0m \u001b[43m \u001b[49m\u001b[43mtol\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[38;5;241;43m1e-4\u001b[39;49m\u001b[43m,\u001b[49m\n\u001b[1;32m 338\u001b[0m \u001b[43m \u001b[49m\u001b[43mv0\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mnp\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mones\u001b[49m\u001b[43m(\u001b[49m\u001b[43mL\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mshape\u001b[49m\u001b[43m[\u001b[49m\u001b[38;5;241;43m0\u001b[39;49m\u001b[43m]\u001b[49m\u001b[43m)\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 339\u001b[0m \u001b[43m \u001b[49m\u001b[43mmaxiter\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mgraph\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mshape\u001b[49m\u001b[43m[\u001b[49m\u001b[38;5;241;43m0\u001b[39;49m\u001b[43m]\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[43m \u001b[49m\u001b[38;5;241;43m5\u001b[39;49m\u001b[43m,\u001b[49m\n\u001b[1;32m 340\u001b[0m \u001b[43m \u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m 341\u001b[0m \u001b[38;5;28;01melse\u001b[39;00m:\n\u001b[1;32m 342\u001b[0m eigenvalues, eigenvectors \u001b[38;5;241m=\u001b[39m scipy\u001b[38;5;241m.\u001b[39msparse\u001b[38;5;241m.\u001b[39mlinalg\u001b[38;5;241m.\u001b[39mlobpcg(\n\u001b[1;32m 343\u001b[0m L, random_state\u001b[38;5;241m.\u001b[39mnormal(size\u001b[38;5;241m=\u001b[39m(L\u001b[38;5;241m.\u001b[39mshape[\u001b[38;5;241m0\u001b[39m], k)), largest\u001b[38;5;241m=\u001b[39m\u001b[38;5;28;01mFalse\u001b[39;00m, tol\u001b[38;5;241m=\u001b[39m\u001b[38;5;241m1e-8\u001b[39m\n\u001b[1;32m 344\u001b[0m )\n",
"File \u001b[0;32m/opt/hostedtoolcache/Python/3.9.16/x64/lib/python3.9/site-packages/scipy/sparse/linalg/_eigen/arpack/arpack.py:1597\u001b[0m, in \u001b[0;36meigsh\u001b[0;34m(A, k, M, sigma, which, v0, ncv, maxiter, tol, return_eigenvectors, Minv, OPinv, mode)\u001b[0m\n\u001b[1;32m 1592\u001b[0m warnings\u001b[38;5;241m.\u001b[39mwarn(\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mk >= N for N * N square matrix. \u001b[39m\u001b[38;5;124m\"\u001b[39m\n\u001b[1;32m 1593\u001b[0m \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mAttempting to use scipy.linalg.eigh instead.\u001b[39m\u001b[38;5;124m\"\u001b[39m,\n\u001b[1;32m 1594\u001b[0m \u001b[38;5;167;01mRuntimeWarning\u001b[39;00m)\n\u001b[1;32m 1596\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m issparse(A):\n\u001b[0;32m-> 1597\u001b[0m \u001b[38;5;28;01mraise\u001b[39;00m \u001b[38;5;167;01mTypeError\u001b[39;00m(\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mCannot use scipy.linalg.eigh for sparse A with \u001b[39m\u001b[38;5;124m\"\u001b[39m\n\u001b[1;32m 1598\u001b[0m \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mk >= N. Use scipy.linalg.eigh(A.toarray()) or\u001b[39m\u001b[38;5;124m\"\u001b[39m\n\u001b[1;32m 1599\u001b[0m \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124m reduce k.\u001b[39m\u001b[38;5;124m\"\u001b[39m)\n\u001b[1;32m 1600\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m \u001b[38;5;28misinstance\u001b[39m(A, LinearOperator):\n\u001b[1;32m 1601\u001b[0m \u001b[38;5;28;01mraise\u001b[39;00m \u001b[38;5;167;01mTypeError\u001b[39;00m(\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mCannot use scipy.linalg.eigh for LinearOperator \u001b[39m\u001b[38;5;124m\"\u001b[39m\n\u001b[1;32m 1602\u001b[0m \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mA with k >= N.\u001b[39m\u001b[38;5;124m\"\u001b[39m)\n",
"\u001b[0;31mTypeError\u001b[0m: Cannot use scipy.linalg.eigh for sparse A with k >= N. Use scipy.linalg.eigh(A.toarray()) or reduce k."
]
}
],
"source": [
"# make a list of all the text_english entries per analysed image from the mydict variable as above\n",
"topic_model, topic_df, most_frequent_topics = ammico.text.PostprocessText(\n",
" mydict=mydict\n",
").analyse_topic()"
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "95667342",
"metadata": {},
"source": [
"### Option 2: Read in a csv\n",
"Not to analyse too many images on google Cloud Vision, use the csv output to obtain the text (when rerunning already analysed images)."
]
},
{
"cell_type": "code",
"execution_count": 11,
"id": "5530e436",
"metadata": {
"execution": {
"iopub.execute_input": "2023-05-26T09:38:47.677525Z",
"iopub.status.busy": "2023-05-26T09:38:47.676956Z",
"iopub.status.idle": "2023-05-26T09:38:49.270294Z",
"shell.execute_reply": "2023-05-26T09:38:49.269131Z"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Reading data from df.\n"
]
},
{
"ename": "TypeError",
"evalue": "Cannot use scipy.linalg.eigh for sparse A with k >= N. Use scipy.linalg.eigh(A.toarray()) or reduce k.",
"output_type": "error",
"traceback": [
"\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
"\u001b[0;31mTypeError\u001b[0m Traceback (most recent call last)",
"File \u001b[0;32m/opt/hostedtoolcache/Python/3.9.16/x64/lib/python3.9/site-packages/bertopic/_bertopic.py:2868\u001b[0m, in \u001b[0;36mBERTopic._reduce_dimensionality\u001b[0;34m(self, embeddings, y, partial_fit)\u001b[0m\n\u001b[1;32m 2867\u001b[0m \u001b[38;5;28;01mtry\u001b[39;00m:\n\u001b[0;32m-> 2868\u001b[0m \u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mumap_model\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mfit\u001b[49m\u001b[43m(\u001b[49m\u001b[43membeddings\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43my\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43my\u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m 2869\u001b[0m \u001b[38;5;28;01mexcept\u001b[39;00m \u001b[38;5;167;01mTypeError\u001b[39;00m:\n",
"File \u001b[0;32m/opt/hostedtoolcache/Python/3.9.16/x64/lib/python3.9/site-packages/umap/umap_.py:2684\u001b[0m, in \u001b[0;36mUMAP.fit\u001b[0;34m(self, X, y)\u001b[0m\n\u001b[1;32m 2683\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39mtransform_mode \u001b[38;5;241m==\u001b[39m \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124membedding\u001b[39m\u001b[38;5;124m\"\u001b[39m:\n\u001b[0;32m-> 2684\u001b[0m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39membedding_, aux_data \u001b[38;5;241m=\u001b[39m \u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43m_fit_embed_data\u001b[49m\u001b[43m(\u001b[49m\n\u001b[1;32m 2685\u001b[0m \u001b[43m \u001b[49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43m_raw_data\u001b[49m\u001b[43m[\u001b[49m\u001b[43mindex\u001b[49m\u001b[43m]\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 2686\u001b[0m \u001b[43m \u001b[49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mn_epochs\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 2687\u001b[0m \u001b[43m \u001b[49m\u001b[43minit\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 2688\u001b[0m \u001b[43m \u001b[49m\u001b[43mrandom_state\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;66;43;03m# JH why raw data?\u001b[39;49;00m\n\u001b[1;32m 2689\u001b[0m \u001b[43m \u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m 2690\u001b[0m \u001b[38;5;66;03m# Assign any points that are fully disconnected from our manifold(s) to have embedding\u001b[39;00m\n\u001b[1;32m 2691\u001b[0m \u001b[38;5;66;03m# coordinates of np.nan. These will be filtered by our plotting functions automatically.\u001b[39;00m\n\u001b[1;32m 2692\u001b[0m \u001b[38;5;66;03m# They also prevent users from being deceived a distance query to one of these points.\u001b[39;00m\n\u001b[1;32m 2693\u001b[0m \u001b[38;5;66;03m# Might be worth moving this into simplicial_set_embedding or _fit_embed_data\u001b[39;00m\n",
"File \u001b[0;32m/opt/hostedtoolcache/Python/3.9.16/x64/lib/python3.9/site-packages/umap/umap_.py:2717\u001b[0m, in \u001b[0;36mUMAP._fit_embed_data\u001b[0;34m(self, X, n_epochs, init, random_state)\u001b[0m\n\u001b[1;32m 2714\u001b[0m \u001b[38;5;250m\u001b[39m\u001b[38;5;124;03m\"\"\"A method wrapper for simplicial_set_embedding that can be\u001b[39;00m\n\u001b[1;32m 2715\u001b[0m \u001b[38;5;124;03mreplaced by subclasses.\u001b[39;00m\n\u001b[1;32m 2716\u001b[0m \u001b[38;5;124;03m\"\"\"\u001b[39;00m\n\u001b[0;32m-> 2717\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m \u001b[43msimplicial_set_embedding\u001b[49m\u001b[43m(\u001b[49m\n\u001b[1;32m 2718\u001b[0m \u001b[43m \u001b[49m\u001b[43mX\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 2719\u001b[0m \u001b[43m \u001b[49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mgraph_\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 2720\u001b[0m \u001b[43m \u001b[49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mn_components\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 2721\u001b[0m \u001b[43m \u001b[49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43m_initial_alpha\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 2722\u001b[0m \u001b[43m \u001b[49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43m_a\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 2723\u001b[0m \u001b[43m \u001b[49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43m_b\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 2724\u001b[0m \u001b[43m \u001b[49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mrepulsion_strength\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 2725\u001b[0m \u001b[43m \u001b[49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mnegative_sample_rate\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 2726\u001b[0m \u001b[43m \u001b[49m\u001b[43mn_epochs\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 2727\u001b[0m \u001b[43m \u001b[49m\u001b[43minit\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 2728\u001b[0m \u001b[43m \u001b[49m\u001b[43mrandom_state\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 2729\u001b[0m \u001b[43m \u001b[49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43m_input_distance_func\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 2730\u001b[0m \u001b[43m \u001b[49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43m_metric_kwds\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 2731\u001b[0m \u001b[43m \u001b[49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mdensmap\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 2732\u001b[0m \u001b[43m \u001b[49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43m_densmap_kwds\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 2733\u001b[0m \u001b[43m \u001b[49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43moutput_dens\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 2734\u001b[0m \u001b[43m \u001b[49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43m_output_distance_func\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 2735\u001b[0m \u001b[43m \u001b[49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43m_output_metric_kwds\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 2736\u001b[0m \u001b[43m \u001b[49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43moutput_metric\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;129;43;01min\u001b[39;49;00m\u001b[43m \u001b[49m\u001b[43m(\u001b[49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[38;5;124;43meuclidean\u001b[39;49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[38;5;124;43ml2\u001b[39;49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[43m)\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 2737\u001b[0m \u001b[43m \u001b[49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mrandom_state\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;129;43;01mis\u001b[39;49;00m\u001b[43m \u001b[49m\u001b[38;5;28;43;01mNone\u001b[39;49;00m\u001b[43m,\u001b[49m\n\u001b[1;32m 2738\u001b[0m \u001b[43m \u001b[49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mverbose\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 2739\u001b[0m \u001b[43m \u001b[49m\u001b[43mtqdm_kwds\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mtqdm_kwds\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 2740\u001b[0m \u001b[43m\u001b[49m\u001b[43m)\u001b[49m\n",
"File \u001b[0;32m/opt/hostedtoolcache/Python/3.9.16/x64/lib/python3.9/site-packages/umap/umap_.py:1078\u001b[0m, in \u001b[0;36msimplicial_set_embedding\u001b[0;34m(data, graph, n_components, initial_alpha, a, b, gamma, negative_sample_rate, n_epochs, init, random_state, metric, metric_kwds, densmap, densmap_kwds, output_dens, output_metric, output_metric_kwds, euclidean_output, parallel, verbose, tqdm_kwds)\u001b[0m\n\u001b[1;32m 1076\u001b[0m \u001b[38;5;28;01melif\u001b[39;00m \u001b[38;5;28misinstance\u001b[39m(init, \u001b[38;5;28mstr\u001b[39m) \u001b[38;5;129;01mand\u001b[39;00m init \u001b[38;5;241m==\u001b[39m \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mspectral\u001b[39m\u001b[38;5;124m\"\u001b[39m:\n\u001b[1;32m 1077\u001b[0m \u001b[38;5;66;03m# We add a little noise to avoid local minima for optimization to come\u001b[39;00m\n\u001b[0;32m-> 1078\u001b[0m initialisation \u001b[38;5;241m=\u001b[39m \u001b[43mspectral_layout\u001b[49m\u001b[43m(\u001b[49m\n\u001b[1;32m 1079\u001b[0m \u001b[43m \u001b[49m\u001b[43mdata\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 1080\u001b[0m \u001b[43m \u001b[49m\u001b[43mgraph\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 1081\u001b[0m \u001b[43m \u001b[49m\u001b[43mn_components\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 1082\u001b[0m \u001b[43m \u001b[49m\u001b[43mrandom_state\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 1083\u001b[0m \u001b[43m \u001b[49m\u001b[43mmetric\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mmetric\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 1084\u001b[0m \u001b[43m \u001b[49m\u001b[43mmetric_kwds\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mmetric_kwds\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 1085\u001b[0m \u001b[43m \u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m 1086\u001b[0m expansion \u001b[38;5;241m=\u001b[39m \u001b[38;5;241m10.0\u001b[39m \u001b[38;5;241m/\u001b[39m np\u001b[38;5;241m.\u001b[39mabs(initialisation)\u001b[38;5;241m.\u001b[39mmax()\n",
"File \u001b[0;32m/opt/hostedtoolcache/Python/3.9.16/x64/lib/python3.9/site-packages/umap/spectral.py:332\u001b[0m, in \u001b[0;36mspectral_layout\u001b[0;34m(data, graph, dim, random_state, metric, metric_kwds)\u001b[0m\n\u001b[1;32m 331\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m L\u001b[38;5;241m.\u001b[39mshape[\u001b[38;5;241m0\u001b[39m] \u001b[38;5;241m<\u001b[39m \u001b[38;5;241m2000000\u001b[39m:\n\u001b[0;32m--> 332\u001b[0m eigenvalues, eigenvectors \u001b[38;5;241m=\u001b[39m \u001b[43mscipy\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43msparse\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mlinalg\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43meigsh\u001b[49m\u001b[43m(\u001b[49m\n\u001b[1;32m 333\u001b[0m \u001b[43m \u001b[49m\u001b[43mL\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 334\u001b[0m \u001b[43m \u001b[49m\u001b[43mk\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 335\u001b[0m \u001b[43m \u001b[49m\u001b[43mwhich\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[38;5;124;43mSM\u001b[39;49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[43m,\u001b[49m\n\u001b[1;32m 336\u001b[0m \u001b[43m \u001b[49m\u001b[43mncv\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mnum_lanczos_vectors\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 337\u001b[0m \u001b[43m \u001b[49m\u001b[43mtol\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[38;5;241;43m1e-4\u001b[39;49m\u001b[43m,\u001b[49m\n\u001b[1;32m 338\u001b[0m \u001b[43m \u001b[49m\u001b[43mv0\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mnp\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mones\u001b[49m\u001b[43m(\u001b[49m\u001b[43mL\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mshape\u001b[49m\u001b[43m[\u001b[49m\u001b[38;5;241;43m0\u001b[39;49m\u001b[43m]\u001b[49m\u001b[43m)\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 339\u001b[0m \u001b[43m \u001b[49m\u001b[43mmaxiter\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mgraph\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mshape\u001b[49m\u001b[43m[\u001b[49m\u001b[38;5;241;43m0\u001b[39;49m\u001b[43m]\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[43m \u001b[49m\u001b[38;5;241;43m5\u001b[39;49m\u001b[43m,\u001b[49m\n\u001b[1;32m 340\u001b[0m \u001b[43m \u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m 341\u001b[0m \u001b[38;5;28;01melse\u001b[39;00m:\n",
"File \u001b[0;32m/opt/hostedtoolcache/Python/3.9.16/x64/lib/python3.9/site-packages/scipy/sparse/linalg/_eigen/arpack/arpack.py:1597\u001b[0m, in \u001b[0;36meigsh\u001b[0;34m(A, k, M, sigma, which, v0, ncv, maxiter, tol, return_eigenvectors, Minv, OPinv, mode)\u001b[0m\n\u001b[1;32m 1596\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m issparse(A):\n\u001b[0;32m-> 1597\u001b[0m \u001b[38;5;28;01mraise\u001b[39;00m \u001b[38;5;167;01mTypeError\u001b[39;00m(\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mCannot use scipy.linalg.eigh for sparse A with \u001b[39m\u001b[38;5;124m\"\u001b[39m\n\u001b[1;32m 1598\u001b[0m \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mk >= N. Use scipy.linalg.eigh(A.toarray()) or\u001b[39m\u001b[38;5;124m\"\u001b[39m\n\u001b[1;32m 1599\u001b[0m \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124m reduce k.\u001b[39m\u001b[38;5;124m\"\u001b[39m)\n\u001b[1;32m 1600\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m \u001b[38;5;28misinstance\u001b[39m(A, LinearOperator):\n",
"\u001b[0;31mTypeError\u001b[0m: Cannot use scipy.linalg.eigh for sparse A with k >= N. Use scipy.linalg.eigh(A.toarray()) or reduce k.",
"\nDuring handling of the above exception, another exception occurred:\n",
"\u001b[0;31mTypeError\u001b[0m Traceback (most recent call last)",
"Cell \u001b[0;32mIn[11], line 2\u001b[0m\n\u001b[1;32m 1\u001b[0m input_file_path \u001b[38;5;241m=\u001b[39m \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mdata_out.csv\u001b[39m\u001b[38;5;124m\"\u001b[39m\n\u001b[0;32m----> 2\u001b[0m topic_model, topic_df, most_frequent_topics \u001b[38;5;241m=\u001b[39m \u001b[43mammico\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mtext\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mPostprocessText\u001b[49m\u001b[43m(\u001b[49m\n\u001b[1;32m 3\u001b[0m \u001b[43m \u001b[49m\u001b[43muse_csv\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[38;5;28;43;01mTrue\u001b[39;49;00m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mcsv_path\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43minput_file_path\u001b[49m\n\u001b[1;32m 4\u001b[0m \u001b[43m)\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43manalyse_topic\u001b[49m\u001b[43m(\u001b[49m\u001b[43mreturn_topics\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[38;5;241;43m10\u001b[39;49m\u001b[43m)\u001b[49m\n",
"File \u001b[0;32m~/work/AMMICO/AMMICO/ammico/text.py:217\u001b[0m, in \u001b[0;36mPostprocessText.analyse_topic\u001b[0;34m(self, return_topics)\u001b[0m\n\u001b[1;32m 215\u001b[0m \u001b[38;5;28;01mexcept\u001b[39;00m \u001b[38;5;167;01mTypeError\u001b[39;00m:\n\u001b[1;32m 216\u001b[0m \u001b[38;5;28mprint\u001b[39m(\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mBERTopic excited with an error - maybe your dataset is too small?\u001b[39m\u001b[38;5;124m\"\u001b[39m)\n\u001b[0;32m--> 217\u001b[0m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39mtopics, \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39mprobs \u001b[38;5;241m=\u001b[39m \u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mtopic_model\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mfit_transform\u001b[49m\u001b[43m(\u001b[49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mlist_text_english\u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m 218\u001b[0m \u001b[38;5;66;03m# return the topic list\u001b[39;00m\n\u001b[1;32m 219\u001b[0m topic_df \u001b[38;5;241m=\u001b[39m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39mtopic_model\u001b[38;5;241m.\u001b[39mget_topic_info()\n",
"File \u001b[0;32m/opt/hostedtoolcache/Python/3.9.16/x64/lib/python3.9/site-packages/bertopic/_bertopic.py:356\u001b[0m, in \u001b[0;36mBERTopic.fit_transform\u001b[0;34m(self, documents, embeddings, y)\u001b[0m\n\u001b[1;32m 354\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39mseed_topic_list \u001b[38;5;129;01mis\u001b[39;00m \u001b[38;5;129;01mnot\u001b[39;00m \u001b[38;5;28;01mNone\u001b[39;00m \u001b[38;5;129;01mand\u001b[39;00m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39membedding_model \u001b[38;5;129;01mis\u001b[39;00m \u001b[38;5;129;01mnot\u001b[39;00m \u001b[38;5;28;01mNone\u001b[39;00m:\n\u001b[1;32m 355\u001b[0m y, embeddings \u001b[38;5;241m=\u001b[39m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39m_guided_topic_modeling(embeddings)\n\u001b[0;32m--> 356\u001b[0m umap_embeddings \u001b[38;5;241m=\u001b[39m \u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43m_reduce_dimensionality\u001b[49m\u001b[43m(\u001b[49m\u001b[43membeddings\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43my\u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m 358\u001b[0m \u001b[38;5;66;03m# Cluster reduced embeddings\u001b[39;00m\n\u001b[1;32m 359\u001b[0m documents, probabilities \u001b[38;5;241m=\u001b[39m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39m_cluster_embeddings(umap_embeddings, documents, y\u001b[38;5;241m=\u001b[39my)\n",
"File \u001b[0;32m/opt/hostedtoolcache/Python/3.9.16/x64/lib/python3.9/site-packages/bertopic/_bertopic.py:2872\u001b[0m, in \u001b[0;36mBERTopic._reduce_dimensionality\u001b[0;34m(self, embeddings, y, partial_fit)\u001b[0m\n\u001b[1;32m 2869\u001b[0m \u001b[38;5;28;01mexcept\u001b[39;00m \u001b[38;5;167;01mTypeError\u001b[39;00m:\n\u001b[1;32m 2870\u001b[0m logger\u001b[38;5;241m.\u001b[39minfo(\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mThe dimensionality reduction algorithm did not contain the `y` parameter and\u001b[39m\u001b[38;5;124m\"\u001b[39m\n\u001b[1;32m 2871\u001b[0m \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124m therefore the `y` parameter was not used\u001b[39m\u001b[38;5;124m\"\u001b[39m)\n\u001b[0;32m-> 2872\u001b[0m \u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mumap_model\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mfit\u001b[49m\u001b[43m(\u001b[49m\u001b[43membeddings\u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m 2874\u001b[0m umap_embeddings \u001b[38;5;241m=\u001b[39m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39mumap_model\u001b[38;5;241m.\u001b[39mtransform(embeddings)\n\u001b[1;32m 2875\u001b[0m logger\u001b[38;5;241m.\u001b[39minfo(\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mReduced dimensionality\u001b[39m\u001b[38;5;124m\"\u001b[39m)\n",
"File \u001b[0;32m/opt/hostedtoolcache/Python/3.9.16/x64/lib/python3.9/site-packages/umap/umap_.py:2684\u001b[0m, in \u001b[0;36mUMAP.fit\u001b[0;34m(self, X, y)\u001b[0m\n\u001b[1;32m 2681\u001b[0m \u001b[38;5;28mprint\u001b[39m(ts(), \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mConstruct embedding\u001b[39m\u001b[38;5;124m\"\u001b[39m)\n\u001b[1;32m 2683\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39mtransform_mode \u001b[38;5;241m==\u001b[39m \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124membedding\u001b[39m\u001b[38;5;124m\"\u001b[39m:\n\u001b[0;32m-> 2684\u001b[0m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39membedding_, aux_data \u001b[38;5;241m=\u001b[39m \u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43m_fit_embed_data\u001b[49m\u001b[43m(\u001b[49m\n\u001b[1;32m 2685\u001b[0m \u001b[43m \u001b[49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43m_raw_data\u001b[49m\u001b[43m[\u001b[49m\u001b[43mindex\u001b[49m\u001b[43m]\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 2686\u001b[0m \u001b[43m \u001b[49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mn_epochs\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 2687\u001b[0m \u001b[43m \u001b[49m\u001b[43minit\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 2688\u001b[0m \u001b[43m \u001b[49m\u001b[43mrandom_state\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;66;43;03m# JH why raw data?\u001b[39;49;00m\n\u001b[1;32m 2689\u001b[0m \u001b[43m \u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m 2690\u001b[0m \u001b[38;5;66;03m# Assign any points that are fully disconnected from our manifold(s) to have embedding\u001b[39;00m\n\u001b[1;32m 2691\u001b[0m \u001b[38;5;66;03m# coordinates of np.nan. These will be filtered by our plotting functions automatically.\u001b[39;00m\n\u001b[1;32m 2692\u001b[0m \u001b[38;5;66;03m# They also prevent users from being deceived a distance query to one of these points.\u001b[39;00m\n\u001b[1;32m 2693\u001b[0m \u001b[38;5;66;03m# Might be worth moving this into simplicial_set_embedding or _fit_embed_data\u001b[39;00m\n\u001b[1;32m 2694\u001b[0m disconnected_vertices \u001b[38;5;241m=\u001b[39m np\u001b[38;5;241m.\u001b[39marray(\u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39mgraph_\u001b[38;5;241m.\u001b[39msum(axis\u001b[38;5;241m=\u001b[39m\u001b[38;5;241m1\u001b[39m))\u001b[38;5;241m.\u001b[39mflatten() \u001b[38;5;241m==\u001b[39m \u001b[38;5;241m0\u001b[39m\n",
"File \u001b[0;32m/opt/hostedtoolcache/Python/3.9.16/x64/lib/python3.9/site-packages/umap/umap_.py:2717\u001b[0m, in \u001b[0;36mUMAP._fit_embed_data\u001b[0;34m(self, X, n_epochs, init, random_state)\u001b[0m\n\u001b[1;32m 2713\u001b[0m \u001b[38;5;28;01mdef\u001b[39;00m \u001b[38;5;21m_fit_embed_data\u001b[39m(\u001b[38;5;28mself\u001b[39m, X, n_epochs, init, random_state):\n\u001b[1;32m 2714\u001b[0m \u001b[38;5;250m \u001b[39m\u001b[38;5;124;03m\"\"\"A method wrapper for simplicial_set_embedding that can be\u001b[39;00m\n\u001b[1;32m 2715\u001b[0m \u001b[38;5;124;03m replaced by subclasses.\u001b[39;00m\n\u001b[1;32m 2716\u001b[0m \u001b[38;5;124;03m \"\"\"\u001b[39;00m\n\u001b[0;32m-> 2717\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m \u001b[43msimplicial_set_embedding\u001b[49m\u001b[43m(\u001b[49m\n\u001b[1;32m 2718\u001b[0m \u001b[43m \u001b[49m\u001b[43mX\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 2719\u001b[0m \u001b[43m \u001b[49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mgraph_\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 2720\u001b[0m \u001b[43m \u001b[49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mn_components\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 2721\u001b[0m \u001b[43m \u001b[49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43m_initial_alpha\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 2722\u001b[0m \u001b[43m \u001b[49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43m_a\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 2723\u001b[0m \u001b[43m \u001b[49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43m_b\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 2724\u001b[0m \u001b[43m \u001b[49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mrepulsion_strength\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 2725\u001b[0m \u001b[43m \u001b[49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mnegative_sample_rate\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 2726\u001b[0m \u001b[43m \u001b[49m\u001b[43mn_epochs\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 2727\u001b[0m \u001b[43m \u001b[49m\u001b[43minit\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 2728\u001b[0m \u001b[43m \u001b[49m\u001b[43mrandom_state\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 2729\u001b[0m \u001b[43m \u001b[49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43m_input_distance_func\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 2730\u001b[0m \u001b[43m \u001b[49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43m_metric_kwds\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 2731\u001b[0m \u001b[43m \u001b[49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mdensmap\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 2732\u001b[0m \u001b[43m \u001b[49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43m_densmap_kwds\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 2733\u001b[0m \u001b[43m \u001b[49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43moutput_dens\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 2734\u001b[0m \u001b[43m \u001b[49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43m_output_distance_func\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 2735\u001b[0m \u001b[43m \u001b[49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43m_output_metric_kwds\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 2736\u001b[0m \u001b[43m \u001b[49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43moutput_metric\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;129;43;01min\u001b[39;49;00m\u001b[43m \u001b[49m\u001b[43m(\u001b[49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[38;5;124;43meuclidean\u001b[39;49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[38;5;124;43ml2\u001b[39;49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[43m)\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 2737\u001b[0m \u001b[43m \u001b[49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mrandom_state\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;129;43;01mis\u001b[39;49;00m\u001b[43m \u001b[49m\u001b[38;5;28;43;01mNone\u001b[39;49;00m\u001b[43m,\u001b[49m\n\u001b[1;32m 2738\u001b[0m \u001b[43m \u001b[49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mverbose\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 2739\u001b[0m \u001b[43m \u001b[49m\u001b[43mtqdm_kwds\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mtqdm_kwds\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 2740\u001b[0m \u001b[43m \u001b[49m\u001b[43m)\u001b[49m\n",
"File \u001b[0;32m/opt/hostedtoolcache/Python/3.9.16/x64/lib/python3.9/site-packages/umap/umap_.py:1078\u001b[0m, in \u001b[0;36msimplicial_set_embedding\u001b[0;34m(data, graph, n_components, initial_alpha, a, b, gamma, negative_sample_rate, n_epochs, init, random_state, metric, metric_kwds, densmap, densmap_kwds, output_dens, output_metric, output_metric_kwds, euclidean_output, parallel, verbose, tqdm_kwds)\u001b[0m\n\u001b[1;32m 1073\u001b[0m embedding \u001b[38;5;241m=\u001b[39m random_state\u001b[38;5;241m.\u001b[39muniform(\n\u001b[1;32m 1074\u001b[0m low\u001b[38;5;241m=\u001b[39m\u001b[38;5;241m-\u001b[39m\u001b[38;5;241m10.0\u001b[39m, high\u001b[38;5;241m=\u001b[39m\u001b[38;5;241m10.0\u001b[39m, size\u001b[38;5;241m=\u001b[39m(graph\u001b[38;5;241m.\u001b[39mshape[\u001b[38;5;241m0\u001b[39m], n_components)\n\u001b[1;32m 1075\u001b[0m )\u001b[38;5;241m.\u001b[39mastype(np\u001b[38;5;241m.\u001b[39mfloat32)\n\u001b[1;32m 1076\u001b[0m \u001b[38;5;28;01melif\u001b[39;00m \u001b[38;5;28misinstance\u001b[39m(init, \u001b[38;5;28mstr\u001b[39m) \u001b[38;5;129;01mand\u001b[39;00m init \u001b[38;5;241m==\u001b[39m \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mspectral\u001b[39m\u001b[38;5;124m\"\u001b[39m:\n\u001b[1;32m 1077\u001b[0m \u001b[38;5;66;03m# We add a little noise to avoid local minima for optimization to come\u001b[39;00m\n\u001b[0;32m-> 1078\u001b[0m initialisation \u001b[38;5;241m=\u001b[39m \u001b[43mspectral_layout\u001b[49m\u001b[43m(\u001b[49m\n\u001b[1;32m 1079\u001b[0m \u001b[43m \u001b[49m\u001b[43mdata\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 1080\u001b[0m \u001b[43m \u001b[49m\u001b[43mgraph\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 1081\u001b[0m \u001b[43m \u001b[49m\u001b[43mn_components\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 1082\u001b[0m \u001b[43m \u001b[49m\u001b[43mrandom_state\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 1083\u001b[0m \u001b[43m \u001b[49m\u001b[43mmetric\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mmetric\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 1084\u001b[0m \u001b[43m \u001b[49m\u001b[43mmetric_kwds\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mmetric_kwds\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 1085\u001b[0m \u001b[43m \u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m 1086\u001b[0m expansion \u001b[38;5;241m=\u001b[39m \u001b[38;5;241m10.0\u001b[39m \u001b[38;5;241m/\u001b[39m np\u001b[38;5;241m.\u001b[39mabs(initialisation)\u001b[38;5;241m.\u001b[39mmax()\n\u001b[1;32m 1087\u001b[0m embedding \u001b[38;5;241m=\u001b[39m (initialisation \u001b[38;5;241m*\u001b[39m expansion)\u001b[38;5;241m.\u001b[39mastype(\n\u001b[1;32m 1088\u001b[0m np\u001b[38;5;241m.\u001b[39mfloat32\n\u001b[1;32m 1089\u001b[0m ) \u001b[38;5;241m+\u001b[39m random_state\u001b[38;5;241m.\u001b[39mnormal(\n\u001b[0;32m (...)\u001b[0m\n\u001b[1;32m 1092\u001b[0m np\u001b[38;5;241m.\u001b[39mfloat32\n\u001b[1;32m 1093\u001b[0m )\n",
"File \u001b[0;32m/opt/hostedtoolcache/Python/3.9.16/x64/lib/python3.9/site-packages/umap/spectral.py:332\u001b[0m, in \u001b[0;36mspectral_layout\u001b[0;34m(data, graph, dim, random_state, metric, metric_kwds)\u001b[0m\n\u001b[1;32m 330\u001b[0m \u001b[38;5;28;01mtry\u001b[39;00m:\n\u001b[1;32m 331\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m L\u001b[38;5;241m.\u001b[39mshape[\u001b[38;5;241m0\u001b[39m] \u001b[38;5;241m<\u001b[39m \u001b[38;5;241m2000000\u001b[39m:\n\u001b[0;32m--> 332\u001b[0m eigenvalues, eigenvectors \u001b[38;5;241m=\u001b[39m \u001b[43mscipy\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43msparse\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mlinalg\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43meigsh\u001b[49m\u001b[43m(\u001b[49m\n\u001b[1;32m 333\u001b[0m \u001b[43m \u001b[49m\u001b[43mL\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 334\u001b[0m \u001b[43m \u001b[49m\u001b[43mk\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 335\u001b[0m \u001b[43m \u001b[49m\u001b[43mwhich\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[38;5;124;43mSM\u001b[39;49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[43m,\u001b[49m\n\u001b[1;32m 336\u001b[0m \u001b[43m \u001b[49m\u001b[43mncv\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mnum_lanczos_vectors\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 337\u001b[0m \u001b[43m \u001b[49m\u001b[43mtol\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[38;5;241;43m1e-4\u001b[39;49m\u001b[43m,\u001b[49m\n\u001b[1;32m 338\u001b[0m \u001b[43m \u001b[49m\u001b[43mv0\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mnp\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mones\u001b[49m\u001b[43m(\u001b[49m\u001b[43mL\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mshape\u001b[49m\u001b[43m[\u001b[49m\u001b[38;5;241;43m0\u001b[39;49m\u001b[43m]\u001b[49m\u001b[43m)\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 339\u001b[0m \u001b[43m \u001b[49m\u001b[43mmaxiter\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mgraph\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mshape\u001b[49m\u001b[43m[\u001b[49m\u001b[38;5;241;43m0\u001b[39;49m\u001b[43m]\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[43m \u001b[49m\u001b[38;5;241;43m5\u001b[39;49m\u001b[43m,\u001b[49m\n\u001b[1;32m 340\u001b[0m \u001b[43m \u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m 341\u001b[0m \u001b[38;5;28;01melse\u001b[39;00m:\n\u001b[1;32m 342\u001b[0m eigenvalues, eigenvectors \u001b[38;5;241m=\u001b[39m scipy\u001b[38;5;241m.\u001b[39msparse\u001b[38;5;241m.\u001b[39mlinalg\u001b[38;5;241m.\u001b[39mlobpcg(\n\u001b[1;32m 343\u001b[0m L, random_state\u001b[38;5;241m.\u001b[39mnormal(size\u001b[38;5;241m=\u001b[39m(L\u001b[38;5;241m.\u001b[39mshape[\u001b[38;5;241m0\u001b[39m], k)), largest\u001b[38;5;241m=\u001b[39m\u001b[38;5;28;01mFalse\u001b[39;00m, tol\u001b[38;5;241m=\u001b[39m\u001b[38;5;241m1e-8\u001b[39m\n\u001b[1;32m 344\u001b[0m )\n",
"File \u001b[0;32m/opt/hostedtoolcache/Python/3.9.16/x64/lib/python3.9/site-packages/scipy/sparse/linalg/_eigen/arpack/arpack.py:1597\u001b[0m, in \u001b[0;36meigsh\u001b[0;34m(A, k, M, sigma, which, v0, ncv, maxiter, tol, return_eigenvectors, Minv, OPinv, mode)\u001b[0m\n\u001b[1;32m 1592\u001b[0m warnings\u001b[38;5;241m.\u001b[39mwarn(\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mk >= N for N * N square matrix. \u001b[39m\u001b[38;5;124m\"\u001b[39m\n\u001b[1;32m 1593\u001b[0m \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mAttempting to use scipy.linalg.eigh instead.\u001b[39m\u001b[38;5;124m\"\u001b[39m,\n\u001b[1;32m 1594\u001b[0m \u001b[38;5;167;01mRuntimeWarning\u001b[39;00m)\n\u001b[1;32m 1596\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m issparse(A):\n\u001b[0;32m-> 1597\u001b[0m \u001b[38;5;28;01mraise\u001b[39;00m \u001b[38;5;167;01mTypeError\u001b[39;00m(\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mCannot use scipy.linalg.eigh for sparse A with \u001b[39m\u001b[38;5;124m\"\u001b[39m\n\u001b[1;32m 1598\u001b[0m \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mk >= N. Use scipy.linalg.eigh(A.toarray()) or\u001b[39m\u001b[38;5;124m\"\u001b[39m\n\u001b[1;32m 1599\u001b[0m \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124m reduce k.\u001b[39m\u001b[38;5;124m\"\u001b[39m)\n\u001b[1;32m 1600\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m \u001b[38;5;28misinstance\u001b[39m(A, LinearOperator):\n\u001b[1;32m 1601\u001b[0m \u001b[38;5;28;01mraise\u001b[39;00m \u001b[38;5;167;01mTypeError\u001b[39;00m(\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mCannot use scipy.linalg.eigh for LinearOperator \u001b[39m\u001b[38;5;124m\"\u001b[39m\n\u001b[1;32m 1602\u001b[0m \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mA with k >= N.\u001b[39m\u001b[38;5;124m\"\u001b[39m)\n",
"\u001b[0;31mTypeError\u001b[0m: Cannot use scipy.linalg.eigh for sparse A with k >= N. Use scipy.linalg.eigh(A.toarray()) or reduce k."
]
}
],
"source": [
"input_file_path = \"data_out.csv\"\n",
"topic_model, topic_df, most_frequent_topics = ammico.text.PostprocessText(\n",
" use_csv=True, csv_path=input_file_path\n",
").analyse_topic(return_topics=10)"
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "0b6ef6d7",
"metadata": {},
"source": [
"### Access frequent topics\n",
"A topic of `-1` stands for an outlier and should be ignored. Topic count is the number of occurence of that topic. The output is structured from most frequent to least frequent topic."
]
},
{
"cell_type": "code",
"execution_count": 12,
"id": "43288cda-61bb-4ff1-a209-dcfcc4916b1f",
"metadata": {
"execution": {
"iopub.execute_input": "2023-05-26T09:38:49.275195Z",
"iopub.status.busy": "2023-05-26T09:38:49.274521Z",
"iopub.status.idle": "2023-05-26T09:38:49.313341Z",
"shell.execute_reply": "2023-05-26T09:38:49.312622Z"
}
},
"outputs": [
{
"ename": "NameError",
"evalue": "name 'topic_df' is not defined",
"output_type": "error",
"traceback": [
"\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
"\u001b[0;31mNameError\u001b[0m Traceback (most recent call last)",
"Cell \u001b[0;32mIn[12], line 1\u001b[0m\n\u001b[0;32m----> 1\u001b[0m \u001b[38;5;28mprint\u001b[39m(\u001b[43mtopic_df\u001b[49m)\n",
"\u001b[0;31mNameError\u001b[0m: name 'topic_df' is not defined"
]
}
],
"source": [
"print(topic_df)"
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "b3316770",
"metadata": {},
"source": [
"### Get information for specific topic\n",
"The most frequent topics can be accessed through `most_frequent_topics` with the most occuring topics first in the list."
]
},
{
"cell_type": "code",
"execution_count": 13,
"id": "db14fe03",
"metadata": {
"execution": {
"iopub.execute_input": "2023-05-26T09:38:49.316604Z",
"iopub.status.busy": "2023-05-26T09:38:49.316142Z",
"iopub.status.idle": "2023-05-26T09:38:49.353752Z",
"shell.execute_reply": "2023-05-26T09:38:49.353034Z"
}
},
"outputs": [
{
"ename": "NameError",
"evalue": "name 'most_frequent_topics' is not defined",
"output_type": "error",
"traceback": [
"\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
"\u001b[0;31mNameError\u001b[0m Traceback (most recent call last)",
"Cell \u001b[0;32mIn[13], line 1\u001b[0m\n\u001b[0;32m----> 1\u001b[0m \u001b[38;5;28;01mfor\u001b[39;00m topic \u001b[38;5;129;01min\u001b[39;00m \u001b[43mmost_frequent_topics\u001b[49m:\n\u001b[1;32m 2\u001b[0m \u001b[38;5;28mprint\u001b[39m(\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mTopic:\u001b[39m\u001b[38;5;124m\"\u001b[39m, topic)\n",
"\u001b[0;31mNameError\u001b[0m: name 'most_frequent_topics' is not defined"
]
}
],
"source": [
"for topic in most_frequent_topics:\n",
" print(\"Topic:\", topic)"
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "d10f701e",
"metadata": {},
"source": [
"### Topic visualization\n",
"The topics can also be visualized. Careful: This only works if there is sufficient data (quantity and quality)."
]
},
{
"cell_type": "code",
"execution_count": 14,
"id": "2331afe6",
"metadata": {
"execution": {
"iopub.execute_input": "2023-05-26T09:38:49.357562Z",
"iopub.status.busy": "2023-05-26T09:38:49.357109Z",
"iopub.status.idle": "2023-05-26T09:38:49.391859Z",
"shell.execute_reply": "2023-05-26T09:38:49.391145Z"
}
},
"outputs": [
{
"ename": "NameError",
"evalue": "name 'topic_model' is not defined",
"output_type": "error",
"traceback": [
"\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
"\u001b[0;31mNameError\u001b[0m Traceback (most recent call last)",
"Cell \u001b[0;32mIn[14], line 1\u001b[0m\n\u001b[0;32m----> 1\u001b[0m \u001b[43mtopic_model\u001b[49m\u001b[38;5;241m.\u001b[39mvisualize_topics()\n",
"\u001b[0;31mNameError\u001b[0m: name 'topic_model' is not defined"
]
}
],
"source": [
"topic_model.visualize_topics()"
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "f4eaf353",
"metadata": {},
"source": [
"### Save the model\n",
"The model can be saved for future use."
]
},
{
"cell_type": "code",
"execution_count": 15,
"id": "e5e8377c",
"metadata": {
"execution": {
"iopub.execute_input": "2023-05-26T09:38:49.394940Z",
"iopub.status.busy": "2023-05-26T09:38:49.394501Z",
"iopub.status.idle": "2023-05-26T09:38:49.428075Z",
"shell.execute_reply": "2023-05-26T09:38:49.427305Z"
}
},
"outputs": [
{
"ename": "NameError",
"evalue": "name 'topic_model' is not defined",
"output_type": "error",
"traceback": [
"\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
"\u001b[0;31mNameError\u001b[0m Traceback (most recent call last)",
"Cell \u001b[0;32mIn[15], line 1\u001b[0m\n\u001b[0;32m----> 1\u001b[0m \u001b[43mtopic_model\u001b[49m\u001b[38;5;241m.\u001b[39msave(\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mmisinfo_posts\u001b[39m\u001b[38;5;124m\"\u001b[39m)\n",
"\u001b[0;31mNameError\u001b[0m: name 'topic_model' is not defined"
]
}
],
"source": [
"topic_model.save(\"misinfo_posts\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "7c94edb9",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.16"
},
"vscode": {
"interpreter": {
"hash": "da98320027a74839c7141b42ef24e2d47d628ba1f51115c13da5d8b45a372ec2"
}
},
"widgets": {
"application/vnd.jupyter.widget-state+json": {
"state": {
"0440bf3b407c4dc4b87333caabc50b84": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "2.0.0",
"model_name": "HBoxModel",
"state": {
"_dom_classes": [],
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "2.0.0",
"_model_name": "HBoxModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/controls",
"_view_module_version": "2.0.0",
"_view_name": "HBoxView",
"box_style": "",
"children": [
"IPY_MODEL_567e12cc410b45969b21e7cd4bb3cb2f",
"IPY_MODEL_fe0529b874484ae097de6747ffd66ca5",
"IPY_MODEL_6878d5284f044778b84d1f36af58d041"
],
"layout": "IPY_MODEL_8061186407454de094c81415f704b7bb",
"tabbable": null,
"tooltip": null
}
},
"061c33c4780443b4b57a477064a04d3f": {
"model_module": "@jupyter-widgets/base",
"model_module_version": "2.0.0",
"model_name": "LayoutModel",
"state": {
"_model_module": "@jupyter-widgets/base",
"_model_module_version": "2.0.0",
"_model_name": "LayoutModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "2.0.0",
"_view_name": "LayoutView",
"align_content": null,
"align_items": null,
"align_self": null,
"border_bottom": null,
"border_left": null,
"border_right": null,
"border_top": null,
"bottom": null,
"display": null,
"flex": null,
"flex_flow": null,
"grid_area": null,
"grid_auto_columns": null,
"grid_auto_flow": null,
"grid_auto_rows": null,
"grid_column": null,
"grid_gap": null,
"grid_row": null,
"grid_template_areas": null,
"grid_template_columns": null,
"grid_template_rows": null,
"height": null,
"justify_content": null,
"justify_items": null,
"left": null,
"margin": null,
"max_height": null,
"max_width": null,
"min_height": null,
"min_width": null,
"object_fit": null,
"object_position": null,
"order": null,
"overflow": null,
"padding": null,
"right": null,
"top": null,
"visibility": null,
"width": null
}
},
"06333b8093784188ad9702b2344b9791": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "2.0.0",
"model_name": "HTMLModel",
"state": {
"_dom_classes": [],
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "2.0.0",
"_model_name": "HTMLModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/controls",
"_view_module_version": "2.0.0",
"_view_name": "HTMLView",
"description": "",
"description_allow_html": false,
"layout": "IPY_MODEL_fcb58fe880424ed9ad5d1575eda055e3",
"placeholder": "",
"style": "IPY_MODEL_8828a33b1bc04ba5989c56f460332a56",
"tabbable": null,
"tooltip": null,
"value": " 456k/456k [00:00&lt;00:00, 27.4MB/s]"
}
},
"0f55b926768449f9a969c120442349c1": {
"model_module": "@jupyter-widgets/base",
"model_module_version": "2.0.0",
"model_name": "LayoutModel",
"state": {
"_model_module": "@jupyter-widgets/base",
"_model_module_version": "2.0.0",
"_model_name": "LayoutModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "2.0.0",
"_view_name": "LayoutView",
"align_content": null,
"align_items": null,
"align_self": null,
"border_bottom": null,
"border_left": null,
"border_right": null,
"border_top": null,
"bottom": null,
"display": null,
"flex": null,
"flex_flow": null,
"grid_area": null,
"grid_auto_columns": null,
"grid_auto_flow": null,
"grid_auto_rows": null,
"grid_column": null,
"grid_gap": null,
"grid_row": null,
"grid_template_areas": null,
"grid_template_columns": null,
"grid_template_rows": null,
"height": null,
"justify_content": null,
"justify_items": null,
"left": null,
"margin": null,
"max_height": null,
"max_width": null,
"min_height": null,
"min_width": null,
"object_fit": null,
"object_position": null,
"order": null,
"overflow": null,
"padding": null,
"right": null,
"top": null,
"visibility": null,
"width": null
}
},
"1062c73f02c44db99d8abe6a89702904": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "2.0.0",
"model_name": "HTMLStyleModel",
"state": {
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "2.0.0",
"_model_name": "HTMLStyleModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "2.0.0",
"_view_name": "StyleView",
"background": null,
"description_width": "",
"font_size": null,
"text_color": null
}
},
"15e26d521938428a93008c39201e5684": {
"model_module": "@jupyter-widgets/base",
"model_module_version": "2.0.0",
"model_name": "LayoutModel",
"state": {
"_model_module": "@jupyter-widgets/base",
"_model_module_version": "2.0.0",
"_model_name": "LayoutModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "2.0.0",
"_view_name": "LayoutView",
"align_content": null,
"align_items": null,
"align_self": null,
"border_bottom": null,
"border_left": null,
"border_right": null,
"border_top": null,
"bottom": null,
"display": null,
"flex": null,
"flex_flow": null,
"grid_area": null,
"grid_auto_columns": null,
"grid_auto_flow": null,
"grid_auto_rows": null,
"grid_column": null,
"grid_gap": null,
"grid_row": null,
"grid_template_areas": null,
"grid_template_columns": null,
"grid_template_rows": null,
"height": null,
"justify_content": null,
"justify_items": null,
"left": null,
"margin": null,
"max_height": null,
"max_width": null,
"min_height": null,
"min_width": null,
"object_fit": null,
"object_position": null,
"order": null,
"overflow": null,
"padding": null,
"right": null,
"top": null,
"visibility": null,
"width": null
}
},
"1a5ecb435ab046df9eadcd6d185aa4fc": {
"model_module": "@jupyter-widgets/base",
"model_module_version": "2.0.0",
"model_name": "LayoutModel",
"state": {
"_model_module": "@jupyter-widgets/base",
"_model_module_version": "2.0.0",
"_model_name": "LayoutModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "2.0.0",
"_view_name": "LayoutView",
"align_content": null,
"align_items": null,
"align_self": null,
"border_bottom": null,
"border_left": null,
"border_right": null,
"border_top": null,
"bottom": null,
"display": null,
"flex": null,
"flex_flow": null,
"grid_area": null,
"grid_auto_columns": null,
"grid_auto_flow": null,
"grid_auto_rows": null,
"grid_column": null,
"grid_gap": null,
"grid_row": null,
"grid_template_areas": null,
"grid_template_columns": null,
"grid_template_rows": null,
"height": null,
"justify_content": null,
"justify_items": null,
"left": null,
"margin": null,
"max_height": null,
"max_width": null,
"min_height": null,
"min_width": null,
"object_fit": null,
"object_position": null,
"order": null,
"overflow": null,
"padding": null,
"right": null,
"top": null,
"visibility": null,
"width": null
}
},
"1ba6109f89264048baac7068e26a07ba": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "2.0.0",
"model_name": "HTMLModel",
"state": {
"_dom_classes": [],
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "2.0.0",
"_model_name": "HTMLModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/controls",
"_view_module_version": "2.0.0",
"_view_name": "HTMLView",
"description": "",
"description_allow_html": false,
"layout": "IPY_MODEL_e4563c8cebad4c739e9e139408437ec3",
"placeholder": "",
"style": "IPY_MODEL_6f02f462b6494e9799d6c6ed94e9c083",
"tabbable": null,
"tooltip": null,
"value": " 1.33G/1.33G [00:17&lt;00:00, 75.6MB/s]"
}
},
"1cbf283ddb9f4df6a9c18b3f2d6c3e6c": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "2.0.0",
"model_name": "HTMLStyleModel",
"state": {
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "2.0.0",
"_model_name": "HTMLStyleModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "2.0.0",
"_view_name": "StyleView",
"background": null,
"description_width": "",
"font_size": null,
"text_color": null
}
},
"1e0ecfdcb4144ae7822c2298444fcddd": {
"model_module": "@jupyter-widgets/base",
"model_module_version": "2.0.0",
"model_name": "LayoutModel",
"state": {
"_model_module": "@jupyter-widgets/base",
"_model_module_version": "2.0.0",
"_model_name": "LayoutModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "2.0.0",
"_view_name": "LayoutView",
"align_content": null,
"align_items": null,
"align_self": null,
"border_bottom": null,
"border_left": null,
"border_right": null,
"border_top": null,
"bottom": null,
"display": null,
"flex": null,
"flex_flow": null,
"grid_area": null,
"grid_auto_columns": null,
"grid_auto_flow": null,
"grid_auto_rows": null,
"grid_column": null,
"grid_gap": null,
"grid_row": null,
"grid_template_areas": null,
"grid_template_columns": null,
"grid_template_rows": null,
"height": null,
"justify_content": null,
"justify_items": null,
"left": null,
"margin": null,
"max_height": null,
"max_width": null,
"min_height": null,
"min_width": null,
"object_fit": null,
"object_position": null,
"order": null,
"overflow": null,
"padding": null,
"right": null,
"top": null,
"visibility": null,
"width": null
}
},
"1f2193c670c640c88001c750be5ba3e4": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "2.0.0",
"model_name": "HBoxModel",
"state": {
"_dom_classes": [],
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "2.0.0",
"_model_name": "HBoxModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/controls",
"_view_module_version": "2.0.0",
"_view_name": "HBoxView",
"box_style": "",
"children": [
"IPY_MODEL_aefcf59c477d4d8388e6389559c92f92",
"IPY_MODEL_8da9fe8dba9544fdb3ca5924f4225891",
"IPY_MODEL_35008830141a470a82cb0f5b6816e1b8"
],
"layout": "IPY_MODEL_2c62130d52c34cee9e185624412c2f82",
"tabbable": null,
"tooltip": null
}
},
"1fc85ed7d97e43f8afea604ae571e083": {
"model_module": "@jupyter-widgets/base",
"model_module_version": "2.0.0",
"model_name": "LayoutModel",
"state": {
"_model_module": "@jupyter-widgets/base",
"_model_module_version": "2.0.0",
"_model_name": "LayoutModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "2.0.0",
"_view_name": "LayoutView",
"align_content": null,
"align_items": null,
"align_self": null,
"border_bottom": null,
"border_left": null,
"border_right": null,
"border_top": null,
"bottom": null,
"display": null,
"flex": null,
"flex_flow": null,
"grid_area": null,
"grid_auto_columns": null,
"grid_auto_flow": null,
"grid_auto_rows": null,
"grid_column": null,
"grid_gap": null,
"grid_row": null,
"grid_template_areas": null,
"grid_template_columns": null,
"grid_template_rows": null,
"height": null,
"justify_content": null,
"justify_items": null,
"left": null,
"margin": null,
"max_height": null,
"max_width": null,
"min_height": null,
"min_width": null,
"object_fit": null,
"object_position": null,
"order": null,
"overflow": null,
"padding": null,
"right": null,
"top": null,
"visibility": null,
"width": null
}
},
"22117190beb746bbb147fff67aaf0555": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "2.0.0",
"model_name": "HTMLModel",
"state": {
"_dom_classes": [],
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "2.0.0",
"_model_name": "HTMLModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/controls",
"_view_module_version": "2.0.0",
"_view_name": "HTMLView",
"description": "",
"description_allow_html": false,
"layout": "IPY_MODEL_45b55e3e6c2e4f6fb2f3798d1c072a5c",
"placeholder": "",
"style": "IPY_MODEL_ba840150a18d4892b69abab2b81df577",
"tabbable": null,
"tooltip": null,
"value": " 48.0/48.0 [00:00&lt;00:00, 2.81kB/s]"
}
},
"235d2e616ddd4ac0979efd3e09dcbb3d": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "2.0.0",
"model_name": "FloatProgressModel",
"state": {
"_dom_classes": [],
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "2.0.0",
"_model_name": "FloatProgressModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/controls",
"_view_module_version": "2.0.0",
"_view_name": "ProgressView",
"bar_style": "success",
"description": "",
"description_allow_html": false,
"layout": "IPY_MODEL_5c79b2c4c9a541bab94f0e42278c795e",
"max": 231508.0,
"min": 0.0,
"orientation": "horizontal",
"style": "IPY_MODEL_3d7141cb7e56467aa773cb18fc436645",
"tabbable": null,
"tooltip": null,
"value": 231508.0
}
},
"2672cb20a6d340a4a547cf0ef8bd263e": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "2.0.0",
"model_name": "FloatProgressModel",
"state": {
"_dom_classes": [],
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "2.0.0",
"_model_name": "FloatProgressModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/controls",
"_view_module_version": "2.0.0",
"_view_name": "ProgressView",
"bar_style": "success",
"description": "",
"description_allow_html": false,
"layout": "IPY_MODEL_1fc85ed7d97e43f8afea604ae571e083",
"max": 898822.0,
"min": 0.0,
"orientation": "horizontal",
"style": "IPY_MODEL_ada9507ce35045bbb5ca330ee9dcc2a6",
"tabbable": null,
"tooltip": null,
"value": 898822.0
}
},
"27192f2cef0a4e2e962e38448ec45f98": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "2.0.0",
"model_name": "FloatProgressModel",
"state": {
"_dom_classes": [],
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "2.0.0",
"_model_name": "FloatProgressModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/controls",
"_view_module_version": "2.0.0",
"_view_name": "ProgressView",
"bar_style": "success",
"description": "",
"description_allow_html": false,
"layout": "IPY_MODEL_595ca027d75940a9bb35f89586088f2e",
"max": 456318.0,
"min": 0.0,
"orientation": "horizontal",
"style": "IPY_MODEL_c30a51e35c4f4bfb8e643146f7099988",
"tabbable": null,
"tooltip": null,
"value": 456318.0
}
},
"29435150e1e54f9e82dcf506395d6c3e": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "2.0.0",
"model_name": "HTMLStyleModel",
"state": {
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "2.0.0",
"_model_name": "HTMLStyleModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "2.0.0",
"_view_name": "StyleView",
"background": null,
"description_width": "",
"font_size": null,
"text_color": null
}
},
"2b22abf7e7144f499902318115731078": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "2.0.0",
"model_name": "HBoxModel",
"state": {
"_dom_classes": [],
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "2.0.0",
"_model_name": "HBoxModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/controls",
"_view_module_version": "2.0.0",
"_view_name": "HBoxView",
"box_style": "",
"children": [
"IPY_MODEL_b9d3599015ae4adc8959e9e71bb9ae46",
"IPY_MODEL_bd8b6aa2fe38417f8a4cf0b1b5ba9fc4",
"IPY_MODEL_1ba6109f89264048baac7068e26a07ba"
],
"layout": "IPY_MODEL_6d9513cd408f440cb33339fce1d957d5",
"tabbable": null,
"tooltip": null
}
},
"2c62130d52c34cee9e185624412c2f82": {
"model_module": "@jupyter-widgets/base",
"model_module_version": "2.0.0",
"model_name": "LayoutModel",
"state": {
"_model_module": "@jupyter-widgets/base",
"_model_module_version": "2.0.0",
"_model_name": "LayoutModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "2.0.0",
"_view_name": "LayoutView",
"align_content": null,
"align_items": null,
"align_self": null,
"border_bottom": null,
"border_left": null,
"border_right": null,
"border_top": null,
"bottom": null,
"display": null,
"flex": null,
"flex_flow": null,
"grid_area": null,
"grid_auto_columns": null,
"grid_auto_flow": null,
"grid_auto_rows": null,
"grid_column": null,
"grid_gap": null,
"grid_row": null,
"grid_template_areas": null,
"grid_template_columns": null,
"grid_template_rows": null,
"height": null,
"justify_content": null,
"justify_items": null,
"left": null,
"margin": null,
"max_height": null,
"max_width": null,
"min_height": null,
"min_width": null,
"object_fit": null,
"object_position": null,
"order": null,
"overflow": null,
"padding": null,
"right": null,
"top": null,
"visibility": null,
"width": null
}
},
"30be2b75761c44b9bac54d8b548035e3": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "2.0.0",
"model_name": "ProgressStyleModel",
"state": {
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "2.0.0",
"_model_name": "ProgressStyleModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "2.0.0",
"_view_name": "StyleView",
"bar_color": null,
"description_width": ""
}
},
"30eebb8044934401a2bef123e186bdf2": {
"model_module": "@jupyter-widgets/base",
"model_module_version": "2.0.0",
"model_name": "LayoutModel",
"state": {
"_model_module": "@jupyter-widgets/base",
"_model_module_version": "2.0.0",
"_model_name": "LayoutModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "2.0.0",
"_view_name": "LayoutView",
"align_content": null,
"align_items": null,
"align_self": null,
"border_bottom": null,
"border_left": null,
"border_right": null,
"border_top": null,
"bottom": null,
"display": null,
"flex": null,
"flex_flow": null,
"grid_area": null,
"grid_auto_columns": null,
"grid_auto_flow": null,
"grid_auto_rows": null,
"grid_column": null,
"grid_gap": null,
"grid_row": null,
"grid_template_areas": null,
"grid_template_columns": null,
"grid_template_rows": null,
"height": null,
"justify_content": null,
"justify_items": null,
"left": null,
"margin": null,
"max_height": null,
"max_width": null,
"min_height": null,
"min_width": null,
"object_fit": null,
"object_position": null,
"order": null,
"overflow": null,
"padding": null,
"right": null,
"top": null,
"visibility": null,
"width": null
}
},
"314fa5543918487680a2d9ec7b5c900e": {
"model_module": "@jupyter-widgets/base",
"model_module_version": "2.0.0",
"model_name": "LayoutModel",
"state": {
"_model_module": "@jupyter-widgets/base",
"_model_module_version": "2.0.0",
"_model_name": "LayoutModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "2.0.0",
"_view_name": "LayoutView",
"align_content": null,
"align_items": null,
"align_self": null,
"border_bottom": null,
"border_left": null,
"border_right": null,
"border_top": null,
"bottom": null,
"display": null,
"flex": null,
"flex_flow": null,
"grid_area": null,
"grid_auto_columns": null,
"grid_auto_flow": null,
"grid_auto_rows": null,
"grid_column": null,
"grid_gap": null,
"grid_row": null,
"grid_template_areas": null,
"grid_template_columns": null,
"grid_template_rows": null,
"height": null,
"justify_content": null,
"justify_items": null,
"left": null,
"margin": null,
"max_height": null,
"max_width": null,
"min_height": null,
"min_width": null,
"object_fit": null,
"object_position": null,
"order": null,
"overflow": null,
"padding": null,
"right": null,
"top": null,
"visibility": null,
"width": null
}
},
"33fe87d050614686b16afcd1bc1218c5": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "2.0.0",
"model_name": "HTMLModel",
"state": {
"_dom_classes": [],
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "2.0.0",
"_model_name": "HTMLModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/controls",
"_view_module_version": "2.0.0",
"_view_name": "HTMLView",
"description": "",
"description_allow_html": false,
"layout": "IPY_MODEL_ffa644e148d04829bd2dd245c2418c2e",
"placeholder": "",
"style": "IPY_MODEL_1cbf283ddb9f4df6a9c18b3f2d6c3e6c",
"tabbable": null,
"tooltip": null,
"value": "Downloading (…)ve/f2482bf/vocab.txt: 100%"
}
},
"35008830141a470a82cb0f5b6816e1b8": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "2.0.0",
"model_name": "HTMLModel",
"state": {
"_dom_classes": [],
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "2.0.0",
"_model_name": "HTMLModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/controls",
"_view_module_version": "2.0.0",
"_view_name": "HTMLView",
"description": "",
"description_allow_html": false,
"layout": "IPY_MODEL_a999a74625824b70a89220fc9391ef0c",
"placeholder": "",
"style": "IPY_MODEL_4570ce92cc0046f1b5cdea6388afc9a6",
"tabbable": null,
"tooltip": null,
"value": " 1.22G/1.22G [00:07&lt;00:00, 185MB/s]"
}
},
"37e5b6f4878549e3883ea8e67a963f86": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "2.0.0",
"model_name": "HTMLModel",
"state": {
"_dom_classes": [],
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "2.0.0",
"_model_name": "HTMLModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/controls",
"_view_module_version": "2.0.0",
"_view_name": "HTMLView",
"description": "",
"description_allow_html": false,
"layout": "IPY_MODEL_a66fd706e4a44f7e930c588f47adefe7",
"placeholder": "",
"style": "IPY_MODEL_c511a30027cf4eee990602355a68e5c8",
"tabbable": null,
"tooltip": null,
"value": "Downloading (…)/af0f99b/config.json: 100%"
}
},
"3850a5b6f6d642ff8e4bf76bdde5c740": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "2.0.0",
"model_name": "HBoxModel",
"state": {
"_dom_classes": [],
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "2.0.0",
"_model_name": "HBoxModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/controls",
"_view_module_version": "2.0.0",
"_view_name": "HBoxView",
"box_style": "",
"children": [
"IPY_MODEL_f9e66d0314524529bd4a6ee9d8e3faef",
"IPY_MODEL_27192f2cef0a4e2e962e38448ec45f98",
"IPY_MODEL_06333b8093784188ad9702b2344b9791"
],
"layout": "IPY_MODEL_b3411c6a3797495fbf87f322feb15661",
"tabbable": null,
"tooltip": null
}
},
"3b5394cc3f5c47beafb332747fba8918": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "2.0.0",
"model_name": "HBoxModel",
"state": {
"_dom_classes": [],
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "2.0.0",
"_model_name": "HBoxModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/controls",
"_view_module_version": "2.0.0",
"_view_name": "HBoxView",
"box_style": "",
"children": [
"IPY_MODEL_d22e5118daf740ee840f35215054b9e0",
"IPY_MODEL_2672cb20a6d340a4a547cf0ef8bd263e",
"IPY_MODEL_7b28bc9518f04dd796c85d487bbb0d36"
],
"layout": "IPY_MODEL_0f55b926768449f9a969c120442349c1",
"tabbable": null,
"tooltip": null
}
},
"3bd4002857b44b18943bc8f05037321b": {
"model_module": "@jupyter-widgets/base",
"model_module_version": "2.0.0",
"model_name": "LayoutModel",
"state": {
"_model_module": "@jupyter-widgets/base",
"_model_module_version": "2.0.0",
"_model_name": "LayoutModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "2.0.0",
"_view_name": "LayoutView",
"align_content": null,
"align_items": null,
"align_self": null,
"border_bottom": null,
"border_left": null,
"border_right": null,
"border_top": null,
"bottom": null,
"display": null,
"flex": null,
"flex_flow": null,
"grid_area": null,
"grid_auto_columns": null,
"grid_auto_flow": null,
"grid_auto_rows": null,
"grid_column": null,
"grid_gap": null,
"grid_row": null,
"grid_template_areas": null,
"grid_template_columns": null,
"grid_template_rows": null,
"height": null,
"justify_content": null,
"justify_items": null,
"left": null,
"margin": null,
"max_height": null,
"max_width": null,
"min_height": null,
"min_width": null,
"object_fit": null,
"object_position": null,
"order": null,
"overflow": null,
"padding": null,
"right": null,
"top": null,
"visibility": null,
"width": null
}
},
"3d5ff7fe38484b2f9a765b4588e19f2a": {
"model_module": "@jupyter-widgets/base",
"model_module_version": "2.0.0",
"model_name": "LayoutModel",
"state": {
"_model_module": "@jupyter-widgets/base",
"_model_module_version": "2.0.0",
"_model_name": "LayoutModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "2.0.0",
"_view_name": "LayoutView",
"align_content": null,
"align_items": null,
"align_self": null,
"border_bottom": null,
"border_left": null,
"border_right": null,
"border_top": null,
"bottom": null,
"display": null,
"flex": null,
"flex_flow": null,
"grid_area": null,
"grid_auto_columns": null,
"grid_auto_flow": null,
"grid_auto_rows": null,
"grid_column": null,
"grid_gap": null,
"grid_row": null,
"grid_template_areas": null,
"grid_template_columns": null,
"grid_template_rows": null,
"height": null,
"justify_content": null,
"justify_items": null,
"left": null,
"margin": null,
"max_height": null,
"max_width": null,
"min_height": null,
"min_width": null,
"object_fit": null,
"object_position": null,
"order": null,
"overflow": null,
"padding": null,
"right": null,
"top": null,
"visibility": null,
"width": null
}
},
"3d7141cb7e56467aa773cb18fc436645": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "2.0.0",
"model_name": "ProgressStyleModel",
"state": {
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "2.0.0",
"_model_name": "ProgressStyleModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "2.0.0",
"_view_name": "StyleView",
"bar_color": null,
"description_width": ""
}
},
"4570ce92cc0046f1b5cdea6388afc9a6": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "2.0.0",
"model_name": "HTMLStyleModel",
"state": {
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "2.0.0",
"_model_name": "HTMLStyleModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "2.0.0",
"_view_name": "StyleView",
"background": null,
"description_width": "",
"font_size": null,
"text_color": null
}
},
"45b55e3e6c2e4f6fb2f3798d1c072a5c": {
"model_module": "@jupyter-widgets/base",
"model_module_version": "2.0.0",
"model_name": "LayoutModel",
"state": {
"_model_module": "@jupyter-widgets/base",
"_model_module_version": "2.0.0",
"_model_name": "LayoutModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "2.0.0",
"_view_name": "LayoutView",
"align_content": null,
"align_items": null,
"align_self": null,
"border_bottom": null,
"border_left": null,
"border_right": null,
"border_top": null,
"bottom": null,
"display": null,
"flex": null,
"flex_flow": null,
"grid_area": null,
"grid_auto_columns": null,
"grid_auto_flow": null,
"grid_auto_rows": null,
"grid_column": null,
"grid_gap": null,
"grid_row": null,
"grid_template_areas": null,
"grid_template_columns": null,
"grid_template_rows": null,
"height": null,
"justify_content": null,
"justify_items": null,
"left": null,
"margin": null,
"max_height": null,
"max_width": null,
"min_height": null,
"min_width": null,
"object_fit": null,
"object_position": null,
"order": null,
"overflow": null,
"padding": null,
"right": null,
"top": null,
"visibility": null,
"width": null
}
},
"5237f9c534e24937b5376273ce58b8df": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "2.0.0",
"model_name": "HBoxModel",
"state": {
"_dom_classes": [],
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "2.0.0",
"_model_name": "HBoxModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/controls",
"_view_module_version": "2.0.0",
"_view_name": "HBoxView",
"box_style": "",
"children": [
"IPY_MODEL_a081fbeb31a4415faac874ce463a5ea4",
"IPY_MODEL_8837dbee76df412bb4530a21311e9481",
"IPY_MODEL_d0cec4b81fee40d0ae86486979881acb"
],
"layout": "IPY_MODEL_5ce370cb3f154f148ed5fb6c2db20f3e",
"tabbable": null,
"tooltip": null
}
},
"53cddafe2c82467fb766bb43cf58f4e6": {
"model_module": "@jupyter-widgets/base",
"model_module_version": "2.0.0",
"model_name": "LayoutModel",
"state": {
"_model_module": "@jupyter-widgets/base",
"_model_module_version": "2.0.0",
"_model_name": "LayoutModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "2.0.0",
"_view_name": "LayoutView",
"align_content": null,
"align_items": null,
"align_self": null,
"border_bottom": null,
"border_left": null,
"border_right": null,
"border_top": null,
"bottom": null,
"display": null,
"flex": null,
"flex_flow": null,
"grid_area": null,
"grid_auto_columns": null,
"grid_auto_flow": null,
"grid_auto_rows": null,
"grid_column": null,
"grid_gap": null,
"grid_row": null,
"grid_template_areas": null,
"grid_template_columns": null,
"grid_template_rows": null,
"height": null,
"justify_content": null,
"justify_items": null,
"left": null,
"margin": null,
"max_height": null,
"max_width": null,
"min_height": null,
"min_width": null,
"object_fit": null,
"object_position": null,
"order": null,
"overflow": null,
"padding": null,
"right": null,
"top": null,
"visibility": null,
"width": null
}
},
"567e12cc410b45969b21e7cd4bb3cb2f": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "2.0.0",
"model_name": "HTMLModel",
"state": {
"_dom_classes": [],
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "2.0.0",
"_model_name": "HTMLModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/controls",
"_view_module_version": "2.0.0",
"_view_name": "HTMLView",
"description": "",
"description_allow_html": false,
"layout": "IPY_MODEL_a1c0eebc376b4cbd8c3001a8e8e3ab14",
"placeholder": "",
"style": "IPY_MODEL_1062c73f02c44db99d8abe6a89702904",
"tabbable": null,
"tooltip": null,
"value": "Downloading pytorch_model.bin: 100%"
}
},
"57078343da114159bca3534f2c6bf4ee": {
"model_module": "@jupyter-widgets/base",
"model_module_version": "2.0.0",
"model_name": "LayoutModel",
"state": {
"_model_module": "@jupyter-widgets/base",
"_model_module_version": "2.0.0",
"_model_name": "LayoutModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "2.0.0",
"_view_name": "LayoutView",
"align_content": null,
"align_items": null,
"align_self": null,
"border_bottom": null,
"border_left": null,
"border_right": null,
"border_top": null,
"bottom": null,
"display": null,
"flex": null,
"flex_flow": null,
"grid_area": null,
"grid_auto_columns": null,
"grid_auto_flow": null,
"grid_auto_rows": null,
"grid_column": null,
"grid_gap": null,
"grid_row": null,
"grid_template_areas": null,
"grid_template_columns": null,
"grid_template_rows": null,
"height": null,
"justify_content": null,
"justify_items": null,
"left": null,
"margin": null,
"max_height": null,
"max_width": null,
"min_height": null,
"min_width": null,
"object_fit": null,
"object_position": null,
"order": null,
"overflow": null,
"padding": null,
"right": null,
"top": null,
"visibility": null,
"width": null
}
},
"595ca027d75940a9bb35f89586088f2e": {
"model_module": "@jupyter-widgets/base",
"model_module_version": "2.0.0",
"model_name": "LayoutModel",
"state": {
"_model_module": "@jupyter-widgets/base",
"_model_module_version": "2.0.0",
"_model_name": "LayoutModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "2.0.0",
"_view_name": "LayoutView",
"align_content": null,
"align_items": null,
"align_self": null,
"border_bottom": null,
"border_left": null,
"border_right": null,
"border_top": null,
"bottom": null,
"display": null,
"flex": null,
"flex_flow": null,
"grid_area": null,
"grid_auto_columns": null,
"grid_auto_flow": null,
"grid_auto_rows": null,
"grid_column": null,
"grid_gap": null,
"grid_row": null,
"grid_template_areas": null,
"grid_template_columns": null,
"grid_template_rows": null,
"height": null,
"justify_content": null,
"justify_items": null,
"left": null,
"margin": null,
"max_height": null,
"max_width": null,
"min_height": null,
"min_width": null,
"object_fit": null,
"object_position": null,
"order": null,
"overflow": null,
"padding": null,
"right": null,
"top": null,
"visibility": null,
"width": null
}
},
"5c79b2c4c9a541bab94f0e42278c795e": {
"model_module": "@jupyter-widgets/base",
"model_module_version": "2.0.0",
"model_name": "LayoutModel",
"state": {
"_model_module": "@jupyter-widgets/base",
"_model_module_version": "2.0.0",
"_model_name": "LayoutModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "2.0.0",
"_view_name": "LayoutView",
"align_content": null,
"align_items": null,
"align_self": null,
"border_bottom": null,
"border_left": null,
"border_right": null,
"border_top": null,
"bottom": null,
"display": null,
"flex": null,
"flex_flow": null,
"grid_area": null,
"grid_auto_columns": null,
"grid_auto_flow": null,
"grid_auto_rows": null,
"grid_column": null,
"grid_gap": null,
"grid_row": null,
"grid_template_areas": null,
"grid_template_columns": null,
"grid_template_rows": null,
"height": null,
"justify_content": null,
"justify_items": null,
"left": null,
"margin": null,
"max_height": null,
"max_width": null,
"min_height": null,
"min_width": null,
"object_fit": null,
"object_position": null,
"order": null,
"overflow": null,
"padding": null,
"right": null,
"top": null,
"visibility": null,
"width": null
}
},
"5ce370cb3f154f148ed5fb6c2db20f3e": {
"model_module": "@jupyter-widgets/base",
"model_module_version": "2.0.0",
"model_name": "LayoutModel",
"state": {
"_model_module": "@jupyter-widgets/base",
"_model_module_version": "2.0.0",
"_model_name": "LayoutModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "2.0.0",
"_view_name": "LayoutView",
"align_content": null,
"align_items": null,
"align_self": null,
"border_bottom": null,
"border_left": null,
"border_right": null,
"border_top": null,
"bottom": null,
"display": null,
"flex": null,
"flex_flow": null,
"grid_area": null,
"grid_auto_columns": null,
"grid_auto_flow": null,
"grid_auto_rows": null,
"grid_column": null,
"grid_gap": null,
"grid_row": null,
"grid_template_areas": null,
"grid_template_columns": null,
"grid_template_rows": null,
"height": null,
"justify_content": null,
"justify_items": null,
"left": null,
"margin": null,
"max_height": null,
"max_width": null,
"min_height": null,
"min_width": null,
"object_fit": null,
"object_position": null,
"order": null,
"overflow": null,
"padding": null,
"right": null,
"top": null,
"visibility": null,
"width": null
}
},
"5e4cdaae4f124324b76a8bd44aa9b5fc": {
"model_module": "@jupyter-widgets/base",
"model_module_version": "2.0.0",
"model_name": "LayoutModel",
"state": {
"_model_module": "@jupyter-widgets/base",
"_model_module_version": "2.0.0",
"_model_name": "LayoutModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "2.0.0",
"_view_name": "LayoutView",
"align_content": null,
"align_items": null,
"align_self": null,
"border_bottom": null,
"border_left": null,
"border_right": null,
"border_top": null,
"bottom": null,
"display": null,
"flex": null,
"flex_flow": null,
"grid_area": null,
"grid_auto_columns": null,
"grid_auto_flow": null,
"grid_auto_rows": null,
"grid_column": null,
"grid_gap": null,
"grid_row": null,
"grid_template_areas": null,
"grid_template_columns": null,
"grid_template_rows": null,
"height": null,
"justify_content": null,
"justify_items": null,
"left": null,
"margin": null,
"max_height": null,
"max_width": null,
"min_height": null,
"min_width": null,
"object_fit": null,
"object_position": null,
"order": null,
"overflow": null,
"padding": null,
"right": null,
"top": null,
"visibility": null,
"width": null
}
},
"635d9e61bfd049568c7d6d3cd8b42cd1": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "2.0.0",
"model_name": "HTMLStyleModel",
"state": {
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "2.0.0",
"_model_name": "HTMLStyleModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "2.0.0",
"_view_name": "StyleView",
"background": null,
"description_width": "",
"font_size": null,
"text_color": null
}
},
"63be3ca0362b4f75ba7b25bdf3d1500f": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "2.0.0",
"model_name": "HTMLStyleModel",
"state": {
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "2.0.0",
"_model_name": "HTMLStyleModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "2.0.0",
"_view_name": "StyleView",
"background": null,
"description_width": "",
"font_size": null,
"text_color": null
}
},
"659303a039f64e3380e704a92deb30c2": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "2.0.0",
"model_name": "HBoxModel",
"state": {
"_dom_classes": [],
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "2.0.0",
"_model_name": "HBoxModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/controls",
"_view_module_version": "2.0.0",
"_view_name": "HBoxView",
"box_style": "",
"children": [
"IPY_MODEL_c86d4560b38d496e89f99b22c55bfad6",
"IPY_MODEL_235d2e616ddd4ac0979efd3e09dcbb3d",
"IPY_MODEL_ebe26a353dd6400bbcddcc1d83f740ee"
],
"layout": "IPY_MODEL_c12474a0dc6a44d3be8a688f968ca09e",
"tabbable": null,
"tooltip": null
}
},
"65df592b2aa54852b12bdf44b61b32a9": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "2.0.0",
"model_name": "HTMLStyleModel",
"state": {
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "2.0.0",
"_model_name": "HTMLStyleModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "2.0.0",
"_view_name": "StyleView",
"background": null,
"description_width": "",
"font_size": null,
"text_color": null
}
},
"662898f7cc534c1a83ea051f1abdf1d9": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "2.0.0",
"model_name": "HBoxModel",
"state": {
"_dom_classes": [],
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "2.0.0",
"_model_name": "HBoxModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/controls",
"_view_module_version": "2.0.0",
"_view_name": "HBoxView",
"box_style": "",
"children": [
"IPY_MODEL_e5dc6af7ae5a41e78285a08bf015b73f",
"IPY_MODEL_b0b21a3ecfc349b99732f8a0b667f81c",
"IPY_MODEL_7ffe408e60734da5878dc87324ecd4eb"
],
"layout": "IPY_MODEL_f2f07a5fe6de4d1881d205f2261a9d93",
"tabbable": null,
"tooltip": null
}
},
"68302cc18692413bbf4d28e6649e953d": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "2.0.0",
"model_name": "HTMLModel",
"state": {
"_dom_classes": [],
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "2.0.0",
"_model_name": "HTMLModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/controls",
"_view_module_version": "2.0.0",
"_view_name": "HTMLView",
"description": "",
"description_allow_html": false,
"layout": "IPY_MODEL_9968658c3cbb4e40b3be10efaefbb110",
"placeholder": "",
"style": "IPY_MODEL_6b5abf71e6c44601bb62b00985249c29",
"tabbable": null,
"tooltip": null,
"value": " 60.0/60.0 [00:00&lt;00:00, 4.23kB/s]"
}
},
"6878d5284f044778b84d1f36af58d041": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "2.0.0",
"model_name": "HTMLModel",
"state": {
"_dom_classes": [],
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "2.0.0",
"_model_name": "HTMLModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/controls",
"_view_module_version": "2.0.0",
"_view_name": "HTMLView",
"description": "",
"description_allow_html": false,
"layout": "IPY_MODEL_bebf0b50b7384d92b9dccadfa64c6e3c",
"placeholder": "",
"style": "IPY_MODEL_e8844fb5198e4371b3ce3304238bbcf5",
"tabbable": null,
"tooltip": null,
"value": " 268M/268M [00:01&lt;00:00, 173MB/s]"
}
},
"69c1671508d74c349abe5573c2165279": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "2.0.0",
"model_name": "FloatProgressModel",
"state": {
"_dom_classes": [],
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "2.0.0",
"_model_name": "FloatProgressModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/controls",
"_view_module_version": "2.0.0",
"_view_name": "ProgressView",
"bar_style": "success",
"description": "",
"description_allow_html": false,
"layout": "IPY_MODEL_15e26d521938428a93008c39201e5684",
"max": 26.0,
"min": 0.0,
"orientation": "horizontal",
"style": "IPY_MODEL_6b5075275e264bbf9a2a3176c5bc1419",
"tabbable": null,
"tooltip": null,
"value": 26.0
}
},
"6ab131dea3604ea9b53980bc469559ad": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "2.0.0",
"model_name": "HBoxModel",
"state": {
"_dom_classes": [],
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "2.0.0",
"_model_name": "HBoxModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/controls",
"_view_module_version": "2.0.0",
"_view_name": "HBoxView",
"box_style": "",
"children": [
"IPY_MODEL_9c33e8e64533494c8606ae6333f46ee6",
"IPY_MODEL_69c1671508d74c349abe5573c2165279",
"IPY_MODEL_e0815bfe97394843ba7c86bf176f0d3a"
],
"layout": "IPY_MODEL_30eebb8044934401a2bef123e186bdf2",
"tabbable": null,
"tooltip": null
}
},
"6b5075275e264bbf9a2a3176c5bc1419": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "2.0.0",
"model_name": "ProgressStyleModel",
"state": {
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "2.0.0",
"_model_name": "ProgressStyleModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "2.0.0",
"_view_name": "StyleView",
"bar_color": null,
"description_width": ""
}
},
"6b5abf71e6c44601bb62b00985249c29": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "2.0.0",
"model_name": "HTMLStyleModel",
"state": {
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "2.0.0",
"_model_name": "HTMLStyleModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "2.0.0",
"_view_name": "StyleView",
"background": null,
"description_width": "",
"font_size": null,
"text_color": null
}
},
"6d9513cd408f440cb33339fce1d957d5": {
"model_module": "@jupyter-widgets/base",
"model_module_version": "2.0.0",
"model_name": "LayoutModel",
"state": {
"_model_module": "@jupyter-widgets/base",
"_model_module_version": "2.0.0",
"_model_name": "LayoutModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "2.0.0",
"_view_name": "LayoutView",
"align_content": null,
"align_items": null,
"align_self": null,
"border_bottom": null,
"border_left": null,
"border_right": null,
"border_top": null,
"bottom": null,
"display": null,
"flex": null,
"flex_flow": null,
"grid_area": null,
"grid_auto_columns": null,
"grid_auto_flow": null,
"grid_auto_rows": null,
"grid_column": null,
"grid_gap": null,
"grid_row": null,
"grid_template_areas": null,
"grid_template_columns": null,
"grid_template_rows": null,
"height": null,
"justify_content": null,
"justify_items": null,
"left": null,
"margin": null,
"max_height": null,
"max_width": null,
"min_height": null,
"min_width": null,
"object_fit": null,
"object_position": null,
"order": null,
"overflow": null,
"padding": null,
"right": null,
"top": null,
"visibility": null,
"width": null
}
},
"6f02f462b6494e9799d6c6ed94e9c083": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "2.0.0",
"model_name": "HTMLStyleModel",
"state": {
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "2.0.0",
"_model_name": "HTMLStyleModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "2.0.0",
"_view_name": "StyleView",
"background": null,
"description_width": "",
"font_size": null,
"text_color": null
}
},
"6f0c5124a0cc4e539c83095ab12edc1b": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "2.0.0",
"model_name": "HTMLStyleModel",
"state": {
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "2.0.0",
"_model_name": "HTMLStyleModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "2.0.0",
"_view_name": "StyleView",
"background": null,
"description_width": "",
"font_size": null,
"text_color": null
}
},
"73bafac92f4f4698b861de3ae1d28c25": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "2.0.0",
"model_name": "ProgressStyleModel",
"state": {
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "2.0.0",
"_model_name": "ProgressStyleModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "2.0.0",
"_view_name": "StyleView",
"bar_color": null,
"description_width": ""
}
},
"75bf3872b92f4e8d886b30dcafff93c9": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "2.0.0",
"model_name": "ProgressStyleModel",
"state": {
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "2.0.0",
"_model_name": "ProgressStyleModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "2.0.0",
"_view_name": "StyleView",
"bar_color": null,
"description_width": ""
}
},
"796df27060bd4a6883869d7076db72fc": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "2.0.0",
"model_name": "HBoxModel",
"state": {
"_dom_classes": [],
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "2.0.0",
"_model_name": "HBoxModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/controls",
"_view_module_version": "2.0.0",
"_view_name": "HBoxView",
"box_style": "",
"children": [
"IPY_MODEL_a55be32388c1480aa6d394846c0e11d1",
"IPY_MODEL_9b09f02d33b54aa6b34a479491f427ae",
"IPY_MODEL_68302cc18692413bbf4d28e6649e953d"
],
"layout": "IPY_MODEL_7985c41a378b4d0a8816fb4fea868cf1",
"tabbable": null,
"tooltip": null
}
},
"7985c41a378b4d0a8816fb4fea868cf1": {
"model_module": "@jupyter-widgets/base",
"model_module_version": "2.0.0",
"model_name": "LayoutModel",
"state": {
"_model_module": "@jupyter-widgets/base",
"_model_module_version": "2.0.0",
"_model_name": "LayoutModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "2.0.0",
"_view_name": "LayoutView",
"align_content": null,
"align_items": null,
"align_self": null,
"border_bottom": null,
"border_left": null,
"border_right": null,
"border_top": null,
"bottom": null,
"display": null,
"flex": null,
"flex_flow": null,
"grid_area": null,
"grid_auto_columns": null,
"grid_auto_flow": null,
"grid_auto_rows": null,
"grid_column": null,
"grid_gap": null,
"grid_row": null,
"grid_template_areas": null,
"grid_template_columns": null,
"grid_template_rows": null,
"height": null,
"justify_content": null,
"justify_items": null,
"left": null,
"margin": null,
"max_height": null,
"max_width": null,
"min_height": null,
"min_width": null,
"object_fit": null,
"object_position": null,
"order": null,
"overflow": null,
"padding": null,
"right": null,
"top": null,
"visibility": null,
"width": null
}
},
"7b28bc9518f04dd796c85d487bbb0d36": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "2.0.0",
"model_name": "HTMLModel",
"state": {
"_dom_classes": [],
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "2.0.0",
"_model_name": "HTMLModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/controls",
"_view_module_version": "2.0.0",
"_view_name": "HTMLView",
"description": "",
"description_allow_html": false,
"layout": "IPY_MODEL_3bd4002857b44b18943bc8f05037321b",
"placeholder": "",
"style": "IPY_MODEL_962453eea1084aac9468c99623034497",
"tabbable": null,
"tooltip": null,
"value": " 899k/899k [00:00&lt;00:00, 9.70MB/s]"
}
},
"7eec2c152d964702956897fd6c8d027f": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "2.0.0",
"model_name": "HTMLStyleModel",
"state": {
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "2.0.0",
"_model_name": "HTMLStyleModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "2.0.0",
"_view_name": "StyleView",
"background": null,
"description_width": "",
"font_size": null,
"text_color": null
}
},
"7ffe408e60734da5878dc87324ecd4eb": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "2.0.0",
"model_name": "HTMLModel",
"state": {
"_dom_classes": [],
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "2.0.0",
"_model_name": "HTMLModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/controls",
"_view_module_version": "2.0.0",
"_view_name": "HTMLView",
"description": "",
"description_allow_html": false,
"layout": "IPY_MODEL_998b090e4cf147d98a56ef59d7ab135b",
"placeholder": "",
"style": "IPY_MODEL_6f0c5124a0cc4e539c83095ab12edc1b",
"tabbable": null,
"tooltip": null,
"value": " 998/998 [00:00&lt;00:00, 59.0kB/s]"
}
},
"8061186407454de094c81415f704b7bb": {
"model_module": "@jupyter-widgets/base",
"model_module_version": "2.0.0",
"model_name": "LayoutModel",
"state": {
"_model_module": "@jupyter-widgets/base",
"_model_module_version": "2.0.0",
"_model_name": "LayoutModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "2.0.0",
"_view_name": "LayoutView",
"align_content": null,
"align_items": null,
"align_self": null,
"border_bottom": null,
"border_left": null,
"border_right": null,
"border_top": null,
"bottom": null,
"display": null,
"flex": null,
"flex_flow": null,
"grid_area": null,
"grid_auto_columns": null,
"grid_auto_flow": null,
"grid_auto_rows": null,
"grid_column": null,
"grid_gap": null,
"grid_row": null,
"grid_template_areas": null,
"grid_template_columns": null,
"grid_template_rows": null,
"height": null,
"justify_content": null,
"justify_items": null,
"left": null,
"margin": null,
"max_height": null,
"max_width": null,
"min_height": null,
"min_width": null,
"object_fit": null,
"object_position": null,
"order": null,
"overflow": null,
"padding": null,
"right": null,
"top": null,
"visibility": null,
"width": null
}
},
"81105190fb5646ef8ae0ba604990f3dd": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "2.0.0",
"model_name": "HTMLModel",
"state": {
"_dom_classes": [],
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "2.0.0",
"_model_name": "HTMLModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/controls",
"_view_module_version": "2.0.0",
"_view_name": "HTMLView",
"description": "",
"description_allow_html": false,
"layout": "IPY_MODEL_a01d8655902f465eb6990f39e8fd4f6a",
"placeholder": "",
"style": "IPY_MODEL_b5c8d9d7e9f846f3b55d7f6073ac7dbc",
"tabbable": null,
"tooltip": null,
"value": " 213k/213k [00:00&lt;00:00, 14.0MB/s]"
}
},
"8828a33b1bc04ba5989c56f460332a56": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "2.0.0",
"model_name": "HTMLStyleModel",
"state": {
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "2.0.0",
"_model_name": "HTMLStyleModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "2.0.0",
"_view_name": "StyleView",
"background": null,
"description_width": "",
"font_size": null,
"text_color": null
}
},
"8837dbee76df412bb4530a21311e9481": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "2.0.0",
"model_name": "FloatProgressModel",
"state": {
"_dom_classes": [],
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "2.0.0",
"_model_name": "FloatProgressModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/controls",
"_view_module_version": "2.0.0",
"_view_name": "ProgressView",
"bar_style": "success",
"description": "",
"description_allow_html": false,
"layout": "IPY_MODEL_cc47c911b0db4fdbb81586f9bd8856d8",
"max": 1802.0,
"min": 0.0,
"orientation": "horizontal",
"style": "IPY_MODEL_75bf3872b92f4e8d886b30dcafff93c9",
"tabbable": null,
"tooltip": null,
"value": 1802.0
}
},
"8bc8dc23a1424259b777a736361cd5b7": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "2.0.0",
"model_name": "ProgressStyleModel",
"state": {
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "2.0.0",
"_model_name": "ProgressStyleModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "2.0.0",
"_view_name": "StyleView",
"bar_color": null,
"description_width": ""
}
},
"8da9fe8dba9544fdb3ca5924f4225891": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "2.0.0",
"model_name": "FloatProgressModel",
"state": {
"_dom_classes": [],
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "2.0.0",
"_model_name": "FloatProgressModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/controls",
"_view_module_version": "2.0.0",
"_view_name": "ProgressView",
"bar_style": "success",
"description": "",
"description_allow_html": false,
"layout": "IPY_MODEL_e9e6af85584a424eadf38a4bf0c283f2",
"max": 1222317369.0,
"min": 0.0,
"orientation": "horizontal",
"style": "IPY_MODEL_8bc8dc23a1424259b777a736361cd5b7",
"tabbable": null,
"tooltip": null,
"value": 1222317369.0
}
},
"8dabf8ecf7ed4d2db3948c0cfbba1cdc": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "2.0.0",
"model_name": "ProgressStyleModel",
"state": {
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "2.0.0",
"_model_name": "ProgressStyleModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "2.0.0",
"_view_name": "StyleView",
"bar_color": null,
"description_width": ""
}
},
"8f2f1578ff644163a3c40635ca7bb712": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "2.0.0",
"model_name": "HTMLModel",
"state": {
"_dom_classes": [],
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "2.0.0",
"_model_name": "HTMLModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/controls",
"_view_module_version": "2.0.0",
"_view_name": "HTMLView",
"description": "",
"description_allow_html": false,
"layout": "IPY_MODEL_1e0ecfdcb4144ae7822c2298444fcddd",
"placeholder": "",
"style": "IPY_MODEL_b8aaca13d4a742149ec0cd1fe31dd58e",
"tabbable": null,
"tooltip": null,
"value": " 629/629 [00:00&lt;00:00, 35.3kB/s]"
}
},
"9229f237dd5b4552ae198d9c6de78363": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "2.0.0",
"model_name": "HTMLStyleModel",
"state": {
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "2.0.0",
"_model_name": "HTMLStyleModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "2.0.0",
"_view_name": "StyleView",
"background": null,
"description_width": "",
"font_size": null,
"text_color": null
}
},
"962453eea1084aac9468c99623034497": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "2.0.0",
"model_name": "HTMLStyleModel",
"state": {
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "2.0.0",
"_model_name": "HTMLStyleModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "2.0.0",
"_view_name": "StyleView",
"background": null,
"description_width": "",
"font_size": null,
"text_color": null
}
},
"9787d683433343949ef2a9c3acaf109d": {
"model_module": "@jupyter-widgets/base",
"model_module_version": "2.0.0",
"model_name": "LayoutModel",
"state": {
"_model_module": "@jupyter-widgets/base",
"_model_module_version": "2.0.0",
"_model_name": "LayoutModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "2.0.0",
"_view_name": "LayoutView",
"align_content": null,
"align_items": null,
"align_self": null,
"border_bottom": null,
"border_left": null,
"border_right": null,
"border_top": null,
"bottom": null,
"display": null,
"flex": null,
"flex_flow": null,
"grid_area": null,
"grid_auto_columns": null,
"grid_auto_flow": null,
"grid_auto_rows": null,
"grid_column": null,
"grid_gap": null,
"grid_row": null,
"grid_template_areas": null,
"grid_template_columns": null,
"grid_template_rows": null,
"height": null,
"justify_content": null,
"justify_items": null,
"left": null,
"margin": null,
"max_height": null,
"max_width": null,
"min_height": null,
"min_width": null,
"object_fit": null,
"object_position": null,
"order": null,
"overflow": null,
"padding": null,
"right": null,
"top": null,
"visibility": null,
"width": null
}
},
"9968658c3cbb4e40b3be10efaefbb110": {
"model_module": "@jupyter-widgets/base",
"model_module_version": "2.0.0",
"model_name": "LayoutModel",
"state": {
"_model_module": "@jupyter-widgets/base",
"_model_module_version": "2.0.0",
"_model_name": "LayoutModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "2.0.0",
"_view_name": "LayoutView",
"align_content": null,
"align_items": null,
"align_self": null,
"border_bottom": null,
"border_left": null,
"border_right": null,
"border_top": null,
"bottom": null,
"display": null,
"flex": null,
"flex_flow": null,
"grid_area": null,
"grid_auto_columns": null,
"grid_auto_flow": null,
"grid_auto_rows": null,
"grid_column": null,
"grid_gap": null,
"grid_row": null,
"grid_template_areas": null,
"grid_template_columns": null,
"grid_template_rows": null,
"height": null,
"justify_content": null,
"justify_items": null,
"left": null,
"margin": null,
"max_height": null,
"max_width": null,
"min_height": null,
"min_width": null,
"object_fit": null,
"object_position": null,
"order": null,
"overflow": null,
"padding": null,
"right": null,
"top": null,
"visibility": null,
"width": null
}
},
"998b090e4cf147d98a56ef59d7ab135b": {
"model_module": "@jupyter-widgets/base",
"model_module_version": "2.0.0",
"model_name": "LayoutModel",
"state": {
"_model_module": "@jupyter-widgets/base",
"_model_module_version": "2.0.0",
"_model_name": "LayoutModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "2.0.0",
"_view_name": "LayoutView",
"align_content": null,
"align_items": null,
"align_self": null,
"border_bottom": null,
"border_left": null,
"border_right": null,
"border_top": null,
"bottom": null,
"display": null,
"flex": null,
"flex_flow": null,
"grid_area": null,
"grid_auto_columns": null,
"grid_auto_flow": null,
"grid_auto_rows": null,
"grid_column": null,
"grid_gap": null,
"grid_row": null,
"grid_template_areas": null,
"grid_template_columns": null,
"grid_template_rows": null,
"height": null,
"justify_content": null,
"justify_items": null,
"left": null,
"margin": null,
"max_height": null,
"max_width": null,
"min_height": null,
"min_width": null,
"object_fit": null,
"object_position": null,
"order": null,
"overflow": null,
"padding": null,
"right": null,
"top": null,
"visibility": null,
"width": null
}
},
"9b09f02d33b54aa6b34a479491f427ae": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "2.0.0",
"model_name": "FloatProgressModel",
"state": {
"_dom_classes": [],
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "2.0.0",
"_model_name": "FloatProgressModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/controls",
"_view_module_version": "2.0.0",
"_view_name": "ProgressView",
"bar_style": "success",
"description": "",
"description_allow_html": false,
"layout": "IPY_MODEL_de2c47e67300445f92715fccac8ddb0a",
"max": 60.0,
"min": 0.0,
"orientation": "horizontal",
"style": "IPY_MODEL_73bafac92f4f4698b861de3ae1d28c25",
"tabbable": null,
"tooltip": null,
"value": 60.0
}
},
"9c33e8e64533494c8606ae6333f46ee6": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "2.0.0",
"model_name": "HTMLModel",
"state": {
"_dom_classes": [],
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "2.0.0",
"_model_name": "HTMLModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/controls",
"_view_module_version": "2.0.0",
"_view_name": "HTMLView",
"description": "",
"description_allow_html": false,
"layout": "IPY_MODEL_c5528279daf540579fc74cb55bdd4f71",
"placeholder": "",
"style": "IPY_MODEL_e0f87c3d7b804219b4bd2229fd383858",
"tabbable": null,
"tooltip": null,
"value": "Downloading (…)okenizer_config.json: 100%"
}
},
"9e107d247362411799bbafbe5e9c6267": {
"model_module": "@jupyter-widgets/base",
"model_module_version": "2.0.0",
"model_name": "LayoutModel",
"state": {
"_model_module": "@jupyter-widgets/base",
"_model_module_version": "2.0.0",
"_model_name": "LayoutModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "2.0.0",
"_view_name": "LayoutView",
"align_content": null,
"align_items": null,
"align_self": null,
"border_bottom": null,
"border_left": null,
"border_right": null,
"border_top": null,
"bottom": null,
"display": null,
"flex": null,
"flex_flow": null,
"grid_area": null,
"grid_auto_columns": null,
"grid_auto_flow": null,
"grid_auto_rows": null,
"grid_column": null,
"grid_gap": null,
"grid_row": null,
"grid_template_areas": null,
"grid_template_columns": null,
"grid_template_rows": null,
"height": null,
"justify_content": null,
"justify_items": null,
"left": null,
"margin": null,
"max_height": null,
"max_width": null,
"min_height": null,
"min_width": null,
"object_fit": null,
"object_position": null,
"order": null,
"overflow": null,
"padding": null,
"right": null,
"top": null,
"visibility": null,
"width": null
}
},
"a01d8655902f465eb6990f39e8fd4f6a": {
"model_module": "@jupyter-widgets/base",
"model_module_version": "2.0.0",
"model_name": "LayoutModel",
"state": {
"_model_module": "@jupyter-widgets/base",
"_model_module_version": "2.0.0",
"_model_name": "LayoutModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "2.0.0",
"_view_name": "LayoutView",
"align_content": null,
"align_items": null,
"align_self": null,
"border_bottom": null,
"border_left": null,
"border_right": null,
"border_top": null,
"bottom": null,
"display": null,
"flex": null,
"flex_flow": null,
"grid_area": null,
"grid_auto_columns": null,
"grid_auto_flow": null,
"grid_auto_rows": null,
"grid_column": null,
"grid_gap": null,
"grid_row": null,
"grid_template_areas": null,
"grid_template_columns": null,
"grid_template_rows": null,
"height": null,
"justify_content": null,
"justify_items": null,
"left": null,
"margin": null,
"max_height": null,
"max_width": null,
"min_height": null,
"min_width": null,
"object_fit": null,
"object_position": null,
"order": null,
"overflow": null,
"padding": null,
"right": null,
"top": null,
"visibility": null,
"width": null
}
},
"a081fbeb31a4415faac874ce463a5ea4": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "2.0.0",
"model_name": "HTMLModel",
"state": {
"_dom_classes": [],
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "2.0.0",
"_model_name": "HTMLModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/controls",
"_view_module_version": "2.0.0",
"_view_name": "HTMLView",
"description": "",
"description_allow_html": false,
"layout": "IPY_MODEL_314fa5543918487680a2d9ec7b5c900e",
"placeholder": "",
"style": "IPY_MODEL_635d9e61bfd049568c7d6d3cd8b42cd1",
"tabbable": null,
"tooltip": null,
"value": "Downloading (…)/a4f8f3e/config.json: 100%"
}
},
"a13481911c17402ca64eb5ca1289c6c5": {
"model_module": "@jupyter-widgets/base",
"model_module_version": "2.0.0",
"model_name": "LayoutModel",
"state": {
"_model_module": "@jupyter-widgets/base",
"_model_module_version": "2.0.0",
"_model_name": "LayoutModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "2.0.0",
"_view_name": "LayoutView",
"align_content": null,
"align_items": null,
"align_self": null,
"border_bottom": null,
"border_left": null,
"border_right": null,
"border_top": null,
"bottom": null,
"display": null,
"flex": null,
"flex_flow": null,
"grid_area": null,
"grid_auto_columns": null,
"grid_auto_flow": null,
"grid_auto_rows": null,
"grid_column": null,
"grid_gap": null,
"grid_row": null,
"grid_template_areas": null,
"grid_template_columns": null,
"grid_template_rows": null,
"height": null,
"justify_content": null,
"justify_items": null,
"left": null,
"margin": null,
"max_height": null,
"max_width": null,
"min_height": null,
"min_width": null,
"object_fit": null,
"object_position": null,
"order": null,
"overflow": null,
"padding": null,
"right": null,
"top": null,
"visibility": null,
"width": null
}
},
"a1c0eebc376b4cbd8c3001a8e8e3ab14": {
"model_module": "@jupyter-widgets/base",
"model_module_version": "2.0.0",
"model_name": "LayoutModel",
"state": {
"_model_module": "@jupyter-widgets/base",
"_model_module_version": "2.0.0",
"_model_name": "LayoutModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "2.0.0",
"_view_name": "LayoutView",
"align_content": null,
"align_items": null,
"align_self": null,
"border_bottom": null,
"border_left": null,
"border_right": null,
"border_top": null,
"bottom": null,
"display": null,
"flex": null,
"flex_flow": null,
"grid_area": null,
"grid_auto_columns": null,
"grid_auto_flow": null,
"grid_auto_rows": null,
"grid_column": null,
"grid_gap": null,
"grid_row": null,
"grid_template_areas": null,
"grid_template_columns": null,
"grid_template_rows": null,
"height": null,
"justify_content": null,
"justify_items": null,
"left": null,
"margin": null,
"max_height": null,
"max_width": null,
"min_height": null,
"min_width": null,
"object_fit": null,
"object_position": null,
"order": null,
"overflow": null,
"padding": null,
"right": null,
"top": null,
"visibility": null,
"width": null
}
},
"a2d5be26434a4fb883153c81cdf43430": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "2.0.0",
"model_name": "ProgressStyleModel",
"state": {
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "2.0.0",
"_model_name": "ProgressStyleModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "2.0.0",
"_view_name": "StyleView",
"bar_color": null,
"description_width": ""
}
},
"a3d0252a125b4190bd7bf7656d5861bd": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "2.0.0",
"model_name": "HTMLStyleModel",
"state": {
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "2.0.0",
"_model_name": "HTMLStyleModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "2.0.0",
"_view_name": "StyleView",
"background": null,
"description_width": "",
"font_size": null,
"text_color": null
}
},
"a488ba4004804c7699ba8c06940a3667": {
"model_module": "@jupyter-widgets/base",
"model_module_version": "2.0.0",
"model_name": "LayoutModel",
"state": {
"_model_module": "@jupyter-widgets/base",
"_model_module_version": "2.0.0",
"_model_name": "LayoutModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "2.0.0",
"_view_name": "LayoutView",
"align_content": null,
"align_items": null,
"align_self": null,
"border_bottom": null,
"border_left": null,
"border_right": null,
"border_top": null,
"bottom": null,
"display": null,
"flex": null,
"flex_flow": null,
"grid_area": null,
"grid_auto_columns": null,
"grid_auto_flow": null,
"grid_auto_rows": null,
"grid_column": null,
"grid_gap": null,
"grid_row": null,
"grid_template_areas": null,
"grid_template_columns": null,
"grid_template_rows": null,
"height": null,
"justify_content": null,
"justify_items": null,
"left": null,
"margin": null,
"max_height": null,
"max_width": null,
"min_height": null,
"min_width": null,
"object_fit": null,
"object_position": null,
"order": null,
"overflow": null,
"padding": null,
"right": null,
"top": null,
"visibility": null,
"width": null
}
},
"a55be32388c1480aa6d394846c0e11d1": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "2.0.0",
"model_name": "HTMLModel",
"state": {
"_dom_classes": [],
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "2.0.0",
"_model_name": "HTMLModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/controls",
"_view_module_version": "2.0.0",
"_view_name": "HTMLView",
"description": "",
"description_allow_html": false,
"layout": "IPY_MODEL_c803c42eb9cc466ab69970a21f140ee6",
"placeholder": "",
"style": "IPY_MODEL_65df592b2aa54852b12bdf44b61b32a9",
"tabbable": null,
"tooltip": null,
"value": "Downloading (…)okenizer_config.json: 100%"
}
},
"a66fd706e4a44f7e930c588f47adefe7": {
"model_module": "@jupyter-widgets/base",
"model_module_version": "2.0.0",
"model_name": "LayoutModel",
"state": {
"_model_module": "@jupyter-widgets/base",
"_model_module_version": "2.0.0",
"_model_name": "LayoutModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "2.0.0",
"_view_name": "LayoutView",
"align_content": null,
"align_items": null,
"align_self": null,
"border_bottom": null,
"border_left": null,
"border_right": null,
"border_top": null,
"bottom": null,
"display": null,
"flex": null,
"flex_flow": null,
"grid_area": null,
"grid_auto_columns": null,
"grid_auto_flow": null,
"grid_auto_rows": null,
"grid_column": null,
"grid_gap": null,
"grid_row": null,
"grid_template_areas": null,
"grid_template_columns": null,
"grid_template_rows": null,
"height": null,
"justify_content": null,
"justify_items": null,
"left": null,
"margin": null,
"max_height": null,
"max_width": null,
"min_height": null,
"min_width": null,
"object_fit": null,
"object_position": null,
"order": null,
"overflow": null,
"padding": null,
"right": null,
"top": null,
"visibility": null,
"width": null
}
},
"a999a74625824b70a89220fc9391ef0c": {
"model_module": "@jupyter-widgets/base",
"model_module_version": "2.0.0",
"model_name": "LayoutModel",
"state": {
"_model_module": "@jupyter-widgets/base",
"_model_module_version": "2.0.0",
"_model_name": "LayoutModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "2.0.0",
"_view_name": "LayoutView",
"align_content": null,
"align_items": null,
"align_self": null,
"border_bottom": null,
"border_left": null,
"border_right": null,
"border_top": null,
"bottom": null,
"display": null,
"flex": null,
"flex_flow": null,
"grid_area": null,
"grid_auto_columns": null,
"grid_auto_flow": null,
"grid_auto_rows": null,
"grid_column": null,
"grid_gap": null,
"grid_row": null,
"grid_template_areas": null,
"grid_template_columns": null,
"grid_template_rows": null,
"height": null,
"justify_content": null,
"justify_items": null,
"left": null,
"margin": null,
"max_height": null,
"max_width": null,
"min_height": null,
"min_width": null,
"object_fit": null,
"object_position": null,
"order": null,
"overflow": null,
"padding": null,
"right": null,
"top": null,
"visibility": null,
"width": null
}
},
"ac22b32266bd4553a44f8175675d7f66": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "2.0.0",
"model_name": "HTMLStyleModel",
"state": {
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "2.0.0",
"_model_name": "HTMLStyleModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "2.0.0",
"_view_name": "StyleView",
"background": null,
"description_width": "",
"font_size": null,
"text_color": null
}
},
"ad16e36578ef48989933e0367c380548": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "2.0.0",
"model_name": "FloatProgressModel",
"state": {
"_dom_classes": [],
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "2.0.0",
"_model_name": "FloatProgressModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/controls",
"_view_module_version": "2.0.0",
"_view_name": "ProgressView",
"bar_style": "success",
"description": "",
"description_allow_html": false,
"layout": "IPY_MODEL_f1e9babfe8324c6c94b6d1650cb05cff",
"max": 48.0,
"min": 0.0,
"orientation": "horizontal",
"style": "IPY_MODEL_30be2b75761c44b9bac54d8b548035e3",
"tabbable": null,
"tooltip": null,
"value": 48.0
}
},
"ada9507ce35045bbb5ca330ee9dcc2a6": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "2.0.0",
"model_name": "ProgressStyleModel",
"state": {
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "2.0.0",
"_model_name": "ProgressStyleModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "2.0.0",
"_view_name": "StyleView",
"bar_color": null,
"description_width": ""
}
},
"aefcf59c477d4d8388e6389559c92f92": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "2.0.0",
"model_name": "HTMLModel",
"state": {
"_dom_classes": [],
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "2.0.0",
"_model_name": "HTMLModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/controls",
"_view_module_version": "2.0.0",
"_view_name": "HTMLView",
"description": "",
"description_allow_html": false,
"layout": "IPY_MODEL_a13481911c17402ca64eb5ca1289c6c5",
"placeholder": "",
"style": "IPY_MODEL_9229f237dd5b4552ae198d9c6de78363",
"tabbable": null,
"tooltip": null,
"value": "Downloading pytorch_model.bin: 100%"
}
},
"b0b21a3ecfc349b99732f8a0b667f81c": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "2.0.0",
"model_name": "FloatProgressModel",
"state": {
"_dom_classes": [],
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "2.0.0",
"_model_name": "FloatProgressModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/controls",
"_view_module_version": "2.0.0",
"_view_name": "ProgressView",
"bar_style": "success",
"description": "",
"description_allow_html": false,
"layout": "IPY_MODEL_b13ded60d2ed4fe684772eeb642f44da",
"max": 998.0,
"min": 0.0,
"orientation": "horizontal",
"style": "IPY_MODEL_f9c7aa5668294434a78254c88991556e",
"tabbable": null,
"tooltip": null,
"value": 998.0
}
},
"b13ded60d2ed4fe684772eeb642f44da": {
"model_module": "@jupyter-widgets/base",
"model_module_version": "2.0.0",
"model_name": "LayoutModel",
"state": {
"_model_module": "@jupyter-widgets/base",
"_model_module_version": "2.0.0",
"_model_name": "LayoutModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "2.0.0",
"_view_name": "LayoutView",
"align_content": null,
"align_items": null,
"align_self": null,
"border_bottom": null,
"border_left": null,
"border_right": null,
"border_top": null,
"bottom": null,
"display": null,
"flex": null,
"flex_flow": null,
"grid_area": null,
"grid_auto_columns": null,
"grid_auto_flow": null,
"grid_auto_rows": null,
"grid_column": null,
"grid_gap": null,
"grid_row": null,
"grid_template_areas": null,
"grid_template_columns": null,
"grid_template_rows": null,
"height": null,
"justify_content": null,
"justify_items": null,
"left": null,
"margin": null,
"max_height": null,
"max_width": null,
"min_height": null,
"min_width": null,
"object_fit": null,
"object_position": null,
"order": null,
"overflow": null,
"padding": null,
"right": null,
"top": null,
"visibility": null,
"width": null
}
},
"b1c5920efd324ed58cbf4a32c65d1a8b": {
"model_module": "@jupyter-widgets/base",
"model_module_version": "2.0.0",
"model_name": "LayoutModel",
"state": {
"_model_module": "@jupyter-widgets/base",
"_model_module_version": "2.0.0",
"_model_name": "LayoutModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "2.0.0",
"_view_name": "LayoutView",
"align_content": null,
"align_items": null,
"align_self": null,
"border_bottom": null,
"border_left": null,
"border_right": null,
"border_top": null,
"bottom": null,
"display": null,
"flex": null,
"flex_flow": null,
"grid_area": null,
"grid_auto_columns": null,
"grid_auto_flow": null,
"grid_auto_rows": null,
"grid_column": null,
"grid_gap": null,
"grid_row": null,
"grid_template_areas": null,
"grid_template_columns": null,
"grid_template_rows": null,
"height": null,
"justify_content": null,
"justify_items": null,
"left": null,
"margin": null,
"max_height": null,
"max_width": null,
"min_height": null,
"min_width": null,
"object_fit": null,
"object_position": null,
"order": null,
"overflow": null,
"padding": null,
"right": null,
"top": null,
"visibility": null,
"width": null
}
},
"b1dbc2fa988c4c7c93f51f4ef9738536": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "2.0.0",
"model_name": "FloatProgressModel",
"state": {
"_dom_classes": [],
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "2.0.0",
"_model_name": "FloatProgressModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/controls",
"_view_module_version": "2.0.0",
"_view_name": "ProgressView",
"bar_style": "success",
"description": "",
"description_allow_html": false,
"layout": "IPY_MODEL_061c33c4780443b4b57a477064a04d3f",
"max": 213450.0,
"min": 0.0,
"orientation": "horizontal",
"style": "IPY_MODEL_dc288ea3cb884eb88b3fe592e3b13a86",
"tabbable": null,
"tooltip": null,
"value": 213450.0
}
},
"b3411c6a3797495fbf87f322feb15661": {
"model_module": "@jupyter-widgets/base",
"model_module_version": "2.0.0",
"model_name": "LayoutModel",
"state": {
"_model_module": "@jupyter-widgets/base",
"_model_module_version": "2.0.0",
"_model_name": "LayoutModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "2.0.0",
"_view_name": "LayoutView",
"align_content": null,
"align_items": null,
"align_self": null,
"border_bottom": null,
"border_left": null,
"border_right": null,
"border_top": null,
"bottom": null,
"display": null,
"flex": null,
"flex_flow": null,
"grid_area": null,
"grid_auto_columns": null,
"grid_auto_flow": null,
"grid_auto_rows": null,
"grid_column": null,
"grid_gap": null,
"grid_row": null,
"grid_template_areas": null,
"grid_template_columns": null,
"grid_template_rows": null,
"height": null,
"justify_content": null,
"justify_items": null,
"left": null,
"margin": null,
"max_height": null,
"max_width": null,
"min_height": null,
"min_width": null,
"object_fit": null,
"object_position": null,
"order": null,
"overflow": null,
"padding": null,
"right": null,
"top": null,
"visibility": null,
"width": null
}
},
"b5c8d9d7e9f846f3b55d7f6073ac7dbc": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "2.0.0",
"model_name": "HTMLStyleModel",
"state": {
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "2.0.0",
"_model_name": "HTMLStyleModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "2.0.0",
"_view_name": "StyleView",
"background": null,
"description_width": "",
"font_size": null,
"text_color": null
}
},
"b8aaca13d4a742149ec0cd1fe31dd58e": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "2.0.0",
"model_name": "HTMLStyleModel",
"state": {
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "2.0.0",
"_model_name": "HTMLStyleModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "2.0.0",
"_view_name": "StyleView",
"background": null,
"description_width": "",
"font_size": null,
"text_color": null
}
},
"b9d3599015ae4adc8959e9e71bb9ae46": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "2.0.0",
"model_name": "HTMLModel",
"state": {
"_dom_classes": [],
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "2.0.0",
"_model_name": "HTMLModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/controls",
"_view_module_version": "2.0.0",
"_view_name": "HTMLView",
"description": "",
"description_allow_html": false,
"layout": "IPY_MODEL_9e107d247362411799bbafbe5e9c6267",
"placeholder": "",
"style": "IPY_MODEL_29435150e1e54f9e82dcf506395d6c3e",
"tabbable": null,
"tooltip": null,
"value": "Downloading pytorch_model.bin: 100%"
}
},
"ba840150a18d4892b69abab2b81df577": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "2.0.0",
"model_name": "HTMLStyleModel",
"state": {
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "2.0.0",
"_model_name": "HTMLStyleModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "2.0.0",
"_view_name": "StyleView",
"background": null,
"description_width": "",
"font_size": null,
"text_color": null
}
},
"bc3946fbb9334b6ba7265b4cb6bb7d54": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "2.0.0",
"model_name": "HBoxModel",
"state": {
"_dom_classes": [],
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "2.0.0",
"_model_name": "HBoxModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/controls",
"_view_module_version": "2.0.0",
"_view_name": "HBoxView",
"box_style": "",
"children": [
"IPY_MODEL_f9088b55c29048769ace7f88e2bc2280",
"IPY_MODEL_ad16e36578ef48989933e0367c380548",
"IPY_MODEL_22117190beb746bbb147fff67aaf0555"
],
"layout": "IPY_MODEL_b1c5920efd324ed58cbf4a32c65d1a8b",
"tabbable": null,
"tooltip": null
}
},
"bd8b6aa2fe38417f8a4cf0b1b5ba9fc4": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "2.0.0",
"model_name": "FloatProgressModel",
"state": {
"_dom_classes": [],
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "2.0.0",
"_model_name": "FloatProgressModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/controls",
"_view_module_version": "2.0.0",
"_view_name": "ProgressView",
"bar_style": "success",
"description": "",
"description_allow_html": false,
"layout": "IPY_MODEL_1a5ecb435ab046df9eadcd6d185aa4fc",
"max": 1334448817.0,
"min": 0.0,
"orientation": "horizontal",
"style": "IPY_MODEL_8dabf8ecf7ed4d2db3948c0cfbba1cdc",
"tabbable": null,
"tooltip": null,
"value": 1334448817.0
}
},
"bebf0b50b7384d92b9dccadfa64c6e3c": {
"model_module": "@jupyter-widgets/base",
"model_module_version": "2.0.0",
"model_name": "LayoutModel",
"state": {
"_model_module": "@jupyter-widgets/base",
"_model_module_version": "2.0.0",
"_model_name": "LayoutModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "2.0.0",
"_view_name": "LayoutView",
"align_content": null,
"align_items": null,
"align_self": null,
"border_bottom": null,
"border_left": null,
"border_right": null,
"border_top": null,
"bottom": null,
"display": null,
"flex": null,
"flex_flow": null,
"grid_area": null,
"grid_auto_columns": null,
"grid_auto_flow": null,
"grid_auto_rows": null,
"grid_column": null,
"grid_gap": null,
"grid_row": null,
"grid_template_areas": null,
"grid_template_columns": null,
"grid_template_rows": null,
"height": null,
"justify_content": null,
"justify_items": null,
"left": null,
"margin": null,
"max_height": null,
"max_width": null,
"min_height": null,
"min_width": null,
"object_fit": null,
"object_position": null,
"order": null,
"overflow": null,
"padding": null,
"right": null,
"top": null,
"visibility": null,
"width": null
}
},
"c0631177f2a44954a838cd01e33c16f2": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "2.0.0",
"model_name": "HBoxModel",
"state": {
"_dom_classes": [],
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "2.0.0",
"_model_name": "HBoxModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/controls",
"_view_module_version": "2.0.0",
"_view_name": "HBoxView",
"box_style": "",
"children": [
"IPY_MODEL_33fe87d050614686b16afcd1bc1218c5",
"IPY_MODEL_b1dbc2fa988c4c7c93f51f4ef9738536",
"IPY_MODEL_81105190fb5646ef8ae0ba604990f3dd"
],
"layout": "IPY_MODEL_53cddafe2c82467fb766bb43cf58f4e6",
"tabbable": null,
"tooltip": null
}
},
"c12474a0dc6a44d3be8a688f968ca09e": {
"model_module": "@jupyter-widgets/base",
"model_module_version": "2.0.0",
"model_name": "LayoutModel",
"state": {
"_model_module": "@jupyter-widgets/base",
"_model_module_version": "2.0.0",
"_model_name": "LayoutModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "2.0.0",
"_view_name": "LayoutView",
"align_content": null,
"align_items": null,
"align_self": null,
"border_bottom": null,
"border_left": null,
"border_right": null,
"border_top": null,
"bottom": null,
"display": null,
"flex": null,
"flex_flow": null,
"grid_area": null,
"grid_auto_columns": null,
"grid_auto_flow": null,
"grid_auto_rows": null,
"grid_column": null,
"grid_gap": null,
"grid_row": null,
"grid_template_areas": null,
"grid_template_columns": null,
"grid_template_rows": null,
"height": null,
"justify_content": null,
"justify_items": null,
"left": null,
"margin": null,
"max_height": null,
"max_width": null,
"min_height": null,
"min_width": null,
"object_fit": null,
"object_position": null,
"order": null,
"overflow": null,
"padding": null,
"right": null,
"top": null,
"visibility": null,
"width": null
}
},
"c30a51e35c4f4bfb8e643146f7099988": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "2.0.0",
"model_name": "ProgressStyleModel",
"state": {
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "2.0.0",
"_model_name": "ProgressStyleModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "2.0.0",
"_view_name": "StyleView",
"bar_color": null,
"description_width": ""
}
},
"c4e22bad14e140adb164718dd4f8a8a7": {
"model_module": "@jupyter-widgets/base",
"model_module_version": "2.0.0",
"model_name": "LayoutModel",
"state": {
"_model_module": "@jupyter-widgets/base",
"_model_module_version": "2.0.0",
"_model_name": "LayoutModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "2.0.0",
"_view_name": "LayoutView",
"align_content": null,
"align_items": null,
"align_self": null,
"border_bottom": null,
"border_left": null,
"border_right": null,
"border_top": null,
"bottom": null,
"display": null,
"flex": null,
"flex_flow": null,
"grid_area": null,
"grid_auto_columns": null,
"grid_auto_flow": null,
"grid_auto_rows": null,
"grid_column": null,
"grid_gap": null,
"grid_row": null,
"grid_template_areas": null,
"grid_template_columns": null,
"grid_template_rows": null,
"height": null,
"justify_content": null,
"justify_items": null,
"left": null,
"margin": null,
"max_height": null,
"max_width": null,
"min_height": null,
"min_width": null,
"object_fit": null,
"object_position": null,
"order": null,
"overflow": null,
"padding": null,
"right": null,
"top": null,
"visibility": null,
"width": null
}
},
"c511a30027cf4eee990602355a68e5c8": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "2.0.0",
"model_name": "HTMLStyleModel",
"state": {
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "2.0.0",
"_model_name": "HTMLStyleModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "2.0.0",
"_view_name": "StyleView",
"background": null,
"description_width": "",
"font_size": null,
"text_color": null
}
},
"c5528279daf540579fc74cb55bdd4f71": {
"model_module": "@jupyter-widgets/base",
"model_module_version": "2.0.0",
"model_name": "LayoutModel",
"state": {
"_model_module": "@jupyter-widgets/base",
"_model_module_version": "2.0.0",
"_model_name": "LayoutModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "2.0.0",
"_view_name": "LayoutView",
"align_content": null,
"align_items": null,
"align_self": null,
"border_bottom": null,
"border_left": null,
"border_right": null,
"border_top": null,
"bottom": null,
"display": null,
"flex": null,
"flex_flow": null,
"grid_area": null,
"grid_auto_columns": null,
"grid_auto_flow": null,
"grid_auto_rows": null,
"grid_column": null,
"grid_gap": null,
"grid_row": null,
"grid_template_areas": null,
"grid_template_columns": null,
"grid_template_rows": null,
"height": null,
"justify_content": null,
"justify_items": null,
"left": null,
"margin": null,
"max_height": null,
"max_width": null,
"min_height": null,
"min_width": null,
"object_fit": null,
"object_position": null,
"order": null,
"overflow": null,
"padding": null,
"right": null,
"top": null,
"visibility": null,
"width": null
}
},
"c79b6fc5d57846fb85d5a4abe6f25d1f": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "2.0.0",
"model_name": "HBoxModel",
"state": {
"_dom_classes": [],
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "2.0.0",
"_model_name": "HBoxModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/controls",
"_view_module_version": "2.0.0",
"_view_name": "HBoxView",
"box_style": "",
"children": [
"IPY_MODEL_37e5b6f4878549e3883ea8e67a963f86",
"IPY_MODEL_d9f370109a9642cb8db246eaa30a1ca6",
"IPY_MODEL_8f2f1578ff644163a3c40635ca7bb712"
],
"layout": "IPY_MODEL_c4e22bad14e140adb164718dd4f8a8a7",
"tabbable": null,
"tooltip": null
}
},
"c803c42eb9cc466ab69970a21f140ee6": {
"model_module": "@jupyter-widgets/base",
"model_module_version": "2.0.0",
"model_name": "LayoutModel",
"state": {
"_model_module": "@jupyter-widgets/base",
"_model_module_version": "2.0.0",
"_model_name": "LayoutModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "2.0.0",
"_view_name": "LayoutView",
"align_content": null,
"align_items": null,
"align_self": null,
"border_bottom": null,
"border_left": null,
"border_right": null,
"border_top": null,
"bottom": null,
"display": null,
"flex": null,
"flex_flow": null,
"grid_area": null,
"grid_auto_columns": null,
"grid_auto_flow": null,
"grid_auto_rows": null,
"grid_column": null,
"grid_gap": null,
"grid_row": null,
"grid_template_areas": null,
"grid_template_columns": null,
"grid_template_rows": null,
"height": null,
"justify_content": null,
"justify_items": null,
"left": null,
"margin": null,
"max_height": null,
"max_width": null,
"min_height": null,
"min_width": null,
"object_fit": null,
"object_position": null,
"order": null,
"overflow": null,
"padding": null,
"right": null,
"top": null,
"visibility": null,
"width": null
}
},
"c86d4560b38d496e89f99b22c55bfad6": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "2.0.0",
"model_name": "HTMLModel",
"state": {
"_dom_classes": [],
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "2.0.0",
"_model_name": "HTMLModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/controls",
"_view_module_version": "2.0.0",
"_view_name": "HTMLView",
"description": "",
"description_allow_html": false,
"layout": "IPY_MODEL_d3b8fbb05a4d4282a2fdb2822b5fdd56",
"placeholder": "",
"style": "IPY_MODEL_a3d0252a125b4190bd7bf7656d5861bd",
"tabbable": null,
"tooltip": null,
"value": "Downloading (…)ve/af0f99b/vocab.txt: 100%"
}
},
"cc47c911b0db4fdbb81586f9bd8856d8": {
"model_module": "@jupyter-widgets/base",
"model_module_version": "2.0.0",
"model_name": "LayoutModel",
"state": {
"_model_module": "@jupyter-widgets/base",
"_model_module_version": "2.0.0",
"_model_name": "LayoutModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "2.0.0",
"_view_name": "LayoutView",
"align_content": null,
"align_items": null,
"align_self": null,
"border_bottom": null,
"border_left": null,
"border_right": null,
"border_top": null,
"bottom": null,
"display": null,
"flex": null,
"flex_flow": null,
"grid_area": null,
"grid_auto_columns": null,
"grid_auto_flow": null,
"grid_auto_rows": null,
"grid_column": null,
"grid_gap": null,
"grid_row": null,
"grid_template_areas": null,
"grid_template_columns": null,
"grid_template_rows": null,
"height": null,
"justify_content": null,
"justify_items": null,
"left": null,
"margin": null,
"max_height": null,
"max_width": null,
"min_height": null,
"min_width": null,
"object_fit": null,
"object_position": null,
"order": null,
"overflow": null,
"padding": null,
"right": null,
"top": null,
"visibility": null,
"width": null
}
},
"d009854f1b9c40b49a85270261f7893e": {
"model_module": "@jupyter-widgets/base",
"model_module_version": "2.0.0",
"model_name": "LayoutModel",
"state": {
"_model_module": "@jupyter-widgets/base",
"_model_module_version": "2.0.0",
"_model_name": "LayoutModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "2.0.0",
"_view_name": "LayoutView",
"align_content": null,
"align_items": null,
"align_self": null,
"border_bottom": null,
"border_left": null,
"border_right": null,
"border_top": null,
"bottom": null,
"display": null,
"flex": null,
"flex_flow": null,
"grid_area": null,
"grid_auto_columns": null,
"grid_auto_flow": null,
"grid_auto_rows": null,
"grid_column": null,
"grid_gap": null,
"grid_row": null,
"grid_template_areas": null,
"grid_template_columns": null,
"grid_template_rows": null,
"height": null,
"justify_content": null,
"justify_items": null,
"left": null,
"margin": null,
"max_height": null,
"max_width": null,
"min_height": null,
"min_width": null,
"object_fit": null,
"object_position": null,
"order": null,
"overflow": null,
"padding": null,
"right": null,
"top": null,
"visibility": null,
"width": null
}
},
"d0cec4b81fee40d0ae86486979881acb": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "2.0.0",
"model_name": "HTMLModel",
"state": {
"_dom_classes": [],
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "2.0.0",
"_model_name": "HTMLModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/controls",
"_view_module_version": "2.0.0",
"_view_name": "HTMLView",
"description": "",
"description_allow_html": false,
"layout": "IPY_MODEL_a488ba4004804c7699ba8c06940a3667",
"placeholder": "",
"style": "IPY_MODEL_ac22b32266bd4553a44f8175675d7f66",
"tabbable": null,
"tooltip": null,
"value": " 1.80k/1.80k [00:00&lt;00:00, 103kB/s]"
}
},
"d22e5118daf740ee840f35215054b9e0": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "2.0.0",
"model_name": "HTMLModel",
"state": {
"_dom_classes": [],
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "2.0.0",
"_model_name": "HTMLModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/controls",
"_view_module_version": "2.0.0",
"_view_name": "HTMLView",
"description": "",
"description_allow_html": false,
"layout": "IPY_MODEL_3d5ff7fe38484b2f9a765b4588e19f2a",
"placeholder": "",
"style": "IPY_MODEL_fbf5f3073a71494295fb9ae1f8954b1e",
"tabbable": null,
"tooltip": null,
"value": "Downloading (…)e/a4f8f3e/vocab.json: 100%"
}
},
"d30b96c66d564f049a7e3960afb361c9": {
"model_module": "@jupyter-widgets/base",
"model_module_version": "2.0.0",
"model_name": "LayoutModel",
"state": {
"_model_module": "@jupyter-widgets/base",
"_model_module_version": "2.0.0",
"_model_name": "LayoutModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "2.0.0",
"_view_name": "LayoutView",
"align_content": null,
"align_items": null,
"align_self": null,
"border_bottom": null,
"border_left": null,
"border_right": null,
"border_top": null,
"bottom": null,
"display": null,
"flex": null,
"flex_flow": null,
"grid_area": null,
"grid_auto_columns": null,
"grid_auto_flow": null,
"grid_auto_rows": null,
"grid_column": null,
"grid_gap": null,
"grid_row": null,
"grid_template_areas": null,
"grid_template_columns": null,
"grid_template_rows": null,
"height": null,
"justify_content": null,
"justify_items": null,
"left": null,
"margin": null,
"max_height": null,
"max_width": null,
"min_height": null,
"min_width": null,
"object_fit": null,
"object_position": null,
"order": null,
"overflow": null,
"padding": null,
"right": null,
"top": null,
"visibility": null,
"width": null
}
},
"d3b8fbb05a4d4282a2fdb2822b5fdd56": {
"model_module": "@jupyter-widgets/base",
"model_module_version": "2.0.0",
"model_name": "LayoutModel",
"state": {
"_model_module": "@jupyter-widgets/base",
"_model_module_version": "2.0.0",
"_model_name": "LayoutModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "2.0.0",
"_view_name": "LayoutView",
"align_content": null,
"align_items": null,
"align_self": null,
"border_bottom": null,
"border_left": null,
"border_right": null,
"border_top": null,
"bottom": null,
"display": null,
"flex": null,
"flex_flow": null,
"grid_area": null,
"grid_auto_columns": null,
"grid_auto_flow": null,
"grid_auto_rows": null,
"grid_column": null,
"grid_gap": null,
"grid_row": null,
"grid_template_areas": null,
"grid_template_columns": null,
"grid_template_rows": null,
"height": null,
"justify_content": null,
"justify_items": null,
"left": null,
"margin": null,
"max_height": null,
"max_width": null,
"min_height": null,
"min_width": null,
"object_fit": null,
"object_position": null,
"order": null,
"overflow": null,
"padding": null,
"right": null,
"top": null,
"visibility": null,
"width": null
}
},
"d9f370109a9642cb8db246eaa30a1ca6": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "2.0.0",
"model_name": "FloatProgressModel",
"state": {
"_dom_classes": [],
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "2.0.0",
"_model_name": "FloatProgressModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/controls",
"_view_module_version": "2.0.0",
"_view_name": "ProgressView",
"bar_style": "success",
"description": "",
"description_allow_html": false,
"layout": "IPY_MODEL_d009854f1b9c40b49a85270261f7893e",
"max": 629.0,
"min": 0.0,
"orientation": "horizontal",
"style": "IPY_MODEL_a2d5be26434a4fb883153c81cdf43430",
"tabbable": null,
"tooltip": null,
"value": 629.0
}
},
"dc288ea3cb884eb88b3fe592e3b13a86": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "2.0.0",
"model_name": "ProgressStyleModel",
"state": {
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "2.0.0",
"_model_name": "ProgressStyleModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "2.0.0",
"_view_name": "StyleView",
"bar_color": null,
"description_width": ""
}
},
"dd6bd2457b794574961d237522f66e71": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "2.0.0",
"model_name": "ProgressStyleModel",
"state": {
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "2.0.0",
"_model_name": "ProgressStyleModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "2.0.0",
"_view_name": "StyleView",
"bar_color": null,
"description_width": ""
}
},
"de2c47e67300445f92715fccac8ddb0a": {
"model_module": "@jupyter-widgets/base",
"model_module_version": "2.0.0",
"model_name": "LayoutModel",
"state": {
"_model_module": "@jupyter-widgets/base",
"_model_module_version": "2.0.0",
"_model_name": "LayoutModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "2.0.0",
"_view_name": "LayoutView",
"align_content": null,
"align_items": null,
"align_self": null,
"border_bottom": null,
"border_left": null,
"border_right": null,
"border_top": null,
"bottom": null,
"display": null,
"flex": null,
"flex_flow": null,
"grid_area": null,
"grid_auto_columns": null,
"grid_auto_flow": null,
"grid_auto_rows": null,
"grid_column": null,
"grid_gap": null,
"grid_row": null,
"grid_template_areas": null,
"grid_template_columns": null,
"grid_template_rows": null,
"height": null,
"justify_content": null,
"justify_items": null,
"left": null,
"margin": null,
"max_height": null,
"max_width": null,
"min_height": null,
"min_width": null,
"object_fit": null,
"object_position": null,
"order": null,
"overflow": null,
"padding": null,
"right": null,
"top": null,
"visibility": null,
"width": null
}
},
"deb2d8dc0e5f4cf9aaef2fa097628744": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "2.0.0",
"model_name": "HTMLStyleModel",
"state": {
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "2.0.0",
"_model_name": "HTMLStyleModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "2.0.0",
"_view_name": "StyleView",
"background": null,
"description_width": "",
"font_size": null,
"text_color": null
}
},
"e0815bfe97394843ba7c86bf176f0d3a": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "2.0.0",
"model_name": "HTMLModel",
"state": {
"_dom_classes": [],
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "2.0.0",
"_model_name": "HTMLModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/controls",
"_view_module_version": "2.0.0",
"_view_name": "HTMLView",
"description": "",
"description_allow_html": false,
"layout": "IPY_MODEL_e1f681a21d7f473795c2d83314b502d8",
"placeholder": "",
"style": "IPY_MODEL_63be3ca0362b4f75ba7b25bdf3d1500f",
"tabbable": null,
"tooltip": null,
"value": " 26.0/26.0 [00:00&lt;00:00, 1.83kB/s]"
}
},
"e0f87c3d7b804219b4bd2229fd383858": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "2.0.0",
"model_name": "HTMLStyleModel",
"state": {
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "2.0.0",
"_model_name": "HTMLStyleModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "2.0.0",
"_view_name": "StyleView",
"background": null,
"description_width": "",
"font_size": null,
"text_color": null
}
},
"e1f681a21d7f473795c2d83314b502d8": {
"model_module": "@jupyter-widgets/base",
"model_module_version": "2.0.0",
"model_name": "LayoutModel",
"state": {
"_model_module": "@jupyter-widgets/base",
"_model_module_version": "2.0.0",
"_model_name": "LayoutModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "2.0.0",
"_view_name": "LayoutView",
"align_content": null,
"align_items": null,
"align_self": null,
"border_bottom": null,
"border_left": null,
"border_right": null,
"border_top": null,
"bottom": null,
"display": null,
"flex": null,
"flex_flow": null,
"grid_area": null,
"grid_auto_columns": null,
"grid_auto_flow": null,
"grid_auto_rows": null,
"grid_column": null,
"grid_gap": null,
"grid_row": null,
"grid_template_areas": null,
"grid_template_columns": null,
"grid_template_rows": null,
"height": null,
"justify_content": null,
"justify_items": null,
"left": null,
"margin": null,
"max_height": null,
"max_width": null,
"min_height": null,
"min_width": null,
"object_fit": null,
"object_position": null,
"order": null,
"overflow": null,
"padding": null,
"right": null,
"top": null,
"visibility": null,
"width": null
}
},
"e4563c8cebad4c739e9e139408437ec3": {
"model_module": "@jupyter-widgets/base",
"model_module_version": "2.0.0",
"model_name": "LayoutModel",
"state": {
"_model_module": "@jupyter-widgets/base",
"_model_module_version": "2.0.0",
"_model_name": "LayoutModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "2.0.0",
"_view_name": "LayoutView",
"align_content": null,
"align_items": null,
"align_self": null,
"border_bottom": null,
"border_left": null,
"border_right": null,
"border_top": null,
"bottom": null,
"display": null,
"flex": null,
"flex_flow": null,
"grid_area": null,
"grid_auto_columns": null,
"grid_auto_flow": null,
"grid_auto_rows": null,
"grid_column": null,
"grid_gap": null,
"grid_row": null,
"grid_template_areas": null,
"grid_template_columns": null,
"grid_template_rows": null,
"height": null,
"justify_content": null,
"justify_items": null,
"left": null,
"margin": null,
"max_height": null,
"max_width": null,
"min_height": null,
"min_width": null,
"object_fit": null,
"object_position": null,
"order": null,
"overflow": null,
"padding": null,
"right": null,
"top": null,
"visibility": null,
"width": null
}
},
"e5dc6af7ae5a41e78285a08bf015b73f": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "2.0.0",
"model_name": "HTMLModel",
"state": {
"_dom_classes": [],
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "2.0.0",
"_model_name": "HTMLModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/controls",
"_view_module_version": "2.0.0",
"_view_name": "HTMLView",
"description": "",
"description_allow_html": false,
"layout": "IPY_MODEL_5e4cdaae4f124324b76a8bd44aa9b5fc",
"placeholder": "",
"style": "IPY_MODEL_f2ff342e0a214b5eb62ae17a5df0ca38",
"tabbable": null,
"tooltip": null,
"value": "Downloading (…)/f2482bf/config.json: 100%"
}
},
"e8844fb5198e4371b3ce3304238bbcf5": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "2.0.0",
"model_name": "HTMLStyleModel",
"state": {
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "2.0.0",
"_model_name": "HTMLStyleModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "2.0.0",
"_view_name": "StyleView",
"background": null,
"description_width": "",
"font_size": null,
"text_color": null
}
},
"e9e6af85584a424eadf38a4bf0c283f2": {
"model_module": "@jupyter-widgets/base",
"model_module_version": "2.0.0",
"model_name": "LayoutModel",
"state": {
"_model_module": "@jupyter-widgets/base",
"_model_module_version": "2.0.0",
"_model_name": "LayoutModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "2.0.0",
"_view_name": "LayoutView",
"align_content": null,
"align_items": null,
"align_self": null,
"border_bottom": null,
"border_left": null,
"border_right": null,
"border_top": null,
"bottom": null,
"display": null,
"flex": null,
"flex_flow": null,
"grid_area": null,
"grid_auto_columns": null,
"grid_auto_flow": null,
"grid_auto_rows": null,
"grid_column": null,
"grid_gap": null,
"grid_row": null,
"grid_template_areas": null,
"grid_template_columns": null,
"grid_template_rows": null,
"height": null,
"justify_content": null,
"justify_items": null,
"left": null,
"margin": null,
"max_height": null,
"max_width": null,
"min_height": null,
"min_width": null,
"object_fit": null,
"object_position": null,
"order": null,
"overflow": null,
"padding": null,
"right": null,
"top": null,
"visibility": null,
"width": null
}
},
"ebe26a353dd6400bbcddcc1d83f740ee": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "2.0.0",
"model_name": "HTMLModel",
"state": {
"_dom_classes": [],
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "2.0.0",
"_model_name": "HTMLModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/controls",
"_view_module_version": "2.0.0",
"_view_name": "HTMLView",
"description": "",
"description_allow_html": false,
"layout": "IPY_MODEL_9787d683433343949ef2a9c3acaf109d",
"placeholder": "",
"style": "IPY_MODEL_deb2d8dc0e5f4cf9aaef2fa097628744",
"tabbable": null,
"tooltip": null,
"value": " 232k/232k [00:00&lt;00:00, 13.6MB/s]"
}
},
"f171b742f36b4558a8ecf8ef11190d8e": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "2.0.0",
"model_name": "HTMLStyleModel",
"state": {
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "2.0.0",
"_model_name": "HTMLStyleModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "2.0.0",
"_view_name": "StyleView",
"background": null,
"description_width": "",
"font_size": null,
"text_color": null
}
},
"f1e9babfe8324c6c94b6d1650cb05cff": {
"model_module": "@jupyter-widgets/base",
"model_module_version": "2.0.0",
"model_name": "LayoutModel",
"state": {
"_model_module": "@jupyter-widgets/base",
"_model_module_version": "2.0.0",
"_model_name": "LayoutModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "2.0.0",
"_view_name": "LayoutView",
"align_content": null,
"align_items": null,
"align_self": null,
"border_bottom": null,
"border_left": null,
"border_right": null,
"border_top": null,
"bottom": null,
"display": null,
"flex": null,
"flex_flow": null,
"grid_area": null,
"grid_auto_columns": null,
"grid_auto_flow": null,
"grid_auto_rows": null,
"grid_column": null,
"grid_gap": null,
"grid_row": null,
"grid_template_areas": null,
"grid_template_columns": null,
"grid_template_rows": null,
"height": null,
"justify_content": null,
"justify_items": null,
"left": null,
"margin": null,
"max_height": null,
"max_width": null,
"min_height": null,
"min_width": null,
"object_fit": null,
"object_position": null,
"order": null,
"overflow": null,
"padding": null,
"right": null,
"top": null,
"visibility": null,
"width": null
}
},
"f2f07a5fe6de4d1881d205f2261a9d93": {
"model_module": "@jupyter-widgets/base",
"model_module_version": "2.0.0",
"model_name": "LayoutModel",
"state": {
"_model_module": "@jupyter-widgets/base",
"_model_module_version": "2.0.0",
"_model_name": "LayoutModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "2.0.0",
"_view_name": "LayoutView",
"align_content": null,
"align_items": null,
"align_self": null,
"border_bottom": null,
"border_left": null,
"border_right": null,
"border_top": null,
"bottom": null,
"display": null,
"flex": null,
"flex_flow": null,
"grid_area": null,
"grid_auto_columns": null,
"grid_auto_flow": null,
"grid_auto_rows": null,
"grid_column": null,
"grid_gap": null,
"grid_row": null,
"grid_template_areas": null,
"grid_template_columns": null,
"grid_template_rows": null,
"height": null,
"justify_content": null,
"justify_items": null,
"left": null,
"margin": null,
"max_height": null,
"max_width": null,
"min_height": null,
"min_width": null,
"object_fit": null,
"object_position": null,
"order": null,
"overflow": null,
"padding": null,
"right": null,
"top": null,
"visibility": null,
"width": null
}
},
"f2ff342e0a214b5eb62ae17a5df0ca38": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "2.0.0",
"model_name": "HTMLStyleModel",
"state": {
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "2.0.0",
"_model_name": "HTMLStyleModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "2.0.0",
"_view_name": "StyleView",
"background": null,
"description_width": "",
"font_size": null,
"text_color": null
}
},
"f9088b55c29048769ace7f88e2bc2280": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "2.0.0",
"model_name": "HTMLModel",
"state": {
"_dom_classes": [],
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "2.0.0",
"_model_name": "HTMLModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/controls",
"_view_module_version": "2.0.0",
"_view_name": "HTMLView",
"description": "",
"description_allow_html": false,
"layout": "IPY_MODEL_57078343da114159bca3534f2c6bf4ee",
"placeholder": "",
"style": "IPY_MODEL_7eec2c152d964702956897fd6c8d027f",
"tabbable": null,
"tooltip": null,
"value": "Downloading (…)okenizer_config.json: 100%"
}
},
"f9c7aa5668294434a78254c88991556e": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "2.0.0",
"model_name": "ProgressStyleModel",
"state": {
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "2.0.0",
"_model_name": "ProgressStyleModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "2.0.0",
"_view_name": "StyleView",
"bar_color": null,
"description_width": ""
}
},
"f9e66d0314524529bd4a6ee9d8e3faef": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "2.0.0",
"model_name": "HTMLModel",
"state": {
"_dom_classes": [],
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "2.0.0",
"_model_name": "HTMLModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/controls",
"_view_module_version": "2.0.0",
"_view_name": "HTMLView",
"description": "",
"description_allow_html": false,
"layout": "IPY_MODEL_d30b96c66d564f049a7e3960afb361c9",
"placeholder": "",
"style": "IPY_MODEL_f171b742f36b4558a8ecf8ef11190d8e",
"tabbable": null,
"tooltip": null,
"value": "Downloading (…)e/a4f8f3e/merges.txt: 100%"
}
},
"fbf5f3073a71494295fb9ae1f8954b1e": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "2.0.0",
"model_name": "HTMLStyleModel",
"state": {
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "2.0.0",
"_model_name": "HTMLStyleModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "2.0.0",
"_view_name": "StyleView",
"background": null,
"description_width": "",
"font_size": null,
"text_color": null
}
},
"fcb58fe880424ed9ad5d1575eda055e3": {
"model_module": "@jupyter-widgets/base",
"model_module_version": "2.0.0",
"model_name": "LayoutModel",
"state": {
"_model_module": "@jupyter-widgets/base",
"_model_module_version": "2.0.0",
"_model_name": "LayoutModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "2.0.0",
"_view_name": "LayoutView",
"align_content": null,
"align_items": null,
"align_self": null,
"border_bottom": null,
"border_left": null,
"border_right": null,
"border_top": null,
"bottom": null,
"display": null,
"flex": null,
"flex_flow": null,
"grid_area": null,
"grid_auto_columns": null,
"grid_auto_flow": null,
"grid_auto_rows": null,
"grid_column": null,
"grid_gap": null,
"grid_row": null,
"grid_template_areas": null,
"grid_template_columns": null,
"grid_template_rows": null,
"height": null,
"justify_content": null,
"justify_items": null,
"left": null,
"margin": null,
"max_height": null,
"max_width": null,
"min_height": null,
"min_width": null,
"object_fit": null,
"object_position": null,
"order": null,
"overflow": null,
"padding": null,
"right": null,
"top": null,
"visibility": null,
"width": null
}
},
"fe0529b874484ae097de6747ffd66ca5": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "2.0.0",
"model_name": "FloatProgressModel",
"state": {
"_dom_classes": [],
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "2.0.0",
"_model_name": "FloatProgressModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/controls",
"_view_module_version": "2.0.0",
"_view_name": "ProgressView",
"bar_style": "success",
"description": "",
"description_allow_html": false,
"layout": "IPY_MODEL_ff35d4081cfd4dd1b749e0515aa4264d",
"max": 267844284.0,
"min": 0.0,
"orientation": "horizontal",
"style": "IPY_MODEL_dd6bd2457b794574961d237522f66e71",
"tabbable": null,
"tooltip": null,
"value": 267844284.0
}
},
"ff35d4081cfd4dd1b749e0515aa4264d": {
"model_module": "@jupyter-widgets/base",
"model_module_version": "2.0.0",
"model_name": "LayoutModel",
"state": {
"_model_module": "@jupyter-widgets/base",
"_model_module_version": "2.0.0",
"_model_name": "LayoutModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "2.0.0",
"_view_name": "LayoutView",
"align_content": null,
"align_items": null,
"align_self": null,
"border_bottom": null,
"border_left": null,
"border_right": null,
"border_top": null,
"bottom": null,
"display": null,
"flex": null,
"flex_flow": null,
"grid_area": null,
"grid_auto_columns": null,
"grid_auto_flow": null,
"grid_auto_rows": null,
"grid_column": null,
"grid_gap": null,
"grid_row": null,
"grid_template_areas": null,
"grid_template_columns": null,
"grid_template_rows": null,
"height": null,
"justify_content": null,
"justify_items": null,
"left": null,
"margin": null,
"max_height": null,
"max_width": null,
"min_height": null,
"min_width": null,
"object_fit": null,
"object_position": null,
"order": null,
"overflow": null,
"padding": null,
"right": null,
"top": null,
"visibility": null,
"width": null
}
},
"ffa644e148d04829bd2dd245c2418c2e": {
"model_module": "@jupyter-widgets/base",
"model_module_version": "2.0.0",
"model_name": "LayoutModel",
"state": {
"_model_module": "@jupyter-widgets/base",
"_model_module_version": "2.0.0",
"_model_name": "LayoutModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "2.0.0",
"_view_name": "LayoutView",
"align_content": null,
"align_items": null,
"align_self": null,
"border_bottom": null,
"border_left": null,
"border_right": null,
"border_top": null,
"bottom": null,
"display": null,
"flex": null,
"flex_flow": null,
"grid_area": null,
"grid_auto_columns": null,
"grid_auto_flow": null,
"grid_auto_rows": null,
"grid_column": null,
"grid_gap": null,
"grid_row": null,
"grid_template_areas": null,
"grid_template_columns": null,
"grid_template_rows": null,
"height": null,
"justify_content": null,
"justify_items": null,
"left": null,
"margin": null,
"max_height": null,
"max_width": null,
"min_height": null,
"min_width": null,
"object_fit": null,
"object_position": null,
"order": null,
"overflow": null,
"padding": null,
"right": null,
"top": null,
"visibility": null,
"width": null
}
}
},
"version_major": 2,
"version_minor": 0
}
}
},
"nbformat": 4,
"nbformat_minor": 5
}