AMMICO/build/doctrees/nbsphinx/notebooks/Example faces.ipynb

{
 "cells": [
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "d2c4d40d-8aca-4024-8d19-a65c4efe825d",
   "metadata": {},
   "source": [
    "# Facial Expression recognition with DeepFace"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "51f8888b-d1a3-4b85-a596-95c0993fa192",
   "metadata": {},
   "source": [
    "Facial expressions can be detected using [DeepFace](https://github.com/serengil/deepface) and [RetinaFace](https://github.com/serengil/retinaface).\n",
    "\n",
    "The first cell is only run on google colab and installs the [ammico](https://github.com/ssciwr/AMMICO) package.\n",
    "\n",
    "After that, we can import `ammico` and read in the files given a folder path."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "id": "50c1c1c7",
   "metadata": {
    "execution": {
     "iopub.execute_input": "2023-10-20T12:16:48.375131Z",
     "iopub.status.busy": "2023-10-20T12:16:48.374644Z",
     "iopub.status.idle": "2023-10-20T12:16:48.383786Z",
     "shell.execute_reply": "2023-10-20T12:16:48.383130Z"
    }
   },
   "outputs": [],
   "source": [
    "# if running on google colab\n",
    "# flake8-noqa-cell\n",
    "import os\n",
    "\n",
    "if \"google.colab\" in str(get_ipython()):\n",
    "    # update python version\n",
    "    # install setuptools\n",
    "    # %pip install setuptools==61 -qqq\n",
    "    # install ammico\n",
    "    %pip install git+https://github.com/ssciwr/ammico.git -qqq\n",
    "    # mount google drive for data and API key\n",
    "    from google.colab import drive\n",
    "\n",
    "    drive.mount(\"/content/drive\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "id": "b21e52a5-d379-42db-aae6-f2ab9ed9a369",
   "metadata": {
    "execution": {
     "iopub.execute_input": "2023-10-20T12:16:48.387617Z",
     "iopub.status.busy": "2023-10-20T12:16:48.387148Z",
     "iopub.status.idle": "2023-10-20T12:17:00.113930Z",
     "shell.execute_reply": "2023-10-20T12:17:00.113134Z"
    }
   },
   "outputs": [],
   "source": [
    "import ammico\n",
    "from ammico import utils as mutils\n",
    "from ammico import display as mdisplay"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "a2bd2153",
   "metadata": {},
   "source": [
    "We select a subset of image files to try facial expression detection on, see the `limit` keyword. The `find_files` function finds image files within a given directory:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "id": "afe7e638-f09d-47e7-9295-1c374bd64c53",
   "metadata": {
    "execution": {
     "iopub.execute_input": "2023-10-20T12:17:00.119500Z",
     "iopub.status.busy": "2023-10-20T12:17:00.117662Z",
     "iopub.status.idle": "2023-10-20T12:17:00.124041Z",
     "shell.execute_reply": "2023-10-20T12:17:00.123399Z"
    }
   },
   "outputs": [],
   "source": [
    "# Here you need to provide the path to your google drive folder\n",
    "# or local folder containing the images\n",
    "images = mutils.find_files(\n",
    "    path=\"data/\",\n",
    "    limit=10,\n",
    ")"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "705e7328",
   "metadata": {},
   "source": [
    "We need to initialize the main dictionary that contains all information for the images and is updated through each subsequent analysis:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "id": "b37c0c91",
   "metadata": {
    "execution": {
     "iopub.execute_input": "2023-10-20T12:17:00.127351Z",
     "iopub.status.busy": "2023-10-20T12:17:00.126919Z",
     "iopub.status.idle": "2023-10-20T12:17:00.130440Z",
     "shell.execute_reply": "2023-10-20T12:17:00.129758Z"
    }
   },
   "outputs": [],
   "source": [
    "mydict = mutils.initialize_dict(images)"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "a9372561",
   "metadata": {},
   "source": [
    "To check the analysis, you can inspect the analyzed elements here. Loading the results takes a moment, so please be patient. If you are sure of what you are doing, you can skip this and directly export a csv file in the step below.\n",
    "Here, we display the face recognition results provided by the DeepFace and RetinaFace libraries. Click on the tabs to see the results in the right sidebar. You may need to increment the `port` number if you are already running several notebook instances on the same server."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "id": "992499ed-33f1-4425-ad5d-738cf565d175",
   "metadata": {
    "execution": {
     "iopub.execute_input": "2023-10-20T12:17:00.133475Z",
     "iopub.status.busy": "2023-10-20T12:17:00.133238Z",
     "iopub.status.idle": "2023-10-20T12:17:01.353924Z",
     "shell.execute_reply": "2023-10-20T12:17:01.353117Z"
    }
   },
   "outputs": [
    {
     "ename": "TypeError",
     "evalue": "__init__() got an unexpected keyword argument 'identify'",
     "output_type": "error",
     "traceback": [
      "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
      "\u001b[0;31mTypeError\u001b[0m                                 Traceback (most recent call last)",
      "Cell \u001b[0;32mIn[5], line 1\u001b[0m\n\u001b[0;32m----> 1\u001b[0m analysis_explorer \u001b[38;5;241m=\u001b[39m \u001b[43mmdisplay\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mAnalysisExplorer\u001b[49m\u001b[43m(\u001b[49m\u001b[43mmydict\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43midentify\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[38;5;124;43mfaces\u001b[39;49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[43m)\u001b[49m\n\u001b[1;32m      2\u001b[0m analysis_explorer\u001b[38;5;241m.\u001b[39mrun_server(port \u001b[38;5;241m=\u001b[39m \u001b[38;5;241m8050\u001b[39m)\n",
      "\u001b[0;31mTypeError\u001b[0m: __init__() got an unexpected keyword argument 'identify'"
     ]
    }
   ],
   "source": [
    "analysis_explorer = mdisplay.AnalysisExplorer(mydict, identify=\"faces\")\n",
    "analysis_explorer.run_server(port = 8050)"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "6f974341",
   "metadata": {},
   "source": [
    "Instead of inspecting each of the images, you can also directly carry out the analysis and export the result into a csv. This may take a while depending on how many images you have loaded."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "id": "6f97c7d0",
   "metadata": {
    "execution": {
     "iopub.execute_input": "2023-10-20T12:17:01.357789Z",
     "iopub.status.busy": "2023-10-20T12:17:01.357181Z",
     "iopub.status.idle": "2023-10-20T12:17:04.842760Z",
     "shell.execute_reply": "2023-10-20T12:17:04.842009Z"
    }
   },
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Downloading data from 'https://github.com/serengil/deepface_models/releases/download/v1.0/retinaface.h5' to file '/home/runner/.cache/pooch/3be32af6e4183fa0156bc33bda371147-retinaface.h5'.\n"
     ]
    },
    {
     "ename": "ValueError",
     "evalue": "('Input image file path (', '102141_2_eng', ') does not exist.')",
     "output_type": "error",
     "traceback": [
      "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
      "\u001b[0;31mValueError\u001b[0m                                Traceback (most recent call last)",
      "Cell \u001b[0;32mIn[6], line 2\u001b[0m\n\u001b[1;32m      1\u001b[0m \u001b[38;5;28;01mfor\u001b[39;00m key \u001b[38;5;129;01min\u001b[39;00m mydict\u001b[38;5;241m.\u001b[39mkeys():\n\u001b[0;32m----> 2\u001b[0m     mydict[key] \u001b[38;5;241m=\u001b[39m \u001b[43mammico\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mfaces\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mEmotionDetector\u001b[49m\u001b[43m(\u001b[49m\u001b[43mmydict\u001b[49m\u001b[43m[\u001b[49m\u001b[43mkey\u001b[49m\u001b[43m]\u001b[49m\u001b[43m)\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43manalyse_image\u001b[49m\u001b[43m(\u001b[49m\u001b[43m)\u001b[49m\n",
      "File \u001b[0;32m~/work/AMMICO/AMMICO/ammico/faces.py:139\u001b[0m, in \u001b[0;36mEmotionDetector.analyse_image\u001b[0;34m(self)\u001b[0m\n\u001b[1;32m    132\u001b[0m \u001b[38;5;28;01mdef\u001b[39;00m \u001b[38;5;21manalyse_image\u001b[39m(\u001b[38;5;28mself\u001b[39m) \u001b[38;5;241m-\u001b[39m\u001b[38;5;241m>\u001b[39m \u001b[38;5;28mdict\u001b[39m:\n\u001b[1;32m    133\u001b[0m \u001b[38;5;250m    \u001b[39m\u001b[38;5;124;03m\"\"\"\u001b[39;00m\n\u001b[1;32m    134\u001b[0m \u001b[38;5;124;03m    Performs facial expression analysis on the image.\u001b[39;00m\n\u001b[1;32m    135\u001b[0m \n\u001b[1;32m    136\u001b[0m \u001b[38;5;124;03m    Returns:\u001b[39;00m\n\u001b[1;32m    137\u001b[0m \u001b[38;5;124;03m        dict: The updated subdict dictionary with analysis results.\u001b[39;00m\n\u001b[1;32m    138\u001b[0m \u001b[38;5;124;03m    \"\"\"\u001b[39;00m\n\u001b[0;32m--> 139\u001b[0m     \u001b[38;5;28;01mreturn\u001b[39;00m \u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mfacial_expression_analysis\u001b[49m\u001b[43m(\u001b[49m\u001b[43m)\u001b[49m\n",
      "File \u001b[0;32m~/work/AMMICO/AMMICO/ammico/faces.py:188\u001b[0m, in \u001b[0;36mEmotionDetector.facial_expression_analysis\u001b[0;34m(self)\u001b[0m\n\u001b[1;32m    185\u001b[0m \u001b[38;5;66;03m# Find (multiple) faces in the image and cut them\u001b[39;00m\n\u001b[1;32m    186\u001b[0m retinaface_model\u001b[38;5;241m.\u001b[39mget()\n\u001b[0;32m--> 188\u001b[0m faces \u001b[38;5;241m=\u001b[39m \u001b[43mRetinaFace\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mextract_faces\u001b[49m\u001b[43m(\u001b[49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43msubdict\u001b[49m\u001b[43m[\u001b[49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[38;5;124;43mfilename\u001b[39;49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[43m]\u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m    189\u001b[0m \u001b[38;5;66;03m# If no faces are found, we return empty keys\u001b[39;00m\n\u001b[1;32m    190\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m \u001b[38;5;28mlen\u001b[39m(faces) \u001b[38;5;241m==\u001b[39m \u001b[38;5;241m0\u001b[39m:\n",
      "File \u001b[0;32m/opt/hostedtoolcache/Python/3.9.18/x64/lib/python3.9/site-packages/retinaface/RetinaFace.py:190\u001b[0m, in \u001b[0;36mextract_faces\u001b[0;34m(img_path, threshold, model, align, allow_upscaling)\u001b[0m\n\u001b[1;32m    186\u001b[0m resp \u001b[38;5;241m=\u001b[39m []\n\u001b[1;32m    188\u001b[0m \u001b[38;5;66;03m#---------------------------\u001b[39;00m\n\u001b[0;32m--> 190\u001b[0m img \u001b[38;5;241m=\u001b[39m \u001b[43mget_image\u001b[49m\u001b[43m(\u001b[49m\u001b[43mimg_path\u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m    192\u001b[0m \u001b[38;5;66;03m#---------------------------\u001b[39;00m\n\u001b[1;32m    194\u001b[0m obj \u001b[38;5;241m=\u001b[39m detect_faces(img_path \u001b[38;5;241m=\u001b[39m img, threshold \u001b[38;5;241m=\u001b[39m threshold, model \u001b[38;5;241m=\u001b[39m model, allow_upscaling \u001b[38;5;241m=\u001b[39m allow_upscaling)\n",
      "File \u001b[0;32m/opt/hostedtoolcache/Python/3.9.18/x64/lib/python3.9/site-packages/retinaface/RetinaFace.py:43\u001b[0m, in \u001b[0;36mget_image\u001b[0;34m(img_path)\u001b[0m\n\u001b[1;32m     41\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m \u001b[38;5;28mtype\u001b[39m(img_path) \u001b[38;5;241m==\u001b[39m \u001b[38;5;28mstr\u001b[39m:  \u001b[38;5;66;03m# Load from file path\u001b[39;00m\n\u001b[1;32m     42\u001b[0m     \u001b[38;5;28;01mif\u001b[39;00m \u001b[38;5;129;01mnot\u001b[39;00m os\u001b[38;5;241m.\u001b[39mpath\u001b[38;5;241m.\u001b[39misfile(img_path):\n\u001b[0;32m---> 43\u001b[0m         \u001b[38;5;28;01mraise\u001b[39;00m \u001b[38;5;167;01mValueError\u001b[39;00m(\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mInput image file path (\u001b[39m\u001b[38;5;124m\"\u001b[39m, img_path, \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124m) does not exist.\u001b[39m\u001b[38;5;124m\"\u001b[39m)\n\u001b[1;32m     44\u001b[0m     img \u001b[38;5;241m=\u001b[39m cv2\u001b[38;5;241m.\u001b[39mimread(img_path)\n\u001b[1;32m     46\u001b[0m \u001b[38;5;28;01melif\u001b[39;00m \u001b[38;5;28misinstance\u001b[39m(img_path, np\u001b[38;5;241m.\u001b[39mndarray):  \u001b[38;5;66;03m# Use given NumPy array\u001b[39;00m\n",
      "\u001b[0;31mValueError\u001b[0m: ('Input image file path (', '102141_2_eng', ') does not exist.')"
     ]
    }
   ],
   "source": [
    "for key in mydict.keys():\n",
    "    mydict[key] = ammico.faces.EmotionDetector(mydict[key]).analyse_image()"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "174357b1",
   "metadata": {},
   "source": [
    "These steps are required to convert the dictionary of dictionarys into a dictionary with lists, that can be converted into a pandas dataframe and exported to a csv file."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "id": "604bd257",
   "metadata": {
    "execution": {
     "iopub.execute_input": "2023-10-20T12:17:04.846149Z",
     "iopub.status.busy": "2023-10-20T12:17:04.845908Z",
     "iopub.status.idle": "2023-10-20T12:17:05.565679Z",
     "shell.execute_reply": "2023-10-20T12:17:05.564801Z"
    }
   },
   "outputs": [
    {
     "ename": "ValueError",
     "evalue": "All arrays must be of the same length",
     "output_type": "error",
     "traceback": [
      "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
      "\u001b[0;31mValueError\u001b[0m                                Traceback (most recent call last)",
      "Cell \u001b[0;32mIn[7], line 2\u001b[0m\n\u001b[1;32m      1\u001b[0m outdict \u001b[38;5;241m=\u001b[39m mutils\u001b[38;5;241m.\u001b[39mappend_data_to_dict(mydict)\n\u001b[0;32m----> 2\u001b[0m df \u001b[38;5;241m=\u001b[39m \u001b[43mmutils\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mdump_df\u001b[49m\u001b[43m(\u001b[49m\u001b[43moutdict\u001b[49m\u001b[43m)\u001b[49m\n",
      "File \u001b[0;32m~/work/AMMICO/AMMICO/ammico/utils.py:222\u001b[0m, in \u001b[0;36mdump_df\u001b[0;34m(mydict)\u001b[0m\n\u001b[1;32m    220\u001b[0m \u001b[38;5;28;01mdef\u001b[39;00m \u001b[38;5;21mdump_df\u001b[39m(mydict: \u001b[38;5;28mdict\u001b[39m) \u001b[38;5;241m-\u001b[39m\u001b[38;5;241m>\u001b[39m DataFrame:\n\u001b[1;32m    221\u001b[0m \u001b[38;5;250m    \u001b[39m\u001b[38;5;124;03m\"\"\"Utility to dump the dictionary into a dataframe.\"\"\"\u001b[39;00m\n\u001b[0;32m--> 222\u001b[0m     \u001b[38;5;28;01mreturn\u001b[39;00m \u001b[43mDataFrame\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mfrom_dict\u001b[49m\u001b[43m(\u001b[49m\u001b[43mmydict\u001b[49m\u001b[43m)\u001b[49m\n",
      "File \u001b[0;32m/opt/hostedtoolcache/Python/3.9.18/x64/lib/python3.9/site-packages/pandas/core/frame.py:1816\u001b[0m, in \u001b[0;36mDataFrame.from_dict\u001b[0;34m(cls, data, orient, dtype, columns)\u001b[0m\n\u001b[1;32m   1810\u001b[0m     \u001b[38;5;28;01mraise\u001b[39;00m \u001b[38;5;167;01mValueError\u001b[39;00m(\n\u001b[1;32m   1811\u001b[0m         \u001b[38;5;124mf\u001b[39m\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mExpected \u001b[39m\u001b[38;5;124m'\u001b[39m\u001b[38;5;124mindex\u001b[39m\u001b[38;5;124m'\u001b[39m\u001b[38;5;124m, \u001b[39m\u001b[38;5;124m'\u001b[39m\u001b[38;5;124mcolumns\u001b[39m\u001b[38;5;124m'\u001b[39m\u001b[38;5;124m or \u001b[39m\u001b[38;5;124m'\u001b[39m\u001b[38;5;124mtight\u001b[39m\u001b[38;5;124m'\u001b[39m\u001b[38;5;124m for orient parameter. \u001b[39m\u001b[38;5;124m\"\u001b[39m\n\u001b[1;32m   1812\u001b[0m         \u001b[38;5;124mf\u001b[39m\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mGot \u001b[39m\u001b[38;5;124m'\u001b[39m\u001b[38;5;132;01m{\u001b[39;00morient\u001b[38;5;132;01m}\u001b[39;00m\u001b[38;5;124m'\u001b[39m\u001b[38;5;124m instead\u001b[39m\u001b[38;5;124m\"\u001b[39m\n\u001b[1;32m   1813\u001b[0m     )\n\u001b[1;32m   1815\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m orient \u001b[38;5;241m!=\u001b[39m \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mtight\u001b[39m\u001b[38;5;124m\"\u001b[39m:\n\u001b[0;32m-> 1816\u001b[0m     \u001b[38;5;28;01mreturn\u001b[39;00m \u001b[38;5;28;43mcls\u001b[39;49m\u001b[43m(\u001b[49m\u001b[43mdata\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mindex\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mindex\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mcolumns\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mcolumns\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mdtype\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mdtype\u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m   1817\u001b[0m \u001b[38;5;28;01melse\u001b[39;00m:\n\u001b[1;32m   1818\u001b[0m     realdata \u001b[38;5;241m=\u001b[39m data[\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mdata\u001b[39m\u001b[38;5;124m\"\u001b[39m]\n",
      "File \u001b[0;32m/opt/hostedtoolcache/Python/3.9.18/x64/lib/python3.9/site-packages/pandas/core/frame.py:736\u001b[0m, in \u001b[0;36mDataFrame.__init__\u001b[0;34m(self, data, index, columns, dtype, copy)\u001b[0m\n\u001b[1;32m    730\u001b[0m     mgr \u001b[38;5;241m=\u001b[39m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39m_init_mgr(\n\u001b[1;32m    731\u001b[0m         data, axes\u001b[38;5;241m=\u001b[39m{\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mindex\u001b[39m\u001b[38;5;124m\"\u001b[39m: index, \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mcolumns\u001b[39m\u001b[38;5;124m\"\u001b[39m: columns}, dtype\u001b[38;5;241m=\u001b[39mdtype, copy\u001b[38;5;241m=\u001b[39mcopy\n\u001b[1;32m    732\u001b[0m     )\n\u001b[1;32m    734\u001b[0m \u001b[38;5;28;01melif\u001b[39;00m \u001b[38;5;28misinstance\u001b[39m(data, \u001b[38;5;28mdict\u001b[39m):\n\u001b[1;32m    735\u001b[0m     \u001b[38;5;66;03m# GH#38939 de facto copy defaults to False only in non-dict cases\u001b[39;00m\n\u001b[0;32m--> 736\u001b[0m     mgr \u001b[38;5;241m=\u001b[39m \u001b[43mdict_to_mgr\u001b[49m\u001b[43m(\u001b[49m\u001b[43mdata\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mindex\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mcolumns\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mdtype\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mdtype\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mcopy\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mcopy\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mtyp\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mmanager\u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m    737\u001b[0m \u001b[38;5;28;01melif\u001b[39;00m \u001b[38;5;28misinstance\u001b[39m(data, ma\u001b[38;5;241m.\u001b[39mMaskedArray):\n\u001b[1;32m    738\u001b[0m     \u001b[38;5;28;01mfrom\u001b[39;00m \u001b[38;5;21;01mnumpy\u001b[39;00m\u001b[38;5;21;01m.\u001b[39;00m\u001b[38;5;21;01mma\u001b[39;00m \u001b[38;5;28;01mimport\u001b[39;00m mrecords\n",
      "File \u001b[0;32m/opt/hostedtoolcache/Python/3.9.18/x64/lib/python3.9/site-packages/pandas/core/internals/construction.py:503\u001b[0m, in \u001b[0;36mdict_to_mgr\u001b[0;34m(data, index, columns, dtype, typ, copy)\u001b[0m\n\u001b[1;32m    499\u001b[0m     \u001b[38;5;28;01melse\u001b[39;00m:\n\u001b[1;32m    500\u001b[0m         \u001b[38;5;66;03m# dtype check to exclude e.g. range objects, scalars\u001b[39;00m\n\u001b[1;32m    501\u001b[0m         arrays \u001b[38;5;241m=\u001b[39m [x\u001b[38;5;241m.\u001b[39mcopy() \u001b[38;5;28;01mif\u001b[39;00m \u001b[38;5;28mhasattr\u001b[39m(x, \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mdtype\u001b[39m\u001b[38;5;124m\"\u001b[39m) \u001b[38;5;28;01melse\u001b[39;00m x \u001b[38;5;28;01mfor\u001b[39;00m x \u001b[38;5;129;01min\u001b[39;00m arrays]\n\u001b[0;32m--> 503\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m \u001b[43marrays_to_mgr\u001b[49m\u001b[43m(\u001b[49m\u001b[43marrays\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mcolumns\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mindex\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mdtype\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mdtype\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mtyp\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mtyp\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mconsolidate\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mcopy\u001b[49m\u001b[43m)\u001b[49m\n",
      "File \u001b[0;32m/opt/hostedtoolcache/Python/3.9.18/x64/lib/python3.9/site-packages/pandas/core/internals/construction.py:114\u001b[0m, in \u001b[0;36marrays_to_mgr\u001b[0;34m(arrays, columns, index, dtype, verify_integrity, typ, consolidate)\u001b[0m\n\u001b[1;32m    111\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m verify_integrity:\n\u001b[1;32m    112\u001b[0m     \u001b[38;5;66;03m# figure out the index, if necessary\u001b[39;00m\n\u001b[1;32m    113\u001b[0m     \u001b[38;5;28;01mif\u001b[39;00m index \u001b[38;5;129;01mis\u001b[39;00m \u001b[38;5;28;01mNone\u001b[39;00m:\n\u001b[0;32m--> 114\u001b[0m         index \u001b[38;5;241m=\u001b[39m \u001b[43m_extract_index\u001b[49m\u001b[43m(\u001b[49m\u001b[43marrays\u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m    115\u001b[0m     \u001b[38;5;28;01melse\u001b[39;00m:\n\u001b[1;32m    116\u001b[0m         index \u001b[38;5;241m=\u001b[39m ensure_index(index)\n",
      "File \u001b[0;32m/opt/hostedtoolcache/Python/3.9.18/x64/lib/python3.9/site-packages/pandas/core/internals/construction.py:677\u001b[0m, in \u001b[0;36m_extract_index\u001b[0;34m(data)\u001b[0m\n\u001b[1;32m    675\u001b[0m lengths \u001b[38;5;241m=\u001b[39m \u001b[38;5;28mlist\u001b[39m(\u001b[38;5;28mset\u001b[39m(raw_lengths))\n\u001b[1;32m    676\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m \u001b[38;5;28mlen\u001b[39m(lengths) \u001b[38;5;241m>\u001b[39m \u001b[38;5;241m1\u001b[39m:\n\u001b[0;32m--> 677\u001b[0m     \u001b[38;5;28;01mraise\u001b[39;00m \u001b[38;5;167;01mValueError\u001b[39;00m(\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mAll arrays must be of the same length\u001b[39m\u001b[38;5;124m\"\u001b[39m)\n\u001b[1;32m    679\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m have_dicts:\n\u001b[1;32m    680\u001b[0m     \u001b[38;5;28;01mraise\u001b[39;00m \u001b[38;5;167;01mValueError\u001b[39;00m(\n\u001b[1;32m    681\u001b[0m         \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mMixing dicts with non-Series may lead to ambiguous ordering.\u001b[39m\u001b[38;5;124m\"\u001b[39m\n\u001b[1;32m    682\u001b[0m     )\n",
      "\u001b[0;31mValueError\u001b[0m: All arrays must be of the same length"
     ]
    }
   ],
   "source": [
    "outdict = mutils.append_data_to_dict(mydict)\n",
    "df = mutils.dump_df(outdict)"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "8373d9f8",
   "metadata": {},
   "source": [
    "Check the dataframe:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "id": "aa4b518a",
   "metadata": {
    "execution": {
     "iopub.execute_input": "2023-10-20T12:17:05.569578Z",
     "iopub.status.busy": "2023-10-20T12:17:05.568999Z",
     "iopub.status.idle": "2023-10-20T12:17:05.608066Z",
     "shell.execute_reply": "2023-10-20T12:17:05.607399Z"
    }
   },
   "outputs": [
    {
     "ename": "NameError",
     "evalue": "name 'df' is not defined",
     "output_type": "error",
     "traceback": [
      "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
      "\u001b[0;31mNameError\u001b[0m                                 Traceback (most recent call last)",
      "Cell \u001b[0;32mIn[8], line 1\u001b[0m\n\u001b[0;32m----> 1\u001b[0m \u001b[43mdf\u001b[49m\u001b[38;5;241m.\u001b[39mhead(\u001b[38;5;241m10\u001b[39m)\n",
      "\u001b[0;31mNameError\u001b[0m: name 'df' is not defined"
     ]
    }
   ],
   "source": [
    "df.head(10)"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "579cd59f",
   "metadata": {},
   "source": [
    "Write the csv file - here you should provide a file path and file name for the csv file to be written."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "id": "4618decb",
   "metadata": {
    "execution": {
     "iopub.execute_input": "2023-10-20T12:17:05.611905Z",
     "iopub.status.busy": "2023-10-20T12:17:05.611437Z",
     "iopub.status.idle": "2023-10-20T12:17:05.645473Z",
     "shell.execute_reply": "2023-10-20T12:17:05.644763Z"
    }
   },
   "outputs": [
    {
     "ename": "NameError",
     "evalue": "name 'df' is not defined",
     "output_type": "error",
     "traceback": [
      "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
      "\u001b[0;31mNameError\u001b[0m                                 Traceback (most recent call last)",
      "Cell \u001b[0;32mIn[9], line 1\u001b[0m\n\u001b[0;32m----> 1\u001b[0m \u001b[43mdf\u001b[49m\u001b[38;5;241m.\u001b[39mto_csv(\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mdata_out.csv\u001b[39m\u001b[38;5;124m\"\u001b[39m)\n",
      "\u001b[0;31mNameError\u001b[0m: name 'df' is not defined"
     ]
    }
   ],
   "source": [
    "df.to_csv(\"data_out.csv\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "b1a80023",
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.9.18"
  },
  "vscode": {
   "interpreter": {
    "hash": "da98320027a74839c7141b42ef24e2d47d628ba1f51115c13da5d8b45a372ec2"
   }
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}