зеркало из
https://github.com/ssciwr/AMMICO.git
synced 2025-10-29 13:06:04 +02:00
* [pre-commit.ci] pre-commit autoupdate updates: - [github.com/kynan/nbstripout: 0.6.1 → 0.7.1](https://github.com/kynan/nbstripout/compare/0.6.1...0.7.1) * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
184 строки
5.7 KiB
Plaintext
184 строки
5.7 KiB
Plaintext
{
|
|
"cells": [
|
|
{
|
|
"attachments": {},
|
|
"cell_type": "markdown",
|
|
"id": "0",
|
|
"metadata": {},
|
|
"source": [
|
|
"# Crop posts module"
|
|
]
|
|
},
|
|
{
|
|
"attachments": {},
|
|
"cell_type": "markdown",
|
|
"id": "1",
|
|
"metadata": {},
|
|
"source": [
|
|
"Crop posts from social media posts images, to keep import text informations from social media posts images.\n",
|
|
"We can set some manually cropped views from social media posts as reference for cropping the same type social media posts images."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "2",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"# Please ignore this cell: extra install steps that are only executed when running the notebook on Google Colab\n",
|
|
"# flake8-noqa-cell\n",
|
|
"import os\n",
|
|
"if 'google.colab' in str(get_ipython()):\n",
|
|
" # we're running on colab\n",
|
|
" # first install pinned version of setuptools (latest version doesn't seem to work with this package on colab)\n",
|
|
" %pip install setuptools==61 -qqq\n",
|
|
" # install the moralization package\n",
|
|
" %pip install git+https://github.com/ssciwr/AMMICO.git -qqq\n",
|
|
"\n",
|
|
" # prevent loading of the wrong opencv library\n",
|
|
" %pip uninstall -y opencv-contrib-python\n",
|
|
" %pip install opencv-contrib-python\n",
|
|
"\n",
|
|
" from google.colab import drive\n",
|
|
" drive.mount('/content/drive')\n",
|
|
"\n",
|
|
" if not os.path.isdir('/content/ref'):\n",
|
|
" !wget https://github.com/ssciwr/AMMICO/archive/refs/heads/ref-data.zip -q\n",
|
|
" !unzip -qq ref-data.zip -d . && mv -f AMMICO-ref-data/data/ref . && rm -rf AMMICO-ref-data ref-data.zip"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "3",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"import ammico.cropposts as crpo\n",
|
|
"import ammico.utils as utils\n",
|
|
"import matplotlib.pyplot as plt\n",
|
|
"import cv2\n",
|
|
"import importlib_resources\n",
|
|
"pkg = importlib_resources.files(\"ammico\")"
|
|
]
|
|
},
|
|
{
|
|
"attachments": {},
|
|
"cell_type": "markdown",
|
|
"id": "4",
|
|
"metadata": {},
|
|
"source": [
|
|
"The cropping is carried out by finding reference images on the image to be cropped. If a reference matches a region on the image, then everything below the matched region is removed. Manually look at a reference and an example post with the code below."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "5",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"# load ref view for cropping the same type social media posts images.\n",
|
|
"# substitute the below paths for your samples\n",
|
|
"path_ref = pkg / \"data\" / \"ref\" / \"ref-00.png\"\n",
|
|
"ref_view = cv2.imread(path_ref.as_posix())\n",
|
|
"RGB_ref_view = cv2.cvtColor(ref_view, cv2.COLOR_BGR2RGB)\n",
|
|
"plt.figure(figsize=(10, 15))\n",
|
|
"plt.imshow(RGB_ref_view)\n",
|
|
"plt.show()\n",
|
|
"\n",
|
|
"path_post = pkg / \"data\" / \"test-crop-image.png\"\n",
|
|
"view = cv2.imread(path_post.as_posix())\n",
|
|
"RGB_view = cv2.cvtColor(view, cv2.COLOR_BGR2RGB)\n",
|
|
"plt.figure(figsize=(10, 15))\n",
|
|
"plt.imshow(RGB_view)\n",
|
|
"plt.show()"
|
|
]
|
|
},
|
|
{
|
|
"attachments": {},
|
|
"cell_type": "markdown",
|
|
"id": "6",
|
|
"metadata": {},
|
|
"source": [
|
|
"You can now crop the image and check on the way that everything looks fine. `plt_match` will plot the matches on the image and below which line content will be cropped; `plt_crop` will plot the cropped text part of the social media post with the comments removed; `plt_image` will plot the image part of the social media post if applicable."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "7",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"# crop a posts from reference view, check the cropping \n",
|
|
"# this will only plot something if the reference is found on the image\n",
|
|
"crop_view = crpo.crop_posts_from_refs(\n",
|
|
" [ref_view], view, \n",
|
|
" plt_match=True, plt_crop=True, plt_image=True,\n",
|
|
")"
|
|
]
|
|
},
|
|
{
|
|
"attachments": {},
|
|
"cell_type": "markdown",
|
|
"id": "8",
|
|
"metadata": {},
|
|
"source": [
|
|
"Batch crop images from the image folder given in `crop_dir`. The cropped images will save in `save_crop_dir` folder with the same file name as the original file. The reference images with the items to match are provided in `ref_dir`.\n",
|
|
"\n",
|
|
"Sometimes the cropping will be imperfect, due to improper matches on the image. It is sometimes easier to first categorize the social media posts and then set different references in the reference folder `ref_dir`."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "9",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"\n",
|
|
"crop_dir = \"data/\"\n",
|
|
"ref_dir = pkg / \"data\" / \"ref\" \n",
|
|
"save_crop_dir = \"data/crop/\"\n",
|
|
"\n",
|
|
"files = utils.find_files(path=crop_dir,limit=10,)\n",
|
|
"ref_files = utils.find_files(path=ref_dir.as_posix(), limit=100)\n",
|
|
"\n",
|
|
"crpo.crop_media_posts(files, ref_files, save_crop_dir, plt_match=True, plt_crop=False, plt_image=False)\n",
|
|
"print(\"Batch cropping images done\")"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "10",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": []
|
|
}
|
|
],
|
|
"metadata": {
|
|
"kernelspec": {
|
|
"display_name": "Python 3 (ipykernel)",
|
|
"language": "python",
|
|
"name": "python3"
|
|
},
|
|
"language_info": {
|
|
"codemirror_mode": {
|
|
"name": "ipython",
|
|
"version": 3
|
|
},
|
|
"file_extension": ".py",
|
|
"mimetype": "text/x-python",
|
|
"name": "python",
|
|
"nbconvert_exporter": "python",
|
|
"pygments_lexer": "ipython3",
|
|
"version": "3.9.16"
|
|
}
|
|
},
|
|
"nbformat": 4,
|
|
"nbformat_minor": 5
|
|
}
|