* add py test * update dependencies and markers * Update ci.yml * fix linting * trying with editable install for cov * fix code smells * fix code smells Co-authored-by: xianghe ma <maxianghe@outlook.com>
Misinformation campaign analysis
Extract data from from social media images and texts in disinformation campaigns.
This project is currently under development!
Use the pre-processed social media posts (image files) and process to collect information:
- Text extraction from the images
- Improving the preparation of the text for the data analysis (e.g., text cleaning)
- Performing person and face recognition in images, facial expressions recognition, as well as the extraction of any other available individual characteristics (e.g., gender, clothes)
- Extraction of other non-human objects in the image
- 5-Color analysis of the images
This development will serve the fight to combat misinformation, by providing more comprehensive data about its content and techniques. The ultimate goal of this project is to develop a computer-assisted toolset to investigate the content of disinformation campaigns worldwide.
Installation
The misinformation package can be installed using pip: Navigate into your package folder misinformation/ and execute
pip install .
This will install the package and its dependencies locally.
Usage
There are sample notebooks in the misinformation/notebooks folder for you to explore the package usage:
- Facial analysis: Use the notebook
facial_expressions.ipynbto identify if there are faces on the image, if they are wearing masks, and if they are not wearing masks also the race, gender and dominant emotion. - Object analysis: Use the notebook
ojects_expression.ipynbto identify certain objects in the image. Currently, the following objects are being identified: person, bicycle, car, motorcycle, airplane, bus, train, truck, boat, traffic light, cell phone.
There are further notebooks that are currently of exploratory nature (colors_expression to identify certain colors on the image, get-text-from-image to extract text that is contained in an image.)