diff --git a/ADMIN_HOWTO_decide_on_Tool_Needs.md b/ADMIN_HOWTO_decide_on_Tool_Needs.md index 5882aa2..2b00741 100644 --- a/ADMIN_HOWTO_decide_on_Tool_Needs.md +++ b/ADMIN_HOWTO_decide_on_Tool_Needs.md @@ -1,9 +1,25 @@ -# ADMIN_Howto decide on Tool Needs +# Choosing Disinformation Deployment Tools + +## Disinformation deployments are not all the same + +CogSecCollab supports deployments of different sizes and types. Deployments we've run include: +* single analyst working with open datasets, looking for patterns of behavior and extending knowledge about disinformation creators' assets. +* Small team (4-6 people) collaborating on a short-term data task +* Mid-sized team (20-50 people) collecting data and doing initial analysis on disinformation incidents lasting days +* Large team (50+ people) going from collection to response + +Some of the factors in tool and process choice will include: +* Team size: big teams need different resources to keep them organised +* Tempo: longer, more academic investigations will need different toolings to real-time responses. +* Environment: expected output forms/ formats will affect tool choice, as will any tool constraints created by being embedded in a larger team or system +* Localisation: is this a single language or multiple? If there are maps being used, do the map tools include the areas of interest (e.g. non-US regions, disputed territories etc) + +## Example: larger deployment We will have n people distributed around the world, adding data to our system. That data is: -* Instances of disinformation. Those instances are mostly going to be single examples of text, images, video or audio, with associated metadata, but could be groups of examples. The meta will need to include the date & time that an instance appeared, where it appeared (twitter, facebook, etc), user, group, etc it appeared from/in, hashtags and other information added to the instance (e.g. image descriptions), and a URL to the original instance if possible. -* New disinformation narratives, e.g. “black people can’t catch Covid19”. +* Instances of disinformation. Those instances are mostly going to be single examples of text, images, video or audio, with associated metadata, but could be groups of examples. The meta will need to include the date & time that an instance appeared, where it appeared (twitter, facebook, etc), user, group, etc it appeared from/in, hashtags and other information added to the instance (e.g. image descriptions), and a URL to the original instance if possible. +* New disinformation narratives, e.g. “black people can’t catch Covid19”. * New hashtags and other high-level information associated with disinformation incidents. We would like to take these instances into our system as raw data, and: @@ -13,7 +29,7 @@ We would like to take these instances into our system as raw data, and: We will also have people in our communities who will add datasets to our system. That data is: * Results of searches for messages, images, video and audio associated with disinformation-related phrases, images, groups etc. Much of this data will be in a format specific to the social media channel it was scraped from, which is often json or csv formatted. -* Datasets provided by external disinformation researchers, relevant to Covid19; e.g. datasets released by social media platforms and other providers. +* Datasets provided by external disinformation researchers, relevant to Covid19; e.g. datasets released by social media platforms and other providers. We would like to take these datasets into our system as raw datasets, and: * Add them to our datastore