Fiery-snap is a componentized scrapper/processor that leverages Python micro services to transform data for analysis. Simply put each little service reads data from a set of 1 or more queues, performs some operation on the data, and then puts it into the outbound queue for additional operations.
Current implementation details
– The current implementation reads Twitter user timelines and attempts to extract common artifacts (e.g. domains, hashes, URLs, URIs, and potential URLs). This current implementation has three main components (discussed below) that can be run in a series docker instances, remote services, or all locally.
+ Python 2.7 and pip installed.
+ Twitter API keys (see: https://apps.twitter.com/)
git clone https://github.com/deeso/fiery-snap && cd fiery-snap
sudo python setup.py install (install Dependencies)
Edit sample-prod-config.toml (For Twitter Api Key)
grep -n CREATE fiery-snap/scripts/sample-prod-config.toml
read-twitter-data-n-days.py -days 1 -host 10.18.123.19 -processed -content_artifacts -content -all
Please read README.md For Details.