from the DataAnywhere team / OccupyData
Data is available in bits publicly, but aggregated by companies that want to charge for it. Other data may be free in aggregate form, but is not available for live query/access, or data additions from the public. This project aims to solve these problems, one data set at a time.
Using open source tools, the Data Anywhere solution is to set up simple database, which will replicate itself, and simple scrapers on various virtual machines. These are cheap (about $5+/mo on digitalocean), and many go unused/underutilized.
The immediate goal is for the servers to aggregate any type of data, and make it accessible to the public. The longer term vision of this project will appeal to the data geek. We’d like to use the data for examining unexpected relationships chronologically at first, but could be compared along any index.
Although just taking off, the Data Anywhere project has the potential to help many organizations. It integrates a persistent data model; if one machine is shut down, no permanent loss is incurred to the data set, since it replicated itself to several other machines. These servers can be used to aggregate any type of data, and make it accessible to the public at large, through a simple RESTful web interface.
We are actively looking for more individuals and community partners to grow the Data Anywhere community. Our very first workshop was at the March Occupy Data hackathon. We had two groups initiate projects, and we’re planning our next workshop for a summer Occupy Data hackathon. At these events, participants are provided with simple instructions on how to set up and secure a server, and databases that maintain themselves, and replicate. Knowledge of Linux or Python is helpful but not necessary. Patience and a willingness to learn is MUCH more important.
About us: The Data Anywhere team is led by an EXTRA-ordinary, no less than amazing software developer and Linux admin, teaching Linux basic system admin, MongoDB setup and usage, and flask web API. The opportunity to work with her alone, will be well worth it.
More Info: Hope to see you in June! That’s when we’re planning for the next Occupy Data hackathon. For Data Anywhere announcements subscribe to our discussion list, follow @occupydata on Twitter, or join us Meetup.com. More information on Occupy Data can be found at OccupyResearch.net and occupydatanyc.org.