DataPatterns.org: tricks and best practice for data work
London, 5 August 2011
(by Daniel Dietrich)
DataPatterns.org, a collection of tips and tricks for data work. The collection is not an finished document but a collection of opinions and evolving best practices. The purpose is not to present all available options and technologies but to pick one and follow it through. DataPatterns is also a collaborative effort: if you have some good hacks and would like to share them, please contribute a patch to the DataPatterns repository. In this blog post Rufus Pollock writes:
"How do you scrape a massive online archive? How do you fix a broken CSV file? How do you normalize entity names in a large collection of records?
There is a lot of practical skill in handling newly opened data, and the implicit promise of the open data movement is that we will help more people to access and re-use data. And while it would be desirable to be able to offer simple web-based tools for data wrangling, the truth is that what’s required is often a wild mix of web tools, desktop and command-line tools and programming skills. So what we need is the other half of the Open Data Manual."
- 372 reads


