Finding Open Datasets


July 31, 2021

The best ways to build a skill is to practice it, and data analytics is no different. That means you need to find datasets to analyse, and luckily there are plenty of open datasets available. There’s always more ways to find data sets, but here’s a few places to get inspiration.

I’m looking for a particular dataset

Google has a dataset search which is a great place to start.

I’m looking for inspiration

I want data about my community

As well as open data portals, Governments run censuses, track macro economic and social indicators, agricultural and environmental data, and so on (that may or may not be on the portal). The statistics departments, such as Australia’s Bureau of Statistics (ABS) and UK Office for National Statistics often have a lot of useful aggregate information (although with the ABS it takes some skill to find it).

Open Street Map has a lot of geographical data, and varying levels of data about structures. In Australia the G-NAF contains address data that’s not in Open Street Map.

I want big data

I want something special

You can always collect or build your own dataset. If you’ve got an actual problem you’re trying to solve, this is often the best way. This gives you experience not only analysing a dataset, but with collecting and processing data which are very useful to be able to understand and do.

The web contains a ton of public data that can be processed into datasets. For example I built a job ad dataset from Common Crawl. You could further annotate these to create your own dataset.

Another good method is to collect your own data, and if it doesn’t contain any potentially damaging information, share it as an open dataset. For example in Victoria there’s a way to get your own energy usage data and analyse it. Or you could analyse your email data. Or you could stick some sensors in your garden and record measurements that link to your plants growth, Or run a survey on a topic of interest to you.