Reflecting on Research: Data – when is enough enough
For context:
My research focuses on a case study, a single river with a
catchment of +100km2 that is the tributary of a much larger catchment. The
landscape is, in description, on the borders of the lowlands and the uplands, hilly
but not mountainous, diverse, predominantly agricultural rural and wet.
It is wet both in weather and in leakage, the hills leak.
(not a geography term. probably.)
My project is combining the social and physical sciences by
incorporating landmanager/owner perspectives and expertise with traditional GIS
(Geographic Information Systems) and computer modelling, to explore tree
planting for NFM as a landuse change.
Collecting 'Data':
I have between one year to 18months for data collection.
This seems like a long time, but it really isn't
My project breaks down into three key areas
- Explore social perceptions, preference and expertise to co-create understanding and knowledge of the catchment
- Model a known flood event and develop sceanrios of alternative landuse
- Analyse and evaluate the case study with and including participant evaluation
Each one of these stages could involve an epic amount of
data collection.
So when is enough enough.
I spent all summer walking farms, gathering data and most
importantly, listening. But when you are considering peoples opinions, their
perceptions and preferences, how many people do you talk to? When everyone is
of equal value shouldn’t you talk to everyone?! If that’s not possible (which
it isn’t, not everyone would want to talk to me I’m sure, let alone the time limits),
How do you decide who to target?
How do you decide when you have done enough?
Actually the first of those two questions is the more easily
tackled, there’s vast amounts of literature supporting how to select
participants; random selection, snowball selection etc. in reality it often
comes down to who you can contact, who answers the phone or who happens to be
in when you knock on the door. I used a number of different methods to try and
speak to a diverse, roughly representative range of people.
The second question is much much harder. Strictly speaking I
think I have enough qualitative data to have been able to make conclusions from
what I’ve been told. Officially I have ‘finished’ this initial aspect of data
collection, but there are areas of the catchment I’d really like to know more
about, and people and landscapes of interest I think might be really important.
I’m not going to not speak to these people if I get the chance, but I also have
to move on to the next stage of the project; computer models do not run
themselves. Well, yes, they do, but I have to tell it what to do first.
Actually first I have to work out what I want it to do, then work out how to
tell it to do that. Urgh.
As I accept the situation and move on I find that I am still
tackling the same question. How much data is enough? I am using a ‘physically based
spatially distributed’ model; more simply it uses information about the
physical characteristics and where things are in relation to each other to work
out what’s happening. The alternative is to use a conceptual lumped model,
which I’m not going to explain, but it’s more ‘pure maths’ and doesn’t allow me
to explore landuse change in quite the same detail. The ‘conceptual’ bit is a
bit of a red herring as the model I’m using is obviously conceptual; I am not,
for example, actually going to fill my computer with soil when I do the soil
input. I do however, have to put in a value for the soil quality (quantity,
depth etc). Here resides the difficulty, I am using physical data to represent the
catchment and to an extent the finer the detail and the data the more accurate
the model… sort of
But there’s a line where the quality and quantity of the
data runs out, and by running a more complex model I’m more likely to create
errors and uncertainties which I wouldn’t have had with the simpler model. So I
have to ask not only ‘how much data is enough’ but also ‘how much is to much?’!
I am still tackling this one – the endless problem with a
PhD is that no one else really knows what you’re doing, so no one can really
tell you what you do or don’t need until you need it.
For now, having access to the data when I need it is more
important than not having it, or having it and not needing it.
So I’ve ordered a pair of waders and I’m either going to sit
in front of my supervisor and cry until he tells me what to do or (more likely)
I’m going to get some help and go and walk in the river with some kit and
collect some data that I may or may not need.
This is normal right?
Comments
Post a Comment