The Questionnaire Challenge

How should we model questionnaires in our health data?

This is something that @ianmcnicoll and I have grappled with for years. We have reached a conclusion in recent times, and our approach, perhaps rather controversially, is not to model them! Yes, you read me right – as a general principle, don’t archetype questionnaires.

Now of course there will be some situations where there are standardised and ubiquitous questionnaires and perhaps it will be reasonable to lock these down as fixed data elements and value sets in an archetype, and maybe even govern them within a CKM environment.

But… Consider the number of questionnaires out ‘in the wild’ at any point in time. Should each of these be archetyped?

Let’s think it through. If there are 5000 questionnaires in the world (and clearly there are way more questionnaires than that out in the health ecosystem) then we would need a corresponding whopping 5000 archetypes. And, as we all know, no two questionnaires will be alike even if they have a common parent document – it is always human nature to ‘tweak’ each one for local use because ‘our situation’ is unique. It’s just the way it is. The consequences are that any data captured using the myriad of archetypes, even though they may be similar, the data will not be interoperable. We will have a huge number of archetypes with a huge variation in content and intent – not a lot of upside from my point of view.

Another alternative could be to define a generic archetype pattern for a questionnaire, and re-use that. In fact we tried this back in 2007 with our work in the NHS – you can see a reasonable example here. The resulting questionnaire pattern is pretty simple and relies on using the templating layer to document the questions, and either templating or use of a terminology subset to record the answers. The equivalent FHIR resource appears similar in principle and intended use. This kind of pattern provides a common framework for a questionnaire but really doesn’t give us a lot more interoperability for questionnaire data – the actual questionnaire content will vary enormously and the results can still be chaotic.

So, still somewhat clunky and awkward – not an elegant solution at all.

Then we got to thinking: Do we always need to actually record the questions and their answers in the EHR? This is a critical question. Sometimes the answer is yes, but most often I think we will find that we don’t need to record the actual question and (often) check box response. What we really want to record is the outcome, the real health information meaning.

Think about the practical aspects of this…

Clinicians have a systematic questioning process for history taking – we all have a similar pattern but ask subtly different questions – resulting in zillions of permutations and combinations and levels of granularity. Questions could range from: “Have you had any abdominal symptoms?” to “Have you had any nausea, vomiting, reflux, abdominal pain, diarrhoea, constipation etc” to whatever combination is relevant for a given clinician in a given clinical situation. Many will ask the same kind of question slightly differently. Every resulting questionnaire will be slightly different.

And what do they record? They don’t record each question and corresponding Yes/No answers in their paper health records. They record the positive responses or the relevant negatives, and/or maybe a quick note that there were no positive responses to systematic questioning about current symptoms, problems, past history, family history etc.

So we need to ask ourselves: Do we need to record the exact question, the potential alternative answers and the actual answer?

If the answer is yes, then it is a very good reason to lock in the questionnaire in an archetype, or at least a template.

If the answer is no, then what is the best way to record the relevant data – the relevant positives and the relevant negatives. For example, with abdominal pain – record the details about their diarrhoea and colicky pain in the right lower abdomen, PLUS that the patient has NEVER had an Appendicectomy.

So I’m suggesting that we need to record the positive presence of something identified in the questionnaire, for example a symptom, diagnosis or previous procedure, and the positive absence of related things. In this case, record the details about the diarrhoea and abdominal pain as the positive presence of symptoms using the Symptom archetype and positive exclusion of a previous Appendicectomy procedure in the Exclusion of Procedure archetype. We don’t need to record the actual question and corresponding Yes/No answer.

So our current approach in Ocean is to use the software UI as the means to display the checklist or questionnaire, but only record in the electronic health record any relevant answers – both the positive presence of symptoms, signs, diagnoses, procedures and tests etc, and also the positive exclusion of any of these things – all using standardised archetypes.

Lets face it, it is not often that we ever go back to look at the raw questionnaire data again. So now we tend not to record the raw data (with some exceptions, where it may be required or useful) but use a transform so that a patient’s or clinician’s positive tick in a box for ‘Past History of Epilepsy?’ will be converted into a positive statement of ‘Epilepsy’ within an EHR, using the Diagnosis archetype. Any additional ‘other’ free text or ‘details’ or ‘date of diagnosis’ from the questionnaire can be captured using other relevant data fields for the Diagnosis archetype. The benefit from this approach is that this data can then be potentially re-used into the future as part of a comprehensive Problem List, not just buried as a ticked check box within a questionnaire from years ago, perhaps never to see the light of day ever again.

Consider the questionnaire as what it really is – just a clinical communication tool, a checklist. It is absolutely not the best means to record, persist and re-use good quality health data. What we really want to record in a consistent way are those critical pieces of health information in a formal archetype so that the data can be utilised for long term health records, decision support, exchange or analysis. Recording the check boxes answers from a questionnaire don’t really do that job!

3 thoughts on “The Questionnaire Challenge

  1. This is in alignment with FHIR’s approach. Questionnaire is an important means of data capture and is sometimes useful in understanding context around data, particularly for things like clinical trials where skewing of results due to the nature or order of the questions may be relevant. However, for data retrieval and navigation purposes, the expectation is that data initially gathered by questionnaire will be propagated into the appropriate resources for use in general care, decision support, etc. The raw questionnaire data will be looked at rarely – if at all.

  2. Heather I agree with your philosophy but in practice it’d be really difficult to extract ‘health information that matters’ from questionnaires. They are mostly designed very poorly and may include different kinds of information as you indicare; such as symptom, procedure, reaction, patient preferences, history, results/ other observations etc. What you are saying is take pertinent bits and then chuck it into corresponding type archetypes. But remember most of these items will not have any context, not even dates! Hence they may not really be real credible clinical information worth saving using EHR-type archetypes but rather loose collection of bits and pieces. I agree that overall synthesis should be Archetyped but I’d always want to see the raw questionnaire should I wish to look at it. Maybe the proposed generic ENTRY archetype would be a good mechanism.

    Another difficulty might be the ‘outcome / real health information’ might change from clinician to clinician depending on how you interpret a large combination of items. So the danger of storing the ‘synthesis’ of a questionnaire in (possibly an evaluation type) archetype will be to lose the details which might be interpreted differently by a different clinician. Observation type information should be OK though.


    • Hi Koray,

      Our experience is that the data is the priority and so work towards re-designing questionnaires to support capture of good quality data.
      If we are trying to capture data from the majority of existing questionnaires then good luck – questionnaires notoriously ask questions badly. They work variably as far as human interpretation but usually very badly wrt computer interpretation.
      We do have experience in taking previous paper questionnaires, analysing the data requirements sought in terms of what we want to persist and then we design the UI/questions to match the data desired. This enhances the process rather than just trying to create an electronic version of the original paper questionnaire

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s