How do I do a statistical analysis of survey data?
I have analyzed a lot of surveys over the years, and can say that it all depends on your purpose. For example, do you have some specific research questions in mind? Or, is this purely exploratory, perhaps a summary report for management?
If you have specific research questions, you might be able to skip quickly to more advanced techniques such as regression analysis. If you are doing a first level summary analysis, you might stop at the descriptive analysis stage.
Nonetheless, here are some basic steps to consider (it is a sort of ladder of complexity, so you might stop and any point, depending on your experience, your client and your resources of time or money):
- If you weren’t involved with designing the survey, research what its overall purpose was and how it was constructed. If it is a survey that gets repeated on an annual basis (or some other basis) there may be research papers published on the results, to guide your thinking. If it is a one-off survey that you are given to analyze, find out what you can from the people who sponsored and designed the survey.
- Read the instrument thoroughly and consider the target population. Think about how you might answer the questions yourself, if it is a general survey.
- Consult subject-matter experts, if you aren’t one yourself (and even if you are one, it is good to get other opinions). Keep in mind what the goals of your clients are - they pay your salary. That said, don’t be afraid to give them advice or set them straight if they are recommending things that don’t make sense to you.
- Go through the items with a statistical package, summarizing them with descriptive univariate measures, like means, standard deviations, and other distributional measures. Have a look at the actual distributions with histograms (you may have to bin numerical variables first), to see if they are normally distributed (important for many statistical methods), have skew, seem to be a power law or whatever.
- For categorical data, look at the counts for each category, see whether they should be re-coded to make for a more manageable number of categories. Graphs if various kinds (box-plots, histograms, bar graphs, etc.) are very helpful - as they say “you can learn a lot just by looking”.
- If there are open-ended text based questions, skim at least a sample of responses, to get a feel for things. You might later consider text analysis (e.g. word clouds, sentiment analysis, content analysis) at some later stage.
- Consider whether some variables that “go together” can be collapsed into a single measure (a scale). For example, you might sum a number of variables that all seemed to be related to social status into one measure. Advanced techniques like factor analysis would be helpful here, though common sense will often take you a long way.
- Try some bi-variate stats, such as X-Y charts on variables that seem likely to show interesting and relevant relationships.
- If you have a primary dependent variable of interest in the survey (e.g. income), you might use advanced methods to explore which of your other variables have significant influences on that variable.
- You might also consider methods like cluster analysis, to see if there are some unknown grouping factors in your population.
- Don’t neglect the step of writing up the results. That can be anything from a formal research paper/corporate white paper to a list of comments/observations on a spreadsheet tab. It is a good idea to have some sort of “executive summary”, since your clients might be busy people. Also, keep in mind the sophistication of the clients. A very technical explanation might not be very useful to a non-quantitative client, while and oversimplified one won’t do for a technical audience.
- If you come up with a real knock-out graph that summarizes/supports your case, keep it front and center. That may be the item that actually gets around to an important audience (e.g. a Board of Directors, Deans’ Council, Assistant Deputy Minister or whatever).
Anyway, good luck with your survey analysis.
How NOT to use survey results: