Cleaning Data Collected with Survey Software I

Posted on : January 24, 2018 - by :

Good researchers take steps to make sure they have clean data – data without obvious errors. You should be able to prevent some kinds of errors from ever being entered, but other times you may need to check data after it is entered.

Capable survey software should let you prevent impossible data. Usually this isn’t a problem with the answers to a multiple-choice question. Someone cannot pick a choice that isn’t shown. There are cases, though, were even with multiple-choice data you need your data collection program’s help. One such situation is if you want to present more choices than you want to accept answers. Most programs will let you say with you want to let people pick only one answer to a question or as many choices as they wish among those presented. These options cover most cases, but not all. Sometimes you may wish to present more choices than you want each person to be able to pick. For example, you may present a list of 20 cities and ask people to pick the three they would most like to visit. In this case your program should let you specify you want to accept up to three answers. Some programs don’t give you that flexibility. Even fewer programs will let you specify that you want exactly 3 picks from each person, rather than no more than three picks.

You also need to be able to ensure that questions asking people to enter a number have only valid answers. For example, if you ask people how many days they worked in the previous week, your data collection tool should not allow people to enter more than seven. Your tool should also let you check a combination of answers. For example, if you ask people what percentage of their income they spend on food, clothing and housing, you should be able to make sure the answers do not total more than 100. In other cases you should be able to require your answers total to exactly 100, or any other predetermined value. Your tool should also be able to enforce a rule that the total of a series of numbers adds up to a number the participant has entered. For example, you may ask farmers how many acres they farm. You could then ask them how many acres they plant with corn, wheat, vegetables and other. You should be able to ensure that the numbers entered for those four categories add up to the total number of acres they say they farm.

You should be able to show them a generic message such as “your answers must total” the required number, such as 100. You should also be able to show custom message for particular question, such as “you said you farmed X acres, please make sure your answers for each of the kinds of crops add up to that total.”

A more sophisticated example is if you ask people to rank a series of options, your program should be able to make sure that people cannot enter the same number for two or more answers in the series. Taking that example a step further, your tool should be able to make sure that all the ranks are used. And then of course, you should be able to have ranked their top three choices out of a larger list, both ensuring that the same rank isn’t picked twice and that 1, 2 and 3 are all used. Only more sophisticated survey software allows the last kind of check.