triple-s - Two can talk - survey description standards

triple-s logo

December 2000

It's such a relief, when you are in a foreign land and you cannot understand a word anyone is saying, to find someone who speaks the same language. Suddenly, this single common point of reference can mean the difference between chaos, confusion and uncertainty on the one hand, with order, understanding and the ability to regain control of your situation.

It is with a similar sense of relief that survey researchers can now start to benefit from a different kind of civilized communication that is starting to break out among the tools of the trade as rival software packages for collecting, analysing and processing survey data start to incorporate open interfaces and agreed standards to allow work to flow between them.

Until recently, there were no agreed standards for how different computer packages would treat their data

Until recently, there were no agreed standards for how different computer packages would treat their data. Once you signed up for one system, you were effectively locked in to that system. It was easy to believe you were being punished by choosing to move data from one manufacturer's package to another, such was the amount of work involved in reformatting data and re-specifying the texts and layout of that data.

In the modern world of high speed communications, e-mail and the Internet, the need to move data around quickly, inexpensively and reliably, is critical. You may wish to work with data that someone else has collected; you may be working on a joint research project where the different partners use different systems or you may wish to use solutions from more than one supplier within your own organization.

Peter Wills of Snap Surveys originally proposed a standard to facilitate the exchange of research and survey data at a time when no standards existed

Peter Wills of Snap Surveys originally proposed a standard to facilitate the exchange of research and survey data at a time when no standards existed, and joined with other manufacturers in order to create a cross-industry agreed standard.

Characteristics

The problem with survey data is that every set of data is unique and every value within that data has a unique meaning too. All survey packages have a way of associating codes and textual descriptions to the questions and answers. The difficulty is, they all do it in their own way.

There are now around twenty different packages that support triples

The first widely agreed standard for survey data is triples, the standard Steve Jenkins of Snap Surveys helped to create along with two other manufacturers, Keith Hughes of Merlinco and Geoff Wright originally of Pulse Train and latterly at Computable Functions Ltd. Triples deals with both the data and the descriptions - the metadata, as it is called. This largely solves the problem of moving data from one system to another, as any system that can understand triples can load in the data along with all its descriptions, saving hours of tedious and error-prone typing. There are now around twenty different packages that support triples See www.triple-s.org for more information.

Dr Steve Jenkins

Several other open standard initiatives are taking place today. As a manufacturer committed to providing its users with open software, Snap Surveys is actively involved in many of these initiatives.

Some would argue that the earliest and most enduring standard is the old 80-column punch card - a format that many survey packages still support in electronic format. The format of the punch card has changed little since it was invented at the end of the 19th century by Herman Hollerith, the founder of IBM, to process US Census data. However, as a standard, card-format data does not solve the metadata problem.

Almost all the recent initiatives are based on XML, which is not a standard in itself, but is a computer readable language that encourages the development of standard interfaces and definitions. XML stands for "extensible markup language", and HTML, the stuff of most web pages, is an example of an XML.

The OpenSurvey group www.opensurvey.org is proposing two new standards for surveys: askML, which will allow questionnaires to be exchanged between different data collection systems, including all their routing logic, validation rules and much more; and tabML, a standard to make it easy to transfer volumes of completed tables between systems.

Triples too has adapted to XML, and manufacturers are starting to incorporate the XML version of triples.

The European Union is funding a number of initiatives to aid standardization in metatdata for statistical data

The European Union is funding a number of initiatives to aid standardization in metatdata for statistical data, such as the Metanet and Tedesco projects which involve a number of national statistical offices working in partnership. A US Government-backed project organized by the University of Michigan is DDI, the Data Definition Initiative, which is concerned with metadata for archiving survey data.

Case Study

OPERA Group of companies www.operagrp.com is a full-service market research agency based in Norwich, in the East of England. As the agency grew in size, it expanded into telephone interviewing and found itself looking for a way to integrate different software systems. When the company's CATI unit expanded to 30 stations, it was necessary to introduce a new software system. For Val Keel, MAP's Research Services Manager, it was preferable for this solution to be one that would integrate with the existing Snap software used at OPERA Group of companies to enter and analyze its face to face work.

Researchers can use a single tool to carry out all analysis, regardless of the form of data collection used for the study

Apart from offering the specific functionality that Val and her colleagues are seeking from different software packages, one requirement is that they support the triples standard for data transfer. This makes it easy to move completed interviews into Snap for analysis. It also means that the researchers can use a single tool to carry out all analysis, regardless of the form of data collection used for the study.

Val Keel explained, "After the interviews, we generate codeframes and code in our interviewing system. We then extract and save as triples and then move it into Snap to carry out all the analysis. It is a standard and we convert from one format into another format - there isn't very much to say about it". The triples link transfers not only the data, but also all of the definitions such as the wording of the questions and the labels for each answer code.

"It has enabled me to retain the analysis package I am comfortable with, which our clients know and which gives us the analysis we want"

"Without triples we would probably have had to choose a single package that did everything. We would have been thrown in the deep end. It has enabled me to retain the analysis package I am comfortable with, which our clients know and which gives us the analysis we want."

One drawback is that routing instructions, which are essential for filtering and establishing the right totals in the analysis, are lost in the transition. Routing is not supported in triples at present, though the triples group aims to introduce it in the future.

When examining the different systems, Val Keel was surprised at the complexity of some solutions in the area of defining data - the area that triples handles automatically.

Summary

Triples is so far the most successful of the emerging cross-industry standards for surveys and questionnaires, in that it has been adopted by a number of different manufacturers and it is in daily use by an increasing number of research departments and agencies throughout the world. It is just one of several government-led or vendor-led initiatives to break down the barriers between different systems and working methods in the move towards open systems and open standards. At present, many of the other standards are either on the drawing board, or are in use in a fairly limited context.

The Internet has fuelled both interest and growth in connectivity between different software packages, and the arrival of XML on the scene, while not a standard in itself, is making it easier to define and agree standards. We are about to go through a phase where standards will proliferate, many are geared towards quite specific purposes, such as archiving or exchanging finished tabular reports. Over time, these standards are bound to converge.

Snap Surveys has already played a key role in the development and promotion of industry standards, and as a part of its commitment to giving its users open software, will continue to play an active part in all appropriate open survey initiatives.