Home>Expertises>Data preparation
Data preparation
CDSP is expert in preparing survey data for dissemination: anonymisation, matching, correction for nonresponse and weighting calculation.
Data anonymisation and protection
As an expert in the dissemination of data in compliance to current legal standards, CDSP has introduced measures to protect researchers and interviewees while maintaining data richness. It complies with data protection law, copyright law, the law on statistical confidentiality, and French regulations governing the dissemination of public archives. The issues around balancing scientific and legal considerations are discussed with the relevant lawyers and expert committees in order to establish the appropriate procedures. In addition, the obligations to maintain intellectual integrity, citation requirements and confidentiality are guaranteed by legal documents that CDSP has introduced, such as letter of commitment for Data Sharing.
Checking data consistency
CDSP’s engineers also have expertise in the most advanced methods of data cleaning. CDSP is thus responsible for processing the French data in the European Social Survey (ESS). In addition to collecting data, this also involves meticulous checking of data consistency as part of ESS’s European coordination role. These data are then deposited with Sikt - Norwegian Agency for Shared Services in Education and Research, for subsequent documentation and dissemination.
With regard to the data produced or deposited with CDSP by external researchers and institutions, several operations are undertaken before Data Documentation Initiative (DDI) metadata are added to the file. The quality of the data (consistency, missing values, etc.) and of the materials accompanying them (questionnaire, coding frames, reports, etc.) is checked, then the necessary recoding or correction is done. CDSP’s engineers maintain a dialogue with researchers and the depositor teams throughout this process.
These data are then documented and disseminated by CDSP.
In addition, CDSP’s engineers convert the files into several formats, including at least one in free format, for dissemination and storage, and name them in accordance with the guidelines applicable within the professional research data community.
Matching and calculating weightings
CDSP has also developed expertise in data matching and nonresponse correction. All the data collected in the ELIPSS survey process are enriched with additional variables sourced from the annual ELIPSS survey. This survey is specifically designed on the basis of consultations with the different stakeholders and the users of the data in order to provide key information for social science research. Because the survey is regularly repeated, nonresponse can be corrected for by a process of allocation, in which missing information is replaced by information collected in a previous iteration. This process helps to substantially mitigate the problem of missing data.
In addition, CDSP’s engineers have developed methods for calculating different types of weighting used to correct for nonresponse in each ELIPSS panel survey. CDSP has been working on this process for a long time with experts from lnsee and INED. In parallel, the Centre’s engineers are always on the lookout for innovative solutions to help them tackle the challenges associated with ageing and attrition in the panel sample.