Options to access
We offer research data in various degrees of anonymization, each of which entails certain sensitivities and protection requirements and is therefore made available via different access paths. In general, the lower the level of anonymization, the greater the effort involved in requesting and using it. You can learn more about the individual variants below (descending degree of anonymization):
not yet available
Public Use File (PUF)
Short description
Dataset is reduced and anonymized to such an extent that there are no restrictions on use and it is possible to share the data outside of scientific research.
Request and usage
Public Use Files can be downloaded from our website without request or reason.
not yet available
Campus Use File (CUF)
Short description
Dataset is less reduced and anonymized than a PUF. It can be used for academic courses, for example.
Request and usage
To download a Campus Use File, you must register on our website with an academic institution's email address. Subsequently, a form must be filled out with information on the exact use. After verification of the information, the corresponding download link will be sent by email.
Download
Short description
Dataset is less reduced and anonymized than a CUF. It may only be used for scientific research purposes.
Request and usage
To download an download dataset, you must register on our website with the email address of a scientific institution. A form must then be filled out with information about the research activity and the use of the dataset. After a verification of the information, the corresponding download link will be sent by email.
Remote
Short description
Dataset is less reduced and anonymized than an download dataset. It may only be used for scientific research purposes.
Request and usage
To use a remote dataset, registration on our website with the email address of a scientific institution is required. A form must then be filled out with information about the research activity and the intended use of the dataset. After verification of the information, a Secure Virtual Desktop (SVD) is set up and an e-mail with access data is sent. Please note that SVD access can be granted for a total period of up to 3 years. Within this period, users must book usage periods with a budget of up to 60 days per year. DeZIM employees also have the option of simplified access via a Secure Local Repository (SLR).
Attention: Access to the virtual desktop requires the installation of a VPN program on your computer and a smartphone app for two-factor authentication (similar to a TAN procedure in online banking). More detailed information is provided in the registration process.
We strongly recommend that you only use these datasets if the information in the download dataset is not sufficient for your research!
Download SVD user manual (Also read VPN config instructions)
Details (Click here for more information regarding packages and SVD work environment)
Within the SVD working environment, a Linux system with the statistical software R / Rstudio as well as Stata is provided. Additionally, a browser (Firefox), a text editor (gedit) as well as a file manager (Thunar) are installed.
Please note that you do NOT have Internet access within the virtual environment. This is for the security of the input data as well as your result data. Therefore, you do NOT have the possibility to install software on your own. If you need additional software / R or Stata packages, please contact us by e-mail.
R Packages: acepack, aod, arm, betareg, biglm, boot, bootstrap, brglm, broom.mixed, car, cardx, caret, class, cluster, dispmod, dplyr, dr, e1071, ebal, effects, ergm, fastICA, fixest, flextable, foreign, gam, gcookbook, gee, geepack, ggalluvial, ggplot2, ggrepel, ggridges, ggvis, glmnet, gmodels, gnm, gss, gsynth, gtsummary, here, Hmisc, hrbrthemes, igraph, influence.ME, interflex, janitor, knitr, labelled, latentnet, leaps, lme4, lmeSplines, lmm, lmtest, locfit, lsmeans, lubridate, mapproj, maps, MCMCglmm, mediation, memisc, mgcv, mi, mice, mitools, mix, mlogit, MNP, modelsummary, multcomp, multgee, multiplex, network, nlme, nlstools, nnet, norm, np, openxlsx, optmatch, pacman, PAFit, pan, plm, plotly, pscl, PSAgraphics, psych, quantreg, qvcalc, randomForest, remotes, rgl, rmarkdown, rms, RSiena, rstantools, sandwich, simpleboot, sm, sna, spatial, stargazer, statnet.common, stringr, survey, survival, tidymodels, tidyr, tidyverse, vcd, VGAM, VIM, viridis, viridisLite, visreg, writexl, xtable
Stata-Packages: anogi, asdoc, avar, barplot, barplot2, bayesmlogit, bcoeff, bcoeffs, bcuse, binscatter, catplot, cdfplot, cem, center, cluster, clustergram, coefplot, collapse2, combineplot, crtest, decomp, decompose, delta, devcon, devnplot, dfl, diff, distinct, egenmore, estout, expand_n, fairlie, filelist, fitstat, fre, ftools, gllamm, gologit, goprobit, gpfoble, grep, grfreq, grinter, grlogit, grnote, grstyle, gsa, gsample, gsum, gtools, hammock, hausman, hbar, hbox, hist3, histbox, historaj, histplot, hte, ice, igraph, ivreg2, ivreg210, ivreg28, ivreg29, jmpierce, jmpierce2, kdens, kdens2, kdmany, keeporder, keepvar, kernel, khb, kmatch, kountry, ldecomp, linkplot, lmcol, lmtest, logout, logtest, margeff, margfx, margin, marginscontplot2, mdesc, mgof, mice, missing, mmsel, mrtab, mvdcmp, network, networkDynamic, nlcheck, oaxaca, oaxaca9, outreg, outreg2, overid, palettes, psmatch2, ranktest, raschtest, rd, reghdfe, renames, reorder, ritest, robreg, rvlrplot, rvpplot2, sadi, smithwelch, sna, spost13_ado, spmap, sq, sum2, sum2docx, summout, summtab, sumstats, texdoc, tscollap, unique, violin, vioplot, webdoc, wgttest, winsor, winsor2, xttest2, xttest3
Onsite
Short description
The Onsite dataset is the least reduced and anonymized dataset. It may only be used for scientific research purposes.
Request and usage
To use an Onsite dataset, registration on our website with the email address of a scientific institution is required. A form must then be filled out with information about the research activity and how the dataset will be used. After a review of the information, contact will be established with the applicant and one or more appointments will be made to work on a dedicated Workstation at the DeZIM premises. Please note that the Onsite Workstation is not available to students and requires a lead time of up to 5 weeks before it can be made available. As access to the onsite workstation is also subject to other requests, the waiting time may be further delayed if capacity is already full.
We strongly recommend that you only use these datasets if the information in the download dataset is not sufficient for your research!
Details (Click here for more information regarding packages and SVD work environment)
Within the working environment, a full Linux system with the statistical software R / Rstudio as well as Stata is provided. Additionally, a browser (Firefox), a text editor (gedit), and a file manager (Nautilus) are installed.
Please note that you do NOT have Internet access within the working environment. This is for the security of the input data as well as your result data. Therefore, you do NOT have the possibility to install software on your own. If you need additional data or software / R or Stata packages, please contact us by email.
R Packages: acepack, aod, arm, betareg, biglm, boot, bootstrap, brglm, broom.mixed, car, cardx, caret, class, cluster, dispmod, dplyr, dr, e1071, ebal, effects, ergm, fastICA, fixest, flextable, foreign, gam, gcookbook, gee, geepack, ggalluvial, ggplot2, ggrepel, ggridges, ggvis, glmnet, gmodels, gnm, gss, gsynth, gtsummary, here, Hmisc, hrbrthemes, igraph, influence.ME, interflex, janitor, knitr, labelled, latentnet, leaps, lme4, lmeSplines, lmm, lmtest, locfit, lsmeans, lubridate, mapproj, maps, MCMCglmm, mediation, memisc, mgcv, mi, mice, mitools, mix, mlogit, MNP, modelsummary, multcomp, multgee, multiplex, network, nlme, nlstools, nnet, norm, np, openxlsx, optmatch, pacman, PAFit, pan, plm, plotly, pscl, PSAgraphics, psych, quantreg, qvcalc, randomForest, remotes, rgl, rmarkdown, rms, RSiena, rstantools, sandwich, simpleboot, sm, sna, spatial, stargazer, statnet.common, stringr, survey, survival, tidymodels, tidyr, tidyverse, vcd, VGAM, VIM, viridis, viridisLite, visreg, writexl, xtable
Stata-Packages: anogi, asdoc, avar, barplot, barplot2, bayesmlogit, bcoeff, bcoeffs, bcuse, binscatter, catplot, cdfplot, cem, center, cluster, clustergram, coefplot, collapse2, combineplot, crtest, decomp, decompose, delta, devcon, devnplot, dfl, diff, distinct, egenmore, estout, expand_n, fairlie, filelist, fitstat, fre, ftools, gllamm, gologit, goprobit, gpfoble, grep, grfreq, grinter, grlogit, grnote, grstyle, gsa, gsample, gsum, gtools, hammock, hausman, hbar, hbox, hist3, histbox, historaj, histplot, hte, ice, igraph, ivreg2, ivreg210, ivreg28, ivreg29, jmpierce, jmpierce2, kdens, kdens2, kdmany, keeporder, keepvar, kernel, khb, kmatch, kountry, ldecomp, linkplot, lmcol, lmtest, logout, logtest, margeff, margfx, margin, marginscontplot2, mdesc, mgof, mice, missing, mmsel, mrtab, mvdcmp, network, networkDynamic, nlcheck, oaxaca, oaxaca9, outreg, outreg2, overid, palettes, psmatch2, ranktest, raschtest, rd, reghdfe, renames, reorder, ritest, robreg, rvlrplot, rvpplot2, sadi, smithwelch, sna, spost13_ado, spmap, sq, sum2, sum2docx, summout, summtab, sumstats, texdoc, tscollap, unique, violin, vioplot, webdoc, wgttest, winsor, winsor2, xttest2, xttest3