Access

Options to access

We offer research data in various degrees of anonymization, each of which entails certain sensitivities and protection requirements and is therefore made available via different access paths. In general, the lower the level of anonymization, the greater the effort involved in requesting and using it. You can learn more about the individual variants below (descending degree of anonymization):

not yet available

Public Use File (PUF)

Short description

Dataset is reduced and anonymized to such an extent that there are no restrictions on use and it is possible to share the data outside of scientific research.

Request and usage

Public Use Files can be downloaded from our website without request or reason.



not yet available

Campus Use File (CUF)

Short description

Dataset is less reduced and anonymized than a PUF. It can be used for academic courses, for example.

Request and usage

To download a Campus Use File, you must register on our website with an academic institution's email address. Subsequently, a form must be filled out with information on the exact use. After verification of the information, the corresponding download link will be sent by email.



Download

Short description

Dataset is less reduced and anonymized than a CUF. It may only be used for scientific research purposes.

Request and usage

To download an download dataset, you must register on our website with the email address of a scientific institution. A form must then be filled out with information about the research activity and the use of the dataset. After a verification of the information, the corresponding download link will be sent by email.



Remote

Short description

Dataset is less reduced and anonymized than an download dataset. It may only be used for scientific research purposes.

Request and usage

To use a remote dataset, registration on our website with the email address of a scientific institution is required. A form must then be filled out with information about the research activity and the intended use of the dataset. After verification of the information, a Secure Virtual Desktop (SVD) is set up and an e-mail with access data is sent. Please note that SVD access can be granted for a total period of up to 3 years. Within this period, users must book usage periods with a budget of up to 60 days per year. DeZIM employees also have the option of simplified access via a Secure Local Repository (SLR).

Attention: Access to the virtual desktop requires the installation of a VPN program on your computer and a smartphone app for two-factor authentication (similar to a TAN procedure in online banking). More detailed information is provided in the registration process.

We strongly recommend that you only use these datasets if the information in the download dataset is not sufficient for your research!


Download SVD user manual (Also read VPN config instructions)

Download SLR user manual

Details (Click here for more information regarding packages and SVD work environment)

Within the SVD working environment, a Linux system with the statistical software R / Rstudio as well as Stata is provided. Additionally, a browser (Firefox), a text editor (gedit) as well as a file manager (Thunar) are installed.

Please note that you do NOT have Internet access within the virtual environment. This is for the security of the input data as well as your result data. Therefore, you do NOT have the possibility to install software on your own. If you need additional software / R or Stata packages, please contact us by e-mail.

R Packages: acepack, aod, arm, betareg, biglm, boot, bootstrap, brglm, broom.mixed, car, cardx, caret, class, cluster, dispmod, dplyr, dr, e1071, ebal, effects, ergm, fastICA, fixest, flextable, foreign, gam, gcookbook, gee, geepack, ggalluvial, ggplot2, ggrepel, ggridges, ggvis, glmnet, gmodels, gnm, gss, gsynth, gtsummary, here, Hmisc, hrbrthemes, igraph, influence.ME, interflex, janitor, knitr, labelled, latentnet, leaps, lme4, lmeSplines, lmm, lmtest, locfit, lsmeans, lubridate, mapproj, maps, MCMCglmm, mediation, memisc, mgcv, mi, mice, mitools, mix, mlogit, MNP, modelsummary, multcomp, multgee, multiplex, network, nlme, nlstools, nnet, norm, np, openxlsx, optmatch, pacman, PAFit, pan, plm, plotly, pscl, PSAgraphics, psych, quantreg, qvcalc, randomForest, remotes, rgl, rmarkdown, rms, RSiena, rstantools, sandwich, simpleboot, sm, sna, spatial, stargazer, statnet.common, stringr, survey, survival, tidymodels, tidyr, tidyverse, vcd, VGAM, VIM, viridis, viridisLite, visreg, writexl, xtable

Stata-Packages: anogi, asdoc, avar, barplot, barplot2, bayesmlogit, bcoeff, bcoeffs, bcuse, binscatter, catplot, cdfplot, cem, center, cluster, clustergram, coefplot, collapse2, combineplot, crtest, decomp, decompose, delta, devcon, devnplot, dfl, diff, distinct, egenmore, estout, expand_n, fairlie, filelist, fitstat, fre, ftools, gllamm, gologit, goprobit, gpfoble, grep, grfreq, grinter, grlogit, grnote, grstyle, gsa, gsample, gsum, gtools, hammock, hausman, hbar, hbox, hist3, histbox, historaj, histplot, hte, ice, igraph, ivreg2, ivreg210, ivreg28, ivreg29, jmpierce, jmpierce2, kdens, kdens2, kdmany, keeporder, keepvar, kernel, khb, kmatch, kountry, ldecomp, linkplot, lmcol, lmtest, logout, logtest, margeff, margfx, margin, marginscontplot2, mdesc, mgof, mice, missing, mmsel, mrtab, mvdcmp, network, networkDynamic, nlcheck, oaxaca, oaxaca9, outreg, outreg2, overid, palettes, psmatch2, ranktest, raschtest, rd, reghdfe, renames, reorder, ritest, robreg, rvlrplot, rvpplot2, sadi, smithwelch, sna, spost13_ado, spmap, sq, sum2, sum2docx, summout, summtab, sumstats, texdoc, tscollap, unique, violin, vioplot, webdoc, wgttest, winsor, winsor2, xttest2, xttest3




Onsite

Short description

The Onsite dataset is the least reduced and anonymized dataset. It may only be used for scientific research purposes.

Request and usage

To use an Onsite dataset, registration on our website with the email address of a scientific institution is required. A form must then be filled out with information about the research activity and how the dataset will be used. After a review of the information, contact will be established with the applicant and one or more appointments will be made to work on a dedicated Workstation at the DeZIM premises. Please note that the Onsite Workstation is not available to students and requires a lead time of up to 5 weeks before it can be made available. As access to the onsite workstation is also subject to other requests, the waiting time may be further delayed if capacity is already full.

We strongly recommend that you only use these datasets if the information in the download dataset is not sufficient for your research!


Download OSW user manual

Details (Click here for more information regarding packages and SVD work environment)

Within the working environment, a full Linux system with the statistical software R / Rstudio as well as Stata is provided. Additionally, a browser (Firefox), a text editor (gedit), and a file manager (Nautilus) are installed.

Please note that you do NOT have Internet access within the working environment. This is for the security of the input data as well as your result data. Therefore, you do NOT have the possibility to install software on your own. If you need additional data or software / R or Stata packages, please contact us by email.

R Packages: acepack, aod, arm, betareg, biglm, boot, bootstrap, brglm, broom.mixed, car, cardx, caret, class, cluster, dispmod, dplyr, dr, e1071, ebal, effects, ergm, fastICA, fixest, flextable, foreign, gam, gcookbook, gee, geepack, ggalluvial, ggplot2, ggrepel, ggridges, ggvis, glmnet, gmodels, gnm, gss, gsynth, gtsummary, here, Hmisc, hrbrthemes, igraph, influence.ME, interflex, janitor, knitr, labelled, latentnet, leaps, lme4, lmeSplines, lmm, lmtest, locfit, lsmeans, lubridate, mapproj, maps, MCMCglmm, mediation, memisc, mgcv, mi, mice, mitools, mix, mlogit, MNP, modelsummary, multcomp, multgee, multiplex, network, nlme, nlstools, nnet, norm, np, openxlsx, optmatch, pacman, PAFit, pan, plm, plotly, pscl, PSAgraphics, psych, quantreg, qvcalc, randomForest, remotes, rgl, rmarkdown, rms, RSiena, rstantools, sandwich, simpleboot, sm, sna, spatial, stargazer, statnet.common, stringr, survey, survival, tidymodels, tidyr, tidyverse, vcd, VGAM, VIM, viridis, viridisLite, visreg, writexl, xtable

Stata-Packages: anogi, asdoc, avar, barplot, barplot2, bayesmlogit, bcoeff, bcoeffs, bcuse, binscatter, catplot, cdfplot, cem, center, cluster, clustergram, coefplot, collapse2, combineplot, crtest, decomp, decompose, delta, devcon, devnplot, dfl, diff, distinct, egenmore, estout, expand_n, fairlie, filelist, fitstat, fre, ftools, gllamm, gologit, goprobit, gpfoble, grep, grfreq, grinter, grlogit, grnote, grstyle, gsa, gsample, gsum, gtools, hammock, hausman, hbar, hbox, hist3, histbox, historaj, histplot, hte, ice, igraph, ivreg2, ivreg210, ivreg28, ivreg29, jmpierce, jmpierce2, kdens, kdens2, kdmany, keeporder, keepvar, kernel, khb, kmatch, kountry, ldecomp, linkplot, lmcol, lmtest, logout, logtest, margeff, margfx, margin, marginscontplot2, mdesc, mgof, mice, missing, mmsel, mrtab, mvdcmp, network, networkDynamic, nlcheck, oaxaca, oaxaca9, outreg, outreg2, overid, palettes, psmatch2, ranktest, raschtest, rd, reghdfe, renames, reorder, ritest, robreg, rvlrplot, rvpplot2, sadi, smithwelch, sna, spost13_ado, spmap, sq, sum2, sum2docx, summout, summtab, sumstats, texdoc, tscollap, unique, violin, vioplot, webdoc, wgttest, winsor, winsor2, xttest2, xttest3