Data Files

This page allows for the download of the public-use microdata file for the Kauffman Firm Survey, which contains data from the baseline (2004) through seventh follow-up (2011) surveys. The dataset can be downloaded in SAS, SPSS or STATA (click to download).

Public-use KFS Survey Data

*Note: SAS users need to remove the dashes before the filename before opening it. Apologies for the inconvenience.

New KFS Public Data File Formats

In our effort to make it easier for researchers to use the KFS public data more efficiently, we produced the public data files (both in wide and long format) based on the KFS logically imputed data. The data are available in Stata format only.

To learn more about the KFS logically imputed data please read section 2.4. Logical Imputation (Data Editing) page 45 in Applied Survey Data Analysis Using Stata: The Kauffman Firm Survey Data. Available at SSRN.

For more information please read the "READ ME FIRST" file which is included in the zipped file.

Logically imputed KFS Public Data (ZIP)

The major differences between the raw public data and the new logically imputed public data file are:

  1. The new file codes the data into soft missing and hard missing data.
  2. To use loops, reshape data efficiently and ensure consistency of variable names (type) across all years, we renamed some of the variables.
  3. For Survival Analysis, we constructed the Duration and Event variables.
  4. Most variables have labels in the new files.
  5. We have created all the continuous variables using the midpoints of the class intervals for the range variables. This will allow you to use the Strata codes in Applied Survey Data Analysis Using Stata: the Kauffman Firm Survey Data. Available at SSRN.

Two other more detailed forms of the KFS data are also available to qualified researchers but require additional procedures to obtain access:

NORC Data Enclave

Researchers wishing to access a more detailed data file and to engage with a community of researchers in analysis of the KFS should consider applying for access to the University of Chicago NORC Data Enclave. The NORC Data Enclave provides secure remote access to the KFS confidential microdata file, which contains more detail industry codes, geographical codes (zip code, metropolitan statistical area, and state), and many additional continuous variables (in addition to categorical variables). The KFS confidential microdata may only be accessed through the NORC Data Enclave.

U.S. Census Bureau Research Data Centers

We are pleased to announce that the KFS is now available to qualified researchers on approved projects using the U.S. Census Bureau's Research Data Centers. This will allow the KFS data to be matched to other micro data about the businesses in the sample in a Census Bureau secure research facility. Learn more about applying to a Census RDC.