Skip to content


Frequently Asked Questions

Q1How was this variable constructed?

A1. For any “this variable,” the answer can usually be found in Lynne G. Zucker, Michael R. Darby and Jason Fong, “Communitywide Database Designs for Tracking Innovation Impact: COMETS, STARS and Nanobank,” National Bureau of Economic Research Working Paper No. 17404, September 2011, available here (PDF) or at Other queries can be answered by clicking on the Reference Sheet link, if any, next to the variable name, type, and description in the Codebook.

Q2. Why aren’t the variables I need up at COMETS yet?

A2. There are two main reasons for this, one or both of which may be applicable: (a) Producing an integrated multi-terabyte database from numerous legacy databases is an incredibly labor-intensive activity which is progressing as fast and as far as funding will allow. COMETS 1.0 contains the first major files to emerge from this process including iterative beta-testing. (b) Many of the most interesting files are the most commercially valuable ones and their vendors protect their investment in building them with stringent limits on what can be made public. Over time, useful data constructed from such databases will appear that serve research purposes without undue burden on the vendors. In addition, if funding is available, emergent alternative sources of information will be utilized in future versions of COMETS. See Zucker, Darby, and Fong (2011) (PDF) for a further explanation of both constraints and planned future enhancements to COMETS.

Q3. I merged some variables into an Excel file using BEA Economic Areas as my unit of geography and the values seem very low for about a third of the observations. I tried to do the same thing into STATA and got the same values. Is there something wrong with the COMETS data?

A3. We’ve run into the same problem with data merging and data use, particularly in regards to BEAs. BEAs should always be 3 digits with leading zeroes for BEAs 1-99. This becomes problematic when using STATA and Excel because if BEAs are specified as numbers or numeric variables, STATA and Excel drop the leading zeroes. In STATA, BEAs must be IMPORTED as STRING variables. In Excel, you must IMPORT them as TEXT columns. If you import into STATA or Excel and then try to change the variable type you will not regain those leading zeroes, they must be imported as string/text variables. If you do not specify these variables correctly, BEAs will not match up and you will lose data.