5/26/2010 3:00:00 PM By
E.J. Reedy
Martin Kenney and Donald Patton of the University of California, Davis are making available to other researchers the "Firm Database of Initial Public Offerings (IPOs): from June 1996 through 2006, Version A." Interested scholars who would like to be emailed the data should contact Don. This was a database we featured at the 2007 Kauffman Symposium on Entrepreneurship and Innovation Data but at that time the data wasn't complete. Now, version A is done and Don and Martin have plans for several updates to the database in the coming year. So, without further adieu, here is the data base summary included with the data file:
This database is comprised of all de novo initial public offerings (IPOs) on American stock exchanges and filed with the Securities and Exchange Commission (SEC) from June 1996 through December 2006. In assembling the set of firms to be included we initially relied on Thomson Financial Venture Expert to generate a list of all IPOs over this time period. From this list the following types of firms and filings were excluded: mutual funds, real estate investment trusts (REITs), asset acquisition or blank check companies, foreign F-1 filers, all small business (SB-2) IPOs with the exception of Internet firms, and all spin-offs and other firms that were not true de novo firms.
Every firm going public must file a prospectus with the U.S. Securities and Exchange Commission prior to its initial public stock offering. The IPO is a defining event in the history of any firm, and it performs two functions. First, it provides the firm with capital so that it can continue its expansion. Second, after the IPO, the stakes of both management and investors, (subject to certain lock-up delays) becomes liquid. In return, the firm must conform to the reporting and transparency requirements imposed by the SEC under the Securities Act of 1933. One of the primary objectives of the Securities Act of 1933 is to require companies making a public offering of their securities to publicly disclose relevant business and financial information about their company so that potential investors can make an informed investment decision regarding the offering. To achieve this end the 1933 Act requires companies going public to file disclosure documents with the Securities and Exchange Commission, the most important of which are the general form S-1 registration statement and the 424B prospectus.
This database has been constructed directly from these registration statements and prospectuses. These documents were found on the SEC's Electronic Data, Gathering and Retrieval (EDGAR) website. Up until the advent of the SEC's EDGAR system, IPO registration statements and other SEC documents were filed in paper form in officially designated locations and libraries. Beginning in the 1980s the SEC began to provide Internet access to these documents through its EDGAR program, but it wasn't until June 1996 that public firms were required to file all of their documents in this format. Therefore a complete record of all IPO documents for firms going public only begins in June 1996, and this is the starting point of this database.
5/25/2010 8:00:00 AM By
E.J. Reedy
Last year
I posted on an effort at Eurostat to do a large-scale survey on Access to Finance. Since that time, more than a year has passed and it looks like Eurostat is much further down the line in developing this survey. I haven't been tied very closely to their efforts since then because over time what they are attempting to do and what we have done with the
Kauffman Firm Survey (KFS) have diverged. I am actually really pleased that this has happened, in a way, because Eurostat made some very important decisions early on in conceptualizing their study that limited its ability to gather comparable data to the KFS and the new version of their survey seems much more realistic in the content it collects and likely to get comparable data across European countries.
The
minutes from the most recent expert meeting on the survey and the
draft survey instrument show a survey effort that has gelled on trying to determine how the relationship has changed between 2007 and 2010 between small firms and debt/equity markets. This survey is notable because it is one of the few which asks businesses about attempts to access equity markets (and debt), not just actual equity or debt received. Now, while this is a solid instrument and they face a lot of limits in the length of the survey, I would have concerns about the complexity of concepts used here with no definitions (like factoring) and the reliability of the data as a result. The saving grace may be that the survey skirts most respondent fears by not actually asking for specific dollars received or other sensitive topics only overall use. The other downside is there is no way to gather data on the experience of the companies that didn't survive 2007 to 2010, an obviously tumultuous set of years where access to finance may have been a critical factor in their sustainability. Additionally, these national statistical offices, which already have a significant amount of rich microdata, are gathering data through this survey that is least helpful to them in matched microdata analysis. While they get a lot of directional feedback from businesses, they don't get specifics - like amounts - which for research purposes would have been so much more valuable when matched with other business data already on record. But I have to keep reminding myself that the national statistical offices are not actually charged with performing the research which could really inform policy in this area; they need to be good stewards of the data and try to achieve high response rates so that tabular information can be published and sent to Eurostat.
Eurostat has a tough task in organizing a survey like this since they never actually have the microdata and don't carryout the country-level collections. I admire them for taking on this important and tough subject but do so wish that the European regulations were different and that this enormous effort could be used to create a rich research data set which would go so much further in advancing research and understanding in this area than the aggregate tables and limited reports which this effort will produce. Don't even get me started on the fact that no non-European countries will be fielding replications of this effort (to my knowledge) when it happens even though this has been in the planning stages for more than three years.
5/24/2010 3:00:00 PM By
E.J. Reedy
The
Economic Development Administration (EDA) at the U.S. Department of Commerce has issued a call for proposals on the “Mapping Regional Innovation Clusters Project.” I am still reading over all the details and thinking about some of its implications, but in seeking proposals in the $1 million/year range for three years of support this should bring out a large and diverse set of applicants.
The short stated intent of the project:
…EDA, pursuant to its Research and Evaluation program, solicits applications for an economic development research project aimed at developing a replicable method for identifying and mapping regional innovation clusters, providing resources on best practices, and providing recommendations on metrics for the evaluation of regional innovation clusters.
For further details:
http://www.grants.gov/search/search.do?mode=VIEW&oppId=54670
My comments here will be fairly simple.
- Missed opportunity. It’s a real shame that significant efforts like this aren’t actually better thought out across agencies by groups like the Office of Management and Budget, the Office of Science and Technology Policy, or through informal task forces. This effort could be so much the better if it were coming on the tail of a one-year or two-year research effort across U.S. statistical agencies to develop new and relevant regional innovation statistics from the underlying microdata. Instead, whoever wins will be forced to use many of the same fairly worn sets of indicators. So much could be done in this regard at Census and BLS, at a minimum, but it takes effort, time, and some funding. The U.S. statistical agencies are becoming increasingly aware that they need to produce better regional statistics (BEA is really taking the lead hear but only after some rough years).
- Web visuals aren’t so different. Having just gone through my first project in online data visualization with the Kauffman Index of Entrepreneurial Activity, I can say that doing online data visualization is not cheap and online visualizations are only as good as the traditional analysis completed. Unfortunately, I don’t feel like the science behind regional clusters or what is actually important to be measured when looking at regional strengths and weaknesses is specified enough to offer a fully-coherent base of knowledge for visualization.
The solicitation specifically identifies a couple of prior EDA-funded projects which the agency wants to be a component of the new project:
5/20/2010 9:00:00 AM By
E.J. Reedy
Today we released the
2009 Kauffman Index of Entrepreneurial Activity. This piece is important because it is the earliest indicator we have of how the composition of who is becoming an entrepreneur is changing. Given the recession, the 2009 numbers are particularly of interest. As a quick background, the Kauffman Index measures entrepreneurship as the percentage of the adult, non-business owner population that starts a business each month, thus the Kauffman Index captures all new business owners, including those who own incorporated or unincorporated businesses, and those who are employers or non-employers.
You can find more of an overview of the findings on the
main Kauffman website, but 2009 is notable in that it does show the highest index rating for the U.S. generally, African-Americans, and men. But rather than regurgitate what we have released there, I wanted to post a few additional tables that I found very interesting showing the composition changes over time of who is becoming an entrepreneur. So here Rob Fairlie (the study's author) and I have applied the Kauffman Index rates of entry for the different demographic groups to the Current Population Surveys weights for these populations over time. This isn't information we focus on in the current report but I do think we might add it in next year as we only really went down this route in the last few days.
Composition of New Entrepreneurs by Age
| |
Ages 20-34 |
Ages 35-44 |
Ages 45-54 |
Ages 55-64 |
| 1996 |
35% |
27% |
24% |
14% |
| 1997 |
35% |
28% |
21% |
16% |
| 1998 |
34% |
29% |
21% |
16% |
| 1999 |
33% |
30% |
22% |
16% |
| 2000 |
29% |
27% |
26% |
18% |
| 2001 |
30% |
27% |
25% |
18% |
| 2002 |
29% |
28% |
26% |
17% |
| 2003 |
26% |
30% |
25% |
19% |
| 2004 |
30% |
26% |
24% |
21% |
| 2005 |
31% |
25% |
23% |
20% |
| 2006 |
28% |
25% |
27% |
20% |
| 2007 |
28% |
26% |
27% |
19% |
| 2008 |
28% |
25% |
26% |
21% |
| 2009 |
25% |
27% |
26% |
23% |
Composition of New Entrepreneurs by Race
| |
White |
Black |
Latino |
Asian |
| 1996 |
77% |
9% |
11% |
4% |
| 1997 |
77% |
9% |
11% |
3% |
| 1998 |
78% |
8% |
11% |
4% |
| 1999 |
75% |
10% |
11% |
4% |
| 2000 |
74% |
11% |
12% |
3% |
| 2001 |
73% |
10% |
13% |
5% |
| 2002 |
73% |
11% |
12% |
4% |
| 2003 |
70% |
9% |
16% |
5% |
| 2004 |
72% |
9% |
15% |
5% |
| 2005 |
71% |
10% |
14% |
5% |
| 2006 |
70% |
10% |
15% |
5% |
| 2007 |
67% |
9% |
18% |
5% |
| 2008 |
65% |
8% |
21% |
6% |
| 2009 |
66% |
10% |
19% |
5% |
Composition of New Entrepreneurs by Nativity
| |
Native-Born |
Immigrant |
| 1996 |
86% |
14% |
| 1997 |
87% |
13% |
| 1998 |
86% |
14% |
| 1999 |
85% |
15% |
| 2000 |
84% |
16% |
| 2001 |
84% |
16% |
| 2002 |
82% |
18% |
| 2003 |
81% |
19% |
| 2004 |
79% |
21% |
| 2005 |
82% |
18% |
| 2006 |
80% |
20% |
| 2007 |
76% |
24% |
| 2008 |
74% |
26% |
| 2009 |
76% |
24% |
Composition of New Entrepreneurs by Education Level
| |
Less than High School |
High School Graduate |
Some College |
College Graduate |
| 1996 |
17% |
33% |
27% |
23% |
| 1997 |
17% |
31% |
28% |
24% |
| 1998 |
15% |
33% |
26% |
26% |
| 1999 |
14% |
33% |
27% |
27% |
| 2000 |
16% |
33% |
28% |
24% |
| 2001 |
14% |
30% |
25% |
31% |
| 2002 |
14% |
31% |
25% |
29% |
| 2003 |
17% |
29% |
26% |
28% |
| 2004 |
14% |
29% |
27% |
30% |
| 2005 |
16% |
29% |
28% |
28% |
| 2006 |
14% |
29% |
28% |
29% |
| 2007 |
15% |
28% |
24% |
32% |
| 2008 |
16% |
31% |
24% |
29% |
| 2009 |
16% |
31% |
23% |
30% |
So, what we see here is that partly because of changing propensities to enter entrepreneurship (the Kauffman Index) and partly because of changing demographic patterns of the labor force that new entrepreneurs are getting older, more educated, less white and more likely to be immigrants.
Kauffman Index microdata for 2009 will be made available in the next couple of months. Currently, microdata is available through the website through 2008.
5/18/2010 8:00:00 AM By
E.J. Reedy
A lot of times I find out about new data sources through working papers or conference presentations. In this case,
Ben Hallen at the University of Maryland and
Rory McDonald have a working paper on super angel investors which uses a new database – CrunchBase – and Ben seemed enthusiastic on the data, so I thought I’d take more of a look. Incidentally, also keep an eye out for the updated version of this paper as it was really interesting but for now the paper is not posted online and interested scholars should contact the scholars directly.
CrunchBase, which advertises itself as the “free tech company database,” is a great concept and one that can only become more powerful as more users see it and use it. It’s essentially technology company data collected via wiki. Here were the overview stats as of 5/14/2010:
CrunchBase Stats
Companies - 39,866
People - 54,684
Financial Organizations - 4,705
Service Providers - 2,305
Funding Rounds - 14,944
Acquisitions - 2,996
While many researchers will have concerns about data gathered using a bottom-up process, I suspect the data is actually much more accurate than we would expect. Now, this isn’t to say that the data should be taken as is shown because even CrunchBase acknowledges the following on their
FAQ web page:
You do not know if the data is accurate. As multiple people edit CrunchBase profiles of companies, financial organizations and people, some mistakes might be added. Information might also be out of date. If you notice anything that needs changing you can go ahead and edit the page.
Most large data sets (even government data) have a significant amount of error at the individual firm level which, if random, washes itself out as the data gets aggregated up. Now, the true test of CrunchBase as a research tool will be to see if they close the cycle with researchers providing data and receiving updates back. In my experience, scholars are great at taking data, complaining about, and spending tons of time cleaning it but rarely actually do many scholars go to the next step of showing data producers where there were errors or things that could be improved. I hope for CrunchBase’s sake that this paradigm begins to change.
In any case, for those looking at technology companies CrunchBase definitely seems worth a further exploration. I hope that those of you who have explored this data further than I have will offer comments related to how good or bad CrunchBase is at curating the data to allow for longitudinal analysis.
5/17/2010 3:00:00 PM By
E.J. Reedy
Hi, my name is E.J., and my household is a nightmare for survey research firms. We are a cell phone-only household. Even worse, I maintain the same cell phone number which I first received in college which means that I maintain an area code which is at least two household moves ago.
The
National Center for Health Statistics reports that I am like a quarter of Americans that no longer maintain a landline phone, choosing to have cell phones only. Furthermore, they found that one in seven homes have a landline but rarely use it. It used to be that
random digit dialing was the best way to get a nationally representative sample of households so that survey firms could talk to small samples of people and extrapolate to large populations of people. But these continuing trends away from landlines (particularly for younger households) makes that really a challenge. I looked around a bit to see if I could find a similar study for businesses (if you know of one, let me know!) but all I found was a
2004 NFIB study which showed "seventy-eight percent of small-business owners use a cell phone for business purposes." I suspect that almost all self-employed businesses now predominently use cell service more than landlines, while it will be less of an issue for brick-and-mortar stores.
If you want to follow this story more, then
AAPOR or the
JSM are likely to be the most helpful places to look for additional guidance.
Related previous posts:
The Impact of Cell Phones on Entrepreneurship Surveys,
Exploring Mode Effects in Establishment Surveys
Update 5/24/2010: I saw a
course listing at the Joint Program on Survey Methodology on this issue for March 23, 2011.
5/17/2010 9:43:43 AM By
E.J. Reedy
A new paper out from the National Bureau of Economic Research - “
Dynamic Text-Based Industry Classifications and Endogenous Product Differentiation” by Gordon M. Phillips and Gerard Hoberg - discusses the power large-scale text analysis can provide in examining industrial classifications and other traditionally nebulous areas of differentiation among firms and markets.
Although it is convenient to use existing industry classifications such as SIC or NAICS for research purposes, these measures have limitations. Both do not adjust significantly over time as product markets evolve. Innovations can also create new product markets that do not exist in fixed classifications. In the late 1990s, hundreds of new technology and web-based firms were grouped into a large and nondescript SIC-based \business services" industry. More generally, fixed classifications like SIC and NAICS have at least four shortcomings: they only rarely re-classify firms into different industries as firm product offerings change, they do not allow for product markets themselves to evolve over time, they do not allow for the possibility that two firms that are rivals to a third firm, might not directly compete against each another, and lastly, they do not allow for within industry continuous measures of similarity to be computed.
This is a timely publication as the
Office of Management and Budget (OMB) is in the final stages of seeking approval (and feedback) about the 2012 revisions to the North American Industrial Classification System. While there is a lot of effort made to update these industry classifications unfortunately I do not believe that government officials are yet taking advantage of some of the methods which are described in this paper which mine existing data to look for discontinuities in how industries are defined, when firms change industries, or other aspects of industrial organization.
Now, the prospect of the government performing large-scale text analysis like this might scare some, but in my mind, there are groups like the Center for Economic Studies at the Census Bureau or other places like the Statistics of Income Division at the Internal Revenue Service who could do this responsibly if given the mandate, funding, and some lead time. These places house large quantities of text data yet maintain separate research functions and most importantly they maintain processes for seeking outside researcher proposals for cutting edge research which would benefit the agencies through improved data products. I’ve never heard staff at either of these locations discuss this NAICS redesign as a high priority but perhaps if OMB were using their coordinating powers and discretionary funding with more force, that could change.
Identifying new industries clusters, and other big changes in the industrial organization faster and more accurately remains a key deficiency in the current national statistical system. The U.S. regions who are on the front line of economic development rely too much on private data to try to understand change in their economies because the federal system has too often missed the data needs of the diffused customers here. Coincidentally, the
Council for Community and Economic Research annual conference starts today in Washington, DC. This is the most organized group of individuals advocating for improved regional economic statistics in the United States.
I should note that while there is great potential power in the methods employed by Phillips and Hoberg, the authors also note the potential gaming which could be used by firms if they felt the text they were sharing could be manipulated to effect government policies to the firms advantage. “We also note that
while our new measures are interesting for research or scientific purposes, they would not be good for policy and antitrust purposes as they could be manipulated by firms fairly easily if firms believed they were being used by policy makers.” I think these methods would be best added to an existing review process and not seen as a substitute. In that case, the ability to game the system could be reduced.
5/14/2010 4:00:00 PM By
E.J. Reedy
The Council for Community and Economic Research (C2ER) and the Council of Development Finance Agencies (CDFA) recently released the
C2ER-CDFA State Business Finance & Incentives Resource Center. C2ER and CDFA members can use this to research business incentives and development finance programs across the country. The C2ER-CDFA State Business Finance & Incentives Resource Center is a national database with more than 1,700 programs from all 50 states and federal agencies. Programs are cataloged and searchable by state, type (i.e. bonds, grants, loans, loan guarantees, tax credits, etc.), category (i.e. tax, direct business financing, indirect business finance, etc.), and business need.
I've been lamenting for the last couple of months the dearth of good policy data sets that could enable analysis of actual policy impacts across states on important topics. Today, I am pleased to report the introduction of a new database that does just this, although I wish it did so over time and also was open to all scholars, not just the members of the associations which sponsor it. But, all good things come with time, I hope, and I am a huge supporter of C2ER and would recommend membership. I've downloaded two sample documents from the website so people can see more of what they'd actually get in this database:
listings at the state level and
detail about individual policy. With state-level,
longitudinal business tabulations/databases now available from Census and rich, although
not yet longitudinal files from the Bureau of Labor Statistics, we should see examinations in this area of scholarship expanding greatly.
5/12/2010 9:12:58 AM By
E.J. Reedy
5/11/2010 2:24:58 PM By
E.J. Reedy