ASSESSING METHODS FOR EVALUATING STATE TECHNOLOGY DEVELOPMENT PROGRAMS: RECOMMENDATIONS FOR THE GEORGIA RESEARCH ALLIANCE

Completed April 1997. Presented at Annual Meeting, Technology Transfer Society, Denver, CO, July 1997.


Jan Youtie, Economic Development Institute, Georgia Institute of Technology, Atlanta, GA 30332-0640 USA, Email: jan.youtie@edi.gatech.edu; Barry Bozeman, School of Public Policy, Georgia Institute of Technology, Atlanta, GA 30332-345 USA, Email: barry.bozeman@pubpolicy.gatech.edu; Philip Shapira School of Public Policy, Georgia Institute of Technology Atlanta, GA 30332-0345 USA, Email: ps25@prism.gatech.edu


Introduction

U.S. states have been increasing their investments in technology development programs in recent years. From 1992 to 1994, state investments in university/non-profit centers, joint industry-university research partnerships, direct financing grants, incubators, and near-term assistance programs using science and technology for economic development grew by more than 25 percent, approaching $400 million in 1994. (Coburn and Berglund, 1995). These state investments are augmented, in most cases, by multiple other funders including the federal government, industry, venture capital, consortia, and private sources.

The 1990s have also been a period in which more attention is being paid to government program performance. Thirty-five states have some type of performance-based budgeting initiative, either through legislation, executive order, or budget agency initiative. The field of technology development has not been immune from this growing desire for performance measurement. A recent survey of such programs found that 95 percent of states have some type of method for collecting performance data or conducting a program evaluation. But, despite the prevalence of some type of performance measurement or evaluation effort among state technology development programs, few states have well-conceived evaluation plans. For example, activity reporting, client survey data, and informal client contact are the most commonly used evaluation methods. (Melkers and Cozzens, 1996). More systematic evaluation approaches are less common. Only in part is this due to lack of funding or interest; there are also complex issues about how best to apply evaluation methodologies to assess the often diffuse and indirect effects of technology promotion policies.

This paper reports on a study that examined the appropriateness of different evaluation methods in assessing the performance of one of Georgia’s major technology development programs - the Georgia Research Alliance (GRA). The GRA is a collaborative initiative among six research universities in the state to use research infrastructure invested in targeted industry areas to generate economic development results. Research infrastructure investments in advanced telecommunications, environmental technologies, and human genetics are administered by three centers. A recent GRA programmatic addition funds the university side of industry-university collaborative research projects with significant commercial potential. GRA management acts as a "holding company" for the program, developing strategy and finding financial resources. In the past five years, the state has invested approximately $126 million "in eminent scholar endowments and equipment and facilities.

Evaluability Assessment

As an effort to further its performance-based budgeting initiatives, the Governor’s Office of Planning and Budget of the state of Georgia desired that evaluative information about GRA’s impacts be developed. However, to what extent is this actually possible? What should be measured? And, what actually can be measured, given the constraints of the resources that can reasonably be allocated to evaluation? Such questions raise the issue of the "evaluability" of technology development programs - the degree to which the particular characteristics of the program affect the ability to provide effective evaluation.

The elements and factors that affect GRA’s evaluability include the following:

Recommended Methods

The set of methods available for evaluation is considerable. Almost every methodological approach employed in the social and behavioral sciences has, at some point, been adapted to the purpose of evaluation. In addition, some methods have been developed specifically for evaluation purposes. Rather than deal comprehensively with the range of available evaluation methods (a task more suited to text book-writing than to evaluation design), we give more extensive treatment to those techniques we recommend as most appropriate for evaluating GRA. Each evaluation method has specific strengths and weaknesses, both in general and for any specific application. In this section some of the advantages and disadvantages of the recommended methods are discussed with specific reference to GRA needs. In addition, recommendations are provided for combinations of evaluation methods, on the presumption that the weaknesses of one method can often be offset by using another in combination.

Recommended Evaluation Approaches

We recommend two different evaluation regimes for assessing GRA and its activities. We refer to these as "routine evaluation" and "comprehensive evaluation." Routine evaluation implies the investment of modest resources and does not require expertise of external consultants. Comprehensive evaluation is more thorough-going but requires greater resources and the use of external evaluation consultants. We recommend that each be pursued, but at different intervals. Routine evaluation should be performed annually or bi-annually; comprehensive evaluation should be performed on a three to four year cycle.

"Routine" Evaluation

Typically, valid evaluation requires considerable technical expertise and commitment of substantial resources. But often it is possible to engage in useful evaluation activity even when evaluation is performed on a modest budget and by persons who are not highly trained evaluators. In the GRA context, two types of useful evaluation activities can be performed at very little expense and without the need for great evaluation expertise. We recommend GRA be evaluated every year or two on the basis of (1) performance indicators; (2) flow analysis.

"Comprehensive" Evaluation

OPB and GRA may wish to consider setting aside a percentage of program money to devote to evaluation. This is a common practice and has led to the production of high quality, highly usable evaluations. Two familiar examples are the resources invested by NIST and state manufacturing assistance programs and for the Department of Energy’s Energy Related Inventions Program (Brown, Curlee and Elliott, 1995). With a sum set aside, resources would be available every three to four years for a comprehensive evaluation. While each of the recommended methods presented in Table 2 is worthy of consideration, we feel that a good balance is provided by using (in addition to the methods employed in the routine evaluation): (1) a survey-based cost benefit analysis; (2) case studies; (3) content analysis. (4) external and peer review. By using these approaches in combination for a comprehensive evaluation, one could ensure with case studies an in-depth portrait of program activities with attention to the details that contribute to success; provide for objective monetary-based impacts by using cost benefit analysis, and give some insight into the crucial issue of changes in perceptions (e.g. business climate) by using content analysis. To properly conduct a cost-benefit analysis, it would be necessary to conduct interviews and surveys with firms to define, identify, and attribute appropriate treatments for costs and benefits. A full set of public benefits and costs would also need to be identified.

Peer review remains the best approach to providing valid, credible evaluation of scientific research. We recommend that each major GRA program be submitted to peer review four to five years. The primary use of peer review should be evaluation of the quality of the scientific work and the peer panels should be comprised of scientific experts with intimate knowledge of the scientific fields addressed by GRA researchers. Peer review is much less useful in determining the economic potential of scientific work. Thus, external reviews should accompany peer reviews. Panels of industry advisors are more appropriate for assessing the economic utility of GRA work. The periodic use of peer review and external review will provide an assessment of scientific quality and the relevance of quality scientific and technical outcomes to the long-term economic development goals of GRA.

The balance provided by this combination of methods is further illustrated by considering the extent to which the use of these methods addresses the issues identified in the interviews of GRA stakeholders.

Relevance to Evaluability Assessment

In this section, we re-visit some of the issues raised in the evaluability assessment. The section examines the evaluation design recommendations in light of the ease of and ability to conduct a measured assessment of impacts.

Evaluation Expertise Requirements

We do not feel that a highly expert evaluation team is needed for the routine evaluations. It may be useful to employ a professional evaluation team for the one-shot review and establishment of performance indicators.

The comprehensive evaluation should not be undertaken unless professional evaluation research personnel are employed. The skills required for more comprehensive evaluation reside in a number of institutional contexts including consulting firms, universities, and professional associations. Sometimes government agencies (more common in federal than state government) have their own highly expert evaluation units. There are familiar trade-offs involved in the choice of one or another source of evaluation expertise (see Bozeman, 1979) including, for example, contextual knowledge vs. perceived disinterestedness, rigidity vs. over-eagerness to please, and expertise vs. availability. But regardless of the institutional provider and particular actors chosen for the evaluation, the evaluators should have experience in a wide variety of methods. Too often evaluators have an evaluation "hammer" used on every policy "nail." The mix of techniques and methods recommended here suggests a "one-tool" evaluation team is inappropriate. Similarly, the evaluators should have considerable knowledge of technology-based economic development programs. The complexities of GRA are such that evaluators not well-versed in such programs are much less likely to provide valid information.

References

Bozeman, B. (1979). Public Management and Policy Analysis. New York: St. Martin’s Press.

Bozeman, B. (1993). "Peer Review and the Evaluation of R&D Impacts," in B. Bozeman and J. Melkers, (eds.), Evaluating R&D Impacts, New York: Kluwer Publishing, pp. 36-49.

Bozeman, B. and D. Coursey (1992). "Benefits and Problems in Technology Transfer: A National Survey of U.S. University and Government Laboratories," IEEE Transactions in Engineering Management, 132-141.

Brown, M.A., T.R. Curlee, and S.R. Elliott. (1995). Evaluating Technology Innovation Programs: The Use of Comparison Groups to Indentify Impacts. Research Policy 24(5):669-685.

Brown, M.A., L.G. Berry and R. Goel (1991). "Guidelines for Successfully Transferring Government-Sponsored Innovations," Research Policy, 20, 121-143.

Chubin, D. and E. Hackett (1990). Peerless Science: Peer Review and U.S. Science Policy. Albany, NY: State University Press.

Coburn, C. and Berglund, D. (1995). Partnerships: A Comprendium of State and Federal Cooperative Technology Programs. Columbus, OH: Batelle Memorial Institute.

Cook, T. and D. Campbell (1979). Quasi-experimentation. Boston: Houghton Mifflin.

Cosmos Corporation (1996). A Day in the Life of the Manufacturing Partnerships: Case Studies of Exemplary Engagements with Clients by MEP Centers. Gaithersburg, MD: National Institute of Standards and Technologies.

Dunn, W. (1994) Public Policy Analysis, second edition. Englewood Cliffs, NJ: Prentice-Hall.

Irvine, J., and B.R. Martin (1983) Assessing Basic Research: Some Partial Indicators of Scientific Progress in Radio Astronomy. Research Policy 12(2):61-90.

Melkers, J. and S. Cozzens (1996) Performance Measurement in State-Science and Technology Programs. Paper prepared for the Annual Meeting of the American Evaluation Association, Atlanta, Georgia.

Oldsman, E. (1997) "The Impact of the New York Manufacturing Extension Program: A Quasi-Experiment," In Shapira, P. and J. Youtie (Eds.) Manufacturing Modernization: Learning from Evaluation Practices and Results, Atlanta, Georgia: Georgia Tech Research Corporation.

Riall, B. William (1991). "Local Economic Impact: Costs and Benefits of Development." Atlanta, Georgia: Georgia Tech Research Corporation.

Rip, A. (1988) "Mapping of Science," in A. Van Raan (ed.) Handbook of Quantitative Studies of Science and Technology (New York: North Holland).

Roessner, J. D., Y. Lee, P. Shapira, and B. Bozeman (1996) "Evaluation of Iowa State University’s Center for Advanced Technology Development." Atlanta, Georgia: Georgia Tech Research Corporation.

Shapira, P. and J. Youtie (1995). "Georgia Manufacturing Extension Alliance: Overview of the Evaluation Plan," In Shapira, P. and J. Youtie, Evaluating Industrial Modernization. Atlanta, GA: Georgia Tech Research Corporation.

Stokey, E. and R. Zeckhauser (1978). A Primer for Policy Analysis. New York: W.W. Norton and Company.

Tornatzky, L., P. Waugaman, L. Casson, S. Crowell, C. Spahr, and F. Wong (1995). Benchmarking Best Practices for University-Industry Technology Transfer, Southern Technology Council.

Weimer, D. and A. Vining (1989). Policy Analysis. Englewood Cliffs, NJ: Prentice Hall.

Yin, R. (1989) Case Study Research. Newbury Park: Sage Publications.

Youtie, J. (1997) "Toward a Cross-Case Analysis of Outcomes of Exemplary Engagements by MEP Centers." In Shapira, P. and J. Youtie (Eds.) Manufacturing Modernization: Learning from Evaluation Practices and Results, Atlanta, Georgia: Georgia Tech Research Corporation.