Skip Navigation

Center for Science, Mathematics & Computer Education

Data Connections: Developing a Coherent Picture of Mathematics Teaching and Learning

Data Connections Conference

Data Connections held a conference April 24-25, 2014, in Lincoln, Nebraska, and shared findings related to using value-added models to characterize the impact of teacher professional development on K-8 student mathematics achievement. The conference featured plenary presentations both by the Data Connections team and nationally-recognized statisticians and psychometricians--Dr. Mark Reckase and Dr. Sharon Lohr. The conference also had concurrent break-out sessions and workshop time to explore key questions related to modeling teacher and student data, from the perspectives of statisticians and psychometricians, school district personnel, and other educational researchers.

We collected slides, handouts, and notes from Data Connections conference sessions, available here.

Travel support was available to attend this conference. Attendees from Math Science Partnership projects will be given preference for travel support (up to 3 attendees per MSP project). MSPs were encouraged to send teams composed of statisticians/psychometricians, K-12 district personnel, and educational researchers.


Data Connections: Developing a Coherent Picture of Mathematics Teaching and Learning is a 3-year, $1.2 million Research, Evaluation and Technical Assistance (RETA) grant funded by the National Science Foundation. The grant is a partnership among the University of Nebraska-Lincoln, the Lincoln Public Schools, and RMC Research. The purpose of this RETA study is to develop statistical models to create a coherent picture of mathematics teaching and learning. Data Connections works closely with two other NSF-funded Math Science Partnership (MSP) programs: Math in the Middle Institute Partnership and NebraskaMATH. Utilizing data already collected through these MSP programs, the three-year study builds on newly developed models to help these and other MSP programs and their evaluators evaluate and interpret student and teacher data in statistically productive and meaningful ways.

Specifically, the project focuses on layered value-added models to estimate teacher effects, and how those models can be adapted for use with "messy" data, for example when different tests are administered in different grade levels at different times of year. The project is investigating the use of Z-scores, parallel processing, and binning by quantile to address issues arising with available student achievement data. Project research scope includes:

  • methods to estimate student achievement trajectories over time
  • methods for best connecting these trajectories to measures of teaching quality
  • connecting measures of student and teacher attitudes to each other and to measures of student achievement and teaching quality
  • estimating the impact of MSP interventions on all of the above

Statistical modeling theory and associated computing technology have seen rapid advances in the past several years. There is a great need to develop statistical models that take full advantage advantage of these advances to connect various forms of data to develop a coherent picture of mathematics teaching and learning. MSP projects are grappling with how to evaluate the effects of their sponsored professional development programs for teachers on student learning. Few studies have addressed how to use value-added models to analyze achievement data that are not on a single developmental scale, and even fewer, perhaps none, have discussed how to use information from multiple instruments in a single year that are on different scales. In addition to this work, the research findings will be disseminated widely, through presentations at national conferences, articles published in peer-reviewed journals, and a dissemination conference targeting MSP project and evaluation personnel. Utilizing value-added models increases the potential to provide evidence of high quality teacher impact in high-need schools. By helping MSP projects use the methods developed in this study, Data Connections will build their capacity to inform the nation of how their projects impact teaching and learning.


The overall goal of Data Connections is to develop, refine, and disseminate statistical models that develop a coherent picture of mathematics teaching and learning, particularly in regard to MSP programs. Specifically, a main goal is to investigate the use of Z-scores, parallel processing, nonparametric ranking, and binning by quantile to create a coherent picture of student achievement trajectories across different assessments given at different points in time. Once we address this first goal, we will then move to connecting various measures of teaching quality and teacher and student attitudes in order to create a coherent picture of mathematics teaching and learning.

Our goals will be accomplished through four project objectives:

Objective 1: Determine how value-added models can be effectively used to estimate teacher effects when student achievement data are from a variety of measures at different times.
Objective 2: Determine how to connect student achievement trajectories to measures of teaching quality (mathematical knowledge for teaching, mathematical quality of instruction).
Objective 3: Determine how to best connect measures of student and teacher attitudes to each other and to measures of student achievement and teaching quality.
Objective 4: Determine how to best connect measures of teaching quality and student achievement to teacher professional networks.


Dr. Walt Stroup PI
Dept. of Statistics
Dr. Jennifer Green Co-PI
Montana State University
Dr. Wendy Smith Co-PI
Center for Science, Mathematics and Computer Education
Dr. Leslie Lukin Co-PI
Lincoln Public Schools
Dr. John Sutton Evaluator
RMC Research
Dr. Traci Kutaka Postdoctoral research associate
Center for Science, Mathematics & Computer Education
Pam Fellers Graduate research assistant
Dept. of Statistics
Lixin Ren Graduate research assistant
Dept. of Psychology
Dr. Xin Wang Evaluator
RMC Research

National Advisory Board Members

  • Dr. William Sanders, senior research fellow, University of North Carolina; senior manager of value-added assessment and research, SAS Institute Inc.: Dr. Sanders is a leading scholar in developing statistical methods for working with longitudinal student achievement data. Dr. Sanders also has been deeply involved in developing statistical software to meet the needs of educational research data analyses.
  • Dr. Heather Hill, associate professor, Harvard Graduate School of Education: Dr. Hill’s expertise in developing the Mathematical Quality of Instruction instrument, and her work on the Mathematical Knowledge for Teaching instruments provides valuable advice on our uses of data from these instruments. Additionally, Dr. Hill is also seeking to find ways to connect data from a variety of measures of teacher quality and student achievement.
  • Dr. Paul Eakin, professor, University of Kentucky: Dr. Eakin is an experienced mathematician and the PI for the Appalachian Math Science Partnership. He has spent much of the last 15 years dealing with issues of connecting teaching practices to student learning.

RESEARCH BASE (Why Value-Added Models?)

The Common Core State Standards (CCSS) set out ambitious academic standards for K-12 mathematics and language arts. There is a significant need to ensure teachers have high quality professional development opportunities to enable them to teach to such high standards. Long before CCSS, NSF's MSP programs have been providing substantial content-based professional development to teachers through many projects. Through the MSP program, NSF, mathematicians, and mathematics educators are making a significant investment in the education of our nation's mathematics teachers. Increasingly, MPS projects are challenged to document the benefits of their programs on teaching and learning mathematics. This proposal responds to that challenge.

Good teaching matters. Quantitative studies have found that the statistical effect size of good teaching is greater than any other educational variable, including students' socioeconomic status (e.g., Wenglinsky, 2002). "Teachers' and students' practices mediate between the conventional resources that schools and school systems deploy, on the one hand, and learning accomplishments, on the other." (e.g., Cohen, Raudenbush, & Ball, 2002, p. 86). Key variables in student learning include the cognitive demands of tasks, the questions teachers ask to probe student thinking, and teachers' high expectations of and support for student learning (e.g., Kilpatrick, Swafford, & Findell, 2001). "Teachers are crucial to students' opportunities to learn mathematics, and substantial differences in the mathematics achievement of students are attributable to differences among teachers." (National Mathematics Advisory Panel Report, 2008, Ch. 5 p. ix). However, what is lacking in the field today are effective ways to connect measures of teaching quality to measures of student learning.

There is certain mathematical knowledge teachers need that other users of mathematics do not (such as figuring out student errors and misconceptions); too many practicing teachers lack sufficient mathematical knowledge for teaching to effectively build deep student understanding of mathematics. (e.g., Ball & Bass, 2003; Ball, Thames, & Phelps, 2008; Davis & Simmt, 2006; Kilpatrick, Swafford, & Findell, 2001; Ma, 1999). Mathematical knowledge for teaching is the particular form of mathematical knowledge that is usable in, and useful for, the work that teachers do as they teach mathematics (Stylianides & Ball, 2008). Teachers with greater mathematical knowledge for teaching are better able to listen to student reasoning and to help students build conceptual understanding of mathematical concepts (e.g., Ball, Thames, & Phelps, 2008).

Since the enactment of No Child Left Behind (NCLB) (2001), education systems, in theory, have held students to higher academic standards by holding states accountable for assessing measurable student outcomes. Value-added modeling is an alternative approach to test-based accountability systems interested in the proportions of students scoring at or above pre-determined proficiency levels. Value-added modeling techniques aim to estimate the contribution of educational factors, such as schools and teachers, to growth in student achievement, while allowing for the possibility to control for the effect of non-educational factors, such as socioeconomic status. Value-added modeling methods provide opportunities to estimate the proportion of variability in achievement or student growth attributable to teachers, as well as estimate an individual teacher's effect on student learning. Of urgent concern is to develop value-added modeling techniques that can be effective with less-than-ideal (e.g., real) school district data on student achievement.

School districts and policymakers desire to use teacher effect estimates for a variety of purposes, from informing educational systems how students are affected by current practices and conditions to making high-stakes decisions regarding teacher salary and/or employment. However, even though value-added modeling methods infer causal effects of teachers on student growth, the assessment data are not obtained from randomized, experimental studies. Consequently, several questions arise when defining what teacher effects really describe. Defining teacher effects requires identifying to what a particular teacher's impact on a student's growth in achievement will be compared, such as other teachers in the school, district, or entire state. The definition also depends on the outcomes used to measure achievement; the scope and purpose of the instruments can limit what is measured and, consequently, restrict the part of a teacher's total impact on a student that can be estimated (McCaffrey et al., 2003). Other factors, such as characteristics of classrooms and schools that affect students' growth in achievement can be confounded with teacher effect estimates, so the purpose for obtaining such estimates needs to be clearly defined and should dictate how precisely the effects need to be estimated. Typically, teacher effects merely account for unexplained classroom-level heterogeneity (Lockwood, McCaffrey, Mariano, & Setodji, 2007).

Past research has examined the relationship between teaching quality and student learning, looking at the effect teachers' degrees (Ackerman, Heafner, & Bartz, 2006; Rowan et al., 2002; Wayne & Youngs, 2003), coursework (Hill, Rowan, & Ball, 2005; Wayne & Youngs, 2003), certification status (Ackerman et al., 2006; Presley, White, & Gong, 2005; Hill et al., 2005; Rowan et al., 2002; Wayne & Youngs, 2003), teaching experience (Ackerman et al., 2006; Hill et al., 2005; Presley et al., 2005; Rowan et al., 2002; Wayne & Youngs, 2003), licensure examination scores (Presley et al., 2005; Wayne & Youngs, 2003), pedagogical practices (Ackerman et al., 2006; Rowan et al., 2002), and amount of professional development (Ackerman et al., 2006) have on student achievement. Recent efforts measure not only teachers' mathematical content knowledge, but also their mathematical knowledge for teaching (Ball, Hill, & Bass, 2005; Hill, 2007b; Hill, Schilling, & Ball, 2004), examining its relationship to student achievement (Hill et al., 2005). Various covariate adjustment models and gain score models have been used with methods, such as analysis of variance (ANOVA), analysis of covariance (ANCOVA) (Sanders, 2006), and hierarchical linear models (HLM) (Raudenbush & Bryk, 2002; Rowan et al., 2002; Wright, Sanders, & Rivers, 2006), to estimate these effects. However, researchers criticize these procedures for underestimating teacher effects and modeling students' achievement status instead of changes in achievement (Rowan et al., 2002; Sanders, 2006), as well as for their inability to estimate cumulative teacher effects (Rowan et al., 2002) and deal with missing data (Sanders, 2006). Unfortunately, "assertions about the magnitude of teacher effects on student achievement depend…on the methods used to estimate these effects and on how the findings are interpreted" (Rowan et al., 2002, p. 9). Cross-classified models (Raudenbush & Bryk, 2002) and the Educational Value-Added Assessment System (EVAAS) model (Sanders, Saxton, & Horn, 1997) are currently recommended over other models to provide estimates of teacher effectiveness. The EVAAS model is a longitudinal linear mixed effects model that has each student serve as his or her own control, similar to the cross-classified model, which models individual growth curves (Sanders, 2006). Using the EVAAS model, Sanders et al. have been able "to produce estimates of school and teacher effects that are free of socioeconomic confoundings and do not require direct measures of these concomitant variables" (Wright, Horn, & Sanders, 1997, p. 58). In fact, Sanders (2000) has shown "that differences in teacher effectiveness is the single largest factor affecting [students'] academic growth" (p. 334); teachers are the dominant factor impacting student progress (Sanders, 2004; Sanders & Horn, 1998; Wright et al., 1997). Darling-Hammond (2000) adds, "Effects of well-prepared teachers on student achievement can be stronger than the influences of student background factors, such as poverty, language background, and minority status" (Conclusions and Implications, ¶ 6). With teacher effectiveness linked to student achievement, questions remain about what factors influence the quality of teaching (Ackerman et al., 2006; Carey, 2004; Frome, Lasater, & Cooney, 2005). Professional development is one factor thought to influence teaching quality (Carey, 2004). Desimone et al. (2002) write, "Professional development is considered an essential mechanism for deepening teachers' content knowledge and developing their teaching practices" (p. 81). Various authors list characteristics of effective professional development programs (Desimone et al., 2002; Garet, Porter, Desimone, Birman, & Yoon, 2001; Guskey, 1994; Loucks-Horsley et al., 1996), but rigorous evaluations are needed to determine whether these programs actually affect teaching quality (Blank et al., 2007; Carey, 2004; Guskey, 1994; Hill, 2007a; Loucks-Horsley et al., 1996; NMAP, 2008; Rowan et al., 2002; Shaha et al., 2004). Previous research has examined the relationship between teacher quality and student learning (Ackerman et al., 2006; Blank et al., 2007; Darling-Hammond, 2000; Hill, Rowan, et al., 2005; Presley et al., 2005; Rowan et al., 2002; Wayne & Youngs, 2003), as well as estimated value-added teacher effects (Rowan et al., 2002; Sanders, 2000; Sanders, 2006; Sanders & Horn, 1998; Sanders & Rivers, 1996; Sanders et al., 1997; Wright et al., 2006; Wright et al., 1997). Yet, there is still a need to explore the relationship between teacher development and teacher practices, as well as student learning (Cooney & Bottoms, 2002; Fishman, Marx, Best, & Tal, 2003; Frome et al., 2005; Garet et al., 2001; NMAP, 2008). Existing research tries to estimate the effect of professional development programs with a dichotomous variable (Blank et al., 2007; Desimone et al., 2002; Garet et al., 2001; Shaha, et al., 2004; Stroup, 2007; Stroup & Fang, 2006). Typically, teachers are assigned either a value of one to indicate their participation or a value of zero to indicate their absence of participation in a program. However, the discrete nature of this approach neglects the interactive nature of teachers and the possibility of creeping excellence, where program participants share their newly acquired knowledge and ideas with non-participating teachers. This approach also disregards the teachers' varying degrees of participation and changes in practice. Hill (2007a) writes, "Although teachers might be required to engage in professional development, they are not required to learn from it" (p. 123). A teacher's participation in professional development opportunities does not necessitate actual learning or changes in teacher beliefs and practices. It may also be the case that a teacher's absence of participation in a professional development program does not predicate his or her lack of teaching quality. Instead, estimating the change in a teacher's effect on student achievement after participating in a professional development program can be an alternative approach for estimating the impact of such a program; this approach allows each teacher to serve as his or her own control and helps address the complexities ignored by merely comparing the effects of participating teachers on student achievement to those of non-participating teachers. Professional development programs focus on preparing teachers to meet the recent demands of providing students with quality instruction, but rigorous evaluations are needed to determine whether these programs are actually effective. Value-added modeling techniques provide opportunities to estimate the relationship between teacher development and student learning, but most require student achievement data to be on a single developmental scale over time (McCaffrey, Lockwood, Koretz, & Hamilton, 2003). Typically, available assessment data do not meet such requirements, limiting analyses that can be conducted. We have already developed an alternative value-added methodology, specifically the use of Z-scores, for analyzing less-than-ideal longitudinal student achievement data collected from a mixture of norm- and criterion-referenced assessments to estimate the impact of a professional development program on student learning. While we have applied this model to one district's less-than-ideal student achievement data, we next need to expand to include data from multiple districts. We also need to explore alternatives to the Z-score approach, including nonparametric ranking and binning by quartile. Few studies have addressed how to use value-added models to analyze achievement data that are not on a single developmental scale (Green, Smith, Heaton, Jiao, & Stroup, under review; Rivkin, Hanushek, & Kain, 2005), and even fewer, perhaps none, have discussed how to use information from multiple instruments in a single year that are on different scales, potentially both within and across instruments. When modeling multiple outcome measures, instead of a single measure, across time, parallel process, or multivariate, growth curve models can estimate the relationship between the growth trajectories for each of the parallel measures and allow researchers to investigate changes in latent factors over time instead of changes in observed scores. We have explored the use of parallel processing, specifically curve-of-factors, methodology to analyze longitudinal student achievement data simulated from two different assessments in a single subject, such as mathematics, and estimate teachers' effects on student learning (Green, 2010). The computing power necessary to run these analyses with less-than-ideal data is quite significant. We need to build on our simulations to conduct these analyses with actual student achievement data. Our prior work with using Z-scores to estimate teacher effects was applied to data from M2, an NSF-funded MSP (2004-2011) (Green, 2010; Green et al., under review). One of the participating school districts uses a combination of norm-referenced and criterion-referenced tests from year to year, thus making it impossible to construct student achievement trajectories with existing statistical methodology. The estimated impact of participation in M2 did not indicate a significant change in participants' effects on student learning. One reason for this outcome may be that teacher change is a slow process, and changes to student achievement can take years to develop (Cohen & Hill, 2001). Of notable concern is the potential for ceiling effects with criterion-referenced tests. The tests are constructed to determine whether students meet pre-determined proficiency levels on specific mathematics criteria, and it is possible for students to consistently answer most to all questions on these assessments correctly. In such instances, the tests are unable to detect changes in a student's achievement across time and, consequently, limit discernment of value-added teacher effects. Thus, some teacher's value-added effects could be under-estimated due to many students in teachers' classrooms reaching the ceiling of the criterion-referenced tests, which is a limiting feature of the instrument, but not necessarily of the model. This supports the need to further develop value-added models. It may be that using means is really asking the wrong question; the right question may be more along the lines of looking at teachers who do show improvement during and after participation in professional development, and then seeing how these teachers differ from those who do not show improvement. Ideally, more than student achievement data would go into models for estimating value-added teacher effects. However, no statistical models currently exist that can tie measures of teacher knowledge, beliefs, and practices, to student knowledge and beliefs. We plan to develop methodology to allow connections to be made between the Survey of Mathematical Knowledge for Teaching (Hill, Schilling, & Ball, 2004) and student achievement, to investigate whether teachers' gains in content knowledge for teaching mathematics during participation in professional development can be linked to changes in teacher effects on student learning. Because curricula and test content vary across grades, as do mobility rates, we plan to explore whether there are notable changes in a student's Z-scores from year to year that could potentially be confounded with changes in mobility rates, curricula, and/or test content. When utilizing value-added models to estimate teacher effects, it is essential to determine whether the goals of the program align with what the instruments assess and acknowledge any limitations that exist. We will address the potential issue of censoring, and carefully consider what data are needed and how much baseline data should be obtained when estimating the impact of a professional development program. Ideally, these methods can be extended to other VAM approaches, as well as other professional development programs, and could eventually be used to establish potential relationships between changes in a teacher's mathematical knowledge for teaching mathematics and changes in student achievement.