|
|
|
|
|
| |
|
|
| |

- Using Cochran's Z Statistic to Test the Kernel-Smoothed Item Response Functio...
by Yinggan Zheng, , Gierl, M. J., Ying Cui,
This study combined the kernel smoothing procedure and a nonparametric differential item functioning statistic-Cochran’s Z-to statistically test the difference between the kernel-smoothed item response functions for reference and focal groups. Simulation studies were conducted to investigate the Type I error and power of the proposed kernel-smoothed Cochran’s Z. For the purpose of comparison, the Type I error and power rates with no correction and with regression correction were also include in the simulation. The results of this study suggest the Type I error and power performance of Cochran’s Z improved when kernel smoothing was applied.
- Factor Loading Estimation Error and Stability Using Exploratory Factor Analysis
by Sass, D. A.
Exploratory factor analysis (EFA) is commonly employed to evaluate the factor structure of measures with dichotomously scored items. Generally, only the estimated factor loadings are provided with no reference to significance tests, confidence intervals, and/or estimated factor loading standard errors. This simulation study assessed factor loading estimation error under several experimental conditions and whether hypothesis testing and/or confidence intervals should be employed. Results revealed that the promax rotation performed well with small interfactor correlations and simple structure data, whereas varimax only performed well with orthogonal factors. Neither rotation method performed well with larger interfactor correlations or approximate simple structure data. An explanation of these results is provided.
- Investigating an Invariant Item Ordering for Polytomously Scored Items
by Ligtvoet, R., van der Ark, L. A., te Marvelde, J. M., Sijtsma, K.
This article discusses the concept of an invariant item ordering (IIO) for polytomously scored items and proposes methods for investigating an IIO in real test data. Method manifest IIO is proposed for assessing whether item response functions intersect. Coefficient HT is defined for polytomously scored items. Given that an IIO holds, coefficient HT expresses the accuracy of the item ordering. Method manifest IIO and coefficient HT are used together to analyze a real data set. Topics for future research are discussed.
- Random Responding as a Threat to the Validity of Effect Size Estimates in Cor...
by Crede, M.
Random responding to psychological inventories is a long-standing concern among clinical practitioners and researchers interested in interpreting idiographic data, but it is typically viewed as having only a minor impact on the statistical inferences drawn from nomothetic data. This article explores the impact of random responding on the size and direction of correlations observed between multi-item inventory scores. Random responses to individual items result in nonrandomly distributed inventory-level scores. Therefore, even low base rates of random responding can significantly affect the statistical inferences made from inventory-level data. Study 1 uses simulations to show that even low base rates of random responding can significantly affect observed correlations, especially when the inventories in question assess low or high base rate phenomena. Study 2 uses archival data to illustrate the moderating effect of random responding on observed correlations in two samples.
- Differential Relationships Between WISC-IV and WIAT-II Scales: An Evaluation ...
by Konold, T. R., Canivez, G. L.
Considerable debate exists regarding the accuracy of intelligence tests with members of different groups. This study investigated differential predictive validity of the Wechsler Intelligence Scale for Children—Fourth Edition. Participants from the WISC-IV—WIAT-II standardization linking sample (N = 550) ranged in age from 6 through 16 years (M = 11.6, SD = 3.2) and varied by the demographic variables of gender, race/ethnicity (Caucasian, African American, and Hispanic), and parent education level (8-11, 12, 13-15, and 16 years). Full Scale IQ and General Ability Index scores from the WISC-IV were used to predict scores on Mathematics, Oral Language, Reading, Written Language, and the total composite on the Wechsler Individual Achievement Test—Second Edition. Differences in prediction were evaluated between demographic subgroups via Potthoff’s technique. Of the 30 simultaneous tests, 25 revealed no statistically significant between group differences. The remaining statistically significant differences were found to have little practical or clinical influence when effect size estimates were considered. Results are discussed in the context of other ability measures that were previously investigated for differential validity as well as educational implications for clinicians.
- The Motivation at Work Scale: Validation Evidence in Two Languages
by Gagne, M., Forest, J., Gilbert, M.-H., Aube, C., Morin, E., Malorni, A.
The Motivation at Work Scale (MAWS) was developed in accordance with the multidimensional conceptualization of motivation postulated in self-determination theory. The authors examined the structure of the MAWS in a group of 1,644 workers in two different languages, English and French. Results obtained from these samples suggested that the structure of motivation at work across languages is consistently organized into four different types: intrinsic motivation, identified regulation, introjected regulation, and external regulation. The MAWS subscales were predictably associated with organizational behavior constructs. The importance of this new multidimensional scale to the development of new work motivation research is discussed.
- Measuring Situational Interest in Academic Domains
by Linnenbrink-Garcia, L., Durik, A. M., Conley, A. M., Barron, K. E., Tauer, J. M., Karabenick, S. A., Harackiewicz, J. M.
Three studies were conducted to develop and validate scores on a new measure appropriate for assessing adolescents’ situational interest (SI) across various academic settings. In Study 1 (n = 858), a self-report questionnaire was administered to undergraduates in introductory psychology. Confirmatory factor analyses (CFA) supported a three-factor model that differentiated between interest generated by (a) the presentation of course material that grabbed students’ attention (triggered-SI), (b) the extent to which the material itself was enjoyable and engaging (maintained-SI-feeling), and (c) whether the material was viewed as important and valuable (maintained-SI-value). CFA analyses in Study 2 (n = 284) and Study 3 (n = 246) also supported the three-factor situational interest model for middle and high school students in mathematics. Moreover, situational interest was shown to be distinct from individual interest and was a statistically significant predictor of change in individual interest across the school year.
- Discriminant Validity of Self-Reported Emotional Intelligence: A Multitrait-M...
by Joseph, D. L., Newman, D. A.
A major stumbling block for emotional intelligence (EI) research has been the lack of adequate evidence for discriminant validity. In a sample of 280 dyads, self- and peer-reports of EI and Big Five personality traits were used to confirm an a priori four-factor model for the Wong and Law Emotional Intelligence Scale (WLEIS) and a five-factor model for Goldberg’s International Personality Item Pool (IPIP). After demonstrating measurement equivalence between self-report and peer-report for both scales, the authors show discriminant validity between the four EI subfacets and Big Five personality traits. This is accomplished through a series of structural equation models fit to the mutitrait-multimethod matrix. Despite their conclusion of discriminant validity, the authors note strong latent correlations between Others’ Emotion Appraisal and trait Agreeableness ( = .87), between Use of Emotion and trait Conscientiousness ( = .73), between Regulation of Emotion and trait Neuroticism ( = –.66), and between Self Emotion Appraisal and trait Neuroticism ( = –.66). There is also post hoc evidence of potential leniency in self-reported emotion regulation. Results point to the utility of peer-report methods as well as the relative construct validity of various subfacets of self-reported emotional competence.
- Factor Structure Analysis of the Schutte Self-Report Emotional Intelligence S...
by Ng, K.-M., Chuang Wang, , Kim, D.-H., Bodenhorn, N.
The authors investigated the factor structure of the Schutte Self-Report Emotional Intelligence (SSREI) scale on international students. Via confirmatory factor analysis, the authors tested the fit of the models reported by Schutte et al. and five other studies to data from 640 international students in the United States. Results show that although Gignac, Palmer, Manocha, and Stough’s modified hypothesized nested model fit the sample data, this model was not parsimonious. As a result, this study proposed a new model that also fitted the data. Results further indicate convergent and concurrent criterion-related validities and reliability of the model. The findings support the use of the modified SSREI for international students.
- A Comparison of Approaches for Improving the Reliability of Objective Level S...
by Skorupski, W. P., Carvajal, J.
This study is an evaluation of the psychometric issues associated with estimating objective level scores, often referred to as ‘‘subscores.’’ The article begins by introducing the concepts of reliability and validity for subscores from statewide achievement tests. These issues are discussed with reference to popular scaling techniques, classical test theory, and item response theory. Methods for increasing the reliability of subscore estimates that have been suggested in literature are then reviewed. Based on this review, an empirical study comparing some of the more promising procedures was conducted. Test score data from a large statewide testing program were analyzed in this study. The comparison of subscore augmentation approaches found that generally all methods were very successful in dramatically increasing the reliability of subscore estimates. However, this increase was accompanied by near-perfect correlations among the subscore estimates. This finding called into question the validity of the resultant subscores, and therefore the usefulness of the subscore augmentation process. Implications for practice are discussed.
- A Monte Carlo Study of Eight Confidence Interval Methods for Coefficient Alpha
by Romano, J. L., Kromrey, J. D., Hibbard, S. T.
The purpose of this research is to examine eight of the different methods for computing confidence intervals around alpha that have been proposed to determine which of these, if any, is the most accurate and precise. Monte Carlo methods were used to simulate samples under known and controlled population conditions. In general, the differences in the accuracy and precision of the eight methods examined were negligible in many conditions. For the breadth of conditions examined in this simulation study, the methods that proved to be the most accurate were those proposed by Bonett and Fisher. Larger samples sizes and larger coefficient alphas also resulted in better interval coverage, whereas smaller numbers of items resulted in poorer interval coverage.
- Initial Scale Development: Sample Size for Pilot Studies
by Johanson, G. A., Brooks, G. P.
Pilot studies are often recommended by scholars and consultants to address a variety of issues, including preliminary scale or instrument development. Specific concerns such as item difficulty, item discrimination, internal consistency, response rates, and parameter estimation in general are all relevant. Unfortunately, there is little discussion in the extant literature of how to determine appropriate sample sizes for these types of pilot studies. This article investigates the choice of sample size for pilot studies from a perspective particularly related to instrument development. Specific recommendations are made for researchers regarding how many participants they should use in a pilot study for initial scale development.
- The Multilevel Crossed Random Effects Growth Model for Estimating Teacher and...
by Palardy, G. J.
This article examines the multilevel linear crossed random effects growth model for estimating teacher and school effects from repeated measurements of student achievement. Results suggest that even a small degree of unmodeled nonlinearity can result in a substantial upward bias in the magnitude of the teacher effect, which raises concerns about its appropriateness for estimating teacher effects. To address this issue, a piecewise linear crossed random effect growth model is proposed. A comparison with the linear growth form shows that the piecewise specification provides more accurate estimates of teacher effects when achievement growth departs from linear growth across grade levels or over summer, which are prevalent conditions. Fitted examples using nationally representative data and Bayesian estimation methods are provided.
- Bayesian Estimation of Graded Response Multilevel Models Using Gibbs Sampling...
by Natesan, P., Limbers, C., Varni, J. W.
The present study presents the formulation of graded response models in the multilevel framework (as nonlinear mixed models) and demonstrates their use in estimating item parameters and investigating the group-level effects for specific covariates using Bayesian estimation. The graded response multilevel model (GRMM) combines the formulation of graded response models with the discrimination parameter fixed at one for all items by Tuerlinckx and Wang and of two parameter models by Rijmen and Briggs to offer graded response models with item-specific discrimination parameters. Apart from the contribution to the body of knowledge by formulating GRMMs, the significance of the present study includes providing a meeting point between psychometrics and statistics, overcoming the Neyman—Scott problem by using Bayesian estimation, estimation of abilities of persons with extreme scores, and demonstration of general purpose software for estimating item response theory parameters. Data from the emotional functioning scale on 11,158 healthy and chronically ill children and adolescents were used from the PedsQL 4.0 Generic Core Scales database to illustrate the model. Estimates for the item parameters from WINBUGS using Bayesian priors and Multilog were compared for the GRMM and the ordinary graded response models, respectively.
- Students' Group and Member Attachment to Their University: A Construct Validi...
by France, M. K., Finney, S. J., Swerdzewski, P.
This study examined the psychometric properties of scores from the University Attachment Scale, a measure that operationalizes group and member attachment as two separate dimensions of attachment to a university. A two-factor model was championed over a one-factor model providing evidence of a distinction between university attachment and member attachment. Relationships with external criteria provided further support for this distinction and construct validity evidence. As predicted, ‘‘involved’’ students had practically and statistically significantly higher group attachment than ‘‘noninvolved’’ students. Furthermore, transfer students had practically and statistically significantly lower member attachment than nontransfer students. Additionally, there was a statistically significant positive relationship between students’ perceived cohesion to the university and both group and member attachment. Overall, the authors believe that this is a promising new measure of university attachment.
|
|
|
| |
|
|
|
|
| |
| |
|
| |
|