Volume 4, Article 7
                    October 2000
 
TUTORIAL

 
 
STRUCTURAL EQUATION MODELING AND REGRESSION:
GUIDELINES FOR RESEARCH PRACTICE

David Gefen
Management Department
LeBow College of Business
Drexel University

gefend@drexel.edu

Detmar W. Straub
Department of Computer Information Systems
Robinson College of Business
Georgia State University

Marie-Claude Boudreau
Management Information System Department
Terry College of Business
University of Georgia

ABSTRACT

The growing interest in Structured Equation Modeling (SEM) techniques and recognition of their importance in IS research suggests the need to compare and contrast different types of SEM techniques so that research designs can be selected appropriately. After assessing the extent to which these techniques are currently being used in IS research, the article presents a running example which analyzes the same dataset via three very different statistical techniques. It then compares two classes of SEM: covariance-based SEM and partial-least-squares-based SEM. Finally, the article discusses linear regression models and offers guidelines as to when SEM techniques and when regression techniques should be used. The article concludes with heuristics and rule of thumb thresholds to guide practice, and a discussion of the extent to which practice is in accord with these guidelines.
Keywords: IS research methods; measurement; metrics; guidelines; heuristics; structural equation modeling (SEM); LISREL; PLS; regression; research techniques; theory development; construct validity; research rules of thumb and heuristics; formative constructs; reflective constructs.

Note: The paper is written in such a way that readers with basic knowledge of multivariate statistics can follow the logic and examples. It does not assume the reader is already conversant with LISREL, PLS, or other SEM tools. This tutorial contains:
 


Because of the large number of notes associated with this paper, they are presented as end notes at the end of this paper rather than as footnotes.


I. INTRODUCTION

Structural Equation Modeling (SEM) techniques such as LISREL and Partial Least Squares (PLS) are second generation data analysis techniques that can be used to test the extent to which IS research meets recognized standards for high quality statistical analysis. That is to say, they test for statistical conclusion validity . Contrary to first generation statistical tools such as regression, SEM enables researchers to answer a set of interrelated research questions in a

 
 

Figure 1. The TAM Model

Unlike first generation regression tools, SEM not only assesses
THE EXTENT TO WHICH SEM IS BEING USED
Not surprisingly, SEM tools are increasingly being used in behavioral science research for the causal modeling of complex, multivariate data sets in which the researcher gathers multiple measures of proposed constructs . Indeed, even a casual glance at the IT literature suggests that SEM has become de rigueur in validating instruments and testing linkages between constructs.

Before describing in greater depth the methods and approaches adopted in SEM vis-à-vis regression, it is useful to know the extent to which SEM is currently being used in IS research. The results of analyzing techniques used in empirical articles in three major IS journals (MIS Quarterly, Information & Management and Information Systems Research) during the four year period between January 1994 and December 1997 are shown in Table 1. Consistent with Straub , the qualifying criteria for the sample were that the article employed either:

  • correlation or statistical manipulation of variables or
  • some form of data analysis, even if the data analysis was simply descriptive statistics.

  • Table 1. Use of Structural Equation Modeling Tools 1994-1997


    SEM Approaches
    I&M

    (n=106)

    ISR

    (n=27)

    MISQ

    (n=38)

    All Three Journals
    PLS
    2%
    19%
    11%
    7%
    LISREL
    3%
    15%
    11%
    7%
    Other *
    3%
    11%
    3%
    4%
    Total %
    8%
    45%
    25%
    18%
                          * Other includes SEM techniques such as AMOS and EQS.

    Studies using archival data (e.g., citation analysis) or unobtrusive measures (e.g., computer system accounting measures) were omitted from the sample unless it was clear from the methodological description that key variable relationships being studied could have been submitted to validation procedures. The number of articles published by each journal (n) and the percentage using SEM techniques are shown in the table. Most of the 171 articles selected were field studies (74%); the remainder were field experiments (6%), laboratory experiments (15%) and case studies (5%) that used quantitative data.

    Table 1 clearly shows that SEM has been used with some frequency for validating instruments and testing linkages between constructs in two of three widely known IS journals. In ISR, 45% of the positivist, empirically-based articles used SEM; in MISQ, it was 25%. From the first appearance of SEM in 1990 in the major IS journals , usage grew steadily. By the mid-1990’s SEM was being used in about 18% of empirical articles across the three journals, with PLS and LISREL being the two most common techniques. Other SEM tools, such as EQS and AMOS, were used less often, but this is most likely because of the slowness of diffusion of innovation and is not a statement about the power or capability of these particular packages.

              WHAT IS IN THIS PAPER
    To help the reader understand the differences among LISREL, PLS, and linear regression, this article presents a running example of the analysis of a Technology Acceptance Model (TAM) dataset that uses these three statistical techniques. The running example begins in Section II. It can be skimmed or skipped by readers familiar with the three techniques.

    Despite increased interest and the growing literature of individual SEM models, there is no comprehensive guide for researchers on when a specific form of SEM should be employed. To inform research practice and to explore the dimensions of the problem, Section III compares the two most widely used SEM models in the IT literature: LISREL and PLS. PLS and LISREL represent the two distinct SEM techniques, respectively:

    In Section IV, the paper summarizes the major assumptions of the two SEM models. Based on this analysis, guidelines are presented in Section V for when to choose one of the two SEM models or one of the first generation regression models.

    A summary of the major guidelines in Sections III, IV, and V, is presented below in Tables 2 and 3. Table 2 summarizes the objective behind each technique and limitations relating to sample size and distribution. A detailed discussion with citations on these issues can be found in Overview of Analytical Techniques in Section III. Table 3 summarizes guidelines based on the capabilities of each technique. These guidelines are discussed in detail and with citations in The SEM Model, also in Section III.

    Table 2. Comparative Analysis between Techniques
    Issue  LISREL  PLS Linear Regression 
    Objective of Overall Analysis  Show that the null hypothesis of the entire proposed model is plausible, while rejecting path-specific null hypotheses of no effect.  Reject a set of path-specific null hypotheses of no effect.  Reject a set of path-specific null hypotheses of no effect. 
    Objective of Variance Analysis  Overall model fit, such as insignificant chi-square or high AGFI.  Variance explanation (high R-square) Variance explanation (high R-square)
    Required Theory Base  Requires sound theory base. Supports confirmatory research. Does not necessarily require sound theory base. Supports both exploratory and confirmatory research. Does not necessarily require sound theory base. Supports both exploratory and confirmatory research.
    Assumed Distribution Multivariate normal, if estimation is through ML. Deviations from multivariate normal are supported with other estimation techniques. Relatively robust to deviations from a multivariate distribution.  Relatively robust to deviations from a multivariate distribution, with established methods of handling non-multivariate distributions. 
    Required Minimal Sample Size  At least 100-150 cases.  At least 10 times the number of items in the most complex construct. Supports smaller sample sizes, although a sample of at least 30 is required.

    Table 3. Capabilities by Research Approach

    Capabilities  LISREL PLS Regression
    Maps paths to many dependent (latent or observed) variables in the same research model and analyze all the paths simultaneously rather than one at a time.  Supported Supported Not supported
    Maps specific and error variance of the observed variables into the research model. Supported Not supported Not supported
    Maps reflective observed variables Supported Supported Supported
    Maps formative observed variables  Not supported Supported Not supported
    Permits rigorous analysis of all the variance components of each observed variable (common, specific, and error) as an integral part of assessing the structural model Supported Not supported Not supported
    Allows setting of non-common variance of an observed variable to a given value in the research model.  Supported Not supported Supported by adjusting the correlation matrix. 
    Analyzes all the paths, both measurement and structural, in one analysis.  Supported Supported Not supported
    Can perform a confirmatory factor analysis Supported Supported Not supported
    Provides a statistic to compare alternative confirmatory factor analyses models Supported Not supported Not supported

     
    II. RUNNING EXAMPLE OF USE OF SEM VERSUS FIRST GENERATION STATISTICAL TECHNIQUES
    For those IS researchers who are not familiar with SEM, this section presents a sample analysis of a typical dataset that uses the three techniques discussed in this article:
        1. linear regression
        2. LISREL
        3. PLS
    TAM AS DOMAIN FOR RUNNING EXAMPLE
    The domain of the running example is the Technology Acceptance Model (TAM), a widely researched theoretical model that attempts to explain the adoption of new information technologies. A partial listing of previous TAMstudies, presented in Appendix A, shows the extent to which this model has been examined in IS research. TAM, based on the Theory of Reasoned Action , is a straightforward model of IT adoption that contends that beliefs such as system perceived usefulness (PU) and perceived ease-of-use (EOU) impact:
        1. attitudes toward use,
        2. intentions to use (IUSE), and ultimately
        3. IT acceptance (most often measured as utilization).
    Figure 1, shown in Section I and repeated below, illustrates the basic research model used throughout this tutorial. The causal linkages in TAM are thoroughly explained in the literature and need not be repeated here. Suffice it to say, TAM studies typically involve up to three hypotheses


    Figure 1. Basic TAM Model Used as Running Example


    associated with these fundamental constructs (Table 4). First, PU is expected to influence outcome variables such as intention to use the system (see H1). Researchers in this research stream choose outcomes depending on the questions they are investigating and the research methods they have selected. Attitudes toward use are also chosen as DVs (dependent variables) as are several standard IT use variables. The latter relationship is, perhaps, the most consistent finding in TAM studies with self-reported usage variables (see Straub, Limayem, and Karahanna this relationship raises a serious question about the possibility of common methods variance in most TAM studies). Moreover, it has come to represent the most interesting derivative work trying to explain the conditions and antecedents to PU and EOU.
    Table 4. Typical TAM Hypotheses
    Hypothesis
    H1 PU will impact the system outcome construct, Intention to Use the System.
    H2 EOU will impact the system outcome construct, Intention to Use the System.
    H3 EOU will impact PU.
    In the original TAM studies by Davis and Davis et al. , EOU was also thought to influence User Acceptance (a surrogate for IT Usage). With respect to H2 in Table 4, these studies and subsequent studies did not find consistent results. One empirically-derived explanation for why EOU did not produce invariant effects on system outcomes was offered by Davis . He argued that EOU may affect system outcomes only through the intermediate or intervening variable PU (i.e., H3). His experiment confirmed this statistical explanation, which has also been posited and confirmed by later research (e.g., Adams et al., 1992, Gefen, 2000, Gefen and Straub, 2000, Keil et al., 1995, Venkatesh and Davis, 1994).

    While a literature review and in-depth discussion of the TAM research model are not necessary here, elaboration of the measurement and data gathering are relevant. The instrument used to collect the data is shown in Appendix B. While the measures are based on previously validated instruments in the literature, the current study re-validates these measures, as recommended by Straub .

     METHODOLOGY

    To test TAM via the three statistical techniques, we conducted a free simulation experiment with student subjects. As indicated in Appendix B, subjects were asked to use the Internet during the laboratory experiment to access Travelocity.com, thoroughly review the site, and then answer questions about it. In free simulation experiments, subjects are placed in a real-world situation and then asked to make decisions and choices as part of the experiment. Since there are no preprogrammed treatments, the experiment allows the values of the IVs (independent variables) to range over the natural range of the subject’s experience. In effect, the experimental tasks induce subject responses, which are then measured via the research instrument.

    Subjects were students taking MBA courses at the Lebow College of Business at Drexel University, a large accredited urban research university in Philadelphia. Most of the subjects were well acquainted with commercial Web sites where products and services are offered for sale, so the technology itself was not a novelty to them. Many were also familiar with the specific Web site selected for study, Travelocity.com. To permit controlling for possible effects from prior experience, we also measured the extent of this activity for each subject. One hundred and sixty subjects took part in the experiment. The exercise was optional for the course, which can be interpreted to mean that there should be no confounding effects from coercing subjects into participation. Participation in the experiment was voluntary and the students were not rewarded for taking part in it. Even so, 93% of the students volunteered to take part in the study.

    DATA ANALYSIS USING LINEAR REGRESSION
    Because linear regression cannot test all three relationships in a single statistical test, it is necessary to use two separate regressions to test the model fully. In regression #1, IUSEis the dependent variable and PU and EOU are independent variables. In regression #2, PU is regressed on EOU as the only independent variable. To perform linear regression analysis on the data, the researcher must first create an index for each of the constructs or variables. As shown in Appendix B, the index represents the value of the construct by averaging the subject responses to items PU1-PU6 for PU, items EOU1-EOU6 for EOU, and items IUSE1-IUSE3 for IUSE.

    The findings from the statistical tests are shown in Figure 2. As is common in the literature , H1 and H3 are significant and in the posited directions while H2 is not. Using an index (average) for the constructs in the TAM testing is acceptable because the items making up the instruments scales were tested to ensure that they formed strong unities and demonstrate good measurement properties (construct validity and reliability). The tests most frequently used are factor and reliability analyses . In this case, a Principal Components Analysis (PCA) of the primary research constructs showed extremely clean loadings in the factor structure, as depicted in Table 5. The only loading that was marginal was PU1, which was still above the commonly cited .40 minimum loading level . The reliabilities reported are Cronbach’s a s, and all are well above the cited minimums of .60 or .70 . Note that in the example all six PU items are included. Had PU1 been dropped, the factor analyses in PCA, LISREL, and PLS, would have shown a cleaner factor-loading pattern. (The same item also cross-loaded on the EOU factor in other e-commerce studies .) The item was included because dropping it does not change the regression patterns and the objective is to use established scales "as is" in this demonstration.



     
     

    Figure 2. TAM Causal Path Findings via Linear Regression Analysis

    Table 5. Factor Analysis and Reliabilities for Example Dataset

          Factors   Cronbach’s
    Construct
    Item
    1
    2
    3
    Alpha 
     
    PU1
    .543
    .277
    .185
     
    Perceived
    PU2
    .771
    .178
    .053
     
    Usefulness
    PU3
    .827
    .315
    .185
    .91
    (PU)
    PU4
    .800
    .268
    .234
     
     
    PU5
    .762
    .352
    .236
     
     
    PU6
    .844
    .437
    .290
     
    Perceived
    EOU1
    .265
    .751
    .109
     
    Ease-of-Use
    EOU2
    .217
    .774
    .150
     
    (EOU)
    EOU3
    .270
    .853
    .103
    .93
     
    EOU4
    .303
    .787
    .105
     
     
    EOU5
    .248
    .831
    .179
     
     
    EOU6
    .242
    .859
    .152
     
    Intention
    IUSE1
    .183
    .147
    .849
     
    To Use
    IUSE2
    .224
    .062
    .835
    .80
    (IUSE)
    IUSE3
    .139
    .226
    .754
     
                      Rotation Method: Varimax with Kaiser Normalization (Rotation converged in 6 iterations)

    DATA ANALYSIS USING LISREL

    To estimate coefficients, researchers employing LISREL typically use a different algorithm than the algorithm used for linear regression. Instead of minimizing variance as in regression, the most common LISREL estimation method maximizes likelihood. The differences between the typical LISREL approach and that of regression will be examined in greater detail later in the paper. For the moment, it is sufficient to say that the preliminary factor and reliability analyses that are required to legitimate indices in linear regression are not necessary in SEM techniques like LISREL and PLS because the testing of measurement properties of the instruments is simultaneous with the testing of hypotheses. The coefficients in LISREL can be read in a manner very similar to regression, that is, the standardized coefficients, known as betas and gammas, indicate the relative strength of the statistical relationships. And the loadings from the instrument items to the constructs (termed "latent" variables in SEM) can, once one recalibrates the scaling and examines the t-values, be interpreted in a similar manner to factor analysis.

    We will discuss the LISREL findings in the same order in which the findings were discussed in the regression analysis. Unlike regression, however, it is only necessary to conduct a single LISREL run, in that the technique can consider the underlying structural relationships of all the latent variables at once. Moreover, it can also estimate the strength of the measurement items in loading on their posited latent variable or construct. Using the same dataset as in the regression runs (plus factor analysis and reliability tests), a single LISREL run produced the results shown in Figure 3 and Table 6. The SMC in Figure 3 is the LISREL equivalent of an R2 in linear regression. It shows the percent of explained variance in the latent variable [Bollen,1989].
     
     

    Figure 3. TAM Standardized Causal Path Findings via LISREL Analysis

    Table 6. Standardized Loadings and Reliabilities in LISREL Analysis

        Latent Construct Loading (and Error) Reliability
    Coefficient
    Construct
    Item
    PU
    EOU
    IUSE
     
    PU1
    0.99 (.50)
         
    Perceived
    PU2
    1.10 (.39)**
         
    Usefulness
    PU3
    0.93 (.45)**
       
    .95
    (PU)
    PU4
    1.07 (.26)**
         
     
    PU5
    1.10 (.29)**
         
     
    PU6
    1.11 (.24)**
         
     
    EOU1
     
    0.78 (.45)
       
    Perceived
    EOU2
     
    0.95 (.38)**
       
    Ease-of-Use
    EOU3
     
    0.92 (.25)**
     
    .94
    (EOU)
    EOU4
     
    0.99 (.31)**
       
     
    EOU5
     
    1.00 (.27)**
       
     
    EOU6
     
    0.94 (.21)**
       
    Intention
    IUSE1
       
    1.36 (.34)
     
    To Use
    IUSE2
       
    2.17 (.38)**
    .95
    (IUSE)
    IUSE3
       
    1.15 (.53)**
     
                             The first item loading in each latent variable is fixed at 1.00 and does not have a t- value.
                              ** Significant at the .01 level
     
    As in the regression analysis, H1 and H3 are significant and in the posited directions. H2, likewise, is not significant. Moreover, LISREL provides several indications of the extent to which the sampled data fits the researcher-specified model. In this case, both the ratio of the chi-square to the degrees of freedom (160.17/87=1.84) and the adjusted goodness of fit (AGFI) index (.84) tell the researcher that the model is a reasonably good-fitting model. Finally, due to the low standardized root mean square residual (RMR), it is not unreasonable to conclude that the data fits the model. Dropping PU1 significantly improves the fit indexes (almost all the published LISREL analyses of TAM have dropped items). So that readers can make straightforward comparisons, we will use the same tabular format as Table 5 to present the LISREL-generated factor loadings and reliabilities. Table 6 shows that the measurement properties for the instrument items using the confirmatory factor analysis (CFA) capability of LISREL are remarkably similar to those of the PCA performed earlier. All meet a standard for significance at the .01 level. The reliabilities are likewise respectable.
    More details about each of these statistics are given below, but it is sufficient to point out at this time that the results of the LISREL analysis are in complete accord with those of the regression analysis. The primary differences that the reader may wish to take note of is that when all of the causal paths are tested in the same model, there is not a statistical issue with the lack of connection between runs, which characterizes all regression analyses. It is possible in regression, for example, to misinterpret the underlying causality in that no single run can partial out all the variance in complex research models.
    DATA ANALYSIS USING PLS
    In estimating its coefficients, PLS uses algorithms that have elements in common with both linear regression and LISREL. Like regression, it works with the variance of the individual data item from the means. In partialing out variance for the entire research model via iterative analysis, PLS resembles LISREL. In fact, it is this latter characteristic, that it works with the entire structure of the research model, that allows it to be categorized as a SEM technique.

    Coefficients in PLS, shown in Figure 4, can be read in a manner very similar to regression and LISREL, that is, the standardized coefficients indicate the relative strength of the statistical relationships. Moreover, loadings from the instrument items to the constructs can also be interpreted in a similar manner to the PCA that precede regression runs and the CFA that is utilized in LISREL. Using the same dataset as in the two previous analyses, a single PLS run produced the results shown in Figure 4 and Tables 7 and 8.
     


    Figure 4. TAM Causal Path Findings via PLS Analysis

    Table 7. Loadings in PLS Analysis

       
    Latent Construct
    Construct
    Item
    PU
    EOU
    IUSE
     
    PU1
    .776**
    .613
    .405
    Perceived
    PU2
    .828**
    .498
    .407
    Usefulness
    PU3
    .789**
    .448
    .302
    (PU)
    PU4
    .886**
    .558
    .353
     
    PU5
    .862**
    .591
    .451
     
    PU6
    .879**
    .562
    .406
    Perceived
    EOU1
    .534
    .802**
    .323
    Ease-of-Use
    EOU2
    .557
    .839**
    .338
    (EOU)
    EOU3
    .467
    .886**
    .260
     
    EOU4
    .562
    .843**
    .289
     
    EOU5
    .542
    .865**
    .304
     
    EOU6
    .508
    .889**
    .288
    Intention
    IUSE1
    .350
    .270
    .868**
    To Use
    IUSE2
    .380
    .234
    .858**
    (IUSE)
    IUSE3
    .336
    .280
    .814**
                                      N.B. A reliability statistic not automatically produced in PLS.
                                      ** Significant at the .01 level

    Table 8. AVE and Correlation Among Constructs in PLS Analysis


    AVE/ Correlation IUSE PU EOU
    IUSE .721    
    PU .468 .742  
    EOU .359 .632 .738
    As before, H1 and H3 are significant while H2 is not. While there are no overall model fit statistics produced by PLS, it can estimate t-values for the loadings utilizing either a jackknife or bootstrap technique. The loadings and the significance level of their t-values are shown in Table 7. Note that item loadings on their respective construct are presented by PLS, but that cross-loadings need to be calculated as the correlation of each standardized item with its factor scores on the constructs. Assessing the confirmatory factor analysis in PLS is then done by verifying that the AVE (discussed later) of each construct is larger than its correlations with the other constructs and that each item loading in the factor analysis is much higher on its assigned construct (factor) than on the other constructs. Table 8 shows the correlation and AVE table. The AVE is presented in the diagonal with a gray background.
    SUMMARY AND CAVEAT
    What do these three analyses of this sample dataset show? It is clear that in this particular circumstance, the analyses produced remarkably similar results. The reader should not generalize that this will always be the case, however. When certain endogenous constructs are added to this basic model, for example, the SEM analytical techniques  LISREL and PLS come to different conclusions than linear regression. As developed by Straub , Gefen and Straub , and Karahanna and Straub , the construct social presence-information richness (SPIR) has been found to predict PU. But in the dataset used for the running example, SPIR is statistically significant in two separate SEM analyses, but not in a regression analysis. Whether this difference is obtained because regression cannot partial out variance for the entire model whereas SEM can, or for some other reason, is not easy to determine. In spite of the fact that the measurement properties of the instrument seem to be acceptable, no instrument perfectly captures the phenomenon and the interaction between the measurement characteristics and the statistical technique may spell the difference. Then, again, as we shall shortly see, the assumptions and algorithms used in each of the techniques vary quite a bit and this could be the explanation.

    The point is not to resolve this particular issue here. What is critical to note is that there may be subtle or even gross differences between analytical inferences about statistical conclusion validity depending on the researchers’ choices  in sample, in instrument, in method, and in analytical technique.


    III. SEM RESEARCH MODELS

    Given the heavy increase in the use of SEM in well known IS journals, how does one know when the SEM statistics confirm or disconfirm hypotheses? Before addressing this key question, it is important to understand the central characteristics of the SEM techniques and what distinguishes them from ordinary least squares regression (linear regression models).
    DIAGRAMMATIC SYNTAX
    One of the most notable differences between SEM and its first generation predecessors, a difference that also indicates the nature of the analysis being performed, is the special diagrammatic syntax used in SEM. A sample of this syntax is presented in the theoretical model presented in Figure 5.


    Figure 5. Generic Theoretical Network with Constructs and Measures

    In LISREL terminology, the structural model contains the following:
    To illustrate, IUSE and PU would be considered to be endogenous constructs in the TAM running example used earlier.Both are predicted by one or more other variables, or latent constructs. EOU, however, would be considered to be an exogenous latent construct in that no other variable in this particular model predicts it.The causal path PU (x1)Þ IUSE (x2) was estimated as a b coefficient.The causal path EOU (h1)Þ PU (x1) was estimated as a g coefficient9.

    In addition, the measurement model consists of:

    The Qd and Qe matrixes are diagonal by default, meaning that an error term is supposed to load only on its corresponding item.Thelx and lY matrixes are full and fixed, requiring the researcher to connect each item to its latent construct.

    THE TWO PRIMARY METHODS OF SEM ANALYSIS

    The holistic analysis that SEM is capable of performing is carried out via one of two distinct statistical techniques:

      1. covariance analysis – employed in LISREL, EQS and AMOS – and
      2. partial least squares – employed in PLS and PLS-Graph .
    These two distinct types of SEM differ in the objectives of their analyses, the statistical assumptions they are based on, and the nature of the fit statistics they produce.

    The statistical objective of PLS is, overall, the same as that of linear regression, i.e., to show high R_Square and significant t-values, thus rejecting the null hypothesis of no-effect . The objective of covariance-based SEM, on the other hand, is to show that the null hypotheses -- the assumed research model with all its paths --is insignificant, meaning that the complete set of paths as specified in the model that is being analyzed is plausible, given the sample data. Moreover, its goodness of fit tests, such as chi-square test the restrictions implied by a model. In other words, the objective of covariance-based SEM is to show that the operationalization of the theory being examined is corroborated and not disconfirmed by the data .

    Another important difference between the two SEM techniques is that covariance-based SEM techniques, unlike PLS, enable an assessment of unidimensionality. Unidimensionality is the degree to which items load only on their respective constructs without having "parallel correlational pattern(s)" . In factor analysis terms, unidimensionality means that the items reflecting a single factor have only that one shared underlying factor among them. Accordingly, there should be no significant correlational patterns among measures within a set of measures (presumed to be making up the same construct) except for the correlation associated with the construct itself (see also ). Unidimensionality cannot be assessed using factor analysis or Cronbach’s a .

    An example of unidimensionality and parallel correlational patterns can clarify these terms. A student’s GPA is the average of his or her course grades. Assume there are only 10 courses in a narrow subject area and all students take all 10 courses. All things being equal other than instructor, course grades in a factor analysis should all load onto one factor,  the GPA for this set of courses. This can be verified using a factor analysis. It is possible, however, that some of the grades are related to each other beyond their loading onto the GPA factor. Such a circumstance could occur, for example, when two course sections are taught by a very lenient professor who tries to help his students by giving them higher grades than other professors in this same course. As a result, his two course sections would show a parallel correlational pattern. They would share variance with the overall course grades (the GPA factor), but would also have a significant shared variance between them. Likewise, if several of the courses were graded based on a take-home exam rather than on a traditional in-class examinations, it is unlikely that the 10 courses would show unidimensionality because the courses with the take-home exam would probably share a factor among themselves beyond the factor that is associated with all the grades of all the courses. In this hypothetical circumstance, it is likely that the take-home exam courses would share the "GPA" factor with the other courses, but would, in addition, have another shared factor among themselves reflecting the unique variance relating to take-home grades.

    Unidimensionality testing can uncover such cases. When there is unidimensionality, there is no significant shared variance among the items beyond the construct which they reflect. In addition, while both methods of SEM provide for factor analysis, covariance-based SEM also provide the ability to compare alternative pre-specified measurement models and examine, through statistical significances, which is better supported by the data . Assuming that the models are nested, this type of CFA enables the comparison of two separate measurement models for the same data and a significance statistic for which model is superior . Finally, covariance-based SEM provides a set of overall model-fit indices that include a wide set of types of fit (unlike the single F statistic in linear regression and the R_Square that is derived from this F-value). Covariance-based SEM is thought to provide better coefficient estimates and more accurate model analyses .

    OVERVIEW OF ANALYTICAL TECHNIQUES

    Differences between SEM methods are the result of the varying algorithms for the analytical technique. Covariance-based SEM uses model fitting to compare the covariance structure fit of the researcher’s model to a best possible fit covariance structure. Indices and residuals provided tell how closely the proposed model fits the data as opposed to a best-fitting covariance structure. Covariance-based SEM tests the a priori specified model against population estimates derived from the sample., When the research model has a sound theoretical base, its overall objective is theory testing. Thus, these types of modeling examine whether the data is statistically congruous with an assumed multivariate distribution . Covariance-based SEM techniques emphasize the overall fit of the entire observed covariance matrix with the hypothesized covariance model; for this reason, they are best suited for confirmatory research.

    Our running example provides a straightforward translation of these terms. The TAM research model expresses certain causal paths that are specified in the theory or represent refinements or testable propositions by IS researchers. If this model is an accurate description of the system use/technology acceptance phenomenon, then the relationships between observed measures of these constructs in the theoretical model should be superior to a LISREL-generated model of no-fit. In other words, data gathered from the field or from experimental subjects should correspond well to patterns that are hypothesized by the research model. By comparing the sample data and its various path-, item loading-, and error variance-estimates to a null model, it is possible to see how good the researcher’s TAM theoretical model really is.

    PLS, the second major SEM technique, is designed to explain variance, i.e., to examine the significance of the relationships and their resulting R_Square, as in linear regression. Consequently, PLS is more suited for predictive applications and theory building, in contrast to covariance-based SEM. Some researchers, thus, suggest that PLS should be regarded as a complimentary technique to covariance-based SEM techniques possibly even a forerunner to the more rigorous covariance-based SEM . Using OLS (Ordinary Least Squares) as its estimation technique, PLS performs an iterative set of factor analyses combined with path analyses until the difference in the average R_Square of the constructs becomes insignificant . Once the measurement and structural paths have been estimated in this way, PLS applies either a jackknife or a bootstrap approach to estimate the significance (t-values) of the paths.

    Neither of these PLS significance estimation methods require parametric assumptions. PLS is thus especially suited for the analysis of small data samples and for data that does not necessarily exhibit the multivariate normal distribution required by covariance-based SEM . This characteristic of PLS is in contrast to covariance-based SEM which requires a sample of at least 100 or 150 because of the sensitivity of the chi-square statistic to sample size . Nonetheless, even in PLS the sample size should be a large multiple of the number of constructs in the model since PLS is based on linear regression. One guideline for such a sample size in PLS is that the sample should have at least ten times more data-points than the number of items in the most complex construct in the model .

    Just as the objectives of the two types of SEM differ, so do their analysis algorithms. Covariance-based SEM applies second order derivatives, such as Maximum Likelihood (ML) functions, to maximize parameter estimates. Though LISREL uses ML estimates as a default, it can also be set to estimate these coefficients using other established estimation techniques, including Unweighted Least Squares (ULS), Generalized Least Squares (GLS), and Weighted Least Squares (WLS), among others. ULS can be used when the observed variables have the same units; GLS and ML are appropriate when the observed variables are known to be multivariate-normal, although they are applicable even when the observed variables deviate from this assumption . As to WLS, this estimation method should be used when polychoric correlations have been generated or when there are substantial deviations from a multivariate-normal distribution .

    PLS, on the other hand, applies an iterative sequence of OLS and multiple linear regressions, analyzing one construct at a time . Rather than estimating the variance of all the observed variables, as in covariance-based SEM, PLS estimates the parameters in such a way that will minimize the residual variance of all the dependent variables in the model . Consequently, PLS is less affected by small sample sizes , as in the case of linear regression models in general . PLS, like linear regression models , is also less influenced by deviations from multivariate normal distribution , although sample size considerations influence the strength of the statistical test . Comparisons based on all three aspects discussed were presented in Table 2 in Section I.

    In the running example, it is clear that the data gathered from the free simulation experiment produces normalized/standardized path coefficients and R-squares that are similar across all three techniques. In minimizing the residual variance between the indicators of the latent variables PU and IUSE, EOU and IUSE, and EOU and PU, the statistical linkages in PLS between these constructs proves to be consistent with TAM theory. Moreover, despite the use of different estimation methods, the regression approaches reached comparable percent of explained variance (R2 and SMC) and comparable standardized path coefficients.

    THE SEM MODEL
     
    The SEM model contains two inter-related models -- the measurement model and the structural model.Both models are explicitly defined by the researcher.Pragmatically speaking, the researcher expresses which items load onto which latent variables and which latent constructs predict which other constructs through software packages specifically designed for these techniques, or, by one’s expression of the equations via generalized packages like SAS.The measurement model defines the constructs (latent variables) that the model will use, and assigns observed variables to each.The structural model then defines the causal relationship among these latent variables (see Figure 5; the arrows between the latent variables represent these structural connections). The measurement model uses factor analysis to assess the degree that the observed variables load on their latent constructs (x and h, for exogenous and endogenous constructs, respectively).The manifest or observed variables are identified as Xs and Ys, for items reflecting the exogenous and endogenous constructs, respectively.SEM estimates item loading (l) and measurement error for each observed item (Qd and Qe, respectively for X and Y items).


    The item loadings provided by SEM are analogous to a factor analysis where each factor is, in effect, a latent variable. SEM techniques also explicitly assume that each of the observed variables has unique measurement error. Measurement error represents both inaccuracy in participant responses and their measurement, as well as inaccuracies in the representation of the theoretical concept by the observed variables. Consequently, covariance-based techniques are well suited for the analysis of models containing variables with measurement error , facilitating a transition from exploratory to confirmatory analysis.

    Typically, a latent variable will be estimated based on multiple observed variables. Nonetheless, SEM does permit the use of constructs represented by single items. In such cases, in covariance-based SEM alone, the researcher explicitly sets parameters for the reliability and loading of the observed variable. Having a single item reflect a construct would be appropriate when the researcher uses an established scale with a known reliability and wishes to use an index of the scale as a whole, or when there is, indeed, only one item with little or no assumed measurement error, as with gender or age .
     

    The structural model estimates the assumed causal and covariance linear relationships among the exogenous (x) and endogenous (h) latent constructs18.(As explained earlier, these paths are called g when they link exogenous and endogenous latent constructs, and b when they link endogenous latent constructs.) SEM also estimates the shared measurement error for the constructs (f and y, for exogenous and endogenous latent constructs respectively)19. By allowing the researcher to specify these g and y paths, SEM can support multi-layered causal models.


    Covariance-based SEM and PLS differ, however, in the types of relationship they support between the observed variables and their associated latent constructs. PLS supports two types of relationship, formative and reflective. Formative observed variables, as their name implies, "cause" the latent construct, i.e., represent different dimensions of it. Latent variables attached to formative measures are the summation of the formative observed variables associated with them . These observed variables are not assumed to be correlated with each other or to represent the same underlying dimension .

    The latent construct "Technological Environment," for example, might be measured by the extent of the IT infrastructure, but also by the level of technical support. These measures could be uncorrelated, but each viewed as "forming" the construct.

    Reflective observed variables, on the other hand, reflect the latent variable and as a representation of the construct should be unidimensional and correlated . To emphasize this difference, formative items are drawn with an arrow leading to the latent construct, while reflective items are drawn with an arrow leading away from the latent construct. PLS supports both types of observed variables whereas covariance-based SEM has been interpreted to support only reflective observed variables . According to one interpretation, reflective observed variables should be preferred to formative ones when there is a relevant theory and when the objective is theory testing rather than theory building .

    An example might better clarify the difference between reflective and formative observed variables. When a construct, such as intelligence, cannot be measured directly, researchers measure it indirectly using several indicator variables. In the case of intelligence these indicator variables might be scores obtained from a test. When the scores are assumed to measure the same underlying aspect of intelligence, they are reflective. This situation would occur, for example, when a researcher is measuring algebraic intelligence and the indicator variables chosen evaluate aptitudes for addition, division, subtraction, and multiplication. On the other hand, when more than one aspect of intelligence is being measured, such as when the exam tests both algebraic and linguistic intelligence using one indicator variable each, then the indicator variables would be formative of a construct for "intelligence." It is conceivable and often the case that an individual’s algebraic and linguistic intelligence can be reasonably thought of as composite elements (or sub-constructs/meso-level constructs) of the molar-level construct "intelligence," but not necessarily highly correlated with each other. Therefore, they are formative rather than reflective of the molar construct "intelligence." Whereas both algebraic intelligence and linguistic intelligence are viable sub-constructs in this situation, the nature of constructs chosen by the researcher in other situations will determine whether the measures are better seen as formative or reflective.

    The ability to analyze complex models (like that shown in Figure 5) in a single, unified process is a major advantage of both types of SEM over first generation regression models. In first generation regression models, item loadings on the latent variables must be analyzed in a separate step (as shown in the TAM running example in Section II) and the linkage to each dependent variable must be assessed independently (other than MANOVA, of course). SEM analysis also generally results in a more rigorous variance analysis , and enables the researcher to include not only common variance but also specific and error variance explicitly into the research model .

    Some SEM, such as LISREL, also permit the researcher to specify how the specific and error variance of each observed variable relates to those of other observed variables. Accordingly, LISREL allows the setting and fixing of the item loading and measurement error of the observed variables . Setting the items loading, however, should not be exercised unless there is a good reason for doing so, such as comparing samples or when it is known that there is little or no measurement error (e.g., when measuring gender or age). Table 3 in Section II presented guidelines based on capabilities by research approach.

    APPLYING CRITERIA TO THE RUNNING EXAMPLE

    How would these criteria for analytical method choice apply in the case of the TAM running example? In the first case, as indicated earlier, TAM is a mature theoretical research stream in IS research. As such, the relationships between the basic constructs are relatively well understood. Based on Table 2, therefore, TAM testing should use confirmatory analytical techniques, which, in this case, means that any of the three methods would be appropriate although LISREL and regression are to be preferred as they are especially suited for testing theory. Given that the sample size exceeds the minimal requirements for LISREL, which is the most demanding in this regard, any of these techniques would also be appropriate with regard to this criterion.

    There are, however, conditions where the use of linear regression and PLS would be the most appropriate choices for the TAM running example. If the sample size for the TAM researchers had been low, then the power of a LISREL analysis would have suffered badly and PLS, which can work with much smaller samples, would have been a better choice. The tradeoff in this situation would be that PLS is best used for exploratory research, but can, when necessary, serve for confirmatory work.

    Regression might have been an appropriate choice if the researcher wished to make specific and direct comparisons to other studies that used this technique in the research tradition. By the same token, ANOVA or MANOVA might be employed for these same reasons. The statistics generated by regression and older statistical techniques seem to be more amenable to meta-analysis, which might also be a factor in its selection. Researchers who want to add to the research tradition and meta-analyze the cumulative effect of TAM studies would find it simpler to work with regression, ANOVA, t-tests, and simple or partial correlations.

    Finally, if the LISREL TAM model had refused to converge, as it did in some of the runs with our sample data when the SPIR variables were included, PLS or regression may also be a better choice. One should never conclude that the refusal of LISREL to converge represents anything other than the inability of the matrices to be reduced, which is the mathematical method used for maximum likelihood estimation. Lack of convergence does not suggest anything definitive about the model itself (as is obvious in the TAM case presented here) or its hypothesized causal paths. If LISREL reports that the reason for non-convergence is that a matrix is not positive definite, then two rows (item measures) are likely so similar that matrix reduction cannot be carried out, but this would imply more about measurement than about the underlying theory being tested and relationships between constructs. Moving to another technique is a perfectly acceptable alternative in such a case.

    STATISTICS IN SEM

    Just as the two types of SEM techniques differ in their underlying statistical assumptions and estimation methods, so do the statistics they produce. First, it is important to note in this respect that covariance-based SEM, unlike linear regression models and PLS, does not always converge and produce interpretable results. A covariance-based SEM model that does not converge will have to be modified or the theory base reassessed when the model:

    Lack of convergence notwithstanding, the next few paragraphs describe SEM statistics, starting with covariance-based SEM statistics.

    Covariance-based SEM packages generate statistics at three levels:

      1. at the individual path and construct level.
      2. at the overall model fit level.
      3. individual path modification indexes.
    At the individual path level, SEM estimates item loadings and measurement error along with their respective t-values. Construct reliability, the analog of a Cronbach’s a, can then be derived from these statistics.[25]As with Cronbach’s a statistics, construct reliability should be above .70 [Hair et al., 1998, Segars, 1997].SEM also estimates the coefficients and t-values representing the relationships among the latent constructs gs, bs,