STATISTICAL DATA ANALYSIS (Cont.): Specifying variables for - TopicsExpress



          

STATISTICAL DATA ANALYSIS (Cont.): Specifying variables for analysis: In addressing a particular question we will need to specify both the outcome variable and the exposure variable or variables. *In observational studies, the control of confounding is a key issue in the analysis, and so we should identify: 1. variables believed in advance to confound the exposure–outcome association (a priori confounders) 2. Other variables to be investigated as possible confounders, since a plausible argument can be made concerning their relationship with the exposure and outcome variables, but for which there is little or no existing evidence. *We should also specify any variables considered to be possible effect-modifiers: in that they modify the size or even the direction of the exposure–outcome association. Effect modification is examined using tests for interaction. In practice, variables may play more than one role in an analysis. For example, a variable may confound the effect of one of the main exposures of interest, but its effect may also be of interest in its own right. A variable may be a confounder for one exposure variable and an effect-modifier for another. Many studies have an exploratory element, in that data are collected on some variables which may turn out to be important exposures, but if they do not they may still need to be considered as potential confounders or effect-modifiers. Data reduction: Before commencing formal statistical analyses, it may be necessary to derive new variables by grouping the values of some of the original variables. Note that the original variables should always be retained in the dataset; they should never be overwritten. Grouping of categorical exposure variables is necessary when there are large numbers of categories (for example, if occupation is recorded in detail). If there is an unexposed category, then this should generally be treated as a separate group (e.g. non-smokers). The exposed categories should be divided into several groups; four or five is usually sufficient to give a reasonable picture of the risk relationship. Grouping of numerical exposure variables may be necessary in order to: 1. Use methods based on stratification, as recommended for the initial examination of confounding 2. Use graphical methods to examine how the level of a non-numerical outcome changes with exposure level 3. To examine whether there is a linear association between a numerical exposure variable and a non-numerical outcome. Note that grouping entails loss of information: after checking linearity assumptions or performing initial analyses using the grouped variable it may be appropriate to use the original variable, or a transformation of the original variable. *In the final analysis. One strategy for numerical exposures is to divide the range of the variable using, say, quintiles, to give five groups with equal numbers of subjects in each group. This helps to ensure that estimates of effect for each category are reasonably precise, but can sometimes obscure an important effect if a few subjects with very high levels are grouped with others with more moderate levels. Alternatively, cut-off points may be chosen on the basis of data from previous studies, the aim being to define categories within which there is thought to be relatively little variation in risk. Using standard cut-off points has the advantage of making comparisons between studies easier. Univariable analyses: It is usually helpful to begin with a univariable analysis, in which we examine the association of the outcome with each exposure of interest, ignoring all other variables. This is often called the crude association between the exposure and the outcome. Although later analyses, controlling for the effects of other variables, will supersede this one, it is still a useful stage of the analysis because: 1. Examination of simple tables or graphs, as well as the estimated association, can give useful information about the data set. 2. These analyses will give an initial idea of those variables that are strongly related to the disease outcome. 3. The degree to which the crude estimate of effect is altered when we control for the confounding effects of other variables is a useful indication of the amount of confounding present (or at least, the amount that has been measured and successfully removed). For exposures with more than two levels, one of the levels has to be chosen as the baseline. Often this will be the unexposed group or, if everyone is exposed to some extent, the group with the lowest level of exposure. If there are very few persons in this group, however, this will produce exposure effect estimates with large standard errors. It is then preferable to choose a larger group to be the baseline group. QUESTION: In statistical analysis, cut-off points may be chosen on the basis of data from previous studies, using standard cut-off points has the advantage of making comparisons between studies easier. The aim being to define categories within which there is thought to be relatively lengthy variation in risk. Yes or No?
Posted on: Tue, 02 Sep 2014 08:07:50 +0000

Trending Topics



Recently Viewed Topics




© 2015