Background: Results from randomized trials indicate a 5.4% survival advantage associated with axillary dissection. To gain insight on survival outcomes when less than an axillary dissection is performed, we performed a retrospective analysis to determine survival outcome for node-negative and node-positive breast cancer patients when a variable number of nodes were excised. Methods: The data analyzed in this paper are from the Surveillance, Epidemiology, and End Results (SEER) database, from which 72,102 patients were selected whose breast cancer had been diagnosed in 1988 or later and who were aged 40–79 years at diagnosis, had a single primary lesion, and had 0 to 3 positive lymph nodes. Cases were separated into age groups (40 to 49 and 50 to 79 years), and node-negative cases were separated from those with one to three positive nodes. Results: This analysis indicates that even when all regional lymph nodes are pathologically negative, the number of nodes removed is associated with survival. In the group of breast cancer patients who had one to three pathologically positive nodes, as with the node-negative group, the higher the number of nodes removed, the greater the survival. The hazard rate for death in the node-negative group was roughly 5% less for each additional five nodes removed. For the node-positive group, the hazard rate for death was between 8% and 9% less for each additional five nodes removed. Conclusions: This retrospective study supports the notion that removal of regional nodes, even when such nodes are interpreted as pathologically negative, is important for the long-term survival of breast cancer patients.
Polymerase chain reaction-sequence-specific oligonucleotide probes typing methods have been applied to 1000 individuals from the Northern Ireland population to give human leuckocyte antigen DRB1 (HLA-DRB1) allele assignment. HLA-DRB1 allele frequencies and four-locus haplotypes (A/B/C/DR) for this Caucasian population, based on HLA class I and class II allele assignment, are now presented. No significant deviations from Hardy-Weinberg proportions were observed. The HLA-C locus exhibited marginal evidence of selection (p _ 0.03, uncorrected one-sided test) in the direction of balancing selection; the HLA-A, -B, and -DRB1 allele frequency distributions were compatible with expectations under a neutral model (which does not mean that selection is not operating). Evidence for selection was seen on haplotypes HLA-A*010101-B*0801- DRB1*030101 and HLA-A*290201-B*440301- DRB1*070101 based on their patterns of linkage disequilibrium.
In this paper we discuss the statistical methods for population genetics used in the analyses of the 13th International Histocompatibility Workshop data. The discussion includes features that are unique to our implementation of these methods. References are provided to the original work and/or subsequent enhancements to the original work.
In this poster we present results from tests of the following hypotheses: that axial compression preload increases motion segment stiffness and hysteresis, and load displacement is more linear. These effects were tested in both intact motion segments and in isolated intervertebral discs.
Objective: This study investigated whether electromyographic signals recorded from the skin surface overlying the multifidus muscles could be used to quantify their activity. Design: Comparison of EMG signals recorded from electrodes on the back surface and from wire electrodes within four different slips of multifidus muscles of three human subjects performing isometric tasks that loaded the trunk from three different directions. Background: It has been suggested that suitably placed surface electrodes can be used to record activity in the deep multifidus muscles. Methods: We tested whether there was a stronger correlation and more consistent regression relationship between signals from electrodes overlying multifidus and longissimus muscles respectively than between signals from within multifidus and from the skin surface electrodes over multifidus. Results: The findings provided consistent evidence that the surface electrodes placed over multifidus muscles were more sensitive to the adjacent longissimus muscles than to the underlying multifidus muscles. The R2 for surface versus intra-muscular comparisons was 0.64, while the average R2 for surface-multifidus versus surface-longissimus comparisons was 0.80. Also, the magnitude of the regression coefficients was less variable between different tasks for the longissimus versus surface multifidus comparisons. Conclusions: Accurate measurement of multifidus muscle activity requires intra-muscular electrodes. Relevance: Electromyography is the accepted technique to document the level of muscular activation, but its specificity to particular muscles depends on correct electrode placement. For multifidus, intra-muscular electrodes are required.
Software to analyze multi-locus genotype data for entire populations is useful for estimating haplotype frequencies, deviation from Hardy-Weinberg equilibrium and patterns of linkage disequilibrium. These statistical results are important to both those interested in human genome variation and disease predisposition as well as evolutionary genetics. As part of the 13th International Histocompatibility and Immunogenetics Working Group (IHWG), we have developed a software framework (PyPop). The primary novelty of this package is that it allows integration of statistics across large numbers of data-sets by heavily utilizing the XML file format and the R statistical package to view graphical output, while retaining the ability to inter-operate with existing software. Largely developed to address human population data, it can, however, be used for population based data for any organism. We tested our software on the data from the 13th IHWG which involved data sets from at least 50 laboratories each of up to 1000 individuals with 9 MHC loci (both class I and class II) and found that it scales to large numbers of data sets well.
Uniting mentoring with e-mail results in expanded opportunities for mentoring, making it possible to overcome the constraints of time limitations and distance to achieve successful mentoring relationships. With these opportunities however, come many of the same challenges that have already been identified through the research on formal mentoring programs. This paper addresses one of these challenges by reporting on the impact of one model of training on the e-mentoring outcomes. A series of interactive, web-based case studies were developed as training modules for mentors and protégées participating in the MentorNet program. The target group for this research study was undergraduate students. Using a control group experimental design, we randomly assigned half the study group to a condition where interactive on-line training was required. The other half was assigned to a condition where it was optional. Those in the mandatory group exhibited improved outcomes. This study was focused on MentorNet (www.MentorNet.net), a large-scale electronic mentoring program that matches women studying engineering and related sciences with professionals in industry for year-long, structured mentoring relationships conducted via e-mail, in an effort to encourage their retention in the technical fields where women are severely underrepresented. We discuss implications for conducting e-mentoring programs.
The human leukocyte antigen (HLA) region on chromosome 6 is the major histocompatibility complex (MHC) for humans. Genes in this region control several different functions involved in the immune response, influence susceptibility to many diseases, and are important for matching donors and recipients for tissue and bone marrow transplantation. In this talk, I will discuss some characteristics of the HLA region and issues involved in the estimation of haplotype (multi-locus genotype) frequencies using the Expectation-Maximization (EM) algorithm. The accuracy of haplotype frequency estimates is of increasing interest with regard to association studies, candidate gene studies, and mapping of disease causing genes for microsattelite, SNP, and protein level variation. Marker-disease associations that are not detected at the single locus level, may nevertheless be detected in a multi-locus analysis using unknown or estimated haplotypes.
In this presentation, we examine the mentoring as a movement for social change. We examine its impact on women students in engineering and science, with particular focus on women of color. Finally, we review best practices developed from years of industry experience supporting and evaluating mentoring programs.
MentorNet, the E-Mentoring Network for Women in Engineering and Science, is leveraging technology and drawing on the benefits of mentoring to address the underrepresentation of women in engineering, science, math, and technology fields. MentorNet is a multi-institutional, large-scale, structured electronic mentoring (e-mentoring) program that pairs women students in engineering, science, math, and technology fields with industry professionals and supports them through year-long e-mentoring relationships. This paper will report on the most salient benefits accrued by women students based on three-years of evaluation results from the 1998-99, 1999-2000, and 2000-01 program years. During these three years, MentorNet matched, supported, and helped facilitate over 3,700 e-mentoring pairs. These 3,700 e-mentoring pairs represented women students from 70 universities and professionals, who volunteered as mentors, from over 700 corporations, professional societies, governmental agencies and laboratories. The collective program evaluations support the need for and efficacy of the program. For all three time periods, at least 80% of the students reported they would recommend MentorNet to other students.
Haplotype analyses are an important area in the study of the genetic components of human disease. Associations between markers and disease loci that are not evident with a single marker locus may be identified in multi-locus marker analyses using estimated haplotype frequencies (HF). Procedures that make use of the expectation-maximization (EM) algorithm to estimate HFs from unphased genotype data are in common use in genetic studies. The EM algorithm uses these unphased genotype frequencies along with the assumption of Hardy-Weinberg proportions (HWP) to converge on HF estimates. In this paper we assess the accuracy of EM estimates of HFs in type 1 diabetes patients where the true haplotypes are known, but the data are analyzed ignoring family information to allow for comparison between estimated and true frequencies. The data consists of six HLA loci with high levels of polymorphism and a range of departures from HWP and linkage equilibrium. While the overall accuracy of the EM estimates is good, there can be large over and under estimates of particular HFs, even for common haplotypes, especially when the loci involved deviate significantly from HWP. Estimating HFs for three or more loci and then collapsing over loci so as to generate two locus haplotypes can improve the accuracy of the estimation. The collapsing procedure is most beneficial when one of the loci in the two-locus haplotype of interest deviates significantly from HWP and the locus collapsed over is in linkage disequilibrium with the other loci.
The IHWG Anthropology / Human Genetic Diversity Component provides high resolution genotype data generated with standardized typing reagents across diverse ethnic groups. The molecular level characterization of the alleles allows analyses to use both allele frequency and sequence information in population genetic analyses. In this report we present preliminary results on the following population genetic analyses of HLA polymorphism: (1) conformity to Hardy-Weinberg expectations; (2) haplotype distribution among populations and patterns of linkage disequilibrium; (3) tests for balancing selection; (4) worldwide patterns of genetic differentiation. In all cases we describe the analytical methods employed, present their assumptions and limitations, and discuss what conclusions can be drawn from population genetic analyses of HLA data.
In this paper, we report on electronic discussion lists (e-lists) sponsored by MentorNet, the National Electronic Industrial Mentoring Network for Women in Engineering and Science. Using the Internet, the MentorNet program connects students in engineering and science with mentors working in industry. These e-lists are a feature of MentorNet's larger electronic mentoring program and were sponsored to foster the establishment of community among women engineering and science students and men and women professionals in those fields. This research supports the hypothesis that electronic communications are a viable forum for developing community among engineering and science students and professionals and identifies factors influencing the emergence of electronic communities (e-communities). The e-lists that emerged into self-sustaining e-communities were focused on topic-based themes. The e-communities maintained three to four simultaneous threaded discussions and were sustained by professionals who served as facilitators by seeding the e-lists with discussion topics. The e-lists were sponsored to provide women students participating in MentorNet with access to groups of technical and scientific professionals. In addition to providing benefits to the students, the e-lists also provided the professionals with opportunities to engage in peer mentoring with other, mostly female, technical and scientific professionals. We discuss the implications of our findings for developing e-communities and for serving the needs of women in technical and scientific fields.
The concepts of statistical sampling and survey design are discussed. These concepts are especially relevant with the advent of the 2000 Census and the debate over its use of statistical sampling. In this paper, basic ideas from survey design are introduced using the 2000 Census as an example, in order to capitalize on the recent media attention. Then, these same concepts are applied to the National Health Interview Survey (NHIS). The NHIS, in particular the 1993 survey, was chosen because it can be accessed via the Internet and to explore concepts related to survey design, such as sampling and stratification, and issues related to the development and execution of surveys. The 1993 NHIS included 52,467 males and 57,204 females. Through the use of statistical weights, these 109,671 individuals represent a population of 254,281,227, which is an approximation of the 1993 noninstitutionalized U.S. population. Data for the 1993 NHIS can be accessed through the National Center for Health Statistics web site and simple analyses can be performed over the web to demonstrate the use of sampling weights. In addition, subsets of the data can be downloaded and analyzed using statistical software packages.
In this article, I present an experiment that can be conducted in a calculus class to investigate the difference quotient and the derivative, using mathematical modeling with student-collected data. I also discuss an extension to the experiment through which students can discover the meaning of parameter estimation in a mathematical model. KEY WORDS: Mathematical Modeling, Data Analysis, Difference Quotient
A method for detecting linkage between a genetic marker and a quantitative trait in sibship data is presented that models the dependence structure within families. A computationally efficient algorithm is given for the application of generalized least squares (GLS) to the regression procedure developed by Haseman and Elston (1972). The null distribution of the test statistic based on the GLS estimator of the Haseman-Elston regression coefficient is studied. The distribution is significantly skewed for studies with a small number of families when only a baseline correlation between sib pairs that share a sibling is incorporated. However, the observed significance levels for the test are not far from nominal levels when this correlation is estimated using an intraclass correlation coefficient.
A common practice among researchers performing linkage studies is the use of equal allele frequencies as input when reporting p-values from computer linkage programs such as S.A.G.E. SIBPAL. Our results, using 5,000 sets from a uniform- prior distribution of allele frequencies, showed that such input may be problematic. Further, we found that the S.A.G.E. SIBPAL test for proportion of alleles shared identical by descent among concordantly affected sib pairs showed a greater percentage of significant p-values with decreasing parental genotype information (Table III), while the S.A.G.E. SIBPAL Haseman-Elston test produced significant p-values comparatively less frequently (Table IV).
Techniques that test for linkage between a marker and a trait locus based on the regression methods proposed by Haseman and Elston (1972) involve testing a null hypothesis of no linkage by examination of the regression coefficient. Modified Haseman-Elston methods accomplish this using ordinary least squares (OLS), weighted least squares (WLS) in which weights are reciprocals of estimated variances, and generalized estimating equations (GEE). Methods implementing the WLS and GEE currently use a diagonal covariance matrix, thus incorrectly treating the squared trait differences of two sib-pairs within a family as uncorrelated. Correctly specifying the correlations between sib-pairs in a family yields the best linear unbiased estimator of the regression coefficient (Scheffe, 1959). This estimator will be referred to as the generalized least squares (GLS) estimator. We determined the null variance of the GLS estimator and the null variance of the WLS/OLS estimator. The correct null variance of the WLS/OLS estimate of the Haseman-Elston regression coefficient may be either larger or smaller than the variance of the WLS/OLS estimate calculated assuming that the squared sib-pair differences are uncorrelated. For a fully informative marker locus the gain in efficiency using GLS rather than WLS/OLS under the null hypothesis is approximately 10% in a multifamily study with three siblings per family and 25% for families with four siblings each.
Production or transformation of forms of energy inevitably affects the environment, e.g., discharges of pollutants to the air or water and changes in land use. The costs of these impacts are known as external costs or externalities. This report reviews the estimation of external costs with respect to health effects arising from air pollution in the literature and suggests specific easily implemented improvements to the model used by Rowe et al., 1994. Here we examine daily mortality studies that focused on particulate matter. In addition we consider studies with health end points other than mortality. This research was sponsored by the Electric Power Research Institute, Palo Alto, CA.
Production or transformation of forms of energy inevitably entails impacts on the environment, including discharges of pollutants to air or water and changes in land use, which are known as external costs or externalities. This report discusses the problem of estimating external costs arising from ozone air pollution, with repsect to health effects resulting in premature mortality. Such estimates have been based on epidemiological studies in which regression analysis is used to estimate a relationship between ozone concentrations and daily mortality conunts. The regression coefficient or slope of this relationship is then used as a dose-response function to predict the effects of some future facility, as one element of an externalities model. This research was sponsored by the Electric Power Research Institute, Palo Alto, CA.