Cambridge: Cambridge University Press, 2016. — 1429 p.
Engaging and accessible to students from a wide variety of mathematical backgrounds, Statistics Using Stata combines the teaching of statistical concepts with the acquisition of the popular Stata software package. It closely aligns Stata commands with numerous examples based on real data, enabling students to develop a deep understanding of statistics in a way that reflects statistical practice. Capitalizing on the fact that Stata has both a menu-driven 'point and click' and program syntax interface, the text guides students effectively from the comfortable 'point and click' environment to the beginnings of statistical programming. Its comprehensive coverage of essential topics gives instructors flexibility in curriculum planning and provides students with more advanced material to prepare them for future work. Online resources - including complete solutions to exercises, PowerPoint slides, and Stata syntax (do-files) for each chapter - allow students to review independently and adapt codes to solve new problems, reinforcing their programming skills.
Half title
Imprints
Dedication
Chapter One Introduction
The Role of the Computer in Data Analysis
Statistics: Descriptive and Inferential
Variables and Constants
The Measurement of Variables
Discrete and Continuous Variables
Setting a Context with Real Data
Exercises
Chapter Two Examining Univariate Distributions
Counting the Occurrence of Data Values
When Variables Are Measured at the Nominal Level
Frequency and Percent Distribution Tables
Bar Charts
Pie Charts
When Variables Are Measured at the Ordinal, Interval, or Ratio Level
Frequency and Percent Distribution Tables
Stem-and-Leaf Displays
Histograms
Line Graphs
Describing the Shape of a Distribution
Accumulating Data
Cumulative Percent Distributions
Ogive Curves
Percentile Ranks
Percentiles
Five-Number Summaries and Boxplots
Modifying the Appearance of Graphs
Summary of Graphical Selection
Summary of Stata Commands in Chapter
Commands for Frequency and Percent Distribution Tables
Bar and Pie Graphs
Stem and Leaf Displays
Histograms
Line Graphs
Percentiles
Boxplots
Exercises
Chapter Three Measures of Location, Spread, and Skewness
Characterizing the Location of a Distribution
The Mode
The Median
The Arithmetic Mean
Interpreting the mean of a dichotomous variable
The Weighted Mean
Comparing the Mode, Median, and Mean
Characterizing the Spread of a Distribution
The Range and Interquartile Range
The Variance
The Standard Deviation
Characterizing the Skewness of a Distribution
Selecting Measures of Location and Spread
Applying What We Have Learned
Summary of Stata Commands in Chapter
The Stata Command
Stata TIPS
Exercises
Chapter Four Reexpressing Variables
Linear and Nonlinear Transformations
Linear Transformations: Addition, Subtraction, Multiplication, and Division
The Effect on the Shape of a Distribution
The Effect on Summary Statistics of a Distribution
Common Linear Transformations
Standard Scores
z-SCORES
Using z-Scores to Detect Outliers
Using z-Scores to Compare Scores in Different Distributions
Relating z-Scores to Percentile Ranks
Nonlinear Transformations: Square Roots and Logarithms
Nonlinear Transformations: Ranking Variables
Other Transformations: Recoding and Combining Variables
Recoding Variables
Combining Variables
Data Management Fundamentals – the Do-File
Summary of Stata Commands in Chapter
Exercises
Chapter Five Exploring Relationships between Two Variables
When Both Variables Are at Least Interval-Leveled
Scatterplots
The Pearson Product Moment Correlation Coefficient
Judging the Strength of the Linear Relationship
The Correlation Scale Itself Is Ordinal
Correlation Does Not Imply Causation
The Effect of Linear Transformations
Restriction of Range
The Reliability of the Data
When at Least One Variable Is Ordinal and the Other Is at Least Ordinal: The Spearman Rank Correlation Coefficient
When at Least One Variable Is Dichotomous: Other Special Cases of the Pearson Correlation Coefficient
The Point Biserial Correlation Coefficient: The Case of One at Least Interval and One Dichotomous Variable
The Phi Coefficient: The Case of Two Dichotomous Variables
Other Visual Displays of Bivariate Relationships
Summary of Stata Commands in Chapter
Exercises
Chapter Six Simple Linear Regression
The “Best-Fitting” Linear Equation
The Accuracy of Prediction Using the Linear Regression Model
The Standardized Regression Equation
R as a Measure of the Overall Fit of the Linear Regression Model
Simple Linear Regression When the Independent Variable Is Dichotomous
Using r and R as Measures of Effect Size
Emphasizing the Importance of the Scatterplot
Summary of Stata Commands in Chapter
Exercises
Chapter Seven Probability Fundamentals
The Discrete Case
The Complement Rule of Probability
The Additive Rules of Probability
First Additive Rule of Probability
Second Additive Rule of Probability
The Multiplicative Rule of Probability
The Relationship between Independence and Mutual Exclusivity
Conditional Probability
The Law of Large Numbers
Exercises
Chapter Eight Theoretical Probability Models
The Binomial Probability Model and Distribution
The Applicability of the Binomial Probability Model
The Normal Probability Model and Distribution
Summary of Chapter Stata Commands
Exercises
Chapter Nine The Role of Sampling in Inferential Statistics
Samples and Populations
Random Samples
Obtaining a Simple Random Sample
Sampling with and without Replacement
Sampling Distributions
Describing the Sampling Distribution of Means Empirically
Describing the Sampling Distribution of Means Theoretically: The Central Limit Theorem
Central Limit Theorem (CLT)
Estimators and BIAS
Summary of Chapter Stata Commands
Exercises
Chapter Ten Inferences Involving the Mean of a Single Population When σ Is Known
Estimating the Population Mean, µ, When the Population Standard Deviation, σ, Is Known
Interval Estimation
Relating the Length of a Confidence Interval, the Level of Confidence, and the Sample Size
Hypothesis Testing
The Relationship between Hypothesis Testing and Interval Estimation
Effect Size
Type II Error and the Concept of Power
Increasing the Level of Significance, α
Increasing the Effect Size, δ
Decreasing the Standard Error of the Mean, σx¯
Closing Remarks
Summary of Chapter Stata Commands
Exercises
Chapter Eleven Inferences Involving the Mean When σ Is Not Known: One- and Two-Sample Designs
Single Sample Designs When the Parameter of Interest Is the Mean and σ Is Not Known
The t Distribution
Degrees of Freedom for the One Sample t-Test
Violating the Assumption of a Normally Distributed Parent Population in the One Sample t-Test
Confidence Intervals for the One Sample t-Test
Hypothesis Tests: The One Sample t-Test
Effect Size for the One Sample t-Test
Two Sample Designs When the Parameter of Interest Is µ, and σ Is Not Known
Independent (or Unrelated) and Dependent (or Related) Samples
Independent Samples t-Test and Confidence Interval
The Assumptions of the Independent Samples t-Test
Effect Size for the Independent Samples t-Test
Paired Samples t-test and Confidence Interval
The Assumptions of the Paired Samples t-Test
Effect Size for the Paired Samples t-Test
The Bootstrap
Summary of Chapter Stata Commands
Commands Involving the t-Distribution
One Sample t-Test Commands
Independent Samples t-Test Commands
Paired Samples t-Test Commands
Exercises
Chapter Twelve Research Design: Introduction and Overview
Questions and Their Link to Descriptive, Relational, and Causal Research Studies
The Need for a Good Measure of Our Construct, Weight
The Descriptive Study
From Descriptive to Relational Studies
From Relational to Causal Studies
The Gold Standard of Causal Studies: The True Experiment and Random Assignment
Comparing Two Kidney Stone Treatments using a Non-randomized Controlled Study
Including Blocking in a Research Design
Underscoring the Importance of Having a True Control Group Using Randomization
Analytic Methods for Bolstering Claims of Causality from Observational Data (Optional Reading)
Quasi-Experimental Designs
Threats to the Internal Validity of a Quasi-Experimental Design
Threats to the External Validity of a Quasi-Experimental Design
Threats to the Validity of a Study: Some Clarifications and Caveats
Threats to the Validity of a Study: Some Examples
Exercises
Chapter Thirteen One-Way Analysis of Variance
The Disadvantage of Multiple t-Tests
The One-Way Analysis of Variance
A Graphical Illustration of the Role of Variance in Tests on Means
ANOVA as an Extension of the Independent Samples t-Test
Developing an Index of Separation for the Analysis of Variance
Carrying Out the ANOVA Computation
The Between Group Variance (MSB)
The Within Group Variance (MSW)
The Assumptions of the One-way ANOVA
Testing the Equality of Population Means: The F-Ratio
How to Read the Tables and Use Stata Functions for the F-Distribution
ANOVA Summary Table
Measuring the Effect Size
Post-hoc Multiple Comparison Tests
The Bonferroni Adjustment: Testing Planned Comparisons
The Bonferroni Tests on Multiple Measures
Summary of Stata Commands in Chapter
Exercises
Chapter Fourteen Two-Way Analysis of Variance
The Two-Factor Design
The Concept of Interaction
The Hypotheses That Are Tested by a Two-Way Analysis of Variance
Assumptions of the Two-Way Analysis of Variance
Balanced versus Unbalanced Factorial Designs
Partitioning the Total Sum of Squares
Using the F-Ratio to Test the Effects in Two-Way ANOVA
Carrying Out the Two-Way ANOVA Computation by Hand
Decomposing Score Deviations about the Grand Mean
Modeling Each Score as a Sum of Component Parts
Explaining the Interaction as a Joint (or Multiplicative) Effect
Measuring Effect Size
Fixed versus Random Factors
Post-hoc Multiple Comparison Tests
Summary of Steps to Be Taken in a Two-Way ANOVA Procedure
Summary of Stata Commands in Chapter
Exercises
Chapter Fifteen Correlation and Simple Regression as Inferential Techniques
The Bivariate Normal Distribution
Testing Whether the Population Pearson Product Moment Correlation Equals Zero
Using a Confidence Interval to Estimate the Size of the Population Correlation Coefficient, ρ
Revisiting Simple Linear Regression for Prediction
Estimating the Population Standard Error of Prediction, σY|X
Testing the b-Weight for Statistical Significance
Explaining Simple Regression Using an Analysis of Variance Framework
Measuring the Fit of the Overall Regression Equation: Using R and R
Relating R to σy|x
Testing R for Statistical Significance
Estimating the True Population R: The Adjusted R
Exploring the Goodness of Fit of the Regression Equation: Using Regression Diagnostics
Residual Plots: Evaluating the Assumptions Underlying Regression
Detecting Influential Observations: Discrepancy and Leverage
Using Stata to Obtain Leverage
Using Stata to Obtain Discrepancy
Using Stata to Obtain Influence
Using Diagnostics to Evaluate the Ice Cream Sales Example
Using the Prediction Model to Predict Ice Cream Sales
Simple Regression When the Predictor Is Dichotomous
Summary of Stata Commands in Chapter
Exercises
Chapter Sixteen An Introduction to Multiple Regression
The Basic Equation with Two Predictors
Equations for b, β, and Ry When the Predictors Are Not Correlated
Equations for b, β, and Ry When the Predictors Are Correlated
Summarizing and Expanding on Some Important Principles of Multiple Regression
Testing the b-Weights for Statistical Significance
Assessing the Relative Importance of the Independent Variables in the Equation
Measuring the Drop in R Directly: An Alternative to the Squared Semipartial Correlation
Evaluating the Statistical Significance of the Change in R
The b-Weight as a Partial Slope in Multiple Regression
Multiple Regression When One of the Two Independent Variables Is Dichotomous
The Concept of Interaction between Two Variables That Are at Least Interval-Leveled
Testing the Statistical Significance of an Interaction Using Stata
Centering First-Order Effects to Achieve Meaningful Interpretations of b-Weights
Understanding the Nature of a Statistically Significant Two-Way Interaction
Interaction When One of the Independent Variables Is Dichotomous and the Other Is Continuous
Summary of Stata Commands in Chapter
Exercises
Chapter Seventeen Nonparametric Methods
Parametric versus Nonparametric Methods
Nonparametric Methods When the Dependent Variable Is at the Nominal Level
The Chi-Square Distribution (χ)
The Chi-Square Goodness-of-Fit Test
The Chi-Square Test of Independence
Assumptions of the Chi-Square Test of Independence
Fisher’s Exact Test
Calculating the Fisher Exact Test by Hand Using the Hypergeometric Distribution
Nonparametric Methods When the Dependent Variable Is Ordinal-Leveled
Wilcoxon Sign Test
The Mann-Whitney U Test
The Kruskal-Wallis Analysis of Variance
Summary of Stata Commands in Chapter
Exercises
Appendices
Appendix A Data Set Descriptions
Anscombe
Basketball
Blood
Brainsz
Currency
Exercise, Food Intake, and Weight Loss
Framingham
Hamburg
Ice Cream
Impeach
Learndis
Mandex
Marijuana
Nels
States
Stepping
Temp
Wages
Appendix B Stata do Files and Data Sets in Stata Format
Appendix C Statistical Tables
Appendix D References
Appendix E Solutions