class:title-slide-custom <style> /* colors: #EEB422, #8B0000, #191970, #00a8cc */ /* define the new color palette here! */ a, a > code { color: #8B0000; text-decoration: none; } .title-slide h2::after, .mline h1::after { content: ''; display: block; border: none; background-color: #8B0000; color: #8B0000; height: 2px; } .remark-slide-content { background-color: #FFFFFF; border-top: 80px solid #8B0000; font-size: 20px; font-weight: 300; line-height: 1.5; <!-- padding: 1em 2em 1em 2em --> background-image: url(css/UNL.svg); background-position: 2% 98%; background-size: 10%; border-bottom: 0; } .inverse { background-color: #8B0000; <!-- border-top: 20px solid #696969; --> <!-- background-image: none; --> <!-- background-position: 50% 75%; --> <!-- background-size: 150px; --> } .remark-slide-content > h1 { font-family: 'Roboto'; font-weight: 300; font-size: 45px; margin-top: -95px; margin-left: -00px; color: #FFFFFF; } .title-slide { background-color: #FFFFFF; <!-- border-left: 80px solid #8B0000; --> background-image: url(css/UNL.svg); background-position: 98% 98%; <!-- background-attachment: fixed, fixed; --> background-size: 20%; border-bottom: 0; border: 10px solid #8B0000; <!-- background: transparent; --> } .title-slide > h1 { color: #111111; font-size: 32px; text-shadow: none; font-weight: 500; text-align: left; margin-left: 15px; padding-top: 80px; } .title-slide > h2 { margin-top: -25px; padding-bottom: -20px; color: #111111; text-shadow: none; font-weight: 100; font-size: 28px; text-align: left; margin-left: 15px; } .title-slide > h3 { color: #111111; text-shadow: none; font-weight: 100; font-size: 28px; text-align: left; margin-left: 15px; margin-bottom: -20px; } body { font-family: 'Roboto'; font-weight: 300; } .remark-slide-number { font-size: 13pt; font-family: 'Roboto'; color: #272822; opacity: 1; } .inverse .remark-slide-number { font-size: 13pt; font-family: 'Roboto'; color: #FAFAFA; opacity: 1; } .title-slide-custom .remark-slide-number { display: none; } .title-slide-custom h3::after, .mline h1::after { content: ''; display: block; border: none; background-color: #8B0000; color: #8B0000; height: 2px; } .title-slide-custom { background-color: #FFFFFF; <!-- border-left: 80px solid #8B0000; --> background-image: url(css/UNL.svg); background-position: 98% 98%; <!-- background-attachment: fixed, fixed; --> background-size: 20%; border-bottom: 0; border: 10px solid #8B0000; <!-- background: transparent; --> } .title-slide-custom > h1 { color: #111111; font-size: 40px; text-shadow: none; font-weight: 500; text-align: left; margin-left: 15px; padding-top: 80px; padding-bottom: 10px; } .title-slide-custom > h2 { margin-top: -25px; padding-bottom: 30px; color: #111111; text-shadow: none; font-weight: 100; font-size: 32px; text-align: left; margin-left: 15px; } .title-slide-custom > h3 { margin-top: -25px; padding-bottom: -25px; color: #111111; text-shadow: none; font-weight: 100; font-size: 32px; text-align: left; margin-left: 15px; } .title-slide-custom > h4 { color: #111111; text-shadow: none; font-weight: 100; font-size: 28px; text-align: left; margin-left: 15px; margin-bottom: -30px; padding-bottom: -25px; } .title-slide-custom > h5 { color: #111111; text-shadow: none; font-weight: 100; font-size: 24px; text-align: left; margin-left: 15px; margin-bottom: -40px; } <!-- img { --> <!-- max-width: 50%; --> <!-- } --> </style> <br><br> # Can 'You Draw It'? Eye Fitting Straight Lines in the Modern Era ## ISU Graphics Group ### October 14, 2021 #### Emily Robinson #### Department of Statistics, University of Nebraska - Lincoln <!-- #####
[emily.robinson@huskers.unl.edu](emily.robinson@huskers.unl.edu) --> <!-- #####
[www.emilyarobinson.com](https://www.emilyarobinson.com/) --> <!-- #####
[earobinson95](https://github.com/earobinson95) --> ??? Thank you, everyone for coming! I am a PhD candidate in the Department of Statistics at the University of Nebraska - Lincoln. I will be presenting on human perception of statistical charts and giving an overview of current graphical testing methods then introduce the current research I am conducting in graphical testing. --- class:primary # Testing Statistical Graphics Evaluate design choices and understand cognitive biases through the use of **visual tests**. Could ask participants to: 📊 identify differences in graphs. 📖 read information off of a chart accurately. 🌎 use data to make correct real-world decisions. ✏️ predict the next few observations. ??? One way we can evaluate these design choices through the use of graphical tests. Could ask participants to: - identify differences in graphs. - read information off of a chart accurately. - use data to make correct real-world decisions. - predict the next few observations. All of these types of tests require different levels of use and manipulation of the information presented in the chart. --- class:primary # Lineup Protocol .pull-left[ Introduced in Buja, Cook, Hofmann, et al. (2009). Embed a *target plot* (actual data) in a set of *null plots* (data generated under the null distribution). ].pull-right[ <!-- Trigger the Modal --> <img id='imglineupprotocol' src='images/lineup-protocol.png' alt=' ' width='100%'> <!-- The Modal --> <div id='modallineupprotocol' class='modal'> <!-- Modal Content (The Image) --> <img class='modal-content' id='imgmodallineupprotocol'> <!-- Modal Caption (Image Text) --> <div id='captionlineupprotocol' class='modal-caption'></div> </div> ] ??? Efforts in the field of graphics have developed graphical testing tools and methods such as the lineup protocol to provide a framework for inferential testing. When inspecting a plot, how do we know if what we are seeing is actually there? One way of answering this question is to embed the true data plot (called target plot) into a set of randomly permuted data sets (called null plots). This is what we call a lineup. This is similar to the law-enforcement procedure to line up a suspect among a set of innocents to check if a victim can identify the suspect as the perpetrator of the crime. Here, visual evaluation of the lineup is conducted by a person. If the viewers detect the target plot, we can conclude the plots are distinguishable. The lineup protocol is one such example of the development of tools designed for statistical graphical testing. The advancement of graphing software provides the tools necessary to develop new methods of testing graphics. --- class:primary # Linear Regression The principle of simple linear regression is to find the line (i.e., determine its equation) which passes as close as possible to the observations, that is, the set of points. <img src="index_files/figure-html/linear-regression-1.png" width="70%" style="display: block; margin: auto;" /> ??? Linear regression is a statistical approach that allows to assess the linear relationship between two quantitative variables. --- class:primary # Linear Regression The principle of simple linear regression is to **find the line** (i.e., determine its equation) which passes as close as possible to the observations, that is, the set of points. <img src="index_files/figure-html/linear-regression2-1.png" width="70%" style="display: block; margin: auto;" /> ??? The principle of simple linear regression is to **find the line** (i.e., determine its equation) which passes as close as possible to the observations, that is, the set of points. --- class:primary # Let's see your drawing skills! .pull-left[ .center[ <img src="images/eyefitting_example.gif" width="100%"/> ] ].pull-right[ .center[ <img src="images/can-you-draw-it-mobile-qrcode.png" width="88%"/> <font size="6"> **SCAN ME** OR VISIT [bit.ly/3BF56Zj](https://bit.ly/3BF56Zj) </font> ] <!-- <a rel='nofollow' href='https://www.qr-code-generator.com' border='0' style='cursor:default'><img src='https://chart.googleapis.com/chart?cht=qr&chl=https%3A%2F%2Femily-robinson.shinyapps.io%2Fcan-you-draw-it-mobile%2F&chs=180x180&choe=UTF-8&chld=L|2' alt=''></a> --> ] --- class:primary # Did it looks something like this? .center[ <img src="images/can-you-draw-it-example.png" width="60%"/> ] --- class:primary # Linear Regression The principle of simple linear regression is to **find the line** (i.e., determine its equation) **which passes as close as possible to the observations**, that is, the set of points. .center[ <img src="images/pca-plot.jpg" width="80%"/> ] -- **Big Idea:** How do statistical regression results compare to intuitive, visually fitted results? ??? We are going to focus on two regression lines determined by ordinary least squares regression and regression based on the principal axis. The figure illustrates the difference between an OLS regression line which minimizes the vertical distance of points from the line and a regression line based on the principal axis (Principal Component) which minimizes the Euclidean distance of points (orthogonal) from the line. This is what we refer to as “ensemble perception” indicating the visual system can compute averages of various features in parallel across the items in a set (in this case, over the x and y-axes). **Big Idea:** How do statistical regression results compare to intuitive, visually fitted results? --- class:primary # Eye Fitting Straight Lines ## Mosteller, Siegel, Trapido, et al. (1981) .pull-left[ + **Big Idea:** Students fitted lines by eye to four sets of points. + **Method:** 8.5 x 11 inch transparency with a straight line etched across the middle. + **Sample:** 153 graduate students and post docs in Introductory Biostatistics. + **Experimental Design:** Latin square. + **Findings:** Students tended to fit the slope of the first principal component. ].pull-right[ <img src="images/eyefitting-straight-lines-plots.png" width="95%"/> ] ??? I want to introduce a study conducted in 1981 called Eye Fitting Straight Lines by Mosteller et al. In this study: + Students fitted lines by eye to four sets of points. + 8.5 x 11 inch transparency with a straight line etched across the middle. + 153 graduate students and post docs in Introductory Biostatistics. + Latin square. + Students tended to fit the slope of the first principal component or major axis (the line that minimizes the sum of squares of perpendicular rather than vertical distances). --- class:primary # 'You Draw It' Feature ## (New York Times, 2015) .pull-left[ <img src="images/nyt-caraccidents-frame4.png" width="100%"/> .center[ (Katz, 2017) ] ].pull-right[ Readers are asked to input their own assumptions about various metrics and compare how these assumptions relate to reality. + [Family Income affects college chances](https://www.nytimes.com/interactive/2015/05/28/upshot/you-draw-it-how-family-income-affects-childrens-college-chances.html) (Aisch, Cox, and Quealy, 2015) + [Just How Bad Is the Drug Overdose Epidemic?](https://www.nytimes.com/interactive/2017/04/14/upshot/drug-overdose-epidemic-you-draw-it.html) (Katz, 2017) + [What Got Better or Worse During Obama’s Presidency](https://www.nytimes.com/interactive/2017/01/15/us/politics/you-draw-obama-legacy.html?_r=0) (Buchanan, Park, and Pearce, 2017) ] ??? In 2015, the New York Times developed a You Draw it feature where readers are asked to input their own assumptions about various metrics and compare how these assumptions relate to reality. The New York Times team utilizes **Data Driven Documents (D3)** that allows readers to predict these metrics through the use of drawing a line on their computer screen with their mouse. --- class:primary # Research Objectives 1. Validate ‘You Draw It’ as a method for graphical testing, comparing results to the less technological method utilized in Mosteller et al. (1981). 2. Extend the study with formal statistical analysis methods in order to better understand the perception of linear regression. ??? The two objectives of my current research are to: 1. Validate ‘You Draw It’ as a method for graphical testing, comparing results to the less technological method utilized in Mosteller et al. (1981). 2. Extend the study with formal statistical analysis methods in order to better understand the perception of linear regression. --- class:primary # 'You Draw It' Task Study Participant Prompt: *Use your mouse to fill in the trend in the yellow box region.* .center[ <img src="images/eyefitting_example.gif" width="60%"/> ] ??? Here we see an example of a "You Draw It" task plot used in the study. Participants are prompted to "Use your mouse to fill in the trend in the yellow box region. The yellow box region moves along as the participant draws their trend-line until the yellow region disappears." Task plots were created using Data Driven Documents (D3), a JavaScript-based graphing framework that facilitates user interaction. We then integrate this into RShiny using the r2d3 package. --- class:primary # Data Generation .pull-left[ `\(N = 30\)` points `\((x_i, y_i), i = 1,...N\)` were generated for `\(x_i \in [x_{min}, x_{max}]\)`. Data were simulated based on linear model with additive errors: `\begin{equation} y_i = \beta_0 + \beta_1 x_i + e_i \end{equation}` where `\(e_i \sim N(0, \sigma^2).\)` Parameters `\(\beta_0\)` and `\(\beta_1\)` were selected to reflect the four data sets used in Mosteller, Siegel, Trapido, et al. (1981). ].pull-right[ <!-- Trigger the Modal --> <img id='imgeyefittingexamplesimplot' src='images/eyefitting-example-simplot.png' alt=' ' width='100%'> <!-- The Modal --> <div id='modaleyefittingexamplesimplot' class='modal'> <!-- Modal Content (The Image) --> <img class='modal-content' id='imgmodaleyefittingexamplesimplot'> <!-- Modal Caption (Image Text) --> <div id='captioneyefittingexamplesimplot' class='modal-caption'></div> </div> ] ??? Data were generated following a linear model with additive errors. Model equation parameters, `\(\beta_0\)` and `\(\beta_1\)`, were selected to reflect the four data sets (F, N, S, and V) used in Mosteller et al. (1981). + **S:** positive slope; small variance; `\(x \in [0, 20]\)`. + **F:** positive slope; a large variance; `\(x \in [0, 20]\)`. + **V:** steep positive slope; small variance; `\(x \in [4, 16]\)`. + **N:** negative slope; large variance; `\(x \in [0, 20]\)`. --- class:primary # Study Design + Participants recruited through Twitter, Reddit, and direct email in May 2021. + A total of 35 individuals completed 119 unique you draw it task plots. + Data sets were generated randomly, independently for each participant at the start of the experiment. + Participants shown 2 practice plots followed by 4 task plots randomly assigned for each individual in a completely randomized design. + Experiment conducted and distributed through an RShiny application found [**here**](https://emily-robinson.shinyapps.io/you-draw-it-pilot-app/). ??? Participants were recruited through through Twitter, Reddit, and direct email in May 2021. The experiment was conducted and distributed through an RShiny application. Participants were first shown 2 practice plots followed by the 4 You Draw It task plots randomly assigned for each individual in a completely randomized design. --- class:primary # Model Data .pull-left[ For each participant, the final data set used for analysis contains: + `\(x_{ijk}\)`, `\(y_{ijk,drawn}\)`, `\(\hat y_{ijk,OLS}\)`, `\(\hat y_{ijk,PCA}\)` for + parameter choice `\(i = 1,2,3,4\)`, + participant j = `\(1,...N_{participant}\)` + `\(x_{ijk}\)` value corresponding to increment `\(k = 1, ...,4 x_{max} + 1\)`. **Vertical residuals** between the drawn and fitted values were calculated as: + `\(e_{ijk,OLS} = y_{ijk,drawn} - \hat y_{ijk,OLS}\)` + `\(e_{ijk,PCA} = y_{ijk,drawn} - \hat y_{ijk,PCA}\)`. ].pull-right[ <!-- Trigger the Modal --> <img id='imgeyefittingtrialplot' src='images/eyefitting-trial-plot.png' alt=' ' width='100%'> <!-- The Modal --> <div id='modaleyefittingtrialplot' class='modal'> <!-- Modal Content (The Image) --> <img class='modal-content' id='imgmodaleyefittingtrialplot'> <!-- Modal Caption (Image Text) --> <div id='captioneyefittingtrialplot' class='modal-caption'></div> </div> ] ??? We compare the participant drawn line to two regression lines determined by ordinary least squares regression and regression based on the principal axis. The figure illustrates the difference between an OLS regression line which minimizes the vertical distance of points from the line and a regression line based on the principal axis (Principal Component) which minimizes the Euclidean distance of points (orthogonal) from the line. Here we see an example of the feedback data from one you draw it plot. For 0.25 increments across the domain, we have the participant drawn values, the fitted values from the ordinary least squares regression, and the fitted values from the regression based on the principal axis. We are mainly interested in the deviation of the participant drawn line from the fitted regression lines. So while it seems counter-intuitive, the residual actually becomes our response in this case. --- class:primary # Linear Trend Constraint The **Linear Mixed Model** equation for each fit (OLS and PCA) residuals is given by: `\begin{equation} e_{ijk,fit} = \left[\gamma_0 + \alpha_i\right] + \left[\gamma_{1} x_{ijk} + \gamma_{2i} x_{ijk}\right] + p_{j} + \epsilon_{ijk} \end{equation}` where + `\(e_{ijk,fit}\)` is the residual between the drawn and fitted y-values for the `\(i^{th}\)` parameter choice, `\(j^{th}\)` participant, and `\(k^{th}\)` increment of x-value corresponding to either the OLS or PCA fit + `\(\gamma_0\)` is the overall intercept + `\(\alpha_i\)` is the effect of the `\(i^{th}\)` parameter choice (F, S, V, N) on the intercept + `\(\gamma_1\)` is the overall slope for `\(x\)` + `\(\gamma_{2i}\)` is the effect of the parameter choice on the slope + `\(x_{ijk}\)` is the x-value for the `\(i^{th}\)` parameter choice, `\(j^{th}\)` participant, and `\(k^{th}\)` increment + `\(p_{j} \sim N(0, \sigma^2_{participant})\)` is the random error due to the `\(j^{th}\)` participant's characteristics + `\(\epsilon_{ijk} \sim N(0, \sigma^2)\)` is the residual error. ??? Using the `lmer` function in the lme4 package, a linear mixed model (LMM) is fit separately to the OLS and PCA residuals, constraining the fit to a linear trend. --- class:primary # Linear Trend Constraint .center[ <img src="images/eyefitting-lmer-plot.png" width="85%"/> ] ??? Results indicate the estimated trends of PCA residuals (orange) appear to align closer to the y = 0 horizontal (dashed) line than the OLS residuals (blue). In particular, this trend is more prominent in parameter choices with large variances (F and N). These results are consistent to those found in Mosteller et al. (1981) indicating participants fit a trend-line closer to the estimated regression line with the slope of based on the first principal axis than the estimated OLS regression line. --- class:primary # Smoothing Spline Trend The **Generalized Additive Mixed Model** equation for each fit (OLS and PCA) residuals is given by: `\begin{equation} e_{ijk,fit} = \alpha_i + s_{i}(x_{ijk}) + p_{j} + s_{j}(x_{ijk}) \end{equation}` where + `\(e_{ijk,fit}\)` is the residual between the drawn and fitted y-values for the `\(i^{th}\)` parameter choice, `\(j^{th}\)` participant, and `\(k^{th}\)` increment of x-value corresponding to either the OLS or PCA fit + `\(\alpha_i\)` is the intercept for the parameter choice `\(i\)` + `\(s_{i}\)` is the smoothing spline for the `\(i^{th}\)` parameter choice + `\(x_{ijk}\)` is the x-value for the `\(i^{th}\)` parameter choice, `\(j^{th}\)` participant, and `\(k^{th}\)` increment + `\(p_{j} \sim N(0, \sigma^2_{participant})\)` is the error due to participant variation + `\(s_{j}\)` is the random smoothing spline for each participant. ??? Eliminating the linear trend constraint, the `bam` function in the mgcv package is used to fit a generalized additive mixed model (GAMM) separately to the OLS and PCA residuals to allow for estimation of smoothing splines. --- class:primary # Smoothing Spline Trend .center[ <img src="images/eyefitting-gamm-plot.png" width="85%"/> ] ??? The results of the GAMM align with those in the linear constraint trend providing support that for scatter-plots with more noise (F and N), estimated trends of PCA residuals (orange) appear to align closer to the y = 0 horizontal (dashed) line than the OLS residuals (blue). However, By fitting smoothing splines, we can determine whether participants naturally fit a straight trend-line to the set of points or whether they deviate throughout the domain providing us with further insight into the curvature humans perceive in a set of points. --- class:primary # Conclusion **Research Objectives:** 1. Validate ‘You Draw It’ as a method for graphical testing, comparing results to the less technological method utilized in Mosteller et al. (1981). 2. Extend the study found in Mosteller et al. (1981) with formal statistical analysis methods for understanding the perception of linear regression. **Results:** + Estimated drawn trend-lines followed closer to the regression line based on the principal axes than the OLS regression line. + Most prominent in data simulated with large variances. + Humans perform “ensemble perception” in a statistical graphic setting. **The reproducibility of these results serve as validation of the 'You Draw It' tool and method.** ??? 1. Validate ‘You Draw It’ as a method for graphical testing, comparing results to the less technological method utilized in Mosteller et al. (1981). 2. Extend the study found in Mosteller et al. (1981) with formal statistical analysis methods for understanding the perception of linear regression. **Results:** + Estimated drawn trend-lines followed closer to the principal axes than the OLS regression line. + Most prominent in data simulated with large variances. + Humans perform “ensemble perception” in a statistical graphic setting as participants minimized the distance from the their regression line over both the x and y axis simultaneously This study reinforces the differences between intuitive visual model fitting and statistical model fitting, providing information about human perception as it relates to the use of statistical graphics. **The reproducibility of these results serve as validation of the 'You Draw It' tool and method.** --- class:primary # Future Work .center[ <img src="images/loading.gif" width="50%"/> ] ✏️ Implement the 'You Draw It' method in non-linear settings. 📈 Evaluate human ability to extrapolate data from trends.
Use the tool to understand beliefs of real data such as climate change trends.
Develop an R package designed for easy implementation of ‘You Draw It’ task plots. <br> .right-col[ Gif Source: [photobucket.com](http://s280.photobucket.com/user/ariffisariff/media/animated-loading.gif.html) ] ??? ✏️ Implement the 'You Draw It' method in non-linear settings. 📈 Evaluate human ability to extrapolate data from trends.
Use the tool to understand beliefs of real data such as climate change trends.
Develop an R package designed for easy implementation of ‘You Draw It’ task plots. --- class:primary # Future Work .center[ <img src="images/loading.gif" width="50%"/> ] ✏️ **Implement the 'You Draw It' method in non-linear settings.** 📈 **Evaluate human ability to extrapolate data from trends.**
Use the tool to understand beliefs of real data such as climate change trends.
Develop an R package designed for easy implementation of ‘You Draw It’ task plots. <br> .right-col[ Gif Source: [photobucket.com](http://s280.photobucket.com/user/ariffisariff/media/animated-loading.gif.html) ] ??? ✏️ **Implement the 'You Draw It' method in non-linear settings.** 📈 **Evaluate human ability to extrapolate data from trends.**
Use the tool to understand beliefs of real data such as climate change trends.
Develop an R package designed for easy implementation of ‘You Draw It’ task plots. --- class:primary # Logarithmic Scales .center[ <!-- Trigger the Modal --> <img id='imglogscaleexample' src='images/log-scale-example.jpg' alt=' ' width='60%'> <!-- The Modal --> <div id='modallogscaleexample' class='modal'> <!-- Modal Content (The Image) --> <img class='modal-content' id='imgmodallogscaleexample'> <!-- Modal Caption (Image Text) --> <div id='captionlogscaleexample' class='modal-caption'></div> </div> ] Our perception is **logarithmic at first**, but transitions to a **linear scale later** in development. .center[ <img src="images/log-numberline.png" width="70%"/> ] ??? One problem we face is when data spans several orders of magnitude shown on its original scale compresses the smaller magnitudes into relatively little area. We can address this problem through the use of a log scale transformation; however, this alters the contextual appearance of the data. In fact, past research has found that our perception is **logarithmic at first**, but transitions to a **linear scale later** in development. For example, a kindergartner asked to place numbers one through ten along a number line would place three close to the middle, following the logarithmic perspective. We have all experienced this when making a poster where we misjudged the space needed and end up compressing the last few letters onto the poster board. Therefore, if we perceive logarithmically by default, it is a natural (and presumably low effort) way to display information and should be easy to read and understand/use. --- class:primary # Research Objectives **Big Idea:** Are there benefits to displaying exponentially increasing data on a log scale rather than a linear scale? 1. [Perception of Exponential Growth](https://shiny.srvanderplas.com/log-study/) 📈 📈 📈 - Test an individuals ability to perceptually differentiate exponentially increasing data with differing rates of change on both the linear and log scale. 2. [**Prediction of Exponential Trends**](https://shiny.srvanderplas.com/you-draw-it/) ✏️ - **Tests an individuals ability to make predictions for exponentially increasing data.** 3. Estimation by Numerical Translation 📏 - Tests an individuals ability to translate a graph of exponentially increasing data into real value quantities. ??? One way to evaluate design choices is through the use of graphical tests. We could ask participants to identify differences in graphs, read information off of a chart accurately, use data to make correct real-world decisions, or predict the next few observations. All of these types of tests require different levels of use and manipulation of the information presented in the chart. The main goal of this research is to use graphical tests to determine if there are benefits to displaying exponentially increasing data on a log scale rather than a linear scale. We have developed three graphical tests which address the perception of exponential growth, prediction of exponential growth, and translation from graphical to numerical estimation. --- class:primary # Data Simulation **Point data:** `\(N = 30\)` points `\((x_i, y_i), i = 1,...N\)` were generated for `\(x_i \in [x_{min}, x_{max}]\)`. Data were simulated based on a one parameter exponential model with multiplicative errors: `\begin{equation} y_i = e^{\beta x_i + e_i} \end{equation}` for + growth rate `\(\beta\)` + `\(e_i \sim N(0, \sigma^2)\)` generated by rejection sampling. **Line data:** `\(m = 1,....4x_{max} + 1\)` fitted values in 0.25 increments across the domain, `\((x_m, \hat y_{m,NLS})\)` A nonlinear least squares regression is then fit to the simulated points: `\begin{equation} \hat y_{m,NLS} = e^{\hat \beta_{NLS} x_m} \end{equation}` ??? + `\(e_i \sim N(0, \sigma^2)\)` are generated by rejection sampling in order to guarantee the points shown align with that of the fitted line displayed in the initial plot frame. Outputs a list of point data and line data both indicating the parameter identification, x-value, and corresponding simulated or fitted y value. --- class:primary # Treatment Design .pull-left[ 2 x 2 x 2 factorial: + **growth rate:** low and high. + **points truncated:** `\(50\%\)` and `\(75\%\)` of the domain. + **scale:** log and linear. Consistent aesthetic design choices: + y-axis extended `\(50\%\)` below and `\(200\%\)` above the simulated data range. + participants begin drawing at `\(50\%\)` of the domain. ].pull-right[ .center[ <!-- Trigger the Modal --> <img id='imglow10linear' src='images/low-10-linear.png' alt='Low Growth Rate, 50% Truncation, Linear Scale' width='30%'> <!-- Trigger the Modal --> <img id='imglow10log' src='images/low-10-log.png' alt='Low Growth Rate, 50% Truncation, Log Scale' width='30%'> <!-- The Modal --> <div id='modallow10linear' class='modal'> <!-- Modal Content (The Image) --> <img class='modal-content' id='imgmodallow10linear'> <!-- Modal Caption (Image Text) --> <div id='captionlow10linear' class='modal-caption'></div> </div> <!-- The Modal --> <div id='modallow10log' class='modal'> <!-- Modal Content (The Image) --> <img class='modal-content' id='imgmodallow10log'> <!-- Modal Caption (Image Text) --> <div id='captionlow10log' class='modal-caption'></div> </div> <!-- Trigger the Modal --> <img id='imglow15linear' src='images/low-15-linear.png' alt='Low Growth Rate, 75% Truncation, Linear Scale' width='30%'> <!-- Trigger the Modal --> <img id='imglow15log' src='images/low-15-log.png' alt='Low Growth Rate, 75% Truncation, Log Scale' width='30%'> <!-- The Modal --> <div id='modallow15linear' class='modal'> <!-- Modal Content (The Image) --> <img class='modal-content' id='imgmodallow15linear'> <!-- Modal Caption (Image Text) --> <div id='captionlow15linear' class='modal-caption'></div> </div> <!-- The Modal --> <div id='modallow15log' class='modal'> <!-- Modal Content (The Image) --> <img class='modal-content' id='imgmodallow15log'> <!-- Modal Caption (Image Text) --> <div id='captionlow15log' class='modal-caption'></div> </div> <!-- Trigger the Modal --> <img id='imghigh10linear' src='images/high-10-linear.png' alt='High Growth Rate, 50% Truncation, Linear Scale' width='30%'> <!-- Trigger the Modal --> <img id='imghigh10log' src='images/high-10-log.png' alt='High Growth Rate, 50% Truncation, Log Scale' width='30%'> <!-- The Modal --> <div id='modalhigh10linear' class='modal'> <!-- Modal Content (The Image) --> <img class='modal-content' id='imgmodalhigh10linear'> <!-- Modal Caption (Image Text) --> <div id='captionhigh10linear' class='modal-caption'></div> </div> <!-- The Modal --> <div id='modalhigh10log' class='modal'> <!-- Modal Content (The Image) --> <img class='modal-content' id='imgmodalhigh10log'> <!-- Modal Caption (Image Text) --> <div id='captionhigh10log' class='modal-caption'></div> </div> <!-- Trigger the Modal --> <img id='imghigh15linear' src='images/high-15-linear.png' alt='High Growth Rate, 75% Truncation, Linear Scale' width='30%'> <!-- Trigger the Modal --> <img id='imghigh15log' src='images/high-15-log.png' alt='High Growth Rate, 75% Truncation, Log Scale' width='30%'> <!-- The Modal --> <div id='modalhigh15linear' class='modal'> <!-- Modal Content (The Image) --> <img class='modal-content' id='imgmodalhigh15linear'> <!-- Modal Caption (Image Text) --> <div id='captionhigh15linear' class='modal-caption'></div> </div> <!-- The Modal --> <div id='modalhigh15log' class='modal'> <!-- Modal Content (The Image) --> <img class='modal-content' id='imgmodalhigh15log'> <!-- Modal Caption (Image Text) --> <div id='captionhigh15log' class='modal-caption'></div> </div> ]] --- class:primary # Feedback Data .pull-left[ For each participant, the final data set used for analysis contains: + `\(x_{ijklm}\)`, `\(y_{ijklm,drawn}\)`, and `\(\hat y_{ijklm,NLS}\)` for: + growth rate `\(i = 1,2\)`, + point truncation `\(j = 1,2\)`, + scale `\(k = 1,2\)`, + participant `\(l = 1,...N_{participant}\)`, and + `\(x_{ijklm}\)` value `\(m = 1, ...,4 x_{max} + 1\)`. Vertical residuals between the drawn and fitted values were calculated as: + `\(e_{ijklm,NLS} = y_{ijklm,drawn} - \hat y_{ijklm,NLS}\)`. ].pull-right[ <!-- Trigger the Modal --> <img id='imgexpspaghettiplot' src='images/exp-spaghetti-plot.png' alt=' ' width='100%'> <!-- The Modal --> <div id='modalexpspaghettiplot' class='modal'> <!-- Modal Content (The Image) --> <img class='modal-content' id='imgmodalexpspaghettiplot'> <!-- Modal Caption (Image Text) --> <div id='captionexpspaghettiplot' class='modal-caption'></div> </div> ] --- class:primary # Generalized Additive Mixed Model The GAMM equation for residuals is given by: `\begin{equation} e_{ijklm,nls} = \tau_{ijk} + s_{ijk}(x_{ijklm}) + p_{l} + s_{l}(x_{ijklm}) \end{equation}` where + `\(e_{ijklm,NLS}\)` is the residual between the drawn y-value and fitted y-value for the `\(l^{th}\)` participant, `\(m^{th}\)` increment, and `\(ijk^{th}\)` treatment combination + `\(\tau_{ijk}\)` is the intercept for the `\(i^{th}\)` growth rate, `\(j^{th}\)` point truncation, and `\(k^{th}\)` scale treatment combination + `\(s_{ijk}\)` is the smoothing spline for the `\(ijk^{th}\)` treatment combination + `\(x_{ijklm}\)` is the x-value for the `\(l^{th}\)` participant, `\(m^{th}\)` increment, and `\(ijk^{th}\)` treatment combination + `\(p_{l} \sim N(0, \sigma^2_{participant})\)` is the error due to the `\(l^{th}\)` participant's characteristics + `\(s_{l}\)` is the random smoothing spline for the `\(l^{th}\)` participant. ??? Allowing for flexibility, the bam function in the mgcv package is used to fit a GAMM to estimate trends of vertical residuals from the participant drawn line in relation to the NLS fitted values. --- class:primary # GAMM Residual Trend Results .center[ <img src="images/exp-gamm-plot.png" width="80%"/> ] ??? Predictions made on the **linear scale** (blue) deviate from the `\(y=0\)` horizontal (dashed) line `\(\implies\)` **underestimation** of exponential growth. Predictions made on the **log scale** (orange) follow closely to the `\(y=0\)` horizontal (dashed) line `\(\implies\)` **more accurate** than trends predicted on the linear scale. More prominent in high exponential growth rates. Underestimation begins after the aid of points is removed. ??? Indicated by the discrepancy in results for treatments with points truncated at `\(50\%\)` compared to `\(75\%\)` of the domain. --- class:primary # Conclusion **Goal:** Test an individual's ability to make predictions for exponentially increasing data. **Results:** + Predictions made on the log scale were more accurate than those made on the linear scale. + Strongly supported for high exponential growth rates. + Points shown along the trend improve predictions. **The results of this study suggest that there are cognitive advantages to log scales when making predictions of exponential trends.** --- class:primary # References <font size="2"> <p><cite>Aisch, G., N. Cohn, A. Cox, et al. (2016). <em>Live Presidential Forecast</em>. URL: <a href="https://www.nytimes.com/elections/2016/forecast/president">https://www.nytimes.com/elections/2016/forecast/president</a>.</cite></p> <p><cite><a id='bib-aisch_cox_quealy_2015'></a><a href="#cite-aisch_cox_quealy_2015">Aisch, G., A. Cox, and K. Quealy</a> (2015). <em>You Draw It: How Family Income Predicts Children's College Chances</em>. URL: <a href="https://www.nytimes.com/interactive/2015/05/28/upshot/you-draw-it-how-family-income-affects-childrens-college-chances.html">https://www.nytimes.com/interactive/2015/05/28/upshot/you-draw-it-how-family-income-affects-childrens-college-chances.html</a>.</cite></p> <p><cite><a id='bib-buchanan_park_pearce_2017'></a><a href="#cite-buchanan_park_pearce_2017">Buchanan, L., H. Park, and A. Pearce</a> (2017). <em>You Draw It: What Got Better or Worse During Obama's Presidency</em>. URL: <a href="https://www.nytimes.com/interactive/2017/01/15/us/politics/you-draw-obama-legacy.html">https://www.nytimes.com/interactive/2017/01/15/us/politics/you-draw-obama-legacy.html</a>.</cite></p> <p><cite><a id='bib-buja2009statistical'></a><a href="#cite-buja2009statistical">Buja, A., D. Cook, H. Hofmann, et al.</a> (2009). “Statistical inference for exploratory data analysis and model diagnostics”. In: <em>Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences</em> 367.1906, pp. 4361–4383.</cite></p> <p><cite>Carpenter, P. A. and P. Shah (1998). “A model of the perceptual and conceptual processes in graph comprehension.” In: <em>Journal of Experimental Psychology: Applied</em> 4.2, p. 75.</cite></p> <p><cite>Chandar, N., D. Collier, and P. Miranti (2012). “Graph standardization and management accounting at AT&T during the 1920s”. In: <em>Accounting History</em> 17.1, pp. 35–62.</cite></p> <p><cite>Chong, S. C. and A. Treisman (2003). “Representation of statistical properties”. In: <em>Vision research</em> 43.4, pp. 393–404.</cite></p> <p><cite>— (2005). “Statistical processing: Computing the average size in perceptual groups”. In: <em>Vision research</em> 45.7, pp. 891–900.</cite></p> <p><cite>Ciccione, L. and S. Dehaene (2021). “Can humans perform mental regression on a graph? Accuracy and bias in the perception of scatterplots”. In: <em>Cognitive Psychology</em> 128, p. 101406.</cite></p> <p><cite>Cleveland, W. S. and R. McGill (1984). “Graphical perception: Theory, experimentation, and application to the development of graphical methods”. In: <em>Journal of the American statistical association</em> 79.387, pp. 531–554.</cite></p> </font> --- class:primary # References <font size="2"> <p><cite>Cleveland, W. S. and R. McGill (1985). “Graphical perception and graphical methods for analyzing scientific data”. In: <em>Science</em> 229.4716, pp. 828–833.</cite></p> <p><cite>Finney, D. (1951). “Subjective judgment in statistical analysis: An experimental study”. In: <em>Journal of the Royal Statistical Society: Series B (Methodological)</em> 13.2, pp. 284–297.</cite></p> <p><cite>Gouretski, V. and K. P. Koltermann (2007). “How much is the ocean really warming?” In: <em>Geophysical Research Letters</em> 34.1.</cite></p> <p><cite>Green, T. M. and B. Fisher (2009). “The personal equation of complex individual cognition during visual interface interaction”. In: <em>Workshop on Human-Computer Interaction and Visualization</em>. Springer. , pp. 38–57.</cite></p> <p><cite>Harms, H. (1991). “August Friedrich Wilhelm Crome (1753-1833) Autor begehrter Wirtschaftskarten”. In: <em>Cartographica Helvetica</em> 3, pp. 33–38.</cite></p> <p><cite>Hofmann, H., L. Follett, M. Majumder, et al. (2012). “Graphical tests for power comparison of competing designs”. In: <em>IEEE Transactions on Visualization and Computer Graphics</em> 18.12, pp. 2441–2448.</cite></p> <p><cite><a id='bib-katz_2017'></a><a href="#cite-katz_2017">Katz, J.</a> (2017). <em>You Draw It: Just How Bad Is the Drug Overdose Epidemic?</em> URL: <a href="https://www.nytimes.com/interactive/2017/04/14/upshot/drug-overdose-epidemic-you-draw-it.html">https://www.nytimes.com/interactive/2017/04/14/upshot/drug-overdose-epidemic-you-draw-it.html</a>.</cite></p> <p><cite>Lewandowsky, S. and I. Spence (1989). “The perception of statistical graphs”. In: <em>Sociological Methods & Research</em> 18.2-3, pp. 200–242.</cite></p> <p><cite><a id='bib-mosteller1981eye'></a><a href="#cite-mosteller1981eye">Mosteller, F., A. F. Siegel, E. Trapido, et al.</a> (1981). “Eye fitting straight lines”. In: <em>The American Statistician</em> 35.3, pp. 150–152.</cite></p> <p><cite>Playfair, W. (1801). “The statistical breviary; shewing, on a principle entirely new, the resources of every state and kingdom in Europe, Wallis, Londres”. In: <em>Press, Chicago</em>.</cite></p> </font> --- class:primary # References <font size="2"> <p><cite>Spence, I. (1990). “Visual psychophysics of simple graphical elements.” In: <em>Journal of Experimental Psychology: Human Perception and Performance</em> 16.4, p. 683.</cite></p> <p><cite>Unwin, A. (2020). “Why is data visualization important? what is important in data visualization?” In: <em>Harvard Data Science Review</em> 2.1.</cite></p> <p><cite>Van Opstal, F., F. P. de Lange, and S. Dehaene (2011). “Rapid parallel semantic processing of numbers without awareness”. In: <em>Cognition</em> 120.1, pp. 136–147.</cite></p> <p><cite>Vanderplas, S., D. Cook, and H. Hofmann (2020). “Testing Statistical Charts: What makes a good graph?” In: <em>Annual Review of Statistics and Its Application</em> 7, pp. 61–88.</cite></p> <p><cite>VanderPlas, S. and H. Hofmann (2015). “Spatial reasoning and data displays”. In: <em>IEEE Transactions on Visualization and Computer Graphics</em> 22.1, pp. 459–468.</cite></p> <p><cite>— (2017). “Clusters beat trend!? testing feature hierarchy in statistical graphics”. In: <em>Journal of Computational and Graphical Statistics</em> 26.2, pp. 231–242.</cite></p> <p><cite>Walker, F. A. (2013). <em>Statistical atlas of the United States based on the results of the ninth census 1870 with contributions from many eminent men of science and several departments of the government</em>.</cite></p> <p><cite>Wickham, H. (2011). “ggplot2”. In: <em>Wiley Interdisciplinary Reviews: Computational Statistics</em> 3.2, pp. 180–185.</cite></p> <p><cite>Wilkinson, L. (2013). <em>The grammar of graphics</em>. Springer Science & Business Media.</cite></p> <p><cite>Yates, J. (1985). “Graphs as a managerial tool: A case study of Du Pont's use of graphs in the early twentieth century”. In: <em>The Journal of Business Communication (1973)</em> 22.1, pp. 5–33.</cite></p> </font> --- class:inverse <br> <br> <br> <br> <br> <br> .center[ # Thank you! <br <br> <!--
**emily.robinson@huskers.unl.edu** --> <!--
**www.emilyarobinson.com** -->
**earobinson95** ]