class: center, middle, inverse, title-slide # HUMAN PERCEPTION OF EXPONENTIALLY INCREASING DATA DISPLAYED ON A LOG SCALE EVALUATED THROUGH EXPERIMENTAL GRAPHICS TASKS ## Department of Statistics, University of Nebraska - Lincoln ### Emily Anna Robinson ### July 22, 2021 --- <style> /* colors: #EEB422, #8B0000, #191970, #00a8cc */ /* define the new color palette here! */ a, a > code { color: #8B0000; text-decoration: none; } .title-slide h2::after, .mline h1::after { content: ''; display: block; border: none; background-color: #8B0000; color: #8B0000; height: 2px; } .remark-slide-content { background-color: #FFFFFF; border-top: 80px solid #8B0000; font-size: 20px; font-weight: 300; line-height: 1.5; <!-- padding: 1em 2em 1em 2em --> background-image: url(css/UNL.svg); background-position: 2% 98%; background-size: 10%; border-bottom: 0; } .inverse { background-color: #8B0000; <!-- border-top: 20px solid #696969; --> <!-- background-image: none; --> <!-- background-position: 50% 75%; --> <!-- background-size: 150px; --> } .remark-slide-content > h1 { font-family: 'Roboto'; font-weight: 300; font-size: 45px; margin-top: -95px; margin-left: -00px; color: #FFFFFF; } .title-slide { background-color: #FFFFFF; <!-- border-left: 80px solid #8B0000; --> background-image: url(css/UNL.svg); background-position: 98% 98%; <!-- background-attachment: fixed, fixed; --> background-size: 20%; border-bottom: 0; border: 10px solid #8B0000; <!-- background: transparent; --> } .title-slide > h1 { color: #111111; font-size: 32px; text-shadow: none; font-weight: 500; text-align: left; margin-left: 15px; padding-top: 80px; } .title-slide > h2 { margin-top: -25px; padding-bottom: -20px; color: #111111; text-shadow: none; font-weight: 100; font-size: 28px; text-align: left; margin-left: 15px; } .title-slide > h3 { color: #111111; text-shadow: none; font-weight: 100; font-size: 28px; text-align: left; margin-left: 15px; margin-bottom: -20px; } body { font-family: 'Roboto'; font-weight: 300; } .remark-slide-number { font-size: 13pt; font-family: 'Roboto'; color: #272822; opacity: 1; } .inverse .remark-slide-number { font-size: 13pt; font-family: 'Roboto'; color: #FAFAFA; opacity: 1; } <!-- img { --> <!-- max-width: 50%; --> <!-- } --> </style> # Outline 1. Related Literature + Introduction to Graphics + Perception and Psychophysics + Testing Graphics + Logarithmic Scales and Mapping + Underestimation of Exponential Growth 2. Research Objectives 3. Prediction with You Draw It + Eye Fitting Straight Lines in the Modern Era + Prediction of Exponential Trends 4. Future Work 5. Questions and Discussion ??? Thank you, everyone for coming! Today, I will be presenting my research on the human perception of exponentially increasing data displayed on a log scale evaluated through experimental graphics tasks as part of my Ph.D. prelims. This work has been conducted under the supervision of Dr. Susan VanderPlas and Dr. Reka Howard. First, I will provide some background on graphics and logarithmic scales and give an overview of my research objectives. This presentation will mainly focus on the second graphical task study, Prediction with you draw it and then I will share future work to be done. --- class:inverse <br> <br> <br> <br> <br> <br> <br> <br> .center[ # Related Literature ] --- class:primary # Introduction to Graphics Data visualization is defined as the art of drawing **graphical charts** in order to display data (Unwin, 2020). **What are graphics useful for?** (Lewandowsky and Spence, 1989) + Data cleaning. + Exploring data structure. + Communicating information. **Who uses graphics?** + Governments (Harms, 1991; Playfair, 1801; Walker, 2013). + Companies (Chandar, Collier, and Miranti, 2012; Yates, 1985). + News sources and mass media (Aisch, Cohn, Cox, et al., 2016). + Scientific publications (Gouretski and Koltermann, 2007). ??? To get started, we are first going to lay the foundation of graphics. Data visualization has become central tool in modern data science and statistics. Unwin 2020 defines data visualization as the art of drawing graphical charts in order to display data. Graphics are useful for data cleaning, exploring data structure, and communicating information. While working at the consulting desk, one of the first things I do when I receive a clients data is to plot the raw data points they have given me. This often leads to the detection of typos when recording data and guides discussion with the client in order to clarify their research questions. After an analysis has been conducted, I then use charts and graphs to display the results and find this is more effective at helping them with interpretation. In the 18th and 19th century, governments began using graphics to understand population and economic interests. In the 20th century, we saw companies using graphics to understand the inner workings of their business and support their business decision. We often see news source displaying graphics of weather forecasts such as hurricane trajectories. Today, we see graphics everywhere from scientific journals to mass media in the newspapers, TV, and internet. Despite the popularity of graphics, we are too accepting of them as default without asking critical questions about the graphics we create or view (Unwin, 2020). + **How effective is this graph at communicating useful information?** (Vanderplas, Cook, & Hofmann, 2020) + An effective graphic accurately shows the data through the appropriate chart selection, axes and scales, and aesthetic design choices in order to successfully communicate the intended result. Higher quality of technology has influenced the creation, replication, and complexity of graphics. We now have an infinitely many number design choices: + variables displayed, type of graphic, size of graphic, aspect ratio, colors, symbols, scales, limits, ordering of categorical variables There is a need for an established set of concepts and terminology to build their graphics from so they can actively choose which of many possible graphics to draw in order to ensure their charts are effective at communicating the intended result. --- class:primary # Grammar of Graphics .pull-left[ **Big Idea:** Graphics are built from the ground up (Wilkinson, 2012). Graphics are viewed as a mapping + **from variables** in the data set + **to visual attributes** on the page/screen. ].pull-right[ <img src="images/graphic-flowchart.png" width="85%"/> (Vanderplas, Cook, and Hofmann, 2020) ] ??? Graphics are build from the ground up by specifying exactly how to create a particular graph from a given data set. Graphics are viewed as a mapping + **from variables** in the data set (or statistics computed from the data) + **to visual attributes** such as the axes, colors, shapes, or facets on the page/screen The figure illustrates the process of creating a graphic from a data set through the use of variable mapping, data transformations, coordinate systems, and aesthetic features (Vanderplas, Cook, & Hofmann, 2020) Software, such as Hadley Wickham’s ggplot2, aims to implement the framework of creating charts and graphics as the grammar of graphics recommends. --- class:primary # Perceptual Process .pull-left[ <img src="images/perceptual-process-goldsein-pg5.png" width="100%"/> .center[(Goldstein and Brockmole, 2016) ] ].pull-right[ **Sensation:** simple processes that occur right at the beginning of a sensory system (Carlson, 2010). **Perception:** higher-order mechanisms and identified with more complex processes (Myers and DeWall, 2021). + Preattentive stage + Direct attention ] ??? + In order to develop guiding principles for generating graphics effective in communication, we must first understand the basic mechanics of the human perceptual system and the biases we are vulnerable to. + The **perceptual process** is a sequence of steps used to describe a how a stimulus in the environment leads to our perception of the stimulus and action in response to the stimulus. 1. stimulus in the environment -> light is reflected and focused back into the viewer’s eyes. 2. the light is reflected and transformed and image is formed on the viewer’s retina. 3. Then the visual receptors transform the light energy into electrical energy through a process called transduction. 4. Signals are transmitted through the retina, to the brain 5. where perception (what do you see?) and recognition (what is it called?) occur. 6. The viewer may then take some sort of motor action; for example, the viewer might move closer to the object. The perceptual process is not direct and instead takes on more of a cyclic nature where a person may go through many iterations of stimuli, perception, recognition, and action before the final image is identified and understood (M. A. Peterson, 1994). **Sensation:** simple processes that occur right at the beginning of a sensory system. **Perception:** higher-order mechanisms and identified with more complex processes + **preattentive stage**: observe color, shape, size, and other basic information about the stimuli being perceived. + **direct attention** is required for additional processing to allow us to draw connections between components that assist in our interpretation of the stimuli. When viewing a chart or graph, most insights we gain are due to the cognitive processes that occur after attention is focused on specific aspects of the graph. The relationship between physiology and perception can provide us information about how graphics may be understood and interpreted. Through experimentation, the physiological response (automatic reaction) is related to the behavioral response (perception, recognition, and action). --- class:primary # Logarithmic Perception **Weber’s law** states we do not notice absolute changes in stimuli, but instead that we notice the relative change (Fechner, 1860). Numerically: `\begin{equation*} \frac{\Delta S}{S} = K \end{equation*}` + `\(\Delta S\)` represents the difference threshold + `\(S\)` represents the initial stimulus + `\(K\)` is called Weber’s contrast which remains constant as the magnitude of `\(S\)` changes. ??? Ernst Weber, an early psychophysics researcher discovered the relationship between the difference threshold (smallest detectable difference between two sensory stimuli) and the magnitude of a stimulus. States we do not notice absolute changes in stimuli, but instead that we notice the relative change. --- class:primary # Logarithmic Perception **Weber-Fechner law** states the relationship between the perceived intensity is logarithmic to the stimulus intensity when observed above a minimal threshold of perception (Fechner, 1860). Derived from Weber’s law: `\begin{equation} P = K \ln \frac{S}{S_0} \end{equation}` + `\(P\)` represents the perceived stimulus + `\(K\)` represents Weber’s contrast + `\(S\)` represents the initial stimulus intensity + `\(S_0\)` represents the minimal threshold of perception. ??? Gustav Fechner, a founder of psychophysics, provided further extension to Weber’s law by discovering the relationship between the perceived intensity is logarithmic to the stimulus intensity when observed above a minimal threshold of perception. --- class:primary # Testing Graphics Evaluate design choices through the use of graphical tests. Could ask participants to: - identify differences in graphs. - read information off of a chart accurately. - use data to make correct real-world decisions. - predict the next few observations. ??? Evaluate design choices through the use of graphical tests. Could ask participants to: - identify differences in graphs. - read information off of a chart accurately. - use data to make correct real-world decisions. - predict the next few observations. All of these types of tests require different levels of use and manipulation of the information presented in the chart. --- class:primary # Motivation .pull-left[ Data visualizations played an important role in during the **COVID-19 pandemic** (Rost, 2020; Romano, Alessandro, Sotis, Chiara, Dominioni, Goran, et al., 2020; Van Bavel, Jay J, Baicker, Katherine, Boggio, Paulo S, et al., 2020). Dashboards displayed: + case counts. + transmission rates. + outbreak regions. ].pull-right[ <!-- Trigger the Modal --> <img id='img91divoccasesjuly2021' src='images/91divoc-cases-july2021.png' alt='(Fagen-Ulmschneider, 2020)' width='80%'> <!-- The Modal --> <div id='modal91divoccasesjuly2021' class='modal'> <!-- Modal Content (The Image) --> <img class='modal-content' id='imgmodal91divoccasesjuly2021'> <!-- Modal Caption (Image Text) --> <div id='caption91divoccasesjuly2021' class='modal-caption'></div> </div> <!-- Trigger the Modal --> <img id='imgcovid19summer2020riskmap' src='images/covid19-summer2020-risk-map.png' alt='(Global Epidemics, 2021)' width='80%'> <!-- The Modal --> <div id='modalcovid19summer2020riskmap' class='modal'> <!-- Modal Content (The Image) --> <img class='modal-content' id='imgmodalcovid19summer2020riskmap'> <!-- Modal Caption (Image Text) --> <div id='captioncovid19summer2020riskmap' class='modal-caption'></div> </div> ] ??? Data visualizations played an important role in during the COVID-19 pandemic in displaying case counts, transmission rates, and outbreak regions. + Mass media routinely showed charts to share information with the public about the progression of the pandemic. + Graphics helped guide decision makers to implement policies such as shut-downs or mandated mask wearing. + Facilitated communication with the public to increase compliance. One of the many dashboards was called 91-DIVOC (COVID-19 backwards!). Gives the viewer choices of what to show: case count, mortality, hospitalizations, standardized to population, geographic regions, scales (log/linear). Other dashboards showed outbreak regions in the form of maps. --- class:primary # Logarithmic Scales .pull-left[ <!-- Trigger the Modal --> <img id='imglogscaleexample' src='images/log-scale-example.png' alt=' ' width='100%'> <!-- The Modal --> <div id='modallogscaleexample' class='modal'> <!-- Modal Content (The Image) --> <img class='modal-content' id='imgmodallogscaleexample'> <!-- Modal Caption (Image Text) --> <div id='captionlogscaleexample' class='modal-caption'></div> </div> ].pull-right[ <!-- Trigger the Modal --> <img id='imglogscalecomic' src='images/log-scale-comic.png' alt='(Munroe, 2005)' width='100%'> <!-- The Modal --> <div id='modallogscalecomic' class='modal'> <!-- Modal Content (The Image) --> <img class='modal-content' id='imgmodallogscalecomic'> <!-- Modal Caption (Image Text) --> <div id='captionlogscalecomic' class='modal-caption'></div> </div> ] ??? + **Problem:** Data which sapns several orders of magnitude shown on its original scale compresses the smaller magnitudes into relatively little area. + **Solution:** Use of a log scale transformation; alters the contextual appearance of the data. The usefulness of the log scale in science is illustrated here showing the challenge of displaying the fuel energy density of Uranium along side other sources of fuel due to differences in magnitude of density. --- class:primary # Logarithmic Mapping Our perception is **logarithmic at first**, but transitions to a **linear scale later** in development (Dehaene, Izard, Spelke, et al., 2008; Siegler and Braithwaite, 2017; Varshney and Sun, 2013). .center[ <img src="images/log-numberline.png" width="100%"/> ] **Assumption:** If we perceive logarithmically by default, it is a natural way to display information and should be easy to read and understand/use. ??? When we first learn to count, we begin counting by ones, then by tens, and advancing to hundreds, following the base10 order of magnitude system. Our perception and mapping of numbers to a number line is **logarithmic at first**, but transitions to a **linear scale later** in development, with formal mathematics education. + For example: A kindergartner asked to place numbers one through ten along a number line would place three close to the middle, following the logarithmic perspective. Assuming there is a direct relationship between perceptual and cognitive processes, it is reasonable to assume numerical representations should also be displayed on a nonlinear, compressed number scale. Therefore, if we perceive logarithmically by default, it is a natural (and presumably low effort) way to display information and should be easy to read and understand/use. --- class:primary # Benefits and Pitfalls of Log Scales .pull-left[ **Benefits** were seen in spring 2020, during the early stages of the COVID-19 pandemic. .center[ <!-- Trigger the Modal --> <img id='imgcovid19FT03' src='images/covid19-FT-03.23.2020-log.png' alt='(Burn-Murdoch, Nevitt, Tilford, et al., 2020)' width='100%'> <!-- The Modal --> <div id='modalcovid19FT03' class='modal'> <!-- Modal Content (The Image) --> <img class='modal-content' id='imgmodalcovid19FT03'> <!-- Modal Caption (Image Text) --> <div id='captioncovid19FT03' class='modal-caption'></div> </div> ] ].pull-right[ **Pitfalls** were exposed as the pandemic evolved, and the case counts were no longer spreading exponentially. .center[ <!-- Trigger the Modal --> <img id='imgcovid19FTlinear' src='images/covid19-FT-linear.png' alt='(Burn-Murdoch, Nevitt, Tilford, et al., 2020)' width='80%'> <!-- The Modal --> <div id='modalcovid19FTlinear' class='modal'> <!-- Modal Content (The Image) --> <img class='modal-content' id='imgmodalcovid19FTlinear'> <!-- Modal Caption (Image Text) --> <div id='captioncovid19FTlinear' class='modal-caption'></div> </div> <!-- Trigger the Modal --> <img id='imgcovid19FTlog' src='images/covid19-FT-log.png' alt='(Burn-Murdoch, Nevitt, Tilford, et al., 2020)' width='80%'> <!-- The Modal --> <div id='modalcovid19FTlog' class='modal'> <!-- Modal Content (The Image) --> <img class='modal-content' id='imgmodalcovid19FTlog'> <!-- Modal Caption (Image Text) --> <div id='captioncovid19FTlog' class='modal-caption'></div> </div> ] ] ??? **Benefits** were seen in spring 2020, during the early stages of the COVID-19 pandemic. + Large magnitude discrepancies in case counts at a given time point between different geographic regions. + Log scale transformations were usefulness for showing case count curves for areas with few cases and areas with many cases within one chart. **Pitfalls** were exposed as the pandemic evolved, and the case counts were no longer spreading exponentially. + Graphs with linear scales seemed more effective at spotting early increases in case counts that signaled more localized outbreaks. + The effect of the linear scale appears to evoke a stronger reaction from the public than the log scale. --- class:primary # Research Objectives **Big Idea:** Are there benefits to displaying exponentially increasing data on a log scale rather than a linear scale? 1. [Perception through Lineups](https://shiny.srvanderplas.com/log-study/) 📈 📈 📈 - Test an individuals ability to perceptually differentiate exponentially increasing data with differing rates of change on both the linear and log scale. 2. [Prediction with You Draw It](https://shiny.srvanderplas.com/you-draw-it/) ✏️ - Tests an individuals ability to make predictions for exponentially increasing data. 3. Estimation by Numerical Translation 📏 - Tests an individuals ability to translate a graph of exponentially increasing data into real value quantities. --- class:primary # Research Objectives **Big Idea:** Are there benefits to displaying exponentially increasing data on a log scale rather than a linear scale? 1. [Perception through Lineups](https://shiny.srvanderplas.com/log-study/) 📈 📈 📈 - Test an individuals ability to perceptually differentiate exponentially increasing data with differing rates of change on both the linear and log scale. 2. [**Prediction with You Draw It**](https://shiny.srvanderplas.com/you-draw-it/) ✏️ - **Tests an individuals ability to make predictions for exponentially increasing data.** 3. Estimation by Numerical Translation 📏 - Tests an individuals ability to translate a graph of exponentially increasing data into real value quantities. --- class:primary # Stages of Exponential Growth .center[ <img src="images/exponential-stages-comic.jpg", width="85%"/> (Von Bergmann, 2021) ] ??? Exponential growth is often misjudged: + **early stage** appears to have a small growth rate. + **middle stage** appears to be growing, but not at an astounding rate, appearing more quadratic. + **late stages** exponential growth when it is quite apparent. This misinterpretation can lead to decisions made under inaccurate understanding causing future consequences. --- class:primary # Underestimation of Exponential Growth Estimation and prediction of **exponential growth is underestimated** when presented both numerically and graphically (Jones, 1979; Mackinnon and Wearing, 1991; Wagenaar and Sagaria, 1975). **Can log transforming the data help?** + Maybe, but are there consequences? + Most readers are not mathematically sophisticated enough to intuitively understand logarithmic math and translate that back into real-world effects. ??? Estimation and prediction of **exponential growth is underestimated** when presented both numerically and graphically. + Numerical estimation is more accurate than graphical estimation for exponential curves. + No improvement in estimation found when participants had contextual knowledge or experience with exponential growth. + Instruction on exponential growth reduced the underestimation. + Estimation was improved by providing immediate feedback to participants about the accuracy of their current predictions. **Can log transforming the data help?** + Maybe, but are there consequences? + Most readers are not mathematically sophisticated enough to intuitively understand logarithmic math and translate that back into real-world effects. --- class:inverse <br> <br> <br> <br> <br> <br> <br> <br> .center[ # Prediction with 'You Draw It' ] --- class:primary # Subjective Judgment in Statistical Analysis ## (Finney, 1951) .pull-left[ + **Big Idea:** Determine the effect of stopping iterative calculations after one iteration based on starting values of the pair of *parallel* probit regression lines judged by eye. + **Method:** Sent out by mail, asked to "rule two lines." + **Sample:** 21 scientists + **Findings:** One cycle of iterations was sufficient. ].pull-right[ <img src="images/subjective-judgement-plot.png" width="75%"/> ] ??? Judge by eye the positions for a pair of *parallel* probit regression lines in a biological assay. **relative potency** the ratio between the dose required from the test preparation and the standard preparation. + **Big Idea:** Determine the effect of stopping iterative maximum likelihood calculations after one iteration in the estimation of parameters connected with dose-response relationships. + **Method:** Sent out by mail, asked to "rule two lines." + **Sample:** 21 scientists + **Findings:** One cycle of iterations for calculating the **relative potency** was sufficient based on the starting values provided by eye from the participants. <!-- --- --> <!-- class:primary --> <!-- # Scientistis have always had sass! --> <!-- ## (Finney, 1951) --> <!-- <font size="6"> --> <!-- .small[ --> <!-- + "No one in their right senses could draw 2 parallel lines through the points in question, but I've done my best to comply with your request as I understand it. **Certainly I did not use my intelligence**". (No. 2.) --> <!-- + "Where fact and theory are so at variance one guess is as good as another-so here goes. The job might have been easier if S.E. of individual points was known!" (No. 9.) --> <!-- + "I should-say that the points were not on two parallel straight lines, and that was that!" (No. 11.) --> <!-- + "The line through the x points is of course easy to draw. But I had to overcome considerable intellectual resistance before I could bring myself to draw a parallel line for the 0 points". (No. 14.) --> <!-- + "**What fun! But I don't believe a word of it.** The only thing is either (a) to do the experiment again or (b) revise the theory". (No. 18.) --> <!-- + **"If, as an experimental scientist, an experiment of mine produced a set of data such as you provided I should at once do the experiment again!"** (No. -20-a chemist.) --> <!-- + "What I would really like to draw are 2 lines which are far from parallel!" (No. 21.) --> <!-- ] --> <!-- </font> --> --- class:primary # Eye Fitting Straight Lines ## (Mosteller, Siegel, Trapido, et al., 1981) .pull-left[ + **Big Idea:** Students fitted lines by eye to four sets of points. + **Method:** 8.5 x 11 inch transparency with a straight line etched across the middle. + **Sample:** 153 graduate students and post docs in Introductory Biostatistics. + **Experimental Design:** Latin square. + **Findings:** Students tended to fit the slope of the first principal component. ].pull-right[ <img src="images/eyefitting-straight-lines-plots.png" width="95%"/> ] ??? + Students fitted lines by eye to four sets of points. + 8.5 x 11 inch transparency with a straight line etched across the middle. + 153 graduate students and post docs in Introductory Biostatistics. + Latin square. + Students tended to fit the slope of the first principal component or major axis (the line that minimizes the sum of squares of perpendicular rather than vertical distances). --- class:primary # You Draw It Feature ## (New York Times, 2015) .pull-left[ <img src="images/nyt-caraccidents-frame4.png" width="100%"/> .center[ (Katz, 2017) ] ].pull-right[ Readers are asked to input their own assumptions about various metrics and compare how these assumptions relate to reality. + [Family Income affects college chances](https://www.nytimes.com/interactive/2015/05/28/upshot/you-draw-it-how-family-income-affects-childrens-college-chances.html) (Aisch, Cox, and Quealy, 2015) + [Just How Bad Is the Drug Overdose Epidemic?](https://www.nytimes.com/interactive/2017/04/14/upshot/drug-overdose-epidemic-you-draw-it.html) (Katz, 2017) + [What Got Better or Worse During Obama’s Presidency](https://www.nytimes.com/interactive/2017/01/15/us/politics/you-draw-obama-legacy.html?_r=0) (Buchanan, Park, and Pearce, 2017) ] ??? Readers are asked to input their own assumptions about various metrics and compare how these assumptions relate to reality. The New York Times team utilizes **Data Driven Documents (D3)** that allows readers to predict these metrics through the use of drawing a line on their computer screen with their mouse. --- class:primary # Background of D3 **Who?** [Mike Bostock](https://observablehq.com/@mbostock) created D3 during his time working on graphics at the New York Times. **What?** Open-source JavaScript based graphing framework + D3 = "Data Driven Documents" + `D3` is to JavaScript as `ggplot2` is to R + Framework for binding objects and layers to plotting area + framework for movement and user interaction **When?** D3 v1.0 released in 2011, D3.js recently celebrated it's 10th anniversary! **Where?** The internet! **Why?** Advantages of using D3 include animation and allowing for movement and user interaction. **How?** `r2d3`! ??? + Used by major news and research organizations such as the New York Times, FiveThirtyEight, Washington Post, and the Pew Research Center create and customize graphics. + `D3` is to JavaScript as `ggplot2` is to R + Advantages include animation and allowing for movement and user interaction. --- class:primary # Relationship between D3 and R .pull-left[ The `r2d3` package (Luraschi and Allaire, 2018) in R provides an efficient integration of D3 visuals and R by displaying them in familiar formats: + RMarkdown with HTML output + Shiny applications (amazing!) ].pull-right[ .center[ <img src="images/r2d3-hex.png" width="40%"/> ] ] `r2d3` makes it easy to do your data processing in R, then apply D3.js code to visualize that data! -- .pull-left[ **How?** + Converts data in R to JSON that can be interpreted by JavaScript + Sources D3 code library + Creates plot container (svg) + Renders plot using source code ].right-plot[ ```r r2d3(data = data, script = "d3-source-code.js", d3_version= "5") ``` ] ??? A challenge of working with D3 is the environment necessary to display the graphics and images. The `r2d3` package in R provides an efficient integration of D3 visuals and R by displaying them in familiar formats: + RMarkdown with HTML output + Shiny applications (amazing!) `r2d3` makes it easy to do your data processing in R, then apply D3.js code to visualize that data! The example R code illustrates the structure of the r2d3 function which includes specification of a data frame in R (converted to a JSON file), the D3.js source code file, and the D3 version that accompanies the source code. A default SVG (scalable vector graphic) container for layering elements is then generated by the r2d3 function which renders the plot using the source code. --- class:primary # Getting Started with D3 `D3.js` is to JavaScript as `ggplot2` is to R .pull-left[
[Codecademy: Introduction to JavaScript](https://www.codecademy.com/learn/introduction-to-javascript)
Understand [SVG](http://tutorials.jenkov.com/svg/g-element.html) elements: inspect elements in web browser!
Amelia Wattenberger's [Full Stack D3 and Data Visualization Book](https://www.newline.co/fullstack-d3)
Build a basic graphic using [r2d3](https://rstudio.github.io/r2d3/articles/introduction.html)
Modify `D3.js` code until it does what you want! ] .pull-right[ **Additional Resources**
[How to learn D3 with no coding experience](https://www.heshameissa.com/blog/learn-d3)
Amelia Wattenberger on [Twitter](https://twitter.com/Wattenberger) ] --- class:primary # You Draw It Experimental Task Study Participant Prompt: *Use your mouse to fill in the trend in the yellow box region.* .pull-left[ <img src="images/eyefitting_example.gif" width="100%"/> ].pull-right[ <img src="images/exponential_example.gif" width="100%"/> ] ??? Here we see an example of a you draw it interactive plot as seen by participants during the study. Participants are prompted to: "Use your mouse to fill in the trend in the yellow box region". The yellow box region moves along as the participant draws their trend-line until the yellow region disappears. --- class:primary # Study Design Two sub-studies: 1. **Eye Fitting Straight Lines in the Modern Era** + Validate 'you draw it' as a tool for measuring predictions of trends fitted by eye and a method for testing graphics. + Replicated experiment and results found in Mosteller et al. (1981). 2. **Prediction of Exponentially Increasing Trends** + Test an individuals' ability to make predictions for exponentially increasing data on both the log and linear scales. A total of 39 individuals completed 256 unique you draw it task plots. --- class:inverse <br> <br> <br> <br> <br> <br> <br> <br> .center[ # Eye Fitting Straight Lines in the Modern Era ] --- class:primary # Data Simulation **Point data:** `\(N = 30\)` points `\((x_i, y_i), i = 1,...N\)` were generated for `\(x_i \in [x_{min}, x_{max}]\)`. Data were simulated based on linear model with additive errors: `\begin{equation} y_i = \beta_0 + \beta_1 x_i + e_i \end{equation}` where `\(e_i \sim N(0, \sigma^2).\)` **Line data:** `\(k = 1,....4x_{max} + 1\)` fitted values in 0.25 increments across the domain, `\((x_k, \hat y_{k,OLS})\)` An ordinary least squares regression is then fit to the simulated points: `\begin{equation} \hat y_{k,OLS} = \hat\beta_{0,OLS} + \hat\beta_{1,OLS} x_k \end{equation}` ??? --- class:primary # Treatment Design .pull-left[ Replicated the four data sets from Mosteller et al. (1981). Consistent aesthetic design choices: + aspect ratio set to one. + y-range extended `\(10\%\)` beyond the range of the simulated data points. ].pull-right[ <!-- Trigger the Modal --> <img id='imgeyefittingexamplesimplot' src='images/eyefitting-example-simplot.png' alt=' ' width='100%'> <!-- The Modal --> <div id='modaleyefittingexamplesimplot' class='modal'> <!-- Modal Content (The Image) --> <img class='modal-content' id='imgmodaleyefittingexamplesimplot'> <!-- Modal Caption (Image Text) --> <div id='captioneyefittingexamplesimplot' class='modal-caption'></div> </div> ] ??? + **S:** positive slope; small variance; `\(x \in [0, 20]\)`. + **F:** positive slope; a large variance; `\(x \in [0, 20]\)`. + **V:** steep positive slope; small variance; `\(x \in [4, 16]\)`. + **N:** negative slope; large variance; `\(x \in [0, 20]\)`. --- class:primary # Slope of First Principal Component .pull-left[ Fitted points, `\((x_k, \hat y_{k,PCA})\)` are calculated by: `\begin{equation} \hat y_{k,PCA} = \hat\beta_{0,PCA} + \hat\beta_{1,PCA} x_k \end{equation}` where + `\(\hat\beta_{1,PCA} = \frac{\text{PC1 Rotation in y-axis}}{\text{PC1 Rotation in x-axis}}\)` + `\(\hat\beta_{0,PCA}\)` calculated by the point-slope equation of a line using the mean of of the simulated points, `\((\bar x_i, \bar y_i)\)`. ].pull-right[ <!-- Trigger the Modal --> <img id='imgpcaplot' src='images/pca-plot.png' alt=' ' width='100%'> <!-- The Modal --> <div id='modalpcaplot' class='modal'> <!-- Modal Content (The Image) --> <img class='modal-content' id='imgmodalpcaplot'> <!-- Modal Caption (Image Text) --> <div id='captionpcaplot' class='modal-caption'></div> </div> ] ??? + In addition to the fitted values calculated by the OLS regression, we also obtain fitted values from the slope of the first principal component regression line. + This shows how the PCA regression line minimizes the distance orthogonally (smallest distance - both vertical and horizontal) while OLS minimizes vertically only. + rotation of coordinate axes from `princomp`. --- class:primary # Model Data .pull-left[ For each participant, the final data set used for analysis contains: + `\(x_{ijk}\)`, `\(y_{ijk,drawn}\)`, `\(\hat y_{ijk,OLS}\)`, `\(\hat y_{ijk,PCA}\)` for + parameter choice `\(i = 1,2,3,4\)`, + participant j = `\(1,...N_{participant}\)` + `\(x_{ijk}\)` value corresponding to increment `\(k = 1, ...,4 x_{max} + 1\)`. Vertical residuals between the drawn and fitted values were calculated as: + `\(e_{ijk,OLS} = y_{ijk,drawn} - \hat y_{ijk,OLS}\)` + `\(e_{ijk,PCA} = y_{ijk,drawn} - \hat y_{ijk,PCA}\)`. ].pull-right[ <!-- Trigger the Modal --> <img id='imgeyefittingtrialplot' src='images/eyefitting-trial-plot.png' alt=' ' width='100%'> <!-- The Modal --> <div id='modaleyefittingtrialplot' class='modal'> <!-- Modal Content (The Image) --> <img class='modal-content' id='imgmodaleyefittingtrialplot'> <!-- Modal Caption (Image Text) --> <div id='captioneyefittingtrialplot' class='modal-caption'></div> </div> ] --- class:primary # Linear Mixed Model The LMM equation for each fit (OLS and PCA) residuals is given by: `\begin{equation} e_{ijk,fit} = \left[\gamma_0 + \alpha_i\right] + \left[\gamma_{1} x_{ijk} + \gamma_{2i} x_{ijk}\right] + p_{j} + \epsilon_{ijk} \end{equation}` where + `\(e_{ijk,fit}\)` is the residual between the drawn and fitted y-values for the `\(i^{th}\)` parameter choice, `\(j^{th}\)` participant, and `\(k^{th}\)` increment of x-value corresponding to either the OLS or PCA fit + `\(\gamma_0\)` is the overall intercept + `\(\alpha_i\)` is the effect of the `\(i^{th}\)` parameter choice (F, S, V, N) on the intercept + `\(\gamma_1\)` is the overall slope for `\(x\)` + `\(\gamma_{2i}\)` is the effect of the parameter choice on the slope + `\(x_{ijk}\)` is the x-value for the `\(i^{th}\)` parameter choice, `\(j^{th}\)` participant, and `\(k^{th}\)` increment + `\(p_{j} \sim N(0, \sigma^2_{participant})\)` is the random error due to the `\(j^{th}\)` participant's characteristics + `\(\epsilon_{ijk} \sim N(0, \sigma^2)\)` is the residual error. ??? Using the `lmer` function in the lme4 package, a linear mixed model (LMM) is fit separately to the OLS and PCA residuals, constraining the fit to a linear trend. --- class:primary # LMM Residuals Trend Results .center[ <img src="images/eyefitting-lmer-plot.png" width="85%"/> ] --- class:primary # Generalized Additive Mixed Model The GAMM equation for each fit (OLS and PCA) residuals is given by: `\begin{equation} e_{ijk,fit} = \alpha_i + s_{i}(x_{ijk}) + p_{j} + s_{j}(x_{ijk}) \end{equation}` where + `\(e_{ijk,fit}\)` is the residual between the drawn and fitted y-values for the `\(i^{th}\)` parameter choice, `\(j^{th}\)` participant, and `\(k^{th}\)` increment of x-value corresponding to either the OLS or PCA fit + `\(\alpha_i\)` is the intercept for the parameter choice `\(i\)` + `\(s_{i}\)` is the smoothing spline for the `\(i^{th}\)` parameter choice + `\(x_{ijk}\)` is the x-value for the `\(i^{th}\)` parameter choice, `\(j^{th}\)` participant, and `\(k^{th}\)` increment + `\(p_{j} \sim N(0, \sigma^2_{participant})\)` is the error due to participant variation + `\(s_{j}\)` is the random smoothing spline for each participant. ??? Eliminating the linear trend constraint, the `bam` function in the mgcv package is used to fit a generalized additive mixed model (GAMM) separately to the OLS and PCA residuals to allow for estimation of smoothing splines. --- class:primary # GAMM Residuals Trend Results .center[ <img src="images/eyefitting-gamm-plot.png" width="85%"/> ] ??? + Estimated trends from PCA residuals appear to align closer to the `\(y=0\)` horizontal (dashed) line than the OLS residuals. + More prominent in parameter choices with large variances (F and N). + Consistent to results found in Mosteller et al. (1981). --- class:primary # Conclusion **Goal:** Establish you draw it as a tool for testing graphics. **Results:** + Estimated drawn trend-lines followed closer to the PCA than the OLS regression line. + Most prominent in parameter choices with large variances. + Consistent to those found in the previous study. **The reproducibility of these results serve as evidence of the reliability of the you draw it method.** ??? **Goal:** Establish you draw it as a tool for testing graphics. + Replicate results found in Eye Fitting Straight Lines (Mosteller et al., 1981). **Results:** + Estimated drawn trend-lines by participants followed closer to the regression line based on the slope of the first principle component than that of the ordinary least squares regression. + Most prominent in parameter choices with large variances. + Consistent to those found in the previous study, indicating participants fit a trend line closer to the estimated regression line with the slope of the first principal component than the estimated OLS regression line. **The reproducibility of these results serve as evidence of the reliability of the you draw it method.** --- class:inverse <br> <br> <br> <br> <br> <br> <br> <br> .center[ # Prediction of Exponential Trends ] --- class:primary # Data Simulation **Point data:** `\(N = 30\)` points `\((x_i, y_i), i = 1,...N\)` were generated for `\(x_i \in [x_{min}, x_{max}]\)`. Data were simulated based on a one parameter exponential model with multiplicative errors: `\begin{equation} y_i = e^{\beta x_i + e_i} \end{equation}` for + growth rate `\(\beta\)` + `\(e_i \sim N(0, \sigma^2)\)` generated by rejection sampling. **Line data:** `\(m = 1,....4x_{max} + 1\)` fitted values in 0.25 increments across the domain, `\((x_m, \hat y_{m,NLS})\)` A nonlinear least squares regression is then fit to the simulated points: `\begin{equation} \hat y_{m,NLS} = e^{\hat \beta_{NLS} x_m} \end{equation}` ??? + `\(e_i \sim N(0, \sigma^2)\)` are generated by rejection sampling in order to guarantee the points shown align with that of the fitted line displayed in the initial plot frame. Outputs a list of point data and line data both indicating the parameter identification, x-value, and corresponding simulated or fitted y value. --- class:primary # Treatment Design .pull-left[ 2 x 2 x 2 factorial: + **growth rate:** low and high. + **points truncated:** `\(50\%\)` and `\(75\%\)` of the domain. + **scale:** log and linear. Consistent aesthetic design choices: + aspect ratio of one. + y-axis extended `\(50\%\)` below and `\(200\%\)` above the simulated data range. + participants begin drawing at `\(50\%\)` of the domain. ].pull-right[ .center[ <!-- Trigger the Modal --> <img id='imglow10linear' src='images/low-10-linear.png' alt='Low Growth Rate, 50% Truncation, Linear Scale' width='30%'> <!-- Trigger the Modal --> <img id='imglow10log' src='images/low-10-log.png' alt='Low Growth Rate, 50% Truncation, Log Scale' width='30%'> <!-- The Modal --> <div id='modallow10linear' class='modal'> <!-- Modal Content (The Image) --> <img class='modal-content' id='imgmodallow10linear'> <!-- Modal Caption (Image Text) --> <div id='captionlow10linear' class='modal-caption'></div> </div> <!-- The Modal --> <div id='modallow10log' class='modal'> <!-- Modal Content (The Image) --> <img class='modal-content' id='imgmodallow10log'> <!-- Modal Caption (Image Text) --> <div id='captionlow10log' class='modal-caption'></div> </div> <!-- Trigger the Modal --> <img id='imglow15linear' src='images/low-15-linear.png' alt='Low Growth Rate, 75% Truncation, Linear Scale' width='30%'> <!-- Trigger the Modal --> <img id='imglow15log' src='images/low-15-log.png' alt='Low Growth Rate, 75% Truncation, Log Scale' width='30%'> <!-- The Modal --> <div id='modallow15linear' class='modal'> <!-- Modal Content (The Image) --> <img class='modal-content' id='imgmodallow15linear'> <!-- Modal Caption (Image Text) --> <div id='captionlow15linear' class='modal-caption'></div> </div> <!-- The Modal --> <div id='modallow15log' class='modal'> <!-- Modal Content (The Image) --> <img class='modal-content' id='imgmodallow15log'> <!-- Modal Caption (Image Text) --> <div id='captionlow15log' class='modal-caption'></div> </div> <!-- Trigger the Modal --> <img id='imghigh10linear' src='images/high-10-linear.png' alt='High Growth Rate, 50% Truncation, Linear Scale' width='30%'> <!-- Trigger the Modal --> <img id='imghigh10log' src='images/high-10-log.png' alt='High Growth Rate, 50% Truncation, Log Scale' width='30%'> <!-- The Modal --> <div id='modalhigh10linear' class='modal'> <!-- Modal Content (The Image) --> <img class='modal-content' id='imgmodalhigh10linear'> <!-- Modal Caption (Image Text) --> <div id='captionhigh10linear' class='modal-caption'></div> </div> <!-- The Modal --> <div id='modalhigh10log' class='modal'> <!-- Modal Content (The Image) --> <img class='modal-content' id='imgmodalhigh10log'> <!-- Modal Caption (Image Text) --> <div id='captionhigh10log' class='modal-caption'></div> </div> <!-- Trigger the Modal --> <img id='imghigh15linear' src='images/high-15-linear.png' alt='High Growth Rate, 75% Truncation, Linear Scale' width='30%'> <!-- Trigger the Modal --> <img id='imghigh15log' src='images/high-15-log.png' alt='High Growth Rate, 75% Truncation, Log Scale' width='30%'> <!-- The Modal --> <div id='modalhigh15linear' class='modal'> <!-- Modal Content (The Image) --> <img class='modal-content' id='imgmodalhigh15linear'> <!-- Modal Caption (Image Text) --> <div id='captionhigh15linear' class='modal-caption'></div> </div> <!-- The Modal --> <div id='modalhigh15log' class='modal'> <!-- Modal Content (The Image) --> <img class='modal-content' id='imgmodalhigh15log'> <!-- Modal Caption (Image Text) --> <div id='captionhigh15log' class='modal-caption'></div> </div> ]] --- class:primary # Feedback Data .pull-left[ For each participant, the final data set used for analysis contains: + `\(x_{ijklm}\)`, `\(y_{ijklm,drawn}\)`, and `\(\hat y_{ijklm,NLS}\)` for: + growth rate `\(i = 1,2\)`, + point truncation `\(j = 1,2\)`, + scale `\(k = 1,2\)`, + participant `\(l = 1,...N_{participant}\)`, and + `\(x_{ijklm}\)` value `\(m = 1, ...,4 x_{max} + 1\)`. Vertical residuals between the drawn and fitted values were calculated as: + `\(e_{ijklm,NLS} = y_{ijklm,drawn} - \hat y_{ijklm,NLS}\)`. ].pull-right[ <!-- Trigger the Modal --> <img id='imgexpspaghettiplot' src='images/exp-spaghetti-plot.png' alt=' ' width='100%'> <!-- The Modal --> <div id='modalexpspaghettiplot' class='modal'> <!-- Modal Content (The Image) --> <img class='modal-content' id='imgmodalexpspaghettiplot'> <!-- Modal Caption (Image Text) --> <div id='captionexpspaghettiplot' class='modal-caption'></div> </div> ] --- class:primary # Generalized Additive Mixed Model The GAMM equation for residuals is given by: `\begin{equation} e_{ijklm,nls} = \tau_{ijk} + s_{ijk}(x_{ijklm}) + p_{l} + s_{l}(x_{ijklm}) \end{equation}` where + `\(e_{ijklm,NLS}\)` is the residual between the drawn y-value and fitted y-value for the `\(l^{th}\)` participant, `\(m^{th}\)` increment, and `\(ijk^{th}\)` treatment combination + `\(\tau_{ijk}\)` is the intercept for the `\(i^{th}\)` growth rate, `\(j^{th}\)` point truncation, and `\(k^{th}\)` scale treatment combination + `\(s_{ijk}\)` is the smoothing spline for the `\(ijk^{th}\)` treatment combination + `\(x_{ijklm}\)` is the x-value for the `\(l^{th}\)` participant, `\(m^{th}\)` increment, and `\(ijk^{th}\)` treatment combination + `\(p_{l} \sim N(0, \sigma^2_{participant})\)` is the error due to the `\(l^{th}\)` participant's characteristics + `\(s_{l}\)` is the random smoothing spline for the `\(l^{th}\)` participant. ??? Allowing for flexibility, the bam function in the mgcv package is used to fit a GAMM to estimate trends of vertical residuals from the participant drawn line in relation to the NLS fitted values. --- class:primary # GAMM Residual Trend Results .center[ <img src="images/exp-gamm-plot.png" width="80%"/> ] ??? Predictions made on the **linear scale** (blue) deviate from the `\(y=0\)` horizontal (dashed) line `\(\implies\)` **underestimation** of exponential growth. Predictions made on the **log scale** (orange) follow closely to the `\(y=0\)` horizontal (dashed) line `\(\implies\)` **more accurate** than trends predicted on the linear scale. More prominent in high exponential growth rates. Underestimation begins after the aid of points is removed. ??? Indicated by the discrepancy in results for treatments with points truncated at `\(50\%\)` compared to `\(75\%\)` of the domain. --- class:primary # Conclusion **Goal:** Test an individual's ability to make predictions for exponentially increasing data. **Results:** + Predictions made on the log scale were more accurate than those made on the linear scale. + Strongly supported for high exponential growth rates. + Points shown along the trend improve predictions. **The results of this study suggest that there are cognitive advantages to log scales when making predictions of exponential trends.** ??? Further investigation is necessary to determine the implications of using log scales when translating exponential graphs to numerical values. --- class:inverse <br> <br> <br> <br> <br> <br> <br> <br> .center[ # Future Work ] --- class:primary # Research Objectives **Big Idea:** Are there benefits to displaying exponentially increasing data on a log scale rather than a linear scale? 1. [Perception through Lineups](https://shiny.srvanderplas.com/log-study/) 📈 📈 📈 - Test an individuals ability to perceptually differentiate exponentially increasing data with differing rates of change on both the linear and log scale. 2. [Prediction with You Draw It](https://shiny.srvanderplas.com/you-draw-it/) ✏️ - Tests an individuals ability to make predictions for exponentially increasing data. + Eye Fitting Straight Lines in the Modern Era + Prediction of Exponentially Increasing Trends 3. **Estimation by Numerical Translation** 📏 - **Tests an individuals ability to translate a graph of exponentially increasing data into real value quantities.** --- class:primary # Phrasing of Estimation Questions Spence (1990) presents four example questions for comparing the sizes of individual graphical elements: + **How much greater** was the rainfall in September than May? + `\(\implies\)` *estimate numeric change* + Is the price of oil in constant dollars **increasing or decreasing** from year to year? + `\(\implies\)` *determine increasing or decreasing* + **Do more** people subscribe **to** Time **than** Newsweek? + `\(\implies\)` *determine yes or no* + **Did** the ABC Corporation pay the largest dividends last year, **or** did XYZ? + `\(\implies\)` *determine ABC or XYZ* --- class:primary # Phrasing of Estimation Questions .pull-left[ Amer (2005) presents a cost volume profit graph with two crossing lines and asks participants to **estimate** three values: ].pull-right[ .center[ <!-- Trigger the Modal --> <img id='imgamerpoggendorffillusion' src='images/amer-poggendorff-illusion.png' alt='(Amer, 2005)' width='80%'> <!-- The Modal --> <div id='modalamerpoggendorffillusion' class='modal'> <!-- Modal Content (The Image) --> <img class='modal-content' id='imgmodalamerpoggendorffillusion'> <!-- Modal Caption (Image Text) --> <div id='captionamerpoggendorffillusion' class='modal-caption'></div> </div> ] ] + the **amount** of total revenues/total costs on the ordinate **corresponding to the endpoint** of the total-revenue line plotted on the graph. + `\(\implies\)` *estimate numeric value for a given x-value*. + the **amount** of costs/revenues on the ordinate at the **break even point**—the point where the two lines cross. + `\(\implies\)` *estimate numeric value based on a visual cue (lines crossing)*. ??? --- class:primary # Phrasing of Estimation Questions Dunn (1988) presents two maps indicating the murder rate of each US state and asks participants to: + **write down their estimate** of the murder rate as accurately as possible beside the 24 named states. + `\(\implies\)` *judge color shade or component part to estimate numeric value.* .center[ <!-- Trigger the Modal --> <img id='imgframedmurderratemap' src='images/framed-murder-rate-map.png' alt='(Dunn, 1988)' width='90%'> <!-- The Modal --> <div id='modalframedmurderratemap' class='modal'> <!-- Modal Content (The Image) --> <img class='modal-content' id='imgmodalframedmurderratemap'> <!-- Modal Caption (Image Text) --> <div id='captionframedmurderratemap' class='modal-caption'></div> </div> ] ??? Dunn (1988) assessed the relative accuracy with which quantitative information is extracted from both types of charts. Participants were: + informed that the experiment was designed to test the ability of individuals to "read" or "decode" statistical maps. + shown two maps, an unclassed choropleth map and a framed rectangle chart, indicating the murder rate of each US state. + asked to **write down their estimate** of the murder rate as accurately as possible beside the 24 named states. + `\(\implies\)` *judge color shade or component part to estimate numeric value.* --- class:primary # Prolific Data Collection .pull-left[ Future data collection: + Develop and complete the pilot study for estimation task. + Make any necessary adjustments to the experimental tasks based on pilot studies. + Collect final data via [Prolific](https://app.prolific.co/studies). ].pull-right[ .center[ <img src="images/prolific-log.png" width="50%"/> [(Prolific studies)](https://app.prolific.co/studies) ] ] --- class:primary # References <font size="2"> <p><cite><a id='bib-Aisch_NYTimes_presidential_forecast'></a><a href="#cite-Aisch_NYTimes_presidential_forecast">Aisch, G., N. Cohn, A. Cox, et al.</a> (2016). <em>Live Presidential Forecast</em>. URL: <a href="https://www.nytimes.com/elections/2016/forecast/president">https://www.nytimes.com/elections/2016/forecast/president</a>.</cite></p> <p><cite><a id='bib-aisch_cox_quealy_2015'></a><a href="#cite-aisch_cox_quealy_2015">Aisch, G., A. Cox, and K. Quealy</a> (2015). <em>You Draw It: How Family Income Predicts Children's College Chances</em>. URL: <a href="https://www.nytimes.com/interactive/2015/05/28/upshot/you-draw-it-how-family-income-affects-childrens-college-chances.html">https://www.nytimes.com/interactive/2015/05/28/upshot/you-draw-it-how-family-income-affects-childrens-college-chances.html</a>.</cite></p> <p><cite><a id='bib-amer2005bias'></a><a href="#cite-amer2005bias">Amer, T.</a> (2005). “Bias due to visual illusion in the graphical presentation of accounting information”. In: <em>Journal of Information Systems</em> 19.1, pp. 1–18.</cite></p> <p><cite><a id='bib-best_perception_2007'></a><a href="#cite-best_perception_2007">Best, L. A., L. D. Smith, and D. A. Stubbs</a> (2007). “Perception of Linear and Nonlinear Trends: Using Slope and Curvature Information to Make Trend Discriminations”. In: <em>Perceptual and Motor Skills</em> 104.3. Publisher: SAGE Publications Inc, pp. 707–721. ISSN: 0031-5125. DOI: <a href="https://doi.org/10.2466/pms.104.3.707-721">10.2466/pms.104.3.707-721</a>. URL: <a href="https://doi.org/10.2466/pms.104.3.707-721">https://doi.org/10.2466/pms.104.3.707-721</a> (visited on Jul. 06, 2020).</cite></p> <p><cite><a id='bib-buchanan_park_pearce_2017'></a><a href="#cite-buchanan_park_pearce_2017">Buchanan, L., H. Park, and A. Pearce</a> (2017). <em>You Draw It: What Got Better or Worse During Obama's Presidency</em>. URL: <a href="https://www.nytimes.com/interactive/2017/01/15/us/politics/you-draw-obama-legacy.html">https://www.nytimes.com/interactive/2017/01/15/us/politics/you-draw-obama-legacy.html</a>.</cite></p> <p><cite><a id='bib-burnmurdoch_2020'></a><a href="#cite-burnmurdoch_2020">Burn-Murdoch, J., C. Nevitt, C. Tilford, et al.</a> (2020). <em>Coronavirus tracked: has the epidemic peaked near you?</em> URL: <a href="https://ig.ft.com/coronavirus-chart/?areas=eur">https://ig.ft.com/coronavirus-chart/?areas=eur</a>.</cite></p> <p><cite><a id='bib-carlson2010psychology'></a><a href="#cite-carlson2010psychology">Carlson, N. R.</a> (2010). <em>Psychology: The science of behaviour</em>. Pearson Education.</cite></p> <p><cite><a id='bib-chandar2012graph'></a><a href="#cite-chandar2012graph">Chandar, N., D. Collier, and P. Miranti</a> (2012). “Graph standardization and management accounting at AT&T during the 1920s”. In: <em>Accounting History</em> 17.1, pp. 35–62.</cite></p> <p><cite><a id='bib-dehaene2008log'></a><a href="#cite-dehaene2008log">Dehaene, S., V. Izard, E. Spelke, et al.</a> (2008). “Log or linear? Distinct intuitions of the number scale in Western and Amazonian indigene cultures”. In: <em>science</em> 320.5880, pp. 1217–1220.</cite></p> <p><cite><a id='bib-bavel_using_2020'></a><a href="#cite-bavel_using_2020">Van Bavel, Jay J, Baicker, Katherine, Boggio, Paulo S, et al.</a> (2020). “Using social and behavioural science to support COVID-19 pandemic response”. In: <em>Nature human behaviour</em> 4.5, pp. 460–471.</cite></p> </font> --- class:primary # References <font size="2"> <p><cite><a id='bib-dunn1988framed'></a><a href="#cite-dunn1988framed">Dunn, R.</a> (1988). “Framed rectangle charts or statistical maps with shading: An experiment in graphical perception”. In: <em>The American Statistician</em> 42.2, pp. 123–129.</cite></p> <p><cite><a id='bib-fagen-ulmschneider_2020'></a><a href="#cite-fagen-ulmschneider_2020">Fagen-Ulmschneider, W.</a> (2020). <em>91-DIVOC</em>. URL: <a href="https://91-divoc.com/pages/covid-visualization/">https://91-divoc.com/pages/covid-visualization/</a>.</cite></p> <p><cite><a id='bib-fechner1860elemente'></a><a href="#cite-fechner1860elemente">Fechner, G. T.</a> (1860). <em>Elemente der psychophysik</em>. Vol. 2. Breitkopf u. Härtel.</cite></p> <p><cite><a id='bib-finney_subjective_1951'></a><a href="#cite-finney_subjective_1951">Finney, D. J.</a> (1951). “Subjective Judgment in Statistical Analysis: An Experimental Study”. In: <em>Journal of the Royal Statistical Society. Series B (Methodological)</em> 13.2. Publisher: [Royal Statistical Society, Wiley], pp. 284–297. ISSN: 0035-9246. URL: <a href="https://www.jstor.org/stable/2984070">https://www.jstor.org/stable/2984070</a> (visited on Mar. 31, 2021).</cite></p> <p><cite><a id='bib-global_epidemics_2021'></a><a href="#cite-global_epidemics_2021">Global Epidemics</a> (2021). <em>Risk Levels</em>. URL: <a href="https://globalepidemics.org/key-metrics-for-covid-suppression/">https://globalepidemics.org/key-metrics-for-covid-suppression/</a>.</cite></p> <p><cite><a id='bib-goldstein2016sensation'></a><a href="#cite-goldstein2016sensation">Goldstein, E. B. and J. Brockmole</a> (2016). <em>Sensation and perception</em>. Cengage Learning.</cite></p> <p><cite><a id='bib-gouretski2007much'></a><a href="#cite-gouretski2007much">Gouretski, V. and K. P. Koltermann</a> (2007). “How much is the ocean really warming?” In: <em>Geophysical Research Letters</em> 34.1.</cite></p> <p><cite><a id='bib-harms1991august'></a><a href="#cite-harms1991august">Harms, H.</a> (1991). “August Friedrich Wilhelm Crome (1753-1833) Autor begehrter Wirtschaftskarten”. In: <em>Cartographica Helvetica</em> 3, pp. 33–38.</cite></p> <p><cite><a id='bib-jones_generalized_1979'></a><a href="#cite-jones_generalized_1979">Jones, G. V.</a> (1979). “A generalized polynomial model for perception of exponential series”. En. In: <em>Perception & Psychophysics</em> 25.3, pp. 232–234. ISSN: 0031-5117, 1532-5962. DOI: <a href="https://doi.org/10.3758/BF03202992">10.3758/BF03202992</a>. URL: <a href="http://link.springer.com/10.3758/BF03202992">http://link.springer.com/10.3758/BF03202992</a> (visited on May. 19, 2020).</cite></p> <p><cite><a id='bib-katz_2017'></a><a href="#cite-katz_2017">Katz, J.</a> (2017). <em>You Draw It: Just How Bad Is the Drug Overdose Epidemic?</em> URL: <a href="https://www.nytimes.com/interactive/2017/04/14/upshot/drug-overdose-epidemic-you-draw-it.html">https://www.nytimes.com/interactive/2017/04/14/upshot/drug-overdose-epidemic-you-draw-it.html</a>.</cite></p> <p><cite><a id='bib-lewandowsky_perception_1989'></a><a href="#cite-lewandowsky_perception_1989">Lewandowsky, S. and I. Spence</a> (1989). “The Perception of Statistical Graphs”. En. In: <em>Sociological Methods & Research</em> 18.2-3, pp. 200–242. ISSN: 0049-1241, 1552-8294. DOI: <a href="https://doi.org/10.1177/0049124189018002002">10.1177/0049124189018002002</a>. URL: <a href="http://journals.sagepub.com/doi/10.1177/0049124189018002002">http://journals.sagepub.com/doi/10.1177/0049124189018002002</a> (visited on May. 29, 2020).</cite></p> </font> --- class:primary # References <font size="2"> <p><cite><a id='bib-luraschi_r2d3'></a><a href="#cite-luraschi_r2d3">Luraschi, J. and J. Allaire</a> (2018). <em>r2d3: Interface to 'D3' Visualizations</em>. R package version 0.2.3. URL: <a href="https://CRAN.R-project.org/package=r2d3">https://CRAN.R-project.org/package=r2d3</a>.</cite></p> <p><cite><a id='bib-mackinnon_feedback_1991'></a><a href="#cite-mackinnon_feedback_1991">Mackinnon, A. J. and A. J. Wearing</a> (1991). “Feedback and the forecasting of exponential change”. En. In: <em>Acta Psychologica</em> 76.2, pp. 177–191. ISSN: 00016918. DOI: <a href="https://doi.org/10.1016/0001-6918(91)90045-2">10.1016/0001-6918(91)90045-2</a>. URL: <a href="https://linkinghub.elsevier.com/retrieve/pii/0001691891900452">https://linkinghub.elsevier.com/retrieve/pii/0001691891900452</a> (visited on May. 19, 2020).</cite></p> <p><cite><a id='bib-mosteller_eye_1981'></a><a href="#cite-mosteller_eye_1981">Mosteller, F., A. F. Siegel, E. Trapido, et al.</a> (1981). “Eye fitting straight lines”. In: <em>The American Statistician</em> 35.3, pp. 150–152.</cite></p> <p><cite><a id='bib-munroe_2005'></a><a href="#cite-munroe_2005">Munroe, R.</a> (2005). <em>Log Scale</em>. URL: <a href="https://xkcd.com/1162/">https://xkcd.com/1162/</a>.</cite></p> <p><cite><a id='bib-myers_dewall_2021'></a><a href="#cite-myers_dewall_2021">Myers, D. G. and C. N. DeWall</a> (2021). <em>Psychology</em>. Worth Publishers.</cite></p> <p><cite><a id='bib-playfair1801statistical'></a><a href="#cite-playfair1801statistical">Playfair, W.</a> (1801). “The statistical breviary; shewing, on a principle entirely new, the resources of every state and kingdom in Europe, Wallis, Londres”. In: <em>Press, Chicago</em>.</cite></p> <p><cite><a id='bib-romano_scale_2020'></a><a href="#cite-romano_scale_2020">Romano, Alessandro, Sotis, Chiara, Dominioni, Goran, et al.</a> (2020). <em>The Scale of COVID-19 Graphs Affects Understanding, Attitudes, and Policy Preferences</em>. En. SSRN Scholarly Paper ID 3588511. Rochester, NY: Social Science Research Network. DOI: <a href="https://doi.org/10.2139/ssrn.3588511">10.2139/ssrn.3588511</a>. URL: <a href="https://papers.ssrn.com/abstract=3588511">https://papers.ssrn.com/abstract=3588511</a> (visited on Nov. 30, 2020).</cite></p> <p><cite><a id='bib-rost_2020'></a><a href="#cite-rost_2020">Rost, L. C.</a> (2020). <em>You've informed the public with visualizations about the coronavirus. Thank you.</em> URL: <a href="https://blog.datawrapper.de/coronavirus-data-visualization-effect-datawrapper/">https://blog.datawrapper.de/coronavirus-data-visualization-effect-datawrapper/</a>.</cite></p> <p><cite>Shah, P. and P. A. Carpenter (1995). “Conceptual limitations in comprehending line graphs.” In: <em>Journal of Experimental Psychology: General</em> 124.1, p. 43.</cite></p> <p><cite><a id='bib-siegler_numerical_2017'></a><a href="#cite-siegler_numerical_2017">Siegler, R. S. and D. W. Braithwaite</a> (2017). “Numerical Development”. En. In: <em>Annual Review of Psychology</em> 68.1, pp. 187–213. ISSN: 0066-4308, 1545-2085. DOI: <a href="https://doi.org/10.1146/annurev-psych-010416-044101">10.1146/annurev-psych-010416-044101</a>. URL: <a href="http://www.annualreviews.org/doi/10.1146/annurev-psych-010416-044101">http://www.annualreviews.org/doi/10.1146/annurev-psych-010416-044101</a> (visited on May. 19, 2020).</cite></p> </font> --- class:primary # References <font size="2"> <p><cite>Silver, N. (2020). <em>2020 Election Forecast</em>. URL: <a href="https://projects.fivethirtyeight.com/2020-election-forecast/">https://projects.fivethirtyeight.com/2020-election-forecast/</a>.</cite></p> <p><cite><a id='bib-spence_visual_1990'></a><a href="#cite-spence_visual_1990">Spence, I.</a> (1990). “Visual psychophysics of simple graphical elements.” En. In: <em>Journal of Experimental Psychology: Human Perception and Performance</em> 16.4, pp. 683–692. ISSN: 1939-1277, 0096-1523. DOI: <a href="https://doi.org/10.1037/0096-1523.16.4.683">10.1037/0096-1523.16.4.683</a>. URL: <a href="http://doi.apa.org/getdoi.cfm?doi=10.1037/0096-1523.16.4.683">http://doi.apa.org/getdoi.cfm?doi=10.1037/0096-1523.16.4.683</a> (visited on May. 29, 2020).</cite></p> <p><cite><a id='bib-unwin_why_2020'></a><a href="#cite-unwin_why_2020">Unwin, A.</a> (2020). “Why is Data Visualization Important? What is Important in Data Visualization?” En. In: <em>Harvard Data Science Review</em>. DOI: <a href="https://doi.org/10.1162/99608f92.8ae4d525">10.1162/99608f92.8ae4d525</a>. URL: <a href="https://hdsr.mitpress.mit.edu/pub/zok97i7p">https://hdsr.mitpress.mit.edu/pub/zok97i7p</a> (visited on Apr. 27, 2020).</cite></p> <p><cite><a id='bib-vanderplas2020testing'></a><a href="#cite-vanderplas2020testing">Vanderplas, S., D. Cook, and H. Hofmann</a> (2020). “Testing Statistical Charts: What makes a good graph?” In: <em>Annual Review of Statistics and Its Application</em> 7, pp. 61–88.</cite></p> <p><cite><a id='bib-varshney_why_2013'></a><a href="#cite-varshney_why_2013">Varshney, L. R. and J. Z. Sun</a> (2013). “Why do we perceive logarithmically?” En. In: <em>Significance</em> 10.1, pp. 28–31. ISSN: 17409705. DOI: <a href="https://doi.org/10.1111/j.1740-9713.2013.00636.x">10.1111/j.1740-9713.2013.00636.x</a>. URL: <a href="http://doi.wiley.com/10.1111/j.1740-9713.2013.00636.x">http://doi.wiley.com/10.1111/j.1740-9713.2013.00636.x</a> (visited on May. 07, 2020).</cite></p> <p><cite><a id='bib-vonbergmann_2021'></a><a href="#cite-vonbergmann_2021">Von Bergmann, J.</a> (2021). <em>xkcd_exponential: Public Health vs Scientists</em>. URL: <a href="https://github.com/mountainMath/xkcd_exponential">https://github.com/mountainMath/xkcd_exponential</a>.</cite></p> <p><cite><a id='bib-wagenaar_misperception_1975'></a><a href="#cite-wagenaar_misperception_1975">Wagenaar, W. A. and S. D. Sagaria</a> (1975). “Misperception of exponential growth”. En. In: <em>Perception & Psychophysics</em> 18.6, pp. 416–422. ISSN: 0031-5117, 1532-5962. DOI: <a href="https://doi.org/10.3758/BF03204114">10.3758/BF03204114</a>. URL: <a href="http://link.springer.com/10.3758/BF03204114">http://link.springer.com/10.3758/BF03204114</a> (visited on Jul. 07, 2020).</cite></p> <p><cite><a id='bib-walker2013statistical'></a><a href="#cite-walker2013statistical">Walker, F. A.</a> (2013). <em>Statistical atlas of the United States based on the results of the ninth census 1870 with contributions from many eminent men of science and several departments of the government</em>.</cite></p> <p><cite><a id='bib-wilkinson2012grammar'></a><a href="#cite-wilkinson2012grammar">Wilkinson, L.</a> (2012). “The grammar of graphics”. In: <em>Handbook of computational statistics</em>. Springer, pp. 375–414.</cite></p> <p><cite><a id='bib-yates1985graphs'></a><a href="#cite-yates1985graphs">Yates, J.</a> (1985). “Graphs as a managerial tool: A case study of Du Pont's use of graphs in the early twentieth century”. In: <em>The Journal of Business Communication (1973)</em> 22.1, pp. 5–33.</cite></p> </font> --- class:inverse <br> <br> <br> <br> <br> <br> <br> <br> .center[ # Questions and Discussion ] --- class:primary # Research Objectives **Big Idea:** Are there benefits to displaying exponentially increasing data on a log scale rather than a linear scale? 1. [**Perception through Lineups**](https://shiny.srvanderplas.com/log-study/) 📈 📈 📈 - **Test an individuals ability to perceptually differentiate exponentially increasing data with differing rates of change on both the linear and log scale.** 2. [Prediction with You Draw It](https://shiny.srvanderplas.com/you-draw-it/) ✏️ - Tests an individuals ability to make predictions for exponentially increasing data. 3. Estimation by Numerical Translation 📏 - Tests an individuals ability to translate a graph of exponentially increasing data into real value quantities. --- class:inverse <br> <br> <br> <br> <br> <br> <br> <br> .center[ # Perception through lineups ] --- class:primary # Lineup Experimental Task Study Participant Prompt: *Which plot is most different?* .center[ <!-- Trigger the Modal --> <img id='imglinearlineupexample' src='images/linear-lineup-example.png' alt=' ' width='45%'> <!-- Trigger the Modal --> <img id='imgloglineupexample' src='images/log-lineup-example.png' alt=' ' width='45%'> <!-- The Modal --> <div id='modallinearlineupexample' class='modal'> <!-- Modal Content (The Image) --> <img class='modal-content' id='imgmodallinearlineupexample'> <!-- Modal Caption (Image Text) --> <div id='captionlinearlineupexample' class='modal-caption'></div> </div> <!-- The Modal --> <div id='modalloglineupexample' class='modal'> <!-- Modal Content (The Image) --> <img class='modal-content' id='imgmodalloglineupexample'> <!-- Modal Caption (Image Text) --> <div id='captionloglineupexample' class='modal-caption'></div> </div> ] --- class:primary # Study Design **Curvature:** + High Curvature + Medium Curvature + Low Curvature **Treatment Design:** Target Panel gets model A and Null Panels get model B `\(3!\cdot 2!= 6\)` curvature combinations `\(\times 2\)` lineup data sets per combination `\(=\)` **12 test data sets** `\(\times 2\)` scales (log & linear) `\(=\)` **24 different lineup plots** **Experimental Design:** 13 lineup plots per participant `\(6\)` test parameter combinations per participant `\(\times 2\)` scales `\(= 12\)` test lineups `\(1\)` rorschach parameter combination per participant --- class:primary # Generalized Linear Mixed Model Define `\(Y_{ijkl}\)` to be the event that participant `\(l\)` correctly identifies the target plot for data set `\(k\)` with curvature `\(j\)` plotted on scale `\(i\)`. `$$\text{logit }P(Y_{ijk}) = \eta + \delta_i + \gamma_j + \delta \gamma_{ij} + s_l + d_k$$` where - `\(\eta\)` is the beaseline average probability of selecting the target plot. - `\(\delta_i\)` is the effect of the log/linear scale. - `\(\gamma_j\)` is the effect of the curvature combination. - `\(\delta\gamma_{ij}\)`is the two-way interaction effect of the scale and curvature. - `\(s_l \sim N(0,\sigma^2_\text{participant})\)`, random effect for participant characteristics. - `\(d_k \sim N(0,\sigma^2_{\text{data}})\)`, random effect for data specific characteristics. We assume that random effects for data set and participant are independent. ??? Each lineup plot evaluated was assigned a value based on the participant response (correct = 1, not correct = 0). The binary response was analyzed using generalized linear mixed model following a binomial distribution with a logit link function. --- class:primary # Results .center[ <img src="images/lineup-results.png" width="100%"/> ] ??? + The choice of scale has no impact if curvature differences are large. + Presenting data on the log scale makes us more sensitive to the the changes when there are only slight changes in curvature. + An exception occurs when identifying a plot with more curvature than the surrounding plots, indicating that it is is more difficult to say something has less curvature, but easy to say that something has more curvature (Best, Smith, and Stubbs, 2007). --- class:primary # Research Objectives **Big Idea:** Are there benefits to displaying exponentially increasing data on a log scale rather than a linear scale? 1. [Perception through Lineups](https://shiny.srvanderplas.com/log-study/) 📈 📈 📈 - Test an individuals ability to perceptually differentiate exponentially increasing data with differing rates of change on both the linear and log scale. 2. [**Prediction with You Draw It**](https://shiny.srvanderplas.com/you-draw-it/) ✏️ - **Tests an individuals ability to make predictions for exponentially increasing data.** 3. Estimation by Numerical Translation 📏 - Tests an individuals ability to translate a graph of exponentially increasing data into real value quantities. --- class:primary # Sum of Squares Analysis Define `\(SS_{ijk}\)` as the sums of squares for parameter choice `\(i = 1,2,3,4\)`, fit `\(j=1,2\)`, and participant `\(k = 1,...,N_{participant}\)`. The LMM equation is given by: `\begin{equation} \log\left(SS_{ijk}\right) = \alpha_i + \beta_j + \alpha\beta_{ij} + p_{j} + \epsilon_{ijk} \end{equation}` + `\(\alpha_i\)` denotes the effect of the `\(i^{th}\)` parameter choice + `\(\beta_j\)` denotes the effect of the `\(j^{th}\)` fit + `\(\alpha\beta_{ij}\)` denotes the interaction between the `\(i^{th}\)` parameter choice and `\(j^{th}\)` fit + `\(p_{j} \sim N(0, \sigma^2_{participant})\)` is the random error due to the `\(k^{th}\)` participant's characteristics + `\(\epsilon_{ijk} \sim N(0, \sigma^2)\)` is the residual error. ??? Sums of squares were calculated (vertical difference between the drawn and fitted values) and analyzed to compare the sum of squares between OLS and PCA using the lmer function in the `lme4` package in R to run a linear mixed model (LMM) with a log transformation. --- class:primary # Sum of Squares Results <br> <br> .center[ <img src="images/eyefitting-ss-plot.jpg" width="80%"/> ] ??? + No significant effect of fit for any parameter choices. + Indication of the trend previously shown in the residual LMM and GAMM; not enough power to detect.