index.html (8828B)
1 <!DOCTYPE html> 2 <html lang="en"> 3 <head> 4 <link rel="stylesheet" href="/style.css" type="text/css"> 5 <meta charset="utf-8"> 6 <meta http-equiv="Content-Type" content="text/html; charset=utf-8"> 7 <meta name="viewport" content="width=device-width, initial-scale=1.0"> 8 <link rel="stylesheet" type="text/css" href="/style.css"> 9 <link rel="icon" href="data:image/svg+xml,<svg xmlns=%22http://www.w3.org/2000/svg%22 viewBox=%220 0 100 100%22><text y=%22.9em%22 font-size=%2290%22>🏕️</text></svg>"> 10 <title></title> 11 </head> 12 <body> 13 <div id="page-wrapper"> 14 <div id="header" role="banner"> 15 <header class="banner"> 16 <div id="banner-text"> 17 <span class="banner-title"><a href="/">beauhilton</a></span> 18 </div> 19 </header> 20 <nav> 21 <a href="/about">about</a> 22 <a href="/now">now</a> 23 <a href="/thanks">thanks</a> 24 <a class="nav-active" href="/posts">posts</a> 25 <a href="https://notes.beauhilton.com">notes</a> 26 <a href="https://talks.beauhilton.com">talks</a> 27 <a href="https://git.beauhilton.com">git</a> 28 <a href="/contact">contact</a> 29 <a href="/atom.xml">rss</a> 30 </nav> 31 </div> 32 <main> 33 <h1> 34 R Markdown is my spirit animal 35 </h1> 36 <p> 37 <time id="post-date">2019-10-20</time> 38 </p> 39 <h2> 40 update 2022-12-02 41 </h2> 42 <p> 43 The R Markdown folks have a new project called <a href="https://quarto.org">Quarto</a>, looks like it does all the things 44 RMD does plus more. 45 </p> 46 <h2> 47 original post 48 </h2> 49 <p> 50 In a [previous post]({% post_url 2019-06-10-python-write-my-paper %}) 51 I talked about how easy it is, if you’re already doing your own stats 52 anyway in some research project, to have a Python script output 53 paragraphs with all the stats written out and updated for you to add 54 into your paper. 55 </p> 56 <p> 57 The main problem with the approach I outlined was how to get those 58 nicely updated paragraphs into the document you are sharing with 59 colleagues. 60 </p> 61 <p> 62 Medicine, in particular, seems wed to Microsoft Word documents for 63 manuscripts. Word does not have a great way to include text from 64 arbitrary files, forcing the physician-scientist to manually copy and 65 paste those beautifully automated paragraphs. As I struggled with this, 66 I thought (here cue Raymond Hettinger), “There must be a better 67 way.” 68 </p> 69 <p id="post-excerpt"> 70 Turns out that a better way does exist, and it is R Markdown. 71 </p> 72 <p> 73 Though I was at first resistant to learning about R Markdown, mostly 74 because I am proficient in Python and thought the opportunity cost for 75 learning R at this point would be too high, as soon as I saw it demoed I 76 changed my tune. Here’s why. 77 </p> 78 <h2> 79 Writing text 80 </h2> 81 <ul> 82 <li> 83 R Markdown is mostly markdown. 84 <ul> 85 <li> 86 Markdown is by far the easiest way to write plaintext documents, 87 especially if you want to apply formatting later on without worrying 88 about the specifics while you’re writing (e.g. <code>#</code> just 89 specifies a header - you can decide how you want the headers to look 90 later, and that styling will automatically be applied). 91 </li> 92 <li> 93 Plaintext is beautiful. It costs nearly nothing in terms of raw 94 storage, and is easy to keep within a version control system. Markdown 95 plaintext is human-readable whether or not the styling has been applied. 96 Your ideas will never be hidden in a proprietary format that requires 97 special software to read. 98 </li> 99 <li> 100 I had been transitioning to writing in Markdown anyway, so +1 for R 101 Markdown. 102 </li> 103 </ul> 104 </li> 105 <li> 106 R Markdown is also a little LaTeX. 107 <ul> 108 <li> 109 LaTeX is <a href="https://tex.stackexchange.com/questions/1319/showcase-of-beautiful-typography-done-in-tex-friends">gorgeous</a> 110 and wonderful, the most flexible and expressive of all the typesetting 111 tools (though not as fast as our old friend Groff…). It also has a 112 steeper learning curve than Markdown, and is not so pretty on the screen 113 in its raw form. R Markdown lets you do the bulk of your work in simple 114 Markdown, then seamlessly invoke LaTeX when you need something a little 115 fancier. 116 </li> 117 </ul> 118 </li> 119 <li> 120 R Markdown is also a little HTML. 121 <ul> 122 <li> 123 HTML is also expressive, and can be gorgeous and wonderful. It is a 124 pain to write. As with LaTeX, you can simply drop in some HTML where you 125 need it, and R Markdown will deal with it as necessary. 126 </li> 127 </ul> 128 </li> 129 <li> 130 R Markdown is academic-friendly. 131 <ul> 132 <li> 133 Citations and formatting guidelines for different journals are the 134 tedious banes of any academic’s existence. R Markdown has robust support 135 for adding in citations that will be properly formatted in any desired 136 style, just by changing a tag at the top of the document. Got a 137 rejection from Journal 1 and want to submit to Journal 2, which has a 138 completely different set of citation styles and manuscript formatting? 139 NBD. 140 </li> 141 </ul> 142 </li> 143 </ul> 144 <h2> 145 Writing code 146 </h2> 147 <p> 148 R Markdown, as the name implies, can also run R code. Any analysis 149 you can dream of in R can be included in your document, and you can 150 choose whether you want to show the code and its output, the output 151 alone, or the code alone. People will think you went through all the 152 work of making that figure, editing it in PowerPoint, screenshotting it 153 to a .png, then dropping that .png file into your manuscript, but the 154 truth is… you scripted all of that, so the manuscript itself made the 155 .png and included it where it needed to go. 156 </p> 157 <p> 158 R Markdown is by no means restricted to R code. This is the killer 159 app that won me over. Simply by specifying that a given code block is 160 Python, and installing a little tool (<code>reticulate</code>) that 161 allows R to interface with Python, I can run arbitrary Python code 162 within the document and capture the output however I want. That results 163 paragraph? Sure. Fancy images of predictions from my machine learning 164 model? But of course. 165 </p> 166 <p> 167 If you don’t want to use any R code ever, that’s fine. R Markdown 168 doesn’t mind. Use SAS, MATLAB (via Octave), heck, even bash scripts - 169 the range of language support is fantastic. 170 </p> 171 <h2> 172 Working with friends 173 </h2> 174 <p> 175 R Markdown can be compiled to pretty much any format you can dream 176 of. My current setup simultaneously puts out an HTML document (that can 177 be opened in any web browser), a PDF (because I love PDFs), and (AND!) a 178 .docx Word file, all beautifully formatted, on demand, whenever I hit my 179 keyboard shortcut. I can preview the PDF or HTML as I write, have a 180 .docx to send to my PI, and life is good. 181 </p> 182 <p> 183 Also, because you can write in any programming language, you can 184 easily collaborate between researchers that are comfortable in different 185 paradigms. You can pass data back and forth between your chosen 186 languages (for me, R and Python), either directly or by saving 187 intermediate data to a format that both languages can read. 188 </p> 189 <h2> 190 Automating tasks 191 </h2> 192 <p> 193 Many analyses and their manuscripts, especially if they use similar 194 techniques (e.g. survival modeling), are rather formulaic. Many 195 researchers have scripts they keep around and tweak for new analyses 196 revolving around the same basic subject matter or approach. With R 197 Markdown, your entire manuscript becomes a runnable program, further 198 automating the boring parts of getting research out into the open. 199 </p> 200 <p> 201 One of the <a href="https://www.youtube.com/watch?v=MIlzQpXlJNk">first 202 introductions</a> I had to R Markdown shared the remarkable idea of 203 setting the file to run on a regular basis, generating a report based on 204 any updated data, and then sending this report to all the interested 205 parties automatically. While much academic work could not be so fully 206 automated, parts of it certainly can be. 207 </p> 208 <p> 209 Perhaps your team is building a database for outcomes in a given 210 disease, and has specified the analysis in great detail beforehand. One 211 of my mentors gives the advice that in any project proposal you should 212 go as far as to mock up the results section, including all figures, so 213 you make sure you are collecting the right data. If this was done in an 214 R Markdown document rather than a simple Word document, you could have 215 large parts of the template manuscript become the real manuscript as the 216 database fleshes out over time. Then when it’s done, look over the data, 217 make additions and subtractions as needed, write the discussion 218 sections, and send it in. 219 </p> 220 </main> 221 <div id="footnotes"></div> 222 <footer></footer> 223 </div> 224 </body> 225 </html>