site

files for beauhilton.com
git clone https://git.beauhilton.com/site.git
Log | Files | Refs

index.html (8828B)


      1 <!DOCTYPE html>
      2 <html lang="en">
      3  <head>
      4   <link rel="stylesheet" href="/style.css" type="text/css">
      5   <meta charset="utf-8">
      6   <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
      7   <meta name="viewport" content="width=device-width, initial-scale=1.0">
      8   <link rel="stylesheet" type="text/css" href="/style.css">
      9   <link rel="icon" href="data:image/svg+xml,<svg xmlns=%22http://www.w3.org/2000/svg%22 viewBox=%220 0 100 100%22><text y=%22.9em%22 font-size=%2290%22>🏕️</text></svg>">
     10   <title></title>
     11  </head>
     12  <body>
     13   <div id="page-wrapper">
     14    <div id="header" role="banner">
     15     <header class="banner">
     16      <div id="banner-text">
     17       <span class="banner-title"><a href="/">beauhilton</a></span>
     18      </div>
     19     </header>
     20     <nav>
     21      <a href="/about">about</a>
     22 <a href="/now">now</a>
     23 <a href="/thanks">thanks</a>
     24 <a class="nav-active" href="/posts">posts</a>
     25 <a href="https://notes.beauhilton.com">notes</a>
     26 <a href="https://talks.beauhilton.com">talks</a>
     27 <a href="https://git.beauhilton.com">git</a>
     28 <a href="/contact">contact</a>
     29 <a href="/atom.xml">rss</a>
     30     </nav>
     31    </div>
     32    <main>
     33     <h1>
     34      R Markdown is my spirit animal
     35     </h1>
     36     <p>
     37      <time id="post-date">2019-10-20</time>
     38     </p>
     39     <h2>
     40      update 2022-12-02
     41     </h2>
     42     <p>
     43      The R Markdown folks have a new project called <a href="https://quarto.org">Quarto</a>, looks like it does all the things
     44 RMD does plus more.
     45     </p>
     46     <h2>
     47      original post
     48     </h2>
     49     <p>
     50      In a [previous post]({% post_url 2019-06-10-python-write-my-paper %})
     51 I talked about how easy it is, if you’re already doing your own stats
     52 anyway in some research project, to have a Python script output
     53 paragraphs with all the stats written out and updated for you to add
     54 into your paper.
     55     </p>
     56     <p>
     57      The main problem with the approach I outlined was how to get those
     58 nicely updated paragraphs into the document you are sharing with
     59 colleagues.
     60     </p>
     61     <p>
     62      Medicine, in particular, seems wed to Microsoft Word documents for
     63 manuscripts. Word does not have a great way to include text from
     64 arbitrary files, forcing the physician-scientist to manually copy and
     65 paste those beautifully automated paragraphs. As I struggled with this,
     66 I thought (here cue Raymond Hettinger), “There must be a better
     67 way.”
     68     </p>
     69     <p id="post-excerpt">
     70      Turns out that a better way does exist, and it is R Markdown.
     71     </p>
     72     <p>
     73      Though I was at first resistant to learning about R Markdown, mostly
     74 because I am proficient in Python and thought the opportunity cost for
     75 learning R at this point would be too high, as soon as I saw it demoed I
     76 changed my tune. Here’s why.
     77     </p>
     78     <h2>
     79      Writing text
     80     </h2>
     81     <ul>
     82      <li>
     83       R Markdown is mostly markdown.
     84       <ul>
     85        <li>
     86         Markdown is by far the easiest way to write plaintext documents,
     87 especially if you want to apply formatting later on without worrying
     88 about the specifics while you’re writing (e.g. <code>#</code> just
     89 specifies a header - you can decide how you want the headers to look
     90 later, and that styling will automatically be applied).
     91        </li>
     92        <li>
     93         Plaintext is beautiful. It costs nearly nothing in terms of raw
     94 storage, and is easy to keep within a version control system. Markdown
     95 plaintext is human-readable whether or not the styling has been applied.
     96 Your ideas will never be hidden in a proprietary format that requires
     97 special software to read.
     98        </li>
     99        <li>
    100         I had been transitioning to writing in Markdown anyway, so +1 for R
    101 Markdown.
    102        </li>
    103       </ul>
    104      </li>
    105      <li>
    106       R Markdown is also a little LaTeX.
    107       <ul>
    108        <li>
    109         LaTeX is <a href="https://tex.stackexchange.com/questions/1319/showcase-of-beautiful-typography-done-in-tex-friends">gorgeous</a>
    110 and wonderful, the most flexible and expressive of all the typesetting
    111 tools (though not as fast as our old friend Groff…). It also has a
    112 steeper learning curve than Markdown, and is not so pretty on the screen
    113 in its raw form. R Markdown lets you do the bulk of your work in simple
    114 Markdown, then seamlessly invoke LaTeX when you need something a little
    115 fancier.
    116        </li>
    117       </ul>
    118      </li>
    119      <li>
    120       R Markdown is also a little HTML.
    121       <ul>
    122        <li>
    123         HTML is also expressive, and can be gorgeous and wonderful. It is a
    124 pain to write. As with LaTeX, you can simply drop in some HTML where you
    125 need it, and R Markdown will deal with it as necessary.
    126        </li>
    127       </ul>
    128      </li>
    129      <li>
    130       R Markdown is academic-friendly.
    131       <ul>
    132        <li>
    133         Citations and formatting guidelines for different journals are the
    134 tedious banes of any academic’s existence. R Markdown has robust support
    135 for adding in citations that will be properly formatted in any desired
    136 style, just by changing a tag at the top of the document. Got a
    137 rejection from Journal 1 and want to submit to Journal 2, which has a
    138 completely different set of citation styles and manuscript formatting?
    139 NBD.
    140        </li>
    141       </ul>
    142      </li>
    143     </ul>
    144     <h2>
    145      Writing code
    146     </h2>
    147     <p>
    148      R Markdown, as the name implies, can also run R code. Any analysis
    149 you can dream of in R can be included in your document, and you can
    150 choose whether you want to show the code and its output, the output
    151 alone, or the code alone. People will think you went through all the
    152 work of making that figure, editing it in PowerPoint, screenshotting it
    153 to a .png, then dropping that .png file into your manuscript, but the
    154 truth is… you scripted all of that, so the manuscript itself made the
    155 .png and included it where it needed to go.
    156     </p>
    157     <p>
    158      R Markdown is by no means restricted to R code. This is the killer
    159 app that won me over. Simply by specifying that a given code block is
    160 Python, and installing a little tool (<code>reticulate</code>) that
    161 allows R to interface with Python, I can run arbitrary Python code
    162 within the document and capture the output however I want. That results
    163 paragraph? Sure. Fancy images of predictions from my machine learning
    164 model? But of course.
    165     </p>
    166     <p>
    167      If you don’t want to use any R code ever, that’s fine. R Markdown
    168 doesn’t mind. Use SAS, MATLAB (via Octave), heck, even bash scripts -
    169 the range of language support is fantastic.
    170     </p>
    171     <h2>
    172      Working with friends
    173     </h2>
    174     <p>
    175      R Markdown can be compiled to pretty much any format you can dream
    176 of. My current setup simultaneously puts out an HTML document (that can
    177 be opened in any web browser), a PDF (because I love PDFs), and (AND!) a
    178 .docx Word file, all beautifully formatted, on demand, whenever I hit my
    179 keyboard shortcut. I can preview the PDF or HTML as I write, have a
    180 .docx to send to my PI, and life is good.
    181     </p>
    182     <p>
    183      Also, because you can write in any programming language, you can
    184 easily collaborate between researchers that are comfortable in different
    185 paradigms. You can pass data back and forth between your chosen
    186 languages (for me, R and Python), either directly or by saving
    187 intermediate data to a format that both languages can read.
    188     </p>
    189     <h2>
    190      Automating tasks
    191     </h2>
    192     <p>
    193      Many analyses and their manuscripts, especially if they use similar
    194 techniques (e.g. survival modeling), are rather formulaic. Many
    195 researchers have scripts they keep around and tweak for new analyses
    196 revolving around the same basic subject matter or approach. With R
    197 Markdown, your entire manuscript becomes a runnable program, further
    198 automating the boring parts of getting research out into the open.
    199     </p>
    200     <p>
    201      One of the <a href="https://www.youtube.com/watch?v=MIlzQpXlJNk">first
    202 introductions</a> I had to R Markdown shared the remarkable idea of
    203 setting the file to run on a regular basis, generating a report based on
    204 any updated data, and then sending this report to all the interested
    205 parties automatically. While much academic work could not be so fully
    206 automated, parts of it certainly can be.
    207     </p>
    208     <p>
    209      Perhaps your team is building a database for outcomes in a given
    210 disease, and has specified the analysis in great detail beforehand. One
    211 of my mentors gives the advice that in any project proposal you should
    212 go as far as to mock up the results section, including all figures, so
    213 you make sure you are collecting the right data. If this was done in an
    214 R Markdown document rather than a simple Word document, you could have
    215 large parts of the template manuscript become the real manuscript as the
    216 database fleshes out over time. Then when it’s done, look over the data,
    217 make additions and subtractions as needed, write the discussion
    218 sections, and send it in.
    219     </p>
    220    </main>
    221    <div id="footnotes"></div>
    222    <footer></footer>
    223   </div>
    224  </body>
    225 </html>