RStudio internship week 2
I think I can make it through the summer
By Daniel Chen
June 18, 2019
The main topics and events of last week were:
- Much git.
- Metaprogramming and non-standard evaluation (NSE) in R
- Four 1-hour workshops by Allison Hill on the
summer-of-blogdown
- moving things over from jekyll will take some time
So many of the random things I’ve tinkered with in the past have come front and center. As an educator, I know seeing these things again make learning and understanding them easier. You build on your previous knowledge to solidify, fix, and fill in gaps in your mental model. The process repeats until you get an understanding about a topic.
For me, I’m getting a better foundation to how NSE works and how it all plays together within the Tidyverse.
Git
I got some things merged!
The
pull-request that was broken and merged on my first day finally was
fixed and merged.
I also got to work with the
lintr package and
merged it into grader
too.
This week was probably the first time I’ve used git amend
in a long time (if ever?).
I’ve typically always just made the commit, and run git rebase -i
to squash and/or amend my commits.
I can see why common operations like making changes to the previous commit would have a shortcut.
I typically don’t use these features because it’s another thing to teach, and understanding git rebase -i
is more general than git commit --amend
.
What --amend
allows you to do is replace the previous git commit with another one.
You can fix the commit message, or add/fix file you missed.
Theses are all ways to make the commit history cleaner.
The recipe looks like this:
git add <file>
git commit --amend
<Fix/modify the commit message>
git push -f origin master
The last line does a -f
force push
, because the commit is actually different from the one before you --amend
ed.
R package versions
There’s a convention about version numbering adding a .9000
+ after the patch (e.g., v0.1.4.9000) to show a
a development version number.
You can couple this with the DESCRIPTION
file by forcing a particular version to make sure you and the team
all have access to the same development features.
I came across this in grader
that has a Imports
for learnr (>= 0.9.2.9001)
.
grader
progress
We’re probably
going to change the name of the package to
gradethis
because the package gradeR
(note capital R) was submitted to CRAN right as I started.
Here’s what I learned about the library I’m working on this week:
learnr
set’s thechecker
function- In the knitr chunk
exercise.checker
specifies thechecker
functiontutorial_options(exercise.timelimit = 60, exercise.checker = grade_learnr)
- In this example,
grade_learnr
is the main entrypoint fromlearnr
tograder
and my work starts with this function.
- In the knitr chunk
checker
is called on line 129 inexercise.R
inlearnr
- the
checker
function (i.e.,grade_learnr
) returns a value depending on what is passed into it- if missing: returns a
list()
withmessage
,correct
,type
, andlocation
keys - if error:
graded
object (namedchecked_result
of classgrader_graded
) withcorrect
andmessage
- or evaluated code
- if missing: returns a
- the
There’s a bunch of stuff within the exercise.R
function in learnr
that captures information from shiny,
sets up the knitr
environment, and inserts the output and results into the correct place in the DOM.
That’ll be a separate writeup when I leave the grader
world.
For the next week or so the goal is to update the
check_result
API,
which got me down the rabbit hole of non-standard evaluation in R (I’ll talk about it in a separate set of posts).
Non-standard Evaluation (NSE)
I gave
a talk about writing functions in R which touched on NSE but it was pretty superficial.
Since NSE is so crucial to grader
I’ll write a series of posts about this topic and eventually turn it into a talk.
In the meantime, here are the materials (in no particular order) I’ll be reading to:
- https://dplyr.tidyverse.org/articles/programming.html
- https://tidyeval.tidyverse.org/
- https://ggplot2.tidyverse.org/dev/articles/ggplot2-in-packages.html#using-aes-and-vars-in-a-package-function
- https://adv-r.hadley.nz/
Misc
Other random things I’ve discovered this week
dplyr::count
vs dplyr::tally
From the docs:
tally() is a convenient wrapper for summarise that will either call n() or sum(n) depending on whether you’re tallying for the first time, or re-tallying. count() is similar but calls group_by() before and ungroup() after. If the data is already grouped, count() adds an additional group that is removed afterwards.
dplyr::count
can count observations with 0 counts (useful for group_by operations) with the .drop
argument
pryr::standarise_call
Manipulating the function call is black magic NSE voodoo.
This is the stuff that is happening within grader
that gets student code, solution code, learnr
and grader
arguments
that are all passed into grade_learnr
.
# code below was copy/pasted from the console
my_add <- function(x, y) {
x + y
}
# pass in part of an expression
call <- pryr::standardise_call(quote(my_add(x = 3)))
# on the fly add more parameters!
call$y <- 10
# evaluate the thing
eval(call)
## [1] 13
This all uses the global environment,
but grader
will be doing this type of thing with separate environments for each exercise that will be checked.
Also, this is all how match.call
works in base R.
checkmate
package
There is a
package called checkmate
that is unit testing (e.g., testthat
) on steroids.
It allows you to more specific type and argument checking in R.
I haven’t work with the package personally yet, but it does seem to be like
type hints in Python
and allows more specific checks into what objects in R contain.
Credentials
One of the coolest things about being an intern at RStudio is being on the slack channel!
I try to keep my questions reserved but one of the things that have always bothered me was how store and access credentials for R.
Putting in API keys in .Renviron
are common practice, but I piggy-backed on another intern’s question by asking about
storing passwords more securely than in a plain text file.
I’ve used the
rstudioapi
,
secret
, and
getPass
libraries
before,
but as
Raymond Hettinger always says: There must be a better way.
The resource I was given was to look at how database credentials are stored: https://db.rstudio.com/best-practices/managing-credentials/
Summer of blog down
Lastly, the great Allison Hill hosted a series of blogdown workshops for people who were interested, summer-of-blogdown. It was a total of 4 days and we covered the basics of blogdown, how to pick (the academic) themes, deploying it on netlify, and best ways to maintain the site.
I didn’t realize how amazingly flexible the academic theme was until this workshop. I’ll be sure to move my own website over to blogdown + academic one of these days.
I’m currently trying to find out how to save urls in a common location so they can be maintained in one place and be used in links throughout the site.
The ongoing search for how to use variables in an md
document (tl;dr: you can’t, but you might still be able to do what I want):
https://discourse.gohugo.io/t/variables-in-markdown/7113/12.
It almost seemed that
site variables
were going to be the way to go, but that ended up in a dead end.
What I’m most excited about is the ability to write posts in Rmd
and jupyter notebooks
for R and Python posts.
Things I’ve learned the 4 days:
- Each folder in content is a “section” and each “section” has a “page”
/content/home/
contain widgets- Learn x in y minutes website for TOML
- From Greg Wilson: Put a
LICENSE
andCITATION
+ orcid on your website. Make the librarians happy.