New Website a la Blogdown!

I finally got everything moved over to blogdown with the Hugo Academic theme. Thanks so much to Allison Hill, who ran the summer-of-blogdown tutorial for us RStudio interns. The transition was pretty seamless. Mainly because I didn’t really have that much content to move over. The biggest change was I had to commout my categories tag in my YAML post headers becuase they were causing the site to not build.

By Daniel Chen

July 23, 2019

RStudio internship week 2

The main topics and events of last week were: Much git. Metaprogramming and non-standard evaluation (NSE) in R Four 1-hour workshops by Allison Hill on the summer-of-blogdown moving things over from jekyll will take some time So many of the random things I’ve tinkered with in the past have come front and center. As an educator, I know seeing these things again make learning and understanding them easier.

By Daniel Chen

June 18, 2019

And we’re off! RStudio internship week 1, complete.

I’m still pinching myself about being one of the RStudio interns this year. It’s an unbelievable opportunity and I’ve been half panicked and fighting imposter syndrome since the announcement was made in March. My meeting with Greg Wilson on Friday (2019-06-07) went something like this: Greg Wilson: How’s the internship going? Me: I’m panicked, but really excited. Greg Wilson: Good. That’s how interns should feel. I’m working on the grader package (with Garrett Grolemund and Barret Schloerke) which aims to check code against a solution.

By Daniel Chen

June 10, 2019

rstatsnyc as told by @brookLYNevery1, @dataandme, and autographs

EDIT: I’ve added the notes from @dataandme and linked to people’s twitter and slides (if I found them). This is probably going to be an ongoing process… Another year and another talk at the NYC R Conference. As always, the conference was filled with excellent speakers (I’m biased here becuase I was one of them…), food, and people. Brooke Watson ( @brookLYNevery1) did a fantastic job illustrating and summarzing all of the talks.

By Daniel Chen

April 22, 2018

Analysis-Based Project Templates

One of the most annoying things you hear people say when they are working with some common code base is “It works on my machine…”. Conversely, one of the more satisfying things is running a script that you are not actively working on and have it run without problems. [Project Templates]({% post_url 2017-05-30-project_templates %}) are one way to address this problem. The [original post]({% post_url 2017-05-30-project_templates %}) about project templates mainly talks about the folder structure but not so much as the rationale behind why things are the way they are.

By Daniel Chen

January 23, 2018

From VMs to LXC Containers to Docker Containers

Since I’ve joined SDAL, the lab has undergone a few infrastructure related changes, mainly how applications are run on the servers. From what I remember, we started using Virtual Box virtual machines, then moved to LXC Linux containers, and we are now rebuilding our entire infrastructure using Docker containers. How we got here The whole point of using these visualization and container technologies is so we did not want to install anything directly on the server.

By Daniel Chen

July 7, 2017

Project Templates

Project templates provide some standardized way to organize files. Our lab uses a template that is based off the Noble 2009 Paper, “A Quick Guide to Organizing Computational Biology Projects”. I’ve created a simple shell script that automatically generates this folder structure here, and there’s an rr-init project by the Reproducible Science Curriculum folks. The structure we have in our lab looks like this: project | |- data # raw and primary data, are not changed once created | | | |- project_data # subfolder that links to an encrypted data storage container | | | | | |- original # raw data, will not be altered | | |- working # intermediate datasets from src code | + +- final # datasets used in analysis | |- src / # any programmatic code | |- user1 # user1 assigned to the project | +- user2 # user2 assigned to the project | |- output # all output and results from workflows and analyses | |- figures/ # graphs, likely designated for manuscript figures | |- pictures/ # diagrams, images, and other non-graph graphics | +- analysis/ # generated reports for (e.

By Daniel Chen

May 30, 2017

Changes in Higher Education

As a PhD student who already has a Master’s degree, it’s safe to say that I’ve been in school for a long time. One of the things in higher education that I started to dislike over the years are the ways professors assess students in the classroom. As a graduate student, more specifically, a PhD student, a lot of my time should be spent on research. However, when classes are structured the same way as they were in my undergraduate days, memorization is the primary form of assessment, then the point of why I need to take classes as a graduate student seem missed.

By Daniel Chen

May 1, 2017

Preparing for the Summer

As the semester comes to an end, preparing for my lab’s Data Science for the Public Good Program begins. I’ve started a GitHub group to dump the various components we will be using during the summer. Aaron Schroeder and I will be the main trainers for the students this summer. It’s our job to teach the students the basic tools needed to be functional in the lab. We put our initial syllabus in the workshop repository.

By Daniel Chen

April 24, 2017

NYC R Conference

Just got back from the 3rd annual NYC R Conference this past weekend. I have been honored to be one of the few speakers for the 3rd year in a row. This year’s talk, “So You Want to be a Data Scientist” gave a whirlwind tour of the tools and skills needed to be a Data Scientist. I conveyed all this information in 56 slides and did it in 20 minutes.

By Daniel Chen

April 24, 2017