Analysis-Based Project Templates

One of the most annoying things you hear people say when they are working with some common code base is “It works on my machine…”. Conversely, one of the more satisfying things is running a script that you are not actively working on and have it run without problems. [Project Templates]({% post_url 2017-05-30-project_templates %}) are one way to address this problem. The [original post]({% post_url 2017-05-30-project_templates %}) about project templates mainly talks about the folder structure but not so much as the rationale behind why things are the way they are.

From VMs to LXC Containers to Docker Containers

Since I’ve joined SDAL, the lab has undergone a few infrastructure related changes, mainly how applications are run on the servers. From what I remember, we started using Virtual Box virtual machines, then moved to LXC Linux containers, and we are now rebuilding our entire infrastructure using Docker containers. How we got here The whole point of using these visualization and container technologies is so we did not want to install anything directly on the server.

Project Templates

Project templates provide some standardized way to organize files. Our lab uses a template that is based off the Noble 2009 Paper, “A Quick Guide to Organizing Computational Biology Projects”. I’ve created a simple shell script that automatically generates this folder structure here, and there’s an rr-init project by the Reproducible Science Curriculum folks. The structure we have in our lab looks like this: project | |- data # raw and primary data, are not changed once created | | | |- project_data # subfolder that links to an encrypted data storage container | | | | | |- original # raw data, will not be altered | | |- working # intermediate datasets from src code | + +- final # datasets used in analysis | |- src / # any programmatic code | |- user1 # user1 assigned to the project | +- user2 # user2 assigned to the project | |- output # all output and results from workflows and analyses | |- figures/ # graphs, likely designated for manuscript figures | |- pictures/ # diagrams, images, and other non-graph graphics | +- analysis/ # generated reports for (e.