Posts

There has been some interesting activity about getting R to send a notification somehow when a long running job is completed. The most notable entries I have seen in this category are RPushBullet for web notifications and pingr for audio notifications. Although RPushBullet looks really cool (and Dirk does great work), I wondered if there was a way to do this using a free service that I already had access to, namely twitter.

CONTINUE READING

University of Kentucky (UK) recently partnered with the discovery portal KNODE, for helping others to discover potential collaborators at UK. KNODE looks like a large corporate venture, that is probably costing a large amount of capital to the university (and other places that use it). I wonder if the universities money would be better spent on encouraging submission of preprints, a Github Enterprise/Education package and teaching researchers and faculty how to use social media like twitter.

CONTINUE READING

TL;DR In bioinformatics research we need to show validated results (if doing classification or discovery of new things), or show biological relevance. If you do neither of those things in a paper or presentation, then I’m not going to believe your method is worth anything. Seminar Without Results I attended a seminar yesterday (I’m not going to comment on who gave the seminar or what it was about, so please don’t ask) where the presenter had a distinct lack of any useful results.

CONTINUE READING

I can finally say that the publication on my Bioconductor package categoryCompare is finally published in the Bioinformatics and Computational Biology section of Frontiers in Genetics. This has been a long time coming, and I wanted to give some background on the inspiration and development of the method and software. TL;DR The software package has been in development in one form or another since 2010, released to Bioconductor in summer 2012, and the publication has bounced around and been revised since spring of 2013, and it is finally available to you.

CONTINUE READING

I have noted at least one instance (and there are probably others) about how Python’s docStrings are so great, and wouldn’t it be nice to have a similar system in R. Especially when you can have your new function tab completion available depending on your development environment. This is a false statement, however. If you set up your R development environment properly, you can have these features available in R.

CONTINUE READING

I retweeted this a few days ago: 1. Open MATLAB for first time in a few years after using #rstats. 2. Site license doesn’t work right. 3. F*** MATLAB, I’ll try to do it in R And as I have started the process of installing MatLab on my own machine because I want to translate a published MatLab package into R, I am reminded of how painful the process can be.

CONTINUE READING

If you just want the hook scripts, check this gist. If you want to know some of the motivation behind writing them, and about the internals, then read on. Package Version Incrementing A good practice to get into is incrementing the minor version number (i.e. going from 0.0.1 to 0.0.2) after each git commit when developing packages (this is recommended by the Bioconductor devs as well ). This makes it very easy to know what changes are in your currently installed version, and if you remembered to actually install the most recent version for testing.

CONTINUE READING

As part of the instructor training of Software-Carpentry, we were asked to write a blog post about two things: A story about a time you were motivated/demotivated to learn A story that will help motivate our learners drawn from personal experience Here are mine. I thought others who read my ramblings might find them useful. Note that these are cross-posted on the Software-Carpentry teaching blog. Being Motivated to Learn Calculus As part of my undergraduate degree, I was required to take higher level mathematics, either calculus or linear algebra.

CONTINUE READING

The announcements are out, Pubmed is introducing a commenting system pubmedcommons, theoretically providing a single location for true post-publication peer review. This is a really good idea, as NCBI is likely to be around for a lot longer than a given publisher, and the requirement for all NIH funded research to be deposited into Pubmed. There are some detractors, and they may have some valid points link. However, the alternative, pubpeer, I had not heard about.

CONTINUE READING

TL;DR I think data scientists should choose to learn open languages such as R and python because they are open in the sense that anyone can obtain them, use them and modify them for free, and this has lead to large, robust groups of users, making it more likely that packages exist that you can use, and others can easily build on your own work. Why the debate? This was sparked by a comment on twitter suggesting that data scientists and analysts need to be polyglots, that they should know more than one programming language or analysis framework (the full conversation of tweets can be found here)

CONTINUE READING