Blog Posts

More Posts

TL;DR If you use the docopt package to create command line R executables that take options, there is something to know about numeric command line options: they should have as.double before using them in your script. Setup Lets set up a new docopt string, that includes both string and numeric arguments. " Usage: test_numeric.R [--string=<string_value>] [--numeric=<numeric_value>] test_numeric.R (-h | --help) test_numeric.R Description: Testing how values are passed using docopt. Options: --string=<string_value> A string value [default: Hi!

CONTINUE READING

TL;DR Use a short bash script to do deployment from your own computer directly to your *.github.io domain. Why? So Yihui recommends using Netlify, or even Travis-CI in the Blogdown book. I wasn’t willing to setup a custom domain yet, and some of my posts involve a lot of personally created packages, etc, that I don’t want to debug installation on Travis. So, I wanted a simple script I could call on my laptop that would copy the /public directory to the repo for my github.

CONTINUE READING

If you are a newcomer to my weblog, you may notice that some posts that are R tutorials generally include the output of Sys.time() at the end. If you look closeley at that time and the Posted on date, you may notice that some posts show disagreement between them. This is because I decided to move all of my old blog posts from blogspot to here, and keep the original posted dates.

CONTINUE READING

Manual Linking? Using blogdown for generating websites and blog-posts from Rmarkdown files with lots of inserted code and figures seems pretty awesome, but sometimes you want to include a figure manually, either because you want to generate something manually and convert it (say for going from SVG of lots of points to hi-res PNG), or because it is a figure from something else (like this figure from wikipedia). Where to?

CONTINUE READING

TL;DR With the recent charges of sexual harassment against some high-profile individuals, and so many women coming forward with #metoo (and the understanding that this is really something almost all women have faced), I realized that my younger self was #partoftheproblem. I think many other men are part of the problem, even though they might not think so. I didn’t think I was part of the problem either. I hope that other men might read this and critically evaluate if they are #partoftheproblem.

CONTINUE READING

TL;DR Other researchers directly criticized a recent publication of ours in a “research article”. Although they raised valid points, they outright lied about the availability of our results. In addition, they did not provide access to their own results. We have published new work supporting our original results, and a direct rebuttal of their critique in a perspective article. The peer reviewers of their “research article” must have been asleep at the wheel to allow the major point, lack of access to our results, to stand.

CONTINUE READING

TL;DR NIH recently introduced a reproducibility initiative, extending to including the “Authentication of Key Resources” page in grant applications from Jan 25, 2016. Seems to be intended for grants involving biological reagents, but we included it in our recent R03 grant developing new data analysis methods. We believe that this type of thing should become common for all grants, not just those that use biological/chemical resources. NIH and Reproducibility There has been a lot of things published recently about the reproducibility crisis in science (see refs).

CONTINUE READING

TL;DR Partial least squares (PLS) discriminant-analysis (DA) can ridiculously over fit even on completely random data. The quality of the PLS-DA model can be assessed using cross-validation, but cross-validation is not typically performed in many metabolomics publications. Random forest, in contrast, because of the forest of decision tree learners, and the out-of-bag (OOB) samples used for testing each tree, automatically provides an indication of the quality of the model. Why?

CONTINUE READING

TL;DR Currently available methods to discover metal geometries make too many assumptions. We were able to discover novel zinc coordination geometries using a less-biased method that makes fewer assumptions. These novel geometries seem to also have specific functionality. This work was recently published under an #openaccess license in Proteins Journal: Yao, S., Flight, R. M., Rouchka, E. C. and Moseley, H. N. B. (2015), A less-biased analysis of metalloproteins reveals novel zinc coordination geometries.

CONTINUE READING

TL;DR This 2014 PNAS paper by S. Lin et al (Lin et al., PNAS, 2014) that compares transcription of tissues between species has a flawed experimental design, where species is almost perfectly confounded with machine / lane on which the sequencing was done. Y. Golad and O. Mizrahi-Man have published a manuscript describing the confounding and the results of removing it. This was possible because the original authors supplied the information about which publically available files were used in the original analysis.

CONTINUE READING

Contact