Install necessary tools
You will have to install the following tools in your computer in order to run this example.
- R Plug-in for Jenkins
- Image Gallery Plug-in for Jenkins
We won’t go into detail here about how to install each tool. You will need Java for running Jenkins, R for running the example and the taxize R package. The R Plug-in will allow Jenkins to run R scripts and the Image Gallery will create a simple gallery using the image created by the build in Jenkins.
Setting up the Jenkins job
Once you have the plug-ins configured and Jenkins running, create a FreeStyle job and give it any name you want.
Now click the “Add build step” and select “Execute R script”. That will add a textarea where you can write your R script. Let’s copy and paste the taxize example from rOpenSci, with one modification to create a PNG with the tree.
I have just returned from Universidade de Sao Paulo, where I attended the [2º High Performance Computing Workshop](http://2whpc.lcca.usp.br/). Unfortunately I couldn't go in the morning, so I missed the first half of the event. From what I read in the schedule, in the morning USP and Rice University explained what was the current situation of the cooperation agreement - that is mainly about Rice's IBM Bluegene/P being used by USP.
My main reason to attend this event was to watch Professor Dr. João Carlos Setubal’s talk on HPC and Bioinformatics. His talk was great, but I’ll add a full report here, with all the talks that I watched (you can skip to his talk if you prefer).
GPU computing talk
The first talk was by Phd. Denis Tanigushi. He gave a great talk about GPU computing. He started with a history background, that was coincidentally similar to a recent Hacker News thread about shaders. He then exposed the problem, GPU architecture and its application in HPC. What was very interesting was that he used several GPUs and MPI too - I didn’t know it was possible.
I learned about Warps, and when/how to use GPU. Here are some of the software that appeared in Tanigushi’s talk: Matlab, NVidia libraries, Ansys and Gromacs. Oh, he also mentioned C thrust library and functors.
The bioinformatics talk
The next part of the event was very interesting too. It was a series of 4 lighting talks, the first one being the HPC and bioinformatics that I wanted to see. Prof. Setubal gave an excellent talk. He talked about his work with Genomics and Transcriptomics. He also mentioned that his work is of collaboration with other groups and Project Driven. Ah, and that it is also Big Data Driven.
He used Blast, MPI-Blast, SOAP Denovo and Abyss for his analysis. And found out that Abyss didn’t work well in Bluegene, but on the other hand, using MPI-Blast he was able to reduce the processing time from 2 months to 3 days only the time to blast his dataset.
He concluded his talk saying that most of his tools are made by other groups, but not always made to run in parallel. And that it almost always produces a pipeline to run everything. From what I could understand asking him later, his group doesn’t use any kind of pipeline tool - no Galaxy, Taverna, Mobyle, nor Jenkins (cough cough)
The rest of the event
DNADigest held a free Hackday this past Saturday, April 5th, in London. Luckily they also live transmitted the event via Internet, and it was definitely worth waking up at 5AM (UTC -3) to watch it!
DNADigest is a not-for-profit organization that is working on solutions for secure data sharing. This problem involves several fields (metadata, infrastructure, data encryption, genomics, …) and is obviously a very complicated one, so props to DNADigest for working on this.
They made sure the video for the Hackday was always working properly (big thanks Suraj!), but the audio wasn’t so good. From what I could tell the atmosphere was really nice and made the whole event a very productive meeting. I hope to be near London to participate in a future DNADigest Hackday.
The activities were coordinated by the DNADigest team and while I couldn’t listen to the audio very well, the Twitter #dnahd hashtag was being constantly updated, and everybody was hacking together using Hackpad.
My interest wasn’t exactly on data encryption, privacy or data sharing, but yes on metadata, since this is probably one of the items from BioUno roadmap that we’ll tackle next. The result was amazing. I couldn’t attend it, but after the event I had a list with tools, standards and papers to read about metadata.
Uno-Choice has been released and is our first user contributed plug-in. This plug-in was proposed by Ioannis Moutsatsos, and has been described as a “proposal (…) for selecting one or multiple parameters. Attempting to fill the gaps left by current plugin options”.
Version 0.1 is able to add dynamic parameters to your Jenkins jobs. You have the ability to create multi selects from Groovy scripts. How cool is that? If you miss anything in Jenkins UI for your research in bioinformatics, please feel free to submit your issues and we will try to help ya.
Have any suggestion for a new plug-in? Send it in and we will help you to code and release it.
In the past days two more plug-ins were released to our update center. The plug-ins are CLUMPP plug-in and Structure Harvester plug-in. Before that the first plug-ins released were the Structure and PBS plug-ins.
Structure Harvester is a Python utility with a web interface. It is useful to extract relevant data from Structure. CLUMPP is used usually after Structure and Structure Harvester. It is able to permute the clusters output by independent runs so that they match up as closely as possible.
Both plug-ins use Builders, similar to the wrappers in Galaxy, and let you add build steps to call the tools.
Read more about the plug-ins:
Stay tuned for more plug-ins!