Category: English

How Github can be friendlier with academia


Github is an amazing service to host and share any kind of code repository. I’m a big fan of github, ’cause I’m an avid user of git, and to be honest if you are not, just have a look how to get you started in 15 mins.

With the rising movement of openscience and reproducibility, the necessity for science to share their code and result is getting higher and higher.

For example, figshare is doing a great job to get the people the ability to share their own results quickly online, providing useful metrics and feedback to the uploader.

As far as we advanced in reproducibility and code sharing, in some disciplines we are still at cowboys stage, where the code is not shared and it’s
very difficult to reproduce a figure published on a paper, or at least get the code that comes with it.

One way to get over this could be to give to the scientists an easy way to play with their own code, on the safety of a private repository.

It would be great if the repo could be set up as open from day 1, but we know this is not the case for some projects. Therefore the ability to have a private repo will encourage scientists to set up their code under a VCS (git) and take confidence with the system. I know from experience and from friends that tried that there is not going back once you get the handle of it.

Now, let’s try to propose one way how github could be friendlier with academia. Right now github as an entry only for student, teacher or organized group, sitting at http://github.com/edu. This unfortunately does not cover at all the requirements of academic world,therefore I’ll take my chance and propose a very special role, which should cover most of the reqs for a single researcher.

The Researcher account should give the possibility to a scientist to get comfortable with the system.

I think the account should give the ability to create 5 time-based private repos, and the possibility to create one organization with at least 1 time-based private repo.

Let me try to explain the rational behind this numbers. First, the 5 repo are more than enough to get the people started. If they need more, they can always go to a paid plan, which it’s just fair. In second instance, the idea of having the ability to create on orgs with 1 private repo is a good idea because of the collaborations.
Scientist usually collaborate on big consurtium, and the collaboration is project focused instead of people focused, therefore the abitlity to have an
organization makes easier to share the control. Last but not least the url will be more community friendly, emphasizing the projects itself.

The time-based in front of the private, is to make clear that these repositories will automatically opensource in 5 years time. This is to encourage to opensource the repo when the paper gets written, and to help the sharing of the code. As soon on of the repo gets opensource, the researcher re-gain one repo on the total cout as private time-based one. Of course, if the research does not want to opensource the repo and wants to keep it private indefenetely, she can enroll in one of the paid github plan.

In conclusion to get more academic friendly:

  • github should divide education and academic stream
  • github should create a new type of account, the researcher
  • the researcher should be entitled to new time-based repo
  • these are going to be automatically opensourced after 5 years time

Github is, at the moment, the best website to share code and do collaboration. I hope they can take the lead and became the best way to get academics into
VCS and code sharing. They have just nominated a new educational liason, which maybe can help into bringing this issue up.

Creating a new twitter bootstrap theme for jekyll

Yoda On OpenSource

Yoda is wise. And green, as well (ok, maybe not relevant but he is green!)

Now that I’ve finished my Ph.D. at the EBI, it was time to set up a personal page where people could find easily some of my contacts info to have a quick way to contact me. The decision was to use the good old github pages with a cname for hosting, and I’ve written how I’ve done it in this post.

Although a quick html page was a good compromise, I felt It was a bit too short and quick, not giving an enough informative picture. Moreover, I wanted the ability to create several pages to describe projects and other stuff which maybe will come up. Being already on github pages, I’ve decided to use jekyll. Jekyll is a text processor, which converts markup into HTML, having the ability to create a blog if same convention are followed. I love text processors, ’cause it means I can write stuff using an editor and focusing on the content. Then, at later stage, magic happens and the contents looks also good and very well formatted. Same other examples of this process are LaTeX, which is amazing to write scientific publications, and Sphinx, which it’s awesome for documentation (especially Python programs). Easier is the markup language, easier it will conquer the world. for example Markdown is awesome ’cause it feels like writing using decent default (or at least, default that resonate with me.) Ok, stop wandering around and let’s get back on track.

Getting started using Jekyll is quite complicated, because jekyll does not come with any preloaded site or anything, therefore you have to create everything. However, jekyllbootstrap is up to the rescue. Jekyllbootstrap, created by Jade Dominguez, is a series of preloaded template and clever series of addons to jekyll, including themes and external service to handle comments, which it makes possible to decrease the time to start to close to zero!

Jekyllbootstrap gets shipped with the classic (yeah, it’s a classic nowadays) twitter bootstrap, which is a pretty cool frontend helper. Twitter bootstrap version 2.0 has seen a major improvement versus the 1.4 version, where responsive behaviour has been added to the frontend framework. Responsive behaviour is the ability to perform well on any kind of device, using some clever resizing tricks, where the web page changes format and font to adapt to an android or iphone screen, to a tablet, to a laptop screen or to a massive desktop video. All this comes for free, just using bootstrap, therefore it’s very handy to use it. You know, it’s 2012 and mobile should be treated as first web citizen.

I was already thinking to bring the 1.4 theme to 2.0, when I’ve actually found that Geoffrey Dagley had already taken care of it, creating a new repo for it.

So I’ve just installed and I had the theme set up. All was looking good, when I’ve actually find out that there was a problem with the tagline. The tagline was not computed from the metadata, but it was left there as placeholder. Therefore, being a good opensource citizen, I’ve forked the repo,  fixed the problem,  and opened a Pull Request to put it back to the original.

Then, given the fact Thoms Park created bootswatch, I’ve picked cyborg, one of the available theme, which is using the same twitter bootstrap markup, but it has different colors and font, and I’ve created a new theme for jekyll, in its own repo.

So after all this I’ve set up my new website in a bunch of days, corrected and sent a pull request to fix a problem on one of the theme, created a new theme based on bootstrap and bootswatch.

The commodity of jekyll is amazing, ’cause I can create a new file using the nice rake shorthand:

rake post title="a decent title for a new post"

which sets up the file for me and I have only to open it up in gedit and write it!

How does it look like? Check it out!

Michele's web new graphic

P.S.: If Gedit doesn’t recognise Markdown, it’s due to some crazy mime-type problem. Check out this tweet for help:
[tweet https://twitter.com/mattions/status/209684943981379586]

How to push to two different git repositories in one go

Branching illustration

Branching it’s good:
http://git-scm.com/

With the new release of Neuronvisio (0.8.3) we have improved the documentation, gave the software a new home (http://neuronvisio.org) and created a new fork under the NeuralEnsemble orgs.

I think for Python and Neuroscience it would be good to have a website similar to http://pinaxproject.com/ecosystem, to give visibility to the different projects and avoid to re-invent the wheel, however for now using the same space in NeuroEnsemble orgs it’s a good start. I didn’t want to move or transfer my repository there directly,  but I wanted to have a mirror of my repo https://github.com/mattions/neuronvisio in that space, without having to manually update it. I’ve looked how to open a mirror fork on github, but to no avail. So I came up with a possible solution, using the ability of git to push to different repositories.

My solution was to create a new remote point, called all in the local git config (.git/config in your repo) with the following format:

[remote "all"]
url = git@github.com:mattions/neuronvisio.git
url = git@github.com:NeuralEnsemble/neuronvisio.git

This way I can push to both the repos with a single command

git push all

Both the repos will be updated in one go. Neat.

Tools for a computational scientist

So, how do you keep track of your work?

If you are in a wet lab, usually you end up using a lab book, where all the experiments are recorded. You can replicate the experiment, and do something new. It’s pretty cool system, although I think it’s not great for computational scientist. In computational science there is the same problem of recording what is going on, and what happened before. On top of that there is also the problem of sharing the program with other people to address reproducibility. Therefore the problem can be broken down to two different sub problems:

  • record the changes happening in a computational project, in particular to the code used to run the project
  • record the results of different execution and link them with a certain state of the code.
A classic approach is “do nothing”. The code sits on your hard drive somewhere. Maybe it is organized in folders, and descriptive file name. However there is not history attached, you have no idea what’s going on, and which is the latest version. As you guessed this is not a cool position, ’cause you spend time thinking how to track your work instead of doing your work, and you have the feeling that you don’t know what’s going on. This is bad. 
Fortunately, this can be solved 🙂
This is one of the problem which could be solved using a Version Control System, which are exactly invented to track changes in text files (and more).
I found very useful to work with Git, which is an amazing Distributed Version Control System (DVCS). The most important benefit that you get one using a version control system, and in particular git is that you have the ability to be more brave. This is because Git makes very easy to create a branch and test something new as you go on.
Branches

Branch in Git are quick and cheap! Easy to experiment!

Did you ever find yourself in a situation where you wanted to try something new, which could break a lot of different things in your repository, however you didn’t want to mess with your current code?
Well, Git gives you the ability to create a branch very cheaply, to test your new crazy idea and see if it works, in a completely isolated environment from the code that is sitting on your master branch. This means you can try new things, which tends to be quite important in science, because we don’t usually know where we are going, and try more than one solution opens up a lot of different possibilities.
The other good thing is you have a log, with whatever happened, and you can try to go back to the version that was working and restart from there. For example, this is the commits log from neuronvisio.
I’ve ran a hands-on crash course at the EBI about Git, (the repo). The course was very well-welcomed and people started to understand the power of using fast tools to free some mental space.
Another big plus for Git is the ability to host your project on github, which makes collaboration super-easy. These are the contributors for Neuronvisio for example.
Using a version controlled system is a good idea, and integrating it with Sumatra is also a very good idea. Sumatra automatically tracks all the parameters and versions of the programs used. I’ll talk about it in a later post, for now have a look to the slides:
Sumatra and git [slideshare id=3802681&w=425&h=355&sc=no]

Integrating the different leads

Blooming

Spring knocking!

To try to put all the stuff that I have on the net in a consistent way, so to give the people one address where to go to look up my stuff, I’ve decided to get a new personal domain, michelemattioni.me.  I’ve moved this blog to a new address blog.michelemattioni.me. On top of that, I’ve changed the name of the blog to Trains of Thoughts. After 6 years of activity, I guess it was time.

For the technical side, if you are interested, the blog is still hosted on wordpress.com and you can get them to map the old address to any domain or subdomain for 13$/year. I’ve considered the idea to move all the blog and go for a self-hosted strategy, but I’ve decided it was too time-consuming, so I took this solution.

To register my domains and dealing with the DNS, I’m using dnsimple.com (my referral) for the domain I own. It’s a nice DNS provider, which simplify a lot of the DNS woodo action that you need to take when setting up new stuff.

The landing page is hosted using github pages, which is very neat way to keep the site under git and update it with just a push. I plan to use bootstrap to handle the graphic and to add some content to the page.

For the time being, this is the old version (current version):

old version of michelemattioni.me

First version of michelemattioni.me

What to expect from Ideatransform

Image

With Ideatransform kicking in in less than 5 hours, I want to write down what I expect from the meeting

  • I expect a lot of fun. Enjoying the WE is one of my goal
  • I expect to meet a lot of interesting people, among developers, designers, doers and mentors
  • I also would like to pitch the SustainableSouk idea, build a team and create a first MVC, in the classic LeanStartup way.

Although looking for a Co-founder is always a tricky business, and going solo is a possibility, I would like to build this project in a super open and easy way.

The excitement is high, let’s see how it rolls!

Kitegen comes back to fly

[slideshow]

I’m very happy to read (link in Italian) that Kitegen has started the automatic testing for take off and landing at Sommariva site.

I just report the main points written in the post which I think they are interesting, without going into a full english translation:

  • first of all, they have managed to perform a take off with only 1.5 m/s wind speed at the ground, which is remarkable given the fact this wind speed is very small. Considering that the wind speed in Europe is around 3 m/s this is a great achievement.
  • the next step they are planning is to test how long the kite can fly without interruption, to achieve 5000 hours of continuous flight 5000 hours per year. H/T Stefano

All in all, it seems the control software is getting ready for the field, and I can’t wait to see the development during spring and summer!

Goodies: Link to the video!

Making playlist with Android

Droid Music

If you would like to make a playlist based on the content on the folder on your droid, just open a terminal, go to the folder where the music is and run

$ls | sort | grep mp3 > "$(basename "$(pwd)")".m3u

and magically you have the playlist..
which looks like this:

04 - Pennywise - Fuck authority.ofn.mp3
07 - A Perfect Government.mp3
07 - Pearl Jam - Do The Evolution.mp3
13 - Goldfinger - 99 Red Balloons .mp3
Nirvana - 03 - Come As You Are.mp3
Oasis - Live Forever.mp3
Offspring - Smash - 05 - Genocide.mp3
System Of A Down - Aeralis.mp3
System Of A Down - Spiders.mp3
03 Whatever Happened to My Rock 'N Roll.wma

the last one is a wma, and if you don’t have time/will to convert it to mp3 can be added with:

$ls | sort | grep wma >> "$(basename "$(pwd)")".m3u

Tip found here