List intersection in python: let’s do it quickly


So you have two lists and you want to make the intersection of it.
So you think about it for 3 seconds and than you write something like:

a = [1,2,3,4,"B"]
b = [2, "B"]
c = []
for e in a:
    if e in b:

This works, it seems very idiomatic and you’re done with it.
The problem this is extremely slow.

In other words writing a loop in python is a bad idea.
Why is slow? Because Loops in python are slow. Extremely slow.

If you have numbers only, I suggests to check out Numpy and even with string you can check pandas dataframe.

However if you have a mixture of object like above, you can just stick with python datastructure and use sets. If you do not have duplicates you’re out of luck…

With sets it will look like:

a = [1,2,3,4,"B"]
b = [2, "B"]
sa = set(a)
sb = set(b)
c = sa.intersection(sb)

For yours and my convenience, I’ve written a little gist to time it and plot it.

Let’s see the results: (timings in seconds)

list_timing set_timing
100 0.000370 0.000011
1000 0.008075 0.000082
10000 0.477722 0.001216
100000 49.045367 0.016954


So with 10000 elements, with a list takes ~ 0.48 seconds, and with a set 0.0012 seconds, with a 100000 elements a list takes 49 seconds, and the set operation 0.017.

Two Take Home Messages:

  1. If you are writing a for loop, you’re doing it wrong
  2. If you have to intersect or unify list, transform them to sets and use the built-in function.


Ipython notebook ans some statistical distributions

Bernoulli distribution

It was quite a bit that I wanted to have a go to play with the ipython notebook, but I wanted to do it with something that was quite interesting and useful.

The IPython Notebook is a web-based interactive computational environment where you can combine code execution, text, mathematics, plots and rich media into a single document

- from the docs

This means that you can document your process and or exploration using Markdown, which is than beautiful rendered in html, and have also python code executed, with the graphs that are going to be embedded and will stay in the document.

I think it’s a very valuable tool, in particular when you are doing exploratory work, because the process of discovery can be documented and written down, and it a great way to write interactive tutorials.

For example, in this notebook, I’ve plotted the Probability Density Function of several statistical distributions, to have an idea how they are shaped, and which one to pick as base when creating new bayesian model.

You can see how it looks like on nbviewer

exponantial distribution

Google Cloud free trial coming to end


I received an e-mail yesterday that the Google Cloud free trial period is coming to an end.

This means that from the 1st of June onwards, every instance needs to be paid, starting with the smallest D0.

Loquacius was running on google app engine, and it was a test to see how the new Cloud Sql was behaving with a classic Django website. Given the fact this was just a test, I’ve decided to switch it off.

I’ve downloaded the fixtures of the blog (just three entries to test the blog) and switched the database off, disabling the billing and deleting the D0 instance.

The code is still available on github but unfortunately the blog engine will not be run live anymore from google app engine.

You can still do it on your machine, or have a look how it was on this blog post.

Ggplot2 graph style with matplotlib

Gg2plot is an amazing library to plot and it’s available for R to create stunning graphs. GGplot2 takes a different approach from the classic library, and instead of offering a classic line/points approach permits to combine these elements (example), which is a similar root took by D3js. If you are using the scientific python stack (matplotlib, numpy, scipy, ipython) you have the very good matplotlib to plot and have all your graph app.

For example a bunch of sin and cosine generated by the following code:

look like this:


Instead if we set up a ggplot2 style, the graph looks like this:


You may prefer one or the other. Anyway if you like the last one, just download this matplotlibrc and save it as ~/.matplotlib/matplotlibrc, and all your graph will have that style as default.

The matplotlibrc has been inspired by this post, I’ve just updated with the latest matplotlibrc from matplotlib 1.2.1 version.

Have fun!

Edit: Bonus plot, code in the gist.


What do I do when my Pull Request does not merge automatically in master?

Github makes very easy to collaborate with people, however sometime it’s a bit complicated to understand how to use Pull Request, and in particular how to make sure that the feature branch can be merged in master in a Fast Forward way

So let’s se how we can go from this(Or the famous “can’t be merged automatically”)

Pull Request cannot be merged


to this: (Or yeah, this looks good)



Why this happens

The problem is that both in master and in your branch some files have been changed, and their going in different directions. The content of the file in master is different from the one in your feature_branch, and git does not know which one to pick, or how to integrate them.

To solve this, you need to

  1. Get the latest upstream/master
  2. Switch to your master
  3. merge the latest master in your master (Never develop in master, always develop in a feature branch)
  4. switch to your feature branch
  5. merge master in your feature branch
  6. solve all the conflicts: this is where you decide how to integrate the conflicting files, and this can be done only by you because you know what you did, you can figure out what happened in master, and pick the best way to integrate them.
  7. commit all the changes, after all the conflicts are solved
  8. push your feature branch to your origin: the Pull Request will automatically update

Talk is cheap, show me the commands (cit. adapted)

If you didn’t already add the upstream to your repo, have a read to this

1. Fetching upstream
git fetch upstream
2. Go to your master. Never develop here.
git checkout master
3. Bringing your master up to speed with latest upstream master
git merge upstream/master
4. Go to the branch you are developing
git checkout my_feature_branch
5. It will not be fast forward
git merge master
6. Solve the conflicts. get a decent 3 views visual diff editor. I like Meld
git mergetool
7. Commit all the changes. Write an intelligible commit message
git commit -m "Decent commit message"
8. This will push the branch up on your repo.
git push origin my_feature_branch

Hope it helps.


Say hello to Loqu4cius


Loqu4cius is a lightweight blog engine based on Django (not this Django), that runs on google app engine and it uses as backend CloudSQL, which is, as google put it, MySQL on the cloud.

A bit of history

Google appengine has the ability to run scalable app. So far it was possible to use django on it, given the fact Python was one of the two supported languages, however the back end was big table, which is not compatible with the classic RDBMS used by django.

This made impossible to use span relation and over, so the only usable bit of django were the templates, the URLs router but not the model…

Django-nonrel to the rescue.

A project called django-nonrel came to the rescue, and it created a compability layer between the NoSQL backend and the classic django ORM. Most of the span relationship were working, however some of the join, like the many2many were not available.

Fast forward to our time

Fast forward to today, google made it possible to have a classic RDBMS available, with the possibility to use all the ORM goodies, included django third app that can speed up and reuse the development.

So now Google-cloud to the rescue.

To check it out, I’ve came up with Loqu4cius.

It features a tag cloud that makes it be 2.0, is based on Twitter Bootstrap, and I’ve styled with some colors and the fonts (directly from google font), a search bar and the ability to enter rich text using ckeditor. The comments are integrated using disqus, that is the way to go right now.

The code is on GitHub with a quick readme, for any question the comments are here :).

Some thoughts about the development

Google appengine comes with some limitation, but with the possibility to add third parties libraries it is possible to re-use a lot of the django apps already available. (Let’s agree on terminology: app –> a single application that does one thing, for example it manages the tags, project –> a collection of all the apps and related files that runs the entire site.)

My strategy is to create a virtualenv and than copy all the necessary modules into the lib folder. This gives me the ability to install a package with

pip install package_name

and all the dependencies very easily. After that it’s a matter or using the apps and make it work pretty nice.

CSS writing

I like to use less to write CSS, but I don’t want to have a client compilation of the less file, and I want only to serve CSS in production, therefore I use two helper to get the job done.

First I use a python script that finds all the less file and compiles them into css, calling the lessc compiler.

However I don’t want everytime that I write a new bit of the less file, to call the script myself, so I use watchdog to call the script everytime the less file gets saved.

It would be nice to have a tool that can launch both the development server and this script in one go, and it actually doable. It’s called honcho and it accepts a classic Procfile.

For example for loqu4cius this is the Procfile.Dev

web: ./
less: watchmedo shell-command --patterns="*.less" --command='./scripts/ "${watch_src_path}" ' static/less/

launching it with honcho -f start makes sure to launch the development server, and to recompile and move the file to the collectstatic folder as required in one go, so you can focus on just developing.

Last but not least, I’ve created a quick release script, called, which:

  • increases the app.yaml version of the site
  • performs the syncdb in production
  • uploads the site using,
  • commit the modifies app.yaml to the repo
  • tags the repo with the version number

so you can always now which commit refers to which version on googleappengine.

To figure out how to set up the enviroment in a way to have a streamlined development took me a bunch of days, and I’m eager to know other solutions to the same problems!

How to use Neuronvisio to select particular sections of a NEURON based multi-compartment model

Displaying and visualizing selected sections in multi-compartmental models in 3D it’s quite an hard task, however the info and the clarity that is possible to achieve is worth the effort.

Let’s say you have your multi-compartiment model ready in NEURON and you are interested to show in 3D selected sections of the model with arbitrary colors (for example, to show where certain stimuli are applied), like I did in my Medium Spiny Neuron model to show which spines get stimulated:

spiny MSN with stimulated with different trains in selected spines

spiny MSN with stimulated with different trains in selected spines

Neuronvisio offers a nice API, from the visio module, accessible as controls.visio.select_sections(secs_list, scalar_value), which makes this operation easy.

For example, let’s take the pyramidal neuron model that comes as an example with Neuronvisio.

Let’s say that we want to select the soma, the iseg and the first section of the myelin, and we want to give them arbitrary colors.

To achieve that we can easily run:

controls.visio.select_sections([“soma”, “iseg”, “myelin[0]”], [1, 0.5, 0.2]), obtaining this picture:

Selecting and coloring special sections with Neuronvisio

Selecting and coloring special sections with Neuronvisio

Quite handy and pretty fast, don’t you think?