Category: English

Profiling python app

If you have to profile application, in python for example, it’s good to read this blog post which I found very useful information.

The profile is used to compare pytables, a python imlementation of HDF5 and pickle, which is a classic choice which you ran into if you are dealing with saving big files on the harddrive.

The best tool so far seems to be the massif profiler, which comes with the valgrind suite. How valgrind works:

This will run the script through valgrind

valgrind --tool=massif python test_scal.py

This produces a “massif.out.?????” file which is a text file, but not in a very readable format. To get a more human-readable file, use ms_print

ms_print massif.out.????? > profile.txt

So I’ve run some test to check the scalability of HDF5.

[sourcecode language=”python”]
import tables
import numpy as np

h5file = tables.openFile(‘test4.h5′, mode=’w’, title="Test Array")
array_len = 10000000
arrays = np.arange(1)

for x in arrays:
x_a = np.zeros(array_len, dtype=float)
h5file.createArray(h5file.root, "test" + str(x), x_a)

h5file.close()
[/sourcecode]

This is the memory used for one array

 

profile_one_array

Profiling one numpy array

This is for two arrays

profile_two_arrays

Profiling two numpy arrays

Four arrays

profiling_four_arrays

Profiling four numpy arrays

And this is for fifty

profile_fifty_arrays

Profiling fifty numpy arrays

As soon you enter the loop the efficiency is preserved in a really nice way
Summing up:

  • one ~ 87 Mb
  • two ~ 163 Mb
  • four ~ 163 Mb
  • fifty ~ 163 Mb

So the problem is not on pytables, but it lies somewhere else..

Crowd power on steroids

This is not a news, but I think there is a big shift happening in the society.

I guess the idea of community sharing is becoming bigger. Opensource movement was a precursor in this regard, I think and a loose collaboration is able to get things done.

This will raise with the creation of more peer to peer networks which have a strong focus and an objective. Collaborative consumption will be the next stop.

Another trend which is worth of noticing is increase of collective intelligence. Making data available to people is opening up a lot of possibilities. From opengoverment to opendata to openknowledge.

I still remember the Time cover with You. Next year I bet it will be “we”.

Happy New Year,

I guess it will be full.

Fixing a bug. The power of opensource

 

If you used gnome 2.30 and used two screen, cloning the smaller in the bigger one,  you were annoyed by this bug. At least I was really annoyed. Nothing too bad, just the image was not drawn correctly. In my case was drawn twice at different resolution, one on top of the other. However this was a regression from gnome 2.28 where this was not present. The bug was up for a long time and was also reported upstream at gnome side.

I did some research on it and I had nailed down where was the problematic code, however I was unable to propose the solution because C is no my cup of tea.

Yesterday Florent proposed a fix which was tested by Thomas, which was kind enough to give the instruction how to recompile the package, making a deb out of it and install it. Therefore I just give it a try. Finally, I have the desktop fixed!!

The patch it’s already on the gnome bugzilla and everyone will get this back as it supposed to be.

The thing that I want to underline is the anarchic collaboration:

  • somebody open the bug,
  • other people reported it and the bug was confirmed
  • somebody else found which part of code was interest
  • somebody proposed the solution
  • other people tested on different system

This is the beauty of opensource. Just give an hand if you can and enjoy it.

Stand your ground

The story, picked up also by the guardian here is rather interesting. In few words, a student @Darwin College at the University of Cambridge has published, on his personal website a MPhil thesis about how to construct a device which shows a flaw on the credit card system, which makes possible to make a transaction with a stolen card using any PIN.

The bankers has asked to take this information down. Now think about it for a moment. Instead to fix it, they asked to take it down.

I can foresee your objection.. They should give them time to act and then disclose the flaw. They actually did, because the problem was reported in 2009 (yes, last year) as said on this letter.

In the letter they also write why they will not take it down:

you seem to think that we might censor a student’s thesis, which is lawful and already in the public domain, simply because a powerful interest finds it inconvenient. This shows a deep misconception of what universities are and how we work. Cambridge is the University of Erasmus, of Newton, and of Darwin; censoring writings that offend the powerful is offensive to our deepest values.

This the right way to go. Full disclosure. Fix the problem, don’t hide it. It was also the position expressed @ the Moka Olografix. (An Italian camping about security which I went ages ago).

Hat tip to Ross Anderson and Omar Choudary.

A Trenitalia experience

Saturday I was very brave and I had decided  to take the train and go all the way to Turin  from Ancona and back using the amazing trains run by Trenitalia. The Wind Operations Worldwide was having the annual meeting. Good news were announced for the Kitegen project, especially the drawing to a close for the first industrial prototype.

The meeting was starting at 13:30, in Turin so I’ve decided to catch the train from Falconara Marittima @7:50. This train was supposed to arrive @Bologna at 10:37 and then I had to catch the super fast “Alta Velocitŕ” High speed train at 10:53.

A sad 10 appeared under the delay column in the screen. This delay was really strange because the line was free, others trains from Ancona were arriving on time and the train was supposed to start from a very close station,  just 10 Km away.

At the end the train show up with at least 40min of delay. When I’ve asked on the train what was the cause of the delay the answer was a laconic – “there was a problem with the safety check on the locomotive engine, so it has to be detached and reattached. That took a bit of time.” When I’ve pointed out it that I was risking to loose the coincidence to Turin the ticket collector said he was unable to do anything.

I arrived in Bologna after the other train departed, there were not High Speed train to Turin anymore and the only thing left was to change the ticket for a High Speed train to Milan (which arrived delayed as well) and then to take a low speed train from Milan to Turin, which took 1 hour and 55 min.

I’ve arrived in Turin at 15:27 and I was able to reach the meeting only around 16:00.

For the return I was taking the “Intercity Notte” from 21:05 from Turin, which was scheduled to arrive in Ancona at 2:59. This train is one of the long distance train which connects the country from North to South. Turns out that our carriage, number 6, had a problem with the heating. The indoor temperature was very close to the outdoor, which was 0° C. We were freezing. Badly.

In my compartment there was one child, and there were other children on the same carriage. People were travelling even further than me, all the way south to Lecce, where the train was supposed to arrive around 10 o’clock in the morning.

The ticket controller said he was unable to do anything for us. The train was completely full, also the carriage 11 was experiencing the same problem, and there was no possibility to change the seats. The only thing they were able to do was to give us some blankets at Bologna station, after 4 hours trip. As you can understand, some light blankets didn’t really change the situation. More over there were not enough blankets for all the passengers.

Finally I’ve arrived around 3 o’clock at Ancona station, completely frozen.

Was it bad luck? Was just a set of circumstances, impossible to predict? I don’t think so.  The problems I and the other passengers have encountered were not related to the heavy snow, or the bad weather conditions or any other kind of exceptional situation. Last year it took me 26 hours to get back, everything completely blocked due to the heavy snow. That was something you have to accept and deal with it. This I think was a different case.

My guess is that these problems were completely avoidable with a regular maintenance, which has been shrunk badly. This is not an isolated case. Speaking with regular passengers of the long distance train they said: “you freeze in winter and you sweat in summer. No way out.”

I’ve filled the form to ask the reimbursement . At least I want my money back for a not existing service, the High Speed to Turin, and the money back for the freezing conditions which I had to deal for the return trip.

Too much phosphorus

It seems the big paper about the Arsenic which I’ve talked briefly on the previous post lacks a lot of precision and that the impurities presents in the medium will give room to the possibility that actually the bateria is not using arsenate, but still phosphate.

We were not really convinced at lunch discussion and then it seems we are not the only ones.

More info here. Especially interesting these criticisms . ).3 µM of Phospate is quite a lot in a P-/As+ medium.

I guess there is room to have another look at this bacteria.

The other way

I’m still amazed by the subs of the Phosphorus with Arsenic as found out and report in Science here

So life can have a completely different way for working… It’s also true that the Arsenic sits just under the Phosphorus, so it share a lot of chemical property, but still is the first time we have an example of something living without Phosphorus. Some questions just right off the bat:

  • What’s going on with the Adenosin Three Phosphate (main energy molecule in all cells)?
  • Is that thing using an ATA (Adenosin Three Arsenate) ?
  • Is that stable/does it even exist?
  • Which are the pathways? All the kinase/phosphatase system is just working with ATA? I guess we need to change the phosphatase names if that is the case..
  • Which kind of evolution?
  • Where is on the phylogenetic tree?
  • Is it on the phylogenetic tree?

Time to read the paper.

Best LaTeX editor: Gedit on steroids

Looking for a really powerful editor in GNOME is one of my constant research.

However it seems now it’s close to an end and it need just a small tilt to achieve perfection.
Right now I’m using Gedit with LaTeX plugin . It works amazingly and it does it jobs. The spellchecker is available and everything works properly.
I’m using Evince to actually check the complied PDF instead of the third panel, but that is a preference thing.

However the best tool out there to enjoy creative writing is definitely Scrivener. The main features of Scrivener is the ability to completely shiedl the user from the managing of the files and names, giving the possibility to focus on:

  • the status of it’s work (todo, draft, revision…)
  • the ability to shuffle and reorder the pieces of work as it best fit.
  • the main idea to use scrivener, or autonomous chunk of text which can be combined in an easy way

Writing books or big documents from scientific papers to PhD thesis it’s a main effort, which need constantly the main vision, but also the attention to details. Some parts will be ready before others, some pieces have a different evolution than others. The chunked text is the best way to go and I think it will really make the users’ life easier while battling with writing massive document.

The main problems with Scrivener are three according to me:

  • is available only for Mac and Windows, although there is a not supported version for Linux
  • the way the references are managed is not ok for scientific papers (BibTex does a perfect job)
  • not able to control easily the results of the compiled files.

This is the impression I’ve got when I used it for a really small period of testing. Although the User Interface is amazing.

Therefore we should create our own Scrivener, where the three points over stated should be addressed.

Gedit is a very good candidate to evolve, due to use of plugins, to a similar User Experience. Using the LaTeX plugin is already possible to holds and manage complicated text and notation and have a top-notch quality results.

What we are missing is the managing of the files a la Scrivener, where each file is a just a chunk of text which can be a subsection, a section up to even a chapter. Each file should be indexed and the metadata of each file should be tracked, like the revision status and the part status. A project manager, which will holds all this file and it will open them and make them always available to the user. The best would be to have a project manager which holds all the files, and a third panel where the status of the file and the type of the file is tracked properly.

Any chunk of test should be written in LaTeX and could be combined according of the order in the Outline, using the  input command.

Unfortunaly I’m not too familiar with Gedit from the programming point of view, however if there is anyone who thinks is a good idea and want to give a try I’m happy to be a beta tester and give an hand.

If interested, leave a comment, or send me an email. There is a starting (hopefully) discussion also on the Gedit ML

Git tutorial pushing branch

This is a very good and clear tutorial how to push a local branch to a remote with git.

In a nutshell:

1. Creating a the remote branch

git push origin origin:refs/heads/new_feature_name

2. Updating the branch list
git fetch origin

3. Just double check if it is really there
git branch -r

4. Track the remote branch on a new local one
git checkout --track -b new_feature_name origin/new_feature_name

5. Classic pull. All branches will be pulled now
git pull