I have decided to finally make a serious go at trying to learn the Finnish language properly. The eventual aim is to be able to read (sing?) the poem Kalevala in its original Finnish.
Finnish has three more letters than the English alphabet: ä and ö are used in Finnish words, while å is hardly ever used, i.e. in Swedish loan words or names, such the Åland island in the Finnish Archipelago Sea which is perhaps not the best example as it is called Ahvenanmaa by Finnish speakers.
My laptop has an English keyboard, so I need a way of typing these extra letters in. Going through the graphical menu 'Accessories' and then to 'Character Map' is a bit of a chore and slows town typing and the spontaneity of it.
So there are keybindings to type in foreign (unicode) letters. They are outlined in an Ubuntu wiki page called GtkComposeTable. That page explains two approaches. The first way is to type the unicode values.
To do this you type Ctrl+Shift+U then while keeping hold of the Ctrl+Shift, type the following code for each letter:
ä e4 ö f6 å e5 Ä c4 Ö d6 Å c5Typing in random codes is as unintuitive and distracting as using the character map. So there is a second approach. This I have not yet got my head around.
So there is a key called diaeresis (¨), which on the British keyboard is got by using Alt Gr and the left square bracket, so: AltGr+[
Now there is a key called 'compose', which by default is the right control key (RightCtrl). So we should be able to get an 'ä' using some combination of the following keys:
RightCtrl a AltGr+[
By bashing these keys randomly, I managed to occasionally get an ä, the same with ö also. However, I don't really get it.
Something I might explore is whether I can bind some easier and more rational key combination to ä and ö. E.g. AltGr-a and AltGr-o,
This is far as I have got so far. If anyone has any tips or has a better idea, please let me know.
Discuss this post - Leave a commentPython West Midlands
Yesterday was the Python West Midlands Technical Meeting. The night before I had arrived just arrived back in the UK, so I was a bit bleary eyed.
The group started about two years ago. After I learned Python I wanted to meet with other Python users in the area so we could share knowledge and have fun. So eventually I started a mailing list and website and put out a request to some local Linux group mailing lists for others to join me.
Before I know it I had a dozen people on the mailing list, then two dozen, then three. We then managed to have meetings in real life, we started off with evening pub meetings as the main thing, but later this turned to technical meetings one Saturday a month.
PyCon UK
One of the first people to respond was a guy called John, who quickly became my co-conspirator and co-leader of the group. He already had an idea to run a national UK Python day. Now we had a group of people, we could turn that idea into reality. I was thinking of the name "Python Saturday".
There had been previous Python events in the UK such as a Python track at the ACCU conference, so John estimated that there would be 80 people who would come to such a day. I was more sceptical, I estimated 60 people, 20 from Python West Midlands, 20 from London Python and 20 from everywhere else in the UK. We were both very wrong, this was clearly an idea whose time had come, we ended up with 200 delegates!
When the call for talks went out, and John had emailed all his Python friends, we had more talks that we could reasonably fit into a day, so we expanded the event to two days, then we expanded that to two rooms at once, lastly we expanded to four rooms of talks on the Saturday and three rooms of talks and one room of 'open space' on the Sunday. This was PyCon UK 2007, and it was a surprise success.
The Bennett incident
There were of course some mess ups, the Friday night social was a bit of a disaster, the place we had chosen was completely unsuitable for such numbers and the chef had a row with his manager and stormed out just before we had arrived, meaning there was no food! We were stuck with a terrible dilemma, people would we arriving (hungry) at the pub all night from around the UK and beyond so we couldn't change the venue, yet food was needed for many dozens of hungry geeks.
We tried to fix it by getting everyone to contribute a fiver and then buying food from a nearby Thai takeaway, this was a bad idea because firstly we didn't have any serving implements, so could not separate food into x portions, secondly it became a huge scrum, meaning the food was really hard to distribute. In the end the PyCon UK Treasurer saved the day by authorising us to buy a second round of food using money from the conference budget, so we raided John's wallet and we made sure the takeaway packed the food into separate portions. Despite such things, the treasurer still managed to break even, absolute genius, I have no idea how he got so much out of such a small budget. Hopefully, the delegates managed to get on and chat with each other, while the organisers pulled their hair out.
Although the Bennett incident was caused by a disappearing chef, the root of the problem was that a social meeting expected to have a small group had a very large group instead. Not knowing how many delegates you are going to get is the main problem of conference organisation. Plans laid for 100 people, fall apart with 200. Plans for 200, fall apart if you only get 100. After doing this a few times now, I think the answer to is pick the number you want before you start, and then declare the conference full when you hit it.
PyCon 2008
So last year was the practice, this year we would like to think we know what we are doing! We also opened bookings very early with a large early-bird discount, this not only enabled us to get a good idea of numbers, but also to put some money in the bank to pay the bills and deposits that we needed to pay for. Yes this year we have a bank account! Last year we ran the whole conference using John's credit card. We have also managed to get an impressive list of sponsors, so the conference finances are secure, we have a firm foundation to build on.
We far more talks and events than last year, and I am pretty confident that we will meet or beat last years attendance. However, due to the sponsors and more organisational experience, we can comfortably grow a little and still not have to declare full.
Python West Midlands August Meeting
Anyhow back to the last Python West Midlands meeting. This time we did a bit of organising for PyCon UK, thankfully this year we have some friends elsewhere in the country to help us, but much of the practical work is still done by Python West Midlands.
One onerous task that we had been putting off was typing in the feedback forms from last year. As vice-chair of the organising committee I had a lot of other things to do so first, however some other members of the group gritted their teeth and typed out all the returned forms, before I know it they had typed my share also, so lucky bastard that I am, I escaped, cheers to everyone who did the hard work, while I sat on my arse sending emails.
We had 108 forms returned, and they contain lots of interesting comments. Of course whenever you ask the people for their views, some a bit unrealistic, some are mad and some have absolutely fantastic ideas. It is the latter that hopefully will make the typing worthwhile.
You can read the full results here.
As well as questions about what they thought of the conference, they was some interesting statistics about the 108 Python programmers who replied to the questionnaires:
Firstly, we asked them how much experience they have with the Python programming language:
Secondly, we asked about what kind of programmer they are.
In retrospect, it would have been nice to have "I use Python in Open Source projects" as one of the options.
Lastly, we wondered what Operating Systems the programmers worked on:
As you might expect, Linux is very popular among this sample of programmers. I wonder if that is the same in the wider Python world, or whether people with the type of personality to get involved, meet with their peers and contribute, are the same personality type as people who are willing to use an open source operating system?
I wonder how our delegates compare to the Python world generally?
Discuss this post - Leave a commentSo Great Britain did really well in the Olympics this year, coming forth in both total number of medals, and in the number of gold, silver and bronze.
Country Gold Silver Bronze Total Great Britain 19 13 15 47So well done to all the Olympic Athletes for taking part, from whatever country.
Britain is of course part of the European Union. Alexander Stubb, the Finnish Foreign Minister who is in the news at lot a the moment after taking charge in Georgia, one said that "the EU will always be more than an international organisation, but less than a state." (Source - PDF)
So the people running the young Europeans website, had the fab idea of adding all the EU's medals up.
Country Gold Silver Bronze Total EU 87 101 92 280 China 51 21 28 100 USA 36 38 36 110 Russia 23 21 28 72Only a couple of medals away from getting more than the next three regions combined!
Maybe you disagree with Stubb and think the EU is not any different than other international organisations. In that case we would compare the EU to America's and Russia's international organisations.
Country Gold Silver Bronze Total EU 87 101 92 280 NAFTA 41 47 43 131 CIS 31 35 56 122Still win!
Discuss this post - Leave a commentThe pop singer Lily Allen wrote a piece on her blog saying that she had finished her anticipated second album called Stuck On The Naughty Step but her record company had not yet released it, perhaps because the people supposed to be doing that had been laid off.
For this post, it does not matter if you like Lily Allen or not, but in the last paragraph, I linked to the Lily Allen's blogpost, look did it again! Not hard was it?
Well three of four of Britain's major national broadsheet papers quoted from the above blog post, the Times, Guardian and Telegraph, All three of them failed to link to their source. The Metro, the paper they put on buses and trains, did not do any better.
Only the tax funded BBC managed to link to Allen's site. However, the link is across in the right column, and does not link directly to the blog post.
I know by the standards of newspapers that the hyperlink is cutting edge technology, being invented only in 1965 by Project Xandu and first used on the World Wide Web in 1991 by Sir Tim Berners-Lee
The journalists presumably had the blog they were cutting and pasting from in front of them while writing, would it have been that much more effort to cut and paste the URL into the post?
On most browsers, you can copy the URL with these three shortcuts: Ctrl+L, Ctrl+A, Ctrl+C
Unless you are on the Mac, then you want: Cmd+L, Cmd+A, Cmd+C
Hyperlinks are what turns text into hypertext, there is a clue in every link: http://, the HTTP stands for Hyper-Text Transfer Protocol. It is not PTDP the "Plain Text Deadend Protocol", it is not NWHTHTCNLP, the "Now We Have Them Here They Can Never Leave Protocol".
Linking to what are you talking about is how the Web works, like taking your trolley back to the trolley rack is how supermarkets work. Throwing your trolley into a ditch or leaving it in the middle of the road ruins the ecosystem. It is also rather inconvenient for the other shoppers who find there are no trolleys anymore.
Likewise, withholding the sources to keep your readers in the dark is disrespecting the ecology of the Web. Again the key is in the name "Web", interconnected sites, it is the World Wide Web, not 'My Little Cul-de-sac'.
Discuss this post - Leave a commentXML databases have not really broken through yet in a big way, primarily because SQL has proved more resilient than expected in storing a wide range of data.
SQL has limits
For example, object-relational mappers such as SQLAlchemy, the one inside Django and ActiveRecord in Ruby, have become very popular recently, as they all you to serialise in-memory objects and store them in a database. Very little SQL experience is required and you can create very elaborate relational storage.
However, when you have data that is not relational at all, converting it to rows and columns can be either lossy or a lot of work, and often you end up unable to round-trip, i.e. you can get the data in, but you cannot get it out again in the original format.
The most common approach is not to even try, i.e. often people will not store XML in a database, but run a series of XML files though some XSLT stylesheets and then store the compiled HTML in a database. This removes XML's advantage as a processable exchange format. In order to get the data in a different way, the human has to go back to the beginning of the pipeline and rerun the complication and database import. If you want to allow other computers to make flexible requests over the Internet, well then you are out of luck.
Meanwhile, a lot of the XML extensions to SQL databases are still relational underneath, for example, with MS SQL, you can give it XML, which it then splits up into rows and columns; you can then also ask for the data back in XML, which it then retrieves from the rows and columns and outputs in XML. You have not actually gained very much however, as your data is still stored in rows and columns, you have to output the data before you can process or query the XML. This can prove very long-winded with highly complex hierarchical data, it can sometimes prove faster to just dump the XML to disk yourself.
Even worse is storing a whole XML document within a single SQL field in a single row. Again, this is using a database instead of a hard disk.
Elliotte Harold's article, Managing XML data: Native XML databases is a few years old so the software examples he mentions are probably a little old, however the discussion of why one might want an XML database is quite good, this interview with Jonathan Robie is also worth browsing.
Berkeley DB
Berkeley DB, more commonly BDB or just DB, is the classic embedded database, Started in 1991, it is used more or less everywhere. Despite some recent completion from SQLite, it is still by far the leading embedded database. It bundled inside in everything: Linux, BSD, OS X, OpenOffice, Python, Apache, Sendmail, Postfix, subversion, GNOME and no doubt millions of other things.
DB, like much of software that powers the world, was started at the University of California, Berkeley. It was then spun off in Sleepycat Software, which was bought by Oracle a few years ago. Note DB has nothing to do with Oracle's eponymous relational database product. They are at the complete other end of the spectrum, Oracle database is for large corporate data storage, while DB is a fast and light embedded database.
DB is also non-relational, i.e. there is no SQL interface, there are key-value pairs stored in byte arrays and that is more or less it. If you want SQL or network interfaces, then you stick that on top, for example DB is one of the back-end options for MySQL.
Berkeley DB XML
The folks at Sleepycat/Oracle have been working on Berkeley DB XML for a couple of years, but it is not so well known as its older brother. DB XML is a layer on top of DB that forms a fully XML-native database with "XQuery-based access to documents stored in containers and indexed based on their content". (Source)
There are two good things about it. Firstly, it is an XML database written in C and C++, not in Java. Secondly, it is a proper XML database, not putting XML into SQL columns.
The other notable XML database is the Java-based eXist which seems to be more well known, primarily among Java XML developers. eXist is more of a service, rather than something light to embed into applications. It also rumoured to be significantly slower than DB XML, but I have never used eXist so I can't show any benchmarks for that.
There are also a load of proprietary and half-finished XML databases we do not care about. Including the abandoned Java XML database called dbXML which has no relation to the Berkeley on but confuses everyone because the sourceforge page is often the first search engine result.
Install Berkeley DB XML - Windows
Go over to the Oracle/Berkeley DB XML homepage and grab the installer. I have not used it so I have no idea what happens, hopefully it is straightforward.
Install Berkeley DB XML - Posix platforms
DB XML works on all major posix compatible systems, including Linux, OS X, BSD, Solaris and so on. Installing it varies according to distribution. If your distribution has packaged it, then lucky you, you just install it through that. Of the two distributions I usually use as examples on this site, Gentoo Linux and Ubuntu Linux, the former has a DB XML package, the latter currently does not.
Install Berkeley DB XML - Gentoo
sudo emerge dbxml
The package is new, so if you are running stable, then the emerge command will moan and tell you that the package and some of the dependencies are not yet in stable. You will need to add these to /etc/portage/package.unmask, see Using Masked Packages in the Gentoo Handbook for more details.
Install Berkeley DB XML - Ubuntu
Hopefully, in the near future, you will be able to go:
sudo apt-get install dbxml python-dbxml
But we are not there yet. When we are there, the rest of this post will be irrelevant. Therefore this following part will date badly, I will try to remember to update it as events unfold, but if you are reading this paragraph in 2009 or later, it means I have forgotten and it might be worth checking other sources.
What follows shows how important packagers are. Almost any complex package will need to be optimised for that distribution, and normally all the work is done for us behind the scenes. The pointy-heads in the Mozilla corporation that didn't want Firefox to be patched downstream in the distributions are not living in the real world. But that is another story. Let's get going.
Dependencies on Ubuntu
First, we need to get the dependencies. We need Berkeley DB, xqilla and libxerces, the latter two are not all yet in the default Ubuntu repositories, but my mate txwikinger has packaged them for us in his private archive. You need to add these lines to the bottom of your /etc/apt/sources.list with a comment so you remember why you put the lines there:
# Deps for DBXML deb http://ppa.launchpad.net/txwikinger/ubuntu hardy main deb-src http://ppa.launchpad.net/txwikinger/ubuntu hardy main
Then run: sudo apt-get update
sudo apt-get install libdb4.6++ libxerces28 xqilla
sudo apt-get install libxerces28-dev libxqilla-dev libdb4.6++-dev
Working directory
We need to have a working directory, in my examples we will assume ~/Sandbox, but use where you like.
mkdir ~/Sandbox/
cd ~/Sandbox/
Go over to the Oracle/Berkeley DB XML homepage and grab the tarball and extract it into your ~/Sandbox directory, or you can just use the commands:
cd ~/Sandbox
wget http://download.oracle.com/berkeley-db/dbxml-2.4.13.tar.gz
tar -xvvzf dbxml-2.4.13.tar.gz
Compile DB XML
The tarball from Oracle contains a lot of dependencies which we have already installed. So ignore the 'buildall' instructions, as they compile all the dependencies which takes all day. You then have half a dozen or so packages that are in your main system folders but are not managed by Apt, this is not really the done thing.
Now because we are doing it the proper Unix/Ubuntu way rather than the Oracle big-tarball-of-mud approach, we want to use the shared library of xqilla that we installed earlier. There is a slight problem in the configure file which stops it from working. Open the following file in your text editor:
~/Sandbox/dbxml-2.4.13/dbxml/dist/configure
In line 4396, change .la to .so
So the line:
elif test `ls "$with_xqilla"/libxqilla*.la 2>/dev/null | wc -l` -gt 0 ; then
Becomes:
elif test `ls "$with_xqilla"/libxqilla*.so 2>/dev/null | wc -l` -gt 0 ; then
Now we can build DB XML
../dist/configure --with-berkeleydb=/usr/lib/ --with-xqilla=/usr/lib/ --with-xerces=/usr/
cd ~/Sandbox/dbxml-2.4.13/dbxml/build_unix
If that works without error, you can type:
make
sudo make install
Now if that works without error, you can try to run dbxml:
/usr/local/BerkeleyDBXML.2.4/bin
./dbxml
Type help to see the list of commands. Press Ctrl+D to quit when you have had enough.
Python Bindings on Ubuntu
So far so good, now we need the Python bindings:
cd ~/Sandbox/dbxml-2.4.13/dbxml/src/python/
Now we have to do a little more patching to let the Python bindings know where dbxml is installed.
Open setup.py in your text editor We are going to add three lines, in each case make sure the indentation lines up with the line above it.
After line 18, add the following line:
db_xml_home = '/usr/local/BerkeleyDBXML.2.4'
After what is now line 65, add the following line:
INCLUDES.append(os.path.join(db_xml_home, "include"))
After what is now line 69, add the following line:
os.path.join(db_xml_home, "lib"),
Now in dbxml-2.4.13, there is another little bug we need to deal with first.
Open the following file in your text editor:
~/Sandbox/dbxml-2.4.13/dbxml/dist/swig/dbxml_python.i
Go to line 533, and change 'Vaue' to 'Value'.
So change:
class XmlInvalidVaue(XmlException):
To:
class XmlInvalidValue(XmlException):
Now this file eas used to automatically generate the Python bindings, so the generated file needs to be fixed too. The dbxml developers need to generate them, but until they do, we can just fix it ourselves. Open the following file in your text editor:
~/Sandbox/dbxml-2.4.13/dbxml/src/python/dbxml.py
Go to line 121, as before change:
class XmlInvalidVaue(XmlException):
To
class XmlInvalidVaue(XmlException):
Now we can finally install the bindings:
python setup.py build
sudo python setup.py install
Now lets test it, which involves, yes you guessed it, more patches:
cd ~/Sandbox/dbxml-2.4.13/dbxml/examples/python/
Now edit examples.py using your text editor. Remove the 3 from line 9. So from:
from bsddb3.db import *
To:
from bsddb.db import *
Now you can go:
python examples.py 7
Which should give you:
Running example 7. book1 = <book><title>Knowledge Discovery in Databases.</title></book>
If you have that then everything should be in working order. This has been a very long post so I will break here and come back to DB XML in Python another day.
Discuss this post - Leave a commentI was working on something in one of my little Django sites and wondered how you make a recurring monthly event in Python? What I mean by recurring event is "every fourth Saturday" or "every first and second Wednesday" and so on.
I did not want to make a dependency on some huge calender server module like Calcore or Twisted's caldav. All I wanted was a function that accepts "every fourth Saturday" and returns me an actual date that I can use for scheduling things.
A quick google didn't come up with anything, so I decided to do it myself. Here are my first and second attempts. The first attempt just works it out mathematically, the second attempt uses a module from the Python standard library.
"""Helper for recurring date."""
DAYS = [
'Monday',
'Tuesday',
'Wednesday',
'Thursday',
'Friday',
'Saturday',
'Sunday',
]
from datetime import date
def eventdate(year, month, target_day, target_ordinal):
"""Convert a human event date to a real date.
For example, 'the third Thursday of the month'
the target_ordinal is 3 and the target_day is 'Thursday'.
"""
day = DAYS.index(target_day.title())
match = 0
for i in range(1, 32):
try:
if date(year, month, i).weekday() == day:
match += 1
if match == target_ordinal:
return date(year, month, i)
except ValueError:
return None
def main():
"""Example when called directly."""
today = date.today()
if today.month == 12:
year = today.year + 1
month = 1
else:
year = today.year
month = today.month + 1
print "Next Month's Linux group is", eventdate(year, month, 'Thursday', 3)
print "Next Month's Python group is", eventdate(year, month, 'Saturday', 4)
# start the ball rolling
if __name__ == "__main__":
main()
That worked quite fine, but like all good Python programmers, I want to be as efficient (/lazy) as possible, surely the standard library can do this for me? Well I found that the calendar module will return a matrix of dates organised by week and day. This works as follows:
"""Event helpers."""
def eventdate(year, month, target_day, target_ordinal):
"""Convert a human event date to a real date.
For example, 'the third Thursday of the month'
the target_ordinal is 3 and the target_day is 'Thursday'.
"""
import calendar
day = getattr(calendar, target_day.upper())
cal = calendar.Calendar()
return cal.monthdatescalendar(year, month)[target_ordinal - 1][day]
def main():
"""Example when called directly."""
from datetime import date
today = date.today()
if today.month == 12:
year = today.year + 1
month = 1
else:
year = today.year
month = today.month + 1
print "Next Month's Linux group is", eventdate(year, month, 'Thursday', 3)
print "Next Month's Python group is", eventdate(year, month, 'Saturday', 4)
# start the ball rolling
if __name__ == "__main__":
main()
This seems to work identically as the above but in less lines of code. I still get the feeling I am trying too hard and I am missing something obvious, but maybe I am just being too much of perfectionist (as always).
If anyone knows or can work out a more efficient method, please do let me know.
Discuss this post - Leave a commentI am sitting now in the beautiful city of Tampere, the heart of the Finnish lake district. This city offered to host GUADEC but were refused, nuts, but if you read my recent post Exploring Technical Conference Demand and Supply, you will understand why.
In 2007, I went to GUADEC, the GNOME User and Developer European Conference hosted in Birmingham, it was an absolutely fabulous conference. It had the leading GNOME developers, great talks, great venue, great entertainment. The organisation was so good in fact that we completely copied much of the organisation for two other conferences that I have been involved in.
There was however one massive flaw, although the whole thing went over seven days, the main conference talks were in the week. There were a few hundred developers who had flown in from Europe, the USA, Latin America and so on. Apart from me there were three other people from Birmingham in attendance (if I don't include the nearby city of Wolverhampton then it was 1). They might have well held GUADEC on the moon.
Simply speaking, there were no users. This was a massive lost opportunity. They could have had the main talks over a weekend, dozens or hundreds of local people could have attended and learned more about GNOME, and the seeds of future developers could have been sown. How many potential new GNOME hackers live in Birmingham, Britain's second biggest city? A lot more than 4.
For GUADEC 2009, they had the choice of whether to have GUADEC in Tampere, Finland or Coruña, Gran Canaria. So the developers, no doubt on expense accounts, chose Gran Canaria. I would vote for a week in the sun too. However, how many potential new GNOME hackers live in Tampere, Finland's second biggest city, home to two universities and a host of IT companies? How many potential new GNOME developers live in the tourist destination Coruña, population 13,575?
To fly from England to Tampere, costs £40 on Ryanair from Stansted Airport. The cheapest I can find to fly from England to Gran Canaria in July 2009, the height of the tourist season, costs £230 if I book today, which I cannot because the GUADEC organisers have not announced the date. It will cost £300-400 if I book after new year.
If the GNOME conference is a corporate expenses fueled jolly for full-time developers, then please be honest about it. Don't call it a user conference or a community conference, it is not. GUADEC should be called GDEC or GNOME Developer Summit or similar. Someone else can then set up a community conference aimed at everyone.
Discuss this post - Leave a commentLong time readers may remember that I am one of the organisers behind the conference of the United Kingdom Python community, PyCon UK, this year held over the weekend of September 12-14th. I am also giving a talk.
The abstracts of currently accepted talks, tutorials and BOFs and the timetable for the tutorials day have been published today. Not on that list are the keynotes, expected to be from Mark Shuttleworth and Ted Leung.
It is nice to see Django well represented again, with two out of the three main Django developers giving talks, (as well as a Pylons talk, proving that Django is not the only Python toolkit in town).
Also good to see some talks on PyPy, I have wanted to get into that for a while, so September might be my chance to spend some time looking at it properly.
The 'official' conference hotel is the Etap, primarily because it is one of the cheapest hotels in Birmingham (which is a cheap city for England) and because they charge per room not per person. Each room has a single and a double bed, so three students can pack in to a room, paying very little each. People often arrange room shares on the mailing list.
If you are richer, then you can have a whole room in the Etap, or you can go to the Copthorne (next to the conference venue), the Holiday Inn (where the Saturday conference dinner is planned), or the Novotel (a pretty walk of 5-10 mins away). (A longer list is on the conference site). The early bird rate is still open (but not for too much longer).
If you go, do say hello, I'm wearing a crew shirt and a badge with 'Zeth' on it!
Discuss this post - Leave a comment65 is not normally considered a notable number, but we can celebrate it here in this post. At least here in Europe, 65 is the traditional age for retirement. Even more important is that 65 is the atomic number of terbium, a metal used in making solid-state Flash drives.
In 1965, the film Mary Poppins romped home at the Academy Awards winning five Oscars.
Normally reserved for kings, Winston Churchill's state funeral in 1965 was the largest that Britain has ever seen before or since; meanwhile in America, a different king, Martin Luther King, lead the pivotal Selma to Montgomery marches, and Mariner 4 took the first ever photos of Mars.
Most influential technology sites in the UK
These historical facts were a very tenuous build up to the rather more insignificant fact that Wikio, a blog search engine that has been bought by Yahoo, has ranked this site as the 65th in their list of top 100 most influential technology sites in the UK.
I clicked on the link How are these rankings compiled? Which gave the following information:
The position of a blog in the Wikio ranking depends on the number and weight of the incoming links from other blogs. These links are dynamic, which means that they are backlinks or links found within articles.
Blogrolls are not taken into account and Wikio only counts links from the last 120 days. We thus hope to provide a classification more representative of trends in the blogosphere.
So from this it probably means they are sucking in everyone's RSS feeds and then parsing them for links; well that is how I would create such a site. Scraping a whole site, i.e. like Technorati does, would make it very hard to distinguish what is a 'blogroll' and what is a post.
Who is the king of the Midlands?
Now technology is a broad topic, and the UK is a wide area. If we zoom into to sites about free/open source and sites based in the UK's Midlands; then it seems this site is the 2nd most influential open source site in the Midlands, behind an acquaintance of mine, and fellow Midlands resident, Jono Bacon, the Ubuntu community manager.
So the question is readers, can we climb the 21 places to become the most influential open-source site in the Midlands? Can Birmingham triumph against the Black Country?
Only time will tell, but if you are in a position to link here and help pimp this site up the table, please do. Also, if you have linked to me and I have not linked back, either in a post or in my recommended links section then it means I do not know about your site, so leave a comment telling us about it!
Discuss this post - Leave a commentA few months ago, we looked at Linus Torvalds in his own words, which was surprisingly popular (for a filler ;). So following the same approach (i.e. too busy to write something original today ;), what are the top-ten best mailing list posts in the history of free/open source software?
This is pretty difficult to say of course, so here are ten coolest posts that spring to mind. If you can think of a better one, please do paste a link in the comments.
Okay that is my pick, what have I missed? Please post your suggestions in the comments below.
Discuss this post - Leave a commentApologies for the anoraking, but hopefully my regular readers are used to it by now. I have been involved in organising several different technical conferences in the UK, so this is an interesting subject for me.
Dynamic Programming Language conferences are great
I have heard of five conferences in the second half of 2008 connected to Open Source dynamic languages within reach of where I live:
There are probably more that I have not heard of. There are four with very similar prices, and one that is more expensive:
Earliest possible price
PyCon UK Euro python TCL TK YAPC (Perl) Rails Conf Conference Only £60 £79 £79 £96 £549 Conference + Tutorials Day £95 N/A N/A £191 £627Latest possible price (on the door)
PyCon UK Euro python TCL TK YAPC (Perl) Rails Conf Conference Only £100 £159 £79 £96 £746 Conference + Tutorials Day £155 N/A N/A £191 £824Community and corporate conferences
While the type of content is more or less comparable across these conferences, there are of course many differences with the presentation. A community-organised conference emerges through the work of volunteers, mailing lists and fair amount of gaffer tape. A conference organised by a large company has secretaries, marketing officers, paid web designers and professional printed schedules and so on.
In corporately organised conferences, there is a clear line between those on stage and delegates in the pews. The organisers are working for the company, and the speakers will get in free and possibly have their transport and hotel bills covered. This probably accounts for the RailsConf fee.
Community conferences tend to treat everyone the same, a larger proportion of delegates will be pitching in and contributing something. When 50% to 100% of the delegates are conference organisers or speakers then they of course have to cover their own costs.
Delegates are not all the same
I was thinking about these issues while at Europython. Organising a conference for several hundred people is difficult. One of the difficulties is that potential delegates fall into separate groups with different preferences. Let's split them up into four groups:
Attendence
On the whole, for group 4, it does not really matter if the conference is in weekdays or weekends.
Group 1 see the conference as work, so prefer the conference to be in weekdays, so they can miss days at the office and carry on with their normal social activities on the weekend. They would still come at the weekend, they have to meet peers and keep up with the state of the art, but they will moan heavily about missing football or salsa class or whatever.
For group 2, these people find it hard to get their work to let them have time off in the week to attend a weekday conference. A few of these people might be willing to use some of their holiday because the knowledge and contacts gained from the conference may help them in their personal career development. That is pretty difficult for people with families. Economically speaking, holding the conference on weekdays raises the cost of attending for this group by whatever benefit they could have had from two more days of holiday.
On the whole, Group 3 just can't come on the weekdays. They can't get time off, and it does not benefit their careers.
Cost
On the whole, for group 1, price is not a problem for these people because work is covering expenses. This group is not price sensitive, or in economic terms, their demand is price-inelastic. This group is more sensitive to hassle, they are happy to pay a little more if someone else deals with all the mundane organisational matters, food, accommodation etc.
On the other hand, group 4 are not receiving a wage, so are very sensitive to price (their demand is price-elastic). If the conference is too expensive then they cannot come.
Group 2 and 3 are generally not receiving expenses but are receiving a wage. They can afford more than group 4 but will not want to pay for non-essentials because it is their own money they are spending.
Conference planning
So when planning a conference, the choices you make about when you hold your conference (weekend/weekday) and the cost of the conference determines the number of delegates you will get. All other factors (e.g. advertising/publicity) being equal, it approximates to:
number of delegates = group1 + (group2 + group3 / weekday penalty) + (group4 / cost penalty)
You get group1 no matter what, if you hold the conference in a weekday then you only get some proportion of groups 2 and 3, and likewise, the more you charge, the less of group 4 you get.
So you get the maximum number of delegates by holding your conference at a weekend and charging as little as possible. Of course, whether you want the maximum number of delegates is another matter. Charging nothing at all might mean you have the maximum number of delegates, having a rather uncomfortable experience.
If you only care about group 1, then hold the conference all week in a sunny beach resort with expensive food and entertainment, this however can become sterile over time, as professional programmers of one language/toolkit all have the same experience and ideas. Groups 2, 3, and 4 can bring in unique and off the wall ideas and application domains.
Conclusion
I am not here trying to argue for any particular set up, what works for one technical community may not work for another.
However, it is often the case that all the conference organisers will be from the same delegate type. So I think it is worth taking time to think about how the decisions you make enable or disable certain types of delegates from coming to your conference.
Discuss this post - Leave a commentSocial networking is fickle
Over a year ago, I talked about how I joined Facebook and Mugshot, then as part of my ten crazy New Year predictions, I argued that social networking will eventually become a protocol. Social networking is fickle, as people move on to the next pub.
After Easter, I joined Twitter and I wrote a post about Scripting Twitter with Python, and how I tried to integrate Twitter into my GNOME desktop.
One of the problems I had with Twitter was the API was heavily rate limited, so I would get suspended from the API often when trying to experiment with spidering the social network (e.g. give me all the friends of friends who have mentioned fishing).
A new Identi
The new kid on the block is Identi.ca. Like Twitter, on Identi.ca, you post your latest status update of up to 140 characters, and you can subscribe to other people's updates.
So here is my page on Identi.ca:
Identi.ca is brand new, so not as many people are using it yet as Twitter. As you can see, currently I only have three 'friends' on Identi.ca. If you have a go, do become my virtual friend!
A few differences from Twitter are that Identi.ca's source code is published online (under the name Laconica) and you can use OpenID to login if you want. Also the API is not rate limited and Identi.ca does not currently have the rate problems that Twitter has (Twitter is offline with capacity problems all the time these days).
Identi.ca plan API breakage but that:
'The documented write API below will remain available until at least September 30, 2008' . SourceI wrote my own quick tool for Identi. I was quite impressed how far I got with it in just an hour, my code output in Python is increasing nicely ;-)
It is a command line tool for getting and sending your updates to Identi.ca.
It is less advanced than my Twitter module because the API will change. It doesn't cache anything for example.
To use my tool, you need feedparser on your system. On Ubuntu:
sudo apt-get install python-feedparser
On Gentoo:
sudo emerge feedparser
If you are another platform (e.g. Mac or Windows) and you have Python's Setup tools installed you can go:
easy_install feedparser
Now you need to get my tool. You can download it from my code page, and save it as identi.py. If you have made a script directory then just throw the file in and then call the program with identi.py. Otherwise you can run it with the python command:
python identi.py
This will download the updates from all the friends that you are following. If you have a lot of updates, then you can limit it. The following command will download the latest 10 updates:
python identi.py -n 10
For complete options, run with -h:
python identi.py -h
so we have downloaded messages. Next is uploading a message, for that just write it at the command line:
python identi.py Just saw a great post at http://commandline.org.uk
The Bash shell does not allow unmatched quotes. My program does not care but if you have unmatched quotes then your message won't even get to my program. So if you want to use unmatched quotes then surround the whole thing with single or double quote marks, for example:
python identi.py "Just saw a great post at Zeth's blog http://commandline.org.uk"
Your username and password are asked for when needed. If this gets on your nerves, you can edit the top of identi.py and provide them.
If you know Python you could also use it as a Python binding:
import identi myid = identi.IdentiCA(username = "zeth", password = "something") myid.login() messages = myid.get_messages() new_message = 'Posting direct from my Python Shell' myid.put_message(new_message)
That is really about all it does, if you need something more then you may be better off reading the API documentation.
Discuss this post - Leave a commentThis is my (not very) regular series about what I have read on the web since last time.
Jürgen has written a post asking whether in the age of mobile phones, the need for a wrist watch is diminished?
Are smartphones a complete waste of time? Bug looks into the pros and cons. K thinks the iPhone is a big con, I have to agree. However, Garrick loves his iPhone.
Justin has a cat fight over OS X 10.5 (Leopard) playing up. For my sins, I have had to use OS X a bit in my new job, and I actually found Leopard less annoying than Tiger, mainly because in each version, OS X becomes less like NextSTEP and more like Linux.
Brock tries out XMLStarlet, the command line toolset for XML processing. Daniel looks at Logical Volume Manager (LVM) on Ubuntu and Gentoo. Paul has started to set up a backup server.
Andrew W dug up a nice graphical guide to the system crontab file. I personally am very happy at whoever invented the /etc/cron.hourly and /etc/cron.daily folders which are good enough for me most of the time.
Mez reminds us of the virtues of compressed air. Danux has started a new site called Amarus, there is not much there at the moment, but we wish him well.
Andy L talks about an issue I have been thinking about before, namely, if he current world wide web gets taken over by narrow minded corporate interests, shall we start our own World Wide Web? I have a slightly different suggestion, lets re-invade the forerunner to WWW, gopher.
Recently, at a conference that shall remain nameless, some cynical but funny person made a joke about the great BDFL. He did an impression of a Guido Van Rossum doll with a pull-string in his back, when the string was pulled, the Guido doll would talk half a dozen phrases about Python 3000 (and nothing else). Interestly, Craig Balding managed to interview Guido on a different subject, Google App Engine Security, and true to the joke, Guido says almost nothing.
Django NewFormsAdmin
If you do use Django, then you will want to know that the Django NewFormsAdmin branch has been committed to SVN. Therefore, if you are running Django from the SVN version, then don't SVN up until you have changed your code.
Basically Admin functions are now not part of the models.py file but instead are in a separate new file called admin.py. So cut and paste your admin classes from models.py to admin.py as explained in this guide. This is the last major API change before Django becomes 1.0 in September.
This will presumably keep Christian Joergensen happy, as he recently had a moan about Django's release schedule, i.e. Django has not made packaged releases that often. I personally disagree with Joergensen. For this type of software, releases are somewhat arbitrary and over-rated marketing tools.
For open source software, the mainline trunk should always be in a releasable state. With distributed development (i.e. when branching is cheap and easy) then there is no need for an old fashioned cycle of plan-develop-freeze-test-release-plan-develop-freeze... The trunk should be constantly tested.
The author admits that web frameworks move faster than some other types of software:
"This is a very long time, when you're in the market of web frameworks."
So Django is not a GUI WYSIWYG web site creating program. You can't just casually pick it up and make a website, you have to put time into it. To get the most out of Django, you have to read a huge pile of (mostly well written) documentation. Even for a seasoned Python programmer who knows other MVC frameworks, it will take an evening or so.
After this initial investment, if you decide to make your web applications using Django, then you are already committing yourself to keep up with the developments and improvements in the framework, i.e. keeping up to date with what the Django developers are doing. Therefore, tracking SVN is not unreasonable if you already know what changes are coming. Almost everyone paying even scant attention to Django, would have known about the impending NewFormsAdmin, the documentation page about it that I linked to above was first published on the 14th January 2007.
I do accept however, that Django does seem more suited for teams maintaining the same websites over time, e.g. in-house programmers or contractors on long-term service agreements; rather than one-off, develop and leave type development. However, the former probably does produce better web sites.
Discuss this post - Leave a commentYesterday, I got passed the June edition of Linux Magazine. Carsten Schnober writes an article commenting on an article on Roy Fielding's home page. The article is otherwise fine but it includes this statement from Schnober:
"In the past, talented programmers would collaborate on developing software in their free time, often producing results that put their commercial competitors to shame, but this age seems to be passing."
Source: Carsten Schnober, Projects on the move, "Linux Magazine", Issue 91, June 2008, Page 94
While it is true that many of the early 1990s free software/open source trail-blazers have grown old and/or rich and their software projects are becoming part of the corporate mainstream.
However, the amount of people working on Free Software/ Open Source software now is at least one order of magnitude larger than it was in the 1990s. These new people are also from far more diverse backgrounds and at least one order of magnitude smarter too. A lot the important stuff happening is not necessarily happening in America, and is not necessarily happening in the English language. More importantly, it is far more specialised.
We already have a lot of the obvious big things, we have C compilers (e.g. GCC, first release 1987), we have kernels (e.g. Linux kernel, first release 1991), we have graphical toolkits (e.g. GTK2, first release 2002), we have HTTP Servers (e.g. Apache, first released 1995), SQL databases (e.g. MySQL, first released 1995) and Virtual Learning Environments (e.g. Moodle, released 2001).
Today's free/open source community are now not just playing catch-up but are going in its own directions, places that proprietary software has not. Which of these explorations will be successful I have no idea. But recent successes of the free/open source world include famous things such as package management (the fact that you can automatically download 20,000 stable malware free software packages at the click of a mouse), modern dynamic languages and web frameworks, through to things like XBMC, which allows you to recycle your old Xbox into a fantastic media centre, and OpenStreetMap which will soon have the best non-governmental map data in the UK.
Even more important are all the tiny projects in the long tail, the application that allows your wife/girlfriend to automatically sync a shopping list into your phone, or the application that allows you to export all emails in your gmail that include certain keywords into a file, e.g. an automatically generated list of everyone who has responded to your party invitation. These small projects that provide one incremental improvement are the majority of free/open source projects. Such projects don't have marketing departments or PR managers.
Fielding's original article was about how Tech Journalism lagged behind an event by a least a week, in other cases it can be years. Journalism covers the free/open source community poorly because they are covering it from a distance.
New ideas emerge in branches in version control systems, in IRC channels, development conferences and on mailing lists. By the time a technology gets into corporate press releases and corporate conferences, it is years old and the brightest minds in the free/open source community have long moved on. Many tech publications do little in the way of investigative journalism, they just re-post, re-write and re-hash whatever comes into their RSS feeds.
Outside of the mainstream tech press, you have people who self identify as free software journalists, embedded journalists, if you will. Hopefully, they are getting paid but they are very much on the fringes of the journalistic scene,
In the last episode of Lugradio, they discussed what they called tech "pundits". By this I would understand people who make their living from writing about the broad picture, people such as John Dvorak (I hate initialising middle letters),
In a church shared lunch, there will be various offerings. Some dishes are a work of beauty and worship that some lady has slaved over. Some are perfectly fine fillers that help to bulk out the lunch, e.g. salad, potatoes or rice. Others are off-the-shelf products that were hastily bought on the to church in by a single man. The occasional dish is worth avoiding entirely and will be subtly moved behind something else in a larger container.
Reading a pundit like John Dvorak, is like surviving a church shared lunch. Some articles are a revelation, some are interesting enough as far as they go, and some are obvious howlers. The main thing is to have a good time while you are there.
Discuss this post - Leave a commentI have had to use the TCL programming language recently, I don't know it well yet, and I have found the quickest way at the moment is to prototype in Python and then edit it into TCL code. This way I know the logic is sound, and therefore logic errors are not mixed in with syntax errors.
in the following example, I had a sequential list of numbers in TCL (which were unique ids of XML elements), and for a given number I had to find the nearest numbers on either side.
"""Nearest Neighbours in a list of numbers."""
def nearestneighbours(numlist, number):
"""For a given number, find the nearest lower and higher numbers in
a given (ordered) list of numbers."""
left = None
right = float('inf')
for i in numlist:
if i < number and i > left:
left = i
if i > number and i < right:
right = i
return (left, right)
def main():
"""Demo when called directly."""
mylist = [58163, 62140, 66139, 70280, 74371,
78525, 82426, 86584, 90650, 94749]
number = 67000
lower, higher = nearestneighbours(mylist, number)
print "Lower:", lower
print "Higher:", higher
if __name__ == "__main__":
main()
We have the function working as we want to, so now we can try to rewrite the code into TCL:
# Nearest Neighbours in a list of numbers.
proc nearestneighbours {numlist number} {
# For a given number, find the nearest lower and higher numbers in
# a given (ordered) list of numbers.
set left 0
set right 1000000000
foreach i $numlist {
if {[expr $i < $number]} {if {[expr $i > $left]} {set left $i}} elseif {
[expr $i > $number]} {if {[expr $i < $right]} {set right $i}}
} ;# end foreach
set nearest [list $left $right]
return $nearest
} ;# end proc findnearest
proc main {} {
# Demo when called directly.
set mylist [list "58163" "62140" "66139" "70280" "74371" "78525" "82426" "86584"
"90650" "94749"]
set number 67000
set highlow [nearestneighbours $mylist $number]
puts "Lower: [lindex $highlow 0]"
puts "Higher: [lindex $highlow 1]"
} ;# end proc main
main
This works great.
However, I wrote the above Python code in a verbose way because I was sure I could replicate it in TCL, in a Python program, I can just use the Python list's sort method to find the neighbours.
def nearestneighbours(numlist, number):
"""For a given number, find the nearest lower and higher numbers in
a given (ordered) list of numbers."""
numlist.append(number)
numlist.sort()
return(numlist[numlist.index(number)-1],
numlist[numlist.index(number)+1])
This works exactly the same as the much more long winded version at the start of this post. How does one do this in TCL? Well rewriting the Python gives us:
proc nearestneighbours {numlist number} {
# For a given number, find the nearest higher and lower numbers in
# a given (ordered) list of numbers.
lappend numlist $number
set numlist [lsort -integer $numlist]
return [list [lindex $numlist [expr [lsearch $numlist $number] -1]]
[lindex $numlist [expr [lsearch $numlist $number] +1]]]
} ;# end proc findnearest
This seems to work fine too, which is the preferred TCL way, I'm not sure.
Discuss this post - Leave a commentThis post is part of a series where I try to make outlandish predictions for 2008. Read the introduction for more details.
By the time you read this, a over a week will have passed and a week is a long time in politics. Maybe something will happen during the Iowa and New Hampshire primaries to shake things up a bit.
9. Social networking will become a protocol
We each meet more new people each year than my great-grandparents would have known in their entire lifetime. Not only that, but on the whole, you will meet a completely different set of new people than your best friends will, so the traditional shared memory of small communities does not help you. Human evolution has not yet caught up with this situation and the result is social amnesia - we cannot immediately recall all the details about all the people we have known. One way of coping is to keep really good diaries, something sadly I have hitherto failed to achieve. Another recent help is social networking sites.
I don't need Facebook to talk to people that I see in my daily life, I also don't need Facebook to interact with geeky friends, everything Facebook can do, something more efficient and less time-consuming can do. Facebook is, however, marginally useful to interact with non-geeks that I went to University with.
The current unique selling point of a service like Facebook or Myspace is that it bundles several functions together in a way that can be easily used by smart but technically dis-interested users and everything is nicely abstracted away behind consistent menus.
I, and perhaps other people, have been saying this for a while, but I still believe that social networking will eventually become a protocol. When you remove all the glitz from social networking sites, at the core are a few very basic functions:
For the sake of this article, the last three are thoroughly uninteresting, email and the web solve these problems far more efficiently than any walled garden social networking service can do.
The first two are the ones that interest us here, it is the 'friends list' that is the selling point of site like Facebook. The problem with the 'friends list' is that it is site-specific. When the next cool social networking site comes out, you have to rebuild your list from scratch. So one could end up with half-a-dozen different overlapping but not identical friends-lists on different networks. Remembering where the heck a specific half-forgotten person is, which of the six lists, removes the convenience that is the main foundation of a social-networking site.
So there is a tipping point, where the cost of entry and exit from these social networking sites outweighs their utility. The exact point will vary across individuals, according to how much of a real offline life they have.
The way to square the circle is to make your friends list more independent of the web application that you currently have in front of you. The answer is not Google's OpenSocial, it is not Microsoft's Passport, or any vendor specific service. It will be a simple protocol that allows you to sync your friends-list between different web applications, instant messaging applications, email accounts, photo sharing sites and so on. A protocol that the majority of its users won't even realise they are using.
I don't think the answer is OpenID, but the answer is something like it or something on top of it. The reason that OpenID is good is that anyone can implement OpenID. So normal people can use their account at web service A, for web services B, C, D, E and F; with minimal form-filling in the process. On the other side, any website can accept OpenID, from an established headline site down to a blog, no permission or licence is required from anyone.
The problem is that, as far as I can tell, OpenIDs do not necessarily map to a way of contacting people. If an OpenFriendList protocol was based on OpenID, then a web application, when given an arbitrary list of OpenIDs, would need to be able to send a message to an OpenID and expect it to go somewhere. The somewhere can change depending on the end user, they could have the messages go into their Facebook inbox, into their email, or onto their mobile phone. The main thing is that you can shoot text at an OpenID and expect it to be delivered.
Anyhow, I expect someone smarter than me will solve this problem in 2008 and no doubt become a millionaire in the process.
10. Mitt Romney becomes the president of the United States
You can't be a really blogger unless you make a guess about the US Election, even if you have no political knowledge or know nothing about America.
Both main US parties are far more right-wing than any party you will find in Europe (give or take a few Nazis), so there is not that much to divide them on policy. None of the candidates will propose to nationalise medicine or have the tax rates and redistribution of income that all Western European countries have. So in the end, it comes down to personality.
The problem is at the moment there are fifteen declared candidates and a few in the wings, so you might as well pick a name out of the hat at this point. So if I really have to choose a winner, I'll go for Mitt Romney.
Mitt Romney ticks all the boxes that American voters seem to want. He is white, male, well spoken and looks mature but not haggard and elderly like many of the other candidates. He has five kids and a good looking wife who can create recipes and train horses. Romney was an extremely successful (but unexciting) businessman, then he managed to fly in and rescue the 2002 Olympic Winter Games, then he had a successful single term as Governor of Massachusetts and went out on a high.
So on paper he is what Americans would want, but in reality, he is just a bit too dull, he is the cookie-cutter presidential stereotype, which could make him look a bit distant to the average American, being a practicing Mormon won't help to make those outside Utah feel like he is "one of us". He also has no experience to hold up on foreign affairs, but considering that whatever position you adopt, Iraq is political kryptonite, maybe that is no bad thing.
Do to his chronic dullness, what Romney can't really do is provide charismatic inspiration, so he risks being sidelined by some of the more nutty candidates who will get more of the airtime. The democratic candidates have managed to generate more interest at the moment, but that is partly because some of their candidates are not dull grey-haired white men in suits.
In the last-post, I went through the most popular Firefox extensions and talked about whether they were good ideas or not. However, it seems that not a lot of people think about another side to this, i.e. what are your Firefox extensions licenced under?
It turns out that a lot of the extensions available through Firefox are not free/open source software at all.
One example is the StumbleUpon Extension. StumbleUpon is a web service that allows you to share links with other users. Sometimes readers have shared this site and my number of visitors have gone up (cheers for that). StumbleUpon is commonly used through a toolbar provided as an extension through Firefox or Internet Explorer, (and a comment-in-the-last-post reminded me about it).
This made me think, what is the licence of this Firefox extension? If you go to the StumbleUpon-homepage, there is no software licence or terms at all. If you click the "Download now - Free" button, you go through to the download-page, still no licence or terms. I unzipped the extension, looking for a software licence, nothing. This made me very suspicious, when people are proud of their licence, they put it right in front of you, what are they hiding?
Eventually, after a bit of digging and Googling, I found their Toolbar-License and guess what? Yes you guessed it, it is proprietary software. So if you want to run free software/open source, then get it off your system now!
Their licence only gives you:
"a non-transferable ... non-sublicensable ... license to reproduce (solely to install and execute) the Toolbar on one of your computers, in executable object code format only, for your personal, non-commercial use only,"
Of course, the "Toolbar" is released as a Firefox extension, in plain-text Javascript and XUL, not in object code format. There is not really object code at all in Javascript, object code is a C term. But the lawyer writing the boilerplate probably didn't know or care about the difference. Anyhow, the licence continues:
"You may not modify, make derivative works of, copy, reproduce, publish, or reverse engineer the Toolbar"
This is in complete opposition to free software/open source, where all users have four freedoms:
Don't sell out your freedoms so cheaply! If you want the most free software computer possible, look up the licenses of your extensions.
For example, here are five popular extensions that are free software/open source:
Please do audit your own, and let us know what you find. Knowing which extensions are free and which are not free would be really helpful.
Discuss this post - Leave a commentI recently looked at the forthcoming Epiphany browser based-on-Webkit. However, some people told me that Firefox has so many extensions that it would not be possible for a new browser to compete, even among the target audience of GNOME users. Is this true?
I am not a C hacker and don't want to be at this stage, so I can't really help with the heavy lifting in finishing the new Epiphany. However, the previous Gecko-based version allowed you to write extensions in Python, so if that is true in the new version, I could write an extension or two.
The old gecko version of Epiphany had various extensions, and a dozen or so of the best were bundled in the Epiphany-Extensions package.
Firefox extensions
It is early days because, as far as I know, the new Epiphany extension API is not written yet, however, we can do a little research about Firefox extensions, and seeing which ones are worth replicating on Epiphany. I myself have FireGPG (allows you to use GPG with webmail), Flashblock (blocks Flash movies unless whitelisted) and FireBug (see below).
There are 2353 add-ons and themes in the Firefox add-on database, several are abandoned in that they have not been updated to work with modern versions of Firefox. The bottom 1000 have had very little impact. For example, the "Et Lolcat" extension translates English to 'locat', it has only been downloaded 26 times ever. I doubt the lack of a lolcat extension is going to prevent anyone from using Epiphany.
As you might expect, outside the big hitters, the popularity of extensions tails off pretty fast. The top few add-ons have been download hundreds of thousands of times, the 100th add-on has been downloaded 10,000 times, the 1000th add-on has hardly ever been downloaded by anyone.
So lets ignore all the themes as Epiphany themes according to your desktop theme; lets also ignore all the abandoned extensions and the extensions which have never really been downloaded by anyone. So we can say there are less than 500 extensions that are actually relevant for our purposes. This is still a massive number. I cannot think of another piece of software that has 500 active extensions.
In the rest of this post, I look through the list of the top 100 downloaded-add-ons. This list of course is dynamic, so will change according to when you view it. So where I have included a number, it is the position in the top 100 when I looked at it. Do not worry I don't talk about 100 add-ons, a lot of the top 100 add-ons are themes and dictionaries which I have ignored.
The top three
Video DownloadHelper (1) - This allows people to rip videos out of sites like Youtube, as does UnPlug (37) and a million others. This could be easily replicated by Epiphany but maybe a better approach would be a "save-as" button in Gnash? Likewise Flashblock would not be required if Gnash has an option for "only play when the user agrees to".
Adblock Plus (2) provides advert blocking, as does Adblock and Adblock Filterset.G Updater (38). In the old Epiphany, there already was a decent adblock. This can and no doubt will be easily replicated by an Epiphany extension.
NoScript (3) provides blocking and white-listing of Javascript. This could be easily replicated by an Epiphany extension. Epiphany already gives you the ability to turn Javascript on and off globally, the extension just needs to give the ability to control this behaviour per site.
Not all extensions are priorities
IE Tab (7) allows Windows users of Firefox to open non-standard webpages in IE. This is not available on Firefox for Linux so is irrelevant. People should not write IE only webpages.
Next we have the replacements for Firefox's rubbish download dialog: DownThemAll (4), Download Statusbar (6), PDF Download (10), Fast Video Download (15), ScrapBook (28). Hopefully Epiphany's download dialog will be good enough out of the gate. So these are not a priority.
Foxmarks (9) and Speed Dial (29) are replacements for Firefox's annoying bookmarks dialog. Epiphany's bookmark manager is better, so these extensions are not a high priority.
Greasemonkey (5) is a higher level extension tool, it basically makes it easier to write extensions for Firefox, especially per site extensions. If Epiphany's extensions are easy to write, this will not be needed.
The Fasterfox (17) extension allows you to prefetch pages, as well as make concurrent connections, i.e. download the same page ten times at the same time. I am undecided weather this extension is a good idea for the web. I wouldn't want people using it on my sites.
A web browser is not a desktop environment or package manager
Quite a few of the extensions use Firefox as a convenient way to make and distribute an application, not surprising as Windows does not have a package manager. These extensions may have none or only tangential connection to the fact that Firefox is a web browser. Many of these in Linux would work just as fine or better as a separate application, indeed many equivalent applications already exist and are probably better.
FireFTP (18) is an FTP client, GNOME has GFTP which is perfectly fine. FoxyTunes (27) is a media player frontend, Linux has billions of media players. Forecastfox (12) tells you the weather, the GNOME desktop already tells you the weather, we can even look out a window. Likewise, FoxClocks (30) tells you the time, which the GNOME desktop does by default. After 40, we have RSS Readers such as the "Feed Sidebar" and "Sage", as well the IRC client ChatZilla. GNOME has lots of RSS Readers, e.g. Straw and Liferea, and Linux has lots of IRC Clients. The best way to use IRC is to use a client that can run 24/7 on the server, such as Irssi.
ScribeFire is a Firefox extension that provides a text editor for blogging. There is GNOME-blog available through all the package managers, but I prefer to use a real text editor. FoxSaver is an extension to provide a screensaver and photoviewer, GNOME has the Eye of GNOME image viewer and its own screensaver. ReminderFox (35) provides reminders, as GNOME already does.
PicLens (8) provides desktop effects for Firefox on Windows. It is not available for Linux, but Compiz with Epiphany does a better job. The same applies to "Tab Effect" (21) and FireGestures (24).
The Firebug (13) extension is a fantastic toolkit for web designers that turns your browser into a complete Dreamweaver clone. This would perhaps be better as a webkit based application, the same goes for "Web Developer" (20).
"Better Gmail 2" (14) provides extra options for Gmail, turning Gmail into a rich desktop application. The whole point of web-based email is that you can access it from any computer anywhere without special software. If you want to use installed software, then Gnome has Evolution which is richer than any web application.
I also skimmed through the 100 to 200 most popular add-ons, and it was more of the same. I hate to be a snob, but it seems that the most downloaded extensions are not necessarily the best ones!
Conclusion
There are many hundreds of Firefox extensions, some of them are absolutely fantastic, however many are repetitive, many also replicate things that already exist on a GNOME based system by default or are quickly available in the package manager. A large number of the extensions are old and have not been ported to modern versions, and some of them are just bad ideas.
This survey has convinced me that it is quality not quantity that matters, that with just 20 well chosen extensions, Epiphany could offer the features that 80% of GNOME users want, with 50 well chosen extensions, it could offer the features that 95% of GNOME users want. I am talking about extensions that actually have something to do with web browsing, not turning Firefox into a jukebox, or into a calendar, into a Compiz replacement, or into an operating system of its own.
Discuss this post - Leave a commentssh.py provides three common SSH operations, get, put and execute. It is a high-level abstraction upon Paramiko.
I wrote it yesterday for my own needs, so it is still very much in the beta stage. Any improvements or comments gratefully accepted.
In short, it works as follows:
import ssh
s = ssh.Connection('example.com')
s.put('hello.txt')
s.get('goodbye.txt')
s.execute('du -h --max-depth=0')
s.close()
That is it, in the rest of this post, I walk through this line by line.
Installation
First, we need to install paramiko, if you don't have it already.
On Gentoo Linux:
emerge paramiko
On Ubuntu/Debian and so on:
apt-get install python-paramiko
If you want to use Python's easy_install then:
easy_install paramiko
Secondly, you need to grab the ssh.py module, grab it from my code-page, and save it as ssh.py.
Connecting to a remote server
To play with the script interactively, you need to start Python:
python
Now, import the ssh module:
import ssh
Next we need to initiate the connection. If your username is the same on both systems, and you have set up ssh-keys, then all you need to do is:
s = ssh.Connection('example.com')
Connection supports the following options:
host The Hostname of the remote machine. username Your username at the remote machine. private_key Your private key file. password Your password at the remote machine. port The SSH port of the remote machine.The host is essential of course. Port defaults to 22. The username defaults to the username you are currently using on the local machine.
You need to use one of the authentication methods, a private key or a password. If you don't specify anything, then ssh.Connection will attempt to use a private_key at ~/.ssh/id_rsa or ~/.ssh/id_dsa.
So to specify a username and password, you can do it like this:
s = ssh.Connection(host = 'example.com', username = 'warrior', password = 'lennalenna')
Of course, Python also allows you to use the order to specify the arguments, so the last example can be written as:
s = ssh.Connection('example.com', 'warrior', password = 'lennalenna')
Operations
Once you have set up the connection, there are three methods you can use. Firstly, to send a file from the local machine, you can use put:
s.put('hello.txt')
The above example copies a file called hello.txt from the current local working directory to the remote server. We can also be more explicit if we want:
s.put('/home/warrior/hello.txt', '/home/zombie/textfiles/report.txt')
So the above example copies /home/warrior/hello.txt on the local server to /home/zombie/textfiles/report.txt on the remote server.
The second operation works in a similar way but in reverse:
s.get('hello.txt')
get takes the file from the remote server to the local server, again we can be more explicit if we want:
s.get('/var/log/strange.log', '/home/warrior/serverlog.txt')
The above example copies the strange.log from the server and saves it as serverlog.txt.
The last operation is execute, this executes a command on the remote server:
s.execute('ls -l')
This returns the output as a Python list.
Closing the connection
You can do as many operations you like while the connection is open, but when you are finished, you need to close the connection between the local and remote machines. You do this with the close method:
s.close()
There we go, that is all I needed to do with SSH. Please do let me know using the comments below if you have any problems using it.
If you import my module in your program and later find that you need more power or flexibility, you should be able to swap it out for the full paramiko with a minimum of fuss.
Discuss this post - Leave a commentIn your scripts or applications, you might need to copy a file from one server to another. One way to do this is to use SFTP, the secure file transfer program, which uses an encrypted SSH (Secure Shell) transport which in turns runs over TCP/IP.
One of the Python implementations of SSH is called Paramiko (available in package managers as paramiko or python-paramiko).
Paramiko is extremely comprehensive so you can get as complicated as you like, but for me, I just want to be able to copy files from a known remotepath to a known localpath and back again.
In this post I explain how to do this using Paramiko directly, in the next-post, I look at another approach.
So we start by importing the module, and specifying the log file:
import paramiko
paramiko.util.log_to_file('/tmp/paramiko.log')
We open an SSH transport:
host = "example.com" port = 22 transport = paramiko.Transport((host, port))
Next we want to authenticate. We can do this with a password:
password = "example101" username = "warrior" transport.connect(username = username, password = password)
Another way is to use an SSH key:
import os
privatekeyfile = os.path.expanduser('~/.ssh/id_rsa')
mykey = paramiko.RSAKey.from_private_key_file(privatekeyfile)
username = 'warrior'
transport.connect(username = username, pkey = mykey)
Now we can start the SFTP client:
sftp = paramiko.SFTPClient.from_transport(transport)
Now lets pull a file across from the remote to the local system:
filepath = '/home/zeth/lenna.jpg' localpath = '/home/zeth/lenna.jpg' sftp.get(filepath, localpath)
Now lets go the other way:
filepath = '/home/zeth/lenna.jpg' localpath = '/home/zeth/lenna.jpg' sftp.put(filepath, localpath)
Lastly, we need to close the SFTP connection and the transport:
sftp.close() transport.close()
In my humble opinion, one should not have to write so many lines or care about the SSH protocol just to send a file from a to b. In the next-post, I will share my own higher level API that runs on top of Paramiko.
Discuss this post - Leave a commentI have a friend and fellow member of the Python West Midlands group. Whenever, someone mentions Django, he asks the person "but is it stable?". This has been repeated so much that is has become a local in-joke. However, lets take the question seriously.
To explore this further, we need to ask what does stable mean? I.e. can we replace the word "stable" with something else to provide some more meaningful questions:
Lets take these one at a time.
Traffic loads
Django's frequently-asked-questions says:
Is Django stable?
Yes. World Online has been using Django for more than three years. Sites built on Django have weathered traffic spikes of over one million hits an hour and a number of Slashdottings. Yes, it's quite stable.
The first sentence is a testimony, useful but not a direct answer. In the second sentence, 'stable' is used as in 'strong table', i.e. Django can handle a heavy load, (i.e traffic rather than physical objects).
It goes on to explain that Django has a "shared-nothing" approach, i.e. you can throw more servers directly at whatever bottleneck you have. If the database is the database, then you can add more hardware to the databases, if it is images the are the problem, you can add more hardware to the media servers, and so on.
Is Django maintained?
The next question is whether Django is actively maintained. One simplistic measure is to look at the bug database and see what is going on. In what follows I use "ticket" in the broadest sense, i.e. not just a confirmed code error, but also enhancement requests, invalid bugs and so on.
At time of writing, Django has 1092 open tickets, out of which, 311 are new and unreviewed, I would guess that half of these are valid problems, and half are not.
Meaning the other 781 open tickets have reviewed by someone at least once. Some have been triaged and are waiting to be worked on, some are already being worked on, some will have been closed wrongly and reopened.
In the last three years, 6047 Django tickets have been closed, we can break these down further:
Number Closed As 3348 "fixed" 867 "invalid" 809 "duplicates" 797 "wontfix" 226 "worksforme"A ticket being fixed is very useful, but a ticket being found to be not a problem or not Django's problem is still useful information, marginally useful perhaps, but still useful if you have that issue.
So Django is three years old, which is pretty new. In that time, they have closed 85% of tickets, while 10.6% of tickets are open but have been read, 4.4% are open and have never been read.
This seems competitive with similar open-source web frameworks:
Framework Years Tickets % closed Django 3 7139 82 Pylons 2 444 90 Symphony 3 3612 82 Turbogears 3 1840 85 Zope2 7 2352 81 Zope3 6 886 76The 'years' column represents how far the ticket tracker data goes back for, not necessarily the age of the project. There may also be sampling errors caused by differences in the ticket tracker software, or a project might have had a clear-out of closed tickets.
On the whole, it is all impressive stuff, all six open-source projects seem to be on top of things. One nice thing about Pylons is that, at the time of writing, it appears that all of the open tickets have been reviewed to some extent. Some other projects such as Django would benefit from more a few more triagers to review new tickets.
API
This is perhaps what my friend meant, i.e. if one makes a web application using Django in 2008, how much pain will it be for it to still work in 2013?
There is of course a balance to be struck between backwards compatibility on the one hand, and keeping the framework modern enough for meet the needs of today's web applications. It is a difficult balance.