Home

First Steps in Python

This is mainly chapters 1 to 5 of Hello World.

The first examples

Chapter 1 of Hello World explains about Python shells, which are interactive environments for writing and running Python programs. Hello World follows IDLE, which is the simplest Python shell, but we recommend Spyder, which is better suited to scientific work. The choice of Python shell makes no difference to the program code.

The number-guessing game nicely illustrates the character of the language. Already in the first two lines

import random
secret = random.randint(1, 100)

we have a function that will be useful later. This function cannot be called directly since it is not a built-in function. Instead, it is part of the random library. This library needs to be imported before the function can be called. Almost all functions in the random library depend on the basic function random.random(), which generates a random number drawn from a uniform distribution on the interval (0,1).

Try and think up a good strategy for guessing the secret number. Will it always lead to the answer within six tries?

Variables

Chapter 2 of Hello World introduces the concept of variables. The analogy with pictures of labels and rings is a good one, and worth thinking over a little before going on.

If x and y are numbers, the fragment

x = x + y
y = x - y
x = x - y

would be nonsense in ordinary mathematics, but it is correct Python. Can you work out what it does?

Arithmetic

Chapter 3 of Hello World gets us started on interesting operations.

Depending on your version of Python, 3/2 may give you 1.5 or it may discard the remainder and give the integer 1. The latter is the old Python standard, and is deprecated. If your installation has the old standard, you can change to the new standard by putting

from __future__ import division

at the top of each program. If you want integer division, 3//2 will provide it.

Data types

Chapter 4 of Hello World introduces the notion of a data type, and explains about int, float and str.

In addition to using str(x) to convert a number into a string, there is another method, known as formatting numbers. Hello World covers formatting number later in Chapter 21, but we can see it now through some examples. Try the following.

x = 22/3
x
str(x)
'a number: %i' % x
'another number: %f' % x
'and yet another: %e' % x

As you can see, % acts as an operator that inserts a number into a string. You can specify the (minimum) number of characters for writing an integer:

'%3i' % x

If the number is too small, it will be padded with spaces on the left. Or you can choose padding with zeros.

'%03i' % x

In similar fashion, you can specify the number of digits on the right of a float

'%25.20f' % x
'%29.20f' % x

The two numbers in the format are the total number of characters (including the decimal point and a minus sign, if any), and the number of decimal digits. Printing more digits does not imply more accuracy! (Floats are good to 16 decimal digits at most.)

Here's another example of a strange-looking but perfectly valid Python statement.

x = 2 > 3

Using the interpreter, find the value and type of x. What other values could this type of variable have?

It is possible to change the case of letters as follows:

string1='Hello'
string2=string1.upper()
print string2
string3=string2.lower()
print string3

Input

Chapter 5 introduces interactive input, as well as the urllib library for reading web pages directly.

By the way, reading a disk file into Python works like reading a web page, but simpler: we just use open instead of urllib.urlopen and no import is needed. We will come back to file input later.

Fetching genomes

Input from the web is not limited to simple pages. Here is an example to fetch a protein sequence from UniProt and save it in a text file.

from Bio import ExPASy, SeqIO
sid = raw_input('Sequence id? ')
try:
    handle = ExPASy.get_sprot_raw(sid)
    seq = SeqIO.read(handle,'swiss')
    SeqIO.write(seq, sid+'.genbank','genbank')
    print 'Sequence length',len(seq)
except Exception:
    print 'Sequence not found'

By replacing genbank with fasta you can change the output to FASTA format. The sequence data are exactly the same, of course, but the FASTA format doesn't have some of the metadata that the 'GenBank format has, such as the name of the person who sequenced the gene.

This example also illustrates Python's try...except construction, which is useful for handling errors.

The sequence id F8RBX8 stands for the protein sequence of an important gene in a well-known organism. Fetch the data and save it as a FASTA file. Open the file using a text editor and read off the gene name and organism.