| Network analysis in cell biology: a new tool in bioinformatics |
|---|
yet to be written
The examples and assignments in this booklet are all given in a piece of software called “the igraph library”. igraph is a software library: a collection of routines, procedures, methods doing network analysis. In fact igraph is most used together with the GNU R software environment. GNU R was originally designed for statistical analysis, but it is also a flexible and modern programming language and has several extension packages. igraph is such an extension package.
R and igraph were designed for real world projects and research, so it was only a secondary goal to have a user-friendly system with a not-too-steep learning curve. Ie. igraph is very easy to use for beginners, it requires to type in commands and often to write short R programs. As this booklet is for beginners, we don't want to do it this way. Instead a short program is developed specifically for our needs and it will serve as a (more) handy interface to igraph. This interface is called “tkigraph” and contains only a (very small) subset of what igraph and R can really do. Anyone seriously interested in doing network analysis should consider learning R and (the real) igraph package.
tkigraph was designed to be easy to start playing with. It can be started by a simple R command line, and uses only the standard R functions included in every R distribution. It is portable and should work equally well on Microsoft Windows, Apple OS X and Linux systems. To start tkigraph start R itself first and type this in into the R command line, usually denoted by the “>” character:
> source("http://geza.kzoo.edu/bionet.R")
(Do not type the “>”.) If the igraph package is not installed on your system, tkigraph downloads and installs it to a temporary directory and this might take some time every time to start tkigraph.
If everything went on smoothly a new window appears on the screen. It looks like this:

(Note that there might be some slight differences depending on your operating system and R version. This particular window is from a Linux based system.)
Most space of the tkigraph window is occupied by the graph list area, this shows the networks currently loaded into igraph. There are various ways to add a new network to igraph, we will see most of them shortly. The top of the window is occupied by the menu bar, the various network operations can be accessed through its menu entries. Some entries in the menu bar work like command buttons, eg. “Delete” and “Quit” but most of them will “roll down” and show several options to choose from. Let us create a simple graph to show how it works. Choose the “Create” menu entry and then “Ring” from the list. This options creates a “ring network” with the specified number of vertices. Give the number of vertices in the appearing dialog box and click on the “OK” button. The ring graph is created and added to the list of graphs. Your igraph window looks like this now:

For each network in the graph list the following information is displayed: in the column marked with “#” the “id” of the network is given, some igraph functions will use this id to refer to graph. Then the name of the graph is shown, for our ring graph this was set to “Regular ring”, you can edit it if you don't like the name or you have several ring graphs and want to distinguish between them. The column “|V|” is the number of vertices in the graph and “|E|” is the number of edges. The “Dir.” column shows whether the graph is directed or not, our ring is undirected. At the begining of each row right before the graph id, there is a checkbox, which can be checked/unchecked by clicking on it once. The networks with their checkbox checked are called selected graphs. Most network operations work on the selected graphs, some of them allow only one graph to be selected at a time, others work with many selected graphs as well.
If you don't need a network in the graph list any more, you can delete it with the “Delete” command button in the menu bar. Of course only the selected network(s) will be deleted. Deleting a graph keepss the graph ids of the remaining graphs. So if you have three graphs with ids 1, 2, and 3 and delete graph #2 then the remaining two graphs will still have ids #1 and #3.
tkigraph can perform operations on the networks in the graph list. The graphs in the graph list can be saved into a file and then load it later. This way you don't have to start with creating the graphs again if the some of them are needed again. For saving the graphs you choose “Save” from the “File” menu. This always saves all graphs, not just the selected ones. For the opposite operation click on “Load” in the “File” menu. Saving and loading a set of graphs preserves the order of the graphs in the graph list but not neccessarily their graph ids.
Sometimes it is needed to create a small graph with the desired edges and vertices. For this choose “By hand” from the “Create” menu and a very simple spreadsheet-like interface is provided for typing in the graph. Be sure to deselect all graphs before choosing “By hand”. You can use numeric vertex ids or symbolic vertex names to give the edges of the graph, one edge in each line. If you want to type in a friendship network you can do something like this:

Press “Quit” to create the network but first you need to select whether you want to create a directed network. Now you can plot the new network by choosing “Non-interactive” from the “Draw” menu. See section “Plotting graphs” for more about this.
An already added graph can also edited “by hand” the same way as you've added the last graph. For this select only the graph you want to edit and choose “By hand” from the “Create” menu.
A ring graph is a very simple regular structure: each vertex is connected to two other vertices in a closed circle. For small graphs with not too many edges it is often very useful to inspect them visually. Let us see how a ring looks like. Click on the checkbox of the ring in the graph list to make it selected and then choose “Non-interactive” from the “Draw” menu. A parameter windows appers where you can select various options for the visualization of the graph. The default values are fine for our current purposes so just press the “OK” button to see the graph:

To close a graphic window, click on the button in its upper right corner.
It is very important to select the place of the vertices for visualization, otherwise there isn't too much to see on graph. igraph has various so-called layout algorithms to calculate the vertex positions and every network in the network list has a default layout. By default the vertices of a ring are arranged on a circle. Other layout algorithms can be selected from the parameter menu of the “Non-interactive” draw option:

You may select the “Random” option, this places the vertices on a square randomly and does not have too much practical use. You will get a different layout every time you draw the graph, but it will always a mess like this:

Interactive graph plotting is also possible, the only difference is that you choose “Interactive” from the “Draw” menu. On an interactive plot the vertices can be moved by dragging the mouse and color, size and other parameters of the vertices and edges can be changed. This is the interactive plot our ring:

The interactive plot window also has a menu bar to set various parameters of the plot. The “Close” menu entry closes the plot, “Layout” lets you update the layout algorithm of the plot, “View” can be used to update the plot if you resized the plot window and “Export” can save the plot in Encapsulated PostScript (EPS) format. The “Select” menu is interesting, it lets you select a set of vertices or edges. To demonstrate this first create a smaller graph, say a star with 10 vertices. This will be network #2 in the graph list, assuming you didn't create other graphs since the first ring. Plot this small star interactively and choose “IDs” for “Labels” from the parameter window:

Note that the default size of the vertices is bigger for this small graph. Click on “OK” to see the plot:

Let us assume that we want to set the color of the center vertex to red, and the color of all other vertices to green. There are two ways to select a set of vertices:
You press the “Ctrl” key on the keyboard and while keeping it pressed click on the vertices to select one after the other.
First you deselect all vertices by choosing “Deselect everything” from the “Select” menu. Then you choose “Select some vertices” from the same menu and give the ids of the vertices to select separated by commas. You can also give intervals here, so to select all vertices except the center you can type “1:9”.
When all desired vertices are selected, right click on a selected vertex and choose “Vertex color”. Choose the desired color in the appearing dialog box. If you did everything right you see something like this:

It is a difficult problem to calculate the “right” coordinates of the vertices for drawing a graph and usually there is no really good solution if the graph is dense, if it contains many edges. For smaller and less dense networks the various algorithms implemented in igraph can give satisfying results. Let us give some quick tips about when to use which algorithm.
Layout algorithms
| Default | For every network there is a default layout, this is assigned when the network is created and depends on the number of vertices and other parameters. For example for rings the “circle” layout is the default, for trees a tree-layout called Reingold-Tilford is used. It is a good idea to try the default layout first. |
| The force-based Kamada-Kawai layout |
TODO This algorithms gives the best results for small connected graphs. It does not work well if the graph is not connected, in this case try the Fruchterman-Reingold layout. |
| The force-based Fruchterman-Reingold layout |
This layout is based on a physical model of the graph: first a unit of mass it added to each randomly placed vertex, gravity makes the vertices repel each other while the edges act as springs and pull adjacent vertices back together. The whole system is gradually cooled down, so vertices move less and less until they freeze. This layout works well for graphs up to 1000 vertices, and it is not a big problem if the graph is not connected. |
| The Reingold-Tilford layout for trees | This is a simple layout for trees or tree-like graphs, the vertices are drawn as branches from a common root. |
| The circle layout | The vertices are simply placed equidistantly on a circle. |
| The random layout | The vertices are placed randomly on a square, this layout does not really make any sense. |
| << Introduction | Protein-protein interaction networks >> |