One of the strengths of LaTeX is that it handles special symbols quite well and is well suited for mathematical annotation in general. However, sometimes you need to add special symbols to the main body of your document and it doesn’t make sense to do it in math mode. Two examples of this are when you want to include a less-than sign or a greek symbol in your text. Firstly, you need to know the macro for the symbol you want. In the case of the less-than sign, it is simply <. The macro for the alpha character is \alpha. To insert one of these symbols in text mode, simply surround it with $ signs. For example:
The t-test results were highly significant (p$<$0.0001), indicating that the ER$\alpha$ gene was responsible for tumor growth.
May 5th, 2010 | Posted in LaTeX | No Comments
Let’s say you are creating a barplot in R. Sometimes you want to label the bars below the X axis. But sometimes you might also want to put a label above the bars. This is pretty easy to do but a little hidden. Below is a working example based on randomly generated data that illustrates how to do this. You can play around with the configuration parameters as much as you want, but this gives you the general idea.
x = rbind(abs(rnorm(5)))
barX <- barplot(x, ylim=c(0, max(x)+0.5), names.arg=c(1,2,3,4,5))
text(x=barX, y=x+0.07, label=c("a", "b", "c", "d", "e"))
April 27th, 2010 | Posted in R | No Comments
I finally made the leap to doing most of my research on computers that run on Linux. The distribution I’m using is Ubuntu. So far I’ve found it quite good. One thing I do frequently is open a Terminal (command) window in a specific directory. However, when you open the Terminal, it by default opens in your home directory. You can create a shortcut to open it in a specific directory pretty easily:
gnome-terminal --working-directory=/home/user/my/custom/path
I add a little shortcut on the top panel by right-clicking and saying Add to Panel… and then creating a Custom Application Launcher and then pasting a command such as the above in the command box. I’m using Ubuntu 9.10, so it could be different in other versions.
April 24th, 2010 | Posted in Linux | 2 Comments
I was trying to install some packages in R recently that depended on the XML package. I used the nifty install.packages() command, but I was getting an error:
“cannot find xml2-config”
This error was preventing these other packages from being installed properly. It turns out that I needed to install a library called libxml2-dev. In my case, I am running Ubuntu Linux 9.10, so all I had to go was go to System -> Administration -> Synaptic Package Manager and install libxml2-dev.
April 22nd, 2010 | Posted in Linux, R | 6 Comments
I’m using the R statistical package and the ROCR package within that. Both of these are free and very flexible. Sometimes with that flexibility comes ambiguity as to what you should do to accomplish a relatively simple task. I am doing a data-mining (machine-learning) project in which I predict a cancer patient’s prognosis. The algorithms I use can give a probability of what either outcome (let’s say “good” or “bad”) will be. I can use those probabilities to create an ROC curve. If you don’t know what this is, Wikipedia has a nice tutorial on it.
Below is the code I used to create this.
rocCurve = function(probabilityPredictions, actualClasses)
{
pred = prediction(probabilityPredictions, actualClasses)
perf = performance(pred, 'tpr', 'fpr')
plot(perf, lwd=2)
abline(0, 1, lwd=2, lty=3)
}
April 9th, 2010 | Posted in R, Statistics | No Comments
In the R statistical package, you have various ways of representing and packaging data. The most simple is in a single variable. More complex representations include vectors, lists, matrices, and data frames. An even more complex representation is S4 Classes, which are intended to simulate object-oriented programming in R.
I was using an R package that returned an S4 class, but I wasn’t sure how to access the properties of the class. A $ sign didn’t work, like it does for lists and data frames. After some searching, I found out that the @ sign is used for this.
More specifically, I am using the ROCR package and wanted to get the area under the curve. Below is how I did that for some randomly generated data.
require(ROCR, quietly=T)
auc = function(numericPredictions, actualClasses)
{
pred = prediction(numericPredictions, actualClasses)
perf = performance(pred, 'auc')
return(perf@y.values[[1]][1])
}
actualClasses = c(rep("a", 500), rep("b", 500))
numericPredictions = rbinom(length(actualClasses), 1, 0.5)
print(auc(numericPredictions, actualClasses))
April 9th, 2010 | Posted in R, Statistics | 1 Comment
When you’re creating a document with LaTeX, you sometimes want to restart the page numbering. In my case, it’s because I have an appendix (supplementary methods) section in a paper I’m writing. I want to be able to reference the figures in this section starting at 1 rather than starting at the number of the last figure I added. Thanks to a bit of googling, I came across the following solution.
Right after the appendix section declaration, add the following lines (supports tables also):
\setcounter{figure}{0}
\setcounter{table}{0}
Then you just reference the figures as you normally would, but you of course would want to describe them adequately. For example, I reference my supplementary figures like this:
Please refer to Supplementary Figure \ref{fig:myfigureid}.
This seemed to work really well for me. There might be some intricacies that I’m missing, but hopefully it will get you on the right path. Please let me know if you know of any other important considerations in doing this.
April 6th, 2010 | Posted in LaTeX | No Comments
In this post, I referred to an article that explains how to invoke methods dynamically (when you have the name of the method as a string object). Today I ran into a problem where I needed to do this, but the methods were not contained within a class. I just had them declared in my Python module at the root level. I believe these are called “functions” in Python.
def function1(parameter1):
print parameter1
To invoke this dynamically, I would need to do the following:
import sys
result = getattr(sys.modules[__name__], "function1")("abc")
I guess sys.modules contains a list of all modules currently running in the system, and __name__ is the name of the current module.
Note: Thanks to this page for help on this problem.
March 25th, 2010 | Posted in Python, Reflection | 2 Comments
I’m working on a paper using the LaTeX typesetting system, and I’m loving it. However, I ran into a little challenge. I had a wide image that I wanted to put on its own page. By default LaTeX tries to be intelligent (and usually does a pretty good job of it) at placing graphics, using the floating approach. But in my document, it kept floating my graphic to a place I didn’t want it. Sometimes you just want to tell it exactly where to put the image and not allow it to float.
After a little searching, here’s how I went about it. Firstly, I made sure I had the following packages imported. If you haven’t downloaded them already, you’ll need to do that. In my case, I had a big enough hard drive that I had downloaded all packages when I installed LaTeX.
\usepackage{graphicx}
\usepackage{rotating}
\usepackage{float}
\usepackage{rotfloat}
Then in my document, I added a sideways figure like this:
\begin{sidewaysfigure}[H]
\includegraphics[width=8 in]{Figure.pdf}
\end{sidewaysfigure}
You can do something similar with a regular–not sideways–figure and with tables to keep things from floating.
\begin{figure}[H]
\includegraphics[width=8 in]{Figure.pdf}
\end{figure}
I won’t go into detail about how this all works. But you can find documentation for the different packages on the Web or in your LaTeX installation. Thanks to the developers who have created these helper packages.
March 3rd, 2010 | Posted in LaTeX | No Comments
If you want to compile Java files at the command line, you use the javac command.
This will create a .class file for every .java file that could be compiled.
But let’s say you are referencing one of many external Java libraries that are packaged as .jar files. You probably don’t want to just stick the .jar files in the same directory as the .java files. That would be sort of messy. In my case, I’m putting those in a sub-directory called lib. But the Java compiler doesn’t always know where to look for those .jar files. Below is one simple way to tell it.
javac -Djava.ext.dirs ../Java/lib ../Java/*.java
February 5th, 2010 | Posted in Java | 1 Comment
Sometimes you just need some extra vertical space in a LaTeX document. There are a couple of simple ways to approach this. This first is the vspace command:
\vspace{5 mm}
This approach gives you a lot of flexibility because you can specify exactly how large you want the vertical space to be.
However, sometimes, you simply want a vertical line that is the same size as other lines in your document. An easy way to do this is to use the blankline command:
\blankline
Enjoy!
January 13th, 2010 | Posted in LaTeX, Tip | No Comments
I installed cygwin, which allows you to execute Linux-based applications on your Windows machines. But what I wanted to do was connect (using ssh) to a Linux box and run an application that has a GUI from there. In this case, I wanted to be able to connect to server and run firefox from the server. This would enable me to download some files to the server, etc.
When I ssh’d into the machine:
ssh me@server.edu
And then tried to start firefox:
firefox &
I got the following error:
Error: no display specified
In searching around, the solution I found was that you have to enable X11 Forwarding. A simple way to do this is to specify the -X parameter when you ssh into the server:
ssh -X me@server.edu
Then when you try to open firefox, it should work:
firefox &
December 28th, 2009 | Posted in Linux | No Comments
In R, you can plot a variety of symbols when you are plotting points on a graph. The default is a hollow circle. But it is very flexible. To do this, you use the pch parameter when you create a plot. You can find information in the help files (?pch), but it only describes the symbols and doesn’t illustrate what they are. I was wondering for my project, so I put together a little script to do this. Below is the code for symbols 1–25. And here’s the graph
par(mar=c(2, 0, 0, 0) + 0.1)
pchRange = 1:25
for (i in pchRange)
{
if (i==1)
plot(rep(i, 10), 1:10, pch=i, xlim=c(min(pchRange), max(pchRange)), xlab=0, ylab=0, yaxt="n" )
if (i>1)
points(rep(i, 10), 1:10, pch=i)
}
December 4th, 2009 | Posted in R | 1 Comment
Let’s say you did a search for files matching a certain pattern in a directory using Python:
import glob
filePaths = glob.glob("C:\\Temp\\*.txt")
print filePaths
This will list the full file paths with a .txt extension in the C:\Temp directory. For example: C:\\Temp\\test.txt.
But if you wanted to get just the file name, how would you go about that? It took me a little while to find an answer, and the method not super obvious, so I’ll post it here.
import glob, os
filePaths = glob.glob("C:\\Temp\\*.txt")
for filePath in filePaths:
print os.path.basename(filePath)
December 3rd, 2009 | Posted in Python | 1 Comment
In this post, I explained how to find quartiles in a list of numbers. As a slight add-on to that functionality, you can easily get the interquartile range. This basically means that you are finding the difference between the first and third quartiles.
So I use the code to compute the quartiles and then use simple math to subtract the difference.
public static double InterQuartileRange(ArrayList values) throws Exception
{
double[] quartiles = Quartiles(values);
return quartiles[2] - quartiles[0];
}
October 16th, 2009 | Posted in Java, Math | No Comments
For a brief overview of what quartiles are, you might read the Wikipedia page on this topic.
Basically, what it means is that if you were to break a list of numbers into four even parts, these would be the values that would separate them. But it gets a little complicated when your list doesn’t break into even chunks, etc. But I won’t get into the details of that here.
I threw together some code for getting the quartiles, which is below. I’ve also included some helper methods that I created to help with it. Also, note that I reused the Median method described here.
public static double[] Quartiles(ArrayList values) throws Exception
{
if (values.size() < 3)
throw new Exception("This method is not designed to handle lists with fewer than 3 elements.");
double median = Median(values);
ArrayList lowerHalf = GetValuesLessThan(values, median, true);
ArrayList upperHalf = GetValuesGreaterThan(values, median, true);
return new double[] {Median(lowerHalf), median, Median(upperHalf)};
}
public static ArrayList GetValuesGreaterThan(ArrayList values, double limit, boolean orEqualTo)
{
ArrayList modValues = new ArrayList();
for (double value : values)
if (value > limit || (value == limit && orEqualTo))
modValues.add(value);
return modValues;
}
public static ArrayList GetValuesLessThan(ArrayList values, double limit, boolean orEqualTo)
{
ArrayList modValues = new ArrayList();
for (double value : values)
if (value < limit || (value == limit && orEqualTo))
modValues.add(value);
return modValues;
}
October 16th, 2009 | Posted in Java, Math | 1 Comment
This is a pretty simple tip, but I still thought I would share it for anyone who is interested. To find out whether an integer is odd, you can use the modulo operator. This operator tells you the remainder after dividing the number by some other number. If you divide any integer by 2, you would expect a remainder of 1 if it is odd or no remainder if it is even.
public static boolean IsOdd(int number)
{
return number % 2 == 1;
}
October 16th, 2009 | Posted in Java, Math | No Comments
I came across an interesting summary of research that was performed at Microsoft regarding the effectiveness of commonly used software development practices, including test-driven development, code coverage, small teams, etc. It was interesting that the researchers tried to quantify how well these practices work, rather than rely on what supposed “experts” say. See the summary, along with links to research papers, here.
October 8th, 2009 | Posted in Tip | No Comments
Cygwin is a software tool that allows you to run Linux programs in Windows. I have had good success in using it to run commands locally and to connect to other servers via ssh, scp, etc. I just got a new computer and installed the latest version of Cygwin, but when I tried to ssh to another server, it told me the ssh command could not be found. This was weird, because it had worked just fine on my previous computer with the base install (I think).
After looking long and hard for a solution to this, I found someone suggesting that it simply wasn’t installed. So then I installed basically every program/library that Cygwin has to offer (since disk space is not a concern right now), and that seems to have solved the problem. The URL above tells you how to do this. If you want to be more selective about which libraries to install, that is fine. I just didn’t want to figure that out at the time, and I thought I’d see what other programs Cygwin has to offer.
September 29th, 2009 | Posted in Linux | No Comments
This bit of advice is not new to me nor to the software development community. But recently I had one of those experiences where I put together a quick solution (to keep my code as simple as possible), and later I ran into a performance problem. And it was because I was violating this principle.
I am generating a very large string (based on values that are read from a text file). I had instantiated a string and was appending to the end of it, like so:
String output = "";
for (String line : fileReader)
output += line;
And it was going v….e….r….y…s….l….o….w. After figuring out where the slowness was occurring, I used the StringBuilder instead. And it started going fast! Here’s the general idea of the change to the code.
StringBuilder output = new StringBuilder();
for (String line : fileReader)
output.append(line);
The reason the latter is so much faster is that it only has to create one object in memory: the StringBuilder object. If you just use a String, it has to recreate a new String object in memory each time you append to it.
Just something to keep in mind.
September 18th, 2009 | Posted in Java, Performance | No Comments