Passing Command-Line Arguments to Bash Scripts

I’m sure this one will be obvious to many readers, but it wasn’t to me, so I’m going to share it. A bash script is a file that you can run at the command line in Linux/Unix environments to automate something. In my case, I have a Java program that I need to run over and over again, so I created a bash script to make this easier rather than having to type the same long command repeatedly. These scripts are quite flexible and allow you do even programming constructs like functions, for loops, and if statements, etc.

Let’s say you have a simple bash script. The file name is test. And the contents of the file are the following:

#!/bin/bash
 
echo 'Hello World!'

You would invoke this at the command line (from the directory where it resides) by entering:

./test

Now let’s say you want to be able to pass an argument to it. You could modify the script to this.

#!/bin/bash
 
echo 'Hello World!'
echo 'The argument is:'
echo $1

You would invoke this at the command line by entering:

./test whatup

This would print out:

Hello World!
The argument is:
whatup

As far as I know, the command-line arguments can only be referenced by their index (in this case, it was the number 1 because it was the first argument). But you could get more creative if you wanted to.

This article has more detail on how to create bash scripts:

http://www.ibm.com/developerworks/library/l-bash.html

How to Do a String Join in R

Let’s say you have a list of values in a vector. And you want to be able to convert that into a delimited string. In some languages (such as Python), you can do this easily with the join method. But how would you do this in R?

This can be done with the paste function. The following code example shows how you would convert a simple vector to a comma-separated string.

x = c(1,2,3)
paste(x, collapse=",")

How to List All Methods in a Python Object

Suppose you have an object in Python that you retrieved from a third-party library, but you don’t have access to the source code or to very good documentation. Believe me, it happens (and did to me today). You can use a simple built-in method in Python to find out which methods are exposed by a given object: dir().

x = ... #get from somewhere
print dir(x)

Invoke External Application in Java

I recently had a scenario where part of what I wanted to do was in Java, and the other part was in an application that was written in Python. Rather than than rewrite my entire code base in one language or the other, I wanted to find a (quick and dirty) way to invoke the Python application from Java. One way to do this would be to use some type of service-oriented architecture (e.g. via an HTTP/XML Web service), but I didn’t want to get into that complexity for this simple task. Another way would be to use some type of interface like Jython.

Instead I just invoked the other program as if from the command line and retrieved the output that was stored in text files. This type of approach will probably only work in limited scenarios, but it can come in handy for integrating different programming languages.

This article gives a nice code sample (that has worked for me) that illustrates how to do this. If you’re like me, you’ll probably simplify the code, but at least it shows how to do it.

“Last Index Of” for Python Strings

In Java and C#, there is a method that enables you to find the last occurrence of a given sequence of string characters within a string. This method is sometimes called “lastIndexOf.” A method by this name does not exist in Python. But there is an easy way to do it if you know what you are looking for.

The method is rfind, which could be translated into “find from the right side.” See below.

x = "abc,def,ghi"
print x.rfind(",") // 7

Simple Way to Compute Median in Java

There is no way that I know of to find the median of a list of numbers in the Java framework. The median is the middle value. If there is an even number of values, the median is the middle of these two numbers. Below is a method, along with a supporting method and some tests for computing the median in Java.

public static double Median(ArrayList values)
{
    Collections.sort(values);
 
    if (values.size() % 2 == 1)
	return values.get((values.size()+1)/2-1);
    else
    {
	double lower = values.get(values.size()/2-1);
	double upper = values.get(values.size()/2);
 
	return (lower + upper) / 2.0;
    }	
}

public static ArrayList CreateDoubleList(double ... values)
{
    ArrayList results = new ArrayList();
 
    for (double d : values)
	results.add(d);
    return results;
}
 
System.out.println(2.5==MathUtility.Median(Lists.CreateDoubleList(0,1,2,3,4,5)));
System.out.println(2.0==MathUtility.Median(Lists.CreateDoubleList(0,1,2,3,4)));
System.out.println(2.0==MathUtility.Median(Lists.CreateDoubleList(3,1,2)));
System.out.println(3.0==MathUtility.Median(Lists.CreateDoubleList(3,2,3)));
System.out.println(1.234==MathUtility.Median(Lists.CreateDoubleList(1.234, 3.678, -2.467)));
System.out.println(1.345==MathUtility.Median(Lists.CreateDoubleList(1.234, 3.678, 1.456, -2.467)));

Perform Log Base 2 Transformation in Java

Java has functionality built into it to transform a number using the natural logarithm. This can be done using the java.util.Math.log() method. However, to my knowledge there is no way to do this for base-2 logarithms.

Please don’t let me get started on how silly this is!! :)

To do a base-2 log transformation:

public static double Log2(double number)
{
    return Math.log(number)/Math.log(2);
}

Modifying Text Size of Axis Labels in R

When you create a plot in R, you can easily modify the text size of the labels on the axes using the cex.lab property. This stands for “character expansion of labels.” For this value, you specify a relative size (compared to the default) that you want the text to be. The following code shows how you might do this:

barplot(c(48.3, 63.3, 66.7), ylim=c(0, 70), col=1:3, ylab="Classification Accuracy %", names=c("Majority Vote", "Clinical Only", "Clinical + Mutations"), cex.names=1.3, cex.lab=1.3)

Creating a Reverse Sequence of Numbers in Python

Let’s say you want to create a sequence of numbers in reverse order (for example, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1). I tried to do this with the range function, but I couldn’t find a way to do it. But Python has a built-in function called reversed that allows you to do this. An easy way to implement this is loop over it as below:

print [x for x in reversed(range(1, 11, 1))]

How to Increase the Priority of a Process in Linux

In Linux, you have the ability to determine which processes are running at a given time on the server. This post explains how to do this using the top command.

Let’s say you have a process that is not getting as much CPU time as you would like, maybe because it is getting beat out by another process. Well, one of the columns displayed by the top command is NI. This stands for NICE and is an integer value between -20 and 19. Lower values mean that the process is “less nice” to other processes, while higher values mean the process is “nice.” These values determine how CPU cycles are allocated.

To increase the priority of a given process, get the PID and use the renice command to change it.

For example, let’s say you have a process with a PID of 1000 and want to make it very unnice. You would do the following at the command line:

renice -20 1000

Which Linux Operating System Version Are You Running?

Linux comes in many flavors. If you did not perform the installation (or forgot), you may need to find out which flavor is being run on a given box. In Linux, it is pretty easy to do this. Simply type the following command at the command line, and it will give detail about it:

cat /proc/version

Find the Difference Between Two Dates in Java

Sometimes when you’re writing a program, you want to be able determine the difference in time between two dates. This might be because you want to see how long it took something to run, or you might be processing data and need to find a time span between two events. In searching around the Internet, I found some forum posts that explained how to do this to some extent, but below is an expanded solution that can help you find the difference in years, months, days, hours, minutes, and seconds.

Please note that the differences in months and years are only approximate. On average a year in the Gregorian calendar lasts 365.2425 days (because of leap years, etc.), so it just divides the number of days by that number to get the difference in days; and that number is divided by 12 to get the difference in months. In many cases, this approximate measure should be adequate, but a more precise answer is possible with more advanced coding. Please let me know if you develop this yourself or if you would like me to post such a solution.

public class Dates
{
    public static double DifferenceInMonths(Date date1, Date date2)
    {
	return DifferenceInYears(date1, date2) * 12;
    }
 
    public static double DifferenceInYears(Date date1, Date date2)
    {
	double days = DifferenceInDays(date1, date2);
	return  days / 365.2425;
    }
 
    public static double DifferenceInDays(Date date1, Date date2)
    {
	return DifferenceInHours(date1, date2) / 24.0;
    }
 
    public static double DifferenceInHours(Date date1, Date date2)
    {
	return DifferenceInMinutes(date1, date2) / 60.0;
    }
 
    public static double DifferenceInMinutes(Date date1, Date date2)
    {
	return DifferenceInSeconds(date1, date2) / 60.0;
    }
 
    public static double DifferenceInSeconds(Date date1, Date date2)
    {
	return DifferenceInMilliseconds(date1, date2) / 1000.0;
    }
 
    private static double DifferenceInMilliseconds(Date date1, Date date2)
    {
	return Math.abs(GetTimeInMilliseconds(date1) - GetTimeInMilliseconds(date2));
    }
 
    private static long GetTimeInMilliseconds(Date date)
    {
	Calendar cal = Calendar.getInstance();
	cal.setTime(date);
	return cal.getTimeInMillis() + cal.getTimeZone().getOffset(cal.getTimeInMillis());
    }
}

This is the code I used to test it:

public static boolean Test(double expected, double actual)
{
    boolean result = MathUtility.Round(expected, 4) == MathUtility.Round(actual, 4);
 
    if (!result)
	System.out.println("Expected: " + expected + ", Actual: " + actual);
 
    return result;
}
 
System.out.println(Test(1, Dates.DifferenceInSeconds(Dates.CreateDate(2008, Calendar.JANUARY, 1, 0, 0, 0), Dates.CreateDate(2008, Calendar.JANUARY, 1, 0, 0, 1))));
System.out.println(Test(1+60, Dates.DifferenceInSeconds(Dates.CreateDate(2008, Calendar.JANUARY, 1, 0, 0, 0), Dates.CreateDate(2008, Calendar.JANUARY, 1, 0, 1, 1))));
System.out.println(Test(1+60+(60*60), Dates.DifferenceInSeconds(Dates.CreateDate(2008, Calendar.JANUARY, 1, 0, 0, 0), Dates.CreateDate(2008, Calendar.JANUARY, 1, 1, 1, 1))));
System.out.println(Test(1+60+(60*60)+(24*60*60), Dates.DifferenceInSeconds(Dates.CreateDate(2008, Calendar.JANUARY, 1, 0, 0, 0), Dates.CreateDate(2008, Calendar.JANUARY, 2, 1, 1, 1))));
System.out.println(Test(1+60+(60*60)+(31*24*60*60), Dates.DifferenceInSeconds(Dates.CreateDate(2008, Calendar.JANUARY, 1, 0, 0, 0), Dates.CreateDate(2008, Calendar.FEBRUARY, 1, 1, 1, 1))));
System.out.println(Test(1+60+(60*60)+(31*24*60*60)+(366*24*60*60), Dates.DifferenceInSeconds(Dates.CreateDate(2008, Calendar.JANUARY, 1, 0, 0, 0), Dates.CreateDate(2009, Calendar.FEBRUARY, 1, 1, 1, 1))));
 
System.out.println(Test(1.0/60, Dates.DifferenceInMinutes(Dates.CreateDate(2008, Calendar.JANUARY, 1, 0, 0, 0), Dates.CreateDate(2008, Calendar.JANUARY, 1, 0, 0, 1))));
System.out.println(Test((1.0/60)+1, Dates.DifferenceInMinutes(Dates.CreateDate(2008, Calendar.JANUARY, 1, 0, 0, 0), Dates.CreateDate(2008, Calendar.JANUARY, 1, 0, 1, 1))));
System.out.println(Test((1.0/60)+1+60, Dates.DifferenceInMinutes(Dates.CreateDate(2008, Calendar.JANUARY, 1, 0, 0, 0), Dates.CreateDate(2008, Calendar.JANUARY, 1, 1, 1, 1))));
System.out.println(Test((1.0/60)+1+60+(24*60), Dates.DifferenceInMinutes(Dates.CreateDate(2008, Calendar.JANUARY, 1, 0, 0, 0), Dates.CreateDate(2008, Calendar.JANUARY, 2, 1, 1, 1))));
System.out.println(Test((1.0/60)+1+60+(24*60*31), Dates.DifferenceInMinutes(Dates.CreateDate(2008, Calendar.JANUARY, 1, 0, 0, 0), Dates.CreateDate(2008, Calendar.FEBRUARY, 1, 1, 1, 1))));
System.out.println(Test((1.0/60)+1+60+(24*60*31)+(24*60*366), Dates.DifferenceInMinutes(Dates.CreateDate(2008, Calendar.JANUARY, 1, 0, 0, 0), Dates.CreateDate(2009, Calendar.FEBRUARY, 1, 1, 1, 1))));
 
System.out.println(Test(1.0/(60*60), Dates.DifferenceInHours(Dates.CreateDate(2008, Calendar.JANUARY, 1, 0, 0, 0), Dates.CreateDate(2008, Calendar.JANUARY, 1, 0, 0, 1))));
System.out.println(Test(1.0/(60*60)+1.0/60, Dates.DifferenceInHours(Dates.CreateDate(2008, Calendar.JANUARY, 1, 0, 0, 0), Dates.CreateDate(2008, Calendar.JANUARY, 1, 0, 1, 1))));
System.out.println(Test(1.0/(60*60)+1.0/60+1, Dates.DifferenceInHours(Dates.CreateDate(2008, Calendar.JANUARY, 1, 0, 0, 0), Dates.CreateDate(2008, Calendar.JANUARY, 1, 1, 1, 1))));
System.out.println(Test(1.0/(60*60)+1.0/60+1+24, Dates.DifferenceInHours(Dates.CreateDate(2008, Calendar.JANUARY, 1, 0, 0, 0), Dates.CreateDate(2008, Calendar.JANUARY, 2, 1, 1, 1))));
System.out.println(Test(1.0/(60*60)+1.0/60+1+(24*31), Dates.DifferenceInHours(Dates.CreateDate(2008, Calendar.JANUARY, 1, 0, 0, 0), Dates.CreateDate(2008, Calendar.FEBRUARY, 1, 1, 1, 1))));
System.out.println(Test(1.0/(60*60)+1.0/60+1+(24*31)+(24*366), Dates.DifferenceInHours(Dates.CreateDate(2008, Calendar.JANUARY, 1, 0, 0, 0), Dates.CreateDate(2009, Calendar.FEBRUARY, 1, 1, 1, 1))));
 
System.out.println(Test(1.0, Dates.DifferenceInDays(Dates.CreateDate(2008, Calendar.JANUARY, 1, 0, 0, 0), Dates.CreateDate(2008, Calendar.JANUARY, 2, 0, 0, 0))));
System.out.println(Test(31.0, Dates.DifferenceInDays(Dates.CreateDate(2008, Calendar.JANUARY, 1, 0, 0, 0), Dates.CreateDate(2008, Calendar.FEBRUARY, 1, 0, 0, 0))));
System.out.println(Test(28.0, Dates.DifferenceInDays(Dates.CreateDate(2007, Calendar.MARCH, 1, 0, 0, 0), Dates.CreateDate(2007, Calendar.FEBRUARY, 1, 0, 0, 0))));
System.out.println(Test(29.0, Dates.DifferenceInDays(Dates.CreateDate(2008, Calendar.MARCH, 1, 0, 0, 0), Dates.CreateDate(2008, Calendar.FEBRUARY, 1, 0, 0, 0))));
System.out.println(Test(365.0, Dates.DifferenceInDays(Dates.CreateDate(2008, Calendar.JANUARY, 1, 0, 0, 0), Dates.CreateDate(2007, Calendar.JANUARY, 1, 0, 0, 0))));
System.out.println(Test(366.0, Dates.DifferenceInDays(Dates.CreateDate(2008, Calendar.JANUARY, 1, 0, 0, 0), Dates.CreateDate(2009, Calendar.JANUARY, 1, 0, 0, 0))));
 
System.out.println(Test(0.4928, Dates.DifferenceInMonths(Dates.CreateDate(2008, Calendar.JANUARY, 1, 0, 0, 0), Dates.CreateDate(2008, Calendar.JANUARY, 16, 0, 0, 0))));
System.out.println(Test(1.0185, Dates.DifferenceInMonths(Dates.CreateDate(2008, Calendar.JANUARY, 1, 0, 0, 0), Dates.CreateDate(2008, Calendar.FEBRUARY, 1, 0, 0, 0))));
System.out.println(Test(0.9993, Dates.DifferenceInYears(Dates.CreateDate(2007, Calendar.JANUARY, 1, 0, 0, 0), Dates.CreateDate(2008, Calendar.JANUARY, 1, 0, 0, 0))));

Notes: See also this post about rounding numbers and this post about creating date objects. Please also note that sometimes Java does something weird with rounding, which is why the Test method rounds the results to four digits.

How to Round Numbers (double, float) in Java to Number of Decimal Places

<rant>For some strange reason, Java doesn’t have the ability to round numbers to a given number of decimals. This seems like such an obvious thing to include in any programming language.</rant> Anyway, you can round to the nearest integer, so you have to a little workaround to round to a given number of decimal places. The method below will do this.

public class MathUtility
{
    public static double Round(double number, int decimalPlaces)
    {
	double modifier = Math.pow(10.0, decimalPlaces);
	return Math.round(number * modifier) / modifier;
    }
}

To test this method, I developed the following simple method.

public static boolean Test(double expected, double actual)
{
    boolean result = expected == actual;
 
    if (!result)
	System.out.println("Expected: " + expected + ", Actual: " + actual);
 
    return result;
}

And then I tested the rounding method using the following code (the tests all passed):

System.out.println(Test(1, MathUtility.Round(1.12345678, 0)));
System.out.println(Test(1.1, MathUtility.Round(1.12345678, 1)));
System.out.println(Test(1.12, MathUtility.Round(1.12345678, 2)));
System.out.println(Test(1.123, MathUtility.Round(1.12345678, 3)));
System.out.println(Test(1.1235, MathUtility.Round(1.12345678, 4)));
System.out.println(Test(1.12346, MathUtility.Round(1.12345678, 5)));
System.out.println(Test(1.123457, MathUtility.Round(1.12345678, 6)));
System.out.println(Test(1.1234568, MathUtility.Round(1.12345678, 7)));
System.out.println(Test(1.12345678, MathUtility.Round(1.12345678, 8)));

Find the Mean/Average of a Number List in Python

To my knowledge, there is no built-in function in Python to find the mean of a list of numbers. You can use statistics packages to do this, such as statpy, but if you just want a lightweight solution to do the trick you can use the function below. Note that on the first line I am converting the numbers to floats because if you were to pass it a list of integers, it would not compute the mean properly.

def mean(numberList):
    floatNums = [float(x) for x in numberList]
    return sum(floatNums) / len(numberList)

Find the Class of a Python Object

Sometimes you are working with an object in Python, and because Python is not a strongly typed language, you don’t always know which class it is an implementation of. A quick way to find out is to use the type method, which can be invoked on any object and gives you a String representation of the class.

obj = "abc"
type(obj)

This would output “<type ’str’>.”

How to Override toString() in Python

In popular object-oriented languages such as Java and C#, they have a built-in method to each class that allows you to obtain a String representation of that class. They call these methods toString() or ToString(), respectively. You can use the default implementation, which basically gives you the name of the class (not usually very helpful). Or you can override the default by implementing this method in the class you are working with.

Python has a similar thing, except the name is pretty different. The method you must override is called __repr__.

Let’s say you have a simple class with a single method.

class TestClass:
    def Method1(self):
        return "method 1"

If you wanted to print the (default) String representation, you would do this:

print TestClass()

This gives you “<__main__.TestClass instance at 0×00A86580>.”

Then you can customize the String representation like this:

class TestClass:
    def Method1(self):
        return "method 1"
 
    def __repr__(self):
        return "here's the custom String representation"

Then when you do this:

print TestClass()

You get this: “here’s the custom String representation”

This functionality can come in very handy if you need to output the contents of objects to a file or the screen and you want to encapsulate the logic within the class itself rather than some other class.

Convert Results to Comma-delimited List in Oracle

Let’s say you are running a query and that there is a one-to-many relationship between one of the columns in one table and a column in another table. You could always retrieve the data from the database and process it using a regular programming language, but sometimes you want to do it all on the database side. Let’s say you wanted to get all the values corresponding to a key in the first table, convert them to a comma-delimited string on the fly, and return the results that way.

Turns out there are a few ways to do this, but none of them are ideal (that I could find). But I had to search all over the Web to find an article that explains it, and this is the best one I could find. It shows various approaches to doing it, including pros and cons to each of them.

See http://www.oracle-base.com/articles/10g/StringAggregationTechniques.php.

Python Yield Example

In Python, there is a bit of functionality that makes it convenient at times to process data while it is being processed inside another method. For example, let’s say you are searching for files within a directory that match a certain pattern, but you don’t want to wait the full time for the files to be found before you started processing them. You can use what’s called a generator which basically sends out the results as they are found within that method.

Or let’s say you were retrieving results from a database or data file and wanted to process them one at a time rather than storing them all in a list (which can be very memory intensive). Generators can come in very handy for this. In Python, a generator is created using the yield keyword.

Here’s the basic structure of how you use it:

def yieldExample():
    for x in getResult():
        yield x

Then to invoke this method, you would do it this way:

for y in yieldExample():
    print y

This post shows an example of how you would use yield and explains more detail.

Search for Files in Directories Recursively in Python

Let’s say you want to search for files in a directory tree of files and directories. And suppose you want to search for all .zip files. You can do this from within Python by looping through the folders and trying to find files that match the *.zip wildcard pattern.

Here is some code I copied from this page that illustrates how to do it:

import os, fnmatch
 
def locate(pattern, root=os.curdir):
    for path, dirs, files in os.walk(os.path.abspath(root)):
        for filename in fnmatch.filter(files, pattern):
            yield os.path.join(path, filename)

Note that this function uses the yield keyword, which turns this method into a generator. I wasn’t sure off hand how to access the results of this method, but it turns out you can iterate over it (for example, using a for loop).

for x in locate("*.zip", "C:\\Temp"):
    print x

If you didn’t want to use yield, you could do it this way:

import os, fnmatch
 
def locate(pattern, root=os.curdir):
    matches = []
 
    for path, dirs, files in os.walk(os.path.abspath(root)):
        for filename in fnmatch.filter(files, pattern):
            matches.append(os.path.join(path, filename))
 
    return matches

This would return a list.

See also this post that explains how to do this in a single folder (not recursively).

How to Perform HTTP GET in C#

Suppose you have a Web resource that you want to download programmatically (as opposed to doing it manually from a Web browser) from C#. The first thing you need to do is send an HTTP request to the server with the URL (e.g. http://www.google.com). The server will then process the request and send an HTTP response back, which needs to be processed. The following code can handle these things. Plus you have to handle it a little differently if you’re downloading a String/text document or a binary document (such as a PDF file or image). Note: It’s not necessary to use a CookieContainer, but it’s in there if you need it.

using System;
using System.Collections.Generic;
using System.IO;
using System.Net;
using System.Text;
 
namespace Piccolo.Common
{
	public class HttpHelper
	{
		private string _baseUrl;
		private CookieContainer _cookieContainer = new CookieContainer();
 
		public HttpHelper() : this("") { }
 
		public HttpHelper(string baseUrl)
		{
			_baseUrl = baseUrl;
		}
 
		public string HttpStringGet(string relativeUrl)
		{
			HttpWebRequest req = (HttpWebRequest)WebRequest.Create(_baseUrl + relativeUrl);
			req.CookieContainer = _cookieContainer;
 
			return ReadBasicResponse(req.GetResponse());
		}
 
		public byte[] HttpBinaryGet(string relativeUrl)
		{
			HttpWebRequest req = (HttpWebRequest)WebRequest.Create(_baseUrl + relativeUrl);
			req.CookieContainer = _cookieContainer;
 
			byte[] result = null;
			byte[] buffer = new byte[4096];
 
			using (WebResponse resp = req.GetResponse())
			using (Stream responseStream = resp.GetResponseStream())
			using (MemoryStream memoryStream = new MemoryStream())
			{
				int count = 0;
				do
				{
					count = responseStream.Read(buffer, 0, buffer.Length);
					memoryStream.Write(buffer, 0, count);
 
				} while (count != 0);
 
				result = memoryStream.ToArray();
			}
 
			return result;
		}
 
		private string ReadBasicResponse(WebResponse response)
		{
			using (WebResponse resp = response)
			using (StreamReader sr = new StreamReader(resp.GetResponseStream()))
				return sr.ReadToEnd().Trim();
		}
	}
}

Here’s how you would call this to download an HTML document.

HttpHelper http = new HttpHelper();
string html = http.HttpStringGet("http://www.google.com/");

Here’s how you would call this to download a binary document.

HttpHelper http = new HttpHelper();
byte[] bytes = http.HttpBinaryGet("http://www.google.com/intl/en_ALL/images/logo.gif");