Coding Notes

Ideas and thoughts from Seth Long

Quicksort / Insertion Sort Combo

My blog is now located at http://www.WhoIsSethLong.com

The Quicksort is one of the fastest sorting algorithms for sorting large lists of data.  The Insertion sort is a fast sorting algorithms for sorting very small lists that are already somewhat sorted.  We combine these two algorithms to come up with a very simple and effective method for sorting large lists.

We start our sort by using a Quicksort.  The Quicksort on average runs in O(n log(n)) time but can be as slow as O(n2) if we try to sort a list that is already sorted and use the left most element as our pivot.  If you are unfamiliar with the quicksort here is a Wikipedia article describing the algorithm: http://en.wikipedia.org/wiki/Quicksort

As the Quicksort works, it breaks down our list into smaller and smaller lists.  Once the lists become small enough that an Insertion sort becomes more efficient then the Quicksort we will switch to the Insertion sort to finish off the job.  Think of it as grinding away with the Quicksort then polishing it off with the Insertion sort once most of the work has been done.

Here is a code sample of a combined Quicksort and Insertion sort algorithm in C#.

class Quicksort
{
    public void Sort(int[] list, int start, int end)
    {
        if (start < end)
        {
            // This is where we switch to Insertion Sort!
            if ((end – start) < 9)
            {
                this.InsertionSort(list, start, end + 1);
            }
            else
            {
                int part = this.partition(list, start, end);
                this.Sort(list, start, part – 1);
                this.Sort(list, part + 1, end);
            }
        }
    }
    public void InsertionSort(int[] list, int start, int end)
    {
        for (int x = start + 1; x < end; x++)
        {
            int val = list[x];
            int j = x – 1;
            while (j >= 0 && val < list[j])
            {
                list[j + 1] = list[j];
                j–;
            }
            list[j + 1] = val;
        }
    }
    private int partition(int[] list, int leftIndex, int rightIndex)
    {
        int left = leftIndex;
        int right = rightIndex;
        int pivot = list[leftIndex];
        while (left < right)
        {
            if (list[left] < pivot)
            {
                left++;
                continue;
            }
            if (list[right] > pivot)
            {
                right–;
                continue;
            }
            int tmp = list[left];
            list[left] = list[right];
            list[right] = tmp;
            left++;
        }
        return left;
    }
}
Advertisements

January 28, 2008 Posted by | Algorithms | 3 Comments

Understanding a SQL Junction Table

My blog is now located at http://www.WhoIsSethLong.com

I was recently asked about a sql junction table and thought I’d share my thoughts about them with the few people who read this.  First off, I am NOT a DBA.  So, if something is not correct or accurate please feel free to correct me.

Junction tables are used when dealing with many-to-many relationships in a SQL database.  If you’re wondering what exactly a many-to-many relationship is, let me try to briefly explain.  Suppose we are working at a school and have a table full of student names and another table full of classrooms.  Each of the students can belong to multiple classrooms or none at all.  Likewise, each classroom can have multiple students or none at all.  This is an example of a many-to-many relationship.

A junction table will allow us to create the many-to-many relationship and most importantly, let us keep from adding duplicate entries as you’ll soon see.

To start, lets create a student table and a classroom table.

CREATE TABLE Students
(
    StudentID int IDENTITY(1,1) PRIMARY KEY,
    StudentName nchar(50) NOT NULL
)

CREATE TABLE Classrooms
(
    ClassroomID int IDENTITY(1,1) PRIMARY KEY,
    RoomNumber int NOT NULL
)

Now that we have our two tables created we need to create the junction table that will link them together.  The junction table is created by using the primary key from the Classrooms and Students tables.

CREATE TABLE StudentClassroom
(
    StudentID int NOT NULL,
    ClassroomID int NOT NULL,
    CONSTRAINT PK_StudentClassroom PRIMARY KEY
    (
        StudentID,
        ClassroomID
    ),
    FOREIGN KEY (StudentID) REFERENCES Students (StudentID),
    FOREIGN KEY (ClassroomID) REFERENCES Classrooms (ClassroomID)
)

We have now created a table with columns for the StudentID and the ClassroomID.  This table also uses a combination of these two columns as the primary key.  This means that each student-classroom pair is unique.  Each student can belong to many classrooms, each classroom can belong to many students but each pair can only occur once.

You should also note that the columns in the junction table are setup as foreign keys to the Students and Classrooms tables.  This is important as it keeps us from adding students to a classroom that doesn’t exist or deleting a classroom from the database if there are still students belonging to it.

To see what students belong to what classrooms we can use the junction table and the following query:

SELECT StudentName, RoomNumber
FROM StudentClassroom
JOIN Students ON Students.StudentID = StudentClassroom.StudentID
JOIN Classrooms ON Classrooms.ClassroomID = StudentClassroom.ClassroomID

So, that’s a junction table in a nut shell.

– Seth Long

January 4, 2008 Posted by | SQL | 26 Comments

Binary Search

My blog is now located at http://www.WhoIsSethLong.com

A binary search is used to find if a value exists or what it’s position is in a sorted list.  You may have noticed that I put “sorted” in italics.  That’s because in order for the binary search to work, it must have a sorted list.

To better explain how the binary search works, lets walk through a simple example.  We will start off with a sorted list of numbers.

3, 5, 6, 9, 11, 12, 14

Next, we’ll need a number to search for.  For this example we’ll choose the number 12.

The binary search starts by selecting the middle position of the array and getting it’s value.  In our example we selected 9.

3, 5, 6, 9, 11, 12, 14

The algorithm then must decide if the number we are searching for is higher or lower then 9.  Since we are searching for 12, the number is higher then 9.  We must search to the right of 9 since we know that if there is a 12 in the list it must be to the right of 9.  We have now reduced the number of items to compare to in half with just 1 comparison! Pretty cool!  This leaves us with the follow numbers to search through.

11, 12, 14

We again select the middle position and compare it to the number we are searching for.  In this case the middle position returns a 12 which happens to be the number we are searching for.  Using the binary search we found our number in an array of 7 items using only 2 comparisons!

The key things to remember about a binary search are that it must have a sort list and it reduces the number of items it has to search through every time it does a comparison.

Coding a Binary Search

Coding a binary search is fairly straight forward.  We take the middle position of the array and compare it to our number.  If our number is higher then the middle position we then use the top half of the list.  If our number is lower we use the lower half of the list.  We then repeat the process till we find the number or run out of numbers to compare.

Here’s an example of a binary search using C#.

public int binarySearch(int[] array, int lower, int upper, int target)
{

    // set -1 to the location in case we don't find it
    int location = -1;    // keep looping till we run out of positions or we find it

    while (lower  array[mid])  // the mid position is less then our target
        {
            lower = mid + 1;
        }
        else  // the mid position is greater then our target
        {
            upper = mid - 1;
        }
    }

return location;

}

One thing you may have noticed is that you could have written the algorithm using recursion instead of using an iterative approach. So, here is another example of the same algorithm using a recursive method rather then an iterative one.

public int binarySearch(int[] array, int lower, int upper, int target)
{
    // get the mid position
    int mid = (lower + upper) / 2;    // does not exist

    if (lower >= upper)
    {
        return -1;
    }

    if (array[mid] == target) // mid position equals the target
    {
        return mid;
    }
    else if (target < array[mid]) // target is less then mid
    {
        return binarySearch(array, lower, mid - 1, target);
    }
    else // target is greater then mid
    {
        return binarySearch(array, mid + 1, upper, target);
    }
}

Performance

One of the greatest things about a binary search is the speed at which it can search through large lists.  For example, if we had a list with 1,000,000 items in it, a binary search could find the item we are searching for in under 20 comparisons!  Of course there is one drawback to the binary search, the list must be sorted.

Since every comparison the binary search performs cuts the number of items we have to search through in half, the binary search runs in logarithmic time or O(logN).

Well, that’s all for now.

– Seth Long

January 3, 2008 Posted by | Algorithms | Leave a comment

Fisher-Yates Shuffle Algorithm

My blog is now located at http://www.WhoIsSethLong.com

I went to an interview recently and was asked to arrange an array of integers from 1 to 52 in random order.  As soon as they said 52 I thought about a deck of cards, then instantly thought about the Fisher-Yates Shuffle Algorithm.

The Fisher-Yates Shuffle Algorithm is a simple algorithm that runs in linear time and is used to shuffle an array in random order.  The algorithm loops through each item in the array, generates a random number between 0 and the array length, then assigns the array item to the randomly generated array position.  This is a simple algorithm and is probably easiest understood with an example. For this example I’ll use C#.

public void FisherYates(int[] ary)
{
    Random rnd = new Random();

    for (int i = ary.Length - 1; i > 0; i--)
    {
        int r = rnd.Next(0, i);

        int tmp = ary[i];
        ary[i] = ary[r];
        ary[r] = tmp;
    }
}

There are other ways to sort the array besides the Fisher- Yates method.  If this had been a deck of cards, another method would be to assign a randomly generated 64-bit number to each card then sort by their randomly assigned number.

– Seth Long

January 3, 2008 Posted by | Algorithms | 3 Comments

Nonrepeated Character Search

My blog is now located at http://www.WhoIsSethLong.com

This code challenge was inspired by a book I recently read.  The book is called Programming Interviews Exposed: Secrets To Landing Your Next Job.  It’s a great book full of those fun technical questions you get asked during programming job interviews.

For this challenge, you will need to create a function/class/algorithm that takes a string and finds the first non-repeated character.  For example, in the word reaper the first non-repeated character is ‘a’.  The character ‘p’ is also not repeated but we are looking for the first one.

There are multiple ways to accomplish this but I’m going to only post 1 here.  To start, I first created a hash table with all the chars in the string and incremented each char as additional matches are found.  So for example, in the string reaper I will start with the char ‘r’ and add that to the hashtable and give it a value of 1.  Next I will add ‘e’ to the hash table and give it a value of 1. The char ‘p’ will get a value of 1 but the next char, ‘e’, already exists in the hashtable so we will increment it by 1 and it’s value becomes 2.  This is done for all the chars in the string.

Once this is done, we then read back over the string, comparing each char in the string to the hashtable till we find the first char that has a value of 1.  I know, it sounds a little confusing but if you read of the code sample it will make a lot more sense.

Here is a code sample in C#:

public char NonRepeatedChar(string s)
{
    Hashtable ht = new Hashtable();
    // loop through each char in string
    // and add to hashtable
    for (int x = 0; x < s.Length; x++)
    {
        char c = s[x];
        if (ht.ContainsKey(c))
        {
            // already exists in hashtable so
            // give it a value of 2.
            ht[c] = 2;
        }
        else
        {
            // does not already exist so give
            // an initial value of 1
            ht.Add(c, 1);
        }
    }
    // loop through the string again and
    // find the FIRST match in the hashtable
    // that has a value of 1 and return it.
    for (int x = 0; x < s.Length; x++)
    {
        char c = s[x];
        // check if the char’s value is 1
        if ((int)ht[c] == 1)
        {
            return c;
        }
    }
    // no match found
    return ;
}

Well, that’s all for now!

 (I just noticed the return value for the above code isn’t showing up. It’s just a null char.)

– Seth Long

January 3, 2008 Posted by | Code Challenges | Leave a comment

A Simple Bubble Sort

My blog is now located at http://www.WhoIsSethLong.com

Well, this is my first time writing online so I thought I’d start off with something simple and a bubble sort was the first thing that came to mind.  If you’re not familiar with what a bubble sort is, let me give you a quick rundown.

A bubble sort algorithm is a very simple way to sort a collection but also not a very efficient one.  The algorithm works by comparing 2 items in the collection and swapping them if they are not in the right position.  For example, we are going to sort the following array of numbers from smallest to largest.

6, 3, 8, 2, 5

The algorithm would start by comparing the first two numbers and moving the larger number to the right.

6, 3, 8, 2, 5   ->  3, 6, 8, 2, 5

The algorithm would then move over 1 place and compare the numbers there.

3, 6, 8, 2, 5

Since the larger number is already on the right, the algorithm doesn’t need to make any changes and again moves over to the right 1 position.

3, 6, 8, 2, 5

Now we have the larger number on the left so it will need to be swapped with the smaller number of the right.

3, 6, 8, 2, 5  ->  3, 6, 2, 8, 5

The larger number has been swapped to the right so the algorithm will now move over again and check the last two numbers.

3, 6, 2, 8, 5

Again, the larger number is on the left so it will need to be swapped.  This gives us:

3, 6, 2, 8, 5  ->  3, 6, 2, 5, 8

After the first pass through, we now have the largest number (8) in its proper position.  So now we will start at the beginning of the array again, work our way towards the end and swap numbers as needed.

3, 6, 2, 5, 8
3, 6, 2, 5, 8  ->  3, 2, 6, 5, 8
3, 2, 6, 5, 8  ->  3, 2, 5, 6, 8

Now, you can see that we don’t compare the last number this time.  That’s because we know the last number is already in it’s proper place.  Each run through the array we compare one less number on the right.  The algorithm gets its name because the largest number will bubble up to the last position after each pass.  Our third run through the array would look like this:

3, 2, 5, 6, 8  ->  2, 3, 5, 6, 8
2, 3, 5, 6, 8

Fourth run:

2, 3, 5, 6, 8

That’s definitely a lot of work just to sort 5 numbers.  There are other algorithms out there that are much more efficient then the bubble sort.

Coding the Bubble Sort Algorithm

Now that we hopefully understand how the bubble sort algorithm works, lets look at the code needed to implement it.  All we really need is 2 loops and an if statement to create a bubble sort algorithm.  The first and outer loop will start from the end of the array and work its way to the beginning.  The second and inner loop will start from the beginning of the array and work its way to the value of the outer loop.  Inside the inner loop is an if statement that compares the 2 numbers and swaps them if necessary.   Here is the code using C# syntax:

// the array to sort
int[] ary = new int[] { 6, 3, 8, 2, 5 };

// outer loop
for (int outer = ary.Length - 1; outer > 0; outer--)
{
    // inner loop
    for (int inner = 0; inner  ary[inner + 1])
    {
        // swap them
        int tmp = ary[inner];
        ary[inner] = ary[inner + 1];
        ary[inner + 1] = tmp;
    }
}

There are other variations of this algorithm but this is probably the most common way of coding it.

Efficiency

As I’ve mentioned before, the bubble sort algorithm is one of the worst ways to sort a collection.  To understand why, we need to look at the worst-case complexity of the algorithm.  Since the algorithm uses 2 loops to sort the array, there are a LOT of comparisons and swaps being made.  With an array of n items, the algorithm will make n – 1 comparisons on the first pass.  The next pass will require n – 2 comparisons, etc…  This gives us n(n – 1) / 2 or O(n2).

Well, that’s all for now.  I’ll be posting some more sorting algorithms soon so keep checking back!

– Seth Long

January 3, 2008 Posted by | Algorithms | Leave a comment