Sunday 30 March 2014

Week 11: Sorting and Efficiency

Sorting algorithms and its efficiency could be either really challenging or easy depending on how in depth your study is. There are three time complexity cases to focus on; best, average, and worst case. Usually, best cases are ignored because it doesn't tell us anything about the algorithm. It doesn't prepare the programmer for the worst. As for average case, if done properly, could potentially be the most useful because programmers should be expecting anything. However, finding the average case could prove time-consuming and overly-complex because there are unlimited number of assumptions to take into consideration. Thus, most people would save all the trouble and focus on worst case, best known as the Big-Oh Notation. The worst case is used to assess the asymptotic behaviour, as the size of 'x' gets toward the infinity. This is really useful because there are no significant assumptions required and it calculates the worst of the worst, thus, making comparisons between different algorithms really straightforward.

The main sorts that were introduced were quick, merge, and radix sort. I have learned that whenever something is sliced or partitioned in some way in a sorting algorithm, it will often result in some kind of O(lg n). The best way to explain why it will result in some sort of log-n is simply the mathematical reason behind it. Whatever number you sub in for 'n' in log-n, it will 'divide and conquer' so that not every single element would have to be looked at. This makes sorting algorithm like quick and merge (average cases) sort really efficient because their growth rate near infinity is very little compare to, for instance, O(n^2).

Although we do not have to create a sort from scratch, we could always take one and tweak it for optimized efficiency. For instance, the built-in sort in Python is called 'timsort' and it combines a variety of sorts together to create a sort that is very well-rounded. One useful technique that could be used is called 'memoization' and it utilizes a top-down/lazy recursion approach. Without going into heavy details, memoization's purpose is to 'remember' what have been called so redundant function calls could be ignored, thus, trading space-cost for time (which in today's technology, is very worthwhile). For instance, a program that calculates the index of fibonacci using recursion could be made a lot more efficient if the technique of memoization is applied.

Saturday 22 March 2014

Week 10: More Big-Oh and Better Sorts

This week, we focused on more big-oh's and a bunch of new sorts including quick, merge, count, and python's own hybrid timsort. One of the sorts that stand out the most to me is quick sort. When I was younger, everybody says that quick sort is everyone's favourite sort but that was pretty much the only thing I know back then. Now, it is amazing to be able to analyse WHY it is everybody's favourite. The biggest difference between quick sorts and all the other sorts is that it chooses a pivot and begins to partition. Generally, it performs better than most sorts (eg. insertion sort with O(n^2)) because it does not rely on swaps and redundant calls. Thus, making quick sort as optimized as possible. Merge sort is pretty awesome too as it splits up elements and joining them together in pairs, quadruples, and so on. It seems like a pretty slow process but surprisingly, it results in O(nlg n).

Professor Heap showed us that we could compare different types of sorts by calculating the duration of each execution. I found it absolutely amazing because nothing beats being able to see the results before your own eyes. The cool thing is that it is not limited to just sorting algorithms but to any programs.

Saturday 15 March 2014

Week 9: Performance and Big-Oh

In this week's lectures, we focused on performance of algorithms and the famous (or notorious) Big-Oh notation. Being able to see the efficiency difference between the algorithms we often deal with daily really does help bring my thoughts together. I have been programming for quite a long time now and I have never given a real thought to what goes on behind the program and what makes it more efficient than the other ones. In terms of Big-Oh, O(1), O(n), O(2^n) and such are very straight forward but I find O(lg n) to be one of the most interesting algorithm behaviour. I never understood the definition of logarithm in math class until now. Take binary search as an example, the worst case performance is O(lg n) because it keeps dividing a list in half until the search is either found or not. If there are 8 elements in the list, the worst case would take 3 steps to find an element that exists. 

Being able to see the a diverse amount of  algorithm behaviours certainly bring my awareness up whenever I am programming. I am now super conscious of my codes and try to keep it as clean as possible without losing too much on overhead.

The most challenging part this week is not so much about the algorithm behaviour, but having to understanding different kind of sortings because I am not too familiar with them. For instance, I have heard of insertion and bubble sort before but I never tried to program it myself so it takes me a bit of time trying to figure out where the codes are heading.

Sunday 9 March 2014

Week 8: Linked Lists

We focused on a data structure called linked list this week. It is very queue to a stack except each object inside the data structure holds the reference to the next object. However, it is comparatively much more superior to a queue in terms of efficiency because insertion of new elements mean only having to deal one reference (as opposed to moving items back and forth in a queue).

The lab was very challenging this week because we had to literally code to find the length of the linked list and how to insert a new element to a desired position without raising any errors. I realized I over think many parts of the code and it definitely consumed a lot of time (tried to use recursion every where I could). Sometimes, it is better to think from a more simpler perspective and it could turn out to be a lot more.  

Sunday 2 March 2014

Week 7: Recursion

Recursion is the technique of being clean and smooth by dividing a big problem into similar sub problems and solving it as it is. The basic idea behind recursion is to solve those sub problems since they are more easily solved and thus, put them altogether as the final solution. One thing I have learned is that anything you can do with recursion could also be done using iterative loops. Obviously, the complexity of it comes down to specific context. For instance, it is possible to use pure loops for solving the tower of hanoi, however, it is so much more efficient and clear to use recursion because tower of hanoi could be split into smaller yet similar structure of problems. Many would ask: when should you use recursion? Personally, I think one should use recursion when the problem presents itself as a composition of similar small problems. The key to choosing between iterative loops and recursion is whichever one is more easier to use for the specific context. There is no need to be fancy and use recursion in every single problem. One thing to note about recursion is that it could lead to a significant amount of overhead which could slow down the program's overall efficiency. The best way to find out whether iterative loops or recursion is more efficient is to have two version of the codes and compare them.

Saturday 15 February 2014

Week 6: Binary Trees

This week, we learned something called trees and it is something completely new to me. However, I didn't find the concept of trees difficult at all, rather, it was pretty straight forward. I find trees exceptionally cool because unlike the other linear data structures we have learned so far in this course, trees work very differently. With that being said, they certainly have different ways to traverse this data structure which makes it very interesting and intriguing.

As I have stated earlier in the blog, I am not very strong at visualizing recursion off the top of my head. Fortunately, I remembered to draw out the recursions when I was tracing the implementation of trees and it definitely helped me understand it a lot better. This week's learning isn't too bad compared to previous weeks.

Looking back at all the concepts we have learned, I just realized we are focused primarily on data structures. I guess knowing these must be very important either for upper computer science courses or our future job placements (both could work as well).

Sunday 9 February 2014

Week 5: Namespace, Scopes, Unit Testing, Tower of Hanoi

This week, we focused on namespace, scopes, unit testing, and the notorious algorithm of "Tower of Hanoi". All in all, I did not have a problem with namespace, scopes, and unit testing as they are mostly very straight to the point. However, the recursions behind the Tower of Hanoi prove to be highly challenging. Tracing the first few steps is relatively easy but afterwards, the numbers just decide to explode all over the place. With the confirmation of the first few traces, I have to use what Professor Heap like to call ,"intuition", to believe that the recursion works. In terms of recursion, being able to believe it works really assist me into thinking ahead in any kind of programming questions.

With that being said, majority of the time spent this week was dedicated to finding the solution to Tower of Ann Hoy that uses four stools. Indeed, it was devastatingly hard because recursion prove to be so abstract inside my brain. Luckily, the handout made it easier by reminding us to split the algorithm in three parts, where first and last basically uses the same concept for 4 stools and the second is a complete copycat of the codes of three stools. I realized when it comes to recursion problems, my thinking becomes optimized when I use pencil and paper. The human brain is simply not a stack so we could only remember so much before we forget the rest. I traced the most simple number of cheeses I could think of and found a very common trait between the algorithm of four and three stools. Although my current codes are not optimal, I am really proud that I toughed it out and found an alternate solution.