Project Euler Problem 3

For project Euler problem 3 we work on a few basic principles that we will use for several upcoming problems (yay reusable code!), factorization and determining whether a number is prime. Here is the problem:

The prime factors of 13195 are 5, 7, 13 and 29.

What is the largest prime factor of the number 600851475143 ?

One way one could complete this problem would be to find all factors of the given number, and loop through checking if they are prime, returning only the largest one. To start out, let’s find all factors of a number.

import math
def findAllFactors(number):
    factors = list()
    for i in xrange(1, long(math.sqrt(number))):
        if number%i == 0:
    return factors

Here, we are simply looping through the numbers between 1 and the square root of the number, looking for numbers that divide evenly into our original number. Numbers that divide evenly are factors, so we add them to a list which we return afterward. Alternately, we could create a generator to return all factors:

import math
def findAllFactors(number):
    for i in xrange(1, long(math.sqrt(number))):
        if number%i == 0:
            yield i 

Next, we loop through each factor, checking to see if it is prime using the following code:

def isPrime(n):
    n = abs(long(n))
    if n < 2:
        return False
    if n == 2: 
        return True
    if not n & 1: 
        return False
    for x in range(3, long(n**0.5)+1, 2):
        if n % x == 0:
            return False
    return True

If we have a prime factor, let’s toss it in a list for now:

if __name__ == "__main__":
    factors = findAllFactors(600851475143L)
    primeFactors = list()
    for f in factors:
        if isPrime(f):

or if you are using the generator:

if __name__ == "__main__":
    primeFactors = list()
    for f in findAllFactors(600851475143L):
        if isPrime(f):

Now we simply grab the largest number in the list by sorting it and grabbbing the last item!

    print "The Correct answer is: %d" % (primeFactors[-1])

Learn to Program with Python pt. 1

Programming can be a fun hobby and a well paying career for those who learn to do it well, as well as a great way to keep stimulating your brain to keep you sharp and alert. Learning to program on your own can be difficult for some people however. Many programming tutorials have one of the following issues:

  • Talking above your head: Many programming tutorials assume a great deal of knowledge beforehand, and are therefore not suitable for brand-new programmers
  • Being half-heartedly written: Many tutorials start out well, but as the tutorial drags on the author loses interest, and the writing and content suffer dramatically
  • Generally sucking: That pretty much sums it up

I have decided to take my own crack at a programming tutorial, and I will attempt to avoid the previously mentioned pitfalls. There are a few details I would like to cover before we get started. The first is my choice of language. I have chosen to use Python for this tutorial for several reasons, the first and foremost being that I really enjoy programming in Python. I also think Python is great about getting out of your way and letting you program without having to micro-manage details such as Data Types (most of the time) and memory allocation (don’t worry if you don’t know what this means). The next detail I would like to cover is the timeline for this tutorial. I plan on writing this tutorial as multiple blog posts over the next several months, with many unrelated posts in between. If it has been a while, feel free to drop me a comment nagging me for the next issue.

Enough introduction, let’s get started with Programming and Python. To get started, we need to make sure we have Python installed on our system. If you are running Linux, chances are Python is pre-installed. For Windows and Mac users, I recommend ActivePython. ActivePython will install Python to your system, as well as a very basic Python editor (although any text editor will do). Open up your editor of choice and type the following:

print "Goodbye Cruel World"

Now save this as . This is your first program. Kinda lame isn’t it? Don’t worry, you’ll be writing slightly less lame programs in no time. In fact, the only reason I had you do this is so I could teach you to run your program once you have written it. If you are using PythonWin editor (the editor that came with ActivePython) press Ctrl+R and a box will pop up asking you which file you want to run, and it should have selected. Press OK, and it will run. If you have chosen to use a different editor (or even if you just like the command line), open up a command prompt and navigate to the folder where you saved your script and type:
and you will see “Goodbye Cruel World” print to your terminal.

In an attempt to make each section as short as possible, I will be stopping here for now and picking up again in a few days with variables, operators, and data types.

My first “real” Python Program pt. 2

In the first post about my first real Python program, I talked about how I parsed out stories based upon a tag-based markup system, prompted the user for values for all of the tags, and then returned the story with all user input in place. Today I will be talking about how I created a system to allow users to write their own stories and have them automatically detected and parsed, as well as allowing users to save their created stories.

Science, it works!

After my latest post there was a comment discussion about how to test the efficiency of one method of performing an action versus another. I looked at code profiling for Python, but everything seemed to be too complicated for what I wanted to do. I decided to use my own basic method for comparing the speed of each implementation. The blocks of code in question were:

def collatz(start):
    print start,
    if start > 1:
        if start % 2 == 0:
            start = start / 2
            start = start * 3 + 1


def collatz(start):
    while start != 1:
        yield start
        if start % 2 == 0:
            start = start / 2
            start = start * 3 + 1
    yield 1

Each block allowed the user to print the numbers in the Collatz Sequence from a given starting point. To test the efficiency of these blocks of code, I decided to stage a scientific(ish) experiment. I figured the best test would be to have each method loop through every number between 0 and 20,000 and print the Collatz Sequence to the screen for each number, recording the time it took to complete the task.

A few notes about the above method:

  • Since one of the methods already prints to the screen, the initial test also had to print to the screen with the second method, which would require another logic block in the testing method. Since this is how one would have to access the numbers in the real world, I considered this acceptable
  • In order to avoid statistical anomalies in my testing, I performed the test 15 times on each block of code. Normally I would expect more tests in order to provide statistically significant results, but each block of code was taking between 5 and 8 minutes to execute. Given that this was “for-fun” testing, I didn’t feel like wasting that much of my life waiting
  • Computers are funny, especially if they are running Windows Vista. I did my best to minimize interference from the OS, but there are certainly a few outliers that suggest that I was not entirely successful. Measures I took to prevent interference included disabling my wireless card (my network often “hiccups”, leading to slowdowns as the OS tries to reconnect), disabling all superfluous programs and services, and running the test over night so that there would not be any interference from me opening files and programs
  • The time recording was done by saving the system time to a variable before the operation started, and again as soon as it finished, and comparing the two and writing the difference to a file. This meant no human interaction at all

I ran the initial test with each method (just to make sure my test suite was working correctly), and was surprised by the results: the generator (my code) was slower by a full 2 minutes than the recursive function. I t was then that I noticed that there was a difference in how each was printed to a console window when executed: the generator printing each number on it’s own line, and the recursive function printing across the screen and wrapping. In order to make sure that the difference was not being caused by the print function, I changed my code so it would print like the other code. I ran the code again, and found the two pieces of code performed as follows:

Total time spent: 0:06:38.639000
Total time spent: 0:06:49.947000
Total time spent: 0:06:50.136000
Total time spent: 0:06:50.752000
Total time spent: 0:06:47.260000
Total time spent: 0:06:37.820000
Total time spent: 0:06:33.018000
Total time spent: 0:06:46.054000
Total time spent: 0:06:42.338000
Total time spent: 0:06:37.777000
Total time spent: 0:06:37.989000
Total time spent: 0:06:40.777000
Total time spent: 0:06:37.321000
Total time spent: 0:06:39.519000
Total time spent: 0:06:47.838000

Total time spent: 0:06:27.871000
Total time spent: 0:06:36.712000
Total time spent: 0:06:50.259000
Total time spent: 0:06:50.623000
Total time spent: 0:06:52.055000
Total time spent: 0:06:49.951000
Total time spent: 0:06:38.672000
Total time spent: 0:06:46.564000
Total time spent: 0:06:47.910000
Total time spent: 0:06:40.730000
Total time spent: 0:06:50.151000
Total time spent: 0:06:39.437000
Total time spent: 0:06:40.821000
Total time spent: 0:06:40.915000
Total time spent: 0:06:24.477000

As you can see, the two pieces of code perform virtually neck-and-neck in this capacity. I had to wonder though, if the print statement could make a difference of two minutes based upon whether or not each number had it’s own line what kind of difference would be made by not printing the numbers at all. The results are as follows:

Total time spent: 0:00:09.738000
Total time spent: 0:00:09.724000
Total time spent: 0:00:09.350000
Total time spent: 0:00:09.368000
Total time spent: 0:00:09.731000
Total time spent: 0:00:09.623000
Total time spent: 0:00:09.491000
Total time spent: 0:00:09.563000
Total time spent: 0:00:09.660000
Total time spent: 0:00:09.625000
Total time spent: 0:00:09.725000
Total time spent: 0:00:09.657000
Total time spent: 0:00:09.648000
Total time spent: 0:00:09.299000
Total time spent: 0:00:09.640000

Total time spent: 0:00:10.263000
Total time spent: 0:00:10.479000
Total time spent: 0:00:09.982000
Total time spent: 0:00:09.893000
Total time spent: 0:00:09.951000
Total time spent: 0:00:10.188000
Total time spent: 0:00:10.161000
Total time spent: 0:00:09.953000
Total time spent: 0:00:10.155000
Total time spent: 0:00:10.136000
Total time spent: 0:00:09.848000
Total time spent: 0:00:10.168000
Total time spent: 0:00:10.123000
Total time spent: 0:00:10.112000
Total time spent: 0:00:10.080000

Here, we are able to see not only the performance hit caused by printing the numbers (from 6 minutes to 10 seconds!), but that there is an average difference of about 1 second between the two pieces of code, with the generator pulling ever so slightly ahead.

Collatz Conjecture in Python

After a brief comment thread on a pretty cool blog I decided to post a little bit more about generators. We were discussing the viability of using a generator for finding the sequence of numbers following the Collatz Conjecture for any number. For those unfamiliar with the Collatz Conjecture, it is as follows from Wikipedia:

Consider the following operation on an arbitrary positive integer:
If the number is even, divide it by two.
If the number is odd, triple it and add one.
In modular arithmetic notation, define the function f as follows:

Now, form a sequence by performing this operation repeatedly, beginning with any positive integer, and taking the result at each step as the input at the next. [… It will eventually end in 1 …]

Or more aptly put by

The original code in question was well written, but relied upon recursion to perform it’s function. I proposed that perhaps a generator would be a better way to test the conjecture (which by the way has neither been proven or disproved). Here is the code I came up with for the generator:

def collatz(start):
    while start != 1:
        yield start
        if start > 1:
            if start % 2 == 0:
                start = start / 2
                start = start * 3 + 1
    yield 1

if __name__ == "__main__":
    for i in collatz(10):
        print i

My first “real” Python Program pt. 1

Whenever I start learning a new programming language, I like to start off with a project to get used to the syntax of the language without having to deal with finding holes in my logic. Typically, this program has been a simple “Mad-Libs” style program in which I prompt the user for some words and insert them into a (sometimes) hilarious story. For the most part in the past, these have been statically written stories in which I hard code variable names into a string and write custom prompts for each variable. When starting out in Python, I decided this would not be enough of a challenge. Somewhere between PHP and C# in my language path I hit a point where programming started to make sense on a deeper level. Adding new object types to my repertoire was becoming an everyday occurrence, I began branching out from what I had been taught in programming classes, and creating and dealing with objects dynamically in loops without ever knowing their names became the norm, and no longer blew my mind.

Python “yield” and Project Euler problem 2

When working on Project Euler problem 2, I stumbled on a unique feature of Python: the yield keyword. The yield keyword allows for the creation of generators. Generators are defined much like functions, but have the added bonus of being able to return multiple values over the course of their execution. For example, lets say I wanted to make a function that counted from 0 to infinity, and kept it’s state between uses. The following generator would do just that:

def counter():
     a = 0
     while True:
          yield a
          a += 1

Simple isn’t it? Iterating through it is simple too. Let’s say we want to get the first 10 numbers, and then the next 20, with a break in between:

if __name__ == "__main__":
     x = counter()
     for i in x:
          if i < 10:
               print i
     print "Broken out of counter"
     for i in x:
          if i < 30:
               print i

The output would be:

Broken out of counter

Now to make this relevant. Project Euler problem 2 is as follows:

Each new term in the Fibonacci sequence is generated by adding the previous two terms. By starting with 1 and 2, the first 10 terms will be:

1, 2, 3, 5, 8, 13, 21, 34, 55, 89, …

Find the sum of all the even-valued terms in the sequence which do not exceed four million.

So what we need to do is find a way to take the starting values of 0 and 1, and as each iteration passes, set a to the value of b, and b to the value of a+b, returning a each time. Sounds like a job for generators!

def fibonacci(max):
     a, b = 0, 1
     while a < max:
          yield a
          a, b = b, a+b

Next, we loop through each of the returned values, checking if it is even. If so, add it to a running total.

def sumEvenFibonacciNumbers(max):
     number = 0
     for n in fibonacci(max):
          if n%2 == 0:
               number += n
     return number

if __name__ == "__main__":
     print sumEvenFibonacciNumbers(4000000)

There you have it, using generators Project Euler problem 2 was a breeze!

Reading RSS feeds with Magpie RSS

Last semester I was taking a client-side programming class and ended up attempting to design a Twitter-esque web app using PHP and AJAX. The goal of the website was to create a twitter like interface for a news aggregation website. While my team and I were all competent programmers, there was one problem that we were all having a hard time solving at first: How to add realistic looking articles to our database while on a serious time crunch. After a little bit of debate and quite a bit of head-against-wall banging, we came upon the idea of pulling RSS feeds from major news organizations and parsing them out into byte-sized pieces (typically just the headline). We decided to use 3 major news organizations for simplicity’s sake since each RSS feed was just a little bit different, and we ended up going with NPR, CNN, and MSNBC World News.

Getting Started with Project Euler

If you haven’t already, go ahead and sign up at Project Euler. Once you have logged in, head on over to the Problems page. I recommend sorting the problems by ascending difficulty, as this will allow you to tackle the problems in a (more or less) gentle slope, at least at first. Let’s take a look at the first problem:

If we list all the natural numbers below 10 that are multiples of 3 or 5, we get 3, 5, 6 and 9. The sum of these multiples is 23.

Find the sum of all the multiples of 3 or 5 below 1000.

So where do we get started? The key to this problem is the modulo operator. For those who are unfamiliar with the modulo operator (notated in many languages as % ), it divides two numbers and returns the remainder. So, if a number divides evenly by 3 or 5, we want to add it to a running total. So, if we were to look at this problem in language-agnostic pseudo-code we would get:

for each number between 0 and 1000
     if number/3 evenly, or number/5 evenly
          add the number to a running total
print the total to the user

With that, you should be able to complete this problem fairly quickly. If you would like to see this problem implemented in a few programming languages, hit the Read link below:

Project Euler and Learning Python

A few months ago I decided it was time to start learning Python. I had heard great things about python,

The Joy of Python

It's the Python

and I couldn’t in good conscience let an opportunity pass to learn a new language. I grabbed a copy of “Dive Into Python” from the internet (see the link on my sidebar) and got started. I spent a few days writing simple apps going along with the book, but got bored fairly quickly. As someone who has conquered several programming languages at this point, it is difficult to find your mind stimulated by programs designed for new programmers. After a few Google searches, I stumbled upon Project Euler. For those who are not familiar with Project Euler, it is a collection of language-agnostic programming/math problems that are designed to help programmers keep (or develop) their edge. Each problem is designed in such a way that an efficient algorithm should provide the correct answer in 60 seconds or less in any language. I decided to give it a go, and have enjoyed myself greatly in the process. I plan on attempting to complete each challenge in a variety of languages, in an attempt to compare and contrast each language. How many of you have tried Project Euler, and what has your experience been?