How to pick a Technical interview question

Filed Under teknolojia by | Comments Off on How to pick a Technical interview question

September marks the end of Summer Intern season and the beginning of next year’s recruiting season. This is usually the time when recruiters would be sending schedules for on-campus visits. Its also a time when interviewers brush up on their skills and we train new interviewers. One of the questions I regularly get from new interviewers is “how do I pick the questions to ask?”.

My take on this is that the interview is an information transfer process. There is some information (how well the candidate would do on the job) that the candidate holds and the interaction is meant to discover this. There is a limited transfer window – as the interviewer you only get 1hr worth of data. You can influence how much of that data is signal (candidate could work well on the team) vs. noise (candidate was nervous or intimidated).

Preamble

The first step to maximizing signal transfer is to make sure that the candidate is comfortable. Interviews are stressful, and the candidate will be more nervous than they would be in their day-to-day job. Get them settled and as close to their “resting state” as possible. This is where you ask ice-breaker questions about what areas they find interesting, or projects that they have worked on in the past.

Main

The bulk of time in a technical interview is spent on the problem-solving session. It may involve asking the candidate to solve a coding challenge, a design challenge or a series of behavioral questions. I prefer to make this a collaborative process so that its clear to the candidate that they can ask questions and pitch alternatives, just as they would working with someone on the team.

The choice of question here is very important. For most engineers, I find the signals we want to extract are:

  1. Do they have a systematic problem-solving approach?
    1. Is the candidate actually able to describe a solution to the problem?
    2. Once they have an understanding in their mind, are they able to translate this into code?
    3. How do they react if they get stuck at some point?
    4. Do they think of how their code would be tested? Do they account for edge cases?
  2. How well do they understand the underlying mechanisms of computing?
    1. Do they have an understanding of computing fundamentals (data structures and algorithms, complexity theory)?
    2. Do they understand the trade-offs (e.g. time vs. space)?
    3. For more senior candidates, do they understand how the underlying runtime works?

The best way I’ve found to get these signals is to ask a problem which, based on the candidate’s profile, they can describe a solution in words within less than 5 minutes. That way we reserve the rest of the time on the actual coding, testing etc. For an intern candidate, you may ask “Given a tree, write a function to tell whether its balanced”. This is solvable on paper pretty easily, as long as they know the definition of “tree” and “balanced”. Translating that into code will exercise multiple skills.

You also want a question that is can be solved in multiple ways. The key is to avoid questions that have a “gotcha” moment where the candidate has to realize the ONE way of solving this. If they do not reach that realization in a few minutes, a significant portion of the interview is spent trying to guide them to that point.

For senior candidates, the ideal question has natural extension points. That way, if the candidate is able to solve the basic version of the problem, you can dig into the extension points.

Once you’ve identified a good question, you should re-use it for multiple candidates. This will help you calibrate how hard the problem is for candidates at various levels. It also helps you understand what areas candidates are likely to get stuck.

Kenya’s Chief Justice selection

Filed Under siasa by | Comments Off on Kenya’s Chief Justice selection

Kenya is going through the process of selecting the next chief justice. Candidates for the post are being interviewed by a panel of prominent lawyers in a public hearing. This is a new selection process – previously the chief justice was appointed by the president and did not need to be accountable to anyone else. You can tell that the lawyers are enjoying digging into the (often ill-prepared) judges. Ahmednasir, the former law society of Kenya chairman had this question for Lady Justice Ang’awa:

Your judgments consist of one line or one paragraph rulings, skeleton in nature and lacks depth. It is evident you have a problem writing in prose. Do you think writing in poetry will help or capture the essence of justice?

Reading the coverage of the hearings is quite entertaining and also illuminating. The interviewers are bringing up particular cases that the judges handled and putting them to task for decisions handed suspiciously in favor of connected politicians or businessmen. The best part of this is that the judges never thought they would have to explain their decisions and some are still indignant when asked to.

Going back to Ang’awa – how does a judge of the High Court make a career of issuing one-line rulings? Sure, it looks good for her statistics that she clears many cases. At this level, though, the goal of a judge is not simply to adjudicate conflicts – they interpret the law, provide guidance to lower courts  and establish precedent which will guide future courts. How is one to infer a legal doctrine from one-line rulings? My advice for the justice is to branch out. Try some verses as Ahmednasir suggested. Maybe combine the love for brevity with poetic skills and write some haiku.

Two numbers

Filed Under hisabati by | Comments Off on Two numbers

Here is an interesting puzzle that was posted to one of the discussion groups at work.

A teacher tells two students that he is thinking of two natural numbers greater than 1.  He tells the first student the product of the two numbers and the second one their sum.  The students then have the following dialog:

 

     First: I do not know their sum.

     Second: I knew that.  Their sum is less than 14.

     First: I knew that, and now I know the numbers.

     Second: So do I.

 

What were the numbers and how did they figure them out?

 

Solution

Let’s say that the two numbers that the teacher is thinking of are a and b. Lets say that the product a*b = n and the sum a+b = m.

When the first student gets the product n, it is fair to assume that he tries to factor n to find out the two numbers. There may be multiple ways of factoring n into two integers greater than 1 so we can assume he finds all such pairs.

When the second student gets the sum m, he tries to find all integer pairs {i,j} such that i+j = m.

Both students have their lists of candidate pairs before they start their conversation.  We can now step through their statements.

 

First student says: ” I do not know their sum”

Remember that the first student has the product n and has determined all pairs {i,j} such that i*j=n. If he cannot find the sum, this implies that there are multiple ways of factoring n into two integers. Otherwise, if there was only one way of factoring n into two, he would have a single pair {i,j} and would be able to determine the sum.

If n cannot be factored  uniquely into two numbers, then the target numbers a and b cannot both be prime (maybe one or zero of them is prime). Otherwise, if a and b were both prime, then there would only be one way of factoring n into two.

Second student says: “I knew that the first student could not know the sum”

Recall that second student has the sum m and has enumerated all pairs {i,j} such that i+j = m. Since he has these pairs, he can calculate all the possible products of i*j and this will help him know what kind of info the first student has.

When the first student says that he cannot uniquely determine the sum, it confirms what second knew. In second student’s list of pairs {i,j} , there is no pair such that, given the product, you can uniquely identify the sum.

The only way that second student could have known for sure that the first student could not determine the sum is if on the list of pairs that second student has, there is no pair {i,j} such that both i and j are prime.

In other words,  m is an integer that cannot be created by adding two primes.

 

Second student says: The sum is less than 14

This tells us that 1 < m < 14. From the previous section , we also know that m cannot be created by adding two primes. This tells us that m = 11.

The derivation is as follows:

m is in the set {2,3,4,5,6,7,8,9,10,11,12,13}

2 is out because the only {i,j} such that i+j = 2 is {1,1} and we know that both candidate numbers are greater than 1.

3 is out because the only {i,j} s.t. i+j=3 is {1,2}

4 is out because it can be created by 2+2 where 2 is prime

5 is out. It can be created by 2+3

6 is out. It can be created by 3+3

7 is out. It can be created by 2+5

8 is out. It can be created by 3+5

9 is out. It can be created by 2+7

10 is out. It can be created by 3+7

12 is out. It can be created by 5+7

13 is out. It can be created by 2+11

 

First student says: “I knew that the sum m < 14”

Remember that the first student has the product n and has determined all pairs {i,j} such that i*j=n.  Since he has the pairs, he has also summed them and he knows for sure that for all {i,j}, i+j <14.

In other words, n is an integer that can be factored into two in multiple ways but however you do it, the two numbers you end up with always add up to less than 14.

This is the final piece. The final derivation is as follows:

We know that  i and j are both numbers in {2,3,4,5,6,7,8,9,10,11,12,13} (i.e. neither i nor j can be greater than 14)

We can enumerate all pairs {i,j} such that i+j < 14

{2|2,3,4,5,6,7,8,9,10,11}

{3|3,4,5,6,7,8,9,10}

{4|4,5,6,7,8,9}

{5|5,6,7,8}

{6|6,7}

Where the notation {x|y,z} is shorthand for the pairs “the pairs where the smaller number is x and the larger is either y or z” (i.e. {x,y} and {x,z})

We can then eliminate all pairs where i and j are both prime. This leaves.

{2|4,6,8,9,10}

{3|4,6,8,9,10}

{4|4,5,6,7,8,9}

{5|6,8}

{6|6,7}

 

Consider the products of pairs i*j

 

4

5

6

7

8

9

10

2

8

 

12

 

16

18

20

3

12

 

18

 

24

27

30

4

16

20

24

28

32

36

 

5

 

 

30

 

40

 

 

6

 

 

36

42

 

 

 

Eliminate all products that can be created by multiplying numbers that add up to >=14. This eliminates 24, 28, 30, 32, 36, 40 and 42 because :

36 = 18*2, and 18+2>14

30 = 15*2, and 15+2 > 14

24 = 12*2, and 12+2 = 14

28 = 14*2

42=21*2

32=16*2

40=20*2

 

This gives us the grid:

 

4

5

6

8

9

10

2

8

 

12

16

18

20

3

12

 

18

 

27

 

4

16

20

 

 

 

 

Which gives the pairs

{2|4,6,8,9,10}

{3|4,6,9}

{4|4,5}

 

We already know that i+j = 11. Looking at our pairs, we determine that the answer is {2,9}

Problem 92 from Project Euler asks us to find the number of non-Happy numbers under 10 million:

A number chain is created by continuously adding the square of the digits in a number to form a new number until it has been seen before.

For example,

44 -> 32 -> 13 -> 10 -> 1 -> 1
85 -> 89 -> 145 -> 42 -> 20 -> 4 -> 16 -> 37 -> 58 -> 89

Therefore any chain that arrives at 1 or 89 will become stuck in an endless loop. What is most amazing is that EVERY starting number will eventually arrive at 1 or 89.

How many starting numbers below ten million will arrive at 89?

The numbers that have a sequence ending in 1 are called the ‘Happy’ numbers (making the rest non-Happy).

As with most Project Euler problems, this could be solved by brute force. Following the definition of the sequences, we can run through all numbers from 1 to 10 million. For each number, split it into its constituent digits, square each digit and sum them all. If the sum of squares of digits is 89, we’ve found a match. Otherwise, test the new number for a match (split digits, square etc). The problem with this approach is that you will have to evaluate at lest 10 million numbers. In a naive implementation, you would actually end up looking at some numbers multiple times. For example, in the series starting at 85 above, you would end up looking at 145 and you may look at it again when you consider the series starting at 145. Clearly, this approach does not scale.

The first optimization we can make is ensure we evaluate a number only once. To do this, we can store a lookup table with all the numbers that have a series ending in 89. If we evaluate a series that ends in 89, we know that any numbers in the series ends in 89 (i.e. 85,145,20 etc from above).

The second optimization  comes when we notice that the largest possible sum of the squares of digits in a number for our test is (92)x7 = 567 (from the number 9999999). This reduces the number of sequence evaluations we have to make.  We can construct our lookup table showing numbers between 0 and 567 that end in 89. Then for each number in our test range from 0 to 10 million, we simply calculate the sum of squares of digits and check our lookup table to see if the value would terminate in 89.

The third optimization is based on the realization that the sum of squares of digits function does not change if you reorder the digits. That is, the sum of squares for 123 is the same as for 213, 312, 321 etc. So if the sequence starting at 123 ends at 89, we know that 312, 213 etc will end at 89 as well. Formally, this means that the sum-of-squares-of-digits function partitions the set of integers into Equivalence classes such that a~b if a and b contain the same digits. Using this optimization,we only need to check sequences for 11,440 numbers – almost 1000 times smaller than the 10 million checks that the brute force approach would have us make. However, to use this approach, we need to determine the size of each equivalence class. That way, if we determine that that a particular number ends in 89, then we can update our count of matching numbers by adding the full size of the numbers equivalence class.

We turn to combinatorics to determine the size of each class. For any one number, its equivalence class is constructed by creating permutations of the digits. The size of the class is therefore given by the number of unique permutations of digits in a sequence. We need to account for repetitions of indistinguishable objects: that is, given the sequence 100335, we need to account for that the the 3s and 0s can be interchanged in permutations without resulting in different integers. The formula and reasoning for such a count is explained at http://www.andrews.edu/~calkins/math/webtexts/prod02.htm under the title “Permutations with repeated elements”. The eventual formula is;

nPr1 r2rk = n! /( r1!r2!…rk!)

where we are arranging n elements where the first element occurs r1 times, the second r2 times …

We still need to figure out how to make sure we pick one (and only one) element from each equivalence class for our sequence calculations. Each class is composed of numbers that contain the same digits in some order. We can represent each class by creating a string with the non-decreasing sequence of digits that define the class. Generating all strings of of length 7 with non-decreasing sequence of digits will give us exactly one representative from each equivalence class.

With these optimizations, we solve the problem in 0.5 seconds using python. We can also count all the non-Happy numbers under 10^100 in under a minute. We can further reduce our computations  by noticing that there are much fewer happy numbers than unhappy numbers – there are 20 happy numbers under 100. Since all numbers are either Happy or non-Happy, we can determine the number of non-Happy numbers by counting happy numbers and subtracting from the count of numbers in range.

Nerdy t-shirts

Filed Under sayansi, tarakirishi by | Comments Off on Nerdy t-shirts

Working at a software company, I’ve gotten to see a fair share of nerdy t-shirts. There are the those celebrating various video games, software launches etc. Then there are the programming references (“Wanna grab a byte?”).

Probably nobody does programming humor better than XKCD – their “my code is compiling” and “sudo make me a sandwich” shirts are pretty common. This week, I saw two shirts that are in contention for the title of “Most Awesome T-shirt Ever”.

The first had a decidedly complicated looking equation on the front, giving the value of Vs. The caption read “Airspeed velocity of an unladen swallow“. Priceless. I was tempted to ask if the swallow was African or European.

The second had a big bold “i2” on the front with the caption “keeping It real“. That made my day!

Presence Africaine

Filed Under filosofia by | Comments Off on Presence Africaine

The Surreptitious Speech: Présence Africaine and the Politics of Otherness ...I recently finished reading “The Surreptitious Speech: Présence Africaine and the Politics of Otherness “ , a compilation of essays edited by Valentine Mudimbe. The collection celebrates 40 years of the journal Presence Africaine.

Mudimbe writes a very engaging summary at the end of the compendium, in which he lays the tone of the compilation and gives his opinion on Presence.

In the span of forty years, the journal and publishing house Presence Africaine succeeded in organizing a new literary and intellectual space for “a surreptitious speech.” This space is not the other side of what we may call the Western space. In fact, it belongs to that space, though it is true that from the beginning Presence defined itself on the margin of this center it challenged.

There was a political reason for this. André Gide made it explicit in the first issue of the journal: why should Presence speak according to the expectations of a culture that was violating what Presence wanted to promote?

The idea of constructing space struck me as a very algebraic operation (we construct spaces in algebra all the time) and that got me excited. Theoretical spaces are a powerful tool for reasoning in Mathematics and in many sciences. How would this tool be useful in such a “concrete” field of study as African Studies?

Mudimbe makes it clear that the space created by Presence is a theoretical space, much like the mathematical one.

A space is always a construct. It is a theoretical articulation that claims to render and represent operations or, put simply, the reality of a place, that is, a primary experience. A space is, to say the least, a second-order plane reflecting upon a first-order practice of life and human experience.

In algebra, we use spaces to help us model some aspect of reality that is the subject of study. For instance, we may model the forces acting on an object as vectors in a 3 dimensional plane. In doing so we ignore some of the complexities of the real world and escape into the clean world of 3 dimensional vector spaces.

A notable difference between intellectual movement theoretical space and the mathematical modeling space is that the intellectual space influences the reality it models. It seems to be an active space where the mathematical space is passive (the fact that we model some scenario as vectors acting in a 3 dimensional space does not actually mean that it happens that way). In a Schrödinger-esque manner, the fact that an intellectual movement space studies a certain society changes the reality of that society.

This second-degree organization, by its very being, considerably alters and transforms the primary logic in which it claims to root itself. To that extent, its narratives as well as its postulations invent “what is really out there’ in the field of everyday place. Methods of faithfully expressing the place, at least in social and human sciences, undergo regular transformations in order to reflect better the reality of an experience and its complexity.

The idea of using spaces as a metaphor comes together wonderfully in the next paragraph. Mudimbe makes explicit the contending cultural theoretical spaces and shows how herculean Presence’s task was.

Indeed, after reading the contributions to this volume, one could deduce that, until the founding of Presence Africaine, African cultures and their designations were submitted to a European space that actualized them as figures of its own past, precisely as anterior to the rupture that radically separated prehistory from history. The memory of the European space would appear thus, diachronically and synchronically, as the paradigm of human experience and, at the same time, as that which historically has muted all other human differences by reducing them to the project of an evolutionary becoming. In this perspective, Presence Africaine could appear to signify the unthinkable: an otherness spatializing itself from a nowhere that could not but be a utopian project. In effect, its surreptitious voice faces Western culture in the name of an absolute alterity; yet this very alterity seems to spring from the Western space.

Microsoft does a poor job of advertising new products, especially those in beta. Often this leads to the impression that Microsoft does not innovate. Here are some new products I’ve been playing with/using that are pretty cool.

  1. PowerPoint Plex (pptPlex) from Office Labs (http://www.officelabs.com/projects/pptPlex/Pages/default.aspx).If you’ve ever wished your slideshow could be something more than a linear series of slides, this is for you. You can can arrange your slides on a canvas, zoom in and out of slides and set your path for navigating through the slides. This means that you can have long code samples in your presentations and zoom into the sections you need as you need them. I can’t imagine going back to linear slides.
    An earlier version of Plex was used in this demo of Touch Wall.
  2. Script# (http://projects.nikhilk.net/ScriptSharp)This is the result of an independent project by Nikhil Kothari, an software architect in the .Net developer platform. Using the script# compiler, you can compile C# code into JavaScript (it does with C# what Google Web Toolkit does for Java). Here’s a blurb from Nikhil’s site.

    Script# brings productivity to Ajax and JavaScript development. Script# is a free tool that enables developers to author C# source code and subsequently compile it into regular script that works across all modern browsers, and in doing so, leverage the productivity and power of existing .NET tools as well as the Visual Studio IDE. Script# empowers you with a development methodology and approach that brings software engineering, long term maintainability and scalable development approaches for your Ajax applications, components and frameworks.

    With Script#, you can design your AJAX applications just as you would .Net applications – complete with object inheritance, namespaces, interfaces, event receivers, private methods etc. The JavaScript produced is browser independent, meaning that you get insulated from all the little browser quirks.

    You also get a lot of code re-use, just as with regular OOP. Script# provides its own base libraries for Ajax development, mirroring the libraries provided in the .Net platform. Alternatively, you can use base libraries from ASP.Net AJAX, or third party platforms like ExtJS(via Ext#). Script# produces a DLL with the JavaScript at compilation so your DLLs can be used in future projects as reference assemblies.

  3. Photosynth(http://photosynth.net/)This product from Microsoft Live Labs takes  set of photos and constructs a 3D model. The effects are pretty astounding.
  4. Songsmith(http://research.microsoft.com/en-us/um/redmond/projects/songsmith/index.html)This falls neatly into the toy category. This product from Microsoft Research allows you to hum or sing a tune and it generates an accompaniment. ArsTechnica had a really good article on this. There are some YouTube videos providing songsmith re-arrangements of classic tunes. My favorite is the remake of Roxanne by the Police.

Summing numbers that cannot be written as a sum of two abundant numbers

Filed Under hisabati, project euler by | Comments Off on Summing numbers that cannot be written as a sum of two abundant numbers

Problem 23 from Project Euler asks for the sum of numbers that cannot be written as the sum of two abundant numbers:

A perfect number is a number for which the sum of its proper divisors is exactly equal to the number. For example, the sum of the proper divisors of 28 would be 1 + 2 + 4 + 7 + 14 = 28, which means that 28 is a perfect number.

A number whose proper divisors are less than the number is called deficient and a number whose proper divisors exceed the number is called abundant.

As 12 is the smallest abundant number, 1 + 2 + 3 + 4 + 6 = 16, the smallest number that can be written as the sum of two abundant numbers is 24. By mathematical analysis, it can be shown that all integers greater than 28123 can be written as the sum of two abundant numbers. However, this upper limit cannot be reduced any further by analysis even though it is known that the greatest number that cannot be expressed as the sum of two abundant numbers is less than this limit.

Find the sum of all the positive integers which cannot be written as the sum of two abundant numbers.

When I read the problem, the requirements seemed to be pretty straight-forward. We need to find all the abundant numbers in the range so that we can determine all the numbers that are the sum of two abundant numbers. My first(naive) approach to this was terribly inefficient – it ran for hours and didn’t give the right answer. We cannot get around the need to determine the abundant numbers in range. However, we can reduce the search space when determining the numbers that are expressed as sums of two abundant numbers. To do this, we take advantage of some known properties:

  1. The upper bound for numbers that cannot be expressed as the sum of two numbers is actually 20161 not 28123 (as shown here).
  2. All even numbers greater than 48 can be written as the sum of two abundant numbers (also shown in the article here).

With these facts in mind, we are ready for the implementation. The first routine determines the abundant numbers in range.

  1: def abundant_nums_in_range(n):
  2:     abundant=[]
  3:     min_abundant_not_multiple_of_2_or_3 = 5391411025
  4:     for i in range(1,n):
  5:         if(i < min_abundant_not_multiple_of_2_or_3):
  6:             if(i % 2 != 0) and (i%3 != 0):
  7:                 continue
  8:         divisors = proper_divisors(i)
  9:         divisors_sum = 0
 10:         for d in divisors:
 11:             divisors_sum += d
 12:         if(divisors_sum > i):
 13:             abundant.append(i)
 14:     return abundant
 15: 

The routine that determines proper divisors of a number tests divisors from 1 to the square-root of the number and extracts all divisors in pairs.

  1: def proper_divisors(n):
  2:     divisors =[]
  3:     maxValue = int(sqrt(n))+1
  4:     for i in range(1,maxValue):
  5:         if(n % i == 0):
  6:             if(i not in divisors):
  7:                 divisors.append(i)
  8:             quotient = n/i
  9:             if(quotient == n): continue
 10:             if(quotient not in divisors):
 11:                 divisors.append(quotient)
 12:     return divisors
 13: 

The main routine for the solution is the method that determines all numbers in range that can be expressed as sum of two numbers. Since we know that all even numbers greater than 48 are in our target set, we only need to determine odd numbers in range. The only way to get an odd number by summing two numbers is to sum an even number with an odd.

  1: def abundant_pair_sums_in_range(n):
  2:     abundant_nums = abundant_nums_in_range(n)
  3:     even_abundants = []
  4:     odd_abundants = []
  5:     for a in abundant_nums:
  6:         if(a %2 == 0):
  7:             even_abundants.append(a)
  8:         else:
  9:             odd_abundants.append(a)
 10:     abundant_sum_nums = []
 11:     num_even_abundants = len(even_abundants)
 12:     num_odd_abundants = len(odd_abundants)
 13:     curr_num = 0
 14:     abundant_nums_filter = [0 for x in range(0,n)]
 15:     # all even numbers >48 are the sum of two abundants
 16:     for a in range(0,n):
 17:         if(a % 2 ==0):
 18:             if(a > 48):
 19:                 abundant_nums_filter[a] = 1
 20:             elif a in [24,30,32,36,38,40,42,44,48]:
 21:                 abundant_nums_filter[a] = 1
 22:     for i in range(0, num_even_abundants):
 23:         for j in range(0, num_odd_abundants):
 24:             curr_num = even_abundants[i] + odd_abundants[j]
 25:             if(curr_num < n):
 26:                 abundant_nums_filter[curr_num] = 1
 27:     for a in range(1,n):
 28:         if abundant_nums_filter[a] == 1 :
 29:             abundant_sum_nums.append(a)
 30:     return abundant_sum_nums
 31: 

The last routine ties all operations together.

  1: def get_non_abundant_sums(n):
  2:     abundant_sums = abundant_pair_sums_in_range(n)
  3:     sum_all = sum([x for x in range(0,n)])
  4:     sum_abundants = sum(abundant_sums)
  5:     return (sum_all - sum_abundants)
  6: 

Problem 26 of Project Euler asks us to find the length of cycles in unit fractions:

A unit fraction contains 1 in the numerator. The decimal representation of the unit fractions with denominators 2 to 10 are given:

1/2 = 0.5
1/3 = 0.(3)
1/4 = 0.25
1/5 = 0.2
1/6 = 0.1(6)
1/7 = 0.(142857)
1/8 = 0.125
1/9 = 0.(1)
1/10 = 0.1

Where 0.1(6) means 0.166666…, and has a 1-digit recurring cycle. It can be seen that 1/7 has a 6-digit recurring cycle.

Find the value of d < 1000 for which 1/d contains the longest recurring cycle in its decimal fraction part.

The decimal representations of the fractions and therefore cycles can be discovered using long division. A python implementation of this would be:

  1: def unit_fraction_decimal_rep(n):
  2:     numerator, denominator = 1, n
  3:     fraction = []
  4:     remainders = {}
  5:     while True:
  6:         numerator *= 10
  7:         r = numerator % denominator
  8:         q = (numerator - r)/denominator
  9:         if(r == 0):
 10:             fraction.append(q)
 11:             break
 12:         if(r in remainders.values()):
 13:             foundCycle = False
 14:             for key,value in remainders.items():
 15:                 if(r == value) and (q == int(fraction[key])):
 16:                     # mark the cycle with parenthesis
 17:                     fraction.insert(key,"(")
 18:                     fraction.append(")")
 19:                     foundCycle = True
 20:                     break
 21:             if foundCycle:
 22:                 break
 23:         remainders[len(fraction)] = r
 24:         fraction.append(str(q))
 25:         numerator = r
 26:     return "0."+"".join(fraction)

To solve the problem posed, we could generate fractional representations of all integers from 1000 down and check for the longest cycles. However, looking at the algorithm above we can figure out how long a cycle will be without actually having to calculate it.

The terms in the decimal representation are determined by the remainder in each iteration of the loop starting on line 5 above. If one of the remainders is 0, then the fraction terminates. On the other hand, if we see a remainder that we have seen previously, then we have gotten to the end of a cycle. The maximum length of a cycle is n-1 since there are only n-1 possibilities for the remainder.

In each iteration of the loop on line 5, we multiply the numerator by 10. This establishes a relationship between 10 and n. We use a little algebra to explore the relationship.

  1. If the fraction terminates, then at some point we get a remainder of 0 meaning that n evenly divides a power of 10. Since 10 has only two divisors, 2 and 5, n evenly divides a power of 10 if and only if n = 2ax5b where a,b are non-negative integers (a or b may be 0 in which case n is either only divisible by 2 or by 5).
  2. The fraction will recur if n does not evenly divide any power of 10.
    • If n is has no factors in common with 10 (abbreviated as n is coprime with 10), the length of the recurring cycle is equal to the order of 10 in the group Zn (the multiplicative group of integers modulo n). If 10 has order d then 10d mod n = 1 and 1 becomes the first remainder repeated (since we start with numerator 1).
    • If n is not coprime with 10, there is no d for which 10d mod n = 1. In such cases n= 2ax5bxm where m is coprime with 10. So 1/n = 1/(2ax5b) x 1/m . The component 1/(2ax5b) will terminate so the cycle that results in 1/n is contributed by 1/m and the length of the cycle is equal to the order of 10 in Zm.

Applying this, we come up with the algorithm below:

  1: def recurring_cycle_length(n):
  2:     cycleLen = 0
  3:     #  remove powers of 2 and 5 in n
  4:     while (n % 2 == 0):
  5:         n /= 2
  6:     while (n % 5 == 0):
  7:         n /= 5
  8:     if n > 1:
  9:         a = 10 % n
 10:         cycleLen = 1
 11:         while a != 1:
 12:             a *= 10
 13:             a %= n
 14:             cycleLen += 1
 15:     return cycleLen

The longest cycle will be generated when 10 has a high order in Zn or Zm. So when searching for the longest cycle we start with the maximum value of n in the range and move down. If n is prime, the order of 10 in the Zn is either n-1 or a divisor of n-1. If we find a max value for the period which is equal to n-1, we have found the longest cycle.

  1: def max_len_recurring_cycle_in_range(n):
  2:     # iterate from the max num down
  3:     maxCycleLen = 0
  4:     maxCycleLenGenerator = n
  5:     for i in range(n, 1, -1):
  6:         currLen = recurring_cycle_length(i)
  7:         if(currLen > maxCycleLen):
  8:             maxCycleLen = currLen
  9:             maxCycleLenGenerator = i
 10:         if(currLen == i-1):
 11:             break
 12:     return maxCycleLenGenerator, maxCycleLen

Problem 24 from Project Euler asks us to find the nth lexicographic permutation of a sequence of digits:

A permutation is an ordered arrangement of objects. For example, 3124 is one possible permutation of the digits 1, 2, 3 and 4. If all of the permutations are listed numerically or alphabetically, we call it lexicographic order. The lexicographic permutations of 0, 1 and 2 are:

012   021   102   120   201   210

What is the millionth lexicographic permutation of the digits 0, 1, 2, 3, 4, 5, 6, 7, 8 and 9?

My first instinct was to write a routine that would enumerate permutations of a sequence of digits in lexicographic order. This would involve two functions: the first would take a permutation and return the next lexicographic permutation of the digits. The next function would call the first 999999 times, starting with the first sequence (0123456789) and passing intermediate values as it went along. The main function is below.

  1: def next_lexicographic_permutation(currPermutation):
  2:     digits = [int(x) for x in str(currPermutation)]
  3:     maxIndex = len(digits)-1
  4:     prevValue = digits[maxIndex]
  5:     currValue = digits[maxIndex]
  6:     for i in range(maxIndex, -1,-1):
  7:         currValue = digits[i]
  8:         if (i == 0) and (currValue == max(digits)):
  9:             return currPermutation
 10:         if(currValue < prevValue):
 11:             prefix = digits[:i]
 12:             suffix = digits[i:]
 13:             nextStart = min([x for x in suffix if x > currValue])
 14:             suffix.remove(nextStart)
 15:             suffix.sort()
 16:             nextPermutation = "".join([str(x) for x in prefix]) +str(nextStart)+ "".join([str(x) for x in suffix])
 17:             return nextPermutation
 18:         else:
 19:             prevValue = currValue
 20:     return currPermutation
 21: 

While this approach worked, the operation took 30 seconds. After looking at the problem for a while longer and consulting my muse, I realized that we could directly determine the nth permutation. It is an established fact that given a set of m elements, the number of permutations possible is given by m!( read as “m factorial”, defined here). So for our set of 10 digits, there are 10! possible permutations. Of these permutations, 1/10 of them have 0 as the first digit, 1/10 have 1 as the first etc. This comes to 9! = 362,880 permutations starting with any one digit. Since sequences are presented in lexicographic order, we know that the millionth permutation has to start with 2 ( the sequences starting with 0 end at 9! and those starting with1 end at 2*9!).

The millionth permutation of 10 digits will be the 274240th permutation from the set of permutations starting with 2 (i.e. 1000,000 – 2*9!, accounting for the permutations that started with 0 and 1). The problem of determining the 274240th permutation from the set of permutations starting with 2 is equivalent to the problem we’ve just solved for 10 digits, except that this time we are determining the  274240th permutation of the digits 013456789. To solve it,we apply the same algorithm as we did previously giving a rather neat recursive solution to the general problem.

  1: def nth_lexicographic_permutation(initialPermutation, n):
  2:     currPermutation=str(initialPermutation)
  3:     if len(currPermutation) == 1: return initialPermutation
  4:     residue = n
  5:     # number of permutations starting with any one digit
  6:     numSuffixPermutations = factorial(len(currPermutation) - 1)
  7:     outputDigitIndex = 0
  8:     if(numSuffixPermutations < residue):
  9:         outputDigitIndex = residue / numSuffixPermutations
 10:         if(residue % numSuffixPermutations == 0):
 11:             outputDigitIndex -= 1
 12:         residue -= (outputDigitIndex * numSuffixPermutations)
 13:     indexDigit = currPermutation[outputDigitIndex]
 14:     currPermutation = currPermutation.replace(indexDigit,'')
 15:     return indexDigit + nth_lexicographic_permutation(currPermutation, residue)
 16: 

This second solution runs in less than a second!

Next Page →