nᵗʰ ugly number

algorithm math primes factors hamming-numbers

29,842

Solution 1

A simple fast solution in Java. Uses approach described by Anon..
Here TreeSet is just a container capable of returning smallest element in it. (No duplicates stored.)

    int n = 20;
    SortedSet<Long> next = new TreeSet<Long>();
    next.add((long) 1);

    long cur = 0;
    for (int i = 0; i < n; ++i) {
        cur = next.first();
        System.out.println("number " + (i + 1) + ":   " + cur);

        next.add(cur * 2);
        next.add(cur * 3);
        next.add(cur * 5);
        next.remove(cur);
    }

Since 1000th ugly number is 51200000, storing them in bool[] isn't really an option.

edit
As a recreation from work (debugging stupid Hibernate), here's completely linear solution. Thanks to marcog for idea!

    int n = 1000;

    int last2 = 0;
    int last3 = 0;
    int last5 = 0;

    long[] result = new long[n];
    result[0] = 1;
    for (int i = 1; i < n; ++i) {
        long prev = result[i - 1];

        while (result[last2] * 2 <= prev) {
            ++last2;
        }
        while (result[last3] * 3 <= prev) {
            ++last3;
        }
        while (result[last5] * 5 <= prev) {
            ++last5;
        }

        long candidate1 = result[last2] * 2;
        long candidate2 = result[last3] * 3;
        long candidate3 = result[last5] * 5;

        result[i] = Math.min(candidate1, Math.min(candidate2, candidate3));
    }

    System.out.println(result[n - 1]);

The idea is that to calculate a[i], we can use a[j]*2 for some j < i. But we also need to make sure that 1) a[j]*2 > a[i - 1] and 2) j is smallest possible.
Then, a[i] = min(a[j]*2, a[k]*3, a[t]*5).

Solution 2

I am working on finding nth ugly number. Note that these numbers are extremely sparsely distributed as n gets large.

I wrote a trivial program that computes if a given number is ugly or not.

This looks like the wrong approach for the problem you're trying to solve - it's a bit of a shlemiel algorithm.

Are you familiar with the Sieve of Eratosthenes algorithm for finding primes? Something similar (exploiting the knowledge that every ugly number is 2, 3 or 5 times another ugly number) would probably work better for solving this.

With the comparison to the Sieve I don't mean "keep an array of bools and eliminate possibilities as you go up". I am more referring to the general method of generating solutions based on previous results. Where the Sieve gets a number and then removes all multiples of it from the candidate set, a good algorithm for this problem would start with an empty set and then add the correct multiples of each ugly number to that.

Solution 3

My answer refers to the correct answer given by Nikita Rybak. So that one could see a transition from the idea of the first approach to that of the second.

from collections import deque
def hamming():
    h=1;next2,next3,next5=deque([]),deque([]),deque([])
    while True:
        yield h
        next2.append(2*h)
        next3.append(3*h)
        next5.append(5*h)
        h=min(next2[0],next3[0],next5[0])
        if h == next2[0]: next2.popleft()
        if h == next3[0]: next3.popleft()
        if h == next5[0]: next5.popleft()

What's changed from Nikita Rybak's 1st approach is that, instead of adding next candidates into single data structure, i.e. Tree set, one can add each of them separately into 3 FIFO lists. This way, each list will be kept sorted all the time, and the next least candidate must always be at the head of one ore more of these lists.

If we eliminate the use of the three lists above, we arrive at the second implementation in Nikita Rybak' answer. This is done by evaluating those candidates (to be contained in three lists) only when needed, so that there is no need to store them.

Simply put:

In the first approach, we put every new candidate into single data structure, and that's bad because too many things get mixed up unwisely. This poor strategy inevitably entails O(log(tree size)) time complexity every time we make a query to the structure. By putting them into separate queues, however, you will see that each query takes only O(1) and that's why the overall performance reduces to O(n)!!! This is because each of the three lists is already sorted, by itself.

Solution 4

I believe you can solve this problem in sub-linear time, probably O(n^{2/3}).

To give you the idea, if you simplify the problem to allow factors of just 2 and 3, you can achieve O(n^{1/2}) time starting by searching for the smallest power of two that is at least as large as the nth ugly number, and then generating a list of O(n^{1/2}) candidates. This code should give you an idea how to do it. It relies on the fact that the nth number containing only powers of 2 and 3 has a prime factorization whose sum of exponents is O(n^{1/2}).

def foo(n):
  p2 = 1  # current power of 2
  p3 = 1  # current power of 3
  e3 = 0  # exponent of current power of 3
  t = 1   # number less than or equal to the current power of 2
  while t < n:
    p2 *= 2
    if p3 * 3 < p2:
      p3 *= 3
      e3 += 1
    t += 1 + e3
  candidates = [p2]
  c = p2
  for i in range(e3):
    c /= 2
    c *= 3
    if c > p2: c /= 2
    candidates.append(c)
  return sorted(candidates)[n - (t - len(candidates))]

The same idea should work for three allowed factors, but the code gets more complex. The sum of the powers of the factorization drops to O(n^{1/3}), but you need to consider more candidates, O(n^{2/3}) to be more precise.

Solution 5

A lot of good answers here, but I was having trouble understanding those, specifically how any of these answers, including the accepted one, maintained the axiom 2 in Dijkstra's original paper:

Axiom 2. If x is in the sequence, so is 2 * x, 3 * x, and 5 * x.

After some whiteboarding, it became clear that the axiom 2 is not an invariant at each iteration of the algorithm, but actually the goal of the algorithm itself. At each iteration, we try to restore the condition in axiom 2. If last is the last value in the result sequence S, axiom 2 can simply be rephrased as:

For some x in S, the next value in S is the minimum of 2x, 3x, and 5x, that is greater than last. Let's call this axiom 2'.

Thus, if we can find x, we can compute the minimum of 2x, 3x, and 5x in constant time, and add it to S.

But how do we find x? One approach is, we don't; instead, whenever we add a new element e to S, we compute 2e, 3e, and 5e, and add them to a minimum priority queue. Since this operations guarantees e is in S, simply extracting the top element of the PQ satisfies axiom 2'.

This approach works, but the problem is that we generate a bunch of numbers we may not end up using. See this answer for an example; if the user wants the 5th element in S (5), the PQ at that moment holds 6 6 8 9 10 10 12 15 15 20 25. Can we not waste this space?

Turns out, we can do better. Instead of storing all these numbers, we simply maintain three counters for each of the multiples, namely, 2i, 3j, and 5k. These are candidates for the next number in S. When we pick one of them, we increment only the corresponding counter, and not the other two. By doing so, we are not eagerly generating all the multiples, thus solving the space problem with the first approach.

Let's see a dry run for n = 8, i.e. the number 9. We start with 1, as stated by axiom 1 in Dijkstra's paper.

+---------+---+---+---+----+----+----+-------------------+
| #       | i | j | k | 2i | 3j | 5k | S                 |
+---------+---+---+---+----+----+----+-------------------+
| initial | 1 | 1 | 1 | 2  | 3  | 5  | {1}               |
+---------+---+---+---+----+----+----+-------------------+
| 1       | 1 | 1 | 1 | 2  | 3  | 5  | {1,2}             |
+---------+---+---+---+----+----+----+-------------------+
| 2       | 2 | 1 | 1 | 4  | 3  | 5  | {1,2,3}           |
+---------+---+---+---+----+----+----+-------------------+
| 3       | 2 | 2 | 1 | 4  | 6  | 5  | {1,2,3,4}         |
+---------+---+---+---+----+----+----+-------------------+
| 4       | 3 | 2 | 1 | 6  | 6  | 5  | {1,2,3,4,5}       |
+---------+---+---+---+----+----+----+-------------------+
| 5       | 3 | 2 | 2 | 6  | 6  | 10 | {1,2,3,4,5,6}     |
+---------+---+---+---+----+----+----+-------------------+
| 6       | 4 | 2 | 2 | 8  | 6  | 10 | {1,2,3,4,5,6}     |
+---------+---+---+---+----+----+----+-------------------+
| 7       | 4 | 3 | 2 | 8  | 9  | 10 | {1,2,3,4,5,6,8}   |
+---------+---+---+---+----+----+----+-------------------+
| 8       | 5 | 3 | 2 | 10 | 9  | 10 | {1,2,3,4,5,6,8,9} |
+---------+---+---+---+----+----+----+-------------------+

Notice that S didn't grow at iteration 6, because the minimum candidate 6 had already been added previously. To avoid this problem of having to remember all of the previous elements, we amend our algorithm to increment all the counters whenever the corresponding multiples are equal to the minimum candidate. That brings us to the following Scala implementation.

def hamming(n: Int): Seq[BigInt] = {
  @tailrec
  def next(x: Int, factor: Int, xs: IndexedSeq[BigInt]): Int = {
    val leq = factor * xs(x) <= xs.last
    if (leq) next(x + 1, factor, xs)
    else x
  }

  @tailrec
  def loop(i: Int, j: Int, k: Int, xs: IndexedSeq[BigInt]): IndexedSeq[BigInt] = {
    if (xs.size < n) {
      val a = next(i, 2, xs)
      val b = next(j, 3, xs)
      val c = next(k, 5, xs)
      val m = Seq(2 * xs(a), 3 * xs(b), 5 * xs(c)).min

      val x = a + (if (2 * xs(a) == m) 1 else 0)
      val y = b + (if (3 * xs(b) == m) 1 else 0)
      val z = c + (if (5 * xs(c) == m) 1 else 0)

      loop(x, y, z, xs :+ m)
    } else xs
  }

  loop(0, 0, 0, IndexedSeq(BigInt(1)))
}

View more solutions

29,842

Author by

Anil Katti

Updated on July 09, 2022

Comments

Anil Katti almost 2 years
Numbers whose only prime factors are 2, 3, or 5 are called ugly numbers.

Example:
```
1, 2, 3, 4, 5, 6, 8, 9, 10, 12, 15, ... 
```
1 can be considered as 2^0.

I am working on finding n^th ugly number. Note that these numbers are extremely sparsely distributed as n gets large.

I wrote a trivial program that computes if a given number is ugly or not. For n > 500 - it became super slow. I tried using memoization - observation: ugly_number * 2, ugly_number * 3, ugly_number * 5 are all ugly. Even with that it is slow. I tried using some properties of log - since that will reduce this problem from multiplication to addition - but, not much luck yet. Thought of sharing this with you all. Any interesting ideas?

Using a concept similar to Sieve of Eratosthenes (thanks Anon)
```
    for (int i(2), uglyCount(0); ; i++) {
        if (i % 2 == 0)
            continue;
        if (i % 3 == 0)
            continue;
        if (i % 5 == 0)
            continue;
        uglyCount++;
        if (uglyCount == n - 1)
            break;
    }
```
i is the n^th ugly number.

Even this is pretty slow. I am trying to find the 1500^th ugly number.
- SLaks over 13 years
  
  Why are these numbers called ugly numbers?
- SLaks over 13 years
  
  Are you trying to find the nth ugly number or trying to determine whether a number is ugly?
- ruslik over 13 years
  
  In problems with integer arithmetics, avoid using floating point.
- Ben Jackson over 13 years
  
  I bet the people on math would be better suited to answering this.
- Khaled Alshaya over 13 years
  
  +1 Interesting question :) These are called Hamming Numbers: en.wikipedia.org/wiki/Regular_number#Algorithms
- President James K. Polk over 13 years
  
  I think the problem is equivalent to iterating over the exponents (x1, x2, x3) in 2x1 * 3x2 * 5**x3 in such a way so that the products come out in numerical order.
- orangepips over 13 years
  
  Python solution using generators that's very fast: code.activestate.com/recipes/576961
- jonderry over 13 years
  
  I think you can get a sub-linear solution if you don't need the whole sequence (see my answer below).
- starblue over 13 years
  
  online-judge.uva.es/p/v1/136.html
- Will Ness about 9 years
  
  the working O(n^(2/3)) solution is at stackoverflow.com/a/10160054/849891. see also rosettacode.org/wiki/….
- cregox almost 9 years
  
  I've just forked (so not my solution) a codewars version in C sharp that's very small, simple and O(n).
- boateng over 5 years
  
  You don’t have to start from 1, you can go from last one you’ve found out..
moinudin over 13 years

+1 This solves the problem of finding the nth number fast. You should also add that going through the multiples of 2,3,5 in parallel will remove the need for a bool array.
Anil Katti over 13 years

I was familiar with Sieve of Eratosthenes.. First I started thinking about generating a sorted list of all the ugly number - which was not quite clean. Then I ventures into the trivial solution (which was damn slow obviously). Sieve of Eratosthenes should help me solve the problem in O(U(n)) where U(n) is the nth ugly number.
Nikita Rybak over 13 years

@Anil You don't have to store elements in array, you can use any other type of container, like heap. This can give you O(n*logn) easily. There's also an approach described by marcog: it'll give O(n), but it's a bit trickier.
Anon. over 13 years

@Anil: When I made the comparison to the Sieve, I didn't really mean "keep an array of bools and eliminate possibilities as you go up" - I was more referring to the general method of generating solutions based on previous results. Where the Sieve gets a result and the removes all multiples of it from the candidate set, a good algorithm for this problem would start with an empty set and then add the correct multiples of each ugly number to that.
Nikita Rybak almost 13 years

@vardhan If you don't understand something, ask. Don't just 'fix' things.
Jim Balter about 11 years

@vardhan "The 2nd solution is not completely linear -- The 3 while loops inside the for loops cannot be described as constant time." -- Um, utterly wrong. Each lasti ranges from 0 to at most n, once total, so they're O(n) total. Put another way, per iteration of the for loop, the average number of iterations of each of the 3 inner loops is <= 1, which is indeed constant time.
Will Ness about 10 years

yes, the n^{2/3} is correct, though I didn't follow your arguments here. This is done by enumerating the i,j,k triples to not reach above an estimated value of n-th member of the sequence (since ln2, ln3, ln5 are known). Code and links in this answer.
interjay over 9 years

There are 17 ugly numbers between 2 and 30, not 22. And adding 30 will not make another one. For example, 3 is ugly but 33 isn't.
guidothekp over 9 years

Oops. I should have read the question more carefully. The problem that needs to be solved should be for numbers of the form 2^a*3^b*5^c. What I solved was for numbers that are multiples of 2, 3, and 5 and these include primes such as 7, 11, etc.
Kakira almost 9 years

Is the while loop necessary though? Won't prev be one of the three last? Like the top solution here: stackoverflow.com/questions/5505894/…
gnasher729 over 8 years

It's a shame that the only fast solution has so few votes. It will easily find the one millionth ugly number around 10^253 by my estimate.
Teepeemm over 8 years

Please format your answer properly. Python is meaningless if you don't.
gnasher729 over 8 years

That's wrong. Ugly numbers include for example 60 = 2^2 * 3^1 * 5^1 which is not on any of the lists.
Zhan over 8 years

no, i think the function covers the ugly number 60. try the the function: nthuglynumber(26) in python. it will return 60.
Will Ness over 8 years

finding n first members of Hamming sequence is an O(n) time calculation. n log n is totally unnecessary. the accepted answer's second version (under "edit") is O(n). (it is also what Dijkstra wrote, down to the whiles -- ifs are enough really, but he wrote that using while leaves no room for doubt, correctness-wise).
Will Ness over 8 years

@gnasher729 1000000-th Hamming number is 5.19312780448E+83, actually.
Will Ness over 8 years

@Kakira if is enough ; but no, sometimes two or even all three have to be advanced at once ; in the linked solution the two ifs are sequential, not exclusive; I think Dijkstra himself wrote this algo down with the whiles, so as not to leave room for any doubt correctness-wise, I think his reasoning was.
Will Ness over 8 years

@gnasher729 no, 60 is on all three lists: 60 = 30 * 2 = 10 * 3 = 12 * 5.
derek over 6 years

The explanation is wrong. Suppose we add "7*2", "7*3", "7*5" to the 3 lists.
Will Ness over 5 years

@derek the explanation might be incomplete but the code is actually right. too bad I can't upvote several times to compensate for the wrong downvotes. I asked for such capability on meta, and was voted down into oblivion, too.
Will Ness over 5 years

@derek the explanation was incomplete but the code is actually right. I've updated the explanation and the example, for clarity.
Will Ness over 5 years

this code is exponential in the (qubic root of the) number k of ugly numbers it produces: n ~ exp (k ^ (1/3)). Dijkstra's algo is linear in k. It is shown in several answers here, e.g. this.
Will Ness over 5 years

what is the value of Iterator.from(6).drop(1).next()? isn't it 7? if so, it would mean this code is wrong. as a test, what is the 1000th hamming number produced by this code, please? is it 51200000?
Will Ness over 5 years

this answer makes absolutely no sense to me at all. you "can simply build a list of ugly numbers"?? the question is how?
Will Ness over 5 years

this code is wrong. it produces e.g. 14=7*2, 21 = 7*3, 22 = 11*2...
Will Ness over 5 years

works for 100, 10000 (I verified that the results are correct -- the value returned is at the index n in the sequence, zero-based), but fails for 1000 with "list index out of range" error. ideone.com/6hnIxg
Abhijit Sarkar over 5 years

@WillNess fixed, thanks for finding the bug. I didn't try to generate the 1000 number, but I tested up to 15. Also, if I was going to use this code for generating a large sequence, I'd probably use a mutable sequence, and also try not to repeat BigInt multiplications.
Alex Peter almost 5 years

The internal while is making this a bad reading, it looks like we can move last2 last3 or last5 for more than 1, which we cannot. :( It is confused if last2 is pointer or it is a power of 2 on first reading. :( Really no reason for that. We do not loop for more than 1.
Will Ness over 4 years

@AlexPeter correct, but needs knowledge / mathematical reasoning external to code. as per Dijkstra (and the guy is pretty famous), with the whiles the code's correctness is self-evident, redundancy notwithstanding.
chqrlie about 4 years

The first fragment will fail to find ugly numbers greater than 1845281250000000000, which is the 11389-th ugly number because cur * 5 will overflow the range of type long.
Will Ness about 4 years

are you aware of this? the linked answer's code calculates 1 billionth H.N. in 0.02s and 1 trillionth in about 2s on Ideone.
chqrlie about 4 years

@WillNess: Amazing contribution! but Haskell is so alien to non-afficionados. Do your published timings include the computation of the exact values, and the conversion to base 10?
Will Ness about 4 years

the code calculates (2,3,5) exponents triples; exact values is a matter of simple BIGNUM arithmetic. it shows also its decimal approximation, e.g. 1B --> (1334,335,404) --> "6.216075755562335E+843". there's nothing especially haskelly about the algorithm.
Will Ness about 4 years

I mean, the triples are exact, of course. exponentiation and printing (in decimal) is already provided by Haskell, so I haven't bothered reimplementing it. the interpreter responds to 2^1334*3^335*5^404 printing the result without delay (it says 0.02s after printing). It is easy to add this to the code on Ideone, I just didn't want to clutter up the output.
Will Ness about 4 years

I've added the full exact number printout to the Ideone entry; the run time didn't change for the 1Bth number. for the 1Tth though the time grew by almost a second on top the previous 2 seconds.
chqrlie about 4 years

@WillNess: The algorithm is simple and elegant, and Louis Klauder's top band idea is brilliant. Computing with base 2 logs saves about 10%. Just one remark on the algorithm: you sort the band at the end, but you just need to find the m-th entry in the sorted array, which can be computed faster with a custom partitioning function. Does Haskell provide such a primitive? The sorting time is probably negligible compared to the enumeration loops...
Will Ness about 4 years

Louis Klauder's code in that (vanished?) DDJ blog thread was using an arbitrary top logvalue; then it would find the top index of the enumerated band that it chanced upon (this I don't exactly remember; maybe it was just printing few big HN's). My contribution was to add the estimation of that top logvalue and figuring out the empirical corrections (and band indexing). then, switching to logbase 2 and realizing that only one top i index needs to be used for each (j,k) pair if the width is < 1 (base 2).
Will Ness about 4 years

re: finding the m-th entry, yes, but the band sorting time is indeed negligible AFAIR, complexity-wise for sure. re: Haskell's quick-partition, I don't know. the built-in sort is mergesort, so with it, laziness doesn't help and counting n-1 elements in the prefix to get to the nth is actually sorting the whole prefix AFAICS. taking just the head element is O(n), due to laziness.
Will Ness about 4 years

er, Haskell's *quickselect... searching for "quickselect Haskell" does produce some hits, e.g. on RosettaCode, but that one's O(n^2) in worst case. there's also a proper package on hackage but it's for vectors (though lists can be converted to and from, of course).