Computer Association of SIUE - Forums

CAOS Forums => Questions and Answers => Topic started by: Tony on 2007-11-19T15:07:50-06:00 (Monday)

Title: Hard Question!!!
Post by: Tony on 2007-11-19T15:07:50-06:00 (Monday)
I hope I get this right, but Justin Camerer had a great question the other day that I thought would be perfect to post on here and see if anyone can get it.  So, here it goes.

You have a list of N numbers.  Every number in the list is duplicated exactly once, except one of the numbers which only shows up once.  For example, 5, 3, 4, 5, 3, where 4 is the number that only shows up once.  So, the challenge is to find the number that only shows up once, but there are rules.  Remember, you do NOT know the size of the list.

You have to be able to do it in N time complexity, and N space complexity.

N space complexity means at any given time you cannot have space allocated that is not being used.  I am not sure if that is what it really means, but for this problem that is what I mean.  So, this means you cannot go through the whole array once to get the size and then create an array of that size or anything like that.  If you are looking at 2 numbers then your data structure can only be of size 2. 

I hope I got everything right here, if I did not maybe Justin C. can add to it, but I am pretty sure that is everything.  If nobody gets it by the end of break, and people want it, I can give a hint that might help.

Tony
Title: Re: Hard Question!!!
Post by: William Grim on 2007-11-19T15:23:43-06:00 (Monday)
You can do it in O(1) space complexity using XOR.
Title: Re: Hard Question!!!
Post by: Tony on 2007-11-20T10:30:10-06:00 (Tuesday)
Pimp!  That is right.  Did you figure that out, or did you know this problem already?
Title: Re: Hard Question!!!
Post by: William Grim on 2007-11-20T12:35:31-06:00 (Tuesday)
I figured it out last October.
Title: Re: Hard Question!!!
Post by: William Grim on 2007-11-20T18:16:57-06:00 (Tuesday)
Given a set of consecutive numbers [1,N] in no particular order, how do you find all numbers missing from the set?  You can't use a hash map or any other data structure outside of the array you're given, and it must be solved in O(N) time.

Assume that there is enough space in the array to hold all numbers should they all be there.  Also assume that 0 represents a hole.
Title: Re: Hard Question!!!
Post by: Tony on 2007-11-21T01:15:25-06:00 (Wednesday)
So, the numbers are not sorted in the array first right? 

Also, a 0 means a hole?  What do you mean?  There just isn't a value for that possition in the array?

Title: Re: Hard Question!!!
Post by: Tony on 2007-11-21T02:17:52-06:00 (Wednesday)
Not sure if this answers the question you are asking, but if you have a set of numbers and there is only one number missing, you can just add all the numbers together.  At the same time keep track of what the largest number is that you reached.  Once you are done you have a variable that holds the value of all of them added together.

That part is O(n).  Then you can just start subtracting each number from the total, starting with 1 to N, where N is the largest number you encountered in the array.  This would mean this is O(2N + 1) but that is still O(N).  So, in the end you end up with the negative of the number you skipped.  So, multiply that by -1 and you have the value you skipped. 

I have no clue how you would do it if there are multiple values missing.

Tony
Title: Re: Hard Question!!!
Post by: Kit on 2007-11-21T13:39:14-06:00 (Wednesday)
I don't understand when you say it can be done "using XOR".  :mello:
Title: Re: Hard Question!!!
Post by: Tony on 2007-11-21T14:02:12-06:00 (Wednesday)
Kit, take this data set....

[5, 4, 3, 4, 5, 3, 2, 1, 1]

Think about what XOR does.  If you XOR 101 by itself, what does that give you?  Then think about doing that to the data as a whole.

I don't want to just give it to you, encase other people are still trying to figure it out, but that should help a lot.

Tony
Title: Re: Hard Question!!!
Post by: Jerry on 2007-11-21T21:23:45-06:00 (Wednesday)
Are you referring to a bitwise XOR operation?

Title: Re: Hard Question!!!
Post by: William Grim on 2007-11-22T01:04:12-06:00 (Thursday)
Quote from: Jerry on 2007-11-21T21:23:45-06:00 (Wednesday)
Are you referring to a bitwise XOR operation?

Yeah, I was referring to a bitwise XOR operation.
Title: Re: Hard Question!!!
Post by: William Grim on 2007-11-22T01:11:23-06:00 (Thursday)
Quote from: Tony on 2007-11-21T02:17:52-06:00 (Wednesday)
Not sure if this answers the question you are asking, but if you have a set of numbers and there is only one number missing, you can just add all the numbers together.  At the same time keep track of what the largest number is that you reached.  Once you are done you have a variable that holds the value of all of them added together.

That part is O(n).  Then you can just start subtracting each number from the total, starting with 1 to N, where N is the largest number you encountered in the array.  This would mean this is O(2N + 1) but that is still O(N).  So, in the end you end up with the negative of the number you skipped.  So, multiply that by -1 and you have the value you skipped.

Sorry, I don't follow you.  What you said in your answer actually results in you always finding 0 as the missing number.  You first add all the numbers and then subtract the same numbers.  Though, I think you're close to one of the possible solutions people use for finding a single missing number, but it's not one I prefer to use.

Quote from: Tony on 2007-11-21T02:17:52-06:00 (Wednesday)

I have no clue how you would do it if there are multiple values missing.

How would you solve the problem if you were allowed to have a bitmap that was N bits long?  That is the first step in finding the solution to the problem I posed.
Title: Re: Hard Question!!!
Post by: William Grim on 2007-11-22T01:14:16-06:00 (Thursday)
Quote from: Tony on 2007-11-21T01:15:25-06:00 (Wednesday)
So, the numbers are not sorted in the array first right? 


Values in the array are unsorted, but all values are taken from a consecutive set of integers [1,N].

Quote from: Tony on 2007-11-21T01:15:25-06:00 (Wednesday)
Also, a 0 means a hole?  What do you mean?  There just isn't a value for that possition in the array?

Correct.  There is no value for positions in the array where there is a 0.  In essence, they represent the missing number, but you don't know where they belong.
Title: Re: Hard Question!!!
Post by: Tony on 2007-11-27T10:57:23-06:00 (Tuesday)
Quote from: William Grim on 2007-11-22T01:11:23-06:00 (Thursday)
Sorry, I don't follow you.  What you said in your answer actually results in you always finding 0 as the missing number. 

Well, I think I miss understood what you were asking.  What I was saying is that if the numbers are consecutive, then you just keep track of the largest value you encountered.  Also, you keep a running total of all the numbers.  Then just subtract ALL the numbers up to the largest value.  So, if you have numbers 1-40 and 35 was missing, you would still subtract 35, which in the end, it would give you a negative 35.
Title: Re: Hard Question!!!
Post by: William Grim on 2007-11-27T12:54:37-06:00 (Tuesday)
Quote from: Tony on 2007-11-27T10:57:23-06:00 (Tuesday)
Well, I think I miss understood what you were asking.  What I was saying is that if the numbers are consecutive, then you just keep track of the largest value you encountered.  Also, you keep a running total of all the numbers.  Then just subtract ALL the numbers up to the largest value.  So, if you have numbers 1-40 and 35 was missing, you would still subtract 35, which in the end, it would give you a negative 35.

Yeah, I realized later what you had said, and you're right about that.  However, try solving the problem using a bitmap N bits long, and you will be close to the answer for the problem I posed.  I'll likely post some code with the answer soon if it's not answered by someone first.
Title: Re: Hard Question!!!
Post by: William Grim on 2007-12-04T02:46:54-06:00 (Tuesday)
Answer to the previous question I posted under this topic:


#include <iostream>
#include <vector>
#include <cstdlib>

typedef std::vector<int> Integers;

const int MASK = 0x80000000;

void usage(const char *name_)
{
  std::cerr << "Usage: " << name_ << " <length>\n";
}

void printVector(const Integers &numbers_)
{
  using namespace std;
  for (unsigned i = 0; i < numbers_.size(); ++i) {
    cout << numbers_[i];
    if (i < numbers_.size()-1) {
      cout << ' ';
    }
  }
  cout << endl;
}

void printMissing(const Integers &numbers_) {
  using namespace std;
  for (unsigned i = 1; i < numbers_.size(); ++i) {
    if (!(numbers_[i] & MASK)) {
      cout << i;
      if (i < numbers_.size()-1) {
cout << ' ';
      }
    }
  }
  cout << '\n';
}

void findMissing(Integers &numbers_)
{
  for (unsigned i = 1; i < numbers_.size(); ++i) {
    int x = numbers_[i];
    if (x & MASK && x ^ MASK) {
      numbers_[x ^ MASK] |= MASK;
    }
    else if (x > 0) {
      numbers_[x] |= MASK;
    }
  }
}

/// @return Random number in range [min_, max_].
long boundedRandom(long min_, long max_)
{
  return random() % (1+max_-min_) + min_;
}

int main(int argc_, const char * const * argv_)
{
  using namespace std;

  if (argc_ < 2) {
    usage(argv_[0]);
    return -1;
  }

  srandomdev();

  unsigned length = atoi(argv_[1]);
  Integers numbers(length);

  for (unsigned i = 1; i < numbers.size(); ++i) {
    numbers[i] = (random() % 2) ? 0 : i;
  }

  for (unsigned i = 1; i < numbers.size()-1; ++i) {
    swap(numbers[i], numbers[boundedRandom(i+1, numbers.size()-1)]);
  }

  printVector(numbers);

  findMissing(numbers);
 
  printVector(numbers);
  printMissing(numbers);
 
  return 0;
}


Basically, to get around the problem of using separate data structures, you reuse the data structure given to you, which is an array.  You treat the upper-most bit in the numbers as the bit for your bitmap.  I'll let you read the code to learn the real magic, but if there are any questions, let me know.

P.S. Yes, I know my code is hard-wired for 32-bit numbers.  It's a demo, and so, I don't care.
Title: Re: Hard Question!!!
Post by: Jerry on 2007-12-04T15:13:26-06:00 (Tuesday)
Just a quick glance it appears to be O(n), where n = the size of the original vector.

Can you prove it is O(1)?

Title: Re: Hard Question!!!
Post by: William Grim on 2007-12-04T20:32:36-06:00 (Tuesday)
Quote from: Jerry on 2007-12-04T15:13:26-06:00 (Tuesday)
Just a quick glance it appears to be O(n), where n = the size of the original vector.

Can you prove it is O(1)?



This problem requires O(N) space complexity.  The idea behind the problem is if you have terabytes or more of data, you had better not create a separate bitmap for the data.  Instead, reuse the resources you have if possible.

For the bitwise-XOR solution that solves a different problem, it is intrinsically O(1) space complexity.  The data structure used to solve the problem need not be O(N) space complexity.