[Solution]Problem 6: Find the smallest positive number missing from an unsorted array [Apple]
Hard
Problem
This was asked by Apple.
You are given an unsorted array with both positive and negative elements. You have to find the smallest positive number missing from the array in O(n) time using constant extra space. You can modify the original array.
Examples
Input: {2, 3, 7, 6, 8, -1, -10, 15}
Output: 1
Input: { 2, 3, -7, 6, 8, 1, -10, 15 }
Output: 4
Input: {1, 1, 0, -1, -2}
Output: 2
Step 1: Easy Solution
The easiest solution we can uses sets. We build a set containing the positive numbers in our array, and then just loop through later. We check off all the values present in the set with a loop till we hit the missing value. It would look like this:
def missing(arr):
posNums=set()
for i in arr:
posNums.add(i)
for i in range(1,len(arr)+1): #we only care about positive numbers
if( i not in posNums):
return i
return len(arr)+1
This takes advantage of two things: 1) Sets have constant lookup. Since they don’t take duplicate elements/key-value pairs, they will be more memory efficient than hash-maps. 2) At minimum, the missing the value for an array of length l will have value l+1 missing. Think about why this is true.
The solution has O(n) time and O(n) space. How can we improve this? First, it’s important to see which avenues we can optimize along. Since we have to traverse an array, it’s unlikely we can improve the O(n) time complexity. Therefore we will be better off spending time optimizing space. Remember you have limited time in an interview. So you want to get maximum bang for your buck.
Step 2: Simplifying the Problem
Something I always tell my students about hard problems is, “Hard Problems are often easy/medium questions with additional constraints”. So if we have a problem, one way to solve them is to build a solution to the easier version. This might give us some hints on how to approach a harder problem.
An easier version of this problem might be with an array containing only positive elements. We could use the index of the array to mark the presence of the value in the array. This way we don’t have to care about the space. We traverse through the array, and for every value, change that value at that index to -1 (if it in bounds). Then traversing the modified array a second time means we just have to return the first index with a positive value. This is a pretty straightforward code. Write it yourself to practice. Share it in the comments below (or with me in DMs) for code review.
Step 3: Building up from the Simpler Problem.
Now that we have a solution, the next step is simple: turn our simple solution into a more complex solution. Do so, we can take our code, and continuously simplify conditions till we can link to our already solution. You will see this technique applied in proofs for theoretical Computer Science. It’s similar to the technique called reduction. To those of you who are still students, make sure to learn this technique very well.
So how can we reduce this our problem to a simpler solution? What if we could somehow shift all the negative values to the left? Then we just take the values starting from starting positive values and check from there. Thus we can break down our problem into two aspects.
Segregate positive numbers from others i.e., move all non-positive numbers to left side. In the following code, the segregate() function does this part.
Ignore non-positive elements and consider only the part of the array which contains all positive elements. Traverse the array containing all positive numbers. To mark the presence of an element x, change the sign of value at index x to the negative. Traverse the array again and print the first index which has positive value. In the following code, the findMissingPositive() function does this part. Note that in findMissingPositive, we have subtracted 1 from the values as indexes start from 0 in C.
Our segregate function is pretty straightforward. Note that we return the index after the last negative value (post-swapping) so that we know where to start.
def segregate(arr,size):
j = 0
for i in range(size):
if (arr[i] <= 0):
arr[i], arr[j] = arr[j], arr[i]
j += 1 # increment count of non-positive integers
return j
The next tricky part is shifting the elements. There’s two things we want to keep track off, our index value is not out of bounds and repeating index values. We handle them both using the conditions:
if (abs(arr[i]) - 1 < size and arr[abs(arr[i]) - 1] > 0):
arr[abs(arr[i]) - 1] = -arr[abs(arr[i]) - 1]
The first condition handles the bounds (we only go when the pointed index is in bounds). Think about why we constructed the second question the way we did. Best explanation gets a free week of the premium newsletter+2 mock interviews with me.
Putting everything together:
def segregate(arr, size):
j = 0
for i in range(size):
if (arr[i] <= 0):
arr[i], arr[j] = arr[j], arr[i]
j += 1 # increment count of non-positive integers
return j
''' Find the smallest positive missing number
in an array that contains all positive integers '''
def findMissingPositive(arr, size):
# Mark arr[i] as visited by
# making arr[arr[i] - 1] negative.
# Note that 1 is subtracted
# because index start
# from 0 and positive numbers start from 1
for i in range(size):
if (abs(arr[i]) - 1 < size and arr[abs(arr[i]) - 1] > 0):
arr[abs(arr[i]) - 1] = -arr[abs(arr[i]) - 1]
# Return the first index value at which is positive
for i in range(size):
if (arr[i] > 0):
# 1 is added because indexes start from 0
return i + 1
return size + 1
''' Find the smallest positive missing
number in an array that contains
both positive and negative integers '''
def findMissing(arr):
size=len(arr)
# First separate positive and negative numbers
shift = segregate(arr, size)
# Shift the array and call findMissingPositive for
# positive part
return findMissingPositive(arr[shift:], size - shift)
The code takes O(n) time and Constant Space.
Closing +Note about the Difficulty Level
Hard Questions can be very tricky because they need you to do very specific things to solve them. Often there is very little to take away except for how to do the specific problem. This question needs you to how to use the index and value at index to work out the solution.
Overall you should be able to solve this question in 40 minutes. Since most interviews are 45 minutes, this gives you some time for discussions and conversations.
——————————————————————————————————————.
Bonuses/Promotion (Get Free Stuff)
To share interesting problems/solutions with me, reach out to me. Different social media of mine also have other content from me. Good problems and/or solutions receive a free shoutout + 2 months of the paid newsletter:
Reach out to me on Instagram: https://www.instagram.com/iseethings404/
Message me on Twitter: https://twitter.com/Machine01776819
My LinkedIn: https://www.linkedin.com/in/devansh-devansh-516004168/