[Solution]Problem 58:Kth Largest Element in a Stream [Amazon]
Recursion, Binary Trees, Data Structures, Object Oriented Programming (OOP), Class Design
To learn more about the newsletter, check our detailed About Page + FAQs
To help me understand you better, please fill out this anonymous, 2-min survey. If you liked this post, make sure you hit the heart icon in this email.
Recommend this publication to Substack over here
Take the next step by subscribing here
Let’s get groovy,
This problem can be found as problem 703. Kth Largest Element in a Stream
Problem
Design a class to find the kth
largest element in a stream. Note that it is the kth
largest element in the sorted order, not the kth
distinct element.
Implement KthLargest
class:
KthLargest(int k, int[] nums)
Initializes the object with the integerk
and the stream of integersnums
.int add(int val)
Appends the integerval
to the stream and returns the element representing thekth
largest element in the stream.
Example 1:
Input
["KthLargest", "add", "add", "add", "add", "add"]
[[3, [4, 5, 8, 2]], [3], [5], [10], [9], [4]]
Output
[null, 4, 5, 5, 8, 8]
Explanation
KthLargest kthLargest = new KthLargest(3, [4, 5, 8, 2]);
kthLargest.add(3); // return 4
kthLargest.add(5); // return 5
kthLargest.add(10); // return 5
kthLargest.add(9); // return 8
kthLargest.add(4); // return 8
Constraints:
1 <= k <= 104
0 <= nums.length <= 104
-104 <= nums[i] <= 104
-104 <= val <= 104
At most
104
calls will be made toadd
.It is guaranteed that there will be at least
k
elements in the array when you search for thekth
element.
You can test your solution here
Step 1: Reading the question
As ever, we start off by actually reading the question. This might seem obvious to the regulars here, but you’d be shocked how many geniuses skip this step. Always read the question very carefully, especially when you have to design classes/data structures as you do for this question.
Reading the question, some things really stood out to me. The constraints given to us actually simplify the problem quite a bit. We know the following things-
When we start searching, we will know that k elements will be present.
We don’t have a remove function that will be called at any stage.
These 2 reduce two potentially annoying edge cases to deal with. We also don’t need to worry about repetitions, which can further simplify the whole process. These will be super helpful to note when coming up with the final solution. Even if they don’t directly contribute, pointing these out to your interviewer is a great way to show your attention to detail and ability to think ahead. Both are very desirable traits. If you remember my post on what makes great software engineers (based on research from Microsoft), you will remember that both these traits scored very highly.
Getting back to the question, what can we do for now? Our observations don’t really tell us what we can do next. This is where we go back to my tried and true system that has helped literally 100s of people ace their interviews. We have read the problem and looked at the example/constraints. In your interviews, I would recommend going with a few more toy examples, just to make sure you understand the requirements (and gain brownie points with the interviewer). For the sake of conciseness, we will skip this and go straight to the next step.
Step 2: Brute Force
Remember, going for the brute-force solution will help you a lot. It’s low-hanging fruit that you can go for. It’s infinitely better than sitting there in silence and trying to figure things out (even though I think you look very cute when you’re thinking hard). And doing brute force can give you insight into the main problem. And you get to show off your methodological problem-solving skills. I know I say this a lot, but it’s important. Do the easy things when they present themselves.
For this problem, our brute force is relatively straightforward. Sort the nums when we get them. For every new value added, we can just insert it into the relevant location. Then for the k values, we just return the relevant list index.
Neat. Simple. Makes sense. But also slow.
For an original nums list of length p and j additions, you would do p*Log (p) + [p+ p+1 + p+2 … p+j] operations for sorting(see if you can derive why and simplify this. Good practice for yourself). Depending on your language and implementation, the list index might have additional overhead as well. Not the game you want to play my loves.
So how can we make this more efficient? Notice, that our current solution is kinda dumb. We’re not taking advantage of any of the special properties of this question. In our question, the top-k values can change very frequently. We want something that will help us track these biggest k values, and find the minimum one amongst these quickly. Our solution should also be able to handle the insertion quickly. Is there something we know that has these properties? Let’s dip into our tool belt, and pull something out.
Step 3: Enter the Min Heap
Solutions online (including some very expensive products/services) hand-wave away how they picked a heap (and what properties of a heap make it useful for situations like this). We’re not going to do that here. I’m here to help you get woke.
Here are 2 things you can look for in a problem that will make heaps a viable solution-
Do we have a lot of insertions/deletions
Do we care about the min/max values
If these two hold true, you should look into heaps. Heaps have speedy insertion and deletion (log N). Depending on your heap, the time to look up the min/max value is O(1). This makes it perfect for our tasks.
The number of operations required depends only on the number of levels the new element must rise to satisfy the heap property. Thus, the insertion operation has a worst-case time complexity of O(log n). For a random heap, and for repeated insertions, the insertion operation has an average-case complexity of O(1).
With a heap, our solution becomes clear. We use a min heap to heapify the top-k values we are given. The insertion will be done quickly and we can the root will be the kth largest value. With this out of the way, let’s get to coding.
Step 4: Coding It Up
Once we work out the solution, coding it up is not too difficult. Notice that we trim the heap to make sure that the size of the heap never exceeds k. This can be easy to overlook during your interviews (especially if you are whiteboarding). Don’t overlook at this. Make sure you take things nice and slow and not miss anything.
class KthLargest:
def __init__(self, k: int, nums: List[int]):
self.heap = []
self.k = k
for i in nums:
heappush(self.heap, i)
while len(self.heap) > k:
heappop(self.heap)
def add(self, val: int) -> int:
heappush(self.heap, val)
if len(self.heap) > self.k:
heappop(self.heap)
return self.heap[0]
Creating the heap takes O(N) time. We have to trim our heap N-k times (to get to the length of k). Each deletion takes log N time, making the total time required to trim the heap O( (N-k) * log N ).
Time Complexity- O(N + (N-k)* log N). Worst case, K is small and we have N*Log N.
Space Complexity- O(N). We have to create a heap.
Loved the solution? Hate it? Want to talk to me about your crypto portfolio? Get my predictions on the UCL or MMA PPVs? Want an in-depth analysis of the best chocolate milk brands? Reach out to me by replying to this email, in the comments, or using the links below.
Stay Woke. I’ll see you living the dream.
Go kill all,
Devansh <3
Reach out to me on:
Instagram: https://www.instagram.com/iseethings404/
Message me on Twitter: https://twitter.com/Machine01776819
My LinkedIn: https://www.linkedin.com/in/devansh-devansh-516004168/
My content:
Read my articles: https://rb.gy/zn1aiu
My YouTube: https://rb.gy/88iwdd
Get a free stock on Robinhood. No risk to you, so not using the link is losing free money: https://join.robinhood.com/fnud75