Problem
This problem was asked by Nest.
Create a basic sentence checker that takes in a stream of characters and determines whether they form valid sentences. If a sentence is valid, the program should print it out.
We can consider a sentence valid if it conforms to the following rules:
The sentence must start with a capital letter, followed by a lowercase letter or a space.
All other characters must be lowercase letters, separators (
,
,;
,:
) or terminal marks (.
,?
,!
,‽
).There must be a single space between each word.
The sentence must end with a terminal mark immediately following a word.
Step 1: Understand the problem
One of the biggest mistakes that people make in this kind of problem is to rush in. The premise of this problem seems simple enough, so they jump in without considering the nuance. What is the nuance here?
Consider the input. It’s a stream. What does this mean for your problem:
You will only get your input one step at a time. You can’t get 3 chars at once as you would with a string
Similarly, new input will overwrite the old one. Since we want to track the sentence seen so far, we want to remember this.
The more advanced people might have caught this immediately. However, this kind of framing is helpful to you as well. Remember, communication is one of the major aspects you’re evaluated in during your interviews. By explicitly calling out this observation you will show off both your communication and technical knowledge. For interviews, follow the mantra: If you see something, say something.
Step 2: Identity methodology
We have to identify how to approach a problem. Here is everything we know so far:
We get our input one character at a time
Our sentence becomes valid or invalid based on how the new character interacts with the input we have seen so far.
There are multiple possible ways for the rules to interact with each other. For example, we might have a space or a lowercase after an uppercase letter.
Depending on the current character and state, a different set of rules will apply.
This should help you notice something. We need some way to track what kind of input we had last. With that, we can determine what state given our new input. Is there a system we can use to effectively capture this constant state changing? Turns out there is. And using that will help us solve this problem easily.
Enter the Finite State Machine
Finite State Machines are one of the most important concepts in Computer Science. You will see them applied everywhere from Software Development, Systems Design, to even Artificial Intelligence.
FSMs are a very concise way to represent systems where our system is always changing states based on input. FSMs follow certain rules, and transition between states based on these rules. Different transition rules are applicable depending on our state. Ringing any bells? This is almost exactly what we want in our solution for this problem.
Framing our problem as FSM
So how can we frame our problem as a finite state machine? We need the following things for an FSM:
Accepted Alphabet: The set of valid inputs. In our case that would be the entire keyboard. Any invalid press will simply declare it as an invalid sentence and clear everything
A set of states: We need to have a list of acceptable states. In our case, we can use the rules to build our state list. To represent which state we are in, we can simply use an integer.
A final accepting state: In our case, this will be the end of the sentence (terminals). We will print the sentence and reset everything.
Initial State: The state our machine starts in before the first input
State Transition Rules: This tells us how we transition from one state to another. In our case, we already have the rules given in the problem. We will simply use them as our transition rules.
Now that we understand the basics of Finite State Machines and how our solution can be solved using them, let’s hash out the technical details. Firstly, we need to figure out how we will represent the states. Based on the rules, we can do the following:
0: Begin
1: Uppercase
2: Lowercase
3: Space
4: Separator
Since we will always be looking up whether our input is in of the categories, we also want to use a set to store all the members of a category. This will ensure constant time lookup.
Final Code
Now that we have done the legwork, the final code is not too difficult. Simply put, it will look like this:
import string
UPPER = set(string.ascii_uppercase) #sets ensure O(1) lookup/insertion
LOWER = set(string.ascii_lowercase)
SPACE = " "
SEP = {',', ';', ':'}
TERM = {'.', '?', '!', '‽'}
def check_sentences(stream):
state = 0
sentence = ""
while stream:
char = stream.__next__()
sentence+=char
if char in UPPER and state == 0:
state = 1
elif char in LOWER and state in {1, 2, 3}:
state = 2
elif char == SPACE and state in {1, 2, 4}:
state = 3
elif char in SEP and state == 2:
state = 4
elif char in TERM and state == 2:
print(sentence)
state = 0
sentence = ""
else: # we reset everything without printing
sentence = ""
state=0
You can see that once we identified the problem as a finite state machine problem, it was relatively straightforward to solve. This solution doesn’t really have a time complexity/space since we are dealing with streams.
However, if we were to take a stream of length n, then our worst case would be O(n) space used. This would happen when we have a valid sentence of n characters (terminal only occurs at the nth character). Time would always be O(n) since we go for n chars.
If you found this solution helpful, make sure you comment and like on Substack. The engagement really helps the newsletter grow, which helps me put more time into refining it.
Bonuses/Promotion (Get Free Stuff)
To learn how to interview better DM me. Let’s set up mock interviews and go over different interview techniques. In addition to focused questions, and a detailed personalized plan, we will go over the right questions to ask during your interviews.
For most of my students, mock interviews have been very helpful. Enough mock interview practice is the key between getting that job offer and rejection. If you can get three people to become paying subscribers to this newsletter, you win a free mock interview. This offer has no upper limit, so the more subs you can get, the more mock interviews you win.
To share interesting problems/solutions 2with me, reach out to me. Different social media of mine also have other content from me. Good problems and/or solutions receive a free shoutout + 2 months of the paid newsletter:
Reach out to me on:
Instagram: https://www.instagram.com/iseethings404/
Message me on Twitter: https://twitter.com/Machine01776819
My LinkedIn: https://www.linkedin.com/in/devansh-devansh-516004168/
My content:
Read my articles: https://rb.gy/zn1aiu
My YouTube: https://rb.gy/88iwdd
Get a free stock on Robinhood. No risk to you, so not using the link is losing free money: https://join.robinhood.com/fnud75
awesome