Happy Sunday my friends,
My Database Peeps will really like this one.
Database sharding is one the most ideas when you design large-scale, multi-server systems. Sharding allows you to split databases into multiple computers. Why is this useful?
Having multiple smaller datasets allows for quicker searchers (since we can filter out tables that aren’t needed).
Many shards are split on important characteristics. Imagine we were running a website and wanted to know about the visitors. We might choose to split our datasets by the platform they used to visit this website (Android, iOS, MacOS, Windows etc. ). This allows us to both run better data analysis and makes adding new rows to our datasets easier.
As you can imagine, all large-scale social media platforms utilize sharding very heavily. It just allows for efficient operations and a much more streamlined user experience.
Quora is a social media platform that allows users to submit questions based on topics or answer questions posted by others. Think of it like a general Stack Overflow. As a social media platform, they generate a lot of data (answers, user activity, upvotes, user interests, etc.) Thus it is important for them to shard their datasets for optimal performance. Keep in mind, that the dataset they are dealing with is not static. Thus when designing its dataset protocols, Quora has to design its procedures in a way that it handles a constantly shifting user demographic and user behavior.
We use MySQL to store critical data such as questions, answers, upvotes, and comments. Data size stored in MySQL is of the order of tens of TB without counting replicas. Our queries per second are in the order of hundreds of thousands. As an aside, we also store a lot of other data in HBase. If MySQL is slow or unresponsive, the Quora site is severely impacted.
The engineers at Quora were nice enough to write a very detailed post talking about their database sharding procedures. The post is titled MySQL Sharding at Quora. Make sure you give it a read. For those of you who will specialize in database design, give it a thorough reading. For everyone else, use it to learn about some of the challenges that you will come across when approaching these questions, and how you can handle these challenges.
In the comments below, share what topic you want to focus on next. I’d be interested in learning and will cover them. To learn more about the newsletter, check our detailed About Page + FAQs
If you liked this post, make sure you fill out this survey. It’s anonymous and will take 2 minutes of your time. It will help me understand you better, allowing for better content.
https://forms.gle/XfTXSjnC8W2wR9qT9
Happy Prep. I’ll see you at your dream job.
Engineering Blogs by Companies are the best,
Devansh <3
To make sure you get the most out of System Design Sundays, make sure you’re checking in the rest of the days as well. Leverage all the techniques I have discovered through my successful tutoring to easily succeed in your interviews and save your time and energy by joining the premium subscribers down below. Get a discount (for a whole year) using the button below
Reach out to me on:
Instagram: https://www.instagram.com/iseethings404/
Message me on Twitter: https://twitter.com/Machine01776819
My LinkedIn: https://www.linkedin.com/in/devansh-devansh-516004168/
My content:
Read my articles: https://rb.gy/zn1aiu
My YouTube: https://rb.gy/88iwdd
Get a free stock on Robinhood. No risk to you, so not using the link is losing free money: https://join.robinhood.com/fnud75