Today I went through the basic knowledge about sys design. Actually, it includes a well-structured skill-tree. The majority of those technological terms is quite familiar for me. During my daily reading, I have already heard or used some aspects such as replication, master-slaver, load balance and federation. But I didn’t have time to review and combine my working experience with the best practice.
I have been using MongoDB for few months and I need to quick pick up the concept of SQL DB. There are some incorrect concepts regarding NoSQL design. But I still learned a lot from it
Tomorrow I will go through all topics and terms again and some methodologies about system design best practice. But we need to start with a handy and quick try. Then I can figure out my weakness and what I need to learn tomorrow.
Let’s begin and have some fun!
Step 1: Outline Use Cases and constraints
First, we need to have a clear understanding of what kind of situations and scenarios we need to handle. That’s why we need to abstract the use cases.
Previously, I worked on the Revel-Xero integration project.
Here are some scoped use cases:
- User register and connect with Revel and Xero account
- Service extracts sales records from Revel account
- Updates Daily
- Categorizes sales orders by Product, Product Class, and Establishment
- Analysis monthly spending by category
- Service generate sales invoice and push to Xero account
- Pushes Daily
- Allow users to manually set the account mapping and pushing schedule
- Sends notifications when approaching or fail to send
- Service has high availability
Now we have three use cases. The real scenario is much more complex than this one. It also includes sales, purchase orders, payroll and item sync. The invoice also has multiple formats and the sales, payment and tax mapping should be flexible enough. But it has a similar workflow. Let’s focus on the current situation
Constraints and assumptions
(Question: what’s the best practice of calculating the constraints and assumptions? Need research)
- Usually once the account setup and begin to work, user only come back on a monthly basis
- There is no need for real-time update. Revel is not a strong consistency system so we need to delay 1-2 days and then sync the data
- Revel only have around 1000 customers in AU. But our target is the entire Asia market. So let’s assume 10 thousand users
- Usually, one user will only have 1 establishment. So 10k establishment
- Each establishment usually will have around 1000 sales orders per day. 10 million transactions per day
- 300 million transactions per month
- one user has one revel account and one xero account. so 10k revel account and 10k xero account
- 20k read request per month
- 100:1 write to read ratio
- write-heavy, user make transactions daily but few visit the site daily.
In case we forget: 1 English letter = 1 byte, 2^8 = 1 byte, 1 Chinese letter = 2 bytes
- Size per transaction:
- user_id: 8 bytes
- created: 8 bytes
- product: 32 bytes
- product_class: 10 bytes
- establishment: 12 bytes
- amount: 5 bytes
- Total: ~ 75 bytes
- 20 GB of new transaction content per month, 240 GB per year
- 720 GB of new transaction in 3 years
- 116 transaction per second on average
- 0.017 read request per second on average
Handy conversion guide:
- 2.5 million seconds per month
- 1 request per second = 2.5 million requests per month
- 40 requests per second = 100 million requests per month
- 400 requests per second = 1 billion requests per month
Step 2: Create a high-level design
Outline a high-level design with all important components.
(To be continued… It’s 12 am now so I will try to finish it tomorrow!)