Designing a highly scalable e-commerce site for dummies.

Photo by Tran Mau Tri Tam ✪ in UnsplashThis is the third article of a series on how to create and develop high-scalability backend applications. You can check the first article about some basic concepts and the second one about how to apply high-scalab…


This content originally appeared on Level Up Coding - Medium and was authored by Darío Rodríguez

Photo by Tran Mau Tri Tam ✪ in Unsplash

This is the third article of a series on how to create and develop high-scalability backend applications. You can check the first article about some basic concepts and the second one about how to apply high-scalability patterns in your applications.

In this article, we are going to focus on how to scale a real-world backend application using all knowledge we got from previous articles.

We are going to pay special attention to changes in architecture to achieve the so-wanted high scalability in our application.

Disclaimer: This article is intended to be an introduction to this topic so I’m going to skip a lot of things like configuration files, deploy environment configurations, etc. For the same reason, some architectural changes won’t be proposed or explained like a heavy refactor to use microservices extensively. I want to keep the explanation easy to follow for non-experienced developers.

Let’s move on and get some information about our project!

Project definition and initial system architecture

First of all, we need to define the project we are dealing with and some initial architecture.

The website we are going to work with is a regular e-commerce website, something like Amazon, a retail website, Shopify, etc. So basically think about any regular e-commerce website that has bi-directional communication with the user sending emails when a purchase is made, etc.

The main use cases for users are the following:

  • see offers
  • list products
  • see product details
  • checkout
  • send an e-mail to confirm the purchase

Our project is a monolith with all features encapsulated but developed using the same code base. So it could be easy to separate each feature in a different application, but for now, it was not a technical requirement because the technical debt was not so high.

In the same way, every feature is using a common typical MySQL database to store all information about products, orders, discounts, campaigns, etc.

In brief, we are in front of a successful MVP application that is going to get bigger and bigger over time. It’s working but it’s near to reaching the point where technical debt starts to be a problem to develop new features and maintain existing ones.

So… suddenly one day our Product Manager told us that because the company is growing and sales are going well, the marketing team is preparing a huge campaign to increase sales for Black Friday.

The technical team is really happy because of the good news but, soon all of them start to realize that the application is not prepared for that. A lot of work must be done in order to get good response times and to handle the huge amount of traffic expected.

Does it sound familiar? Did you face some similar situation?

Analyzing and improving

After that change in the product roadmap, all the people on the technical team started to think about how they can improve the application to face the huge Black Friday campaign.

First of all, they need to spot current performance problems and new ones that could arise when the number of requests starts to increase. To do it they set up an observability dashboard with Kibana and Elasticsearch to get some detailed metrics about performance and APM information.

After that, they started to measure the performance of current customer requests, page load, database slow queries, etc. With all this information they could spot some issues that could seriously affect the Black Friday campaign.

The main issues found were:

  • sometimes sending the purchase confirmation e-mail takes too much time
  • offers page performance sinks on peak hours
  • product list and product details page performance decreases during peaks and a lot of slow queries related to these features were spotted
  • writing checkout information timing metrics have peaked during the busiest hours

As you see they could identify four different issues that should be resolved before launching this new campaign.

Each issue could be solved using a different high scalability pattern so in the next sections we are going to make changes in the initial architecture of the e-commerce website to improve it.

Scaling horizontally

The first thing that the technical team did was to add more servers to increase code execution speed. Yes, they applied horizontal scaling to increase performance.

The changes in the system were minimum. All servers will execute the same code and will be updated accordingly using the CI/CD tool. A Load Balancer was put in place to distribute the requests to a pool of servers that will execute the code.

One of the important changes was to modify the database architecture to create a primary-secondary architecture. The primary server will remain common to all servers in the pool and every server in the pool has one secondary copy of the database for read-only purposes. Some changes in the code were needed to adapt repositories to use primary or secondary servers when needed but the changes were minimum because they didn’t change the database engine.

These changes improved the performance during peak hours but didn’t solve all problems, so the team kept improving the system.

Using background processing

The second problem the team solved was that sending the purchase e-mail was really slow sometimes. This problem happens on a particular part of the system and affects a specific feature of it so they explored another solution.

Sometimes the “sending email” feature was really slow due to problems connecting to the external email server. Because it’s not a critical feature the execution of this code could be deferred to send the email later on so the team solved the issue by applying a background process pattern.

The team created a microservice to refactor this feature to run all this business logic. The microservice and the monolith will communicate using an event bus. This way the main code will execute the purchase process and will send an event to the bus to send the email later on.

An event bus is a method to send messages between microservices or different parts of the system so it’s a really good improvement to defer more actions in the future for other parts of the system.

Now the application is more prepared, but there are some other things that could be done in order to prepare the application to deal with all requests of the Black Friday campaign.

Caching data

The next improvement was focused on the performance of offers, product lists and product detail pages. Previously the team improved performance for the whole system by enabling horizontal scaling but some specific improvements in these pages were needed.

To improve these pages specifically, the team choose to use caches to get lower response times and to allow these pages to handle more queries.

The team chose to use two cache instances with different logic to store and fill their data.

The first one will store product information and will use a lazy approach to fill in the data. It means the data will be stored in the cache when the business logic gets the results from queries from the main database. This approach has some drawbacks and it’s that when data is not in the cache the performance is not so good. But on the other hand, this cache has lower maintenance because it fills by itself.

The second cache will store offers information and will use an eager approach to fill in the data. In this case, the team created a process that stores the offers’ data by compiling it from information located in the main database. This approach is really good because it will give the best performance because the system will always retrieve information from the cache. But it has some issues, the system will only be able to use data stored in the cache, so if some data exists in the main database but not in the cache couldn’t be retrieved.

Sharding main database

The last problem that the team should fix to get a robust system is the one related to performance on writes into the main database during peak hours.

To improve write performance one of the best solutions is to use sharding in the database layer. It requires a lot of changes because usually involves changing the database engine which means doing a lot of changes and refactoring in the code.

In this case, the team choose to change the main database that uses MySQL to a NoSQL database based on MongoDB. This change as said before required a lot of changes in code to adapt data repository classes to use this new database engine.

It was really challenging and required a lot of effort, but it allowed the system to handle thousands of write requests and made the system more robust.

Conclusions

When you are about to improve your application performance your first step should be a deep analysis to know metrics about the performance of different parts of your system.

The next step is to create a strategy to solve all issues detected in the initial analysis. The most common ways to solve performance issues are horizontal scaling, implementing caches and background processing.

To improve the performance of your whole system you should mix and match all scalability patterns to increase both performance and the number of handled requests. In this example, you can see that scaling and application is an iterative and incremental process.

My recommendation is that scaling and performance issues should be resolved as any other bug. So you don’t need to do big refactors as one-off projects but do regular small improvements in your maintaining your system.


Designing a highly scalable e-commerce site for dummies. was originally published in Level Up Coding on Medium, where people are continuing the conversation by highlighting and responding to this story.


This content originally appeared on Level Up Coding - Medium and was authored by Darío Rodríguez


Print Share Comment Cite Upload Translate Updates
APA

Darío Rodríguez | Sciencx (2022-11-01T03:02:24+00:00) Designing a highly scalable e-commerce site for dummies.. Retrieved from https://www.scien.cx/2022/11/01/designing-a-highly-scalable-e-commerce-site-for-dummies/

MLA
" » Designing a highly scalable e-commerce site for dummies.." Darío Rodríguez | Sciencx - Tuesday November 1, 2022, https://www.scien.cx/2022/11/01/designing-a-highly-scalable-e-commerce-site-for-dummies/
HARVARD
Darío Rodríguez | Sciencx Tuesday November 1, 2022 » Designing a highly scalable e-commerce site for dummies.., viewed ,<https://www.scien.cx/2022/11/01/designing-a-highly-scalable-e-commerce-site-for-dummies/>
VANCOUVER
Darío Rodríguez | Sciencx - » Designing a highly scalable e-commerce site for dummies.. [Internet]. [Accessed ]. Available from: https://www.scien.cx/2022/11/01/designing-a-highly-scalable-e-commerce-site-for-dummies/
CHICAGO
" » Designing a highly scalable e-commerce site for dummies.." Darío Rodríguez | Sciencx - Accessed . https://www.scien.cx/2022/11/01/designing-a-highly-scalable-e-commerce-site-for-dummies/
IEEE
" » Designing a highly scalable e-commerce site for dummies.." Darío Rodríguez | Sciencx [Online]. Available: https://www.scien.cx/2022/11/01/designing-a-highly-scalable-e-commerce-site-for-dummies/. [Accessed: ]
rf:citation
» Designing a highly scalable e-commerce site for dummies. | Darío Rodríguez | Sciencx | https://www.scien.cx/2022/11/01/designing-a-highly-scalable-e-commerce-site-for-dummies/ |

Please log in to upload a file.




There are no updates yet.
Click the Upload button above to add an update.

You must be logged in to translate posts. Please log in or register.