Introduction To System Design

Introduction To System Design

Why Learn System Design

In any development process, be it Software or any other tech, the most important stage is Design. Without the designing phase, you cannot jump to the implementation of the testing part. The same is the case with the System as well. Systems Design not only is a vital step in the development of the system but also provides the backbone to handle exceptional scenarios because it represents the business logic of software.

Objectives of System Design

  • Practicality: The system should be designed with the real-world use case and target audience in mind, ensuring it is relevant and feasible.
    Example*: A simple and cost-effective CRM for a small business rather than a complex enterprise-level solution.*

  • Accuracy: The system must accurately meet both functional and non-functional requirements, ensuring it performs as expected.
    Example*: A login system that ensures users can authenticate correctly and securely.*

  • Completeness: The system should fulfil all user requirements, covering both primary and secondary functionalities.
    Example: An e-commerce platform must allow not only shopping but also managing returns and reviews.

  • Efficiency: The system must optimize the use of resources (CPU, memory, bandwidth) to ensure low costs and high performance.
    Example*: Using caching to reduce database load and improve response times on a web app.*

  • Reliability: The system should be reliable, with minimal downtime and the ability to recover from failures without losing data or functionality.
    Example*: A banking system ensuring all transactions are safe even during outages.*

  • Optimization: Time and space optimization should be applied to ensure the system components perform efficiently under given constraints.
    Example: Optimizing database queries to reduce response time and memory usage

  • Scalability (Flexibility): The system must be scalable to handle growth in traffic, data, or users and flexible enough to adapt to changing needs.
    Example: A video streaming service scaling up to handle millions of concurrent viewers during a live event.

Introduction to System Design

What is System Design?

System design is the process of defining the architecture, components, and data flows of a system to meet specific business and technical requirements. It encompasses decisions about how a system will function under different loads, handle failures, and deliver the required level of performance.

Key aspects of system design

  • Structuring the system’s architecture.

  • Defining interactions between different components or services.

  • Identifying how data will flow through the system.

  • Ensuring the system can handle expected and unexpected load (scalability).

  • Balancing between performance, reliability, and cost.

Goals of System Design

  • Scalability: Ability to handle increased load or data volume without performance degradation.

  • Reliability: The system should function as expected under normal and stressful conditions.

  • Modularity and Maintainability: The system should be broken down into modular components that can be maintained, tested, and upgraded independently.

  • Efficiency: System design ensures optimal use of resources (like CPU, memory, bandwidth) while maintaining good performance, low latency, and high throughput.

  • Cost-Effectiveness: Designing a system that meets performance and reliability needs within a budget constraint.

  • Availability: Ensuring the system is up and running without long downtimes.

Types of System Design

System design can be classified into two broad categories:

  • High-Level Design (HLD)

  • Low-Level Design (LLD)

High-Level Design (HLD)

  • HLD focuses on the overall structure or architecture of a system. It is like creating a blueprint of a house before building it.

  • The goal is to map out the system's major components, how they interact, and where they will be located (i.e., what services, databases, or APIs the system will have).

  • Focus on how the system as a whole works.

  • They are designed for stakeholders, architects, and higher-level engineers.

Key Elements of HLD

  • Architecture: How the system is divided into different components (e.g., web servers, databases, APIs).

  • Data Flow: How data moves between different parts of the system (e.g., requests from users to the server and back).

  • Technology Stack: Identifying which tech (e.g., Node.js, Cassandra, GCP) will be used in the system.

  • Interaction Between Services: How different services communicate, using APIs.

  • Scaling Strategy: How the system will scale when more users or data are added.

Example: Imagine designing a basic e-commerce website like Amazon. HLD would cover:

  • Web servers to handle user requests.

  • A database to store product information and user data.

  • Payment gateways to process transactions.

  • An API to handle communication between the front-end and back-end.

  • Load balancers to distribute traffic across multiple servers

Low-Level Design (LLD)

  • LLD focuses on the specific implementation details of individual components. It’s like deciding how each room in a house will be designed after you’ve created the house blueprint.

  • The goal is to detail how each part of the system will work, which algorithms or functions will be used, and how data will be stored or processed.

  • Focus on the detailed design of individual components and modules.

  • It is intended for developers and engineers who will implement the system.

Key Elements of LLD

  • Class Diagrams: Shows how different classes or modules interact.

  • Data Structures: How data is organised, and accessed (e.g., arrays, hashmaps).

  • Algorithms: Specify logic for handling tasks (e.g., searching, sorting, or caching).

  • API Design: Details how individual APIs function, including inputs, outputs, and error handling.

  • Database Schema: Defines tables, and relationships in the database.

Example: Going back to our e-commerce website example. LLD would cover:

  • Designing the database schema with tables like Users, Products, and Orders.

  • Writing the logic to add items to a shopping cart or process payments.

  • Defining how APIs handle product search requests.

  • Choosing the best data structures for storing product information.

Monolithic vs Microservices Architecture

Monolithic Architecture

Monolithic Architecture is a traditional model where all components of an application are integrated into a single, unified unit. Think of it like a large, single building where all rooms (components) are connected.

Characteristics

  • Single Codebase: All functionalities (frontend, backend, database logic) are housed within one codebase.

  • Tightly Coupled: All parts of the application are closely linked. A change in one area often requires updates across the entire application.

  • Single Deployment: The entire application is deployed as a single package. If you make changes to one part, you need to redeploy the whole application.

  • Easier to Develop: For smaller applications, development and testing can be simpler because everything is in one place.

Image credits to Atlassian

Advantages:

  • Simplicity: Easier to develop, test, and deploy for small applications.`

  • Performance: Communication between components is faster since they’re part of the same process (no network overhead).

  • Lower Initial Overhead: Requires less operational complexity initially.

Disadvantages:

  • Scalability Challenges: As the application grows, it becomes more challenging to scale parts of the application independently.

  • Slower Deployment: Changes require redeploying the entire application, which can slow down the deployment process.

  • Harder to Maintain: A large codebase can become unwieldy, making it difficult for new developers to understand and contribute.

Microservices Architecture

Microservices Architecture is an approach where an application is broken down into smaller, independent services that can be developed, deployed, and scaled independently. Think of it as a collection of small buildings (services) that can operate on their own but still work together to form a community (the entire application).

Characteristics

  • Independent Services: Each service focuses on a specific functionality (e.g., user authentication, product catalogue, order processing).

  • Loosely Coupled: Services interact with each other through APIs (Application Programming Interfaces). Changes to one service generally don’t affect others.

  • Multiple Deployments: Each service can be developed, deployed, and scaled independently. You can update one service without redeploying the whole system.

  • Technology Diversity: Different services can be built using different technologies or languages best suited to their needs.

Image credits to Atlassian

Advantages

  • Scalability: You can scale individual services based on demand.

  • Faster Deployments: Each service can be deployed independently, speeding up the development and release cycle.

  • Easier Maintenance: Smaller codebases for each service are easier to manage and understand, especially for larger teams.

Disadvantages

  • Increased Complexity: Managing multiple services can lead to complexities in deployment, monitoring, and inter-service communication.

  • Network Latency: Communication between services involves network calls, which can introduce latency compared to a monolithic structure.

  • Data Management Challenges: Each service might have its database, which can complicate data consistency and transactions across services.

Monolithic vs Microservices Architecture (Comparision Table)

FeatureMonolithic ArchitectureMicroservices Architecture
StructureSingle unified codebase (like one big building)Multiple independent services (like a neighbourhood of buildings)
DeploymentOne deployment package for the entire applicationIndependent deployment for each service
ScalabilityDifficult to scale parts of the application independentlyEasy to scale individual services based on demand
DevelopmentSimpler for small apps but can become complex as they growMore complex initially but easier for large, evolving systems
CommunicationDirect calls within the application (fast)API calls between services (may have latency)
Technology StackTypically uses a single technology stackCan use different technologies for different services
MaintenanceHarder to maintain as the codebase growsEasier to maintain smaller, more focused
ExampleImagine a food delivery app that has everything user authentication, restaurant listings, order processing, and payment all in one big codebase. If you need to update the payment process, you have to deploy the entire application, which can be risky if other parts are not ready.In the same food delivery app, the user authentication, restaurant listing, order processing, and payment would each be a separate microservice. This way, if you want to change how payments are handled, you can update the payment service alone without redeploying the entire application

Why Scalability and Performance are Critical

Scalability

Scalability is the ability of a system to handle increased load, whether that be more users, more data, or more requests, without sacrificing performance or requiring significant redesign. It ensures that your application can grow smoothly alongside your business needs.

There are two types of scalability:

  • Vertical Scaling: Increasing the capacity of a single machine (e.g., adding more RAM, and CPUs).
    Vertical scaling has limits (you can only upgrade hardware so much), and it's expensive.

  • Horizontal Scaling: Adding more machines to handle more traffic (e.g., more servers to distribute the load).
    Horizontal scaling is more cost-effective for large systems but introduces complexities in load distribution and data consistency.

Why is Scalability Critical?

  • Handling a Growing User Base

  • Supporting Data Growth

  • Improving System Reliability

  • Cost Efficiency

  • Ensuring Future-Proofing

Performance

Performance refers to how fast and efficient a system is in responding to requests and processing data. It includes some key metrics like latency, throughput etc.

Performance Metrics:

  • Latency: The time it takes for a request to travel from the client to the server and back.

  • Throughput: The number of requests the system can handle in a given period.

  • Response Time: How quickly the system can respond to a request (including processing time).

  • Example: An e-commerce system during a flash sale must be able to scale horizontally to handle spikes in traffic without slowing down response times.

Why is Performance Critical?

  • User Experience

  • Handling Peak Traffic Efficiently

  • Cost Efficiency

  • Competitive Advantage

  • Handling Complex Data Processing

Key Concepts in System Design

CAP Theorem

A concept in distributed systems that states you can only have two out of three properties: Consistency, Availability, and Partition Tolerance.

  • Consistency: All users see the same data at the same time.

  • Availability: The system is always up and running.

  • Partition Tolerance: The system continues to work even if parts of it can’t communicate.

Example: In a global database, you might have to choose between consistency (everyone sees the latest data) and availability (the system never goes down) when there’s a network issue.

Latency and Throughput

Latency: The time it takes for a request to travel from a user to the system and back.

  • Example: Clicking "buy" on an e-commerce website takes 2 seconds to complete—this is the latency.

Throughput: The number of requests the system can handle in a given period.

  • Example: A payment gateway can process 1000 transactions per second—this is throughput.

Load Balancing

Distributing incoming requests across multiple servers to ensure no single server gets overwhelmed.
Example: If an e-commerce website gets millions of visitors during a sale, a load balancer sends requests to multiple servers to keep the system running smoothly.

Caching

Storing frequently accessed data in a temporary storage (cache) to reduce the time it takes to fetch it from the main database.
Example: When a user visits a website, their browser might cache images, so the next time they visit, the images load faster.

Redundancy and Fault Tolerance

Redundancy: Having backup components in case one part of the system fails.
Example: If one server goes down, another one takes over.

Fault Tolerance: The system’s ability to keep working even when parts fail.
Example: In cloud systems like AWS, if one server fails, the system automatically shifts to another without users noticing.

Types of Systems

OLTP (Online Transaction Processing)

OLTP systems are designed to handle a large number of short, fast, real-time transactions that involve reading and writing data to a database. They are optimized for transactional workloads where the speed and reliability of operations (like updating data) are critical. The system needs to be highly responsive, with a focus on quick query processing and maintaining data consistency.

Example: Banking systems, where users need to make transactions like withdrawals or deposits instantly.

OLAP (Online Analytical Processing)

OLAP systems are designed to support complex queries for analyzing large volumes of historical data. Instead of focusing on real-time transaction processing, OLAP systems prioritize data analysis, reporting, and extracting insights from existing data, often aggregating and summarizing information for decision-making. These systems are optimized for read-heavy workloads where queries are run to perform in-depth analysis of datasets, often in the context of business intelligence.

Example: Data warehouses are used for business intelligence purposes, where queries run to generate insights or reports.

Key Differences Between OLTP and OLAP

FeatureOLTPOLAP
PurposeReal-time transaction processingData analysis and reporting
Data OperationsFrequent read/write operationsRead-heavy operations with complex queries
Data TypeCurrent, up-to-date dataHistorical, aggregated data
Query ComplexitySimple, short queriesComplex queries (aggregations, joins)
Database StructureHighly normalized for efficiencyDe-normalized or multidimensional for analysis
Response TimeVery fast, low-latencyTypically slower, but optimized for complex queries
Example SystemsBanking systems, e-commerce websitesData warehouses, BI systems

Conclusion

In conclusion, understanding system design is crucial for anyone looking to build scalable, efficient, and reliable software systems. Whether you're developing a small application or architecting a complex, distributed system, designing the system's structure, data flow, and performance characteristics lays the groundwork for its success. The principles of system design not only help in creating solutions that meet immediate business needs but also ensure flexibility and scalability for future growth. By learning and applying these concepts, you'll be better equipped to solve real-world challenges, optimize resource usage, and deliver high-performance systems that can grow with your users and data demands.