ReZero's Utopia.

Site Architecture

Word count: 1.9kReading time: 11 min
2020/07/18 Share

Summary

The whole process of evolution

The initial tiny site

  1. Application server needs much stronger CPU to control the complex business.

  2. Database server need much faster disk and larger memory to get disk retrieval & data cache faster

  3. So as file server need the disk.

1
2
3
Note: There are 80% business on the 20% data

example: U may only Baidu the several front pages

The site is growing

  1. Read and write database will be separated exactly.

  2. CDN & Reverse Proxy

Both CDN and Reverse Proxy use the cache model.
But the CDN decided u access the web on the close net provider while the Reverse Proxy on the center.

  1. distributed file and database.
  1. With the task get much complex, the demand ability of data check getting higher, so you need to take some advice like NoSQL or Nondatabase query technology like a search engine.
  1. Business split

  2. distributed service

Value

  1. We are growing, not rebuild or create.

  2. The real power is the business development

Business make technology, career makes a man.

Misunderstand area

  1. Blind pursue large site solutions.

  2. For technology to technology.(but for business)

  3. Technology not the real point sometimes(12306)

Architecture Pattern

Pattern

  1. stratification(horizontal): application, service, data.

    • advantage: keep the interface, everyone justifies their own works.

    • disadvantage: the interface and splice layer border need be careful.

  2. segmentation(vertical): divide the function and business

  3. distributed: both front points to this aim

  • distributed application and business

  • distributed static sources

  • distributed data and storage

  • distributed computing

  1. Server clustering

Some servers deploy the same application and provide service by loading balance.

  1. Cache

    • CDN

    • Reverse Proxy

    • Local cache

    • distributed cache

    • precondition

    • the cache only be short-term effective

    • the data which caused by hot point without balance should be put in the cache

  2. Asynchronous

    • improve system

    • improve web site responsive speed

    • avoid distributed access peak

  3. Redundance

    • cold backup

    • hot backup

  4. Automatic

    • code manager

    • test

    • security

    • deploy

    • monitor

    • alert

    • lose effect move

    • lose effect recover

    • level down

    • allocate resources

Sina apply example

  1. initial: lamp

    • base server layer: support database, storage, cache, search and other technology.

    • the middle layer is platform service and application service.

    • the upon layer is API, the third party service and sina business layer.

  1. MPSS: Now the solutions shows like virtual the physical machine. As this way, they even can use the same port while the MPSS can not.

  2. Multi-Level Cache

Architecture kernel point

What is architecture?

The highest level of planning, difficult decisions to change

Keep their balance

  • Performance

    Browser: cache, compress the page, decrease the transfer cookie, layer regularly.

    Server: CDN, local and distributed cache, asynchronous message queue

    Code: multi threads, manage the memory

    Databae: index, cache, sql optimise, NoSQL

  • Serviceability

    For application server, it can not storage session info.

    For storage server, it should be real-time backup

    Function: Check whether the whole can work when some servers died.

  • Flexibility

    Application Cluster: Add new blood by using loading balance machine.

    Cache Cluster: Cache router algorithm

    Database Cluster Way: Routing partition

  • Augment ability

    Event-driven Architecture: Message queue

    Distributed Service: Divide the business and reuse service and call by distributed service framework

  • Safety

Architecture

Instant response

Web Site Performance Test

  1. Different view of the sites’ performance

    • User view: the speed

    • Most about front end, optimise html css, cdn, reverse proxy, cache strategy

    • Developer view:

    • Cache speed up data, distributed handle

    • Improve the read and write ability by using cluster

    • Asynchronous message speed up the response.

    • Operations view:

    • MNO bandwidth

    • Server hardware configuration

    • Data center network architecture

  2. Performance test index

    • Time: test to calculate the time segment

    • Distribution counts: test by using multi threads

    • Throughput capacity: TPS, QPS, HPS

    • Performance counter: System load(top command)

      1
      2
      3
      4
      5
      6
      7
      8
      top - 18:38:43 up 5 days,  8:33,  1 user,  load average: 0.00, 0.01, 0.05
      Tasks: 74 total, 1 running, 73 sleeping, 0 stopped, 0 zombie
      %Cpu(s): 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
      KiB Mem : 1883860 total, 106724 free, 456832 used, 1320304 buff/cache
      KiB Swap: 0 total, 0 free, 0 used. 1115392 avail Mem

      PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
      1 root 20 0 43076 3528 2408 S 0.0 0.2 0:06.31 systemd
  1. Performance test function

    • Performance test

    • Loading test

    • Pressure test

    • Stability test

    • Performance test report example

Front end optimise

  1. Browser

    • reduce the http request(merge the img, css and js)

    • Use the browser cache(header: Cache-control, expire; one by one update the icon; update the call name instead of the file(html call, js file))

    • turn on the gzip

    • CSS on the header, js on the body tail.

    • Reduce the cookie transfer(While static resources)

  2. CDN & Reverse Proxy

    • loading speed, reduce request, security; loading balance, cache

Application server optimise

  1. Distributed cache
  1. Rational use cache

    • update the data usually

    • No access point

    • Not the same data & read wrong info

    • Cache use ability

    • Cache warming(metadata)

    • The cache to penetrate(not exist also need be saved as null value)

  2. Distributed cache architecture

    • JBoss cache and Memcached

      • Libevent

      • Memory manage: slab, chunk, LRU,

  3. Code optimise

    • Multi threads code

      • Start threads = [Task execute time/(Task execute time - IO wait time)] * CPU kernel counts

      • Stateless object

      • Local object

      • distributed access with lock

  4. Resources reuse

    • Singleton

    • Object pool

  5. Data Structure & Algorithm

    • Hash time33

    • Garbage collection

      • Stack: function args, local variables

      • Heap: create & delete object & garbage collection

Storage performance optimise

  • SSD (B+ potential)

  • B+ tree VS LSM tree

  • RAID VS HDFS

    • RAID
    • HDFS: name node & data node(Map reduce)

No danger of anything going wrong

Measurement and assessment

Cluster session

  • session copy(small cluster)

  • session binding(…)

  • note session by cookie

  • Session server(The best method)

High availability data

  • CAP (always use ap without c)

  • Backup
    cold(abandon), hot(Master-Slave)

  • Fail over

    • check: keep-alive, report of access failed

    • move: route computing find true server

    • data recover: recover the backup count again

Monitor

  1. Data collection

    • User behaviors collection

      • Server logs collection

      • Client Browser logs collection by js(Tool: storm log analyze)

    • Server Performance Monitor

      • Load, memory, disk IO, NetWork IO(Tool: Ganglia)
    • Data report

  2. Monitor manage

    • System alert

    • Fail over

    • Automatic degrade

Telescopic architecture

Architecture edesign

  1. Different function divided by physical

  2. Single function divided by cluster

Application server cluster design (Loading balance)

  1. Http(302 but for SEO works not well)
  1. DNS
  1. Reverse Proxy
  1. IP
  1. Data Link(Direct route) [linux tool: LVS]
  1. Algorithm

    • Round robin

    • Weighted round robin

    • Random

    • Least Connections

    • Sources Hashing

Distributed Cache Cluster Design

  1. Memcached model
  1. Memcached challenge

    • When: Distributed Cache Cluster need be extension

    • Loading balance design advantage demand: Cache

  2. Distributed Cache Hash Algorithm

Data Storage Server Cluster Design

  1. Schema Database Telescopic (Cobar, GreenPlum)

  2. NoSQL Database(Apache HBase)

Site Extensibility

Structure Extensibility Architecture

Module Coupling Decoupling

Distributed Message Queue

  1. Event Driven Architecture

  2. Distributed Message Queue

    • ESB SOA

Reuse Platform

  • Questions

    • Compiling & Deploy

    • Code Patch Manage Difficult

    • Database Connections Exhaustion

    • New Business Add Difficult

  • Separate

    • Vertical: various applications

    • Horizontal: distributed business

  1. Web Service & Enterprise Service

    1. Server[WSDL] -> Service Broker[UDDI] <- SOAP [Client]

    2. Disadvantages

      • Bloat register and find management

      • Inefficient xml serializable method

      • Large spending Http connections

      • Complex deploy and maintenance method

  2. Distributed Service demand and features

    • Loading balance, fail over, efficient long-distance communication

    • Heterogeneous systems, Minimum invasion to applications

    • Versions control, Real time monitor

  3. Distributed Service Framework Design

    • Dubbo(NIO communication)

Extensible data structure

  • ColumnFamily

Open platform theory

Safety Architecture

  1. XSS

    • Filter escape character

    • HttpOnly

  2. Inject

    • OPEN Sources, Error echo, Blinds, Filter escape, Args bind(OS injection)
  3. CSRF

    • Form token, verify code, Refer check
  4. Others

    • Error code, HTML annotation, file upload, traversal paths
  5. Web application firewall

    • ModSecurity
  6. Web security scanner

Info encryption and secret key management

  1. One-way hash encryption

    • MD5, SHA
  2. Symmetric encryption

    • DES, RC
  3. Asymmetric encryption

    • Https, RSA
  4. Secret key management

Info filter & anti-spam

  1. Text Match

    • Trie (base array: storage, check array: status)

    • Multilevel hash match

  1. Classify

    • Basyes(Advance) -> TAN -> ARCS
  2. Blacklist

    • Bloom Filter

Electronic Commerce Control

  1. Risk

    • Account, Sellers, Buyers, Trade
  2. Risk control

    • rule engine

    • statistics model

Example

TaoBao

Evolution

  1. Lamp

  2. 2004, eBay: Php->Java, Mysql->Oracle, MVC Webx, ORM: iBatis, Manage: antx, Server: Weblogic

Note: Taobao choose the free plan when the begin and choose the no free plan when speed up growing web. Both of them are the right decisions.

  1. abandon EJB, import spring; JBoss(Jetty further more) not Weblogic,

At this moment, taobao begin to make progress, many technology which be their base was from that moment.

Wiki

The whole wiki

  • GeoDNS, LVS, Squid, Lighttpd, PHP, Memcached, Lucene, MySQL

Wiki performance optimise strategy

  1. Front end

  2. Server

    • APC

    • Imagemagick

    • Tex

    • replace strtr function

  3. Backstage

Doris

Solutions

  1. Classify Faults

    • Instant, Temp, Forever
  2. Normal status access

  3. Instant fault high availability solutions

  1. Temporary Error high availability solutions
  1. forever fault high availability solutions

Machine can not distinguish between temp and forever.

So you need to find it artificial.

Online shopping spike

Challenges

  1. Impact existing business

  2. Application, Database loading

  3. Tape width

  4. Direct url

Strategy

  1. Independent deploy

  2. Page static(reduce request)

  3. Rent tape width(CDN)

  4. Dynamic generate random order page url

Architecture Design

  1. Spike button control
  1. Spike process & Architecture Design

Fault example analyze

Log cause fault

  • Log output level: global debug

  • Experiences:

    • Self log & third party should be config individually

    • Config log level at least: warn, and check the output code call whether accord with real log level.

    • Shut down third party no use log(Most are error log)

Highly concurrent access database cause fault

  • Experiences:

    • Home page should not access database

    • Home page had better as static

Highly concurrent latch cause fault

Cache cause fault

Application start not at the same time

  • Apache, JBoss

  • JBoss start, then request it by curl. success: start Apache

I/O Big file cause disk

  • Tiny file should be storage themselves instead of sharing with distributed big file storage system.

Abuse of production environment

  • Access production environment should be regularly(DBA)

Non-standard process

  • Diff before you push the code

  • Stronger the code review

Code habits

  • Check the null pointer when you are not sure the input object status

  • Null object pattern

Architects

Leader Art

Man not Production

Discover the excellence of man

Share the blueprint

Learn to compromise

Engage and Develop Others

Career Raiders

Find questions

Ask questions and support

Site architecture

Effect

  • Design, Fire Fighting, Sermon, Geek

Result

  • Sherpa, Spartan, VIP

Duty

  • Productions, Basic service, Basic equipment

Attention Level

  • Function, Not function(Performance & others), Team organization, Production Future, Production operative

Public praise

  • Best, good, normal, bad. worst

Non-mainstream

  • Normal, Literature, 1+1

Appendix

Front End

  • Browser optimise

    • Cache, reduce Http, page compress
  • CDN

  • Static resources should be storage in their own server cluster

  • Image(not logo… but the user upload like avatar)

    • Own server & child domain
  • Reverse Proxy

  • DNS

Application layer architecture

  • Development framework

  • Page Rendering

  • Loading balance

  • Session management

  • Dynamic page staticize

  • Business split

  • Virtual server

Service layer architecture

  • Distributed message

  • Distributed service

  • Distributed cache

  • Distributed configuration

Storage architecture

  • Distributed files

  • Schema database

  • NoSQL database

  • Data synchronization

Backstage architecture

  • Search engine

  • Data repositories

  • Recommend system

Data collection and monitor

  • Browser data collection

  • Server business data collection

  • Server performance collection

  • System monitor

  • System alert

Security Architecture

  • Web Attack

  • Data protection

Data Center architecture

  • Computer room

  • Cabinet

  • Server architecture

Appendix B

  1. Web: only static html

  2. CGI cause dynamic page content

    • process like: server push the reuqest to cgi programmer, CGI computing and generate the html.

    • CGI use Perl, Java servlet call servlet in the web container.

    • Php(Asp, Jsp) improve the situation which caused business code and page programmer coupling by CGI

    • MVC (combine cgi and web server)

Done

CATALOG
  1. 1. Summary
    1. 1.1. The whole process of evolution
      1. 1.1.1. The initial tiny site
      2. 1.1.2. The site is growing
      3. 1.1.3. Value
      4. 1.1.4. Misunderstand area
    2. 1.2. Architecture Pattern
      1. 1.2.1. Pattern
      2. 1.2.2. Sina apply example
    3. 1.3. Architecture kernel point
  2. 2. Architecture
    1. 2.1. Instant response
      1. 2.1.1. Web Site Performance Test
      2. 2.1.2. Front end optimise
      3. 2.1.3. Application server optimise
      4. 2.1.4. Storage performance optimise
    2. 2.2. No danger of anything going wrong
      1. 2.2.1. Measurement and assessment
      2. 2.2.2. Cluster session
      3. 2.2.3. High availability data
      4. 2.2.4. Monitor
    3. 2.3. Telescopic architecture
      1. 2.3.1. Architecture edesign
      2. 2.3.2. Application server cluster design (Loading balance)
      3. 2.3.3. Distributed Cache Cluster Design
      4. 2.3.4. Data Storage Server Cluster Design
  3. 3. Site Extensibility
    1. 3.0.1. Structure Extensibility Architecture
    2. 3.0.2. Distributed Message Queue
    3. 3.0.3. Reuse Platform
    4. 3.0.4. Extensible data structure
    5. 3.0.5. Open platform theory
  4. 3.1. Safety Architecture
    1. 3.1.1. Info encryption and secret key management
    2. 3.1.2. Info filter & anti-spam
    3. 3.1.3. Electronic Commerce Control
  • 4. Example
    1. 4.1. TaoBao
      1. 4.1.1. Evolution
    2. 4.2. Wiki
      1. 4.2.1. The whole wiki
      2. 4.2.2. Wiki performance optimise strategy
    3. 4.3. Doris
      1. 4.3.1. Solutions
    4. 4.4. Online shopping spike
      1. 4.4.1. Challenges
      2. 4.4.2. Strategy
      3. 4.4.3. Architecture Design
    5. 4.5. Fault example analyze
      1. 4.5.1. Log cause fault
      2. 4.5.2. Highly concurrent access database cause fault
      3. 4.5.3. Highly concurrent latch cause fault
      4. 4.5.4. Cache cause fault
      5. 4.5.5. Application start not at the same time
      6. 4.5.6. I/O Big file cause disk
      7. 4.5.7. Abuse of production environment
      8. 4.5.8. Non-standard process
      9. 4.5.9. Code habits
  • 5. Architects
    1. 5.1. Leader Art
      1. 5.1.1. Man not Production
      2. 5.1.2. Discover the excellence of man
      3. 5.1.3. Share the blueprint
      4. 5.1.4. Learn to compromise
      5. 5.1.5. Engage and Develop Others
    2. 5.2. Career Raiders
      1. 5.2.1. Find questions
      2. 5.2.2. Ask questions and support
    3. 5.3. Site architecture
      1. 5.3.1. Effect
      2. 5.3.2. Result
      3. 5.3.3. Duty
      4. 5.3.4. Attention Level
      5. 5.3.5. Public praise
      6. 5.3.6. Non-mainstream
  • 6. Appendix
    1. 6.1. Front End
    2. 6.2. Application layer architecture
    3. 6.3. Service layer architecture
    4. 6.4. Storage architecture
    5. 6.5. Backstage architecture
    6. 6.6. Data collection and monitor
    7. 6.7. Security Architecture
    8. 6.8. Data Center architecture
  • 7. Appendix B
  • 8. Done