Scaling with Rails (Fri Nov 17, lecture 32)

Homework due for today

Legend: : Participation | : Early | : PDF | : Portfolio | : Zipped

  1. PR Meetings: Now that you all had your second PR meeting, please write a progress report on your project. You should include: Feedback you got from the PR, Action Items from the meeting, Your summary of where your project is. An indication of how your team is dividing up the work. Team Deliverable: Your report, as a pdf
  2. Interesting, optional reading: * Why our classes eventually reach 50 columns and hundreds of methods

Discussion

  • Discussion
  • Notes from reading the “Self Knowledge” reflection
    • Everyone agrees this course is a lot lot of work.
    • Several people reported that they really overworked themselves to the determiment of their other courses or mental health
    • Most people are very positive about going into software engineering
    • Many people said they preferred working on the [X] part more than the [Y] part
    • Some people noticed that their time management could be better
    • Frustration mentioned with late changes to homework and deadlines
    • Frustration mentioned with TDD or writing tests

Thought experiment

  • Real world example: cafeteria flow chart
  • Optimization:the search for bottlenecks.
  • What’s a bottleneck? Refer back to the cafeteria example.
  • Moving target:
    • When you eliminate/improve one bottleneck, it just reveals the next one.
    • You make starting the dashboard activity faster….so that now you can notice that drawing the map overlay is slow.
  • Important: Measurement
  • Worse sin: Optimizing early. Why?

Performance

  • Performance is what what a user experiences as “slow” or “fast”
  • Response time to an operation initiated by the user
  • Perception!
    • Can you ‘fool’ the user into thinking the app is faster than it is?
    • Feedback: spinners etc
    • Anticipation: start doing work before user requests it
  • Different (but intertwined with) scaling
Scaling
  • “How many X per minute can you do”
    • (e.g. user log ins, page refreshes, notifications,…
  • How many (users, sessions, videos, pictures, etc) does the site need to support
  • Different from response time: “How long does it take to accomplish Y?” Related but different
  • Scaling has to do with the load on the servers
  • Big challenge: how fast or slow will the site or app grow?
  • Architectural techniques apply equally
    • scaling up vs. scaling out
    • caching
    • load balancing
    • database partitioning and sharding
    • asynchronous processing

Patterns of scaling problems and solutions

  • “Clients” = web browsers accessing the site, mobile apps accessing the site, etc.
  • Load on the servers. Some scenarios, one or more of:
    • Too many clients asking the server to do operation O * Individual clients asking the server to do operation P too often
    • Operation Q is time consuming for the server to satisfy
  • Solutions can be
    • Add an identical server to handle operations O, P or Q
    • Send operation O to one server and operation P to another server
    • Why are so many clients asking for O? Can we reduce the number?
    • What’s the reason why a client would ask for operation P so often? Can we reduce that?
    • Is there a way to make operation Q faster to satisfy?

Tuning for Scalability

Base configuration

YAGNI Basic Law of Performance and Scale: Never do extra work for scaling until measurement reveals there's a problem and where the problem is.
  • We start with the simplest possible set up. If you have stand alone servers (which we don’t) you would run the database and web server on the same box.
  • In our case we are deploying to a cloud server, where you aren’t getting a full box but a virtual slice, from e.g. Heroku. In that case our base configuration is a single “web worker” to run Sinatra and a single “database” to run our database, which would be Postgress.
  • With that base, measure performance. If you have a real load, you can measure. In our case we have to create an artificial load send traffic to the server. We want to see how many simultanuous sessions we support.
  • As you do scalability testing, keep an eye out for errors that are happening. A server will appear to respond super fast to a request if all it is doing is returning an error code!

Stage One Scalability Tuning

  • Analyze and think about whether the servers are poweful enough. Is it your old laptop running a database server or is it a fairly new computer with no other load. If you are using a cloud service, like BlueMix, Heroku or Digital Ocean, what kind of virtual capacity or limits does it have? You might need to simply up the capacity.
  • Examine your database access and queries. Can you determine that pages that don’t have a database call are really fast and the ones that do are slower? Can you see what pages slow down the most?

If queries are the problem

  • Make sure you clearly understand the separation between the web and the database server.
  • Consider whether you should put the database on it’s own server
  • Consider whether any tables need indexes
  • Consider whether you are going back to the database more than a few times for each page displayed.
    • Consider whether you are hitting the database once for each record displayed (so called N+1 problem).
    • If you are, look at making your queries ‘greedy’ meaning that they bring back more in a single call to the server
  • Consider whether you are issuing the exact same query over and over again.
  • Consider whether an intermediate query result that is expensive is being requested again. If so, caching those results is a strategy.
  • Look at the metrics on your database servers, that is simple

If the app/web server seems to be the problem

  • Consider what app server you are using
  • Consider whether you are fully/over utilizing resources
    • Make sure that the resources (memory, cpu) are being used but not pinned to maximum.
    • You don’t want to hit your caps on resources, otherwise the app will start thrashing.
  • Consider adding more “web workers” or concurrent threads.
    • You have to be careful that your architecture allows that. Is your design ready to run concurrently.
    • Is there any way that one request can corrupt another one running at the same time?

If you still have scaling issues

  • Consider adding another discrete server
    • Once you see one box does not cut it any longer add another server, you will need a load balancer for this.
    • Adding another server, increases your redundancy. Depending on how valuable this app is and how badly it needs to stay up 2 app servers is a good idea, as well as 2 database servers, 1 primary and 1 follower on stand by. AWS RDS makes that last part easy.
  • Consider caching services
    • Are there intermediate results that can be cached to avoid hitting the database at all?
  • Consider partitioning the databases
    • One replica for update transactions and multiple replicas for read transactions
    • Partitioning vertically, by moving certain columns of tables to a separate db servers.
    • Partitioning horizontally (sharding) by moving certain sets of rows to separate db servers. (For example, profile records for half the users on one server and the other half on another server.)
  • Consider breaking into services
    • Are there separable major bits of functionality that you can carve off into fully independent services, with their own app and db servers

Look at next class