Finagle

Under The Hood

@vkostyukov

$ whoami

Finagle: Status

Finagle: @Twitter

The Big Picture

Your Server as a Function

trait Service[-Req, +Rep] extends (Req => Future[Rep])

Configuration

Configuring Servers and Clients

  • Configuration is code (not CLI flags, not config files)
  • Conventional API (builder) pattern:
    <Protocol>.client.with* and <Protocol>.server.with*
  • Separate expert-level API: .configured, .transformed, .filtered
  • 100% Java friendly

Servers

What does the server do?

  • Concurrency Limit
  • Request Timeout
  • Metrics and Tracing

Server: Concurrency Limit

import com.twitter.finagle.Http

val server = Http.server
  .withAdmissionControl.concurrencyLimit(
    maxConcurrentRequests = 10,
    maxWaiters = 0
  )

Maintains the concurrency level of this server (default: disabled).

Server and Client: Request Timeout

import com.twitter.conversions.time._
import com.twitter.finagle.Http

val server = Http.server
  .withRequestTimeout(42.seconds)

Time out requests that weren't handled in a given time
(default: disabled).

Clients

What does the client do?

  • Retries
  • Naming / Service Discovery
  • Timeouts and Expirations
  • Load Balancing
  • Rate Limiting
  • Connection Pooling
  • Circuit Breaking
  • Failure Detection
  • Metrics and Tracing
  • Interrupts
  • Context Propagation
  • ...

Client: Response Classification

  • finagle-core knows nothing about the protocol/application its used by
  • We need a way to teach circuit breakers/load balancers how to identify failure responses
  • Response Classifiers, a protocol-agnostic tool that solves it

Client: Response Classification

import com.twitter.finagle.{Http, http}
import com.twitter.finagle.service._
import com.twitter.util._

val classifier: ResponseClassifier = {
  case ReqRep(_, Return(r: http.Response)) if r.statusCode == 503 =>
    ResponseClass.NonRetryableFailure
}

val client = Http.client
  .withResponseClassifier(classifier)

Treat 503 (Service Unavailable) as a non-retriable failure.

Client: Retries

  • Lives in the top of the stack and retries failures from the underlying modules (timeouts, failure detectors, etc)
  • Only retries when it's safe to do so (eg: exceptions that occurred before the bytes were written to the wire and protocol level NACKs)
  • and when RetryBuget (leaky token bucket) allows it
  • Backoffs between retries for a given amount of time

Client: Retry Budget

import com.twitter.finagle.Http
import com.twitter.conversions.time._
import com.twitter.finagle.service.RetryBudget

val budget = RetryBudget(
  ttl = 10.seconds,
  minRetriesPerSec = 5,
  percentCanRetry = 0.1
)

val client = Http.client.withRetryBudget(budget)

Allow retrying 10% of total requests on top of 5 retries per second.

Client: Retry Backoff

  • Build on top of Stream[Duration]
  • A variety of predefined policies (even jittered):
    • Backoff.linear and Backoff.exponential
    • Backoff.exponentialJittered
    • Backoff.equalJittered
    • Backoff.decorrelatedJittered

Client: Retry Backoff

import com.twitter.finagle.Http
import com.twitter.conversions.time._
import com.twitter.finagle.service.Backoff

val client = Http.client
  .withRetryBackoff(Backoff.exponentialJittered(2.seconds, 32.seconds))

Backoff for rnd(2s), rnd(4s), rnd(8), ..., rnd(32s), ... rnd(32s)

Client: Timeouts

import com.twitter.conversions.time._
import com.twitter.finagle.Http

val client = Http.client
  .withTransport.connectTimeout(1.second) // TCP connect
  .withSession.acquisitionTimeout(42.seconds)
  .withSession.maxLifeTime(20.seconds) // connection max life time
  .withSession.maxIdleTime(10.seconds) // connection max idle time

Client: Load Balancing

  • A client treats its server set as replica set
  • Load balancing involves load distributor and load factor
  • Currently available
    • Load distributors: heap, p2c, aperture, round robin
    • Load factors: least loaded, peak EWMA

EWMA (Exp Weighted Moving Avg)

  • Keeps track of RTT latency weighted by load of each replica
  • Involves both RPC latency and queue depth
  • Sensitive to peaks
  • Protects against latency (eg: JVM full GC and warmup)
When timeout is 1 second
  • Round Robin SR is 95%
  • Least Loaded SR is 99%
  • EWMA SR is 99.9%





Chart is from @stevej's (Buoyant, Inc) awesome blog post.

Client: Load Balancing via Aperture

  • Employs a simple feedback controller based on the client’s load
  • Less connections from clients
  • Less connections to servers
  • Balances over fewer, and thus warmer, services

Client: Load Balancing via Aperture

import com.twitter.conversions.time._
import com.twitter.finagle.Http
import com.twitter.finagle.loadbalancer.Balancers

val balancer = Balancers.aperture(
  lowLoad = 1.0, highLoad = 2.0, // the load band adjusting an aperture
  minAperture = 10 // min aperture size
)

val client = Http.client.withLoadBalancer(balancer)

Client: Circuit Breaking

  • Placed under LB to exclude particular nodes from its replica set
  • Currently implemented
    • FailFast - a session-driven circuit breaker
    • FailureAccrual - a request-driven circuit breaker
    • ThresholdFailureDetector - ping based failure detector (Mux only)

Client: Failure Accrual

  • Our most advanced circuit breaker
  • Pluggable policies based on
    • Required success rate
    • Number of consecutive failures
  • By default
    • Marks replica dead after 5 failures
    • Backoff and retry using equal jitter

Client: Circuit Breaking

import com.twitter.conversions.time._
import com.twitter.finagle.Http
import com.twitter.finagle.service.{Backoff, FailureAccrualFactory.Param}
import com.twitter.finagle.service.exp.FailureAccrualPolicy

val twitter = Http.client
  .configured(Param(() => FailureAccrualPolicy.successRate(
    requiredSuccessRate = 0.95,
    window = 100,
    markDeadFor = Backoff.const(10.seconds)
  )))

Mark a replica dead if SR is below 95% over the most recent 100 requests
and then backoff for 10 s.

What else?

https://github.com/twitter/finagle