A minimal caching proxy to GitHub's REST & GraphQL APIs

Overview

github-proxy

CircleCI Maintainability Test Coverage

A caching forward proxy to GitHub's REST and GraphQL APIs. GitHub-Proxy is a thin, highly extensible, highly configurable python framework based on Werkzeug. It comes with out-of-the-box support for Flask and Redis, but can be extended to integrate with other application frameworks, databases, and monitoring tools.

Features:

  • Caching of GitHub responses based on conditional requests.
  • Improved and granular monitoring of client usage and rate-limit consumption.
  • Provides a central and extensible pool of GitHub credentials (either GitHub Apps or user PATs) enabling rotation of rate-limited tokens. Negates the need of managing a GitHub App or bot user account per client.
  • Coarse-grained and highly configurable authorization of clients based on API resource scopes.
  • 100% compatible with GitHub's REST and GraphQL interfaces as well as the Enterprise API.

Install

Core framework:

$ pip install github-proxy

With support for Redis as a cache backend:

$ pip install github-proxy[redis]

With Flask support:

$ pip install github-proxy[flask]

Docker image: (Coming Soon)

$ docker pull babylonhealth/github-proxy

Usage

Flask integration (see example):

from github_proxy import blueprint

app = Flask(__name__)

app.register_blueprint(blueprint)
app.register_blueprint(
    blueprint, name="github_enterprise_proxy", url_prefix="/api/v3"
)  # enterprise server

if __name__ == "__main__":
    app.run()

Core framework (only needed for non-Flask applications):

# Explicitly construct a proxy instance
from github_proxy import Proxy, Config, CacheBackend, TelemetryCollector

config = Config()
proxy = Proxy(
    github_api_url=config.github_api_url,
    github_token_config=config,
    cache=CacheBackend.factory(config),
    rate_limited={},
    clients=config.clients,
    tel_collector=TelemetryCollector.from_type(config.tel_collector_type),
)

# Or inject an instance loaded from the environment
from github_proxy import inject_proxy
from werkzeug import Request, Response

@inject_proxy
def request_handler(proxy: Proxy, request: Request) -> Response:
    return proxy.request(request.path, request, "foo-client")

Registered clients (see client registry file) can then integrate with the proxy using their proxy client token. A proxy client token may be used as a regular GitHub PAT in SAML SSO Authentication:

$ curl -H "Authorization: token ${CLIENT_TOKEN}" http://localhost:5000/zen
Keep it logically awesome.

Architecture

The need for such a solution stemmed from Babylon's reliance on GitOps as an operational and change release framework. This led to a high (and at times abusive) usage of the GitHub API through a limited number of GitHub bot users. Frequent rate-limiting and lack of observability in terms of which client/workflow/team is abusing the API resulted in suboptimal developer experience.

High usage clients of the GitHub API are usually CI/CD pipelines and automated tests. These workflows are traditionally implemented as a collection of job processes executing independently to each other. This setup does not allow hot resources (and their Etags) to be shared across different workflows, or even jobs of the same workflow.

The GitHub-Proxy provides a centralised store of Etags that can be shared and re-used amongst its client base, letting workflows take full advantage of conditional requests which do not count against the rate limit.

graph LR
  subgraph clients
  A[CI job foo]
  C[CD job bar]
  D[Synthetic test baz]
  end
  A -- Auth via `Authorization: token REDACTED` HTTP header --> B[GitHub Proxy]
  C --> B
  D --> B
  B -- Auth via GitHub App or user PAT --> E[GitHub]
  B -- Read/Write responses of GitHub --> F[(Cache)]
  B -- Reporting of usage --> G[(Telemetry Collector)]

Sequence diagrams:

Configuring the proxy

By default, the proxy loads its configuration using the github_proxy.Config class from the following 2 sources:

  1. Environment variables
  2. Client registry file

Environment variables

Variable Description Default
GITHUB_API_URL Base url of the GitHub API server. https://api.github.com
CACHE_TTL The TTL (in seconds) of the cache that stores GitHub responses. 3600
CACHE_BACKEND_URL URI of the cache backend that stores GitHub responses. The scheme of the URI infers the cache backend type. inmemory://
GITHUB_CREDS_CACHE_MAXSIZE The max size of the inmemory cache used for storing rate limited GitHub credentials. 256
GITHUB_CREDS_CACHE_TTL_PADDING The TTL padding (in minutes) of the inmemory cache used for storing rate limited GitHub credentials. This padding accounts for potential clock drift between the proxy and the GitHub servers. 10
TELEMETRY_COLLECTOR_TYPE The type of telemetry collector to be used. noop
CLIENT_REGISTRY_FILE_PATH (Required) Path to the client registry file. See here for more. n/a
GITHUB_PAT_* Variable pattern to specify GitHub user PATs that the proxy can use when integrating with the GitHub API. Example variable name: GITHUB_PAT_FOO. n/a
GITHUB_APP_*_ID Variable pattern to specify GitHub App IDs that the proxy can use when integrating with the GitHub API. Example variable name: GITHUB_APP_BAR_ID. n/a
GITHUB_APP_*_INSTALLATION_ID Variable pattern to specify the GitHub App installation IDs that correspond to each of the GitHub App IDs. Example variable name: GITHUB_APP_BAR_INSTALLATION_ID. n/a
GITHUB_APP_*_PEM Variable pattern to specify the GitHub App private keys that correspond to each of the GitHub App IDs. Example variable name: GITHUB_APP_BAR_PEM. n/a

Client registry file

This file specifies the set of clients that are authorized to integrate with the proxy:

---
version: 1
clients:
  - name: test
    token: H+hYxlecgRq7yfmhq2COlJk7tpSwDmdsp8thdPsnbnQ=
  - name: read_only
    token: oed4+Uo4s4mgwstjSAY/N+HSOsGwfbX91QxqSOjsVlU=
    scopes:
    - method: GET
      path: .*
...

The tokens included in this file are the authorization tokens that clients need to pass to the proxy (instead of GitHub tokens). The name of each client should be unique and is to be used for telemetry purposes. The scopes define the resources and methods of the REST API that each of the clients is authorized to access (default to full access).

Tokens within this file must be treated as secrets. Since secrets cannot be commited to VCS, the registry file can also be provided as a Jinja2 template, enabling the injection of secrets at runtime through env variables:

version: 1
clients:
  - name: test
    token: {{ env.TOKEN_TEST }}

Extending the proxy

Adding a new type of cache backend:

from github_proxy import CacheBackend, CacheBackendConfig

class PostgresCacheBackend(CacheBackend, scheme="postgres"):
    # Implement the __init__, _get, _set, and _make_key methods
    # of the CacheBackend interface
    pass

Adding a new type of telemetry collector:

from github_proxy import TelemetryCollector

class JaegerTelemetryCollector(TelemetryCollector, type_="jaeger"):
    # Implement the collect_gh_response_metrics and collect_proxy_request_metrics 
    # methods of the TelemetryCollector interface
    pass

Once imported, the above extensions can be selected using the respective CACHE_BACKEND_URL and TELEMETRY_COLLECTOR_TYPE env variables.

Relevant references

  1. Google's magic GitHub proxy: Proxy that enables IAM for GitHub API tokens.
  2. Sourcegraph's GitHub proxy: Provides enhanced observability.
Issues
  • [CLOUD-5072] Porting the codebase to a new repo

    [CLOUD-5072] Porting the codebase to a new repo

    • Spike application
    • Spike
    • Add some todos
    • Remove injectors
    • Deployable service
    • Last modified headers (#3)
    • Introducing HTTP token auth (#4)
    • Enterprise server support (#5)
    • Log incoming cache headers (#6)
    • Using a lazy pool of GitHub credentials (#7)
    • Proxy should not fail if cache is down (#8)
    • Support TLS redis (#10)
    • Telemetry (#13)
    • Change custom prom metric names (#14)
    • Change metric labels (#15)
    • Integration tests + refactoring (#17)
    • Media type of a resource should be part of the cache key (#18)
    • Add healthcheck (#19)
    • Better separation between cache misses and uncacheable (#20)
    • Noop renaming of credentials to tokens (#21)
    • Query params to be used when indexing cached responses (#23)
    • Re-using TCP connections (#26)
    • Filter the whole set of hop-by-hop headers (#28)
    • Filter host and content headers (#29)
    • Authorize clients based on scopes (#30)
    • Improve docs (#31)
    • Remove promnight dependency from the github_proxy package
    • abstractmethods
    • Remove hard dependencies on redis and flask
    • Using a lazy pool of GitHub credentials (#7)
    • Telemetry (#13)
    • Integration tests + refactoring (#17)
    • Media type of a resource should be part of the cache key (#18)
    • Add healthcheck (#19)
    • Better separation between cache misses and uncacheable (#20)
    • Noop renaming of credentials to tokens (#21)
    • Query params to be used when indexing cached responses (#23)
    • Re-using TCP connections (#26)
    • Authorize clients based on scopes (#30)
    • Porting the codebase over to a new repo
    opened by dedoussis 0
Releases(v0.4.5)
  • v0.4.5(May 9, 2022)

    What's Changed

    • [CLOUD-5072] Add cryptography dependency by @dedoussis in https://github.com/babylonhealth/github-proxy/pull/7

    Full Changelog: https://github.com/babylonhealth/github-proxy/compare/v0.4.4...v0.4.5

    Source code(tar.gz)
    Source code(zip)
  • v0.4.4(May 9, 2022)

    What's Changed

    • [CLOUD-5072] Retry poetry by @dedoussis in https://github.com/babylonhealth/github-proxy/pull/6

    Full Changelog: https://github.com/babylonhealth/github-proxy/compare/v0.4.3...v0.4.4

    Source code(tar.gz)
    Source code(zip)
  • v0.4.3(May 9, 2022)

    What's Changed

    • [CLOUD-5072] Retry poetry by @dedoussis in https://github.com/babylonhealth/github-proxy/pull/5

    Full Changelog: https://github.com/babylonhealth/github-proxy/compare/v0.4.2...v0.4.3

    Source code(tar.gz)
    Source code(zip)
  • v0.4.2(May 9, 2022)

    What's Changed

    • [CLOUD-5072] Retry poetry by @dedoussis in https://github.com/babylonhealth/github-proxy/pull/4

    Full Changelog: https://github.com/babylonhealth/github-proxy/compare/v0.4.0...v0.4.2

    Source code(tar.gz)
    Source code(zip)
  • v0.4.1(May 9, 2022)

    What's Changed

    • [CLOUD-5072] Fix poetry dynamic versioning by @dedoussis in https://github.com/babylonhealth/github-proxy/pull/3

    Full Changelog: https://github.com/babylonhealth/github-proxy/compare/v0.3.0...v0.4.1

    Source code(tar.gz)
    Source code(zip)
  • v0.4.0(May 9, 2022)

    What's Changed

    • [CLOUD-5072] Fix poetry dynamic versioning by @dedoussis in https://github.com/babylonhealth/github-proxy/pull/3

    Full Changelog: https://github.com/babylonhealth/github-proxy/compare/v0.3.0...v0.4.0

    Source code(tar.gz)
    Source code(zip)
  • v0.3.0(May 9, 2022)

    What's Changed

    • [CLOUD-5072] Fix circleci tag filtering by @dedoussis in https://github.com/babylonhealth/github-proxy/pull/2

    Full Changelog: https://github.com/babylonhealth/github-proxy/compare/v0.2.0...v0.3.0

    Source code(tar.gz)
    Source code(zip)
  • v0.2.0(May 9, 2022)

    What's Changed

    • [CLOUD-5072] Porting the codebase to a new repo by @dedoussis in https://github.com/babylonhealth/github-proxy/pull/1

    New Contributors

    • @dedoussis made their first contribution in https://github.com/babylonhealth/github-proxy/pull/1

    Full Changelog: https://github.com/babylonhealth/github-proxy/commits/v0.2.0

    Source code(tar.gz)
    Source code(zip)
Owner
Babylon Health
Putting an accessible and affordable health service in the hands of every person on earth.
Babylon Health
Proxy-Bot - Python proxy bot for telegram

Proxy-Bot ?? Proxy bot between the main chat and a newcomer, allows all particip

Anton Shumakov 3 Apr 1, 2022
Just a python library to make reddit post caching easier

Reddist Just a python library to make reddit post caching easier. Caching Options In Memory Caching Redis Caching Pickle Caching Usage Installation: D

Samrid Pandit 3 Jan 16, 2022
Proxy server that records responses for UI testing (and other things)

Welcome to playback-proxy ?? A proxy tool that records communication (requests, websockets) between client and server. This recording can later be use

Yurii 41 Apr 1, 2022
The fastest nuker on discord, Proxy support and more

About okuru nuker is a nuker for discord written in python, It uses methods such as threading and requests to ban faster and perform at higher speeds.

null 60 Jun 10, 2022
A little proxy tool based on Tencent Cloud Function Service.

SCFProxy 一个基于腾讯云函数服务的免费代理池。 安装 python3 -m venv .venv source .venv/bin/activate pip3 install -r requirements.txt 项目配置 函数配置 开通腾讯云函数服务 在 函数服务 > 新建 中使用自定义

Mio 600 Jun 29, 2022
a script to bulk check usernames on multiple site. includes proxy & threading support.

linked-bulk-checker bulk checks username availability on multiple sites info people have been selling these so i just made one to release dm my discor

krul 9 Sep 20, 2021
[Multithreading] [Proxy - auto & infile]

Discord-Token-Generator-AutoCheck [Multithreading] [Proxy - auto & infile] How to install? pip install -r requirements.txt run generator.py with pytho

Chakeaw__ 3 Oct 17, 2021
Discord Webhook Proxy for Roblox payloads.

RoProxy A Discord webhook proxy passthrough for roblox. Setup Your port and endpoint are in the config.json, make sure both app.py and config.json are

PythonSerious 2 Nov 5, 2021
The EscapePod Python SDK for Cyb3rVector's EscapePod Extension Proxy

EscapePod Extension SDK for Python by cyb3rdog This is the EscapePod Python SDK for Cyb3rVector's EscapePod Extension Proxy. With this SDK, you can: m

cyb3rdog 3 Mar 7, 2022
Automate coin farming for dankmemer. Unlimited accounts at once. Uses a proxy

dankmemer-farm Simple script to farm Dankmemer coins with multiple accounts at once. Requires: Proxies, Discord Tokens Disclaimer I don't take respons

Scobra 12 Feb 28, 2022
Pancakeswap Sniper BOT - TORNADO CASH Proxy (MAC WINDOWS ANDROID LINUX) A fully decentralized protocol for private transactions

TORNADO CASH Proxy Pancakeswap Sniper BOT 2022-V1 (MAC WINDOWS ANDROID LINUX) ⭐️ A fully decentralized protocol for private transactions ⭐️ AUTO DOWNL

Crypto Trader 1 Jan 5, 2022
TORNADO CASH Proxy Pancakeswap Sniper BOT 2022-V1 (MAC WINDOWS ANDROID LINUX)

TORNADO CASH Pancakeswap Sniper BOT 2022-V1 (MAC WINDOWS ANDROID LINUX) ⭐️ A ful

Crypto Trader 1 Jan 6, 2022
Simple Webhook Spammer with Optional Proxy Support

?? �Simple Webhook Spammer with Optional Proxy Support:- [+] git clone https://g

Terminal1337 11 May 20, 2022
a public repository helping ML/DL engineers and DS to beautify the notebook with minimal coding.

ml-helper-functions a public repository helping ML/DL engineers and DS to beautify the notebook with minimal coding.

Jesal Patel 4 Jun 24, 2021
Minimal telegram voice chat music bot, in pyrogram.

VCBOT Fully working VC (user)Bot, based on py-tgcalls and py-tgcalls-wrapper with minimal features. Deploying To heroku: Local machine/VPS: git clone

Aditya 35 Jun 11, 2022
Send Informative, Concise Slack Notifications With Minimal Effort

slack-templates Send Informative, Concise Slack Notifications With Minimal Effort slack-templates Slack Integration Available Templates Usage Report t

null 8 Jun 19, 2022
Minimal API for the COVID Booking System of the Offices at the UniPD Math Dep

Simple and easy to use python BOT for the COVID registration booking system of the math department @ unipd (torre archimede). This API creates an interface with the official website, with more useful functionalities.

Guglielmo Camporese 4 Dec 24, 2021
A free, minimal, lightweight, cross-platform, easily expandable Twitch IRC/API bot.

parky's twitch bot A free, minimal, lightweight, cross-platform, easily expandable Twitch IRC/API bot. Features ?? Connect to Twitch IRC chat! ?? Conn

Andreas Schneider 9 Apr 11, 2022
Lamblayer: a minimal deployment tool for AWS Lambda layers

lamblayer lamblayer is a minimal deployment tool for AWS Lambda layers. lamblayer does, Create a Layers of built pip-installable python packages. Crea

Yusuke Takahashi 2 Jan 31, 2022