codon

A high-performance, zero-overhead, extensible Python compiler using LLVM

compiler gpu-programming high-performance llvm parallel-programming python

Go to file

Ibrahim Numanagić bef36e016a Fix call tests [wip]		2023-12-09 18:13:07 -08:00
.github	v0.16 (#335 )	2023-04-12 18:13:54 -04:00
bench	Spelling (#276 )	2023-03-20 19:13:39 -04:00
cmake	v0.16 (#335 )	2023-04-12 18:13:54 -04:00
codon	Fix call tests [wip]	2023-12-09 18:13:07 -08:00
docs	Merge simplify & typecheck [wip]	2023-05-28 19:08:32 -07:00
extra/python	v0.16 (#335 )	2023-04-12 18:13:54 -04:00
jupyter	Better Jupyter support & Polymorphism improvements (#363 )	2023-05-10 09:28:25 -04:00
scripts	Better Jupyter support & Polymorphism improvements (#363 )	2023-05-10 09:28:25 -04:00
stdlib	Fix call tests [wip]	2023-12-09 18:13:07 -08:00
test	Fix call tests [wip]	2023-12-09 18:13:07 -08:00
.clang-format	Initial commit	2021-09-27 14:02:44 -04:00
.clang-tidy	Merge simplify & typecheck [wip]	2023-06-25 00:17:52 +02:00
.gitattributes	Update .gitattributes	2021-10-03 11:18:57 -04:00
.gitignore	Fix GCMapAllocator not being used (#369 )	2023-05-05 08:36:43 -04:00
CMakeLists.txt	Merge simplify & typecheck: new name parser [wip]	2023-07-15 23:55:07 +02:00
CODEOWNERS	Dynamic Polymorphism (#58 )	2022-12-04 19:45:21 -05:00
CONTRIBUTING.md	Dynamic Polymorphism (#58 )	2022-12-04 19:45:21 -05:00
LICENSE	Dynamic Polymorphism (#58 )	2022-12-04 19:45:21 -05:00
README.md	Merge simplify & typecheck [wip]	2023-05-28 19:08:32 -07:00
book.json	Update docs (#28 )	2022-07-26 16:08:42 -04:00

README.md

Codon

Docs ｜ FAQ ｜ Blog ｜ Forum ｜ Chat ｜ Benchmarks

What is Codon?

Codon is a high-performance compiler that compiles Python and Python-like code to native machine code with minimal overhead.

Typical speedups over Python are on the order of 10-100x or more, on a single thread. Codon's performance is typically on par with (and sometimes better than) that of C/C++. Unlike Python, Codon supports native multithreading, which can lead to speedups many times higher still.

Goals

Think of Codon as a Python reimagined for statical compilation completely from scratch. The goals of Codon are:

Complete support of Python's syntax (extensions are allowed)
Semantics as close as Python's (not 100% identical, but close enough for most common use-cases)
Top-notch performance and easy implementation of compile-time optimizations for new domains (something that libraries in any mainstream language cannot do).
Seamless interoperability with CPython, C/C++ and/or other languages.

The perfomance is achieved by the following design choices:

Static ahead-of-time type-checking with minimal reliance on type annotations.
Static instantiation of types and functions.
Compile-time expressions, statements and branches.
Lightweight object representation (as close to C as possible).
Compile-time elision of any metadata that will not be needed.
Aggressive compile-time optimizations whenever possible.

Codon stems from the scientific computing environment: its percursor was Seq project, a DSL for bioinformatics where every saved CPU cycle counts.

For more information, please consult Differences with Python.

Where are you at right now?

Long answer—please see Roadmap for details.

Why?

Python is arguably the world's programming language: it is most widely taught, used and is widely popular among non-CS oriented communities. It provides clean (and analyzable) syntax, [simple semantics], and has unmatched library coverage. However, its Achilee's heel was (and still is) the performance: typicall pure Python code is many orders of magnitude slower than its C/C++/Rust counterpart.

Most of the performance hit comes from the extreme flexibility of its semantics, as well from legacy considerations. However, this flexiblity is often not needed and is typically not used (or even known) in many contexts. Thus, we can often get rid of it to achieve large performance gains. When and how? That's where Codon kicks in! We aim to combine modern compiler techniques with Python syntax and semantics to get the best of both worlds whenever possible.

So, TL;DR:

- We want Python's syntax, semantics and ease of use (we don't want you to learn yet another language)
- We want to be as close to bare metal as possible (speeeeed!)
- We want compiler to help us optimize and detect as many bugs as possible ahead-of-time

While there are many amazing attempts to imrpove Python's performance (e.g., [PyPy], new CPython, Numba, Mojo, to name a few), nearly all of them are limited by either legacy constraints, limited scope, or a commitment to the absolute 100% semantical compatibility with (C)Python. Codon took a different approach: we started with a small compiler that targeted limited subset of Python, and will keep expanding it until the gap is small enough not to matter.

Install

Pre-built binaries for Linux (x86_64) and macOS (x86_64 and arm64) are available alongside each release. Download and install with:

/bin/bash -c "$(curl -fsSL https://exaloop.io/install.sh)"

Or you can build from source.

Examples

Codon is a Python-compatible language, and many Python programs will work with few if any modifications:

def fib(n):
    a, b = 0, 1
    while a < n:
        print(a, end=' ')
        a, b = b, a+b
    print()
fib(1000)

The codon compiler has a number of options and modes:

# compile and run the program
codon run fib.py
# 0 1 1 2 3 5 8 13 21 34 55 89 144 233 377 610 987

# compile and run the program with optimizations enabled
codon run -release fib.py
# 0 1 1 2 3 5 8 13 21 34 55 89 144 233 377 610 987

# compile to executable with optimizations enabled
codon build -release -exe fib.py
./fib
# 0 1 1 2 3 5 8 13 21 34 55 89 144 233 377 610 987

# compile to LLVM IR file with optimizations enabled
codon build -release -llvm fib.py
# outputs file fib.ll

See the docs for more options and examples.

This prime counting example showcases Codon's OpenMP support, enabled with the addition of one line. The @par annotation tells the compiler to parallelize the following for-loop, in this case using a dynamic schedule, chunk size of 100, and 16 threads.

from sys import argv

def is_prime(n):
    factors = 0
    for i in range(2, n):
        if n % i == 0:
            factors += 1
    return factors == 0

limit = int(argv[1])
total = 0

@par(schedule='dynamic', chunk_size=100, num_threads=16)
for i in range(2, limit):
    if is_prime(i):
        total += 1

print(total)

Codon supports writing and executing GPU kernels. Here's an example that computes the Mandelbrot set:

import gpu

MAX    = 1000  # maximum Mandelbrot iterations
N      = 4096  # width and height of image
pixels = [0 for _ in range(N * N)]

def scale(x, a, b):
    return a + (x/N)*(b - a)

@gpu.kernel
def mandelbrot(pixels):
    idx = (gpu.block.x * gpu.block.dim.x) + gpu.thread.x
    i, j = divmod(idx, N)
    c = complex(scale(j, -2.00, 0.47), scale(i, -1.12, 1.12))
    z = 0j
    iteration = 0

    while abs(z) <= 2 and iteration < MAX:
        z = z**2 + c
        iteration += 1

    pixels[idx] = int(255 * iteration/MAX)

mandelbrot(pixels, grid=(N*N)//1024, block=1024)

GPU programming can also be done using the @par syntax with @par(gpu=True).

What isn't Codon?

While Codon supports nearly all of Python's syntax, it is not a drop-in replacement, and large codebases might require modifications to be run through the Codon compiler. For example, some of Python's modules are not yet implemented within Codon, and a few of Python's dynamic features are disallowed. The Codon compiler produces detailed error messages to help identify and resolve any incompatibilities.

Codon can be used within larger Python codebases via the @codon.jit decorator. Plain Python functions and libraries can also be called from within Codon via Python interoperability.

Documentation

Please see docs.exaloop.io for in-depth documentation.