/r/Compilers
This subreddit is all about the theory and development of compilers.
For similar sub-reddits see:
Popular mainstream compilers:
/r/Compilers
Do all variables just live on the stack or do they also live in registers? Or do they live both in registers and on the stack at the same time (not the same variables)? I don't really know how I would tackle this and what the usual thing to do is.
Hello everyone,
After i landed my first diff related to InstCombine in LLVM, i found that using the Pattern Matchers functions to detect pattern is interesting, and i thought maybe i can make this pattern work outside LLVM using SQL query, so i built LLQL.
define i32 @function(i32 %a, i32 %b) {
%sub = sub i32 %a, %b
%mull = mul i32 %a, %b
%add = add i32 %sub, %mull
ret i32 %add
}
For example in functions like this suppose you want to search for add instruction with LHS sub instruction and RHS is mul instruction, you can search using LLQL like this
SELECT instruction FROM instructions WHERE m_inst(instruction, m_add(m_sub(), m_mul()))
Or for example you can query how many times this pattern exists in each function
SELECT function_name, count() FROM instructions WHERE m_inst(instruction, m_add(m_sub(), m_mul())) GROUP BY function_name
Github: https://github.com/AmrDeveloper/LLQL
Currently LLQL support Ret, Br, arithmetic, ICMP, FCMP, and matchers for types and nested types.
Looking forward for feedback.
Are there any resources for their exercise questions? I have searched but only found the ones previously done in my coursework
Plus, if a symbol goes to null production, Then how is it handled in stack
Hello,
Aspiring compiler engineer here. I am nearing the end of my degree and will have the option to choose between 2 courses next semester (can only choose 1 due to them being offered at the same time). I have taken compilers courses, done research in codegen, but never taken a PL course. I only know the basics of things like EBNF and automata from frontend part of compilers.
I have to decide between PL theory or distributed systems. Even though I would like to be a compiler engineer, I'd rather take distributed systems because it seems like a huge topic with a huge importance in industry. I also have a lot of interest in systems and loved OS as a topic. PL theory also seems extremely theoretical and I tend to not like too theoretical topics. Nonetheless, if this knowledge is extremely important for industry compiler engineers, then I will end up taking PL theory. Here is the list of topics covered in the PL theory course:
• defining semantics of typed languages
• progress and preservation theorems
• functional languages with recursion over inductively defined data types
• formal proof systems on “pencil-and-paper” and in Coq
• the Curry-Howard correspondence
• concurrency and parallelism
• substructural types for permission and ownership
Please let me know which course you think would be a better fit and opinions. Thanks!
Any course / tutorial ? new to compilers and i like to learn MLIR .
i looked on courseera / udemy , didnt find anything .
Im looking for the big picture and learn the architecture and how things work, the focus is not on development , at least not in the beginning
From what I understand, modern compilers:
Target or generate code that address logical registers which are dynamically renamed or mapped to physical/hardware registers at runtime; and there are more physical registers than logical registers.
Spills registers to stack memory when the program requires more registers than are available, and spilling is basically a write or store to memory instruction.
It seems to me that a compiler spilling logical registers solely based on the number of logical registers is very suboptimal -- unless the CPU can ignore spill instructions when a sufficient number of physical registers are available. Or do CPU-specific compilation flags such as gcc's `-march=native` help reduce spilling somehow? Or perhaps I don't understand enough about what physical registers are for.
Of course, people have reservations against using either. However, my question is only regarding which is better of the two.
Would the answer be different if,
Hello! What are some techniques used to eliminate/remove recursion (turn a recursive definition into a loop).
I know of tail recursion optimization and tail recursion modulo cons. However, is that really all? E.g. I can manually convert a recursive function into a tail-recursive one with an accumulator, but are there really no techniques that automate it at least to some extent?
If there are such methods, could you please point me to the relevant papers and compilers that actually implement them?
I am trying to get the exe by making .obj file with nasm, then link it with 'link' from windows, but i am getting some linker errors in the process.
Before i go into debugging my assembly code and linker options i just want to check if this is the correct way to do it?
I have written a noddy whole-program compiler for a minimal ML dialect of my own invention to Arm64 asm.
In my compiler a "function" is an expression tree in ANF but with if
expressions (note: no ϕ nodes). Whenever one of these "functions" dirties a callee-saved register (including the link register) those dirtied registers are push onto the stack upon entry and popped off before a return or tail call at every return point.
I've noticed that this can be unnecessarily inefficient in some cases. For example, a single tail recursive function that uses lots of local variables can dirty a callee-saved register at which point it ends up pushing and popping every tail call (≡ iteration) when that could have been done once "outside" the recursive function (≡ loop).
This begs the question: when must callee-saved registers be pushed and popped?
I think one answer is that functions that might be non-tail called must preserve callee-saved registers. Stepping back and thinking of the graph of entry and exit points in a program, one entry point that is non-tail called will push some registers so every exit point it might return from must pop those same registers. Does that sound right?
The post is regarding eliminating optionalCallExpression node during TAC generation. My scheme mostly does the following:
Given:
a.b?.()
Output:
let fin$res = undefined
let [callee$res, CONTEXT] = ...callee // callee$res is ID or ID.X
let callee$cond1 = callee$res !== undefined
let callee$cond2 = callee$res !== null
let callee$condfin = callee$cond1 && callee$cond2
if (callee$condfin) {
fin$res = ID(...args) | ID.X(...args) | ID.call(CONTEXT,...args)
}
The scheme failed when the 'CONTEXT' couldn't be stored in a temporary. For instance:
...
class Foo extends Base {
method() {
super.method?.();
}
}
...
Here the context 'super' cannot be stored inside a temp, this caused the transformation to fail. After some experimentation I found out that I could just pass the 'this' reference directly like this and the test262 tests passed.
class Foo extends Base {
method() {
let OPTE_RESULT$3 = undefined;
let OPTCE_CALLEE$4 = super.method;
let OPTCE_C1$5 = OPTCE_CALLEE$4 !== undefined;
let OPTCE_NULL$6 = null;
let OPTCE_C2$7 = OPTCE_CALLEE$4 !== OPTCE_NULL$6;
let OPTE_CFIN$8 = OPTCE_C1$5 && OPTCE_C2$7;
if (OPTE_CFIN$8) {
OPTE_RESULT$3 = OPTCE_CALLEE$4.call(this) <- 'this' here works fine!??
}
}
}
What context does 'super' provide to the call? Is my transformation semantics preserving or something provided by 'super' might be missing here?
In the following example, trying to access the field 'boo' fails, but it works for 'method', why does this happen?
class Base {
boo = 111
method() {
console.log(`Base method: ${this.boo}`)
}
}
class Foo extends Base {
boo = 100
method() {
console.log(`Foo method: ${this.boo}`)
}
callParent() {
console.log("Super.foo", super.foo) // This is undefined? why?
super.method() // This finds the binding and passes 'this'?
}
}
const foo = new Foo();
foo.callParent();
// OUTPUT:
// Super.foo undefined
// Base method: 100
Any insights would be greatly appreciated, thanks...
Hello,
Basically what the title says :) I‘m mainly interested in the inference optimization side of ML models and how to approach this field from the ground up. A simple google search yielded https://mlc.ai/, which seems like a good starting resource to begin with?
Appreciating any hints, when it comes to lectures to watch, frameworks to look into or smaller side projects to pick up in this area.
Thank you all already!
Edit: also always a fan of getting my hands dirty on a real-world open-source project - usually great way to learn for me :)
At the moment I am using C to implement the compiler for my language. But since I am still at the very beginning (lexer is almost finished…) and I don't really want to continue programming in C because it is as ergonomic as a brick, I am looking for sensible alternatives.
I chose C because I am aiming for C as the first target language and want to rewrite the compiler in my own language later, but also ultimately to deepen my C knowledge at the same time. However, I am already a bit tired of C and think I have dealt with this language enough to be able to convert code of my language into C.
I would also like to keep the option open of perhaps offering LLVM as another backend in the (very distant) future. In this case it would of course make sense to program the compiler in C++ from the start. And the new features like smart pointers are certainly very helpful, more memory-safe and ergonomic than pure C. But then I have to read that the C++ API of LLVM is not particularly stable, and the Rust compiler, for example, uses C interface???
And since I've already mentioned Rust: Rust now also offers GCC as a backend, implemented using libgccjit, where there is also an official crate/package for Rust that I could actually use myself? Rust would offer the advantage of having much better tooling and more functional language features like ADT and pattern matching, which is definitely a game changer for compiler implementation.
In addition, LLVM is a mammoth thing that is forked and patched by basically all languages that use LLVM for their backends: Rust, Swift, Zig, Odin... and that is an absolute no-go for me. I would prefer to have a pre-installed library with a stable API and maybe libgccjit is a good choice? Since libgccjit's C API is first class, wouldn't it be the better choice for a self-hosting compiler anyway (especially since GCC supports more platforms than LLVM)?
So, what do you think? Should I use Rust, also with regard to libgccjit?
In my desperation I had already considered simply writing the compiler in Kotlin or even using TypeScript (with Bun as a nice all-in-one tool), because I don't feel like tinkering with C/C++ and (c)make, and needed stuff like "bear" etc. ... you just have high expectations of the tooling of a language these days...
First of all sorry if this is not related to the sub. I really don't know where else to ask. My problem is about custom programming language and it have a compiler so... I pretend it's related
So here's my problem. If anyone who implemented completions with LSP can help I would be happy:
> Manually triggering suggestions with Ctrl+Space shows it, but only if the cursor isn't surrounded by the brackets. Strange behaviour
Speaking from a demand perspective, what skill set is typically more needed by more companies? Of course the two disciplines are relatively niche and most companies don't need either. Regardless, I am curious to know!
Has anyone ever made a compiler using LEX and YACC specification files for CPP language? I really need guidance open to tell you about scope if you're interested. The process also involves various other phases of compilation but I'm stuck with lexical analysis and syntax analysis itself.
I work on compilers for the EVM, a stack based ISA meaning you can’t just apply the classic register allocation algorithms but instead need to worry about how to efficiently order and schedule stack elements via DUPs, SWAPs & POPs.
I’ve found the literature to be very limited on this subject. There’s this paper “Treegraph-based Instruction Scheduling for Stack-based Virtual Machines” but it comes with a lot of limitations and is generally quite suboptimal, relying on a different cost model for its target ISA (TinyVM) as well as not using instructions like SWAPs at all.
Any clues would be appreciated!
I am currently working on a Transpiler from a limited subset of Python to C. I have set myself the goal of not using any libraries and everything has been going well so far but when trying to implement the last feature in my Parser being left sided indexing (e.g. x[0] = 10), it occurred to me that to correctly update the types of the new element in the list, I need to evaluate the index that it is going to be placed at, which is a big problem as in some circumstances, that would lead to me evaluating basically the whole code which I do not want.
My question is: Is there a way around it?
edit: I forgot to add that I want to transpile Pytjon without type annotations.
sorry for posting in succession.
a bit of background about me - I'm a C++ developer and graduate student that has worked on distributed systems. broadly, I have worked on data-intensive and compute-intensive software and found the algorithmic challenges behind delivering data to be interesting.
to a similar end, I would love to work on an optimizing compiler. a problem here seems to be that to work on one for a specific domain (ai, graphics, ml, microarchitecture), I would need knowledge of said domain.
to give an analogy up my alley, I worked on a query optimizer. you wouldn't focus on optimizing queries with massive CTE's unless there was a proven need for them. if there was, then you would go about understanding the need - why is it needed? is this the best option? how will this affect other queries we want to support? etc. I imagine similar problems domain-specific demand exists for compilers.
to what extent should application to compiler optimization motivate study in a related field? do you have any other advice or specific projects to contribute to for me?
a bit of background about me - I'm a C++ developer that has worked on distributed systems. I worked on an in-memory database and found optimization-related components of query execution like the SQL-level optimizer pass and row-level kernel compilation to be very interesting. i'm currently a grad student and i've taken many compiler-related courses but have shied away from actual meaningful work on a compiler and instead stuck to app-layer development.
some of my reservations have been:
compiler work seems to lack the "intuition" component of (insight + intuition) that writing greenfield applications seems to have. This makes sense - most of the low-hanging fruit will be taken, but doesn't this also imply that non-doctorates wouldn't have a ton of novel work?
on a similar note to the former - microarchitecture focused roles in general seem to have a very large emphasis on "just knowing how things are". this also makes sense - engineering is a practice of building on reasonable decisions, but working very strenuous systems focused jobs in the past made me feel like i was there just to memorize things and not contribute interesting ideas.
compiler work seems to be limited to a couple companies (apple intel amd nvidia modular) and hence salary progression and career growth seems to be more limited. I've grown out of this mindset but i'm curious as to whether it's true.
compilers theory is interesting for sure, but the end customer is just as important in my eyes. the nature of what the compiler is being used for (deadlines? scale? cross-functional work?) drives the nature of development. it seems to me like to work on gpu compilers, ai compilers, etc you would need a knowledge of said application domain to do well in which I have very little
I'm considering pivoting away from my current career path and working on compilers. I think that I have a moderately strong foundation through grad-level compiler coursework and think I should pretty much just get to coding. I also would like to think that I would be reasonably successful at teaching myself compiler frameworks because I managed to teach myself advanced C++ (template metaprogramming, asyncs, etc) and microarchitecture by reading books.
However, I would love to hear your thoughts on the above reservations.
For example, take this C code:
if ((int32_arr[i] & 4) != 0)
...
else
...
If you know assembly, you could compile this by hand into something like this (assuming i
is in rsi
, and int32_arr
is in rdi
):
test dword ptr [rdi + rsi * 4], 4
jz .else
...
jmp .after
.else:
...
.after:
Indeed, Clang actually generates the line test byte ptr [rdi + 4*rsi], 4
in a function that does something like this, which isn't far off.
However, a naive code generator for a C compiler might generate something like this:
imul rsi, 4
add rdi, rsi
mov eax, [rdi]
and eax, 4
cmp eax, 0
je .else
...
jmp after
.else:
...
.after:
To generate the first piece of assembly code, you have to make a special case for loading from arrays of primitive types to use an indexed addressing mode. You also have to make a special so that expressions of comparisons like (_ & x) != 0
are compiled as test _, x
, since you don't actually care about the result except to compare it against zero. Finally, you have to make a special case to merge the testing of the indexed load into one instruction, so it doesn't unnecessarily load the result of [rdi + rsi * 4]
into a register before testing against that.
There are a lot of contrived examples where you have to make a special case for a particular kind of expression so that it can be fused into one instruction. For example, given this code as context:
struct x {
int64_t n;
int64_t a[50];
};
struct x *p = ...;
You might expect this:
p->a[i] += 5;
To produce this one line of x86-64 assembly:
add qword ptr [rdi + rsi * 8 + 8], 5
But again, a naive code generation algorithm probably wouldn't produce this.
What I'm asking is, how do big professional compilers like Clang and GCC manage to produce good assembly like what I wrote from arbitrarily complex expressions? It seems like you would need to manually identify these patterns in an AST and then output these special cases, only falling back to naive code generation when none are found. However, I also know that code is typically not generated straight from the C AST in Clang and GCC, but from internal representations after optimization passes and probably register allocation.
Hello,
I am going through the crafting interpreters book and the code written there is not runnable due to dependencies being written later on in the book.
How should i approach this? I like to run the code i am writing, should i just paste the entire thing beforehand and then backtrack? Sorry for the weird question.
Each of these runtimes has its own peculiarities, but would it be possible to make a ranking of the target runtimes starting from the one that could support more features and paradigms?
In my opinion:
If you had to choose a target runtime for a programming language that supports as many features and paradigms as possible, which one would you choose?
The choice is not limited to previous runtimes only.
Hi, I wanted to learn about compilers and more specifically about LLVM and MLIR. I have a background in computer architecture but have not had any exposure to compilers. What is the general opinion of this subreddit on this course and other course recommendations are also welcome. TIA
course link - https://www.cs.cornell.edu/courses/cs6120/2020fa/syllabus/
Apologies if this is a silly or half-baked question.
I've been building a compiler for my own toy programming language. The language has "move semantics", somewhat like in Rust, where aggregate types (structs, enums, arrays, tuples - any type that contains other types) move when they're assigned or passed as arguments, so they always have exactly one owner.
I have not yet implemented any form of automatic memory management for my language (and maybe I never will), so the language mostly works like C with a slightly more modern type system. However, I've been wondering if its worth having move semantics like this in a language where you're forced to do manual memory management anyway (as opposed to just having the compiler automatically copy aggregate values like in C or Go). I figure the main point in move semantics is single-ownership, and that's mostly just useful because it lets the compiler figure out where to generate code to free memory when values go out of scope. So, if your compiler isn't doing any cleanup or memory management for you, is there really any point in move semantics?
The only benefit I can think of is safety. The fact that the compiler doesn't automatically copy aggregate types means that the programmer is free to define copy semantics for their type as they wish, and they can be enforced by the compiler. For example:
struct Mutex { /* ... */ }
struct Locked[T] {
lock: Mutex
value: T
}
fn main() {
let value = Locked[int]{
lock: Mutex{ /* ... */ }
value: 1234
}
let moved_value = value
let illegal = value // error: cannot use moved value `value`
}
In the above example, the programmer is prevented from accidentally copying a struct containing a lock. If they want to copy it, they'd have to call some kind of copy method declared explicitly (and safely) on the Locked
type.
Are there any other benefits of move semantics for a language with absolutely no automatic memory management?
I've heard people say that moves sometimes prevent unnecessary copies, but I can't actually think of a concrete example of this. Even in languages that copy automatically, a lot of copies are optimized out anyway, so I'm not sure move semantics really offer any performance benefit here.
A quick animation about how compilers usually work. Let me know if you’d want to see some animated deeper dives on compiler internals (though the middle-end is the only part I can speak to in great detail). My channel has historically been about array programming but I would love to do more conpiler-specific stuff. Let me know what you think!