/r/Compilers

Photograph via snooOG

This subreddit is all about the theory and development of compilers.

For similar sub-reddits see:

Popular mainstream compilers:

/r/Compilers

18,611 Subscribers

1

Tokenization of String & Identifier

I am currently trying to build a lexer for JSON and am struggling to differentiate tokens based on identifiers and values since they are both strings. For example,

{
  "ident": "value"
}

(IDENT, "ident"), (ASSIGN, ":"), (STRING, "value")

I am very new to parsers and lexical analysis so any help would be much appreciated!

2 Comments
2024/05/12
01:59 UTC

12

Good collection of papers on various compiler optimization

Can someone point to a good collection (GitHub awesome series or something equivalent) that has papers that describes how various compiler optimizations are done. I am trying to refresh my learnings and brush up my skills. Thanks!

3 Comments
2024/05/11
16:43 UTC

40

Why do people dislike the dragon book?

Just curious as to why people dislike the dragon book.

People complain that it’s full of parsing theory but that’s only three chapters.

The second edition is a great book I think. I skipped the parsing sections and went straight into the intermediate language chapter. There’s also a run time environment chapters and has good chapters on optimization.

Cmu uses the second edition for its optimization compiler grad course. Or at least the syllabus for at least one year had the dragon book as a required text.

What do you all think about the dragon book second edition?

What I liked about it is that it taught me that compilers are basically a mathematical systems. I’m not an expert but optimization and register allocation consist of graph algorithms. The frontend generates a tree.

It also taught me that some problems in compiler are intractable computing wise. I think it’s great book.

So yeah just curious about what you think about it?

Thanks :)

26 Comments
2024/05/11
03:21 UTC

26

compiler engineer future

Hi,

I am a Computer Engineering student specializing in embedded & real time systems. However, I have found a deep passion for compilers through a project my school offers based on Andrew Appel’s Modern Compiler Implementation in ML Book. I’ve done the entire project, and on top of writing my own compiler frontend and backend which supports an Object oriented language and targets 3 assemblies (MIPS, IA32, ARM), used LLVM IR. I’ve also implemented (though, not alone) an SSA pass that occurs after canonicalization in my own backend (not LLVM).

I would like to pursue a career in compilers, but I don’t really know what to do to stand out, nor what that might look like. I doubt my skills a lot, and fear nothing will make me stand out in a pool of motivated applicants. I could delve into LLVM more since it is important, but I am not sure it is the most interesting thing for me to do. Do you have any advice?

24 Comments
2024/05/11
01:32 UTC

9

Open source language for automata

Hey,

I am (re)releasing a project called Frame that I've been working on to create a language and transpiler to easily create state machines/automata in Python and also generate UML documentation to boot.

Very interested in connecting with people who might find this interesting. If that is you, please take a look at the Overview and the Getting Started articles.

I also am the mod at r/statemachines so there are a lot of resources there as well on the topic.

Thanks!

Mark

2 Comments
2024/05/10
23:04 UTC

0

Memoization of Mutable Objects

Dear redditors,

I've been supervising a student on a work that consists in memoizing mutable objects. We have prepared a short draft of the work here, and would appreciate feedback. Specifically, we're looking for references to related work. If you're aware of any prior efforts to memoize mutable values, please reach out to us. Additionally, any suggestions for improving the work would be highly valued. Thank you!

Ps.: also posted in reddit/r/ProgrammingLanguages/

0 Comments
2024/05/10
17:33 UTC

16

Where do I lower my ANF intermediate representation?

Hi everyone, long-time lurker, first-time poster. Context: I am self-taught, so my main source of learning are half-published lecture notes on the Internet.

I am currently working on a little ML-like language called Boréal, whose purpose is to practice compilation techniques (and make me feel like a demiurge).

Currently my compilation pipeline ends at an ANF representation which I translate to Lua. However, I want to also practice my ASM-fu, and wish to generate native code. I came to understand that I would need to lower my ANF representation to an IR that allows me to do control flow graphs (the name RTL popped up, it's used by GCC if I'm not mistaken?), and do register & instruction selection.

Are there any resources that you could recommend, that target eagerly-evaluated, immutable functional languages? If not, what are some keywords that I can look up on my favourite search engine?

Cheers.

8 Comments
2024/05/10
11:56 UTC

9

Interview for AI Compiler position at Modular

Hi all,

I am having my first round for AI Compiler position at Modular. What can I expect in these interviews?

My background has been mainly in LLVM and reading research papers for Deep Learning Compilers. Any references to materials will help.

7 Comments
2024/05/09
18:05 UTC

15

Getting a job in compilers

Sorry if this isn’t the right place to post this. I thought of posting to r/cscareerquestions but felt I would get better answers here.

I took a compilers class in my final semester and found the contents really interesting, especially when it got around to IR and optimization passes. The class involved one big project that was divided into the different phases of compilation (lexer, parsing, AST creation, semantic analysis, etc.)

I want to try getting a job that involves compilers but not sure where to start. I have already made a toy compiler for a small subset of C (arithmetic operations, for loops, if-statements and functions), but I’m sure there’s more I could do to stand out when applying for a compiler engineer role. Was just wondering where to start and how to develop my skills further.

7 Comments
2024/05/09
04:53 UTC

5

Stuck at parsing

Im working on a hobby project of a basic compiler but not sure what approach to take at the tree constructions during the parsing phase. I got the lexing phase working with each token containing a 32uint value, a std::string(C++), and an enum token_type. Example:

Raw text input: VAR_U8 foo = 1 + 2 * 3.14;

Tokens created after lexing: [VAR_U8"undefined", STRING"foo", EQUALS, I32=1, SUM, I32=2, MULT, FLOAT=3.14] [EOF]

Can anyone link resources of how exactly I implement the trees after this?

Edit: for some reason reddit mobile isn't allowing me to manually add line breaks in my post...

7 Comments
2024/05/08
16:41 UTC

1

Problem with saviing result from Bison parser

  • Hi there! I wrote a parser on Bison and save calculation's result in intermediate variablre like that:
  • struct func_node result; // func_node in custome data structure...
  • Variable are written in file with main() function. In parser file added next rule:

%parse-param {func_node *myResult}

  • And call...
  • int res_parse = yyparse(&result);
  • but after execution I have seen thet variable is empty yet. I know: in yyparse's process, data are stored. Maybe, anybody collide with same problem?
2 Comments
2024/05/08
16:19 UTC

8

Tinygrad/TVM

Hello friends,

Im sure there are a lot of people in this sub that know a lot about ML/ tensor graph compilers. So im trying to get thoughts on tinygrad from you guys.

In the last week ive been really looking into Tinygrad. If i understood it correctly tthe core idea of them is to build the graph only using the most fundamental operations like basic arithmetic and reduces and build a really good optimizer than can fuse these operations and optimize memory access patterns etc...

For example they do a matrix multiply only by reshaping/transposing/broadcasting the matrices and then performing an element wise multiplication followed by a sum.

Do you think this approach could work? If they make this work so they can compile Models that run as fast as hand wirtten code using this approach would be amazing.

Also maybe someone can elaborate generally how its different from TVM, just from looking at the TVM docs the approach looks somewhat simmilar.

5 Comments
2024/05/08
13:25 UTC

16

Are there any C compilers with no IRs?

Just kinda curious if anyone’s made a compile for (at least a subset of) C which simply parses and emits assembly with no IRs in between as a challenge

30 Comments
2024/05/08
12:53 UTC

10

GPU accelerated compilers.

Hello everyone!

Are there any general purpose language compilers with GPU offloading to compile modules in parallel? What do you think about this approach?

23 Comments
2024/05/08
11:04 UTC

11

Add static type to a dynamically typed language

I am following Crafting Interpreters (CI) to learn how compilers/interpreters work

CI builds a dynamically typed language. How would one go about implementing statically typed language on top of this current existing implementation (that is, in a bootstrapped version)

That is, i want to write a lox_v2.0 language in lox_v1.0 language built by the book that supports statically typed

6 Comments
2024/05/08
07:37 UTC

18

Optimizing graph based render pipelines

No, I didn't get lost, I'm intentionally asking this question here, and I apologize in advance for the lenghty post :-)

For context, since I assume not everyone here is familiar with graphics programming, rendering CGI is usually done in several passes, where each pass reads some input and produces an output. For example a somewhat simple render pipeline could be composed of

  • a ShadingPass that takes in geometry and draws that to a texture
  • an OutlinePass that applies an edge detection kernel to the output of the shading pass and produces a greyscale edge map
  • a general PostProcessPass that applies color correction to the shading pass output
  • a CompositionPass that takes in the post process output and the edge map and slaps them together. This is our final image

Now graphically this looks somewhat like this:

                -> PostProcessPass -
              /                      \
ShadingPass -                          -> CompositionPass -> Screen
              \                      /
                -->  OutlinePass  --

I'm trying to write a system that allows users to define passes and compose them in any way they like. I want to design it in a way that passes have "value semantics" in the sense that they don't modify their inputs, and their output is available to all subsequent passes.

But this is not great for memory consumption, these textures can easily be several hundred megabytes in size, so I don't want to be unnecessarily wasteful. So I would like to reuse textures that are not being used anymore by subsequent passes. Basically preserve semantics while applying optimizations where they don't change the observable behaviour of the pipeline.

Here scheduling comes into play. Any topsort order is sufficient for correctness, for example

ShadingPass -> PostProcessPass -> OutlinePass -> CompositionPass

But if I scheduled in this order:

ShadingPass -> OutlinePass -> PostProcessPass -> CompositionPass

then the output texture of the ShadingPass could be allocated to the output of the PostProcessPass and also to the output of the CompositionPass.

Also this example is somewhat artificial but an additional requirement is that not all texture are alike, they can have different pixel formats and other properties. For this example to be meaningful assume all textures have the same pixel format except for the output of the OutlinePass. That's why we can't allocate the ShadingPass output to the OutlinePass.

To me this problem seems very similar to register allocation and instruction scheduling, which is why I'm asking this here.

The graph I showed above is what a user would specify in a node editor. As far as I can tell, this is analogous to a data dependency graph, the textures correspond to registers and the passes to instructions.

Now my question is, what algorithm can I use for scheduling to minimize texture/register usage? I feel like once I have a good schedule I can compute live sets for each point in the pipeline and use graph coloring for texture allocation. But finding a good schedule seems very hard to me.

Thank you very much for reading and any suggestions!

5 Comments
2024/05/06
18:53 UTC

0

Which is the most used algorithm for JSON parser?

Hi everyone, this is my first post here. I think JSON parser may not be directly related to compilers but you guys know a lot about parser stuff. Which is the prefered algorithm used for JSON parser?

4 Comments
2024/05/05
14:51 UTC

21

Translating out of SSA

I'm writing a toy Lisp and use an SSA representation. The translation into (T)SSA and CSSA was straight forward, since I found a lot of online course material describing the process, but I have problem with implementing a non-naive translation out of SSA.

I have found two articles describing the process which are often referenced to: Translating Out of Static Single Assignment Form and Revisiting Out-of-SSA. I decided to go with Out-of-SSA method and there was no problem implementing the algorithms for interference checking since pseudo code is provided.

But in section IV:C Virtualization of the φ-nodes the process of φ-elimination is only described at a high level and I can't figure out how to do the implementation.

Can anyone point me to some material describing this process in more depth, preferably with examples and/or pseudo code?

If not, is there a simpler algorithm that provides reasonable results? Perhaps method I/II of Translating Out of SSA and then the described CSSA-based coalescing?

18 Comments
2024/05/05
08:17 UTC

3

how to make diassembler ?

i want to make diassembler for linux as well as windows what are some open source examples , can anyone mention please

9 Comments
2024/05/04
11:20 UTC

4

Interprocedural type checking

Can anyone suggest articles on interprocedural type checking? I'm currently writing a type checker for a tiny language and I'm running into a problem. Since in my language functions are not pure and have side effects, it is difficult to check what types a variable might actually have (the language has a gradaul type system, so you can't just check the annotations of all variables). However, so far I have not been able to find a good article that explains whether it is even possible to do this and what approaches exist that at least partially solve the problem

4 Comments
2024/05/02
22:23 UTC

25

What is everyone working on this month?

33 Comments
2024/05/02
15:04 UTC

1

C++ Coding Round for Waymo AI Compiler position

I am having my first coding round for AI Compiler position at Waymo. Any specifics on the kind of coding problems that will be asked for C++ coding round?

3 Comments
2024/05/01
05:06 UTC

37

How can I find a compiler developer job?

I am a new graduate software engineer working at a tech company. However I have always dreamed of building compiler as my job. What do I need to learn or do to apply for a compiler engineer job? I have some experience with LLVM, and good at writing code with cpp, scalable, Java and a couple of other languages.

11 Comments
2024/04/30
18:02 UTC

Back To Top