/r/REMath

Photograph via snooOG

Computer code is a complex logical artifact and high dimensional dynamical system whose extension, maintenance, comprehension, and verification is difficult but can be made easier by creating tools that aid humans in these domains. This subreddit is for people that want advance the field of machine language processing by working to advance ideas to help humans understand computer code of all types and sizes.

Those wanting to learn about formal aspects of reverse engineering should start here and those wishing to study implementations can start here and here

--- Courses ---

Program Analysis by Rolf Rolles

Advanced Tool Development with SMT Solvers by Sean Heelan

Advanced 0Day Discovery using SMT Solvers by Edgar Barbosa

Computer code is a complex logical artifact and high dimensional transition system and data source whose synthesis, extension, maintenance, execution abstraction, visualization, comprehension, and verification is difficult but can be made easier by creating tools that aid humans in these domains.

Statistical Reverse Engineering, Machine Language Processing, and Program Analysis are fields of computer science devoted to creating tools and theories for the understanding of programs with inspiration from the fields of formal methods, reverse engineering, pure mathematics, natural language processing, human–computer interaction, bioinformatics, and machine learning.

We are an interdisciplinary community concerned with the discovery and understanding of computational systems beyond what is available on the surface.

Join us on Slack here or here

Join us on IRC: #r_netsec on freenode

Other places of interest:

/r/REMath

6,609 Subscribers

6

Mathematical preliminaries for the program analysis reading list? (A reading list for the reading list?)

Greetings friends,

I am a fledgling reverse engineer and I have taken a liking to more theoretical areas of computer science, although I have little background in it. I am doing my master's in computer science and my program is heavily applied. I discovered this subreddit and the reading list linked on the sidebar, and noticed that the mathematical component of the reading list might be slightly advanced for someone with my mathematical background.

So, to spark some discussion, I thought I'd ask you all what you think the mathematical preliminaries for the reading list are. It's been a while since I've done my undergraduate mathematics, so I think it would do me some good to brush up on some areas before diving into this reading list.

Thank you for your consideration.

2 Comments
2024/04/17
00:46 UTC

2

Study or skip calculus ?

I am studying the prereq math for program analysis. I completed trig & precalc. Can i jump to discreet math, proofs, linear algebra etc., or should i study calculus 1,2 & 3 before proceeding further.Will studying calculus be of any use in the program analysis domain ?

6 Comments
2024/03/03
05:45 UTC

0

YouTube: Mathematician proves mathematics ends in meaninglessness/contradiction

1 Comment
2024/01/17
20:58 UTC

1

Question regarding a problem I'm trying to solve

Hi, I have a question.

A program creates a binary file whenever it runs that contains some information about what happened during its runtime and some diagnostic information. Part of the diagnostic information will be the exact same for every run from the same machine/configuration, some will be different. The location of the information can be at different positions in the file. (As an example, it could be after 100 bytes or 10000 bytes.

My goal is to be able to identify precisely if the file came from the same machine as another file.

It's raw data, not containing any information about its structure. It does contain some strings, but they aren't unique enough to be 100% sure it's from the same machine. It does have a general order, but the data is arranged in a space efficient way. (As an example, if there's an 8 bit value stored, then a bool, then a 16 bit value, it would be using exactly 25 bits).

I know some techniques to tackle this problem by looking at how the application writes the data, but I was wondering if I could use the underlying structure of the data to solve this problem. If I run the program 3 times from the same machine some bit sequences will always be the same.

For simplicity let's assume the data is less than 10 Megabytes in size.

What would be a good way to approach this?

0 Comments
2023/10/19
18:27 UTC

18

what math do reverse engineers use?

I'm a beginner hacker and I just started learning networking stuff. I like to understand how machines actually do things: how do computers "compress" files? Or how does encrytion work? I wanted to ask you, what mathematical notions should I learn to actually get into reverse engineering?

Ps: i would really appreciate if you could also tell me what should i learn after getting through the ccna course. After understanding the basics of networking that a ccna couse could teach me, what should i learn? Thanks in advance

6 Comments
2021/01/26
19:35 UTC

4

what tools are commonly used?

So i am wondering what tools are commonly used see whats behind a program you use often and want to learn how it functions and such?

any tips or tricks you want to share? to a newbie like me 😊

2 Comments
2020/07/26
22:00 UTC

6

What makes a program flow different from a program path?

Program flow and path are discriminated in the literature of program analysis. However, I have failed to find a formal definition of program flow and path. Can anyone please point me to some authentic literature, e.g. research paper or popular books? Also, an example will be highly appreciated.

1 Comment
2020/01/19
08:38 UTC

12

Limitations of abstract interpretation/static analysis paper?

Is there a good SoK paper that details modern problems/limitations of static analysis paper?

1 Comment
2019/06/11
17:35 UTC

7

What type of content are you looking for in this sub?

I was wondering if you could all give me a good idea of what you are all interested in. I've recently been learning about formal methods and the mathematical foundations behind SAT/SMT solving. I've been following the Mobius Strip RE reading list and am currently reading The Calculus of Computation. Has anyone here read that? What are your backgrounds?

3 Comments
2019/01/31
19:20 UTC

11

Active reading list?

Hey everyone, my name is TotallyNotCarson and I'm a student. I'm pretty interested in program analysis and reverse engineering, the other RE sub seems to be more applied. This sub also seems pretty dead, I was wondering if we could try and have the mods make a monthly/weekly discussion post where we can talk about things we read or ask questions. I'm currently enrolled in an automated reasoning course and would love to share what I'm learning with all of you.

edit: Cant edit my post title, but should probably be "Suggestion: monthly/weekly discussion thread"

0 Comments
2019/01/30
02:10 UTC

12

Program analysis reading list study group, anyone?

I'm going through the recommended texts and would like to create a discord group (or matrix room!) to talk about it. Would anyone be interested?

My background is in programming/informatics, so the math is mostly new to me. It could be nice to join forces, help each other out.

13 Comments
2018/05/30
11:48 UTC

11

Books on program analysis?

Anyone have a good book that covers various types of program analysis? I've read papers on symbolic execution and dataflow/taint analysis but im looking for something more textbook like. I googled a bit and only found a few resources that seem pretty old. Thanks for any and all suggestions!

15 Comments
2018/05/27
21:50 UTC

11

NeuroSAT: Learning a SAT Solver from Single-Bit Supervision by Daniel Selsam

0 Comments
2018/04/07
00:00 UTC

4

Value-Set analysis explanation

Hi,

I'm working on a decompiler, and I'm getting to the part when I want to discover types. I want to use Value-set analysis (explained in "Analyzing Memory Accesses in x86 Executables" - https://pdfs.semanticscholar.org/2f7b/486069be08da1ef1dd86f4ed838a51153f8e.pdf) for it, but I can't make heads or tails about how I'm supposed to apply this.

Can anybody shed light on how VSA is supposed to work (or have points to some resource) that a simple programmer like me can understand? :D

Thanks in advance.

6 Comments
2018/03/27
22:53 UTC

7

Questions about Formal Verification in application to Kernel Security

I'm inquiring about the subject of Formal Methods in particular in the area of Kernel Security my questions are as follows:

  • What significant develops/advances have been made ?

  • Could Formal Methods be used to show a system is insecure rather than secure and if so would it be trivial to implement ?

  • What are some fundamental papers should one read in the area ?

  • Has a system like sel4 despite being proven to be secure still fail due to the assumptions made ?

  • What's a good reading list for bridging into the area ?

  • What are the significant difficulties of implementing formal methods ?

  • Who are key researchers in the area ?

  • What architectures has Formal Methods in respect to OS Kernel's been applied to ?

1 Comment
2018/03/05
01:20 UTC

6

ML algorithm to model/classify/map a software program's internal structure? • r/MLQuestions

4 Comments
2017/11/10
02:07 UTC

4

Live Stream of Coding of Formal Analysis Framework in Rust

0 Comments
2017/06/26
23:48 UTC

4

Question about the Mathematics Side of RE

I asked this question over at r/ReverseEngineering as well but this may be a better place for it.

I'm beginning the book list on the formal side of reverse engineering from Mobius Strip Reverse Engineering. I have a strong background in math, graduate level, but an newer to the formal aspects of computer science topics.

When I'm reading these textbooks what should I be thinking about from the applied side of reverse engineering? The best example of what I'm looking for is if you're studying physics and you start reading a real analysis book you should be thinking about how the function behaviors you're studying relate to the physical systems you are studying. The function itself, assuming some nice properties, combined with operators on that function tell a great deal of information about a physical system.

So as I'm reading The Calculus of Computation should I be thinking about how the C programming language behaves? Does that statement even make sense?

7 Comments
2016/12/01
21:00 UTC

Back To Top