/r/REMath

Computer code is a complex logical artifact and high dimensional dynamical system whose extension, maintenance, comprehension, and verification is difficult but can be made easier by creating tools that aid humans in these domains. This subreddit is for people that want advance the field of machine language processing by working to advance ideas to help humans understand computer code of all types and sizes.

**Those wanting to learn about formal aspects of reverse engineering should start here and those wishing to study implementations can start here and here**

**--- Courses ---**

Program Analysis by Rolf Rolles

Advanced Tool Development with SMT Solvers by Sean Heelan

Advanced 0Day Discovery using SMT Solvers by Edgar Barbosa

Computer code is a complex logical artifact and high dimensional transition system and data source whose synthesis, extension, maintenance, execution abstraction, visualization, comprehension, and verification is difficult but can be made easier by creating tools that aid humans in these domains.

Statistical Reverse Engineering, Machine Language Processing, and Program Analysis are fields of computer science devoted to creating tools and theories for the understanding of programs with inspiration from the fields of formal methods, reverse engineering, pure mathematics, natural language processing, human–computer interaction, bioinformatics, and machine learning.

We are an interdisciplinary community concerned with the discovery and understanding of computational systems beyond what is available on the surface.

Join us on IRC: #r_netsec on freenode

Other places of interest:

/r/REMath

1

Hi, I have a question.

A program creates a binary file whenever it runs that contains some information about what happened during its runtime and some diagnostic information. Part of the diagnostic information will be the exact same for every run from the same machine/configuration, some will be different. The location of the information can be at different positions in the file. (As an example, it could be after 100 bytes or 10000 bytes.

My goal is to be able to identify precisely if the file came from the same machine as another file.

It's raw data, not containing any information about its structure. It does contain some strings, but they aren't unique enough to be 100% sure it's from the same machine. It does have a general order, but the data is arranged in a space efficient way. (As an example, if there's an 8 bit value stored, then a bool, then a 16 bit value, it would be using exactly 25 bits).

I know some techniques to tackle this problem by looking at how the application writes the data, but I was wondering if I could use the underlying structure of the data to solve this problem. If I run the program 3 times from the same machine some bit sequences will always be the same.

For simplicity let's assume the data is less than 10 Megabytes in size.

What would be a good way to approach this?

0 Comments

2023/10/19

18:27 UTC

18:27 UTC

16

I'm a beginner hacker and I just started learning networking stuff. I like to understand how machines actually do things: how do computers "compress" files? Or how does encrytion work? I wanted to ask you, what mathematical notions should I learn to actually get into reverse engineering?

Ps: i would really appreciate if you could also tell me what should i learn after getting through the ccna course. After understanding the basics of networking that a ccna couse could teach me, what should i learn? Thanks in advance

6 Comments

2021/01/26

19:35 UTC

19:35 UTC

4

So i am wondering what tools are commonly used see whats behind a program you use often and want to learn how it functions and such?

any tips or tricks you want to share? to a newbie like me 😊

2 Comments

2020/07/26

22:00 UTC

22:00 UTC

8

Program flow and path are discriminated in the literature of program analysis. However, I have failed to find a formal definition of program flow and path. Can anyone please point me to some authentic literature, e.g. research paper or popular books? Also, an example will be highly appreciated.

1 Comment

2020/01/19

08:38 UTC

08:38 UTC

12

Is there a good SoK paper that details modern problems/limitations of static analysis paper?

1 Comment

2019/06/11

17:35 UTC

17:35 UTC

5

I was wondering if you could all give me a good idea of what you are all interested in. I've recently been learning about formal methods and the mathematical foundations behind SAT/SMT solving. I've been following the Mobius Strip RE reading list and am currently reading The Calculus of Computation. Has anyone here read that? What are your backgrounds?

3 Comments

2019/01/31

19:20 UTC

19:20 UTC

10

Hey everyone, my name is TotallyNotCarson and I'm a student. I'm pretty interested in program analysis and reverse engineering, the other RE sub seems to be more applied. This sub also seems pretty dead, I was wondering if we could try and have the mods make a monthly/weekly discussion post where we can talk about things we read or ask questions. I'm currently enrolled in an automated reasoning course and would love to share what I'm learning with all of you.

edit: Cant edit my post title, but should probably be "Suggestion: monthly/weekly discussion thread"

0 Comments

2019/01/30

02:10 UTC

02:10 UTC

12

I'm going through the recommended texts and would like to create a discord group (or matrix room!) to talk about it. Would anyone be interested?

My background is in programming/informatics, so the math is mostly new to me. It could be nice to join forces, help each other out.

12 Comments

2018/05/30

11:48 UTC

11:48 UTC

10

Anyone have a good book that covers various types of program analysis? I've read papers on symbolic execution and dataflow/taint analysis but im looking for something more textbook like. I googled a bit and only found a few resources that seem pretty old. Thanks for any and all suggestions!

15 Comments

2018/05/27

21:50 UTC

21:50 UTC

11

0 Comments

2018/04/07

00:00 UTC

00:00 UTC

4

Hi,

I'm working on a decompiler, and I'm getting to the part when I want to discover types. I want to use Value-set analysis (explained in "Analyzing Memory Accesses in x86 Executables" - https://pdfs.semanticscholar.org/2f7b/486069be08da1ef1dd86f4ed838a51153f8e.pdf) for it, but I can't make heads or tails about how I'm supposed to apply this.

Can anybody shed light on how VSA is supposed to work (or have points to some resource) that a simple programmer like me can understand? :D

Thanks in advance.

6 Comments

2018/03/27

22:53 UTC

22:53 UTC

6

I'm inquiring about the subject of Formal Methods in particular in the area of Kernel Security my questions are as follows:

What significant develops/advances have been made ?

Could Formal Methods be used to show a system is insecure rather than secure and if so would it be trivial to implement ?

What are some fundamental papers should one read in the area ?

Has a system like sel4 despite being proven to be secure still fail due to the assumptions made ?

What's a good reading list for bridging into the area ?

What are the significant difficulties of implementing formal methods ?

Who are key researchers in the area ?

What architectures has Formal Methods in respect to OS Kernel's been applied to ?

1 Comment

2018/03/05

01:20 UTC

01:20 UTC

6

4 Comments

2017/11/10

02:07 UTC

02:07 UTC

6

0 Comments

2017/06/26

23:48 UTC

23:48 UTC

4

I asked this question over at r/ReverseEngineering as well but this may be a better place for it.

I'm beginning the book list on the formal side of reverse engineering from Mobius Strip Reverse Engineering. I have a strong background in math, graduate level, but an newer to the formal aspects of computer science topics.

When I'm reading these textbooks what should I be thinking about from the applied side of reverse engineering? The best example of what I'm looking for is if you're studying physics and you start reading a real analysis book you should be thinking about how the function behaviors you're studying relate to the physical systems you are studying. The function itself, assuming some nice properties, combined with operators on that function tell a great deal of information about a physical system.

So as I'm reading The Calculus of Computation should I be thinking about how the C programming language behaves? Does that statement even make sense?

7 Comments

2016/12/01

21:00 UTC

21:00 UTC