/r/AskProgramming
Ask questions about programming.
All questions related to programming welcome. Wonder how things work? Have a bug you can't figure out? Just read about something on the news and need more insights? We got you covered! (probably)
You can find out more about our (preliminary) rules in this wiki article. If you have any suggestions please feel free to contact the mod team.
Have a nice time, and remember to always be excellent to each other :)
/r/AskProgramming
I understand that that is not official standard on how to name variable and constants for JavaScript and PHP. However is it common to do the following for both languages...
This seems to make sense for readability and seems to be common standard for both of these programming languages.
currently I have this below code:
from collections import defaultdict
import re
# Parameters
X_percent = 0.93 # Desired coverage percentage (e.g., 93%)
lambda_penalty = 500 # Adjusted penalty
max_attack_time = 3600 # Maximum allowed execution time per attack in seconds
max_total_time = 20 * 3600 # Maximum total execution time in seconds (15 hours)
min_new_plains = 10 # Minimum number of new plains an attack must cover
max_plain_weight = 5 # Maximum weight assigned to any plain
set_a_directory = '/root/setA/' # Directory containing Set A files
file_b_path = '/root/h.txt' # Path to File B
# Step 1: Read File B and assign unique IDs to plains
plain_to_id = {}
id_to_plain = {}
with open(file_b_path, 'r') as f:
for line in f:
plain = line.strip()
if plain not in plain_to_id:
plain_id = len(plain_to_id)
plain_to_id[plain] = plain_id
id_to_plain[plain_id] = plain
total_plains = len(plain_to_id)
print(f"Total number of plains in File B: {total_plains}")
# Step 2: Process Set A files and build data structures
attack_info = []
plain_to_attacks = defaultdict(set) # Maps plain ID to set of attack indices
# Regular expression to extract time from file name
time_pattern = re.compile(r'^(\d+)_')
# Iterate over each file in Set A
for file_name in os.listdir(set_a_directory):
file_path = os.path.join(set_a_directory, file_name)
# Extract execution time from file name
time_match = time_pattern.match(file_name)
if not time_match:
continue # Skip files that don't match the pattern
execution_time = int(time_match.group(1))
if execution_time > max_attack_time:
continue # Exclude attacks over the maximum allowed time
with open(file_path, 'r') as f:
lines = f.readlines()
if not lines:
continue # Skip empty files
attack_command = lines[0].strip()
plains_covered = set()
for line in lines[1:]:
parts = line.strip().split(':')
if not parts:
continue
plain = parts[0].strip()
if plain in plain_to_id:
plain_id = plain_to_id[plain]
plains_covered.add(plain_id)
plain_to_attacks[plain_id].add(len(attack_info)) # Index of this attack
attack_info.append({
'command': attack_command,
'time': execution_time,
'plains': plains_covered,
'index': len(attack_info)
})
num_attacks = len(attack_info)
print(f"Total number of attacks in Set A after filtering: {num_attacks}")
# Step 3: Compute the number of attacks covering each plain (f_p)
plain_cover_count = {}
for plain_id in plain_to_id.values():
cover_count = len(plain_to_attacks[plain_id])
plain_cover_count[plain_id] = cover_count
# Step 4: Assign weights to plains with a maximum weight
plain_weights = {}
for plain_id, f_p in plain_cover_count.items():
if f_p > 0:
plain_weights[plain_id] = min(1.0 / f_p, max_plain_weight)
else:
plain_weights[plain_id] = max_plain_weight
# Step 5: Implement the weighted greedy algorithm with adjusted efficiency
total_plains_needed = int(total_plains * X_percent)
print(f"Number of plains needed for {X_percent*100}% coverage: {total_plains_needed}")
covered_plains = set()
selected_attacks = []
remaining_attacks = set(range(num_attacks))
total_execution_time = 0
while len(covered_plains) < total_plains_needed and remaining_attacks:
best_efficiency = -1
best_attack = None
for i in remaining_attacks:
attack = attack_info[i]
new_plains = attack['plains'] - covered_plains
if len(new_plains) < min_new_plains:
continue # Skip attacks that cover too few new plains
# Calculate W_i (sum of weights of new plains)
W_i = sum(plain_weights[p] for p in new_plains)
# Adjusted Efficiency E_i
efficiency = (W_i * len(new_plains)) / (attack['time'] + lambda_penalty)
if efficiency > best_efficiency:
best_efficiency = efficiency
best_attack = i
if best_attack is None:
print("No further attacks can improve coverage.")
break
# Check if adding this attack exceeds the maximum total execution time
if total_execution_time + attack_info[best_attack]['time'] > max_total_time:
print("Reached maximum total execution time limit.")
break
# Select the attack with the highest adjusted efficiency
selected_attacks.append(best_attack)
covered_plains.update(attack_info[best_attack]['plains'])
remaining_attacks.remove(best_attack)
total_execution_time += attack_info[best_attack]['time']
# Optional: Print progress
coverage_percentage = (len(covered_plains) / total_plains) * 100
print(f"Selected attack {best_attack}: Coverage {coverage_percentage:.2f}%, "
f"Total Time: {total_execution_time / 3600:.2f} hours")
# Step 6: Output the results
print("\nSelected Attacks:")
for idx in selected_attacks:
attack = attack_info[idx]
print(f"Attack Index: {idx}")
print(f"Command: {attack['command']}")
print(f"Time: {attack['time']} seconds")
print(f"Plains Covered: {len(attack['plains'])}")
print("-----------------------------")
final_coverage = (len(covered_plains) / total_plains) * 100
print(f"Final Coverage: {final_coverage:.2f}%")
print(f"Total Execution Time: {total_execution_time / 3600:.2f} hours")
print(f"Total Attacks Selected: {len(selected_attacks)}")
I have a directory of files, we will call this Set A files. Each file has a runtime as the file name follow by _Randstring.txt, The first line in each file is a command/attack and the rest of the lines are 2 columns seperated by ":" that are produced by that command/attack(the result of another program), the left side is what we can call plains, the right side can be ignored. I have another file seperate we can call file B. This has 121k lines of plains. My goal with this python program is to find commands/attack chains thats files plains match a certain percentage of plains in file B in the shortest time possible. The plains in Set A files have a lot of overlap and I have 3600 files with a total of 250million lines, so time/computation is an issue hence the greedy algo. Currently my python code isn't producing the best results, its choosing too many attack chains and too high run time(ints from file names). Is there any way I can make this better so the attack chain it chooses is more "optimal", I don't mind waiting a few hours and have 192core 256gb PC. Thanks
According to the article below, the US government wants us to stop using c/c++
What do you think?
Are they still worth learning though for a solid foundation in systems programming?
https://www.theregister.com/2024/11/08/the_us_government_wants_developers/
Is there a way, in python to extract an image from pdf cohesively. Since sometimes a pdf might have an image that is actually formed from many cutouts. These cutouts are extracted instead of one whole image for my problem.
So in the old days a variable declarations would put the type before the name, such as in the C family:
int num = 29;
But recently I've noticed a trend among modern programming languages where they put the type after the name, such as in Zig
var num : i32 = 29;
But this also appears in Swift, Rust, Odin, Jai, GoLang, TypeScript, and Kotlin to name a few.
This is a bit baffling to me because the older syntax style seems to be clearly better:
The old syntax is less verbose, the new style requires you type "var" or "let" which isn't necessary in the old syntax.
The new style encourages the use of "auto". The variables in the new camp let you do var num = GetCalc();
and the type will be deduced. There is nothing wrong with type deduction per se, but in this example it's clear that it makes the code less clear. I now have to dive into GetCalc()
to see what type num
is. It's always better to be explicit in your code, this was one of the main motivations behind TypeScript. The old style encourages an explicit type, but allows auto if it's necessary.
The old style is more readable because variable declaration and assignment are ordered in the same way. Suppose you have a long type name, and declare a variable: MyVeryLongClassNameForMyProgram value = kDefaultValue;
, then later we do value = kSpecialValue;
. It's easy to see that value is kDefaultValue to start with, but then gets assigned kSpecialValue. Using the new style it's var value : MyVeryLongClassNameForMyProgram = kDefaultValue;
then value = kSpecialValue;
. The declaration is less readable because the key thing, the variable name, is buried in the middle of the expression.
I will grant that TypeScript makes sense since it's based off JavaScript, so they didn't have a choice. But am I the only one annoyed by this trend in new programming languages? It's mostly a small issue but it never made sense to me.
I want to code on the phone when I have time so what languages are most suitable to mobile version of vscode or terminal and do some code changes etc.
Should I go with interpreted or compiled?
This is part of the text btw:
ê„â âCX ë âKX ë á ”åâ0å á3ÿ/áTOâ â8X ë â@X ë á ”åâ0å á3ÿ/á ”åå á1ÿ/á Uã8 ‘1ÿ/DÐâð€½è÷r br Buffer is NULL -ép@-éxŸå†ßMâ P ã ãàâØY ë(å Pã ÞV ë @ áÜV ë åŠ/â0å áå3ÿ/á PãG P ã(å å0 ‘å ã2ÿ/áÍåÿ ã (å å0 ‘å ã2ÿ/áÍå(å ã å0‘åâ3ÿ/á @°á+
ÌŸå€/ ãà â®Y ëŒâ åµV ë ` á³V ë åHå á1ÿ/á á ââ—W ë âçW ëâ €à,å á§W ë ã å âºOâ¿W ë âÛW ë âÙW ë á ”åâ0å á3ÿ/á ”åå á1ÿ/á Uã( ‘1ÿ/†ßâp ½èðäzq åp Trace: tick=%10d -éðA-éxŸåŠßMâ ãà„âpY ëyV ë @ á åHå á1ÿ/á P á ”å‰/â0å á¸å3ÿ/á$å å0 ‘å ã2ÿ/áÍåÿ ã $å å0 ‘å ã2ÿ/áÍå$å ã å0‘å„â3ÿ/á @°á/
áú ãÇY ëp á áá ãÃY ë` á< ãÀY ë€ á á< ã¼Y ë ` ᜠŸå á¸Y ë á â0 áˆâpå €å*W ë‘â å âxW ëâ €à@å â8W ë ã å âoW ë á ”åâ0å á3ÿ/á ”åå á1ÿ/á$å å‘å1ÿ/áŠßâð½èðäíq €î6 %02d:%02d:%02d:%03d-- ðM-ép áHŸåŠßMâ € á ãà„âY ëV ë @ áV ë å‰/â0å áTå3ÿ/á$å å0 ‘å ã2ÿ/áÍåÿ ã $å å0 ‘å ã2ÿ/áÍå$å ã å0‘å„â3ÿ/á °á'
¦ Wã@â° á¦p ƒ` ã Wã Ú Wá` ± P ã ê Øç„â
THERE IS: plain text included, its a smart watch game, and parts of other files are Chinese.
This is the smart watch:
so i am a bsc(hon.) maths grad. i have always had interest in programming was good at python and sql. Now i am learning MERN stack and trying for internships but hardly getting any response, while most of my peers and seniors from clg are learning data analysis or working as an intern as DA. This makes me question my choice whether i coming from maths background try DA instead of development. As i have heard on the internet that people with few yoe as Data analyst earn as high as developers ( 20-30Lpa). which should i go for to get employed asap and learn along with job. (within 5 to 6 months )
I saw an Instagram reel showcasing a code editor with animation and sound effects in an anime or fighting-game style (like Dragon Ball fights). Each letter I typed triggered effects and sounds, and even the cursor movements had these effects, making it feel like a battle scene. The caption mentioned something like 'code editor like Dragon Ball fight,' with a GitHub download link, but I lost the video. Does anyone know where I can find this editor or something similar to this?
Here's a clip: https://www.reddit.com/user/Introscopia/comments/1gnb6wv/3d_rotation_hell/
basically I'm spinning the dice. whichever is the forward-most face, that's what you rolled, right? but I want to then snap the face to be exactly facing forwards. This code seems to work about half the time.
D is the dice struct. T is its transform, expressed as a basis triplet.
// grab the resulting face normal in world-space
D->snap_normal = xform_v3d( D->T, D->mesh->tris[v].normal );
v3d_normalize(&(D->snap_normal));
...
// dice has stopped rolling
if( v3d_dot( D->rotvel, D->rotvel ) < 0.00001 ){
D->state = 2;
static const vec3d FWD = (vec3d){ 0, 0, -1 };
D->snap_axis = v3d_cross( D->snap_normal, FWD );
v3d_normalize( &(D->snap_axis) );
double sign = 1;//( v3d_dot(D->snap_axis, D->snap_normal) > 0 )? 1.0 : -1.0;
D->snap_total_angle = sign * acos( v3d_dot(D->snap_normal, FWD) );
D->snap_T_origin = D->T;
D->snap_timer = 0;
}
...
if( D->state == 2 ){ // SNAPPING
float t = smooth_step( D->snap_timer / 32.0 );
double step = D->snap_total_angle * t;
D->T = D->snap_T_origin;
rodrigues_rotate_Transform( &(D->T), D->snap_axis, step );
D->snap_timer += 1;
if ( D->snap_timer >= 32 ){
D->state = 0;
D->T = D->snap_T_origin;
rodrigues_rotate_Transform( &(D->T), D->snap_axis, D->snap_total_angle );
}
}
When I left off I was thinking that maybe the trick would be to flip the sign of the angle, cause I noticed that setting it negative yielded basically the same results. But I can't figure out what the test is for deciding the direction of rotation... something something right-hand rule?
Any insight is appreciated. Oh and is this the best sub to be asking this? I feel like it's almost more of a math question, but I feel like a math sub won't like to see a big wall of code...
So... I was wondering if I can sell my Instagram bot script, which automatically makes memes and uploads them as reels to multiple ACCs. Where can I sell or how can I get people?
Can anyone explain where I am missing the logic for finding the pivot in a sorted then rotated array in the below function?
static int pivot(int[] arr){
int start = 0, end = arr.length - 1;
while (start < end){
int mid = start + (end - start) / 2;
if (arr[mid] <= arr[start]) {
end = mid - 1;
} else {
start = mid;
}
}
return start; //return end or return mid
}
Suppose a database architecture that uses surrogate keys, so each entity has its unique ID generated by a database sequence. This is regardless of the business unicity on specific field(s). Suppose that your code works with a persistence framework and its associated objects, but the business logic will manipulate another set of objects, so there is a mapping between the two models every time you fetch or persist objects.
Would you include the persistence ID (the PK) in the business models? For the purpose of facilitating fetching other nested (or parent) objects, for example, otherwise you'd have to join tables every time your business needs to fetch related objects.
So the title is petty much self explanatory, I've had some experience with programming before (mainly python and java) but not really with frontend/backend/web development. I know the math and logic behind code (some of it) and also have my fair understanding of algorithms and even automatons, is it really possible for me to become a software engineer/developer? My brother (younger than me) studied computer engineering (not finished) but he got interested in programming outside school and now works as a software developer, he has inspired and encouraged me to learn more coding skills and apply for jobs. Any recommendations? It's seems fun to me and I think I can make a career out of it, also, I'll admit, there's good money on it and that's kinda another reason I wanna become a developer.
I want to stress test some "black box" functions that I have python access to. I have a set quantity of inputs and outputs that are considered success and all other input combinations that cause the same outputs would be considered bugs.
I am having difficulty coming up with a way of making a test suite that doesn't require a custom programmed approach to each function. Ideally I would use wxPython to select the data, then have pass criteria and have it auto-generate the fuzz.
Alternatively I was thinking of just having every input as an input to the fuzzer and simply return all results that cause a selected output state.
Is there already a system that can do this or a testing library that I should be looking at?
Hi I am new to API development. I like to know about versioning management in API. I mean what are the best practice.
Lets say my current version is V1 and I have a class -- CheckListController (which has 15+ methods)
in the next version V2 , I have just update an existing method and add a new method.
so, Shall I copy whole code from V1/CheckListController to V2/CheckListV2Controller ? or extend the existing controller ?
https://github.com/Dijji/XstReader
I read in this repo that the DOJ forced Microsoft to release the .PST file specification in 2010. So technically the world would never have seen under the hood of an Outlook .PST file.
But since they had to release it, we have a ~198 page specification of the file format.
The format doesn't use, nor has been made to use, a more open serializable format like JSON; instead, it is a massive instruction manual of what appears to be how to parse raw bytes in a C-style language with pages and pages of domain-specific cryptic abbreviations like dwPropertyId
, dwCRC
. Here's a snippet:
2.4.8.1 Search Update Descriptor (SUD); The SUD represents a single unit of change that can have an effect on any of the search objects. When a change is made to the contents of a PST (add, modification, removal, and so on), the modifier is responsible to create a SUD that describes the change and queue it into the Search Management Queue (SMQ).
It is a large mix of domain-specific lingo and lingo related to design choices of implementation. Like an SMQ only exists in the context of the design of the PST format.
There must have been years of work, committee meetings, conference calls, developer hours spent deciding how to design this C lang structure. Like you could retire from Microsoft having spent two decades perfecting this and justifying your massive salary to implement the PST format.
My question is: I just don't see the future of programming currently being invested-in to include designing these massive specification C structures. Maybe Microsoft indeed has a huge pool of legacy C devs employed who are just waiting to punch out for the last time. Maybe there are active hirings. And I'm sure there's passionate people looking to convert this kind of stuff to Rust lang.
Am I crazy or is this a fair understanding of the state of programming in 2024?
Edit: Here's the specification: https://msopenspecs.azureedge.net/files/MS-PST/%5bMS-PST%5d.pdf
I am trying to create a page in postscripting language that prints random character strings.
I cannot find any method in any of the ps books that convert numbers to their character codes.
for example 26 5 3 10 >> "ZECJ"
I have searched the official red,blue,green books for this without success.
Does anyone here know where I ought to be looking?
Is there a simple code segment I have overlooked?
Note: my background is engineering and not software and I am supporting a software tool. My company has thousands of data sheets with a current data collection process of hand writing the data and then typing it in to our database. Instead, we would like to digitize the data collection so that data is recorded and sent off in the same process using an interactive GUI.
Before I joined the company, a contracting engineer started a project for this but got let go before it was finished. I can't find any source code from the project but someone has a screenshot of what the GUI tool looked like, which is an interactive word document formatted identically from one of our data sheets. This includes editable text and check boxes over the data tables or blanks where data would be written in.
I am currently trying to implement docx libraries in python to extract all the text and format then output to the GUI. Am I heading towards a proper solution? Any advice how to go about reverse engineering such a tool would be helpful.
I will try to stay short. I am currently studying computer science, or something very similar like that in Germany. And I can't take this anymore. It is way to difficult than I already imagined. I had java basics in my first term/semester and it actually was fun and I liked it. But right now I have Kotlin/Android Studio and Python at the same time. It is extremely annoying. I don't understand it anymore. I can't imagine how people get good with this. My teacher gives us the next exercises for us to do and the next days the only thing i do is reading through every documentation about that language i can find. I want to program and not read like 10 books a day 🥲
Hi y’all im actually a data scientist so programming is not my background, sorry if this is a dumb question - cut me some slack 😳
Anyways - how would I write a unit test for a function that’s supposed to return an out out of 50 dictionaries (based on some condition where I pick out the top 50 scores, where “score” is one of the keys in the dictionary and its value is an integer.
So example of what a dictionary looks like
{ “field1”: “string value”, “field2”: “string value”, “score”: 97 }
This function is supposed to take in about 100k records (one record = one dictionary} and spit out the top 50 scores.
I don’t have a mock database to work with.
How do I write a unit test for this kind of task? My understanding is that you hardcode some inputs like edge cases, and the tell the test what the output should be in each case (something with assert, I’ll have to look at the docs again) but how do I write a test for functions that return an output of the top 50 records?
This is in Python (if that matters at all)
Hey guys anyone knows if there is an affordable API for getting the price history of an amazon product? Doesn't need super high rate limits.
So far the ones I found are Keepa and Amazon historical price from RapiAPI. Both look kinda dubious. Keepa's docs are not even accessible. Wondering if there are other alternatives out there?
Long-time Vim user here... thinking about taking the plunge into Neovim. For those who’ve made the switch, is it genuinely worth the hype? Are the features, plugins, and (dare I say) quality of life improvements enough to lure me over, or should I stay loyal to good ol' Vim?
I have a human computer interaction bachelor and worked 2 years as a UX Designer and Tester. But I want to learn creating websites (especially with wordpress) and try to learn Full Stack Web Devleopment. Which skills do you think are needed in 2024? I know HTML, CSS and javascript are the fundamentals, but that does not help me with creating backend and other jobs. I live in Germany if that's important to know.
Thank you in advance for your help.
The question in the title might be vague, but I will immediately explain my dilemma. I have some C++ knowledge, I have a computer science degree but I chose not to continue in this dommain. Now I have plenty of time, and I'm interested in Blockchain, so I want to learn Rust/Go (already started to learn Rust). I've seen this debate on a local forum, where most of the programmers(all from my country) said that Rust is shit because it's almost impossible to find a job in our country and better learn Java or Python. I don't have financial problems, I have enough time and motivation, and I think that they have a narrow view of this subject, because I could choose to work remotely, that's a main point of being a programmer. As I can see, there still are plenty of Rust job opportunities and I believe that this will not die soon. Any oppinions and suggestions? Should I consider learning something else first? Thanks.
i want to make a weekly game night in our python learning center, but i cant find any games that are easy to join and setup. I am looking for something that can run in a browser and that can allow the admin to invite more people and specifically add python tasks, any recommendations would be appreciated
Hey everyone,
I’m working on a project to create a sandbox that can run files in a contained environment and monitor behaviors like file modifications, network calls, and memory access. The idea is to capture these behaviors and save them in a "blueprint" data structure for later analysis.
Here’s what I’m trying to achieve:
Run files safely within a sandbox to keep the host system secure.
Track file, network, and memory behaviors.
Save the observed behaviors in a compact data structure, acting as a fingerprint for each process.
Main challenges:
- How to ensure containment so the executable doesn’t affect the host system.
- How to structure the blueprint data in a way that’s both detailed and efficient.
- Choosing between Go or C++ for a low-level, efficient approach.
If you’ve worked on something similar or have any resources or tips, I’d love to hear from you! Thanks!
Hi all this is NOT an ask to write any code for me or solve this problem - im just trying to understand how I’m supposed to go about completing this take-home assessment since I am not familiar with writing formal tests for my code. Also this is all in Python as many of you probably guess given the data science in the title.
Might be a very dumb question but I was given this code assessment for a data science role, but it seems like they’re focusing more on code organization and unit testing (which hasn’t been the primary focus of my career), and the assignment came without any mock/seed data or fake records or anything, just the assignment itself aka instructions what the code/functions should do and what the output looks like - with a focus on the unit tests and TDD structure etc etc
Anyways they’re saying that these functions would take input of about 100k records, inside a JSON file, where it’s just an array with 100k dictionaries, each dictionary is a record or a person, with like 3 key-value pairs so this is what the JSON file would look like below, I added one person’s record, but supposedly the full data set has 100k records, where each record represents one person:
[
{“first name: “Jack” Last name: “Smith” “Career”: [{“work”: “Microsoft”, “dates”: {..}}, { company: “Apple”, , “dates”: {..}}, { another person}, {another person},
…..99k more records in the array ]
So the instructions state to not use a database or persistence engine - so that means I shouldn’t create mock dataset of records that I can test my code on right?
It says to use pytest and testing package etc etc.
Anyhoos one of the first tasks says to write a function that takes in this JSON file as an input and spits out pairs of people who worked at the same place during same dates. I’ve seen unit tests before and have a general idea how to write them for simple functions that take like one integer as an input, but how does testing work when the input is a giant file of 100k records? Like to write a test with that input when I don’t have any actual file with 100k records doesn’t make any sense to me but again I’m not really a coder so I don’t know how this could work…I’ve seen some blogs about MagicMock packages or paramteizers something like that, but I still have no idea how those create mock input of 100k records?
Am I super stupid or unknowledgable or how would a unit test work here?? I’m just looking for a general explanation of how a test would work under the hood creating all these records to test on and spit out some outcome? Would I be writing some script to tell this test how to create this JSON object and all the dictionaries inside of it (each dictionary = one record = one person)
EDIT-TO-ADD:
One of the tasks is to write a function that spit out an output of the top 50 pairs of records who worked together the longest (with overlapping dates at the same company)…wouldn’t the input for the unit test have to be at least 50+ records since they want at least that many for the output?? Am I just confusing myself??
When I interview at companies I always ask them about their balance/priorities between delivering features vs addressing technical debt.
I get a range of answers.
And when I've joined these companies, it becomes clear, pretty quickly, that almost no priority is given to addressing tech-debt. Even when they claim otherwise during the interviews.
I even confronted one of my managers about this, and his response way basically "You're always welcome to address tech-debt when the team has met all our sprint commitments". I responded something like, "We have a policy of ambitious sprint goals. So we're expect to NOT finish everything we committed to each sprint.". I forgot what he said. But it was basically a smug "Yep!".