/r/algorithms

Photograph via snooOG

Computer Science for Computer Scientists

Computer Science for Computer Scientists

Smokey says: boycott all products and services from eco-unfriendly businesses to fight climate change! [see more tips]

Note: this subreddit is not for homework advice. Requests for assistance with coursework may be removed.

Other subreddits you may like:

Does this sidebar need an addition or correction? Tell me here

/r/algorithms

106,053 Subscribers

0

Time Complexity Analysis of LU Decomposition Variants

I understand that the time complexity of LU decomposition is typically 2/3 * n^3. I have a couple of questions regarding LU decomposition with and without pivoting:

  1. Is it true that the time complexity for LU decomposition with pivoting is the same as without pivoting, assuming we skip the pivot search and row reordering steps?

  2. If we use another algorithm that sequentially performs LU decomposition with pivoting and without pivoting, what would the overall time complexity be? Would it still be 2/3 * n^3 for each, or would it sum up to 4/3 * n^3?

Looking for some clarification on these points. Thanks in advance!

0 Comments
2024/07/11
13:34 UTC

0

NLP: What kind of model should I be looking for?

I have a tree of questions that are going to be asked to a client and a tree of answers the client may answer attached to it. I want to use NLP to convert what the client said to one of the pre-written simple answers on my tree. I've been looking and trying different models like Sentence Tranformers and BERT but they haven't been very accurate with my examples.

The pre-written answers are very simplistic. Say, for example, a question is "what's your favorite primary color?" and the answers are red, yellow, and blue. The user should be able to say something like "oh that's hard to answer, I guess I'll go with blue" and the model should have a high score for blue. This is a basic example so assume the pre-written answer isn't always word for word in the user answer.

The best solution may just be pre processing the answer to be shorter but I'm not sure if theres an easier work around. Let me know if theres a good model I can use that will give me a high score for my situation.

0 Comments
2024/07/11
19:22 UTC

5

Matrix permutation counter. How come it doesn't work for large matrices?

I tried this coding problem the other day. You are given a matrix (An array of int arrays). And you are told to count the different number of permutations you can put that matrix in just by switching the various values that exist within it.

[
  [1, 2, 3],
  [3, 5, 8]
]

So classic permutation. But there is one factor that makes this more difficult. The matrix can contain duplicate values. In the case above, that value is 3. So it's possible that you can create a certain new permutation however because there are duplicate values, the literal values have already been in that same configuration in a previous iteration. And you cannot count that as a valid permutation.

So this was my solution. You first turn your matrix into a 1-dimensional array of integers. Just go through all the values and put them in a single 1D array.

Then you go through that 1D array and keep track of any value that appears more than once in the array. Put these values in another array called dups (duplicates). The point of this array is this: When you are iterating through the vals array, you need to know if your current value repeats in another part of the array.

Now write a recursive function that starts at level 0, and goes through each val. For each val, it will explore all other vals (except for the currently marked val). And it will keep repeating this and go deeper.

During each recursive call, it will check to see if it has encountered a value that is known to repeat itself. If it does encounter it, it wil create a note of it, so that if it encounters that value again, it knows it has to skip over it.

My solution works for most cases, however for really large matrices, it takes an extremely long time. When I tried it online, it timed out and the test failed.

If you try it out with this parameter in place of vals ( [0, 4, 0, 5, 4, 3, 3, 4, 0, 4, 4, 2, 5, 1, 0, 2, 3, 1, 0, 2] ), it doesn't work.

How come? How can my solution be improved?

let vals = getSingularArr(matrix);
let dups = getRepeatedInts(vals);

let index = 0; let count = 0; let marked = [];

populate(vals, index, marked, count, dups);

function populate(vals, index, marked, count, dups){

    //Base case
    if(index >= vals.length){
        count++
        return count;
    }

    //create a logbook for the current callstack
    var logBook = [];

    for(let x=0; x<vals.length; x++){

        //literal vals
        if(logBook.includes(vals[x])==false){

            //indexes only
            if(marked.includes(x)==false){

                //literal vals
                if(dups.includes(vals[x])==true){
                    logBook.push(vals[x]);
                }

                marked.push(x);
                count = populate(vals,index+1, marked, count, dups);
                marked.pop();
            }
        }
    }

    //remove the logbook when exiting out of the current callstack
    delete logBook;

    return count;

}
1 Comment
2024/07/10
17:09 UTC

4

Efficient algorithm for Hostel Room Allocation

I am creating a web app for allocation of hostel rooms to the students. I am done with the part of UI and basic backend of admin and students.

Now, I want an algorithm which can allocate rooms to all the students based on their preferences.

For simplicity, assume a student can give their preferences to at max 3 rooms, and all rooms capacity is 1. These variables are to be changed based on requirements of different hostels and blocks.

Note: Most students should get their preferences, and remaining should be alloted rooms randomly.

Can anyone please help me with this?

23 Comments
2024/07/10
18:31 UTC

1

Efficient Algorithm for Privatized search engine.

Hey guys, I am creating my own personal search engine, it operates via a CLI and then allows me to open websites in a browser.

I have a fairly large dataset of websites, and was wondering if there is an algorithm already that I can use to find keywords within the website that I am typing in.

For example, if I typed into my CLI `search recipe for brownie`

It would return like 10 different links to brownie recipes by checking keywords within the website.

5 Comments
2024/07/10
01:45 UTC

0

Challenges (DSA but not only) to practice Go

For the past 5+ years I am maintaining this project where I post challenges to be solved in Go. Each challenge has a test and a benchmark, so we can compare different submissions.

Feel free to browse around and see solutions or submit a new challenge / solve one.

https://github.com/plutov/practice-go

0 Comments
2024/07/09
09:53 UTC

1

Recursive vs Blocked Gaussian Elimination: Performance and Memory Impact on GPUs

Hi all,

I've been exploring Gaussian Elimination algorithms, particularly focusing on the recursive and blocked implementations. I'm interested in understanding how these methods compare in terms of performance and memory usage, especially in a GPU environment.

Here's a high-level overview of the two approaches:

Recursive Gaussian Elimination:

function recursive_factorize(matrix, size):
    if the size is as small as the threshold:
        factorize_small(matrix)
    else:
        split the matrix into left and right halves
        recursive_factorize(left_half)
        update_right_half(left_half, right_half)
        recursive_factorize(right_half)

Blocked Gaussian Elimination:

function blocked_factorize(matrix, size, block_size):
    for each block in the matrix:
        factorize_block(current_block)
        update_below(current_block)
        update_right(current_block)
        update_rest(current_block)

I'm looking for references, insights, and empirical data that could shed light on the following:

  1. How do you describe the concept of Recursive vs Blocked algorithm?
  2. How do the recursive and blocked Gaussian Elimination methods differ in performance when implemented on GPUs?
  3. What is the impact of each approach on memory usage and bandwidth?

Your expertise and experience with these algorithms, especially in a GPU context, would be highly appreciated!

0 Comments
2024/07/08
10:12 UTC

2

(osmnx) Find fastest order to visit nodes in a graph - traveling salesperson

Hello everyone, I am trying to build a fast approach to the traveling salesperson problem. Using networkx's or dwave-networkx's tsp function takes too much CPU and ram. Here's the context:

  • I have a graph (G) built from real-world data, which means there's a lot of nodes in the graph to process (that's why the built-in functions take so much time and resources)
  • I have an origin node and another set of nodes which need to be visited
  • The final destination is the origin node (it's a round trip)

Given that, I need to find the optimal order to visit those nodes in order to minimize the travel distance. The result should be something like an array with the nodes in order of visitation. Thus, no big computational tasks should be needed.

I have already tried implementing a greedy algorithm that didn't work and I asked crapGPT but got nonsensical answers... I would love any suggestions on algorithms I should try.

3 Comments
2024/07/07
21:44 UTC

0

What is the fastest way to get tree's structure from list of node!!!

Hi all!
I'm finding data structure to store tree's nodes which is the fastest for getting tree's structure!
If we just store node with value and it's children, we need make recursive
Is there anyway to store tree and don't need to make recurrency when get tree's structure??
Thanks all!

2 Comments
2024/07/07
11:19 UTC

0

I made a Computational String Art Generation Algorithm series.

Hello!

I had some free time before I started my new job last month and in that I started dabbling with this thing called computational string art. I thought I'd give it a try and succeeded in coming up with a method to make it. I thought I'd share my findings with everyone and so made a couple YouTube videos demonstrating that. Check it out, you may like it.

0 Comments
2024/07/07
10:43 UTC

0

Here is 3SAT algorithm I'm working on

Basically I used to filter 1 sat trough 2 sat into 3sat and it split into layers. It's possible to run multiple layers of 3Sat calculations while monitoring model performance I will leave a pastebin if you want to impliment the logic. https://pastebin.com/TL7YHkfD

0 Comments
2024/07/07
01:20 UTC

6

Microsoft OA - Topological Sort( I guess)

A company is planning N projects, numbered from 0 to N-1. Completing the K-th project will bring value V[K] to the company. For some projects there may be additional requirements - the L-th requirement states that before starting project B[L], project A[L] should be completed. There are M such requirements.The company has enough resources for at most two projects to be completed. If two projects are chosen, they will be completed one by one (in sequential order) and the total value they bring to the company is the sum of their individual values. What is the highest value that a valid choice of projects can bring to the company?

Write a function:
int solution(vector<int>& V, vector<int>& A, vector<int>& B){ }

that, given array V of N integers and two arrays A and B of M integers each, returns the maximum value that the company may gain by completing at most two possible projects.

Examples:

  1. Given V = [-3, 5, 7, 2, 3], A = [3, 1] and B = [2, 4], the function should return 9. This can be achieved by completing project 3 (with value 2) first and then project 2 (with value 7).
  2. Given V = [1, 1, 5], A = [0, 1] and B = [2, 2], the function should return 2.
  3. Given V = [5, 6, 6, 7, -10] and A = [0, 0, 0, 1, 2, 3] and B = [1, 2, 3, 3, 1, 2], the function should return 5. The project that are possible to be completed are 0 and 4. As project 4 would bring negative value to the company, it is optimal to do only project 0. The structure of dependencies of projects 1, 2 and 3 form a cycle, so none of them can be completed in a valid choice of projects.

Write an efficient algorithm for the following assumptions:

  • N is an integer within the range [1..100,000];
  • M is an integer within the range [0..100,000];
  • each element of array V is an integer within the range [-1,000,000,000..1,000,000,000];
  • each element of arrays A and B is an integer within the range [0..N-1];
  • a project may not require itself to be completed (A[K] != B[K]);
  • projects’ requirements do not repeat.

What i don't understand how Example 3 returns 5? If project 0(value 5 in V) were to be completed that means that i will have to complete firstly project 0 in A, that is prerequisite for B[0] = 1 which has other dependencies -cycle dependencies which means it cannot be completed. Also to complete project 4 with value -10 in V, i don't even know how to access that index in V through A and B -as i can see it only values 5,6,6,7 can be reached from A and B-inside of those arrays i have numbers from [0,3] that will access 5,6,6,7 but not the -10 because it's on index 4 and i don't have that number in A and B

4 Comments
2024/07/07
07:26 UTC

1

How to get straighter and shorter lines in A* algorithm?

I implemented A* algorithm in Rust (using bevy to draw the result) and it does properly find a path to the destination, but it keeps creating weird edge cases, where the path either is following the shortest possible route, but takes suboptimal turns (= looks ugly), or just outright isn't the shortest possible path to the destination.

Is there a way to fix that?

I can provide code, but it is long and possibly hard to read. I'd like to know, maybe there are some generic solutions to this.

Please bare with me, as I am new to algorithms and don't know all of the language ^^"

Upd: Visualisations and code in the comments

12 Comments
2024/07/06
09:00 UTC

0

Youtube algorithms

hey quick question,

I run 2 youtube accounts on the same email. Does it affect the algorithm?
Or are the accounts seen as seperated? Thanks for any answers

1 Comment
2024/07/05
23:32 UTC

1

Link Inference for Social Network Graph

Hello! Recently I have been noodling with Social Network Graphs and I was wondering how to go about filling in missing relationships in a graph. For simplicity, assume we are working in the English Language and only allow the following family relationships. The relationships in bold are primary relationships and distinct from higher order relationships.

  • Parent (Mother, Father)
  • Child (Son, Daughter)
  • Sibling (Brother, Sister)
  • Grandparent (Grandmother, Grandfather)
  • Grandchild (Grandson, Granddaughter)
  • Aunt, Uncle
  • Niece, Nephew
  • Cousin
  • Spouse (Husband, Wife)
  • In-law (Mother-in-law, Father-in-law, Sister-in-law, Brother-in-law, Daughter-in-law, Son-in-law)
  • Stepparent (Stepmother, Stepfather)
  • Stepchild (Stepson, Stepdaughter)
  • Half-sibling (Half-brother, Half-sister)

Now, let's say I have a graph. I would like to fill all missing relationships given the available information. For concreteness, consider the following situation:

V = {Man, Woman, Boy, Girl}

E = {{Man, Woman, Wife}, {Man, Boy, Son}, {Boy, Girl, Sister}}

G = (V, E)

I would like a function fillMissingRelationships(G) that adds the following edges:

{{Man, Girl, Daughter}, {Woman, Man, Husband}, {Woman, Boy, Son}, {Woman, Girl, Daughter}, {Girl, Man, Father}, {Girl, Woman, Mother}, {Girl, Boy, Brother}, {Boy, Man, Father}, {Boy, Woman, Mother}}

I imagine this gets much more cumbersome and difficult for large graphs. At this point, I have tried reasoning about the properties of my graph. I think these two points are helpful:

  1. We should only concern ourselves with primary relationships because other relationships are too vague. For instance, knowing two people are cousins does not offer much information.
  2. Relationships are at most 3-rd order. This is to say, if a named relationship between two people exists in this graph, it is at most 3 hops away via primary relationships.

I would like to come up with a reasonably clean and efficient algorithm for this task. My instinct is to start by omitting all non-primary relationships. Then, we use an All Pairs Shortest Path algorithm (e.g., Floyd–Warshall) to find the shortest paths between all people in the graph. From there, we use a lookup table mapping an ordered pair or triplet of primary relationships to one of the named relationships listed above. I am not sure if this would work though. And, even if it did work, I sense it would be a fair bit of work to actually implement this in practice.

With all that said, I am wondering if there are any other approaches to this problem, or, better yet, existing code that does this exact task.

0 Comments
2024/07/04
22:59 UTC

8

Better than O((n ** 2 - n) / 2)?

I wrote a program that simulates essentially air hockey pucks. To find collisions I have to compare the distance from every puck to every other puck. So every iteration with three pucks only needs to do three comparisons, but with 1,000 it takes 499,555. Is there a better way?

19 Comments
2024/07/05
16:06 UTC

2

Generating subpartitions of a rectangle into rectangles

I am working in python. I need an algorithm to generate a random subpartition of a w by h rectangle into n subrectangles. The subrectangles must have integer coordinates

16 Comments
2024/07/05
14:04 UTC

1

Looking for a way to efficiently cover a path

Hi internet,

I've been trying to solve this problem for a while. The goal is to completely cover a non-linear path (has loops and turns) with circles of fixed radius, and the center of the circles must be on the path as well. My current method either results in a lot of overlap between these circles or seemingly random gaps between them. I read about the greedy algorithm, but not too sure if that would work the best.

Any help would be appreciated, thanks!

2 Comments
2024/07/04
19:07 UTC

1

Trying to reduce a directed graph while maintaining structural properties

I have a large (~17k nodes) directed graph. I want to reduce it such that i maintain the overall structural properties. This is an rpc call graph so it has hot spots and relay nodes. Graph coarsening/reduction algorithms seems to work largely on undirected graphs only. Is there any directed algorithm to solve this? Do let me know if should provide any more information

0 Comments
2024/07/03
09:44 UTC

1

Can you calculate Fibonacci numbers using recursion in O(N) without memo

So my professor told me that any loop can be converted into recursion with the same runtime complexity. He also said that Fibo can be done without using any memo in O(N). I'm confused because i can't find any solution for it in O(N) without memo

2 Comments
2024/07/03
06:10 UTC

0

Need help creating an algorithm/code.

Hello, people of the internet so l'm Interning for this financial company, and so far they have me deleting a bunch of "households" on "MassMutual-> Advisor360"; that don't have any social security linked. The problem is there are a lot of households in their database(practice360)is their anyway for an algorithm could resolve my issue that could do it automatically for me?

4 Comments
2024/07/03
19:25 UTC

5

How to tell if it is impossible to construct a red black tree

assume

a tree containing {1,2,3,4,5,6,7,8,9,10,11,12,13} where

all even numbers are in a black node and all odd numbers in a red node.

Is there any way to prove such red black tree can't exist?

5 Comments
2024/07/03
17:43 UTC

3

Optimal substructure of the activity-selection problem

Hope this kind of post is allowed here. If not, I apologize.

I’m trying to understand a way of using dynamic programming to solve the activity selection problem. I understand that greedy algorithms work far better for this but this is a learning exercise.

The problem consists of choosing from a set of activities S={a1, a2,…, an}, where each element have a start and a finish time, the maximum number of elements where these activities “don’t overlap”. It starts with a list of activities sorted according to the finishing times.

The text I’m using gives a proof of optimal substructure for subproblems defined as Sij - the set of elements of S that begin after activity ai finishes and end before sj begins. Defining the optimal solution on Sij as Aij, the dynamic programming solution is, max (i<k<j) {|Sik| + |Skj| +1} if Sij is not empty and 0 otherwise. I read the proof and it makes sense but I’m confused as to how that helps finding an optimal solution to the an original problem. Example:

example

If i try to write, say, a function that takes the activities sorted by finishing times with parameters i and j, the formulation given before would exclude some of the activities in the beginning and the end, since a1 finishes and 4 and the first activity that begins after a1 is a4. This would exclude a few elements that belong to the maximal set of the solution to the original problem.

Am I getting something wrong? I this just a formulation that’s useful for analysis but clumsy for implementation?

Any help is much appreciated.

8 Comments
2024/07/02
23:39 UTC

2

How to Optimize Memory Usage for Python Library Implementing Discrete Fred Fréchet Algorithm?

Hello everyone,

I'm using a Python library that implements the discrete Fred Fréchet algorithm (Fred-Frechet) to measure similarity between 2 curves (specifically, I'm interested in the maximal distance between 2 curves). My curves are defined in CSV files, with each curve containing approximately 50,000 points. When I run the algorithm, it consumes about 9GB of RAM.

I have also tried using Dynamic Time Warping (DTW) for these curves, but it resulted in even higher memory usage.

Does anyone have ideas on how to optimize the memory usage for these large curves? Any suggestions or alternative approaches would be greatly appreciated.

Thank you!

1 Comment
2024/07/02
10:41 UTC

28

Judge my Python algorithm for finding the first 1,000,000 prime numbers

As the title says, I created a script in Python to find prime numbers, write them to a text file, and track the time it takes. Is this an adequate algo, or am I doing something nooby to increase the time complexity massively?

All feedback is appreciated.

import time
import math

def is_prime(n):
    if n < 2:
        return False
    for i in range(2, math.isqrt(int(n)) + 1):
        if n % i == 0:
            print(n, "is not prime")
            return False
    print(n, "is prime")
    return True

def write_primes(n):
    with open("primes.txt", "w") as file:
        start = time.time()
        file.write("2\n")
        count = 1
        number = 3
        while count < n:
            if is_prime(number):
                file.write(str(number) + "\n")
                count += 1
            number += 2
        end = time.time()
        file.write("Time taken: " + str(end - start) + " seconds")
        file.close()

write_primes(1000000) 
17 Comments
2024/07/01
18:24 UTC

14

why time complexity != total time taken?

i want example which explain this statement

11 Comments
2024/06/29
17:26 UTC

0

Resources for Design and Analysis of Algorithms

I'm about to start my 3rd semester and one of the course is algorithm design and analysis. I haven't prepare for it and my uni doesn't exactly give any other resources to study either. Do you have some good recommendations e.g. lecture vids, books, articles, courses,pdfs etc. to help?

The curriculum structure looks like this:

Design and Analysis of Algorithms

  • Asymptotic notations and their significance
  • RAM model of computation
  • Complexity analysis: worst case and average case
  • Algorithmic paradigms: divide and conquer, recursion, greedy, etc.

Data Structures and Algorithms

  • Searching:
    • Binary search trees, balanced binary search trees, AVL trees
    • Red-black trees, B-trees, skip lists
    • Hashing, Priority queues, heaps, Interval trees, tries
  • Order statistics
  • Sorting:
    • Comparison-based sorting: quick sort, heap sort, merge sort
    • Decision tree model and lower bound on comparison sorting
    • Sorting in linear time: radix sort, bucket sort, counting sort
  • String matching

Graph Algorithms

  • BFS, DFS
  • Connected components, topological sort
  • Minimum spanning trees
  • Shortest paths: single source and all pairs
  • Models of computation: RAM model and its logarithmic cost
  • Algorithmic paradigms: divide and conquer, recursion, dynamic programming, greedy, branch and bound
  • Advanced data structures: Fibonacci heap, union-find, splay trees
  • Amortized complexity analysis

Randomized Algorithms

  • Introduction before NP-completeness
  • Application areas:
    • Geometric algorithms: convex hulls, nearest neighbor, Voronoi diagrams
    • Algebraic and number theoretic algorithms: FFT, primality testing, etc.

Advanced Topics

  • Graph algorithms: network flows, matching
  • Optimization techniques: linear programming
  • NP-completeness:
    • Reducibility between problems
    • Discussion of different NP-complete problems
    • Examples: satisfiability, clique, vertex cover, independent set, Hamiltonian cycle, TSP, knapsack, set cover, bin packing
  • Backtracking, branch and bound
  • Approximation algorithms: Constant ratio approximation algorithms
4 Comments
2024/06/28
14:17 UTC

0

I'm looking for a grid navigation algorithm

Hello everyone, this is my first post.

I'm looking for the name of an algorithm that matches the following behavior:

I have an object moving freely on a grid (by freely, I mean the object can move in floating points, not just integers). The object is moving X distance towards a direction, let's say (x1.0, y0.5) (so up and to the right).

If the object "hits" a wall on the y-axis, it continues on the x-axis, and vice versa. If the object "hits" a wall on both the x and y axes, it stops.

some sort of route planning algorithm

I believe that the algorithm should receive a starting point, direction, and grid2D, and return the final position.

Here's a clearer illustration (I don't know how to upload an image with this post so link):
https://ibb.co/9nLsVCZ

Does this algorithm have a name? PS: I only need the names of possible algorithms. :)

7 Comments
2024/06/28
14:51 UTC

0

Reconciling the sums of two lists of numbers

Posted to /askmath as well.

A solution would be great, but I don’t mind doing the research if I knew where to look (and how mathematicians would describe this).

This may sound trivial because it involves only addition and subtraction, but I swear people in business spend a ridiculous amount of time on it.

I have a list (technicality a “bag”, I think) of numbers A, and another list of numbers B. These are all currency (integers if you like).

The sums of list A and B are not the same, and I need to find out exactly why. The purpose is to find system or process errors, whether in production or during testing. For example, list A is missing an entry of 764,289.60 that appears in list B, and at the same time list B is missing both entries of 27.99 from list A.

The lists might not be the same level of granularity, so list B could be a single entry of 234.56 while list A has a number that of entries ranging from -20.00 to +40.00

An ideal solution would also be able to “bucket” numbers into groups (e.g. June versus July expenses) and find solutions that identify June expenses mistakenly entered as July.

The solution that involves the fewest changes (like moving puzzle pieces) is probably the best. The number of entries in the lists will be low (maybe a few hundred, usually fewer) although the totals will run into millions or a few billion.

Having typed this much, I’m probably looking for an algorithm as a starting position.

Anyone have ideas? Thanks in advance!

0 Comments
2024/06/28
02:00 UTC

1

Analytically calculating variable freq sine wave?

My friend was asking about calculating a variable frequency sine wave, I assume analytically (i.e. retrieve the wave amp at any given time) but using something like:

sin(time * freq)

...will cause weird distortions as the freq varies as a function of time. The only thing I could think to do was iterate over time, like this:

f = f + sin(time * freq)

...which I imagine will still have the same issue if freq isn't continuous as the phase will jump around.

How does one calculate a variable frequency sine wave without the phase jumping around?

2 Comments
2024/06/27
21:19 UTC

Back To Top