/r/algorithms

Computer Science for Computer Scientists

Computer Science for Computer Scientists

Other subreddits you may like:

- /r/compsci
- /r/programming
- /r/coding
- /r/tinycode
- /r/programmingchallenges
- /r/dailyprogrammer
- /r/ProgrammingLanguages
- /r/compilers
- /r/kernel
- /r/osdev
- /r/systems
- /r/softwarearchitecture
- /r/softwareengineering
- /r/softwaredevelopment
- /r/datastructures
- /r/learnprogramming
- /r/AskComputerScience
- /r/csbooks
- /r/math
- /r/logic
- /r/crypto
- /r/cryptography
- /r/complexsystems
- /r/MachineLearning
- /r/datamining
- /r/quant
- /r/sysor
- /r/CScareerquestions

^{Does} ^{this} ^{sidebar} ^{need} ^{an} ^{addition} ^{or} ^{correction?} ^{Tell} ^{me} ^{here}

/r/algorithms

2

Hello everyone,

So I want to build rent-a-car app, with focus on auto-scheduling part. To explain what I mean by auto-scheduling I will bring some background firstly.

- I can have 10 cars, for example:
- 5 Audi Q5s - same year/model, = 1 Type
- 3 Mercedes of same class, and let say = 1 Type
- 2 Toyotas (Corolla and Rav4) = 2 Types

- So basically I have 4 types/categories of cars in total - I do not care if its Audi Q5 black or white, I treat them with same value
- If client agrees to get Audi Q5, I take "any" of them
- - actually the one that auto scheduling returns for me, where I will now explain rules

Now, how prioritisation would work - how to differentiate multiple cars from same category. Firstly I would need input (from-to date, and category of car, i.e. Audi Q5).Rules:

- I get all previous rents of all Audis Q5 and this new one and reorder them in best possible way, which is from range of most booked to less booked. So i want to overbook one car, then another.
- Why? Because I want to leave other AudiQ5s that have more space for some longer bookings.

- If there is no possible way to fit all, then to possibly remove one or more of previous ones, and try to fit new one if its more valuable by price - each rent also has price, EXAMPLE: if the prices are same maybe 2 rents by 3 days are less valuable then one with 7 days and I could swap this new one instead of 2 old ones)

I would try to repeat these steps (1-2) when adding new rent.

Now, for me this looks like DP problem. Now I would ask for guidelines in base of is this right way to approach this problem, because it seems like best precision/time-cost ratio. I considered heuristics or genetic algorithm but it seems they would get off course more fast job done, but I think I might not get best outcome.

If you came this far, at least thanks for reading, if you could just approve my thoughts or suggest me or share me some paper that gives better idea. Just want to get to best approach possible which will be also time efficient.

6 Comments

2023/09/28

11:09 UTC

11:09 UTC

0

I made a previous post about this, https://www.reddit.com/r/algorithms/comments/14e7vfo/algorithm_to_remove_temperature_influence_on/

I'm going to try again since I didn't explain myself well.

I have a system which measures liquid volume. In this test the liquid volume is fixed, and hence the measurement should be constant. But the measurement has a temperature dependence, and so the value varies over temperature. Both the liquid and the measuring device have a temperature dependence.

In addition, the system also measures a fixed reference channel which physically cannot change, but also shows some correlation to temperature. The reference channel uses the same measuring device, so any temperature dependence in the measuring device will be compensated for, by using this reference channel.

For ease, I’ve put the whole system in our kitchen fridge and gathered data over many hours (mostly over night and weekends when the fridge stays shut).

Here you can see the fridge temperature (measured by a digital thermometer) and the reference channel:

https://i.imgur.com/HciOUvP.png

Considering that on the cooling cycle of the fridge, the air temperature changes quite rapidly and the thermal mass of the measured items result in some thermal lag, I think that’s a reasonable degree of correlation. As a first pass, I’m going to assume 100% correlation.

Here is the measurement channel and temperature:

https://i.imgur.com/QFskX5X.png

I’m wondering how to develop a formula to compensate the liquid signal such that for a fixed volume of liquid, over a wide temperature range, results in a near constant result (when the system reaches temperature ). Something that “looks” like this - nearly a flat line (I’ve faked it by making a wider y-axis range)

https://i.imgur.com/MfSqG7k.png

Even reducing the variation in the liquid signal would be good.

I would imagine this algorithm would use:

- the realtime reference measurement
- Constants (being the measurements) of the temperature, reference channel, and signal channel at the same time.
- And probably the temperature as well.

Initially I thought I could compensate without needing to measure the temperature, by using the reference channel, but on reflection, the reference channel can compensate the temperature effect on the measurement device, but not the temperature effect on the liquid.

Thanks for reading.

0 Comments

2023/09/27

11:08 UTC

11:08 UTC

0

I have a knapsack problem. Finding maximum point with maximum money. But u just can choose the maximum point from odd-even combination. (Ex : 1 4, 2 3 6, etc). How i can implement DP for this problem. I've try used backtracking and its so slow

0 Comments

2023/09/27

01:53 UTC

01:53 UTC

1

Hi, I was rejected from a job offer because I could not get the optimal solution for this problem:

Given a list of numbers, each number represents the number of days before an alien gives birth. An adult alien gives birth once every 7 days indefinitely. A baby alien reaches adulthood after two days. Compute the final number of aliens given an initial list of aliens and a number of days.

Here is a small example:

For the list [3, 2], and 5 days, this is how the list should evolve:

[3, 2]

[2, 1]

[1, 0]

[0, 6, 8]

[6, 5, 7, 8]

I got a naive solution by computing each day for one alien and pushing baby aliens to the end of the resulting list, then counting the total elements of the resulting list. The problem is that this takes an exponential amount of time and space. The interviewers said the optimal solution is O(n) for n the number of elements in the list, but I simply can’t get a solution with that complexity. I thought of mathematically computing how many babies each alien will have, but I also need to do this for each baby recursively, this also does not seems O(n) at all. Do you have any clue?

Edit: reformatted example

0 Comments

2023/09/27

18:07 UTC

18:07 UTC

0

Can please anyone provide me some subtle hint on this curious puzzle: Given a string consisting of only parentheses, e.g. `(()(()))()`

we want to tell how many different subsequences with proper (valid) arrangement of parentheses are possible. E.g. `(), (()), ((())), ()(), ...`

etc.

Note, it is about subsequences rather substrings. For example `(()()())`

contains subsequence `((()))`

despite no such substring.

I started to code this as typical DP problem but suddenly realized that subsequence, say, `()` could be encountered in multiple ways and we want disregard duplicates. It doesn't look I can simply put all results into hashtable as limit (e.g. 300 parentheses) means the result is expressed as a quite large value.

(this problem was shown to me as a "similar but simpler" to another one, regretfully I feel too stupid and this simpler doesn't enlighten me yet - thus ideally I'd like some more hint rather than solution explained)

4 Comments

2023/09/27

17:16 UTC

17:16 UTC

1

Hi, I have been trying to make a robot face that will smoothly track objects or people as they move around the room. I have a test setup of a camera on a pan-tilt rig driven by two servos (HWonder high torque 25kg/cm servos). The controller is a raspberry pi 4 and computer vision processing is done on a tensor processing unit.

The main issue I have been having is that if I want faster movement then the servo's overshoot the subject and there is a lot of thrashing back and forth before it gets close. If I slow down how quickly it moves then it works ok, but it is still not very smooth.

I have tried three approaches that all first involved calculating the difference between the center of the object and the center of the camera screen, I call this `error_x`

or `error_y`

using this value I compute the new angle that the servo must move to in order to close the distance and this keeps happening through a loop with the image processing. These are `new_angle_tilt`

and `new_angle_pan`

- I used a proportional fraction of the error to get an angle change, so the larger the distance to the center of the object the larger the servos angle will change towards that object ie

`new_angle_pan = kit.servo[1].angle + error_x*.05`

This works well when I use a very small factor (ie `.05`

) but that causes it to move very slowly and not smoothly. If I increase the factor the camera overshoots the center too much and thrashes all over.

- I tried using a fix incremental movement, for instance moving 2.5 degrees each cycle until the center is reached. Here error_x is used just to determine the direction.

`new_angle_pan = kit.servo[1].angle + sign(error_x)*2.5`

This works better than using a fraction of error_x but is super slow and not very realistic head movement.

- I tried creating a PID controller and using that and trying various values for the constants. This didn't produce a good result at all and I am not sure how I can fine tune this or if it's a good idea, I just know my 3d printer uses this method to not overshoot temperature controls.

`pid_pan = PIDController(kp=0.05, ki=0.01, kd=0.01, max_output=5, min_output=-5)`

`new_angle_pan = kit.servo[1].angle + pid_pan.compute(0, error_x)`

I can share my full code if people need it, but I think I am looking more for a conceptual answer than for someone to troubleshoot my code as I know it works when it's moving slowly.

Another user suggested doing a `y=mx+b`

calibration, I am going to look into this.

At this point I am wondering if I need to implement some type of re-enforcement learning to teach the system how to move correctly, though that seems like a big time investment.

Or I give up on having the camera on the pan-tilt and I instead have the camera fixed and map distances across the screen to angles on the servo using something like remap in python. Or maybe there is something I haven't thought of yet and this group can save me a ton of headaches lol

2 Comments

2023/09/26

01:36 UTC

01:36 UTC

1

for(int i=0;i<N;++i){

for(int j=0;j<N;j+=1){

if(j>=1000001){

i+=ceil(N/10);

break;

}

}

According to me when i try to solve this algorithm, I think the time complexity should be O(n^2) because what if there is no element which is equal to 1000001 and we loop through the whole inner loop, assumuing the worst but and my TA says it is O(N) which is quite don't understand can please someone would like to provide me the correct logic and help me, please!

5 Comments

2023/09/25

16:42 UTC

16:42 UTC

1

I was wondering if anyone knows of an algorithm that produces voronoi type diagrams. I am currently using fortunes sweep algorithm in my project for generating terrains. However due to the size of my terrains (~1 million seed points) it takes quite a long time. As stated it doesn’t have to exact, just something that looks similar that is pretty fast, thanks

4 Comments

2023/09/24

22:39 UTC

22:39 UTC

1

Hello everyone

I very much doubt I picked the right subreddit for my question / request but lets see where this goes, please redirect me to a more appropriate sub if needed.

One part of my thesis is dedicated to "Route Planning Systems" (as if in vehicle navigation on public roads), more specifically how a route from A to B is constructed from a technical standpoint and how these kinds of systems can be optimised (idk... caching between commonly queried nodes?) but I struggle to do any proper research for this topic.

Road navigation sounds like a typical example for the shortest-path-problem. Information on algorithms to solve this type of problem is widely available yet technical analysis documents and research papers (no, blogs don't count) on the topic is not.

Are there any specific keywords I should focus on finding material for? Can somebody point me into the right direction?

Thanks in advance

1 Comment

2023/09/24

20:03 UTC

20:03 UTC

5

Can someone explain me intuitively the difference between Counterfactual regret Minimization and Monte Carlo tree search (or IS-MCTS for imperfect information games).

Also when it is best to apply which algorithm and if both are suited for real time decision making?

0 Comments

2023/09/23

10:56 UTC

10:56 UTC

0

I read online that to solve max flow for a single source / sink undirected graph, one must turn this into a directed graph by replacing every undirected edge with capacity c between nodes u, v and two directed edges (u,v) and (v,u) both with the same capacity c, and that once solved the flow in one of these two edges must be zero. Is there some proof with explanation on why the two problems are equivalent?

5 Comments

2023/09/23

04:44 UTC

04:44 UTC

0

Say you have 3 columns of data and each column has 3 columns of data so

X y z x1 y1 z1 x2 y2 z2

I need to compare everything in a row to check whether higher or lower. So x compared with x1 and x2 then x1 compare with x2 then same with y and z.

Now there will be more than 3 columns also so I need some algorithm that iterates through every possibility in a row.

2 Comments

2023/09/23

03:53 UTC

03:53 UTC

1

I make “5” my pivot but by the time I get to the end, 6 is the last number in the array and 5 has nowhere to go to partition it.

[5, 6, 7, 2, 3, 4, 1]

(Swap 6 with 2)

[5, 2, 7, 6, 3, 4, 1]

(Swap 6 with 3)

[5, 2, 7, 3, 6, 4, 1]

(Swap six with 4)

[5, 2, 7, 3, 4, 6, 1]

(Swap 6 with 1)

[5, 2, 7, 3, 4, 1, 6]

And now five can’t go anywhere to partition the array.

3 Comments

2023/09/23

03:02 UTC

03:02 UTC

0

Context of the problem: I have the following entities: stations, programs, project manager, days, and time slots. The goal of the problem is to decide what program with who project manager to assign on what day at what time slot for each station.

Some notes:

Each program has its own duration. Say Program A last for 50 minutes, Program B is 30 minutes, etc.

Each project manager has qualifications on what program he/she can handle. This constraint is a hard constraint.

Time slots start from 6AM to 6PM. This means that if Program A which lasts for 30 minutes is assigned to start at 6:00AM, then it will end at 6:30AM. Only one program can be assigned in each station, so there should be no overlap in programs in terms of day and time per station.

There is a forecasting model that takes as input the studio, program, project manager, day, and time set and can give the expected revenue. We want to maximize this while satisfying the project manager qualification constraint, as well as the no schedule overlap constraint.

It is alright to have blank time slots. Example: Program A is from 6:00AM to 6:30AM. Program B can start at 6:40AM. That means 6:30 AM to 6:40 AM has no schedule.

What was tried so far:

Google OR - Tried to represent a solution to be a 5 dimensional matrix with an hour granularity. Dimensions are stations, program, project manager, day and time. If matrix[station][program][project manager][day][time] = 1, then that set is assigned, otherwise not. The main issue encountered here is about time slots, as they are not necessarily on a per hour basis. We tried time slots to be in a 5-minute interval. However, constructing the constraints that would adhere to each programs duration was proven to be difficult.

Pyomo: We tried pyomo still using the same matrix representation as above (5-minutes timeslot interval), but still encountered the same difficulty of expressing program durations as constraint. I seem to not able to make a condition inside the constraint declaration such that if this matrix entry is 1, then do this operation.

Genetic Algorithm: Tried also GA and seems that it is the viable option currently have, although the response time is really long. The solution representation is an array of days, and each day has an array of schedules. Each schedule is a combination of station, program, project manager, and start time.

We are just not sure if GA is the only option we have, or there are other algorithms or python tools that could help us on this complexity of out problem where the solution space seems to be very large. Any advice or thoughts would be useful. Thanks!

2 Comments

2023/09/22

12:58 UTC

12:58 UTC

0

I came across this problem on the web the other day.

Find the * number* of subarrays in an array that contain an odd number of odd integers.

Eg:

-----------------------------------------------------------------------

Array: [2, 2, 1, 2, 2, 2, 1, 2, 2, 1]

Some of the valid subarrays: [2, 2, 1], [2, 2, 1, 2], [1, 2, 2, 2, 1, 2, 2, 1].........

You get the point.

---------------------------------------------------------------------------

I know how to iteratively do this.

*Iterative approach:*

- Keep a "left marker" to indicate where the subarray starts and "right marker" to indicate where the subarray ends.
- Make the left marker iterate over every index. For each index the left-marker is in, iterate the right marker from the left-marker to the end of the main array.
- Keep a counter that keeps count of every odd value that the right-marker passes. If the count is an odd value, take note of the current position of the left-marker and right-marker and just push that into another array called (SubArrayRanges).
- When the right-marker finishes looping, reset the counter and move the left-marker to the next position to repeat all the above steps.

What's a more efficient way to approach this problem? I'm guessing it's got something to do with keeping note of the indexes of where the odd values are and breaking those down into smaller ranges and then dealing with them individually?

1 Comment

2023/09/22

17:30 UTC

17:30 UTC

0

I would like to modify "classic" reddit story ranking algorithm described here (i dont think its used anymore but nevermind):

https://medium.com/hacking-and-gonzo/how-reddit-ranking-algorithms-work-ef111e33d0d9

New requirements to introduce:

- users belong to groups (each user could belong only to one group)
- up/down votes from users belonging to group with more members should have more impact/higher weight

Any ideas?

1 Comment

2023/09/22

19:52 UTC

19:52 UTC

1

I was asked this one today, and have spent the last few hours trying to think of a way to model/program a solution. Anyone know of a good way to think about this?

Problem: There are 24 volunteers. Over the next 3 weeks, each volunteer is assigned to a different task. There are 8 tasks. Each week, the volunteers switch tasks. Each task has 3 volunteers assigned to it. Volunteers cannot be assigned to the same task more than once, and volunteers cannot share the same task more than once.

4 Comments

2023/09/21

22:36 UTC

22:36 UTC

9

Say you're building a wiki/document product like Notion/GDocs/Coda. Notion and Coda seem to split the document into blocks (new block starts when you press return).

If we want to search across title and document body, how hard of a problem is this? What makes it hard? For starters, let's assume the blocks of text are stored in postgres.

The search in all these products sucks, especially searching within the document. For example, I have a document in Notion which has the words "will work on" in the body. When I search "will work on", I see the document in the results because it matched "will" but it wasn't able to find the full phrase because it's not highlighted in the results. I understand there's a limit to the number of "multiple word" combinations you can extract out of a document for reasons of scale, performance, and resource constraints.

I'd like to still understand why these startups with tens/hundreds of millions in funding can't do better w.r.t search? How do solutions like Algolia, Meilisearch, or Typesense fare in this regard?

Should a young startup that's building a knowledge base/wiki platform just accept that search will suck?

3 Comments

2023/09/21

01:13 UTC

01:13 UTC

0

I’ve found myself in a situation today where this is quite literally what I need. Implementing such a thing is fairly trivial, but I’m shocked I can’t find a implementation or a library in any language doing such a thing (I’m working in go so I know I’ll need to do it myself not looking for a lib or anything).

My thought is that maybe I’m calling it by the wrong name or something? Anyone familiar with a similar DS?

8 Comments

2023/09/21

00:47 UTC

00:47 UTC

2

I am curious to see if my understanding of Depth First Search is correct. In my classes at university we were taught that DFS prioritizes going as deep as possible into a tree or graph before returning to add unvisited paths/edges. So doesn't that mean in a graph drawn with some level of interconnectivity that we could technically have multiple "correct" search trees from DFS? and that which one we use is an arbitrary decision we could make?

Example Graph:

Example I had in mind

I'm not talking about in a specific electronic case here btw. I know that how you write code might change this but assume for me that we are doing this by hand and stuff. Hopefully this is making sense.

5 Comments

2023/09/20

21:55 UTC

21:55 UTC

3

"Incremental backup" is a well-known concept. Whenever we commit, only the increments will be recorded to save all the versions and save disk space.

While increments in source codes (plain texts) are very easy to identify, increments in binary files (like a painting, when we add one additional stroke to it) are harder. It is hard in general to identify what exactly is changing in the document and encode that in the most efficient way.

Is there a nice way to precisely define and handle "increments" theoretically?

6 Comments

2023/09/19

04:12 UTC

04:12 UTC

0

I am trying to design an algorithm. For now, I am trying to give myself a large overview.

Assume I have K (for example, K can equal 8) perfect slave machines, and one master machine. Each slave has a perfect HDD. I want to save the string "ABCDEFGHIJKLMNOP". My wished are

- I want to only save a fragment of the string to each HDD belonging to an unique slave
- I want to make sure, that an adversary can only recover the full string, if and only if he can access at least P fraction of the slaves (for example, P can be 50%).
- However, an authorized user can recover the complete string, even if a few slaves are offline, or not accessible. Let's call the number of offline slaves n, let's also assume that n < PK. For example, in this case n can be 2, which is less than 50% of 8 total slaves.

Note that I am differentiating between compromised and accessible. A prefect adversary compromises a slave without rendering it accessible. However, a slave can become inaccessible with or without the involvement of an adversary.

How should I go about designing an algorithm that can do it?

My attempt to solution was to experiment with Reed Solomon Encoding - but I can't find a way that would allow both the fault tolerance, and significant difficulty for an attacker.

As an answer, I would like a detailed pseudocode with explanation with step by step explanation. In this question, I have not set any limit on K or p because I don't know if that is necessary or not. However, if a fully general answer isn't possible, then may be you can set suitable limits on these, to develop a special solution.

Thank you.

4 Comments

2023/09/19

01:43 UTC

01:43 UTC

3

Context, I’m brushing back up on my algorithms after taking a year off. Also I’ve been playing a bit of d4 lately, so the inspiration for this problem stems from the paragon board. Also, realized I’m far too rusty for this to be my first problem back 😅

Given a graph containing a starting city, a number of cities with their respective populations, and miles between cities, as well as an integer i corresponding to the resources available to pave i miles of road, generate the subgraph such that the most number of people can be reached by paved road from your starting city. Tl;dr, generate the connected subgraph containing starting node(I guess it’d end up being a tree) with max weight nodes with total edge weight less than or equal to i).

Seems like this is going to be one of those things that at face value is easy but then is np-hard. But again, hella rusty, so figured I’d ask the folks here 😅

11 Comments

2023/09/18

16:13 UTC

16:13 UTC

0

I'm fighting a O(n!) problem.

Things I tried.

Try an evolutionary code instead. Fail. Always runs into useless side minima. (And I have no idea how to avoid that.)

Improve the code. Yay, brought the inner loop from n^2 to n*(log n)^2 . Only for my n=10-20, no actual improvements (more array operations needed or whatever the reason).

The next step would have been actual parallel code, but alas, I'm no longer a student and can only parallelize by running my second computer, both of them being rather sluggy anyway.

Which brought me to an idea. If I would find 1000 gullible twits^W generous benefactors to run my code in parallel I could go about 3 numbers up with my n and maybe finally see the pattern. (A few subpatterns I already utilized, e.g. if the first number of the permutation is i, the second will be i+1 or i-1 for the optimum.) And with 1000000, I surely will crack it.

Which brings me to my question. There are numerous projects like Seti@Home, which parallelize tasks for anyone to join. Now, finding alien life is probably more interesting than analyzing the worst runtime of the Splay algorithm :-), but is there any chance I can kick off my own project as a poor nobody?

4 Comments

2023/09/18

12:39 UTC

12:39 UTC

0

I was working on a computer science problem that requires me to calculate and keep track of the average of an array of values. The first approach is what came to my mind first. However, the autograder for the problem did not accept the first approach, no matter how many little things I tried. The second approach, however, seems to work. I just can't figure out why.

The main difference I can see is that one keeps track of only the average, and the other keeps track of the cumulative sum. I was reading online that keeping track of the cumulative sum and dividing by the new number of elements could be a better approach because the double will be more accurate. But would the difference be significant?

I don't know how big the test cases were, so maybe that's why the first one did not work. But keeping a double average, and multiplying the number of old elements just makes sense - at least mathematically. So, why does this not work when programming?

Everything else in the code is the same, from reading in data to printing. So, I think it should be this part that makes the difference. Could the inaccuracy due to not keeping track of the running sum and just using an average be the issue?

Thanks!

Approach 1 (Doesn't work):

`double average = 0;`

`int numWealthy = 0;`

`for (int j = index; j >= 0; --j) {`

`average = ((average * numWealthy) + people[j]) / (numWealthy + 1);`

`if (average >= wealthLimit) {`

`++numWealthy;`

`} else {`

`break;`

`}`

`}`

Approach 2 (Works):

`double cumulativeWealth = 0;`

`int potentialNumWealthy = 0;`

`for (int j = index; j >= 0; --j) {`

`double newCumulativeWealth = cumulativeWealth + people[j];`

`int newPotentialNumWealthy = potentialNumWealthy + 1;`

`double newAverage = newCumulativeWealth / newPotentialNumWealthy;`

`if (newAverage >= wealthLimit) {`

`cumulativeWealth = newCumulativeWealth;`

`potentialNumWealthy = newPotentialNumWealthy;`

`} else {`

`break;`

`}`

`}`

2 Comments

2023/09/18

00:08 UTC

00:08 UTC

0

I was working on a computer science problem that requires me to calculate and keep track of the average of an array of values. The first approach is what came to my mind first. However, the autograder for the problem did not accept the first approach, no matter how many little things I tried. The second approach, however, seems to work. I just can't figure out why.

The main difference I can see is that one keeps track of only the average, and the other keeps track of the cumulative sum. I was reading online that keeping track of the cumulative sum and dividing by the new number of elements could be a better approach because the double will be more accurate. But would the difference be significant?

I don't know how big the test cases were, so maybe that's why the first one did not work. But keeping a double average, and multiplying the number of old elements just makes sense - at least mathematically. So, why does this not work when programming?

Everything else in the code is the same, from reading in data to printing. So, I think it should be this part that makes the difference. Could the inaccuracy due to not keeping track of the running sum and just using an average be the issue?

Thanks!

Approach 1 (Doesn't work):

`double average = 0;`

`int numWealthy = 0;`

`for (int j = index; j >= 0; --j) {`

`average = ((average * numWealthy) + people[j]) / (numWealthy + 1);`

`if (average >= wealthLimit) {`

`++numWealthy;`

`} else {`

`break;`

`}`

`}`

Approach 2 (Works):

`double cumulativeWealth = 0;`

`int potentialNumWealthy = 0;`

`for (int j = index; j >= 0; --j) {`

`double newCumulativeWealth = cumulativeWealth + people[j];`

`int newPotentialNumWealthy = potentialNumWealthy + 1;`

`double newAverage = newCumulativeWealth / newPotentialNumWealthy;`

`if (newAverage >= wealthLimit) {`

`cumulativeWealth = newCumulativeWealth;`

`potentialNumWealthy = newPotentialNumWealthy;`

`} else {`

`break;`

`}`

`}`

4 Comments

2023/09/18

00:08 UTC

00:08 UTC

0

Hey guys.

I'm confused on how the calculation of low link values during a dfs works.

According to Geeks for Geeks and several other sources, the algorithm for calculating the low link value of a node u is as follows:

Let v be a child of u.

- If node v is not visited already, then after the DFS of v is complete, a minimum of low[u] and low[v] will be updated to low[u].
- When child v is already visited, then a minimum of low[u] and Disc[v] will be updated to low[u].

The algorithm makes sense to me, but I watched a video about it that gets another result for this graph. According to the video, when doing the dfs in the order of visits marked, the low link value of each vertex will be zero.

But, for me, following the algorithm exposed above I obtain the values 0, 0, 0, 2, 1, 3, 4, respectively.

What is wrong? The video? My understanding?

Note that I am **not** interested in the stack version used by the Tarjan's Algorithm to find SCCs. I am just confused about the default computation for low link values.

0 Comments

2023/09/16

00:06 UTC

00:06 UTC

0

This feels very much NP-hard like, but none the less ill try asking anyway.

Given M marbles, C buckets and each bucket having capacity K, where M = C * K , find some distribution of the marbles that maximizes the number of colours that are shared. colour A and colour B are considered to be shared if they appear together in the same bucket.

The marbles and their distribution may be seen before you start distributing them. That is to say, they may be added to the buckets in any order you desire.

An example:

Bucket1 : A C

Bucket2 : A C

Bucket3: B D

Bucket4: B D

This is M = 8, C = 4, K = 2

Obviously this is not optimal since C can be swapped with D to give a better distribution of colours.

Sometimes there might be multiple optimal solutions, and i'd prefer to take the optimal which "has a better spread" such that each pair of colours is close to the expected mean that each colour should be expected to connect to , but that might be harder to prove NP-hardness for, so i'm happy to drop that for now (and give more details in the comments if anyone wants to request for them).

Can we prove this is NP-hard? Any greedy algorithm i've written never seems to reach the optimal so far (best ive gotten is trying each available swap and doing so until I can't get any better a score).

8 Comments

2023/09/16

00:05 UTC

00:05 UTC

0

public static objectReturn MergeandCountSplitInv(int [] C, int [] D) { int i = 0; int j = 0; int splitInvs = 0;

```
//int B[] = {};
int n = C.length + D.length;
int []B = new int[n];
for(int k = 0; k < n - 1; k++) {
//for(int k : )
System.out.println("i is " + i);
System.out.println("j is " + j);
if (C[i] < D[j]) {
```

// System.out.println("C is " + Arrays.toString(C)); // System.out.println("D is " + Arrays.toString(D)); //System.out.println(k); //System.out.println(i); B[k] = C[i]; i = i +1; }

```
else {
B[k] = D[j];
j = j + 1;
//System.out.println(j);
splitInvs = splitInvs + ((n/2) - i + 1);
}
}
objectReturn pass = new objectReturn(B, splitInvs);
return pass;
}
```

2 Comments

2023/09/15

23:12 UTC

23:12 UTC

1

Hello guys,

I was solving a problem regarding given Big Oh and Theta for worst case and best case runtimes, is it possible to get O(_) on any inputs?

I couldnt wrap my head around what does it mean for 'worst case and best case'? I thought Big Oh and Theta apply to the all cases of the algorithm in general.

3 Comments

2023/09/15

19:19 UTC

19:19 UTC