/r/probabilitytheory
For everything probability theory related! No matter if you're in highschool or a stochastic wizard; if you want to learn something about this subject you're most welcome here.
This subreddit is meant for probability theory-related topics. Regardless of your level of education or your familiarity with probability theory; if you would like to learn more or just talk about this fascinating subject, you are welcome.
Please keep topics of discussion related to probability theory. Interdisciplinary topic is fine, but if the focus is a different area, it may be removed as off-topic. Titles should be a short summary, use the text box for the content of your question or topic.
Remain civil/polite in your conversations. If you think somebody is mistaken, focus on the content, not the user. Treat them as you might treat a student. Insults or snide remarks may result in removed comments or locked comment chains.
Post Flair Please use one of the following flairs for your post. Links will filter to that flair.
Filter out homework posts: Anti-homework
Regarding homework: When asking a homework question, please be sure to: (1) Clearly state the problem; (2) Describe/show what you have tried so far; (3) Describe where you are getting stuck or confused.
Some related subreddits:
To see what probability theory is, see the probability theory wiki. To understand the difference between statistics and probability theory, see this discussion.
Unfortunately, there is no universal LaTeX plugin for Reddit, so we have to make due with normal fonts.
/r/probabilitytheory
Hey there, I thought this would be a simple problem but turns out its way more complex then i thought, does someone know how to solve it or have any suggestions?
If I have four bags with four balls. In the first bag I have one blue ball and three red balls. In the second bag I have two blue balls and two red balls. In the third bag I have one blue ball and three red balls. In the fourth bag I have 3 blue balls and 1 red ball. Each time I take a ball out of the bag, I do NOT put the ball back in the bag (without replacing it). I want to remove all the blue balls from the bags. To have an 80% chance of removing all the blue balls from the bags, how many times do I need to remove balls from the bags? show the calculations
Thanks in advance.
In this problem, I don't understand the distinction between (a) and (b). Are they different? If yes, how?
Can someone help!
tldr: I would love to confirm my simulation algorithm of a card game by mathematically
calculating the win probability. Unfortunately, the game is really complex - Are you up for the challenge? I also provided the results of my simulation at the end.
Hi guys,
I am currently writing an app that counts cards for a card game. I think it is internationally known as fck u or in Germany as Busfahrer. As a programmer, I wrote a simulation for winning the game, but I have no idea whether my results are right/realistic because there is no way I can play enough games to get statistical significance. So the obvious approach would be to calculate the chance of winning. Sadly, I seem to suck at probability theory. So If you want a challenge, be my guest. I will also share my simulation results further down.
Rules:
Because there are probably many different sets of rules, here are mine:
Stages in detail:
Probability Calculation:
I am well aware of how to calculate the individual probabilities for a full deck and specific cards. It gets tricky if you consider tracking the current stage and already drawn cards. As far as I can see there are three possibilities on how to make decisions. 1. always picking the best option without knowledge about the drawn cards from previous stages and long term card counting. (playing blind) 2. choosing based on the cards of previous stages e.g. knowing about the first card when predicting higher/lower (normal smart player without counting cards) 3. choosing based on perfect knowledge. Knowing all cards that are drawn, that remain in the deck and the ones of previous stages (that would be my app).
What I want to know:
I am interested in knowing the probability of winning the game before running out of cards. An additional thing would be knowing the probability to win with a certain amount of cards left but this is not a must have.
chance y to win after exactly x draws
chance y of winning until x draws
My simulations:
Basicly I run the game for 10.000.000 decks and write down the cards remaining in case of a win or if it was a loss. I can run my simulation for any remaining card combination but to make it simpler just assume a complete deck to start with. My results are that you have a 84% chance of winning before you run out of cards. Note that this includes perfect decision making with knowledge about all drawn cards. I have no Idea if that is even near the real number because even one < instead of an > in my code could fuck up the numbers. I also added 2 graphs that show when my algorithm wins (above).
For choices without card counting I get a chance of winning of 67% and for trivial/blind choices (always red, higher, between, hearts, new) I get 31%.
Let me know If you want to know anything else or need other dataanalysis.
Thank you so much for your help. I would love to see how something like this can be calculated <3
Here is a question that is beyond my mathematical competence to answer. Can anyone out there answer it for me?
Suppose you have a deck of 93 cards. Suppose further that three of those 93 cards are special cards. You shuffle the deck many times to randomize the cards.
Within the shuffled deck, what is the probability that at least one special card will be located within four cards of another special card? (Put alternatively, this question = what is the probability that within the deck there exists at least one set of four adjacent cards that contains at least two special cards?)
(That's an obscure question, to be sure. If you're curious why I'm asking, this question arises from the game of Flip 7. That game has a deck of 93 cards. One type of special card in that game is the "Flip 3" card. There are three of these cards in the deck. If you draw a Flip 3 card on your turn, then you give this card to another player or to yourself. Whoever receives the Flip 3 card must then draw three cards. I'm trying to estimate the likelihood of "chained" Flip 3 cards occurring. That is, I'm trying to estimate the odds of the following case: after drawing a Flip 3 card, you draw a second Flip 3 card as part of the trio of drawn-cards that the first Flip 3 card triggers.)
So imagine there is a random probability of rolling blue (1/10 chance) and red (9/10 chance). What is the probability that you will roll blue before red? Assume that every time you roll has same odds.
So I'm trying to figure out the probability that the maximum of the coordinates for an n-dimension centroid are less than some number, and what happens as the dimensions tend to infinity. The vertices are uniformly distributed on [0,1]
For the 3D case: we are calculating P(max(C) <= N) where C = ((x1+x2+x3+x4)/4, (y1+y2+y3+y4)/4, (z1+z2+z3+z4)/4) are the coordinates for the centroid:
Since z = (x1+x2+x3+x4)/4 ~ U(0,1), our problem is equivalent to calculating the probability of the maximum of 3 uniform variables, since 3 coordinates define the centroid in 3 dimensions. This should be the probability of the cubic root of one of the variables being less than some number, which results in N^3 as shown below:
P(max(C) <= N) = P(z^(1/3) <= N) = N^3
I believe this is correct.
How would you evaluate the limit of P(max(C^n ) <= N) as n tends to infinity for the n-dimensional centroid? If the exponent of N grows larger for the n-dimensional case, and N is between 0 and 1, the maximum of the centroid would converge to 0..? How does this make sense? If we include more coordinates, we would expect this probability of the maximum to approach 1, wouldn't we?
Hey there!
Me and a couple friends are trying to figure out a calculator for an event in a game, but we're having some trouble with a specific scenario, and I'm hoping someone smart in here has an answer
Scenario simplified here:
Every time we click a button, there is a 5% chance of being given a cookie, but every 10th pull, we are guaranteed to be given a cookie no matter what.
Now, I've arrived at an "average" probability of being given a cookie over n attempts being 14,5%, but my friends doubted it, and now I'm also not sure. Would be awesome if someone could explain how to actually do this
I was trying to calculate the probability of this situation:
Let's say I'm on a poker table with 4 more players, I'm the first one to take some action and I would like to know how often at least one of the other players would pay my bet and play with me.
Let's assume that all players would only play 20% of their hands (so 20% of the time they will pay me).
The formula to calculate this probability would be 1- (0.8^4)? So a total of 60% of the time? Is that correct?
Hello!
First of all: Im not really looking for a straight up answer, more of a nudge in the right direction.
The question: Couple of days ago I got a denary dice set (from d1 to d10). Now I wanted to make some sort of graph of the probability distribution. I've made a simulation within Sheets, as well as a Normdist based on the E(x) and stdev(x) of the set. The problem is: both dont seem to perfectly represent the reality, since there always seems to be a (very tiny) chance for it to roll below 10 (ten one's) or above 55 (1+2+..+10).
In short: How do I account for the physically impossible outcomes since using Mean+/- 3*stdev covers about 99.95%, without having to "brute force" everything one by one.
For a game where there are five cards, one of the cards is a king and the other four are aces. If you pick the king you win and if you pick the aces you lose. The cards are shuffled and layed out in spots one through five. You pick a spot (one through five) and draw that card.
Obviously the odds of winning this game are 1/5 (20%). However if you were to play the game multiple times does the picking strategy matter?
I think intuitively if you pick the same spot every time (ie. always picking spot 3), it's purely the random shuffling and therefore the odds of winning are still 1/5 (20%). I was told however that if you pick a "random" spot every time (ie. just going with your gut) the odds are no longer 1/5 (20%).
This feels incorrect, it feels like the odds should be 1/5 no matter what my picking strategy is. That being said it also feels like the picking pattern could introduce more variance but I'm not sure.
However, I don't know the math behind it. This is what intuitively feels correct to me but isn't based on the actual probability. I'm hoping someone can explain the math/stats behind it.
This is my first post here so let me know if I did anything wrong or need to include more information (I feel like the title is bad/inaccurate so if someone has a more accurate title or way to phrase the question let me know).
Also, for what it's worth this is related to the new Pokemon TCG Pocket wonder picks.
First off, I understand that this can also be modeled as a dice game (changing from d10 to d12 to d14) but the real-life context of the question I'm trying to solve is more based around placements/rankings.
Let's say that me and 9 other people are playing a marble race game. Assume that it is a fair game and all marbles have an equal chance at winning. My marble does very poorly. We repeat the race a few more times. My marble does poorly every time.
Two other people see our game and want to join, so they add their marbles to the race. Still, my marble places poorly. Again, two more people join in the game, and my marble continues to disappoint.
See here for a hypothetical table of results:
My marble placed... | ... out of N marbles |
---|---|
9 | 10 |
9 | 10 |
8 | 12 |
9 | 12 |
12 | 12 |
7 | 12 |
7 | 14 |
13 | 14 |
How can I calculate the probability that my marble performed as poorly as it did, while accounting for the fact that the odds of victory grew slimmer as marbles were added to the race?
Ultimately, the question I would like to solve is - What is the probability that a marble performs as badly as my marble did over the course of these 8 races?
I have 7 marbles. 3 of them are red, 2 are blue and 2 are red-blue (they count for either colour but only once). I draw 3 marbles. No replacement, order doesn't matter.
What is the probability of me drawing a collection of marbles that covers all colours?
Success states:
Doesn't matter if the computation is logically complicated, I just need to understand some of the main principles. I have programming skills so once I can get some sort of logic down I can do basically whatever. I don't want to do a Monte Carlo simulation, I'd rather stick to pure probability theory.
The real application is in a card game (Magic: the Gathering) where I can have cards that fall into multiple categories. My goal is to draw a collection of cards that covers all colours. There are 5 colours - white, blue, black, red and green. There are cards that are a single colour, two colours, or three colours, etc... The limitation is that if I draw a white-blue-red card it should only count towards one of those colours at a time, not all of its colours.
A simulation would be easier but since I'm making an online tool I think iterating multiple times either A) will produce inaccurate results or B) is computationally more intensive than a straightforward calculation.
I want to prove that probability zero elements are impossible in a finite sample space.
Proof-
In finite sample space S={a,b} we have, in equally likely case P(a)=P(b)=1/2. But for non-equally likely case we have {a} and {b} occupying different "proportion" of space in sample space. Now, we split sample space in parts such that {a} contains j of those "splits" and b contains k of those "splits" in such a way that all these j+k splits are now again equally likely. On solving we get, P(a)=j/(j+k) and if j=0 it implies that {a} gets zero "splits" or it is impossible! Meaning it will never happen!
So heres the conundrum in the simplest of terms. It's a 4 way rock paper scissors match, 3 draw paper and one draws scissors so he wins that match. It's a 3 way rock paper scissors match, two of us get scissors and one gets rock and he wins the match. It's a one on one rock paper scissors match he gets scissors and I draw rock and win the match. What are the odds of this happening.
(I tried to do the math myself and I got a percentage of 0.009602% and I don't think I am right)
SHORT VERSION:
In a simulation where virtual gamblers bet on the outcome of a d3 die that yields 2, 3, 7, it seems that betting on 3 (instead of the expected value 4) minimizes losses. Gamblers lose an amount equal to their error.
Results: https://imgur.com/a/gFsgeBZ
LONGER: I realize I still struggle with what expected value is. I know that it's not actually the value to expect (eg: a d6 dice will never yield 3.5) and more like an average (mean) of all outcomes.
But I was sure EV is what you bet on, especially when many trials are available.
I simulated many bets on a d3 die that yields either 2, 3, or 7. The EV of that die is 4. Gamblers guess the die roll and lose an amount of money equal to their error. For example:
guessed=4 actual=2 loses 2
guessed=4 actual=3 loses 1
guessed=4 actual=7 loses 3
Betting on 3 (not the EV of 4!) seems to minimize losses, which is a surprise to me. Why isn't the EV the optimal bet?
Even stripping the probability view away, shouldn't the mean (which I considered like the value fairest in distance to the input values) be the obvious spot to aim for if minimizing error?
In a finite sample space, if A is a proper subset of sample space or A ⊂ S, where S is sample space. Can the probability of A be equal to 1?
Hi, I am noob, and want to ask. Does frequency interpretation of probability is justified and proved by the theorem known as Law of Large Numbers?
Hello everyone,
I am working on a dashboard and would like to develop a key figure that indicates the risk that a machine will not be delivered on time. The individual risks of the required materials are already known. Now I am faced with the challenge of aggregating these material risks at machine level in a meaningful way. Further more, the number of materials for each product can be different.
What i think is important here is that
Also important that the risk are independent from eachother since this is more or less a prototype to get a glimpse for the numbers.
Do any of you have experience with the aggregation of risks in such cases or ideas on how best to implement this mathematically? Recommendations for books with this kind of circumstances are welcome.
How would I find the probability of drawing 3 of the same cards or less in a 7 card hand using the hypergeometric formula. Population size is 100, sample size is 7, population of success is 40.
Hello! I'm having a discussion with my gf over a probability theory but we can't reach a conclusion so I hope someone here can help
Lets say you have a 0.6% chance to succeed at an action, and you perform that action 74 times. What is the total % chance of the action succeeding within those 74 tries.
Need youtuber suggestion for studying probability as a masters student.
Seeking Advice on Theoretical vs. Applied Probability Courses for Future Use in ML & AI
Dear Probability Community,
I’m a 25-year-old Computer Science student with a deep passion for math—or, more accurately, for truly understanding problems through a mathematical lens. Before diving into the world of Machine Learning, Deep Learning, and AI, I wanted to build a strong mathematical foundation. So, while on exchange, I enrolled in a Probability Theory course that’s heavily inspired by Rick Durrett’s Probability: Theory and Examples and approaches probability from a measure-theoretic perspective.
While the course is fascinating, it’s also incredibly challenging. I haven’t studied pure math in about two years, and this course is proving to be a tough re-entry. The theoretical focus is intense, with learning objectives including:
On the other hand, I recently found out that my home university’s Probability course uses Probability by Jim Pitman, which takes a more applied approach. This course emphasizes:
Pitman’s approach seems much more accessible and focused on applied techniques. It’s almost entirely without measure theory, which feels like a relief compared to Durrett’s heavily theoretical course. So, here’s my question:
Given that my long-term goal is to apply probability in areas like stochastic simulation, ML, and AI, does it make sense to push through the theoretical content in Durrett’s book, hoping it will make applied probability easier down the line? Or should I transition to the more applied approach in Pitman’s book to focus on techniques that may be more directly useful?
Thank you for any insights you can offer—I’m grateful for any advice or personal experiences you may have!
Like the title says, The issue I am seeing is someone missing a hit in a trading card box that is usually around 1 in 3-4~ cases and they are on case 21~ without hitting even 1 so at what point is it fishy that he hasn't gotten the pull? Or at what point is it statistically impossible to miss that much?
So I have a variation on the previous thread here. Suppose I'm referring factory workers for interviews, and the company will hire any given one with probability P (all independent). Human Resources over there is keeping track of how many get hired from the last N ones I refer, and I get a bonus if X of those previous N (>=X) who were interviewed get hired. How many interviews should we expect to occur before I get that bonus?
e.g., suppose P=40%, bonus paid if 50 of the last 100 get hired. The binomial distribution can tell me the odds of that being the case for any given new group of 100 interviews - it's a straightforward calculation (odds X>=50 here is 2.71%). But here, we're preserving knowledge, a buffer of the last 100 interviewees, and keeping a running count of how many were hired. So while that last-100 ratio will average 40 (P*N), and will go up and down over time in a predictable distribution, at some point it will reach my bonus threshold of X. So, how many interviews should we expect to occur before that threshold is cleared?
I've been thinking about each incremental interview as essentially representing a new group of 100 (so our first test is against interviews 1-100, but the next consideration is against interviews 2-101, then 3-102, etc), except each set of 100 trials isn't independent - it is 99/100ths the same as the previous one. So I'm not sure how to properly account for the "trailing history" aspect of the scenario here. Any advice?
ok imagine you're playing roulette, or rolling a dice, or whatever you want. If you have 30% odds of hitting the winning outcome, and to break even [lets say you're playing for money] you need to hit the winning outcome X amount of times over Y rounds. Each round is an independent event as well. for simplicity, let's assume that you need to hit 50 times over 200 rounds.
I was playing around with making up a new card game while doing some worldbuilding and it dawned on me that I'm not sure how to figure out the order of winning hands. I ran 20 rounds of test hands and realized I'd need a lot more data, and then remembered math and people who are good at it exist. Also if there's a better sub for this, please let me know!
The game uses a deck of 40 cards, each suit has an ace through 10.
The dealer deals 2 cards to each of 4 players, then flips a card face up in the middle. The players have a chance to swap out one card before hands are shown and a winner declared.
The win conditions in order are:
Triple - both cards in the player's hand match the center card
Pair - the player holds two of the same cards
Match - one card in the player's hand matches the center card
10 in All - The player's hand plus the center card totals 10 (so like they hold a 2 & 5 and the center card is a 3)
Totals 10 - the two cards in the player's hand add to 10 (so like 8 & 2 or 7 & 3)
Holds 10 - the player has a 10 in their hand
Does that hold up to being the order of rarest combination to least rare? And also does this game already exist and I spent an hour dealing cards to myself for nothing lol?
Thank you so much for any light you can shed!
My friend has seen Justin Tucker miss 4 of the 9 field got attempts he’s seen in person. Justin Tucker has only missed 47 of 455 attempts in his career. What is the probability of someone seeing that many misses in so few attempts.
In the book "How to get Lucky" By Allen D. Allen, he cites a modification of the Monty Hall problem, but it doesn't make sense :
"The Horse Race
Here’s an example of how that can happen for the extended form of magic explained in this chapter. It uses a large group of gamblers who make a definitive, unambiguous decision as a whole by creating a runaway favorite for a horse race. Suppose as the horses head for the finish line, the favorite is in the lead, but another horse has broken out and is close to overtaking the favorite. A horse race as a candidate for the use of real magic. The horse in the lead is the favorite. The horses are nearing the finish line. The horse ridden by the jockey in blue is the favorite and is in the lead. The horse ridden by the jockey in orange has advanced to the “place” position (second position). At this instant in time, the horse in second position has the greatest chance of winning the race. The more horses there are in the race, the more the horse ridden by the jockey in orange is likely to win, as shown in Table I, above. In other words, because the horse ridden by the jockey in blue is the runaway favorite, that horse is the original bet, like the card that first gets the coin. Because the horses not shown have no chance of winning if the two shown are close enough to the finish line, the other horses are like the cards that have been turned over in Figure 6. (Of course, the two leading horses have to be far enough from the finish line for the horse in second place to have time to overtake the favorite, at least a few seconds.) Therefore, betting on the horse ridden by the jockey in orange at this point in time is like moving the coin. But there is a cautionary note here. The number of horses deemed to be in the race for the purpose of determining the number of choices (the first column on the left in Table I) must only include horses that could possibly win the race before the gate opens. Often, this is all the horses in the field, which is why the word “horse race” is usually deemed synonymous for the phrase “anything can happen.” On the other hand, experienced handicappers might consider some of the horses in a race incapable of winning. Unfortunately, you can’t place a bet at the window once the race has begun, much less seconds before the race has finished. But if you were at the track with a buddy who hasn’t read this book, then maybe he’d take your quick bet that the favorite isn’t going to win even though that colt or filly is in the lead only seconds before the race ends."
TLDR: He says that if you bet on a horse before the start of the race out of a race with 100 horses it has chance 1/100, but when close to the finish we see this horse with another , that other horse has the chance 99/100 , because the other horses are in the back (they are out of the race), so now your choosed horse has chance 1/100.
My understanding: He is wrong , both horses have chance 50/50, he misundestood the monty hall problem, because there the showman is not going to open his door, (meaning that if he bets on a horse now, he will always be in the final 2), which is not relevant for the horse race, because here your horse can lose.
Please help me, am I wrong???
I don't know how to get the exact probability. I posted this to r/statistics and it got deleted.
I was playing cards with my GF and her family. We were playing Liar, a game where all the cards were dealt to the players. There were 4 players. My hand was exactly one of each card (random suits). 2,3,4,5,6,7,8,9,J,Q,K,A
What's the odds of that happening?
I plugged the question into Chat GPT to help but I think it's not smart enough yet.
To calculate the odds of being dealt exactly one of each rank (from Ace to King) when four players are dealt cards from a standard 52-card deck, we need to consider how the cards are distributed.
Total Cards and Distribution: There are 52 cards, which consist of 13 ranks (Ace, 2, ..., 10, Jack, Queen, King). Each player receives 13 cards.
Exact Distribution Condition: We want to find the probability that each player receives exactly one card of each rank.
Total Ways to Distribute Cards: The total ways to distribute 52 cards to four players (13 cards each) can be calculated using the multinomial coefficient:
[ \frac{52!}{(13!)^4} ]
Favorable Outcomes: To find the number of favorable distributions where each player gets exactly one of each rank, we can consider the ranks first. Each rank has 4 different suits. For each of the 13 ranks, we need to choose which suit goes to which player.
For each rank, there are (4!) (24) ways to assign the 4 suits to the 4 players. Therefore, for 13 ranks, the number of favorable distributions is:
[ (4!)^{13} = 24^{13} ]
Calculating the Probability: Now, we can find the probability (P) of the favorable outcome:
[ P = \frac{(4!)^{13}}{\frac{52!}{(13!)^4}} = \frac{24^{13} \cdot (13!)^4}{52!} ]
Calculating Odds: The odds are typically expressed as the ratio of favorable outcomes to unfavorable outcomes:
[ \text{Odds} = \frac{P}{1 - P} ]
Given the complexity of directly computing these factorials, the specific numerical odds can be challenging to compute without a calculator or computational software. However, the steps above lay out the framework for determining the odds of being dealt exactly one of each card rank when four players are dealt from a full deck.
So recently on a probabilistic systems analysis class we took a test and the question with the most point was weird. I got 0.95 so did many others but after the class the teacher went trough the test and he gave the answer of 0.91. I can't find anything wrong whit both solutions. When i asked the teacher he said I must have not taken something into account (he was giving figure it out your self vibe). So my problem is that I have no idea if my solution is wrong because it is so simple.
The problem:
1 of 2 suspects (A, B) admitted to their crimes. Before admitting, the chances of them being found innocent was equal (50, 50). On the crime site the blood of the criminal was found. The blood type is only found in 10% of the population. Suspect A was a match and suspect B is unknown. From this information find the chance of A being the criminal.
Teachers solution:
Say A means A is guilty, B means B is guilty, and C means that A's blood was a match
P(A∣C): the probability that A is the culprit given that there is a blood match.
P(C∣A): The probability of a blood match given that A is the culprit. = 1
P(A∣C)= P(C∣A)⋅P(A) / ( P(C∣A)⋅P(A)+P(C∣B)⋅P(B) ) = 1 * 0.5 / (1 * 0.5 + 0.1 * 0.5) = 0.90909...
I do not see anything wrong with this and it seems to be correct.
My solution:
Say A mean A is guilty, B means B's blood was a match
P(A∣B^): The probability of A being the criminal given that B's blood does not match. = 1
P(A|B) = P(A^|B): The probability of A (not) being the criminal given that B's blood does match. = 0.5
P(B) = The probability of B's blood matching. = 0.1
P(A) = the probability of A being the criminal
p(A) = P(A∣B^)⋅P(B^) + P(A∣B)⋅P(B) = 1 * 0.9 + 0.5 * 0.1 = 0.95
If B's blood does not match A is guilty by default. It happens 90% of the time. If B's blood does match we are back to square one and the chances are 50, 50. This is so simple I can't see any way it could be wrong.