/r/probabilitytheory
For everything probability theory related! No matter if you're in highschool or a stochastic wizard; if you want to learn something about this subject you're most welcome here.
This subreddit is meant for probability theory-related topics. Regardless of your level of education or your familiarity with probability theory; if you would like to learn more or just talk about this fascinating subject, you are welcome.
Please keep topics of discussion related to probability theory. Interdisciplinary topic is fine, but if the focus is a different area, it may be removed as off-topic. Titles should be a short summary, use the text box for the content of your question or topic.
Remain civil/polite in your conversations. If you think somebody is mistaken, focus on the content, not the user. Treat them as you might treat a student. Insults or snide remarks may result in removed comments or locked comment chains.
Post Flair Please use one of the following flairs for your post. Links will filter to that flair.
Filter out homework posts: Anti-homework
Regarding homework: When asking a homework question, please be sure to: (1) Clearly state the problem; (2) Describe/show what you have tried so far; (3) Describe where you are getting stuck or confused.
Some related subreddits:
To see what probability theory is, see the probability theory wiki. To understand the difference between statistics and probability theory, see this discussion.
Unfortunately, there is no universal LaTeX plugin for Reddit, so we have to make due with normal fonts.
/r/probabilitytheory
Seeking Advice on Theoretical vs. Applied Probability Courses for Future Use in ML & AI
Dear Probability Community,
I’m a 25-year-old Computer Science student with a deep passion for math—or, more accurately, for truly understanding problems through a mathematical lens. Before diving into the world of Machine Learning, Deep Learning, and AI, I wanted to build a strong mathematical foundation. So, while on exchange, I enrolled in a Probability Theory course that’s heavily inspired by Rick Durrett’s Probability: Theory and Examples and approaches probability from a measure-theoretic perspective.
While the course is fascinating, it’s also incredibly challenging. I haven’t studied pure math in about two years, and this course is proving to be a tough re-entry. The theoretical focus is intense, with learning objectives including:
On the other hand, I recently found out that my home university’s Probability course uses Probability by Jim Pitman, which takes a more applied approach. This course emphasizes:
Pitman’s approach seems much more accessible and focused on applied techniques. It’s almost entirely without measure theory, which feels like a relief compared to Durrett’s heavily theoretical course. So, here’s my question:
Given that my long-term goal is to apply probability in areas like stochastic simulation, ML, and AI, does it make sense to push through the theoretical content in Durrett’s book, hoping it will make applied probability easier down the line? Or should I transition to the more applied approach in Pitman’s book to focus on techniques that may be more directly useful?
Thank you for any insights you can offer—I’m grateful for any advice or personal experiences you may have!
Like the title says, The issue I am seeing is someone missing a hit in a trading card box that is usually around 1 in 3-4~ cases and they are on case 21~ without hitting even 1 so at what point is it fishy that he hasn't gotten the pull? Or at what point is it statistically impossible to miss that much?
So I have a variation on the previous thread here. Suppose I'm referring factory workers for interviews, and the company will hire any given one with probability P (all independent). Human Resources over there is keeping track of how many get hired from the last N ones I refer, and I get a bonus if X of those previous N (>=X) who were interviewed get hired. How many interviews should we expect to occur before I get that bonus?
e.g., suppose P=40%, bonus paid if 50 of the last 100 get hired. The binomial distribution can tell me the odds of that being the case for any given new group of 100 interviews - it's a straightforward calculation (odds X>=50 here is 2.71%). But here, we're preserving knowledge, a buffer of the last 100 interviewees, and keeping a running count of how many were hired. So while that last-100 ratio will average 40 (P*N), and will go up and down over time in a predictable distribution, at some point it will reach my bonus threshold of X. So, how many interviews should we expect to occur before that threshold is cleared?
I've been thinking about each incremental interview as essentially representing a new group of 100 (so our first test is against interviews 1-100, but the next consideration is against interviews 2-101, then 3-102, etc), except each set of 100 trials isn't independent - it is 99/100ths the same as the previous one. So I'm not sure how to properly account for the "trailing history" aspect of the scenario here. Any advice?
ok imagine you're playing roulette, or rolling a dice, or whatever you want. If you have 30% odds of hitting the winning outcome, and to break even [lets say you're playing for money] you need to hit the winning outcome X amount of times over Y rounds. Each round is an independent event as well. for simplicity, let's assume that you need to hit 50 times over 200 rounds.
I was playing around with making up a new card game while doing some worldbuilding and it dawned on me that I'm not sure how to figure out the order of winning hands. I ran 20 rounds of test hands and realized I'd need a lot more data, and then remembered math and people who are good at it exist. Also if there's a better sub for this, please let me know!
The game uses a deck of 40 cards, each suit has an ace through 10.
The dealer deals 2 cards to each of 4 players, then flips a card face up in the middle. The players have a chance to swap out one card before hands are shown and a winner declared.
The win conditions in order are:
Triple - both cards in the player's hand match the center card
Pair - the player holds two of the same cards
Match - one card in the player's hand matches the center card
10 in All - The player's hand plus the center card totals 10 (so like they hold a 2 & 5 and the center card is a 3)
Totals 10 - the two cards in the player's hand add to 10 (so like 8 & 2 or 7 & 3)
Holds 10 - the player has a 10 in their hand
Does that hold up to being the order of rarest combination to least rare? And also does this game already exist and I spent an hour dealing cards to myself for nothing lol?
Thank you so much for any light you can shed!
My friend has seen Justin Tucker miss 4 of the 9 field got attempts he’s seen in person. Justin Tucker has only missed 47 of 455 attempts in his career. What is the probability of someone seeing that many misses in so few attempts.
In the book "How to get Lucky" By Allen D. Allen, he cites a modification of the Monty Hall problem, but it doesn't make sense :
"The Horse Race
Here’s an example of how that can happen for the extended form of magic explained in this chapter. It uses a large group of gamblers who make a definitive, unambiguous decision as a whole by creating a runaway favorite for a horse race. Suppose as the horses head for the finish line, the favorite is in the lead, but another horse has broken out and is close to overtaking the favorite. A horse race as a candidate for the use of real magic. The horse in the lead is the favorite. The horses are nearing the finish line. The horse ridden by the jockey in blue is the favorite and is in the lead. The horse ridden by the jockey in orange has advanced to the “place” position (second position). At this instant in time, the horse in second position has the greatest chance of winning the race. The more horses there are in the race, the more the horse ridden by the jockey in orange is likely to win, as shown in Table I, above. In other words, because the horse ridden by the jockey in blue is the runaway favorite, that horse is the original bet, like the card that first gets the coin. Because the horses not shown have no chance of winning if the two shown are close enough to the finish line, the other horses are like the cards that have been turned over in Figure 6. (Of course, the two leading horses have to be far enough from the finish line for the horse in second place to have time to overtake the favorite, at least a few seconds.) Therefore, betting on the horse ridden by the jockey in orange at this point in time is like moving the coin. But there is a cautionary note here. The number of horses deemed to be in the race for the purpose of determining the number of choices (the first column on the left in Table I) must only include horses that could possibly win the race before the gate opens. Often, this is all the horses in the field, which is why the word “horse race” is usually deemed synonymous for the phrase “anything can happen.” On the other hand, experienced handicappers might consider some of the horses in a race incapable of winning. Unfortunately, you can’t place a bet at the window once the race has begun, much less seconds before the race has finished. But if you were at the track with a buddy who hasn’t read this book, then maybe he’d take your quick bet that the favorite isn’t going to win even though that colt or filly is in the lead only seconds before the race ends."
TLDR: He says that if you bet on a horse before the start of the race out of a race with 100 horses it has chance 1/100, but when close to the finish we see this horse with another , that other horse has the chance 99/100 , because the other horses are in the back (they are out of the race), so now your choosed horse has chance 1/100.
My understanding: He is wrong , both horses have chance 50/50, he misundestood the monty hall problem, because there the showman is not going to open his door, (meaning that if he bets on a horse now, he will always be in the final 2), which is not relevant for the horse race, because here your horse can lose.
Please help me, am I wrong???
I don't know how to get the exact probability. I posted this to r/statistics and it got deleted.
I was playing cards with my GF and her family. We were playing Liar, a game where all the cards were dealt to the players. There were 4 players. My hand was exactly one of each card (random suits). 2,3,4,5,6,7,8,9,J,Q,K,A
What's the odds of that happening?
I plugged the question into Chat GPT to help but I think it's not smart enough yet.
To calculate the odds of being dealt exactly one of each rank (from Ace to King) when four players are dealt cards from a standard 52-card deck, we need to consider how the cards are distributed.
Total Cards and Distribution: There are 52 cards, which consist of 13 ranks (Ace, 2, ..., 10, Jack, Queen, King). Each player receives 13 cards.
Exact Distribution Condition: We want to find the probability that each player receives exactly one card of each rank.
Total Ways to Distribute Cards: The total ways to distribute 52 cards to four players (13 cards each) can be calculated using the multinomial coefficient:
[ \frac{52!}{(13!)^4} ]
Favorable Outcomes: To find the number of favorable distributions where each player gets exactly one of each rank, we can consider the ranks first. Each rank has 4 different suits. For each of the 13 ranks, we need to choose which suit goes to which player.
For each rank, there are (4!) (24) ways to assign the 4 suits to the 4 players. Therefore, for 13 ranks, the number of favorable distributions is:
[ (4!)^{13} = 24^{13} ]
Calculating the Probability: Now, we can find the probability (P) of the favorable outcome:
[ P = \frac{(4!)^{13}}{\frac{52!}{(13!)^4}} = \frac{24^{13} \cdot (13!)^4}{52!} ]
Calculating Odds: The odds are typically expressed as the ratio of favorable outcomes to unfavorable outcomes:
[ \text{Odds} = \frac{P}{1 - P} ]
Given the complexity of directly computing these factorials, the specific numerical odds can be challenging to compute without a calculator or computational software. However, the steps above lay out the framework for determining the odds of being dealt exactly one of each card rank when four players are dealt from a full deck.
So recently on a probabilistic systems analysis class we took a test and the question with the most point was weird. I got 0.95 so did many others but after the class the teacher went trough the test and he gave the answer of 0.91. I can't find anything wrong whit both solutions. When i asked the teacher he said I must have not taken something into account (he was giving figure it out your self vibe). So my problem is that I have no idea if my solution is wrong because it is so simple.
The problem:
1 of 2 suspects (A, B) admitted to their crimes. Before admitting, the chances of them being found innocent was equal (50, 50). On the crime site the blood of the criminal was found. The blood type is only found in 10% of the population. Suspect A was a match and suspect B is unknown. From this information find the chance of A being the criminal.
Teachers solution:
Say A means A is guilty, B means B is guilty, and C means that A's blood was a match
P(A∣C): the probability that A is the culprit given that there is a blood match.
P(C∣A): The probability of a blood match given that A is the culprit. = 1
P(A∣C)= P(C∣A)⋅P(A) / ( P(C∣A)⋅P(A)+P(C∣B)⋅P(B) ) = 1 * 0.5 / (1 * 0.5 + 0.1 * 0.5) = 0.90909...
I do not see anything wrong with this and it seems to be correct.
My solution:
Say A mean A is guilty, B means B's blood was a match
P(A∣B^): The probability of A being the criminal given that B's blood does not match. = 1
P(A|B) = P(A^|B): The probability of A (not) being the criminal given that B's blood does match. = 0.5
P(B) = The probability of B's blood matching. = 0.1
P(A) = the probability of A being the criminal
p(A) = P(A∣B^)⋅P(B^) + P(A∣B)⋅P(B) = 1 * 0.9 + 0.5 * 0.1 = 0.95
If B's blood does not match A is guilty by default. It happens 90% of the time. If B's blood does match we are back to square one and the chances are 50, 50. This is so simple I can't see any way it could be wrong.
Suppose you have a standard deck of 52 playing cards. What is the probability of making a full house if you get to draw 7 of those cards (without replacement)? How much do your odds improve if you get to draw an 8th card?
Can this problem be approached by hand or would someone need to write a computer program to run a simulation to solve it? Thanks!
Assume I am given a red button which, when pressed, destroys the world.
If I am confident that I would never press the button, then does the probability of that outcome exist?
Assume that there is no probability of accidentally pressing the button, and that my mental state will not change.
Basically my question can be simplified as: If there are two outcomes but one outcome is impossible, does the probability exist?
My knowledge on the subject is very limited, I was just curious to know the objective answer.
I have a stochastic processes exam coming up in a week. I feel mostly fine about the mechanics of solving problems (using PDFs, conditional probability etc.) but what I struggle with is defining my random variables to start. I have a hard time reading a question and converting it into a probability model. Do you have any recommendations for videos or textbooks to practice this?
Assume there are n + m balls of which n are red and m are blue. Arrange the balls in a row randomly. What is the probability of getting a particular sequence of colors?
They certainly look log-normal to me, but how would I test to be sure just based on these PDFs, also is it possible this is some other distribution like a gamma distribution? If someone can give me testing tips in Excel or Python I would appreciate it, so far I tried to sum the PDFs into CDFs in Excel and then test the log values for normality but either I'm doing something wrong or these are not log-normal
Consider the following idealized Field Sobriety Metrics: There are three examinations. Each consists of eight tests. A failure of two tests indicates a failure of the examination. Experimentally it has been established that a subject will fail an examination if and only if he or she has a blood alcohol concentration of 0.1% or greater, 65% of the time. That is to say (I think): There is a 65% probability that any individual test is accurate in this sense.
Given this as fact, what is the reliability of all three tests put together? To be more specific, consider three questions: what is the probability of a subject failing exactly one, two, or three of three examinations if and only if he or she has a BAC of 0.1%?
This is not a fully accurate representation of the field sobriety metrics in use today, just to be clear. This is not a homework question.
Working on this problem from the "50 challenging problems is prob and stats..", I understand why the right answer is right, but don't understand why mine is wrong. My initial approach was to consider three cases:
Instead of thinking about number of ways blah blah that the textbook used, i just thought of it in terms of probability of each event, on any given dice, I have a 5/6 chance of that dice not being the number I guessed and a 1/6 chance of it being the number I guessed. So, shouldn't the zero dice show up with probability (5/6)^3? and similarly one dice would be 5^2/6^3 (2 different and 1 is the same as what I guessed)? and then 5/6^3 and 1/6^3 for the other, then I would weight all of these relative to the initial stake, so I'd end up with something like (-x)(5/6)^3 + (x)*5^2/6^3 + (2x) * 5/6^3 + (3x) * 1/6^3?
(Actual answer is ~ .079)
A friend of mine and I have been arguing over a probability question for a long time, and I would like some opinion of people more educated than us. We both live in the south, and if there is one thing southerners like, it is sweet tea. The question is as follows: throughout all of history, is it probable that there were 2 instances in which the same amount of sugar grains were added to a pitcher for sweet tea? He argues that because there are too many variables, such as different cups of sugar per recipe, people who eyeball the measurements, and differences in grain size, it has never happened. I argue that when taking into account the sheer number of instances where sweet tea has been made, including for restaurants, and home consumption, and the mere fact that most people DO measure sugar, that it has definitely happened. I know there is probably a formula including average grains per cup and such, but what do yall think?
After listening to a discussion about life and how lucky we are to even exist, I wondered what the exact probability of our existence was. The following was quite shocking so I thought I'd share it with you.
Here's the odds of you even existing The probability of your existence is 1 in 10^2,685,000. 10 followed by almost 2.7 million zeros. Your existence has required the unbroken stretch of survival and reproduction of all your ancestors, reaching back 4 billion years to single-celled organisms. It requires your parents meeting and reproducing to create your singular set of genes (the odds of that alone are 1 in 400 quadrillion). That probability is the same as if you handed out 2 million dice, each die with one trillion sides… then rolled those 2 million dice and had them all land on 439,505,270,846. https://www.sciencealert.com/what-is-the-likelihood-that-you-exist
A hobby of mine involves rolling dice and it got me thinking about certain probabilities: specifically, is there a way to generalize the probability of a specific numerical order of distinct T, n-sided dice? For example, let's say I had a collection of red, orange, yellow, green, blue, indigo, and indigo dice. Each die has 30 sides (i.e. numbers 1 to 30) and each value has a 1/30 chance to being rolled (i.e. the dice are fair). Also, each dice has a "bonus" to it's roll, red +6, orange +5, ... , violet +0. What's the probability that if you arranged the result from highest to lowest the order is roygbiv? Let's also assume that the color ordering in the rainbow brakes ties (i.e. if red and orange tied, red comes before orange).
I'm trying to come up with a closed form analytic solution for an arbitrary number of dice and an arbitrary number of sides. The two dice case is straightforward. But I can't wrap my head around a generalized case.
Assuming you have to take a written exam, having a sheet of paper available, what is the probability that a pen writing ink randomly on the sheet will find the right combination of where to place the ink and find the solution to the exam (assuming it is unique )? It's a totally unnecessary problem but I was wondering if it was a possible thing to determine given the large number of factors to take into consideration.
Just for fun, I was wondering what the probability of my boyfriend and I meeting are. Here are the variables that make it interesting.
He (M) and I (M) met online playing Valorant while I was in GA for a once in a lifetime training event for a few months. We played one game together for 8 minutes. We were on GA servers, which is strange because if I wasn’t there I’d never be on GA servers, and he shouldn’t have been because he lives in PA, much closer to VA servers. After the one game, we ended up becoming friends and finding out that we lived 30 minutes away from each other in PA.
With all these variables, plus the fact that I hadn’t played the game in months, and he stopped playing the game right after (both incalculable probably), I was just curious if someone knew what the math would be for the chances of us meeting under those circumstances, both liking boys, being around the same age, being compatible, living so close together and then actually dating. Thank you in advance just for reading!
We have X is uniformly distributed from 0 to 1.
Y = 2X if 0<X<0.5
Y= 2x-1 if 0.5<X<1
Given that X is between 0 and 0.5, what is the probability that P(Y<1/2)
Was asked this question in the interview for quant role. Please provide an approach and answer. Thanks
So I have been stuck on this idea for long. I want to estimate any probability of real life events. But when it comes to probability theory , I find that even if I try to calculate it using formulas I still end up with nothing.
For example I wanted to calculate the probability your partner, who you married , is cheating on you. This is the "general" probability your partener is cheating. Psychology Today cited a study saying that 4% of partners cheat eventually. So this is the probability I want to estimate.
Looking on the internet I find that low self esteem is a cause for cheating. They cite that 77% of people who cheated said they have low self esteem. (I understood that using probability you can calculate the probability of an effect using the probability of a cause, but I dont understand it well).
So we get from a study that p(low self esteem | cheating) = 0.77
Then , p(low self esteem) = 0.85 (for any person, again from a study).
Now let's apply Bayes Theorem (which is used to update beliefs as I understand, but here we dont update anything it's just basic conditional probability).
I need p(cheating).
p(cheating = p(cheating | low self esteem) * p(low self esteem) / p(low self esteem | cheating)
, and we put in the numbers and we get
p(cheating) = (0.85/0.77) * p(cheating | low self esteem)
Now did I discover something new from this calculation? I didn't get p(cheating) , it is dependent on p(cheating | low self esteem). Now calculating that is even harder.
What is probability theory useful for? I still can't calculate this stuff. How would you even do that with probability theory???? How can i get an estimate close to 4% without guessing p(cheating | low self esteem)?? I don't want anything subjective, i want it to be as close to 4% (think back-of-envelope calculations or fermi estimation but better using probability theory).
Probability theory is weak , it's just ~6 formulas, what can I even do with it??? Look here.
Hay guys I am a total noob when it comes to probabilitytheory but I saw it can help you in some game , I just wanted to know if the same is true for Cheese or Go?
Let's say there are 5 cards, 1 Ace and 4 Kings. The cards are shuffled and placed face down, next to each other from left to right. My objective is to select the Ace. As far as I know I have a 1 in 5 chance of selecting the Ace?
Now let's say there are successive rounds where the above is simply repeated over and over.
To maximise my probability of selecting as many Aces as possible, is it in my best interest to:
A) always select the facedown card in position X (where X can be position 1-5)
B) always select a card at random. For argument sake let's we use a random number generator because from what I understand humans are biased and bad at randomising
C) use some sort of algorithm to determine which card (position 1-5) to select or not select
Thanks!
Hi everybody, I’m taking a class in measure theoretic probability and I started reading Billingsey’s “Probability and Measure”. I really like the approach of the book but I’ve noticed that it deals mostly with R as codomain of the measurable functions even when the result is more general. I was wondering if there’s any book with the same rigor and deeply inspired by a measure theoretic approach which is in your opinion better than Billingsley’s one to study theorems in their great generality. Thank you for any answer.
If this question doesn't belong here, PLEASE let me know and I will delete it. Not sure where else to post it.
Ran into a new "shake of the day" variant at a bar I visited over the weekend. It starts with a very large cup and in it are (12) standard size dice, (1) large red die and (1) large green die. Large being maybe 1" x 1".
For your first flop, you roll all (14) dice. Whatever the red die ends up being is the number you're shooting for and whatever the green die ends up being is how many rolls you get to get all (12) of the smaller dice to show what's on the red one. Obviously, the red die doesn't really matter because whatever shows is totally random and you want the green die to be a six. Also, after the first flop, if any of the small dice match the red die, they stay out of the cup and count as one (or some) of the twelve.
There were seven of us in the group and we each played 3 times and none of us were able to get the 12 small dice to match the red die. (The best we did was needing a three of kind on the final flop).
SO, the question is...........
What is the probability of getting 12 dice to show the same number when you get 6 shakes to do it when you can pull the matching numbers after each shake?
And really, if you count the first shake with all (14) dice and a few of those match the red die, a person would get seven shakes.
Just curious as I am stumped as to what the odds might be.