Open Side Menu Go to the Top
Register
Science Thread (now with 100% less religion) Science Thread (now with 100% less religion)

04-21-2021 , 01:37 PM
Not sure we have any Londoners here but

Science Thread (now with 100% less religion) Quote
04-24-2021 , 09:17 AM
ECD, this is one for you. Disclaimer: although I know the answer, I did not solve this when I tried, and I can't remember how the solution is proved.

You are interviewing 100 candidates for a job. You are measuring them on some quantifiable, normally distributed metric, like, words per minute, or height, or whatever. You can either discard a candidate, or hire him or her on the spot. Once you discard a candidate, you can't go back to that candidate, they are removed from the game. If you choose to hire a candidate, the game ends. What is the strategy to have the best odds of hiring the "best" candidate?
Science Thread (now with 100% less religion) Quote
04-24-2021 , 11:17 AM
I know the answer so I'll let others try.

A small hint that doesn't really need spoiler tags, figure out what you would do if there were only 3 applicants, then see how you would modify that as you add more people applying.
Science Thread (now with 100% less religion) Quote
04-24-2021 , 12:49 PM
d2 are we assuming these candidates have any knowledge of others in the 'line' of candidates or 'zero' knowledge?
Science Thread (now with 100% less religion) Quote
04-24-2021 , 01:28 PM
Quote:
Originally Posted by d2_e4
ECD, this is one for you. Disclaimer: although I know the answer, I did not solve this when I tried, and I can't remember how the solution is proved.

You are interviewing 100 candidates for a job. You are measuring them on some quantifiable, normally distributed metric, like, words per minute, or height, or whatever. You can either discard a candidate, or hire him or her on the spot. Once you discard a candidate, you can't go back to that candidate, they are removed from the game. If you choose to hire a candidate, the game ends. What is the strategy to have the best odds of hiring the "best" candidate?

The neat thing about this problem is that your chances of success barely depends on the number of candidates you are interviewing.

To do it perfectly takes calculus and involves the number e. But to get close only involves fractions.

Obviously the strategy is of the form "reject the first x candidates and pick the first one in the second group who surpasses all in the first group."

Notice that you can't win if the best is in the first group and that you must win if it is in the second group while the second best is in the first group. If the third best is the highest rated of the first group you are even money.
Science Thread (now with 100% less religion) Quote
04-24-2021 , 02:01 PM
Quote:
Originally Posted by Cuepee
d2 are we assuming these candidates have any knowledge of others in the 'line' of candidates or 'zero' knowledge?
It doesn't really matter, they can't affect the outcome. As I said, you might as well be measuring height or whatever. The puzzle is basically shorthand for "you have a normally distributed variable and x items, what is your optimal strategy for picking the 'best' item if you are only allowed to examine an item and then discard it, or choose it".
Science Thread (now with 100% less religion) Quote
04-24-2021 , 02:02 PM
Quote:
Originally Posted by David Sklansky
The neat thing about this problem is that your chances of success barely depends on the number of candidates you are interviewing.

To do it perfectly takes calculus and involves the number e. But to get close only involves fractions.

Obviously the strategy is of the form "reject the first x candidates and pick the first one in the second group who surpasses all in the first group."

Notice that you can't win if the best is in the first group and that you must win if it is in the second group while the second best is in the first group. If the third best is the highest rated of the first group you are even money.
David, do you know if the answer you and I are thinking of works for any distributions other than normal? Would it work for a uniform distribution?
Science Thread (now with 100% less religion) Quote
04-24-2021 , 03:45 PM
i thought the secretary problem was a very well known thing? remember studying it in high school stats
Science Thread (now with 100% less religion) Quote
04-24-2021 , 04:47 PM
So is David's answer correct?

which would mean you are less than 50/50 to get the right answer as the person would have an equal chance of being in either half, and then there is still a chance that despite him being 'taller' than everyone in A group there is still a chance someone in B group is taller, even if not a great chance???

and i understand the question was just asking the best way to get the 'best odds'.
Science Thread (now with 100% less religion) Quote
04-24-2021 , 05:11 PM
Quote:
Originally Posted by Cuepee
So is David's answer correct?

which would mean you are less than 50/50 to get the right answer as the person would have an equal chance of being in either half, and then there is still a chance that despite him being 'taller' than everyone in A group there is still a chance someone in B group is taller, even if not a great chance???

and i understand the question was just asking the best way to get the 'best odds'.
David gave a hint, not an answer, and it is correct, yes. The question is asking the best strategy, that's right. IIRC, working out the probability itself is quite hard but it does end up less than 50%. It's also not dependent on n (the number of candidates), as David notes.

There is a wikipedia article for this problem:

https://en.wikipedia.org/wiki/Secretary_problem
Science Thread (now with 100% less religion) Quote
04-24-2021 , 07:48 PM
The probability of getting the best candidate is dependent on the number of candidates, it's just the answer converges quickly. Not a huge difference between getting the best out of 10 or the best out of 100,1000 etc.
Science Thread (now with 100% less religion) Quote
04-24-2021 , 07:59 PM
Ya it becomes intuitively clear, after reading David's post that the 50% +1 strategy is the best. That is where that 1 is the first person past the 50% who is 'taller' (if we use that as X).

You basically divide the two sections into a coin flip. So the X has as much chance to be in the second group as the first and as soon as X appears in the second group his odds of being the tallest are pretty high.

(did not read wiki before I wrote this but I guess I should)
Science Thread (now with 100% less religion) Quote
04-24-2021 , 08:18 PM
Quote:
Originally Posted by Cuepee
Ya it becomes intuitively clear, after reading David's post that the 50% +1 strategy is the best.
.....
(did not read wiki before I wrote this but I guess I should)
Yep, lol. It's not.
Science Thread (now with 100% less religion) Quote
04-24-2021 , 08:33 PM
Quote:
Originally Posted by ecriture d'adulte
The probability of getting the best candidate is dependent on the number of candidates, it's just the answer converges quickly. Not a huge difference between getting the best out of 10 or the best out of 100,1000 etc.
Yes, sorry, that was a loose statement on my part.
Science Thread (now with 100% less religion) Quote
04-24-2021 , 08:54 PM
Quote:
Originally Posted by d2_e4
Yep, lol. It's not.
Confused by the "it's not" part.

Did I get it wrong?

(which is highly probably, I admit)
Science Thread (now with 100% less religion) Quote
04-24-2021 , 08:59 PM
Quote:
Originally Posted by Cuepee
Confused by the "it's not" part.

Did I get it wrong?

(which is highly probably, I admit)
Yes, splitting into equal groups is not the optimal solution.
Science Thread (now with 100% less religion) Quote
04-24-2021 , 09:29 PM
Dammit. You will make me read the wiki, won't you?

The 50+1 strategy is positive EV though, so better than nothing, so I will cling to that.
Science Thread (now with 100% less religion) Quote
04-25-2021 , 11:24 AM
Quote:
Originally Posted by d2_e4
David, do you know if the answer you and I are thinking of works for any distributions other than normal? Would it work for a uniform distribution?
I haven't thought about much it but I think the solution depends on having only the information as to the rating of the highest person in the first group. If you know more, things might change. For instance suppose the plan was to let 37 out of 100 pass you by, you are noting the numerical performance of each one of them, and the 37th person's performance was way better than the others. Would you not abandon your original plan and instead pick him(or her)?
Science Thread (now with 100% less religion) Quote
04-25-2021 , 12:14 PM
The proof for the 1/e chances is based on only being able to rank candidates, B<A<C etc. If you allow more information, like one candidate is just so many standard deviations better, it ends up becoming less of a math problem with a straight solution and more of a guess on what the distribution of applicants actually looks like.
Science Thread (now with 100% less religion) Quote
04-25-2021 , 07:51 PM
Quote:
Originally Posted by ecriture d'adulte
The proof for the 1/e chances is based on only being able to rank candidates, B<A<C etc. If you allow more information, like one candidate is just so many standard deviations better, it ends up becoming less of a math problem with a straight solution and more of a guess on what the distribution of applicants actually looks like.
So suppose you knew that the 100 people measured came from a near infinite population that you knew was normally distributed. But you didn't know the mean or standard deviation. You then start measuring those 100 one at a time. Every time you come to a person who surpasses all the other measurements you consider stopping and predicting that he or she is the highest measurement. Based on that specific measurement and the specific measurements that came before him. What are the algorithms that:

A Calculates the chance that this particular person is indeed the possessor of the highest measurement.

B. Tells you whether this should be your pick (because the solution to question A yields a higher probability than waiting for one or more higher measurements yet.)

I'm thinking that this question has not been thought of before. If not please name it after us when you broach it to your Princeton Phd buddies.
Science Thread (now with 100% less religion) Quote
04-25-2021 , 08:14 PM
Quote:
Originally Posted by David Sklansky
I'm thinking that this question has not been thought of before. If not please name it after us when you broach it to your Princeton Phd buddies.
I can already see future generations of 2+2 posters discussing the "DS_E4" problem from high school stats, and wondering why on earth it's called that.
Science Thread (now with 100% less religion) Quote
04-25-2021 , 10:58 PM
Quote:
Originally Posted by David Sklansky
So suppose you knew that the 100 people measured came from a near infinite population that you knew was normally distributed. But you didn't know the mean or standard deviation. You then start measuring those 100 one at a time. Every time you come to a person who surpasses all the other measurements you consider stopping and predicting that he or she is the highest measurement. Based on that specific measurement and the specific measurements that came before him. What are the algorithms that:

A Calculates the chance that this particular person is indeed the possessor of the highest measurement.

B. Tells you whether this should be your pick (because the solution to question A yields a higher probability than waiting for one or more higher measurements yet.)

I'm thinking that this question has not been thought of before. If not please name it after us when you broach it to your Princeton Phd buddies.
There are entire yellow books written on Optimal Stopping, so good chance its been written about.
Science Thread (now with 100% less religion) Quote
04-26-2021 , 11:22 AM
Actually after thinking about it I realize that the technique to answer this question is relatively simple. If you knew the mean and standard deviation of the underlying population, question A is elementary and question B is doable. So given the data you have up to a certain point (where your last measurement surpasses all others) you take a weighted average of your answers to A and B where you are weighting all the different means and standard deviation combinations according to that data.

In other words if your first four numbers were 1134, 872, 1272 and 2033. There would be a certain probability for various means and various standard deviations (of the underlying population) that would spit out that data and thus give various answers for questions A and B. (eg mean of 1385 SD of 422 or mean of 1811 SD of 678, etc) The final solution would be their weighted averages.
Science Thread (now with 100% less religion) Quote
04-26-2021 , 11:44 AM
Yeah, the only non-elementary part is fitting the existing data with a normal distribution, but that's pretty easy if you can use a computer. It's basically the same answer as if you knew the true mean and sd, but with more uncertainty as you're guessing it based on the sample taken to that point.
Science Thread (now with 100% less religion) Quote
05-06-2021 , 10:18 PM
A friend just gave me this, I can't solve it, and moreover, I have no idea how it could even be solved:

prove that for any set of 10 unique numbers [1,100] there will be 2 subsets with the same sum

I'm not even sure how to reduce it to a manageable problem where I can formulate a hypothesis. 10 seems like root 100, but that is a red herring I think. I just have no idea how to formulate this problem with [1,10] or [1,1000]. Intuitively, it feels like the number of unique numbers required for the proposition to hold will scale with ln(n) but I really have no idea.
Science Thread (now with 100% less religion) Quote

      
m