### Driving with mathematicians

“How many gas stations do you think we’ll pass on the way home?”

“You mean on the side of the road, or at exits?”

“Whichever - places where we’d consider buying gas.”

“I don’t know, six, eight…why?”

“Well, if we pass the first 1/e gas stations and then stop at the first place that offers cheaper gas, then we’ll maximize the chance that we’ll end up buying the cheapest gas. I mean, assuming that gas prices are random and don’t follow any trend with regards to location, which obviously isn’t the case.”

“Oh. Ha! But that’s not the algorithm we want - that one maximizes the probability of getting the cheapest gas; it doesn’t minimize the expected price of the gas we buy, which is what we want.”

“Oh, yeah, you’re right: if we apply that first algorithm, then in the likely event that we don’t get the very cheapest gas, there’s still a reasonable chance that we’ll end up with pretty expensive gas.”

“Right.”

“So what’s the strategy for minimizing the expected price of gas?”

“I don’t know.”

So, is there a probabilist in the house?

I know the answer! I know! I know!

Government prices controls on gasoline!

Whadda ya mean that’s not the right answer? Aren’t enviromentalism and social justice issues the most important parts of mathematics?

I think you need to know the distribution of gas prices to do this one. Are we assuming that the distribution is normal but that we don’t know the mean and std dev? That’ll at least make the problem solvable, maybe…

If you know the number of gas stations you will pass, you can purchase 1/n * (fuel needed) gallons from each station. That seems like it should get you to the average overall price. I’m uncertain whether there is a strategy where you can have a better expected value for cost in every case without knowing the distribution. I suspect there are strategies that would do better for specific classes of distributions (to avoid the ‘problem’ distributions for a given strategy).

Ideally, I’d like a strategy that doesn’t depend on distribution (though obviously we could get better (if not strictly better…) strategies for known distributions). There’s a strategy for the maximize-the-chance-of-getting-the-cheapest-gas that doesn’t take distribution into account; I’d like one of those. But I could live with one that assumes normally-distributed prices; known mean and standard deviation I’m a little less willing to concede, what with the price of gas fluctuating as it does.

The 1/n approach guarantees average-priced gas; but I’m wondering if we can do better than an expected price of the average, even at higher risk.

The other question that should be addressed is how much gas you talking about? Do you want to be full near the end, or is it a “it’s time to fill up the gas tank, but any station between here and my destination will work to fill it up” type of situation?

One approach that might be better than average-priced gas on average is to purchase more gas from cheaper stations. That is, you buy gas from the first station you see. Each time you pass a cheaper station, you buy gas according to its rank. If a station is the most expensive so far, buy little or none. If the station is the cheapest so far, buy more. I’m not sure what the optimal weight would be, and in situations where the price tendend to increase along the route, this approach would work poorly (though over all possible rankings, it might work better on average).

Any station will do. The objective was to finish with at least half a tank of gas, and filling up at any point would achieve that. So that’s not a constraint.

And ideally I’d like a strategy that involves filling up at just one place, like the similar problem does. Still, though, the idea of buying gas according to the rank of the gas station is a neat one; the question is, how best to estimate ranks based on only the informaion we have so far (ie, the first n of a total of m stations)?

but I’m wondering if we can do better than an expected price of the average, even at higher risk.ideally I’d like a strategy that involves filling up at just one placehow about calculating the average of the first m stations, and then filling up from the first one (in the remaining n-m stations) that has a lower than the estimated average? (I’m thinking m=n/2 would be a resonable balance between risk and price)

Since we are dealing with expected values, the actual prices will make a difference. I’m pretty sure I would be possible to come up with a strategy that worked for one set of prices but not some other set, even if the ranks stayed the same. However, in practice the prices of gas should be pretty constrained by competition. It is unlikely that there will be any exceptionally large or small outliers in price.

You could take the approach of purchasing less than a tank at the first station after 1/e where the gas is cheaper. Reset the count and do it again. Continue this approach until the last gas station where you top off the tank. I’m not sure how to weight this approach either. I don’t have time to run simulations right now, or I would give it a try.

At least in my neck of the woods, you can approximate price by brand. Any given Arco will cost less than any given Chevron station within a certain distance, for instance. If the stations you pass are also randomly distributed by brand, you could achieve a cheaper price simply by only purchasing from the cheapest brand.

Don’t forget that the earlier you fill up, the more gas you’ll burn carrying the extra gas in your tank. I’m reminded of this by all that F1 I used to watch. Wouldn’t F1 be so much more interesting if it combined pit stop strategies of not wanting to drive with full tanks and some sort of randomly fluctuating time penalty dependent on when you stopped (equivalent to price in our problem)? Then again, we probably wouldn’t need/get to work on this problem because it probably would have already been solved.

Like any good physicist, I can only explicitly solve the continuous limit, whereas I only the discrete case is well-defined.

In the continuous n\to\infty case, of course, the problem is easy. The first m stations are enough to exactly determine the distribution of all the rest (probability=1 that the distribution of the first m is the same as distribution of all n). Then in the remaining n-m, I can wait until I see a station with very low prices. (Of course, there will with probability=1 be stations with arbitrarily low values in any stretch of road with finite (positive) measure. So, when I fill up my tank, I really should not convolve with the delta function but with some physical function approximating the delta function except with positive width. The problem is that it’s natural to suppose that if there’s a physical lower limit on particle size for filling, there probably is one for sampling. And unless, for instance, I’m measuring with extremely high-energy photons compared to the (electrons, say) of filling my tank, then I’m reduced to the previous problem. But, yeah, if measuring is finer than stopping, then I can, for the purposes of this problem, set the measurement wavelength to zero (=infinitesimal), and the gas-station wavelength to positive finite (=normalized to one), and the road length n\to\infty, and we’re interested in the n-dependence.)

This comment contributes nothing to the actual discussion.

well i’m not a probabilist, but this does seem like an interesting problem. if you have to work out the strategy beforehand with no information about prices, it seems to me you’ll get a different answer than if you’re allowed to check prices along the way, since then you are gaining information about the distribution, so you can guess at the probability that any one station you pass might have close-to-the-cheapest gas.

you probably also want a variable in your formula that increases the probability of stopping as time goes by, since the undesired consequences of running out are so…well…undesirable.

side note about driving with mathematicians: when i’m driving and my wife is asleep and i’m trying to stay awake…i factor the numbers on mileage signs. not quite as geeky as your conversation, but very practical nonetheless.

It’s an interesting problem, until you get too distracted by trying to solve it that you forget to fill up and run out of gas.

If a mathematician’s time is worth $50 an hour, and the difference in price between the most expensive gas and the least expensive gas is 20 cents, and the gas tank holds 10 gallons of gas, then the mathematician stands to lose a maximum of $2.00 by buying the most expensive gas. Since $2.00 represents 2.4 minutes’ worth of a mathematician’s valuable time, investing any more than 2.4 minutes in finding the correct solution will not save the mathematician any money.

My solution: stop at any gas station and fill up. Notice the prices as you continue driving, and see how far off you were. Next time you are at the store, buy yourself a number of candy bars whose value is equivalent to the difference in price between the most expensive fill-up and your actual fill-up. Eat and enjoy.

If you’re stopping at every station anyways, you can do better than average (if there’s any difference in prices) with a fairly simple rule.

At the first station, buy M/n of a tank and note the total amount you paid for it. At every station after that, buy the same dollar value. You’ll be buying more at lower prices than at higher prices, so your average cost will be lower (or at least no higher) than what it would be if you bought the same amount at every station.

The hard part is choosing M so that you neither run out nor overfill your tank. You’d probably need at least a good guess at the upper and lower bounds on prices, and possibly a guess about where in the likely distribution the first station’s prices are.

I’m not sure whether this is the optimal solution for mgoff’s idea of basing how much you buy on the rank, but it does probably have the benefit of being a lot easier to remember than most of them.

Oh, Wacky, I know, I know; I’ve ranted before (to friends and family if not on this blog) about how ridiculous it is that folks are so concerned when the price of gas goes up half a cent a litre that they’re willing to spend an hour driving across the city and back to fill up. I should add that it was neither my money, nor my driving companion’s, that was funding the gasoline for the trip. My interest in the question is purely academic.

And I

wishmy time were worth $50/h - even before taxes.Here’s a different spin on the problem:

faculty.gvsu.edu/aboufade/gasgame.htm

This problem is a lot harder than it looks. First of all, I’d assume that the prices are uniform within a range [a,b], where a and b are unknown to the driver. With n gas stations, the expected best price is a+(b-a)/(n+1) (nontrivial to prove), so we need to compare the success of our method to that.

Say we look at the first k gas stations to find the best price, then stop at the first gas station that beats that observed best price. If we simply assume that the results for the measurement period are typical, then the best price we’ll observe is a+(b-a)/k. The probability on any given successive station of beating this price is 1/k. If we ever do beat this observed minimum, the price we expect to pay is a+(b-a)/2k. However, if we never beat it, we’ll just use the last gas station, and the expected price will be ((a+(b-a)/k)+b)/2 = (a+b)/2+(b-a)/2k. The probability of never beating the observed minimum is y = (1-(1/k))^(n-k), so the expected price we do pay is a+(b-a)/2k+(b-a)y/2, so this isn’t a bad approximation: it’s an (yk+1)/2k approximation of the best price. It seems like the best k to pick is not n/e but rather about .18n; for example, if n=10 and k=2, you pay about .16[a-b] more than optimal (about 3 cents per gallon for a 20-cent price range).

However, we can’t simply assume that the result in the measurement period will be typical, so this needs a more sophisticated analysis. If we have a 1-station observation period, we can get a precise analysis. 1/n of the time, the 1st station will be the lowest price, so we will pay a random non-lowest price, that is, an expected a+(b-a)(n+1)/2n. The rest of the time, the expected price we pay is halfway between a and the price of the first station, which is a random non-lowest price. Thus, the expected price we pay in this case is a+(b-a)(n+1)/4n. Combining these together, we get an

expected price of a+(b-a)(n+1)^2/4n^2, which isn’t bad for low n; for instance, if n=5, this strategy gives an expected price of a+(b-a)(.36), compared to the expected optimal price of a+(b-a)(.1666); a cost of (b-a)(.1933), or about 3.9 cents in our 20-cent spread scenario.

I once stopped at a small independent gas station and asked how they set their prices. He handed me a pair of binoculars and pointed to a big-chain gas station up the road. So I’m guessing that the distance between gas stations may be a relevant variable in this eqution.

Even better than the people who drive around for an hour looking for the cheapest gas are the folks who drive the 10 minutes back to the grocery store because you overcharged them 10c on their laundry detergent. Gas, once it is in your tank, becomes free.

Real-world issue intruding: usually if I’m on a highway and get off to get gas, there will be 2-3 stations there to choose from. Of course, generally those stations will provide the same price for the gas, or perhaps be different by 1 - 2 cents per gallon (unless there is something that makes it really annoying to get to one of the stations, like having to make an unprotected left turn to get in, or do some odd maneuvering to get back on the highway.)

I would think that the distribution of gas prices along a particular route may be dependent on the volume of traffic exiting a particular exit, and the number of gas stations clustered at that exit. I would assume a high-traffic area would have more gas stations, thus more competition, and thus lower prices. If there’s only one station for 20 miles, I’m not expecting it to have the lowest prices around. So, for example, there are areas on I-95 where there’s no exit for a very long time, and at that one exit after the drought, there’s only one gas station (there’s no major town nearby)… gas isn’t cheaper in that instance, I can tell you what.

Oh, and for a short time, my time was worth $50/hr before taxes (at least, that’s what I charged tutees and clients). Now, my corporate rate is lower than that ($40/hr approx), but that’s not including all my benefits or bonus, or the fact that I’m being paid for a steady # of hours every week, and I even get paid for some days I don’t work. I’m not even a full-fledged actuary yet. So yeah, I’d go with buying candy, as Wacky Hermit suggests.

Oh, I completely forgot the obvious answer: Indian reserves. When you can save

allthe taxes ($0.40/litre), even driving around is actually worth it. (Oh, and with a 6 cents/litre transit tax in the Vancouver area, you’ll certainly have to take that into account when deciding where to stop.)MS, If you know assume nothing of the distribution of prices, I am not sure that your problem is well defined. In order to calculate the expected value of what we will pay with a certain strategy, we first need to define a probabity measure on the set of all possible distributions of prices (or on some sigma-algebra on that set, if you insist). Anyway, that requires making some assumption about the distribution of prices. Am I wrong?

Mango Juice - thanks, and I’m going to fiddle with the nontrivial proof when I have some time. Mind giving me a hint - does it require any big guns, or just some cleverness?

I think you’re right, oxeador. (Though my background isn’t in probability - mind you, I did take a graduate level course on the subject. Damned if I can remember the difference between almost certain events and nearly certain events and suchlike, however.) I’d like to make as few assumptions as possible, though it’s pretty standard to assume normal distributions if necessary. Mango Juice’s assumption of uniformity might be easier, though. Could we come up with a result if we only knew the mean..?

Maybe you can get at expected best prices via order statistics?

earl

Checking gas prices at highway exits can be highly fuel inefficient. While you may find a stopping strategy that minimizes expected fuel price paid, it won’t minimize total fuel/time/utility cost.

There might be a strategy that dominates all stopping rules — for instance, looking up the best price on the internet before you leave, and stopping at the cheapest you can find online, as long as you see nothing cheaper on the way there.

The nontrivial proof: it’s not so hard, really. But when you’re talking about a uniform continuous distribution, it’s a little hard to solve a problem like “what’s the expected value of the lowest of two random numbers from 0 to 1?”

So I realized I could solve the problem if the only numbers we could generate were 0 or 1. Then, I could solve the problem if the only numbers were 0, .5, or 1, et cetera. I take a limit, and I get this complicated-ish expression that works out to the limit as n -> infinity of (2n^2+4n)/6n^2 = 1/3.

I did it by hand for selecting 3 points, also, and then I realized that I could just say I was selecting k points. The trouble then is that I had to work with the sum of i^k, but since it’s in a limit you can just make it an integral and ignore lower terms, and it all works out.

There might be an easier way though.

I do think uniformity is a more realistic assumption in this case. There is a hard minimum for gas prices: namely, the price it cost THEM to buy it, and with the exception of a few odd-ball gas stations I’ve encountered, I’ve very rarely seen prices differ by more than about 10-15%; they know they’re in competition.

I see a lot of ad hoc strategies proposed, but has anyone tried to calculate the actually optimal strategy? I think if we assume a known probability distribution for prices, we can do that by working backwards. E.g. at the last station, if we haven’t refueled yet, we have to stop, so our decision at the second-to-last station should be made based on what we expect to have to pay at the last station, and so on.

If we don’t know the distribution of prices for sure, we can probably do a more complicated Bayesian analysis, using the observed prices to do inference about the distribution of prices as we go along, and using that information to operate our decision strategy.

Since I don’t have enough to do (hah!) I’ve written up a brief description of the first, elementary, situation:

http://www.math.uchicago.edu/~shulman/gas.pdf

I’ve played around a bit with the more realistic Bayesian version, but exact calculations are a real pain; probably one could program a computer to do it, though.

IANAPOAS (I am not a probabilist or statistician), but the following might be helpful:

This question similar to the classical “optimal stopping” problem, discussed for example in “Recognizing the Maximum of a Sequence, John P. Gilbert, Frederick Mosteller, Journal of the American Statistical Association, Vol. 61, No. 313. (Mar., 1966), pp. 35-73.”.

More specifically, it more or less corresponds (up to a sign change) to the specific case where the payoff is proportional to the number chosen (section 5). Gilbert and Mosteller clearly argue that the optimal strategy for this type of problem will heavily depend on the distribution, more specifically on the shape of its far right-hand tail (usually difficult to estimate in practice). They also give a few examples of optimal strategies when the distribution of prices is fully known as well as the total number of gas stations: the optimal strategy is then to compare at each step i the current price with a predefined threshold T(i), and stop there if it is lower than T(i).

The actual values of the thresholds T(i) are easy to compute recursively (though there seems to be no easy analytical formula for in most cases): for example, in the case of 4 gas stations and prices uniform on (0,1), you have that T(4)=0 (since when you arrive at the last station, your only possibility is to stop there, and a zero threshold simply forces that), and your expected price at that stage E(4) will be 1/2. Looking now at T(3), we see that we should stop there if and only if the price is lower than the expected price at the next stop, which is E(4)=1/2, so that implies T(3)=1/2. The expected price E(3) at this point is now 1/2 with probability 1/2 (if we do not stop), and uniform on (0,1/2) with probability 1/2 (if we stop), which sums to 3/8. Looking at T(2), we have that T(2)=E(3) for the same reasons as above, and E(2)=(1-E(3))*E(3)+E(3)*E(3)/2 = E(3)-E(3)^2/2=39/128. Finally, we can conclude that T(n)=E(n+1) and E(n)=E(n+1)-E(n+1)^2/2 for all n>=1, and E(1)=8463/32768=0.258, which is appoximately half of the average price.

You take a plane! Wait…that’s not right. lol

fanfan,

I’ve heard it as the secretary problem. In hiring a secretary out of a known field of n and grading them say 1-100 where 100 is an instant hire. Interview the first n/2 (assuming none of them score 100), after which the first one more qualified than any of the first n/2 you hire on the spot. So if you know how many gas stations you’ll encounter, pass the first n/2 of them then stop at the very next one that has a lower price than the first half. This breaks down however because it implies that you know what the bottom extreme price is. But as you mentioned, if you set a threshold you’re still good.

dM,

If I remember correctly, the optimal solution to the secretary problem (certainly different from the problem MS suggests) is to skip the first n/e candidates (not n/2) and then hire the first one who is better than all of them. Your probability of success, as n approaches infinity, is 1/e.

Yes, oxeador is correct. It’s a pretty classic probability problem, and there are similar problems like this where 1/e unexpectedly appears.

It’s also known as the dowry problem or the proposal problem (and it was done as an actuarial hiring problem in one of the actuarial mags recently.)

My favourite problem where 1/e unexpectedly appears is “Screaming Toes”. Usually, it is difficult to find a group of people willing to both play it and solve it (except at Mathcamp, of course).