Erroneous entries in the mileage database
#31
Re: Erroneous entries in the mileage database
I'll put in my two cents on the "3 tanks or less = garbage" idea.
I put in my first three tanks when I first joined the site. I then started reading more about hypermiling and trying fuel efficient techniques and got gradually better at it, but I knew that I am far too competitive and that knowing that my tank results were going to be entered would probably make me start fudging, and since I really did want the data, I came to the decision not to enter my numbers. I still do keep track of FE and have a drawer full of pump receipts with notes jotted on them (not all of them- I'm sure some are lost and once in a while someone else fills it up) for the last two years, and though I've never gone through it, I have a very good mental map of where I'm at in terms of FE and what my car has done in different situations. Maybe one day I will organize them and enter some more data, but probably not- I don't have temperature data or trip lengths.
In the meantime, my first three tanks are still in the database, and they are perfectly valid data. At the time, in those conditions, that's what my car got. The passage of time and my failure to enter my other receipts should have no impact whatsoever on the validity of this data, and it would be very poor scientific practice to throw it out for those reasons, admitting an entirely new and uncalculable bias to the measurements based on the FE achievable by people who are 'regular posters' to the database as opposed to those who aren't so regular. Sure, those three tanks are a bit lower than my lifetime average FE, wich is 49.9 mpg, because I've gotten a lot better at it now, dealer break-in, whatever. The point is, data is data, three tanks or six. You shouldn't throw it out.
I put in my first three tanks when I first joined the site. I then started reading more about hypermiling and trying fuel efficient techniques and got gradually better at it, but I knew that I am far too competitive and that knowing that my tank results were going to be entered would probably make me start fudging, and since I really did want the data, I came to the decision not to enter my numbers. I still do keep track of FE and have a drawer full of pump receipts with notes jotted on them (not all of them- I'm sure some are lost and once in a while someone else fills it up) for the last two years, and though I've never gone through it, I have a very good mental map of where I'm at in terms of FE and what my car has done in different situations. Maybe one day I will organize them and enter some more data, but probably not- I don't have temperature data or trip lengths.
In the meantime, my first three tanks are still in the database, and they are perfectly valid data. At the time, in those conditions, that's what my car got. The passage of time and my failure to enter my other receipts should have no impact whatsoever on the validity of this data, and it would be very poor scientific practice to throw it out for those reasons, admitting an entirely new and uncalculable bias to the measurements based on the FE achievable by people who are 'regular posters' to the database as opposed to those who aren't so regular. Sure, those three tanks are a bit lower than my lifetime average FE, wich is 49.9 mpg, because I've gotten a lot better at it now, dealer break-in, whatever. The point is, data is data, three tanks or six. You shouldn't throw it out.
#32
Re: Erroneous entries in the mileage database
I'll put in my two cents on the "3 tanks or less = garbage" idea.
I put in my first three tanks when I first joined the site. I then started reading more about hypermiling and trying fuel efficient techniques and got gradually better at it, but I knew that I am far too competitive and that knowing that my tank results were going to be entered would probably make me start fudging, and since I really did want the data, I came to the decision not to enter my numbers. I still do keep track of FE and have a drawer full of pump receipts with notes jotted on them (not all of them- I'm sure some are lost and once in a while someone else fills it up) for the last two years, and though I've never gone through it, I have a very good mental map of where I'm at in terms of FE and what my car has done in different situations. Maybe one day I will organize them and enter some more data, but probably not- I don't have temperature data or trip lengths.
In the meantime, my first three tanks are still in the database, and they are perfectly valid data. At the time, in those conditions, that's what my car got. The passage of time and my failure to enter my other receipts should have no impact whatsoever on the validity of this data, and it would be very poor scientific practice to throw it out for those reasons, admitting an entirely new and uncalculable bias to the measurements based on the FE achievable by people who are 'regular posters' to the database as opposed to those who aren't so regular. Sure, those three tanks are a bit lower than my lifetime average FE, wich is 49.9 mpg, because I've gotten a lot better at it now, dealer break-in, whatever. The point is, data is data, three tanks or six. You shouldn't throw it out.
I put in my first three tanks when I first joined the site. I then started reading more about hypermiling and trying fuel efficient techniques and got gradually better at it, but I knew that I am far too competitive and that knowing that my tank results were going to be entered would probably make me start fudging, and since I really did want the data, I came to the decision not to enter my numbers. I still do keep track of FE and have a drawer full of pump receipts with notes jotted on them (not all of them- I'm sure some are lost and once in a while someone else fills it up) for the last two years, and though I've never gone through it, I have a very good mental map of where I'm at in terms of FE and what my car has done in different situations. Maybe one day I will organize them and enter some more data, but probably not- I don't have temperature data or trip lengths.
In the meantime, my first three tanks are still in the database, and they are perfectly valid data. At the time, in those conditions, that's what my car got. The passage of time and my failure to enter my other receipts should have no impact whatsoever on the validity of this data, and it would be very poor scientific practice to throw it out for those reasons, admitting an entirely new and uncalculable bias to the measurements based on the FE achievable by people who are 'regular posters' to the database as opposed to those who aren't so regular. Sure, those three tanks are a bit lower than my lifetime average FE, wich is 49.9 mpg, because I've gotten a lot better at it now, dealer break-in, whatever. The point is, data is data, three tanks or six. You shouldn't throw it out.
#33
Re: Erroneous entries in the mileage database
I put in my first three tanks when I first joined the site. I then started reading more about hypermiling and trying fuel efficient techniques and got gradually better at it, but I knew that I am far too competitive and that knowing that my tank results were going to be entered would probably make me start fudging, and since I really did want the data, I came to the decision not to enter my numbers.
I wouldn't say that the data from your car in the database is "bad", but it is definitely incomplete. Sometimes, incomplete data can be very misleading.
On a related not, I posted the following in a thread some time last month:
The HCH II database has 13 entries with less that 100 miles (including two entries of 1 mile, one of 6 miles and one of 9 miles). There are 123 entries with less than 1,000 miles. This is 22% of the total. Of course, people have to start somewhere, but only 28 of the 123 entries below 1,000 miles are active. Why are the low miles driven and inactive entries not deleted? Do they serve any purpose? What is the point of leaving in these entries of a 33.6 mpg over 1 mile, 19.9 mpg over 6 miles, etc.?
Just as hypermilers are not recognized until 3,000 miles, there should be some kind of limits used before a vehicle is considered in the average of a given class of vehicles in the database.
Just as hypermilers are not recognized until 3,000 miles, there should be some kind of limits used before a vehicle is considered in the average of a given class of vehicles in the database.
#34
Re: Erroneous entries in the mileage database
I would think that short-range tanks would be a better data point to "drop" from the fleet averages, more so than old/unmaintained entries. Anyone can get phenomenal mileage (or extremely poor mileage) over a short distance.
Another alternative calculation, that would take this into account automatically, is to weight the mileage entries by the number of miles driven on that tank.
Another alternative calculation, that would take this into account automatically, is to weight the mileage entries by the number of miles driven on that tank.
#35
Re: Erroneous entries in the mileage database
I agree. When I mentioned "low miles driven and inactive entries", I meant entries that met both of those criteria.
#36
Re: Erroneous entries in the mileage database
Mr. Kite- it's not *quite* true that data is data. Those trip meters round like crazy. Moving an entire tank four tenths of an mpg based on when you hit the start button to zero the reading, and when you pull over to fill up again makes a non-negligible difference. Some people here calculate by hand, compare, and use various techniques to 'correct' their tripmeter readings. Others top off or compete to get the longest 'tank' numbers in ways that many people would 'count' as 'fudging.' I've seen more explanations for coming up with different results here than I can remember, and there is a lot of competition; I just didn't want to go there, with any of that.
Look, I don't really need to explain again, you can still not understand me, but I've made my choice and I'm fine with it; I don't particularly feel that I'm 'throwing away' data because I don't enter my numbers, and I'm sure I'm not the only regular poster who doesn't add to the database. It can be darn useless sometimes, with averages for the Insight posted at 25 mpg or whatever, giving a bad impression and looking ridiculous. I'm all for the database; I just could wish it worked a bit better. It's all self-selecting and on the honor system anyway that it isn't remotely scientific, and that's fine, it fulfills its function, and it's great that we're having a debate about how to improve it. If you're concerned about not counting data from the break-in period then maybe you need something other than eliminating people who don't file tank data regularly; maybe you should think about removing the first thousand miles or four hundred miles from any given car; we do input the car's overall mileage, after all. Don't use a proxy for what you really mean.
Look, I don't really need to explain again, you can still not understand me, but I've made my choice and I'm fine with it; I don't particularly feel that I'm 'throwing away' data because I don't enter my numbers, and I'm sure I'm not the only regular poster who doesn't add to the database. It can be darn useless sometimes, with averages for the Insight posted at 25 mpg or whatever, giving a bad impression and looking ridiculous. I'm all for the database; I just could wish it worked a bit better. It's all self-selecting and on the honor system anyway that it isn't remotely scientific, and that's fine, it fulfills its function, and it's great that we're having a debate about how to improve it. If you're concerned about not counting data from the break-in period then maybe you need something other than eliminating people who don't file tank data regularly; maybe you should think about removing the first thousand miles or four hundred miles from any given car; we do input the car's overall mileage, after all. Don't use a proxy for what you really mean.
#37
Re: Erroneous entries in the mileage database
Mr. Kite- it's not *quite* true that data is data. Those trip meters round like crazy. Moving an entire tank four tenths of an mpg based on when you hit the start button to zero the reading, and when you pull over to fill up again makes a non-negligible difference. Some people here calculate by hand, compare, and use various techniques to 'correct' their tripmeter readings. Others top off or compete to get the longest 'tank' numbers in ways that many people would 'count' as 'fudging.' I've seen more explanations for coming up with different results here than I can remember, and there is a lot of competition; I just didn't want to go there, with any of that.
I don't know if you consider any of these things "fudging", but if I were interested in "fudging" and being at the top of the database, I would be. At least for me, the competition is more of a personal thing. Despite just flat out entering false information, I could have easily omitted ugly data. For example, I could have left out my winter tank in the HCH II that was sub 40 mpg, or I could have left out all of my interstate road trip in my HiHy that gave me fuel economy in the upper 20s. However, doing any of these things would not give me any personal satisfaction. I mean, I know the truth and that is what I consider important.
Look, I don't really need to explain again, you can still not understand me, but I've made my choice and I'm fine with it; I don't particularly feel that I'm 'throwing away' data because I don't enter my numbers, and I'm sure I'm not the only regular poster who doesn't add to the database. It can be darn useless sometimes, with averages for the Insight posted at 25 mpg or whatever, giving a bad impression and looking ridiculous. I'm all for the database; I just could wish it worked a bit better.
BTW, I didn't ask you to explain anything to me, but I do understand better. It's not perfect so you choose not to participate. That's OK. It's not that big of a deal to me and I'm not trying to force anybody.
If you're concerned about not counting data from the break-in period then maybe you need something other than eliminating people who don't file tank data regularly; maybe you should think about removing the first thousand miles or four hundred miles from any given car; we do input the car's overall mileage, after all.
I never tried to draw a line and define what is good and what is bad, but I did give some examples that I clearly think are bad. Do you think there is any value whatsoever in some of the examples I gave? In particular, is there any value in the inactive entries from the HCH II database I referenced (33.6 mpg over 1 mile, 19.9 mpg over 6 miles)?
This is absurd. Why don't you say what you really mean? Did I offend you? Did you take my post personally? That was not my intention.
I am very sincere when I say that I think a 33.6 mpg over 1 mile entry and a 19.9 mpg over 6 miles entry are of absolutely no use. There is no hidden meaning.
#38
Re: Erroneous entries in the mileage database
Wasn't trying to be confrontational, and not at all offended- when I said it seemed like a proxy, I meant it literally- it seemed as if you are talking about one idea (ie: low number of data entries) when you mean another (ie: low quality of information).
Only, I thought the break in thing was the low quality issue that concerned you, but instead it appears to be something a bit more nebulous- having a sample size large enough to give data that is 'representative of that person's driving.' An admirable goal, to be sure, and it would certainly provide people perusing the database with the best possible subjective information if they're looking at data car by car... but it's very difficult for me to judge how practical that goal is.
Also, some people want aggregate data, not a map of individual drivers' results. I suppose in a way, then, even very small data samples are valid, if they were really generated at random. Of course, they aren't. So a cutoff for small distances could be reasonable- it tells you nothing that the car averaged 19.9 mpg over 6 miles or whatever- I could sit with my motor running, use a whole tank, and go nowhere, but what does that demonstrate? Would it really be 0 mpg?
Only, I thought the break in thing was the low quality issue that concerned you, but instead it appears to be something a bit more nebulous- having a sample size large enough to give data that is 'representative of that person's driving.' An admirable goal, to be sure, and it would certainly provide people perusing the database with the best possible subjective information if they're looking at data car by car... but it's very difficult for me to judge how practical that goal is.
Also, some people want aggregate data, not a map of individual drivers' results. I suppose in a way, then, even very small data samples are valid, if they were really generated at random. Of course, they aren't. So a cutoff for small distances could be reasonable- it tells you nothing that the car averaged 19.9 mpg over 6 miles or whatever- I could sit with my motor running, use a whole tank, and go nowhere, but what does that demonstrate? Would it really be 0 mpg?
#39
Re: Erroneous entries in the mileage database
The first thing I'm going to do is put some limits on what people can submit. There will be limits of min/max for gallons per tank, MPG, and distance. I have the code for this about halfway done. Some maintenance is being done on the machine I'm using to develop the code, so I won't be able to finish it today, but I can move it to the live site next week.
I had returned from a 1200 mile trip, had the total gas volume at hand and I was unable to enter it as usual. Fortunately, I enter my mileage at another database and I was able to enter the actual MPG for just 1000 miles. Not very accurate I must say.
I feel that limiting the traveled distance without taking into account a reasonable gas volume could prompt a tank submission to be less accurate.
Please let me know if I did not explain it well.
Cheers;
MSantos