Quote:
Originally Posted by RustyBrooks
I mean I guess question one is how the obfuscation is done. Can a human decode it? Or is it some opaque thing like "rustybrooks => 1 and suzzer99 => 2" and so forth?
If it's reversible you can try to reverse it. If it isn't but you know the method, and the method is independent of other data you don't have, then you can obfuscate the other source too. Otherwise, you're probably boned.
Sounds to me more like you have some records that probably match, but w/o the ID it's a messy process. (I thought it's just the ID that's obfuscated?)
At the statistical consulting firm, we would get datasets of car accident reports from insurance companies that we had to match to actual police reports and other sources of data. We'd run a VIN match first - but that missed so many due to 0 being recorded as O, 1 for L, etc.
So I just started chipping away by making those replacements or by matching 16 out of 17 vin #s, etc. Then you'd still have to eyeball all the data from each side to make sure you had a match. Messy stuff.