“A.I.” address location

The problem was how to find the correct address in the contents of thousands of letters from all different senders to all different countries.

For example, here are two letters. Since you are (presumably) a human it should be fairly easy to take an educated guess as to which part of the letter the address is.

But if you are a computer that’s a completely different story. For example, the letter from Hellen Keller has two addresses. So which one is the destination?  One could argue that the address in Nova Scotia is incomplete but actually it is the complete and correct address.

The Letter from Hyundai also has an address but it’s one line. The spelling of those locations are here in English, but they also could have been in Korean, Chinese, German, etc which means that a database of all cities and streets in all languages and all misspellings should be necessary if you want to do a database match. Perhaps doable if you have Google’s R&D pockets but not for the common man.

I solved this problem by making a fairly simple set of rules for the inhouse coded text interpretation software which would assign points to each sentence or paragraph. They all start with 0 points. But if a sentence is short they get more points, if they are long they get less points. If the neighbouring sentences above and below are also short you get bonus points. If you have bonus point but your downstairs neighbour did not than you get extra bonus points. Etc etc. (not going to spill the beans here) In the end all I needed was a set rule of 9. That was enough to accurately find the correct address in 98% of all letters.