How our Algorithm based approach works
How we do it
At Triposo we make travel guides using algorithms. We crawl data from the web, apply some clever algorithms and fully automatically generate travel guides of high quality that cover the entire world. Since this is the core of what we do, we obviously can't give a away all the ingredients of the secret sauce, but this page gives you a rough overview.
We start with data. We take all sorts of data and send our crawlers to fetch gigabytes of travel related content from all kinds of sources, but we prefer open content that come with little restrictions. For this we use data from World66, Wikitravel, Wikipedia, Open Street Maps, TouristEye, Dmoz, Chefmoz and Flickr. Where we find incorrect data, we try to contribute back of course.
Once we have all the data, it is time to parse. From each source we try to extract information about the places like villages, cities and countries, and the points of interest (restaurants, museums, shops, trees, etc). To extract all that information automatically, we use a set of yaml files that define how the content is organized and that identify the patterns in the data.
After we've parsed everything, we end up with two big buckets of information, one about the places and one about the points of interest. Usually the same point of interest or place will occur multiple times in those buckets since we take information from so many sources, so the next step is to match it all up. After the macthing phase we end up with exactly one record for each place or poi that has all the information from any of the sources we've used.
Then we start to determine what's relevant for travelers.
Photos taken by travelers are a great help in this phase. Using the exif data you can tell where they were taken, at what time, on which day. If you combine this information with other data you can find out lots of interesting characteristics. The sort of thing we started out with when we wrote our very first blog posts.
The density of travel-related words in articles describing a point of interest is another valuable source of information. We have a number of scripts analyzing the content that give us a very good indication of what an article is about, which snippets best describe how much people like it and how relevant it's for travelers.
Once our algorithms have determined which points of interest are most important, we start creating guides for countries and cities. We combine the descriptions from open sources with maps made using data from the Open Street Map project and there you go: a complete travel guide comes out.
What we end up with is a complete overview of the things to see and do, places to eat and to go out, all ordered by relevance from a traveler's perspective.