When I travel, I always want to get the most of the city I'm visiting. One way is to talk to local people and get advices about which spots you shouldn't miss. But I wish I could have the point of view of the past visitors... Which places did they enjoy the most ? Where is the best spot to watch the sunset ? The best selfies you can get ? I also want to have a look of some beaches before choosing one or getting an idea about what some areas looks like... The idea here is to ask Instagram about the nearby popular spots.
After looking into the Instagram API and playing around with it, I came up with the following script.
import os import json from collections import Counter import pandas as pd from instagram.client import InstagramAPI INSTAGRAM_ACCESS_TOKEN = '' INSTAGRAM_CLIENT_ID = '' INSTAGRAM_CLIENT_SECRET = '' api = InstagramAPI(access_token=INSTAGRAM_ACCESS_TOKEN, client_id=INSTAGRAM_CLIENT_ID,client_secret=INSTAGRAM_CLIENT_SECRET) def getNbLikes(listMedia): likes =0 count =0 for media in listMedia: likes = likes + media.like_count count = count + 1 if count > 0: return likes/count else: return 0 def getTags(listMedia): tags =  for media in listMedia: for mediaTag in media.tags: tags.append(mediaTag.name) return Counter(tags) def getMedia(locationId): medias = api.location_recent_media(location_id=locationId) return medias bestLocations = ; latD=48.858844 lonD=2.294351 for x in range(-10, 10): for z in range(-10,10): print(x,z) locations = api.location_search(lat=48.858844+x*0.001, lng=2.294351+z*0.001) for location in locations: likes = 0 if not any(d['name'] == location.name for d in bestLocations): images = getMedia(location.id) likes = getNbLikes(images) tags = getTags(images) if len(images)>0 : bestLocations.append(dict(name=location.name,latitude=location.point.latitude,longitude=location.point.longitude,likes=likes,tags=tags,id=location.id,nbrImages=len(images))) finalData = pd.DataFrame.from_dict(bestLocations) finalData.to_csv('instadata.csv', sep='\t', encoding='utf-8')
We first query for the locations around the coordinates of the location we wish to know more about and then we query for the photos of each location and get the number of likes and the number of pictures for it.
Note: You need to replace the Access Tokens and Client ID with the values you get from Instagram here.
After running the previous script for some time, we get a nice dataset that we can analyze with pandas.
> import pandas as pd > df = pd.read_csv("instadata.csv",sep='\t') > df.head(10)
This is the head of the dataFrame displayed in iPython Notebook.
Now let's see which spots have the most likes per picture and which ones have the most pictures
gr = df.groupby('name').sum()
After dropping the useless columns
my_plot = gr.head(30).sort(columns='likes',ascending=False).plot(kind='bar',figsize=[15,5])
This script could be improved with some text mining on the names, to combine the similar results. (You can see that we have multiple results for the Eiffel tower).
The next step is to visualize the nearby pictures.
I put up a little angularJS application where we can select a location and see a list of pictures per location.
I'll put the code online when I have more time.
Please let me know in the comments if you have any improvement ideas !