This blog has moved! This post and other mistakes are now at https://mango.pdf.zone
Look Iām not really sure why but I think I made a thing that makes graphs of when people are online on Facebook. It sounds kinda creepy and uh it is. Read along so you, too, can be the NSA.Ā ĖāĶŹĖ
Little green dots
You know those green dots on the sidebar on Facebook that tell you whoās online? How do they get there? Ā Also there are times next to people who are offline. What are those about?
I was wondering the same things, and so one day I decided to 360 noscope hack Facebook by right clicking and selectingĀ āInspect Elementā.
IāM IN
We did it team. Anyway alright uhhhh letās just uh snoop around here reallllll sneaky like
If you reload the page youāll see approximately fifty-bajillion network requests go off as Facebook desperately tries to load all the junk that it needs to display facebook.com.Ā
You might be wondering at this point why I decided to look for interesting things in this mess instead of, I dunno, getting out more, getting a cat, that sorta thing. Anyway hey look a heading
Finding the good stuff
Whatās thisĀ āpullā thing?
THAT looks like some #datascience right there. This is the kind of 100% legit secret undocumentedĀ āAPIā that we came here for. Letās do some reverse-engineering.
It looks like a mapping of Facebook user ids to... their online status? But thereās more than one value?Ā āwebStatusā andĀ āfbAppStatusā are both there. Whatās more, it tells you what the person is doing on each of the different kinds of statuses.Ā
For example:
āmessengerStatusā :Ā āinvisibleā means theyāre not online on the Facebook Messenger app.Ā
āwebStatusā:Ā āidleā means their web browser is logged in to Facebook, and has the page open, but they arenāt doing anything on the site like moving their mouse or talking to anyone.
Since we have both of these at the same time, we can tell that this person is likely not using their phone, and that they were using facebook.com recently, but not right now.
Thatās already a little creepy that we can tell that about people. But can we do more with this?
You might also notice that there is a value calledĀ ālaā that is a big integer that starts withĀ ā14ā³. If you I dunno, didnāt have a lot of friends in high school, you might recognise that as a UNIX time stampĀ -Ā the time in seconds since midnight, January 1, 1970.Ā
Computer Scientists thought this would be a good time to start measuring the time from because the first app was born at midnight, January 1, 1970. The app was a custom emoji pack for an ancient model of phone that would one day evolve to become the first Blackberry.
If youāre wondering why the response starts withĀ āfor (;;);ā, itās to, among other things,Ā encourage developers to use a quality JSON decoder, instead of like, yāknow, eval().
Anyway thatĀ ālaā thing stands forĀ ālast activeā, and tells you the last time the person was active on Facebook, down to the second. Do you see where Iām going with this?
Roleplaying as the NSAĀ ĖāĶŹĖ
So far we have a whole bunch of things which look like this
A person
A time
Whether theyāre online or offline or idle
Which devices theyāre online/offline/idle on
This doesnāt seem that interesting at first, since you already know who is online by looking at the sidebar. But what if there was someone alwaysĀ watching the little green dots?
Using the power of computers, you can just write a Python program to listen to what the /pull requests are saying all the time ever, and write it down.
Hereās a screenshot of all the log files Iāve got:
And hereās what an individual log file looks like (the first 10 lines):
Those blurred out things are Facebook user ids. If you think these screenshots lookĀ a little bit creepy then YEAH I KNOW RIGHT.Ā
Tell me about your program then you massive nerd
It runs 24/7, and itās constantly logging online/offline activity data from those /pull URLs using my Facebook cookie.
Writing it wasĀ mostly about saying ājeez, all these parameters look complicatedā and then blindly copy/pasting them anyway.
Protip, you can right click on any network request in Chromeās Developer Tools and clickĀ āCopy as cURLā. This is amazingĀ and lets you re-run a request from the terminal, as well as give you all the headers and cookies used to run that request in a nice copy-pasteable format.
The first step was to just run that request verbatim in a terminal with curl.Ā
I was expecting it to not work because it looks like it has some sequence numbers in it oh boy BUT it turned out to just take a really long time. I later found out this was because the /pull endpoint is using HTTP Long Polling, which turns out to be like a streaming HTTP GET request.
The only other important parameter to worry about isĀ āseqā, which Iām guessing is the sequence number of the response from Facebook. Just add 1 to the sequence number that the response from /pull gives for the next request and youāre good to go.
If youāre worrying about remembering all this, chill out I got yoā back, my 100% Terms of Service Compliant implementation of this is available here on GitHub. Standard disclaimers ofĀ āIām so sorry I wrote parts of this in like 30 minutesā apply.
One caveat of the data-collection program that Iāve noticed is that it has false negatives. That is, sometimes it wonāt give you aĀ āthis person is onlineā data point, even though they really are online. I guess that gives plausible deniability of... being offline?Ā
You should probably get out more
[worried laughter]
So thatās the hard part done, right?
Let me paint you a word-picture. Itās 11pm, Iām listening to the soundtrack to The Social NetworkĀ (ironically? meta-ironically? I donāt even know), I have six terminals tiled across two screens as well as fifty thousand browser tabs open and Iām up to my third graphing library.
Making graphs is really hard.
I usedĀ matplotlib, but I realised this wasnāt my thesis and I wouldnāt be embedding this ugly graph as a pdf into a LaTeX document that takes 3 passes of pdflatex to render because thereās been a terrible but extremely localised accident where only humanityās LaTeX to pdf converters have been irreversibly sent back in time to the 80s.
I usedĀ bokeh, which claims to be aĀ āmatplotlib-killerā, and it was was okay until a friend told meĀ āit isnāt the 90s anymore, you donāt generate graphs server-side. Also your graphs are ugly and you should feel ugly you utter fraudā.
This friend recommended nvd3.js, presumably because youāre not making realĀ graphs in 2016 unless your graphing library is <something>.js and requires at LEAST one other <something else>.js as a dependency. Everyone looks at you likeĀ āwhat, you DONāT already use <something else>.js?Ā Jeez say goodbye to your Hacker News karma. Just apt-get install npm && npm install bower && bower install-ā NO STOP IT THIS ISN'T WHAT TIM BERNERS-LEE WANTEDā.
I think it took about three times as much time to graph the data as it took to write the code to download it. And the graphs arenāt even good! I gave up on perfecting the graphs so I could just hurry up and write this questionable blog post already. Just think of me resolving pip3 dependencies when you see the ugly graphs.
(°ć°)ā AND ANOTHER THING when itās midnight and your x-axis formatting function doesnāt convert UNIX times into JavaScript date objects properly because thereās no timezone information and I dunno JavaScript was written by some guy in two weeks (yeah I aināt afraid to call it out what of it) and your binary-search based conversion of sparse timeseries data into uniformly dense timeseries data is causing so many data points to be graphed that itās slowly crashing Chrome and youāre watching helplessly as your RAM goes up and Chrome wonāt close the tab andĀ it just doesnāt seem right that 2016, the year of the Linux Desktop has brought us this situation I mean I thought if you had enough <something>.js libraries this stuff was meant to just scale right up so tha-
Quit stalling with graphing libraries and show me the graphs
Fine but youāre missing out on top-quality graphing-related banter.
The graphs in this section are all of the online/offline activity of some of my Facebook friends.They consented to it being on this blog post on the condition that itās anonymous.Ā
Person 1
Hereās someoneās graph. The x-axis is time, and the y-axis is how online the user is. Possible states for someoneās status areĀ āofflineā,Ā āinvisibleā,Ā āidleā, andĀ āactiveā. Each coloured line is a different kind ofĀ client. Itās called a client because I donāt know Iām an Information Visualisation Professional and I get to make up words like that. Here are explanations for what each of theĀ ācoloured linesā means
status - Not sure what this is. Some kind of client-agnostic status? It doesnāt line up exactly with the activity of the other clients though
webStatus - Chat activity on facebook.com
messengerStatus - Status on the Messenger mobile app
fbAppStatus - Status on the Facebook mobile app
otherStatus - Presumably shows when people are online on other apps that can access the API that causes them to be consideredĀ āonlineā. OAuth? RandomĀ āappsā like Farmville? No idea
Hereās the same graph, with some clumsy drawings on it showing when I think this person is awake/asleep.
You can see the amount of rest theyāre getting each day - itās the width of theĀ āasleepā bit.
You can also see that they were probably asleep from 3am to 10am on February 11, and BOY does it feel creepy writing this.
Of course, this isnāt perfect, since they might be awake and not using Facebook (I know). Having spoken to a few people who were graphed, itās been a fairly accurate measure of awake/asleep time, as well as āhow much do you browse Facebook at workā time ;)
Do you look at Facebook shortly after you wake up? Shortly before you sleep? If so, these graphs are a fairly accurate way to measure when you were asleep, and anyone youāre friends with on Facebook can do it.
Person 2
I showed this person their graph and asked them some questions.
āDid you go to sleep around 11:10pm last night?ā
They said yes.
āDid you wake up around 8:32? Thatās a weird time. Was your alarm set for 8:30?ā
They said yes.
The person isnāt online as frequently as the previous examples
The person isnāt using the Messenger app nearly as much
You can see that their webStatus wasĀ āonlineā on and off from midnight til around 2am, and then again at 10:21am.. Iām not sure if this spiky pattern means that they really were online, then offline, then online again, or if itās just a quirk of the dodgy undocumentedĀ āAPIā Iām using, or even if itās just a problem with my code.Ā
Similarly, Iām not sure why there are these weird spikes every three minutes (+- ~1minute) sometimes.
Also, why doesĀ āotherStatusā go to offline precisely whenĀ āwebStatusā goes to online? So many questions! Let me know if you know the answers to any of these things (@Facebook employee friends ;) ;) ;))
Anyway, I hope I've convinced you that this is real creepy. I donāt really want to be able to have the power to do this.
Your dumb graph screenshots are too small. Give me a live graph to play with
You got it, boss. Click here. Or anywhere, really. This whole sentence is a link.Ā
What else can you do with this data?
You can aggregate. Finding the average wake up time/sleep time/time spent on Facebook each day and then looking for outliers sure sounds like a way to find interesting things about your Facebook friends.
You can write a thing to email you every morning with the names and sleep times of everyone whoās had less than 6 hours of sleep.
You could even try and guess when your friends are talking to each other, by looking for times when only a few people are active, although I suspect this would be hard.
Iām sure you can come up with something else, too.
Why can you do this? Canāt Facebook stop this from happening?
Thatās a good question, thanks for asking.
It makes sense for Facebook to be able to do this, since they can tell when everyone is online anyway. But why can your Facebook friends do this to you?Ā
I donāt know all the details of how facebook.com uses all the data thatās sent via the /pull endpoint, but itās kinda creepy that I can see my friendsā status on every device? I guess they could just give meĀ āwebā orĀ āmobileā orĀ āofflineā, rather than the full list of statuses for every client, but even that doesn't solve the problem.
I also see the value in seeing ālast active 4h agoā and ālast active 1m agoā for Messenger contacts but... I dunno, here I am making these creepy graphs.Ā
Anyway, I just open-sourced my dodgy graph making thing so now everyone can do this. And who knows how many people have been doing it already?
Iām probably oversimplifying it, though. The smart people at Facebook who write this stuff have probably thought of all of this and found that this way was best.
Can I stop you from doing this to me?
Kinda. Coincidentally, because my script is always running, collecting data, I show up asĀ āonlineā all the time. If you were also running a script like this, it would partially prevent what Iām doing from working on you, since you always show up as āonlineā, no matter what youāreĀ reallyĀ doing. Activity from the Messenger app will still show up separately, though.
tl;dr
Facebook sends your computer a bunch of interesting information when youāre on facebook.com.Ā
You can collect that information over time and use it to keep track of when people are on Facebook, and which devices theyāre using.
You can make a pretty good guess as to what time people are going to sleep and waking up
Itās creepy, but I donāt see a way for Facebook to stop allowing this while still making their chat app good.
So how does this make money again?
Oh, no no no. I just uh donāt get out much.
If you want to talk to me about this blog post then I dunno tweet at me I guess. You can also stalk me on GitHub if you want.
For the latest in dumb novelty websites, please direct your browser to https://mango.pdf.zone
Anya is live and ready to show you everything. Watch her strip, dance, and perform exclusive shows just for you. Interact in real-time and make your fantasies come true.
ā Live Streamingā Interactive Chatā Private Showsā HD Quality
Anya is LIVE right now
FREE
Free to watch ⢠No registration required ⢠HD streaming