Privacy, Security, and Censorship

Frontiers of Computational Journalism week 11 - Privacy and Security
okay privacy and security so this is the
digital security for journalism lecture
this is going to be taught at a bit of a
higher level than I normally teach this
thing I do journalism training seminars
for for various places we're also going
to talk a little more about the broader
question of privacy and user data
obviously this has become a very big
field you know to the point where there
is now congressional inquiry is about
consumer data privacy not something you
saw before last year but really the
emphasis for us today is on trying to
keep journalists safe so this is how
we're going to do this we're gonna start
with some basics and just simple stuff
and as the keepers of the security
knowledge you will probably find
yourself responsible for security
practice at some point simply because no
one else will have studying how to do
this so I'm gonna try to give you some
tips for that and then we're gonna go
through sort of various aspects of this
including the legal issues around it or
some of them the highest-level framework
I'm going to teach you today is called
threat modeling has anyone heard of this
so there's no one-size-fits-all security
you have to plan security for the
particular context of the pic Euler
story that you're working on and we're
going to talk about how to do that and
then we're going to do some case studies
so as you may know the WikiLeaks cables
the set of 250k diplomatic cables we're
never meant to be fully released
publicly of the news organizations who
were working on them were vetting them
and redacting them and releasing them a
few of
but there was a monumental series of
security mistakes which caused the
entire archive to be leaked and we're
going to talk about that as a case study
of how even experts get this stuff wrong
and unfortunately we have lots of other
case studies of security failures in
journalism at this point so the basics
so this is what I hope everybody in your
news organization is doing if you don't
know who the security person is in your
news organization it's going to be you
unfortunately they're starting to be a
little more literacy about this stuff
but basically if you can get proper
passwords and proper logins and you can
teach people about phishing you're 90%
of the way there
almost all actual attacks on journalists
and their sources are through fishing so
if you can cut out that vector you're
doing pretty well and then there's you
know basic device encryption that's
getting a lot easier basically
everything is encrypted by default these
days and then there's the you know I
forgot to turn off access problem
passwords I'm sure you've gotten this
speech from a number of places but every
time there's a major password breach we
see something like this this has gotten
a little better most sites don't let you
use password as a password anymore they
don't let you use 1 2 3 4 5 6 anymore
but you can see you know everyone thinks
they're being clever by using shadow and
dragon as a password but no I also
really enjoy how the LinkedIn passwords
are Ron chair than the Gawker passwords
I think that's a
that says something very interesting
about our society I'm not not quite sure
what though anyway as the thing goes
don't be that guy right so don't don't
do that the ever brilliant xkcd had a
nice one and passwords a few years ago
yeah you can do all this like replacing
you know letters with numbers and stuff
but pass phrases are actually easier to
remember and more secure just in terms
of the number of bits of information
involved but you know honestly don't use
a dictionary word or a variation on a
dictionary word is really the the major
piece of advice here why is that what is
the vulnerability of dictionary words
yeah yeah there's not that many of them
there's maybe a few hundred thousand
when you start to look at variations and
do this like digit substitution thing
but let's say less than a million but
why does that matter
okay all right so you don't want
something to go through every word now
you can't go through every dictionary
word trying to log into Facebook or you
can't go through every six digit
passcode trying to log into your phone I
don't know if you've ever tried it but
eventually it starts slowing you down
and eventually it only lets you enter
one passcode every 10 minutes 10 minutes
or 15 minutes something like that
so most sites don't let you do millions
of attempts but there's still a problem
anyone know what it is let's look at
this again how are users passwords
stored on this when someone broke into
LinkedIn that they actually find a file
that had this in them no
so how our passwords stored anyone no
huh hashed right okay so and this is
very old right this has been security
practice since the 1970s so if you go to
your UNIX system and you generally these
days you won't have read access to this
but you might if you're on the root
system or it's not properly not secured
there's a file called etc a password and
what that file looks like is it's a
series of user names and then a hash
okay and what that hash is is we compute
we take some hash function and we hash
the password and the hash function is a
one-way function you've all seen hash
functions you've had some cryptography
somewhere we're gonna yeah this is going
to be the cryptography heavy course in
this class of course the point of a hash
is that it's very fast to compute this
and very slow to invert it this is why
they're also sometimes called a one-way
function and so then when someone's
trying to log in you hash the password
they type in and see if it matches the
hash in the file so good enough and the
point of this is that even if you can
get this file you can't figure out what
the password is without a lot of
computer power that the point of a hash
function is that you can't do better
than random guessing yeah
that's one vulnerability the other
vulnerability is something called a
rainbow table and what a rainbow table
is is what you do is you take a file of
English words right you get a just a
list of dictionary words and you hash
every single one of them and so now you
have a let's say million entry table
which is just a lookup from the hash to
the original word and that's what this
is that's how they got this right so you
can hash a million words on your GPU in
you know an hour or two and you get a
multi gigabyte file and then you're done
if you can get this file you can invert
all the passwords now there are a bunch
of ways to prevent this one is that
modern systems also use a salt which is
they pick some constant and the constant
is system specific and what that does is
it means that you have to generate a
rainbow table for that system which
first of all slows you down means you
can't just download a rainbow table and
also you have to know the salt constant
all right so it makes things a little
harder the other thing you can do is you
can make this hash function super slow
so sometimes you see like multiple salts
and multiple steps if you can make that
password calculation take a hash take
half a second you're never really going
to notice it when you're logging in I
mean what's half a second but if you're
trying to compute a million of these
suddenly it takes weeks instead of days
so there are various things you can try
to do to defend against rainbow tables
but the best thing you can do is not use
dictionary words which is why most
systems don't allow it these days
because then you can't invert these all
right which means that even if someone
breaks into your system and steals your
password file you're still not going to
be able to figure it out except that you
can do the frequency trick you can
probably
take the most common one and then guess
just start with a list of the most
common passwords that people use when
they have to put an uppercase letter or
a symbol and then digit in it anyway
point is proper vulnerabilities probably
more important than that actually it's
two-factor authentication this used to
be quite rare but it's pretty common
as you know Columbia's switch to
two-factor authentication this year
thank goodness as a journalist I'm gonna
say this many times as a journalist you
are a target okay that means you need to
be protecting yourself and one of the
very best ways to protect yourself is to
turn on two-factor authentication of all
your accounts including of course your
personal accounts right your even if
there's nothing work-related in your
personal accounts if someone wants to
harass you getting into your email is a
great way so good passwords good
two-factor turn on two-factor you can do
this on all your social media accounts
basically everything with the password
these days supports two-factor
authentication
there is no excuse for not using it
there's no excuse for your colleagues
not using it you're you're running at
risk
the other problem you're going to have
is if you used the same password for
multiple sites then if one of them is
compromised they're all compromised I
know for a fact that one of the my
passwords has been compromised because I
get still getting notifications for you
know random sites I logged into five
years ago right just whatever site and
you know someone logged in is this you
at least use different passwords for
your email and other things because your
email is used to reset every other
password so if you're only gonna have
two passwords use one for email and
another for everything else but really
you should be using a password manager
you know one passer or lost past or
whatever it is that you want to use
because you want a different password on
every site so that when not if but when
one site is compromised you don't
compromise all the other sites all right
so that's that's login hygiene any
questions all right
phishing this is as I mentioned really
important thing number two how many of
you have heard of fishing with show
pants okay so most of them so fishing is
cannot be protected against by technical
means because it is not a technical
attack it is a confidence game it's a
trick okay the reason fishing works is
because it tricks something someone into
doing something they shouldn't
for example typing in their password
into a fake login page how can you tell
if a login page is real or fake
yeah look at the oral right the browser
designers have gone through an enormous
amount of trouble to make it hard to
fake a login page now I have a bunch of
examples of journalists being hacked by
phishing journalists and their sources
so I think we discussed this tweet
earlier this is a famous tweet which
briefly crashed a stock market this was
probably the what did they call
themselves it was a the Syrian cyber
army anyways as Syrian government
affiliated hackers which they attacked a
bunch of news organizations around 2013
they defaced the front page of The New
York Times for a few hours they didn't
do any serious damage but the code of
potentially and this is we believe how
they got in so notice here that the
email came from a colleague it was a it
appeared to come from someone else at ap
to someone at ap and how they did that
was they first got into one account so
this is a classic trick right you
compromised any account in the
organization so you send out this
message to all 3,000 employees of the AP
you get one of them who falls for it and
now you have access to their their
account now who knows what their job is
right maybe there's nothing interesting
in their email but now you can send an
email from their account to a colleague
which makes it more likely that the
person receiving the email will fall for
it so you can go step by step through
increasing levels of trust this way
there's several ways this message is
suspicious how how can you what of this
email could be suspicious
very general yeah and look at the
subject line news it's the dumbest
subject ever right there's nothing it
says nothing about what it actually is
right please read the following article
it's very important it's um yeah look so
there's another tell here as well which
is that it if you hover it over this URL
it doesn't go to the Washington Post all
right so you all know that a link text
and the link URL are two different parts
in the HTML so you can make a link that
looks like you can you just set the text
to one thing and set the link to another
thing all right so this should these
should be red flags right it should be
if it's very generic maybe you should
think a little more be a little more
suspicious and the easiest way to check
your suspicions is to see where the link
goes to you may remember the hacks of
the DNC in the 2016 election we believe
this is how they got into the DNC via
John Podesta and the interesting thing
about this sad case is that the security
person actually noticed it let me
actually go to the original reference
for this oh because this story is pretty
pretty fascinating
all right so somebody faked being Google
right it said please reset your password
and I had a change password link that
went to bitly so that should be a red
flag for you right because that would be
google.com and then Podesta was like
home this doesn't look right
and they send it to the IP team and we
think there was one word problem here
right this is a legitimate email as
opposed to an illegitimate email so this
is good advice definitely needs to
change the password immediately but he
clicked on the bitly link instead of
this one so there was some confusion and
I don't know if it's the IT person or if
they had meant to type the illegitimate
but there was some confusion the IT
person correctly recognized that there
was a problem Podesta correctly
recognized that there is a problem but
then um the chain of communication back
to Podesta didn't ensure to tell him not
to click on the link and so he clicked
on the link and that is why we know John
Podesta is risotto recipe I don't know
if you saw that that was in the links so
it's a sad story because it almost
worked right more recently last year
citizen lab which is a research group at
the University of Toronto has been
tracking the use of a piece of phone
spyware called Pegasus Pegasus is a
pretty frightening piece of software it
is a commercial product sold by a
company called NSO security I think or
NSO group it's an Israeli company and
what they do is they sell you a kit to
infect phones how it works is you send a
text message to somebody with a link and
this works for both iPhone and if you
click the link it eventually installs
the Pegasus
spyware the this is a very sophisticated
piece of software it's very expensive to
do what they've been doing NSO group
yeah and yeah this is a slide which we
believe is from the Pegasus
documentation it was from a bunch of
females from Minnesota
so they got hacked and this was in the
hacked file so this you know probably
authentic but we do know from analysis
of the spyware that it tries to drop
that it can do things like turn on the
microphone and camera of the phone so if
you click on this link then your phone
is spying on your Europe and reporting
back so citizen lab has done this this
very sophisticated analysis of how this
stuff works basically how you do this
type of analysis is you look for the
command and control servers when you
click that link that link has to go
somewhere so that's at least one server
that in this work that's called the
front end server that download that
applies the exploit and downloads the
spyware and then you need also it needs
to report back that's called the backend
server
it appears that generally those are two
different machines so it has to download
the software from somewhere and it has
to send the exfiltrated data or the
audio stream to somewhere else so you
can look for these servers because they
have they have a particular signature on
the internet when you connect to them
this was used in a bunch of places the
reason we know about this spyware is
because enough journalists were aware of
this problem to send the suspicious link
so in particular
yeah it's the Mexican cases this is
super sad where is it reckless for yeah
part six right oh these are even more
geez anyway a bunch of so the the there
is a Mexican news organization that is
reporting on narco cartels oh my god
there's there's even more of them here
too here we go and interestingly across
a bunch of people in one case what they
did is there was a journalist who was
killed he was the publisher of news
organization and then two days after his
death some of his colleagues were
targeted with Pegasus spyware so the the
message that they received actually said
we have information on who killed your
colleague so this is not just fishing
it's targeted fishing what's sometimes
called spearfishing you craft a message
that's designed to be interesting to the
intended target
unfortunately the targets in this case
forwarded the link to security
organizations who eventually forwarded
it to citizen lab and citizen lab was
able to do the testing to confirm that
it was the Pegasus spyware and and it's
great right because if you get one of
these links if you go to that link it
tries to download the spyware so if you
go so you can get a copy of the spyware
alright it really helps the
investigators fortunately things do seem
to be improving there have been lots of
trainings in this part of the world
to teach journalists to forward
suspicious links to organizations who
can help them interestingly some of this
stuff a lot of it doesn't seem to be
about journalism or corruption at all a
lot some of it were targeting scientists
who are researching the health effects
of soda so the speculation is that
actually this was yeah they were looking
at this soda tax right the speculation
is that this was commercially motivated
the soft drink companies in Mexico
didn't want a tax on their product no
it's out of Israel they are very
sophisticated so they have multiple
iPhone exploits which win citizen lab
discovered it they forwarded to Apple
and Apple fix them those are known as
zero days does anyone know why they're
called that so when you discover of all
vulnerability that's day zero and then
you count the days until it's fixed
right because that's a window where
computers are vulnerable to an attack
now of course if someone else found it
first maybe that window is a lot longer
but anyway zero day has become a term
meaning and undisclosed exploit because
if it was disclosed then Apple and
Google would fix it so that the the
original version of Pegasus they found
had three zero day exploits for iPhones
and iPhones are actually quite secure
they're one of the most secure completed
computing platforms Apple's done a
really nice job of this so three iPhone
zero day exploits that can download
spyware by clicking on a link that's
actually very expensive it's expensive
not just in the sense of
difficult to find those those
vulnerabilities but expensive in the
sense that if you have one you can sell
that on the black market for hundreds of
thousands of dollars and to some extent
that's good news because it means it
limits this type of attack because it is
expensive to do the black market for
these exploits also means that you don't
have to have a you know incredibly good
hacking staff to execute these attacks
against journalists you just have to
have enough money all right I don't know
what NSO Group charges but I bet it's
less than 100k to try to to get a
license to their the technology but
again if you know what phishing is you
can avoid this I'm not going to go
through all the stories it's there are a
lot of bad stories around this this type
of stuff and the purpose of researching
this stuff and publishing this stuff is
to try to pressure an Esso group or the
Israeli government to stop selling to
regimes that abuse human rights right
this this sort of pressure has worked
successfully with American and European
companies in the past so far after a
couple years of effort it doesn't seem
to have deterred NSO but we keep trying
here's another example of spearfishing
this needs to be calling coming down a
little bit but for a while the Syrian
the Assad regime was very active in the
cyber area so again what you do is you
craft a message that will appear to your
enemies will appeal to your enemies so
this was a message designed to appeal to
anti-regime activists and it was a
pretty standard spearfishing you click
the link it was a fake facebook login
page they got your facebook login
credentials after that
they do social network analysis right
you can look at friends and trace all of
this this appears all over the world
here is a Chinese example sent to Hong
Kong news organizations one of my
Chinese colleagues very helpfully
translated this but you know it was a
message about the umbrella movement we
all know what that is Hong Kong
democracy movement as circa 2014 so sent
to news organizations and universities
friends of mine have been targeted this
way but basically the solution is always
the same which is look at the URL most
browsers have by default on or you can
turn on a preview of what URL it's going
to go to so one of the things you can do
is just get into the habit of hovering
over the link for a second and reading
the URL it's easy to turn on if it's not
turn on by default and also to get get
familiar with what should be a
suspicious email right so this is one
type of suspicious email it's very
generic here's another type of
suspicious email any email which is
trying to get you to do a password reset
is suspicious it could be real but could
also be fake and here's another type of
suspicious email an email which purports
to support your cause whatever that may
be right I have vital information about
this this thing that you care about but
I'm not going to tell you what the
information is in the email these are
all things that should be suspicious to
you and so then you can take a moment
and read the URL before you click
especially read the URL before you type
in a password you should get into the
habit of before you hit enter on the
password read the URL
okay that's fishing any thoughts or
questions
all right
next up is the physical security of
devices right so we've talked a lot
about logins and communications and by
the way phishing can come through every
medium not just email right it could
come through Facebook for a while there
was a phishing scam on Twitter that
spread and then we just said it says
spent said spread through direct
messages and you just got a direct
message from a friend that says have you
seen this photograph of you and a link
and the link went through a face fake
Twitter login page and if you gave your
password then they use that password to
then send it to other people so it
actually spread all through the Twitter
network but you can also have situations
where the physical device falls into
your adversaries hands right so a laptop
in this case you know this is
heartbreaking everybody that he had
talked to had to run because of this
footage that was on this laptop which
might have been possible to avoid if the
laptop was encrypted
so think about storage right laptops
phones USB sticks memory cards external
hard drives anywhere you've got a copy
of something fortunately we now live in
an age where all devices have operating
system-level
encryption you need to turn that on it's
on by default and Android now it didn't
used to be but it is now it's you can
turn it on when you do a Mac install you
can turn it on when you do and O's
install you really should be running
with that on there again it's one of
these things where there's just no
excuse not to do it there's no
inconvenience to you
okay so from basic security practice
we're gonna take a little arc away from
that and talk about the state of
surveillance we have reached a point in
history where there is mass monitoring
of the citizens of most countries by
both state and private interests that
has some very shocking implications I
think it's gonna be a long time before
we figure out what that really means for
civilization but before we talk about
unauthorized a large-scale collection of
information everybody should take an
hour to use an incognito browser and see
what they can find out about themselves
through public information and what you
may find is that your privacy settings
are not what you thought they were on
Facebook or maybe you had different
expectations about what those privacy
settings would do so this is something I
recommend every journalist do is just
look yourself out background yourself
imagine that you are researching someone
else and apply it to yourself and see
what you can find out publicly because
if you expose things publicly then you
know nothing else really matters
we have long had a record of every phone
call made this indeed was the initial
revelation from the Snowden documents
was that all of the domestic phone
records were being sent to the NSA in
this country this does mean that there
is a record anytime you call a source or
send a text message and those records
are potentially Sippy knowable in this
country we have a certain amount of
legal process about obtaining those
records in other countries not so much
the government can just ask for them
there are several cases where this has
gotten a journalist our source into
trouble so basically what they found is
that this source who was the source for
a story about a classified operation in
Yemen had called the AP reporter so they
actually looked at this at the sources
phone calls they also subpoenaed all of
the phone records of the AP right which
caused a big stink at the time and led
to a bunch of Department of Justice
rules there's this long document that
the Obama Department of Justice issued
that said that you shouldn't subpoena
journalists phone records except in
extreme cases and only with the approval
of the Attorney General who knows what
the current Justice Department's
position on this is so this is the
weakness of a norm versus a law right
the one of the questions is there a
gentleman's agreement protecting
journalists phone records or is there
actually a law which states what they
have to do to get the records there have
been several failed attempts to pass
so-called shield laws in this country
there are states and countries where
shield laws exist
the laws basically say something like
you know a journalist can't be compelled
by law to reveal their source and we've
had cases where journalists you know
refused to give up their sources and
eventually have to be compelled by law
to do it but this game is changing right
if you want to get privately leaked
material or to keep an anonymous source
this is much harder than it used to be
because there are records there are
records held by the phone company
they're records held by whatever
platform you used to have the
conversation so just looking only at the
phone company I want to show you a piece
that was done in 2012 which
unfortunately in the way of things
seems to have undergone some bit rot so
this map now says for development
purposes only but what this is is this
was a politician who requested from the
phone company all of the records on
where they've been and how the phone
company knows where you've been is
because your phone has to connect to at
least one cell tower and it can only
connect from at most a few kilometers
away and also each tower only covers a
particular direction so what you're
looking at here is which towers the
phone was in range of and the reception
direction of this tower so you can see
him traveling around campaigning and
then it's hooked into there it's all
this is all correlated with their
Twitter account so we have their phone
records and you have their Twitter
messages let's see here the Twitter
stuff appears here you can get a pretty
good sense of what that person was doing
in any given moment so all of us
are carrying tracking devices and even
if there's no spyware on the phone even
if you're not using any apps that
transmit your location which by the way
if you're using Google Maps obviously it
transmits your location right even if
you turn on all the privacy settings for
Google if you look up if you open the
app it has to load the data and near
where you are right so Google knows
where you are every time you open the
app but do you mean if you're not doing
anything like that just from the fact of
the phone connecting to the network your
location is being revealed in fact you
get a distance from the tower to because
the transmission protocols the 4G
protocols have to compensate for the
round-trip light time from the phone to
the tower
so they actually shift the phase of the
signal to synchronize with the tower so
you actually get if you you can actually
triangulate the location if you have
multiple towers because you have a
distance from each one this is used by
emergency services by the way it's
actually a feature of the network to be
able to identify someone's location
within a few tens of meters and you use
this for you know finding people and
earthquake rubble and that sort of thing
so it's actually required by law for the
phone company to be able to figure out
where you are so add to phone company
usage all of the apps that you use which
might send location data all of the
websites you go to which track your IP
address there's there is data around
which tracks you pretty closely and then
there's all the data you voluntarily
give up right every time you press a
submit button on the Internet
this was from a wonderful report from
the Federal Trade Commission it's a
little old right now but this diagram is
still the best that I've found talking
about the information ecosystem and this
is just the inset and what they've done
here
they've listed concentric rings
unfortunately cut off the legend here
but the colors are different types of
information so oh yeah here you go so
medical information is pink here and you
can see how that will spread to
healthcare analytics companies and
various sorts of affiliates who does the
billing for your hospital they have a
copy of that data if you read very
carefully the privacy policies it will
say that they can share it with people
who process the data and that is health
data which is some of the most regulated
data in the US it's really the only data
around which we have strong privacy law
that the privacy all is called HIPAA and
there's also there's all kinds of rules
about sharing health information but for
you know your what you buy or what you
post on social media there's much less
regulation and it's really this big gray
area and then of course the social media
platforms may be revealing information
about you through your API so for a few
years if you logged in anywhere with
Facebook that site got not just your
personal information but the your
friends list which is huge right contact
information is some of the most
revealing information and also the likes
of your friends and that the likes of
your friends that was the information
that Cambridge analytic I used to build
their micro targeting models now it
should be clear the question of whether
or not political micro targeting
actually works as a different question
that is a much more complicated subject
my general sense of that is that the
answer is kinda sorta but not really I'm
gonna actually put a link in the slack
[Applause]
this is an excellent summary of
an excellent sort of skeptical account
of the idea of using big data to
influence populations clearly it works
to some extent but not you can't just
control people's minds because you know
what they liked on Facebook so let me
post that in the class slack
but anyway we have to assume that more
and more of this data is going to be
available and then there's just
straight-up government mass surveillance
so there are various vendors that build
hardware devices whose only purpose is
to monitor a network what I love about
this particular sone unit speed is just
one of them what I love about this
particular piece of advertising coffee
is where regulatory data retention and
lawful interception does not survive for
provision for accuracy so I think what
they're saying is when you have to break
the law to get the information you want
I'm not quite sure how they're spinning
that but when I read that it it what's
the I mean if lawful interception isn't
enough does that leave on lawful
interception anyway the the the hubris
of these companies is stunning to me be
that as it may we long suspected and
since 2013 have become aware that there
is massive government surveillance of
Internet use so now this is old this is
from a I think it might have been like
2019 or 2010 this internal NSA slide
deck this was among the documents
revealed by Snowden and he revealed this
network called xkeyscore I have no idea
of xkeyscore still exists or still goes
by that name but certainly something
like this exists and what this is is
this is a network of IP traffic
recording stations that the US and
allies the so-called five eyes so that's
US UK Australia Canada New Zealand right
there's a major intelligence cooperation
agreement has
since World War two Commonwealth
countries basically to help each other
monitor electronic communications and
now what that means is installing these
network recording boxes or network
monitoring boxes at ISPs the world over
so you know domestically you put this at
the you know 18 T's San Francisco office
in fact there was a court case about
that before Snowden where AT&T; employee
knew of the government coming in and
installing these boxes and filed suit on
the grounds that this was warrantless
surveillance so we'll get to the Fourth
Amendment issues of this in a minute
this can also be recording all of the
cell traffic from the roof of the
embassy it can be putting a monitoring
device where an undersea cable lands
where it first comes ashore you know
anywhere you can get a good access to a
big fat Network pipe you can put a
monitoring device and it's actually
quite clever so at the time they stored
three days full take for this network
because the UK has a weaker intelligence
surveillance laws basically American
intelligence is not allowed to monitor
American citizens but UK intelligence is
so they had at the time of the Snowden
documents a 30-day rolling buffer so
imagine that right you just store about
for 30 days of all Internet traffic or
at least all Internet traffic that could
be communications right all of your
messages and yeah it's encrypted but you
have all the metadata and also not all
of its encrypted especially at that time
so you have a full take system and then
you have a what they call a selector
system where it also
stores for longer any time you match a
IP address login name instant messaging
handle email address or match keyword
right so strong selection by selector
soft selection by content so it's this
very sophisticated system and it's
federated so rather than trying to send
all of this information back to the NSA
headquarters in Maryland what you do is
you store it all locally so you don't
have to send because otherwise you'd
have to mirror all this traffic in real
time right you store it locally and then
you can do a federated search so you
send out one search query to all of
these stations around the world and it
all comes back so that's the state of
the the five eyes system so here you can
see the the five eyes countries I am
sure that China has an equivalent system
definitely domestically but I would I
would be shocked if they didn't have an
international monitoring system of the
similar type and of course many other
regimes so we imagine that most of the
Middle Eastern countries are running
similar things domestically so that is
the environment in which the journalist
is operating in where every
communication with every source or
colleague is recorded somewhere so then
it becomes a question of who can get it
and under what conditions oh yeah here's
more xkeyscore stuff right so basically
what it records it organizes things by
session which is basically just a
sequence of like one TCP connection and
[Music]
indexes it by this metadata right so
even if it doesn't have the content of
an instant message so even if you're
using whatsapp for example Facebook
doesn't know what's in the message
messages because they're intending
cryptid but Facebook knows who you
talked to and the 5ix countries may not
know who you talk to because they only
see that you connected to the Facebook
server but they know that you are
sending a message at that time all right
so you can kind of reconstruct this
stuff and then if you have all of the
traffic going in and out of Facebook
because you're monitoring the network
you can do correlations where you can
say out whenever this person sends a
message this person receives a message
so I know these two people are talking
and I know these are the messages they
say you can do all of this crazy traffic
analysis stuff you can if you have
people's locations you can look at who
travels together so that is a another
NSA program which is or was called Co
traveler there's a whole bunch of
articles on doing machine learning to
try to figure out and they wanted to do
terrorist detection right so
[Music]
right so this is the like AI is
directing drone strikes sort of
sensationalism this is a much more again
skeptical article about it which talks
about yeah it's nice we have slides on
the classifier accuracy this article
talks about you know how they are
actually using this data and what the
actors the system actually looks like
and and so forth
this is a really excellent article on a
[Applause]
mal for terrorism detection that you can
take a look at the short version is that
while it is certainly worrisome the idea
of using classifiers to direct drone
strikes it seems unlikely that there's
any sort of automated link between the
classifiers and the strikes right
there's it's not clear how they're using
the output of those classifiers and
there's there's people intelligence
analysts in the middle of that so it's
not quite so clear but it is worrisome
because one of the first things that
came out of this system that is doing
machine learning was they identified a
journalist who is reporting on terrorism
there you go bureau chief for l0 and
Islamabad so I mean I guess the
algorithm worked right you did find
someone who talks to terrorists I mean
that's their job but it worked
so anyway there is a whole thicket of
issues here which are very complicated
and we're not gonna get super into them
except again to say that you know you
are being watched
and then there's network censorship so
we've talked about surveillance now
censorship is you just can't reach
things
various ongoing research projects such
as the open network initiative that
monitor what's reachable from where
obviously China is the pioneer in the
system the so-called Great Firewall
prevents a lot of sites and it's
political tool right they don't want
citizens seeing certain types of media
from the rest of the world and it's also
used to punish news organizations you
know at various points the New York
Times has been bad it's still currently
banned what am I thinking the English
one okay and then for a while Reuters
was blocked after they did was at
Reuters or Bloomberg I think was
Bloomberg Bloomberg did a big series on
Chinese the wealth of Chinese
politicians they were blocked for a
while of course to really understand
this the part of the context you need to
understand is that the censorship is not
just external it's also internal so
there's very strong rules about what
people are allowed to post on Chinese
social networking sites and there's also
automated filters there's a changing
list of keywords that you can't use on
WeChat so interestingly actually the
keyword filters only apply when you have
more than two people in the chat so what
they're interested in is suppressing
group chat about certain types of
conversations your message simply won't
appear if it contains certain words
there's no list of those words that they
change frequently depending on the
political situation it's all terribly
murky
[Music]
[Music]
or metaphors right like people use so
you get a lot of wordplay to try to get
around the filters yeah it is a really
complicated and interesting topic the
the Chinese information control regime
is the most sophisticated that humanity
has ever produced as a scholar I'm just
I'm awed like I'm so impressed as as a
journalist I'm really disturbed because
you know it goes against my American
values of freedom of speech right I
think you should be able to say these
things of course every country is now
grappling with what it you should be
allowed to stay online right it's and I
mean that's something that's very
important to understand about the
Chinese regime as problematic as it is
it also serves legitimate functions a
lot you know a lot of what they would
suppress we would recognize as
commercial scams disinformation false
news this type of stuff right so it
turns out that every country is having
to deal with the question of what types
of information should be allowed to
spread through the network we do see a
variety of approaches which is not to
say that a suppressing political speech
is a legitimate thing to do in an open
society but anyway moving on from that
briefly we should talk about SSL SSL is
you know gives you a little lock icon on
the browser it is a fascinating protocol
it has a key vulnerability which is that
the only way that you know that you are
connecting to the site that you think
you know is because it has a certificate
that is signed by a third-party
certificate authority there are hundreds
of third-party certificate authorities
there they actually pay the browser
companies to put their certificate in
the browser right so that's how all of
this works is that Verisign pays Google
so that chrome has the Verisign public
key built into the browser so that when
you go to a site whose SSL certificate
is signed by Verisign the browser is
able to recognize that signature which
works great until a government takes
over a certificate authority and so the
key thing to understand about SSL from a
security perspective is that the problem
that it solves is not really encryption
yes there is encryption and encryption
is important but encryption is the easy
parts the hard part is identity okay you
can always send encrypted messages to
someone else you can you make a
connection to someone you could do
public key exchange you can send
encrypted data I mean public key
cryptography is kind of magic but it
doesn't actually solve the problem you
want it to solve and the reason is this
so think about how the internet operates
right person a wants to send something
to person B and you know we sort of
imagine that we get this connection like
this because that's what a TCP
connection is it gives us the illusion
of having a private connection but of
course the connection is not private it
goes through any number of inter me
network nodes right this is how the
internet works that's why it's called
the Internet right it is a network of
networks inter yeah right so it goes
through all these places and so now you
connect to be and you exchange public
you do public key exchange would be you
don't know if you've actually exchanged
the key with beat or you've exchanged
the key with em the middleman who maybe
you did a public key exchange with em
and they did on their own public key
exchange with B and so now they are
reading all of your decrypted traffic
and forwarding on invisibly ok that is
the problem that SSL solves the problem
of SSL solves is when you say hey let's
make a public key is to know that you
are talking directly to B and the way
you do that is you verify B's
certificate and you B says you know I am
foo.com and this certificate says
foo.com and you're like yep
you have the foo.com certificate and I
[Music]
trust Verisign right this is the
certificate authority here the
certificate authority puts their stamp
of approval on this certificate and you
are trusting that the certificate
authority is only going to issue one
certificate to food comm and that once
phu kham has that certificate no one
else can get a certificate if M tries to
get a certificate and says I'm phu kham
the certificate authority says no you're
not I'm not going to give you a
certificate that's how HTTPS works so
this all works great until you have a
compromised certificate authority that
can issue fake certificates for the man
in the middle the reason I'm showing you
all of this is because it actually
happens
so in particular this happened and when
was this 2012 I think Iran was man in
the middling all connections to Facebook
so how they actually pulled this off
there's a variety of ways you can do
this
you can do it with a phishing attack
where you log into a redirects you to a
fake Facebook change page and then you
go through that or you can just look for
you know if you control the network you
control the connections out of that
country you look for people trying to
connect to Facebook you connect to your
own server instead and if you can
convince the browser that you have a
valid certificate for facebook.com then
you'll get the little lock it'll look
like you have a secure connection but
you're actually being that in the middle
the man in the middle attack will come
up again it explains a number of
features of how secure communications
programs work yeah
yeah I mean it's you can't really the
problem is you have to control it
internationally and you're looking for
two different things right there's
there's two different ways that CA can
be compromised one is just the state
owns it right so I don't think there's
an Iranian CA but you know there's
probably a CAS in the Middle East
somewhere that it's friendly to Iran and
will issue a certificate I mean you have
to assume that in fact we know for a
fact that all of the major intelligence
agencies have a supply of can generate
certificates for both sites right so now
you two men in the middle you still have
to physically be on the network path
right that's the complicated part you
have to not just have the fake cert you
have to have your traffic going through
them then there's various ways to
accomplish that so it's not super easy
so one is you can just have a friendly
CA the other is you could get the CI a
hacked and this has happened to you
break into the company that issues the
certificates and you steal their keys
and now you can issue your own
certificate so both of these things have
happened
and it's it's a hard problem to secure
the the global network and this is the
the lesson of cryptography over and over
the hard part is not the encryption the
hard part is key management that's the
tricky bit okay so from from there we're
gonna pivot a little bit and talk about
the law so all of this data is being
collected right there's not a lot we can
do about that we'll talk a little bit
about some mitigation strategies but
basically everything you're doing is
being recorded somewhere by someone so
then the question is who gets it who can
access that information and this is
where law becomes very interesting I'm
gonna talk only about the Americans
raishin that America has relatively
strong laws around this type of access
well strong in some ways for personal
data collected by companies the EU is
stronger of course but we're just going
to talk about the American landscape so
there's good news and there's bad news
the good news is there is law in the US
case law specifically protecting
journalists work product so that's a
legal term which means all of your notes
all of your files all of your raw
footage interview transcripts all of it
okay the bad news is that's only if
you're storing it in other words it
prevents someone it prevents the police
from coming into your newsroom and
seizing your laptop or asking to copy
your files it doesn't prevent them from
going to Amazon and asking to copy your
files
there's old case law here called the
third party doctrine and basically what
that says is if someone else is storing
the data if you voluntarily gave data to
someone else you do not have an
expectation of privacy under the Fourth
Amendment so this is a problem it is a
major legal loophole from the modern era
what companies will do if they want to
put sensitive stuff in the cloud as they
will sign a service agreement with
Amazon or Google or whoever and
basically that's a contract that says
you know Google can't tell law
enforcement to go away they have to
comply with the law but they can promise
two things they can promise that you
will know if someone comes asking for
your data and they can promise that they
will put their lawyer in front of it
first right so they say okay we'll tell
you if we have a law enforcement request
for your data and our lawyer will review
the request to make sure that it
complies with the legal standards before
we do anything so that gives you a
chance to put your lawyer in front of
the process
this is the actual third party doctrine
this is the the the money quote from
this 1979 Supreme Court case and so what
this case was about was whether someone
who dialed a number into a pay phone
should be entitled whether the law
enforcement needs a warrant to get the
number they dialed from the phone
company that's what what this case was
about is it was a case involving
criminal investigation and in 1979 the
court said no you don't need a warrant
to get the data from the number from the
phone company because if you give
information to a third party you can't
expect it to be private anymore so in
1979 this kind of made sense this this
makes less sense now and in fact the
world is changing just this year there
was a court case at the Supreme Court
again called carpenter versus the United
States which said finally this had
happened in various at the various lower
courts but finally said at the Supreme
Court level the precedent that you can't
just ask the phone company to turn over
somebody's location now in this case the
police were monitoring the police were
asking for for a week's worth of
location data so what had happened is
they had gone and gotten this person's
location for the last seven days to see
if they could place them at the scene of
various robberies and this was entered
as evidence in the criminal conviction
of this person the case went to the
Supreme Court and the the appeal said
the government can't get his location
without a warrant they didn't get a
warrant they can't use this information
in the trial and the Supreme Court
agreed now the ruling was pretty narrow
we don't know if the supreme court will
allow you to get location data with a
warrant for less than seven days right
so if you only ask for one day or
something but I think that this is
definitely going to make Police
Department's pause and
it seems very likely that they will want
to get a warrant before they get phone
location data just out of prudence right
they don't want the case to fail because
they got the information about that
warrant in general in the last 10 years
we're slowly getting a series of rulings
protecting our digital privacy so for
example if you are arrested you can't
just grab all the data on someone they
can't just grab all the data on your
phone they need a warrant to do that now
of course if they arrested you they
probably have probable cause to search
your phone but you can't just
arbitrarily get people's phone I don't
think the email Privacy Act ever passed
but this is another legal loophole the
law governing whether you need a warrant
to read someone's email is from 1986 and
basically it it says that you know if
you've stored data longer than 180 days
it's abandoned right 1986 things are a
little different and so you can get
abandoned data without a warrant anyway
this is how things are evolving it's
evolving rapidly that is that is the
sort of a brief overview of whether the
American government can get the data
that you and your source generate there
is also the question of private actors
and this is probably much more
troublesome in most cases oh yeah
we're starting to see actually what law
enforcement if owned companies look like
and in 2011 it was a 1.3 million so we
can only assume it's more oh and the
major platforms now do transparency
reports so how often did they respond to
data requests and after Snowden they
started at it they added this column
which says how many people they gave up
data
before they just had the number of user
data requests but nobody knew if the NSA
could send one request that says give me
all our user information just give it to
me all your user information for 2018
just go but assuming that these
companies are telling the truth
we now know that there is there does not
appear to be in this country mass
ingestion of user data from the social
networking platforms now that may not be
true in China I would be very surprised
if there wasn't a lot more information
sharing in other countries but that
seems to be the case in this country and
here's the Facebook version right same
thing and I mean you know we're looking
at all this from the point of view of
journalists and sources and trying to
protect our work and so we imagined that
we were the good guys in this and we you
know want to assert our rights to
privacy and to our sources privacy which
is all true but I mean it's important to
remember that most of these requests are
criminal investigations right there you
know do you want to make it impossible
to look into someone's Facebook account
if there's a missing child right most of
this stuff is what you might call
routine law enforcement requests and how
this works is they actually have a
lawyer look at every single request and
decide whether they're going to give up
the data and you can see here that they
say yes about 80% of the time okay
so who has a thought about all of that
it says someday to produce something
courage to know exactly yeah right so
they write they might give some
information but not all of it yeah good
question
generally they'll say no if the request
is not legal that's I mean that's the
only way they can say no even so there
were attempts by the NSA to go in the
back door so one other thing that things
that came out of the Snowden leaks is
that there was NSA monitoring of the so
Google leased a data connection between
their one of their UK and US and data
centers so right this is just their
private it's not the Internet right it's
a private network to move data around
for backups and replication and so forth
and that connection was not encrypted
because it was private connection and
that was being monitored so it is always
possible and indeed there is a history
of intelligence agencies trying to go in
through the backdoor as opposed to going
through the front door of legal requests
apparently there was swearing at Google
when those stories came out and at this
point Google and I would imagine most
major cloud service providers now
encrypt the data that gets sent over
their leased lines their private
networks
okay so I've talked about all of the
basics of digital security for
journalism I've talked about the state
of mass surveillance what actually is
being collected which is more or less
everything is being collected by someone
and I've talked about some of the
American law about who gets access to
this data all right so this is all sort
of setting the stage this is background
what I want to teach you now because you
are advanced students is I want to teach
you how to do security planning digital
security planning for a particular story
in context and the method that I'm going
to teach you is a classic computer
security method it's called threat
modeling and here's the idea you want to
try to imagine what are the bad things
that are going to happen and how do we
protect against them so usually what
you're trying to do is keep secrets in
some way now backing up slightly the
higher-level goals are to protect
sources protect colleagues protect your
operations protect the reputation of
yourself and your news organization
normally that involves keeping things
secret so you have to figure out what
you want to keep secret and then also
who are you keeping it secret from you
know we've talked a lot about keeping it
secret from state actors but the the
state is not always the threat often the
subject of your story is the threat or
for example it could be drug cartels
that's a very real threat with very real
consequences and then you have to ask
how can they get this information you
want to keep secret and I think people
with computer science training tend to
imagine it's all about hacking and
indeed there's some hacking that goes on
but it could be easier to
you file a lawsuit or it could be easier
to just get someone in your news
organization drunk or it could be easier
to steal a laptop okay so you got it you
got to think from their point of view
not from the like lead packs or type of
threat and then it's important to
consider the consequences as well and
the reason I say that is you have to
understand the risks that you are taking
and the sources have to understand the
risks that they are taking journalism
ethics demands informed consent for
risks that the sources you are talking
to our taking right you are asking
somebody to put something on the line
for you obviously you try your best to
protect them you can't necessarily
guarantee that so first of all there's a
question of what's private obviously
you're probably going to think of your
emails and text messages but you may not
think of your address book now we've
studied social network analysis prior we
know that just the network topology
without any of the data of who's saying
what to whom or what they're doing is
extremely valuable information so let me
ask you this how many copies of your
address book are there and where are
they and who has them so let's try it so
where is your address book
Sonne all right so that's the obvious
one okay
where else yep Google has all of your
contracts they they have their own like
address application everyone your email
some of the people you call two people
you have messages with people you have
video conferences with what else yep
it's probably synced to your phone yep
huh
iCloud yep so that's Apple Apple has a
copy of it maybe within Gmail yeah we've
got Google already yeah okay so rather
than listing all of these we're just
gonna say snapchat Instagram etc
whatsapp WeChat yeah
all my contacts right so apps oh my god
every app that you've ever said access
my contacts right so you know some of
those are a little startup they're not
really thinking about privacy right so
this is this is the issue with granting
people permission to see your address
book is you don't know how well they're
gonna protect it right if I want your
address book I don't have to hack you I
just have to hack one of the shitty
little apps that you've installed right
and so you have to think about the
security of every single one of these
places the other big one that no one's
mentioned yet is backups do you have a
backup of your drive somewhere do you
have a backup of your phone where is it
is it encrypted all right do we miss
anything I bet there's a big one we're
not thinking of probably good for now so
you can do this exercise for any of your
data and you should write where where is
this sensitive stuff stored you probably
have ten copies of it you know how
classified documents were handled in the
paper era is you had every copy of a
classified document had a number you
know copy three of seven and there was a
log book in the central files that
listed who had every single copy and if
you wanted to take that copy to your
office and read it you had to sign it
out so if you want to track sensitive
data you have to do the same thing you
have to keep track of where that data is
and that has become much harder because
we have all these cloud services this is
also the moment where we introduce the
difference between privacy versus
anonymity so what's the difference
between an encrypted message and an
anonymous message
okay right exactly
so anonymity is privacy of identity so
we'll get to that that's quite hard to
do the next question you have to ask
after the wet data question is that who
wants to know question insecurity
language we call this the adversary
right and I think people are now sort of
getting used to the idea that it's not
always the government who's the problem
for a long time this is the only thing
that security people thought about and
you know indeed we've seen cases where
it is a problem right so this was an AP
source busted by law enforcement getting
phone records right so it happens but in
many cases it is not law enforcement
that is the issue so who else could it
be if it's not law enforcement what are
some examples of threats adversaries
your enemies okay so who are who are
journalism enemies I said that
journalists are targets who are
journalists targeted by sorry what
terrorist groups yeah depending where
you're operating yeah they don't like
you potentially I mean most of the time
these guys play relatively clean but not
always yeah
there could be particular groups that
don't don't like what you're doing right
so the subject of your story or interest
groups around this story you're saying a
lot of mean things about banks right
we're now seeing sort of organized
political harassment daxing and so forth
all right remember they don't have to
ruin your story they just have to make
your life hard although they might ruin
the story
could be a lot of people and then the
question you have to ask is okay so how
can what they can they do to find this
stuff out or harass you or slow you down
or you know threaten you and these are
sort of the categories that I've divided
it into we've talked about technical
problems in a bunch of ways we've talked
about legal problems we haven't talked
about social very much notice I put
phishing under social and that's because
phishing is not a technical exploit okay
it's a social exploit it relies on
tricking somebody any any time you're
relying on fooling someone or exploiting
trust there is likely not a technical
solution to the problem right that is a
human problem not a technical problem
we've talked about physical briefly you
know if someone can get into your server
room and connect a keyboard to your to
your server
I mean game over right you can do
anything you want at that point so you
got to lock that door
you know when the New York Times was
reporting on the Snowden material they
kept it on an air-gapped machine in a
locked door in the New York Times
building with guards outside that's the
only way to do it right and that is the
most sensitive material that journalists
have ever worked with
not only this government but most
governments wanted a copy of that right
like if you know if you're an enemy of
the United States or even if you're just
a foreign intelligence agency you want
those documents so and believe me most
of them were never released right they
made they the the journalists working on
that stuff made a lot of redactions and
a lot of very careful judgment calls
about what was released publicly
of course the you know people debate
hotly whether they made the right calls
some people think they released too much
some people say it's too little but the
point is they had to keep it secret and
so they didn't use any cloud storage
and they had it physically secured and
they had it on premises and doing it on
premises gives them additional legal
protection because it's hard to have
police busting down the door of a news
organization there's laws against that
we haven't talked about operational
security much operational security has
to do with the day to day process the
habits of security so if you're
communicating over a secure channel with
an anonymous source
you can't ever use a non anonymous
channel with them I want to show you an
amazing document that was written by
dolls after which dolls airports is
named Allen dolls
he was the CIA director for a while and
[Music]
it's called the 73 rules of spy craft
and it's this crazy very like
cloak-and-dagger World War 2 era because
he was and he started intelligence
during World War two
all of these rules of a spy craft and
it's like tells you how to use the phone
and the post and a lot of it doesn't
look like any more but a lot of it
really does but look at number two it
consists in carrying out daily tasks
with painstaking remembers of the tiny
things that security demands security is
habits okay it's not installing the
right software it's not you know having
a security meeting it's not hiring a
security consultant just once it's
learning the habits that you need to
stay secure
right you got to get your encryption
right you got to use the right
communication methods every single time
it's really annoying security isn't free
you have to remember to do things you
have to use awkward software
anyway security is about habit and Ellen
dolls is great and I'm gonna paste this
in the slack channel just because it's
so much fun to read
yeah booze is naturally dangerous sex
and business don't mix there's really
good stuff here and then there's like
really basic spy crap if you have around
images first make sure you are not
followed I mean I don't know it's all
very old-fashioned but actually a lot of
it is still very good advice turn the
blackout to good use well that's not
happening when we're not in world wars
anyway so that's operational security
operational security is about habits
it's about keeping your mouth shut if
you're working on a sensitive story
don't tell people okay just I you know
it should go without saying but it
doesn't somehow yes sure you can have
bad habits I mean in theory but I mean
what I will say is that having a
security plan is much better than not
having a security plan yeah so I do know
one reporter who in order to hide who
their source in a government department
was they made a habit of calling lots
and lots of people about stupid things
in that department so now there's their
phone records show they talk to 20
people and not one people which so think
of think about if they're trying to
investigate who the source was the first
thing we have to figure out is who's the
list of suspects you want to make that
list of suspects big ok here's an
example of a legal threat so this was a
New York Times reporter named James
Rosen who you know the
the risin and the times actually fought
back for many years to try to prevent
being compelled to disclose his source I
can't remember how this actually turned
out whether or not he did go to court I
can't remember what happened but you
know you got to remember this stuff
right if your if your adversary can file
a lawsuit that can compel you to
disclose you have different types of
problems than if they can't so let's go
through some scenarios briefly these
actually used to be an assignment you
would pick one and write out a security
plan so this is a sort of classic one
which is you have sensitive material in
a war zone that you're trying to get out
of the country so thoughts how do you
how do you do this how do you get your
pictures home
so encryption great start so encrypt
what all right so first of all you
should be storing them encrypted
otherwise you are this guy we saw at the
beginning oh I went right past it
alright this is you know a real right so
don't be that guy so okay so encrypt the
material on your laptop well you got to
put it on the wire at some point because
you got to get at home
although so that actually mostly depends
on the time frame so if you're working
for a wire service and you're doing
breaking news you're not gonna probably
not gonna have time to mail it home or
fly it home you really want to transmit
at home but ya walking across the border
with the data can can be better than
sending it but certainly if you're gonna
send it you should encrypt it on the
wire right so you can sort of see these
are the general general ideas in in
practice there's another part of this
which this setup doesn't really talk
about which is you really have to
protect your address book your address
book your travel itinerary your location
I mean if the phone records showed that
you visited someone's house that's
really bad
so actually meeting if location is a
problem meeting sources out somewhere is
a lot better than going to their houses
or their work you have to think about
these sorts of things all right let's
try another one this is a corporate
example you're trying to get some inside
material on insider trading so how did I
get this material to you
what would you advise them to do you're
talking to them and you're like okay
it's like okay I've got documents that's
gonna blow this story open I want to get
them to you Oh interesting I don't know
that part discovered it crazy so I don't
know yeah so you could tell them to mail
it to someone else okay
don't yeah so so paper is at candy it
has some problems but paper can be a
very good secure medium because you
can't do it remotely there's no
possibility of hacking and everybody
understands the security properties of
paper everybody understands under what
circumstances paper is secure and not
secure the disadvantage of paper is that
you can't encrypt it so physical
security is harder with paper whereas
it's easier with a encrypted card but
sure they could get out on paper you
definitely want to tell the source not
to email it from their work account I
mean I know this sounds stupid but
people do this you might want to
consider whether or not downloading the
data is going to be logged the safest
thing for them to do if they can manage
it is probably for them to bring the
material home whether in paper or
digital form and then
potentially it depends well so right so
your account is okay their account is
the problem right our connections
between the two is the problem so for
example if they take the material home
and email it to you from their home
computer using their home account again
think about the threats right the
company is going to file a lawsuit
they're going to do an investigation
probably you know certainly the company
computer and the company network is
monitored they may be able to compel the
employee to turn over their personal
laptop they may be able to file a
lawsuit which compels them to turn over
their personal laptop all right so you
know it's a combination of legal
technical and social threats right
what's going to happen when this person
is interrogated by corporate security is
there recording on the company network
can they file a lawsuit to compel the
step of information and so anyway all of
this boils out to surprisingly one of
the best ways to get leaks material
across is to meet someone in person and
hand it over that still works really
well if you can manage to do that and do
that privately that is a pretty secure
thing to do and that's been the case for
40 years right we're still meeting
people in parking lots and handing over
envelopes if you're going to do it
online you should use secure drop we'll
cover that in a minute here's another
one here's where you're reporting on the
police all right now of course this is
journalism eventually they're going to
know your story but you still want to
preserve their sources and you also want
don't want them to know before it's
published so let's let's go through the
threat volume list so what are you
trying to keep private
right so identity of the sources yeah
yeah what they're giving you what else
yeah did the details of the information
that they're giving you and also the
fact that there is a story is something
that you want to keep secret okay so who
you're trying to keep it secret from
yeah they the police commissioner
probably other people in the department
yeah
probably them too yeah what can they do
to find out these secret things
how could they find out who you're
talking to what they told you or even
that you are working on a story I don't
know do they what what are you thinking
so right so if you are the subject of a
criminal investigation then they can get
a warrant to get the data depending
where you are in the world maybe they
would right not not all countries would
require a warrant for the police to get
that information so it depends what else
can they do to find this out yeah
just observing so you know have the
conversations in private they can also
interview your sources right if they
suspect they can ask the start asking
around inside the department so that's
something you have to consider and then
finally the last last threat modeling
question so what happens if they find
out what happens if they manage to
figure out what you're doing and what
you know
yeah so maybe the story has to go before
it's ready and maybe there is no story
what else can happen destroy it what oh
you're saying you should destroy the
material if they find out you're saying
they can yeah so that's the end of the
story yeah if they're trying to subpoena
the material you can't destroy it
otherwise you're you have additional
penalties what
mmm-hmm yeah they could they could just
sort of win the PR war what what happens
to the people you're talking to what
about that sure but what can happen to
them what is the risk to them yeah they
could lose their jobs for example yeah
you know or potentially end up in jail
depending on the circumstances in the
context and then there's this one this
one is a little too real all right this
is a very complicated security situation
where there are intense both physical
threats and digital threats which brings
up a really important point I've been
describing digital security for
journalism security isn't you can't
really split off the digital from the
rest of it from the legal and the
physical and the psychological security
all of this that has to be in one plan
all right so to protect against the
digital threats in this environment you
also have to protect against the
physical threats so and these connect in
a bunch of ways for example the purpose
of installing spyware through a phishing
link may be to figure out where you are
because then they can track your
location through your phone for a
physical attack or there could be a
physical raid on an office to seize your
materials to figure out what's happening
so all of this stuff interconnects and
if you're doing digital security you
have to talk to the people who are doing
the physical security and coordinate and
I mentioned also psychological security
a lot of the people working with
material coming out of war zones even if
they're not in the country eventually
developed PTSD right there's only so
many beheadings you can watch before you
develop psychological problems so that
is a security issue as
we have here at the school on the fourth
floor the dart Center for conflict and
trauma so we actually have a bunch of
specialists experts in reporting in
conflict zones reporting under threat
dealing with trauma inflicted on
journalists it's it's a whole subject in
itself that our technical security plan
has to fit into okay
the last part of this class today is
what I call the recipes so recipes are
some advice on solving specific problems
there is no one-size-fits-all security
advice I can only give you some hints on
trying to do particular things because
that's this is normally what people want
when they ask you right so that a
reporter will come to you and rather
than saying let's make a security plan
for this story or they'll say what app
should I use for anonymous communication
what's the answer by the way - what app
should I use for anonymous communication
apps don't give you anonymous
communication that's the answer now
having said that I'm gonna give you a
recipe for doing this but bear in mind
it's not the app that gives you the
anonymity it's the habits that give you
the anonymity okay so let's talk about
secure communication the first thing you
should be aware of is that any data
which exists for a long time is subject
to popping up and damaging it you later
it may be leaked it may appear in court
anything that you type into the company
slack you should imagine being read
aloud in a court case so this was a
transcript of a internal campfire track
which is like slack that Gawker had
around the publication of the Hulk Hogan
sex tape
some of you may know that this was the
court case which destroyed Gawker they
were ordered to pay a hundred million
dollars fine for privacy violation they
went bankrupt that's the end of them do
you want that phrase read aloud in court
that's what you should be thinking
whenever you're typing into slot chats
last forever
one way to mitigate this is to have a
data retention policy it's really
important to have a standard policy so
you can say oh we delete all chat logs
older than a year because if you start
deleting things when there's a subpoena
that can get you into legal trouble so
the way to solve this problem of chat
logs and other types of data lasting
forever is just to have a regular
deletion policy if you use iMessage
there is actually now a automatically
delete all messages older than a year I
recommend you turn that on you know how
often do you actually look at messages
older than a year right have you ever
looked at messages older than a year
this will save you later because by the
time somebody is asking for that message
it's too late you can't delete it later
text messaging so SMS actual like
standard text message is unencrypted
over-the-air it's like the least secure
thing you could imagine it's broadcast
unencrypted there was a beautiful piece
of art in Madison Square Park when a few
years ago which just was a little box
with an antenna and a printer and I just
printed out every text message that it
received it was kind of beautiful just
reading them but anyway not secure so
you got to use some sort of IP messaging
system which also means by the way that
iMessage will default to text if it
can't send through the network you
really need to turn that off so
otherwise one of the most secure ways
becomes one of the least secure ways
most messaging apps are not encrypted
and the messages can be read by the
parent company which means it can be
subpoenaed the exception is signal and
now whatsapp also I message to the best
of my knowledge the data that is in
iCloud cannot be read by Apple
so as theirs as evidence of that you
will recall perhaps the case where the
FBI wanted to get into an iPhone owned
by the San Bernardino shooter and they
couldn't an Apple wouldn't help them so
it's actually designed so that Apple
can't break it so to understand whether
a messaging system is secure against
subpoenas I use what I call the mud
puddle test the mud puddle test is
you're walking down the street and you
drop your phone or your laptop into a
muddy puddle of water and you call up
Apple or Google or Facebook or whatever
and you say oh man my phone's destroyed
help me get my messages back and if they
say yes we will help you do that it's
not secure okay now in the case of
iCloud the data is stored on Apple
servers but the key is on the local
device that's how they manage it Apple
changed this policy after a bunch of
iCloud hacks you may remember I think is
2014 a bunch of nude celebrity pictures
going around the internet that was done
by logging into iCloud accounts that is
no longer possible all the iCloud
storage is now encrypted it is encrypted
with a key that is on the device there
is a secure area of the device and I
think now in the laptops as well and
that key is mixed with your passcode key
so the password unlocks the access to
the secure key which unlocks the access
to your iCloud storage which is why I
messages reasonably secure I'm not sure
if Apple has the metadata or not but
they certainly
have the text oh yeah SMS you can buy
this device and read SMS or you can I'm
sure you could do it with a $30
software-defined radio as well this is
what I just said about iMessage you want
to turn this little send as SMS thing
off otherwise one of the most secure
ways to talk to people becomes one of
the least secure whatsapp a couple years
ago implemented the signal protocol
whatsapp is the most secure widely used
messaging platform so it's probably a
win if you can get your sources to talk
to you through what's up
now the metadata is still available to
Facebook which means it's still
available to anybody who can hack into
Facebook and anybody who can subpoena
Facebook so you have to consider whether
your adversary you can do either of
those two things but assuming that they
can't and you know Facebook spends a lot
of money on security there as these
things go reasonably secure which means
that whatsapp is a pretty good answer to
secure communication and it's available
in most countries most people already
have it installed this situation has
improved dramatically because of this
choice on Facebook's part which is also
causing them all sorts of headaches
because people are using it to spread
rumors and misinformation and they can't
intervene because they don't know what
the messages are I mean that's the
balance signal is the gold standard and
in part because of its encryption and
this thing called a ratchet actually
this is no longer true it's now just
called the signal protocol it used to be
called the axolotl ratchet it's called a
ratchet anyone know why anyone know how
signal security works and this is
actually relevant to threat modeling
email as well so let's talk about PGP
which I do not recommend by the way
here's an email it is encrypted with
some key key a and you create that key
and you send 100 emails and every one of
those emails is encrypted with the same
key so this is how PGP works the
difficulty here is that if anyone ever
gets this key they can read everything
you've ever sent what signal does and
whatsapp as well is a little different
it uses a key mechanism called a ratchet
and a ratchet is a mechanical device
that can only turn one direction right
it can only turn forward I'm gonna get
this wrong let me think about this for a
second
what happens is that each message is
encrypted with a different key right so
that's the first thing is that you
actually encrypt it with a key that is
derived from your personal key and some
information from a last message so each
message actually stores a hash of the
previous message and this does two
things first it means that you can't
fake a message later right so if you
have a sequences of messages you can't
like insert one in there and forge the
message and say oh they sent this
message - even if you get the person's
key right so even if they steal your
signal private key
they can't fake evidence later right
that's one thing that this this hash
does this by the way is the chain in
blockchain
but right block trains exactly the same
idea you have a hash hash less pointing
back the other thing that it does is
this
this hash is one of the inputs to the
key for this message which means that
you actually have a different key for
every message which means that if
someone which is not the same as your
private key which means that if someone
manages to get into your phone and dump
the signals data storage and get your
private key they still can't read the
messages you've already sent right
because they depend on a key that was
generated just for that message so this
property of signal is also called
forward sequence forward secrecy which
means that even if you get the private
key later you can't read the previous
messages this is not a property that PGP
has which is another problem with PGP
the other really nice thing about signal
is that they store very little
information and they've always said this
we know it's true because this is the
documentation that signal provided in
response to a subpoena so the only thing
they know is the account name the
creation date and the last connection
date that's it they don't store any
information about who talks to who so
that's how they solve this problem of
the subpoena has to cooperate with law
enforcement just like everybody else
they solve the problem by just not
storing anything so signal is ideal and
I've had people working in intelligence
tell me that that's the only thing they
trust yeah
why are you laws that require you keep
logs so I noticed a lot of the VPN
providers that do this so there is a big
legal fight so there's a there are laws
that require phone companies to keep
logs and also provide the ability to
wiretap basically I mean wiretaps are
not done with wires anymore they're done
by programming the switch to record the
call there are ongoing legal fights
about whether you can have a system that
doesn't store data and has no encryption
for the moment it's legal but signal is
also relatively a small player right so
for example the Apple vs. FBI case was
about this principle and Apple fought on
the principle that says no it should be
possible for our users to have actually
secure communication that law
enforcement cannot get into and of
course law enforcement sees it
differently there is a long history of
fights around access to cryptography
none of you are old enough to remember
the crypto Wars of the 90s there was a
proposal called the Clipper Chip which
was you know what this internet thing
needs cryptography everybody needs
encrypted communication let's give
everybody a standard chip which does
secure communication except that there's
a backdoor there'll be a will make an
institution which will do key escrow
which will keep the keys to all these
chips and if law enforcement needs to
get into your phone or your email then
they can get a warrant and get the key
and unlock the that the encryption the
problem I mean there's so aside from the
issue of whether it should be legal to
use encryption that law enforcement
cannot break which is a worldwide fight
that's going on in law and policy the
problem with kiosk Roe is that the key
the stored keys become an immensely
valuable target and if you look at the
history of hacks I mean
it seems to be more or less impossible
to secure a large volume of extremely
valuable data imagine if that data was
encryption key to everything else so
anyway this is a fight that keeps
happening for the moment it is possible
to have to use unbreakable private
communications but again that's just the
phone right
that's anonymity does not come from
encryption anonymity comes from very
very careful habits moving on an email
don't use email for secure stuff if you
can avoid it that's the bad news the
good news is if Google is not a threat
then Gmail is pretty good so Google puts
an enormous amount of money into
security they have the best security
track record of any of the major tech
platforms I don't think they've ever had
a major security breach they put a lot
of money and effort into this and if
both ends of the conversation are using
Gmail then of course gmail knows what
you're talking about but if your
adversary can't get information from
Gmail then you're probably good so if
your adversary is the American
government then okay they can subpoena
Gmail if your adversary is the Chinese
government then Google is probably not
going to give them information which is
why Google's Gmail is blocked in China
also corporate adversaries are unlikely
to be able to subpoena Google PGP is a
disaster cryptographically it's okay it
suffers from this forward sequency the
lack of forward secrecy is not great it
the problem is the usability it's very
hard to use properly and it also doesn't
give you metadata and also there was a
big security hole found in it earlier
this year so I would say avoid it if you
can this is the issue with email is that
again there's all these intermediaries
right so if someone's sending you an
email from their work computer so for
example it goes to the company Network
and then nowadays most of the
connections like if you're sending an
email from Google Yahoo at some point
it's got to be connected there's got to
be connection between a Google server
and a Yahoo server it used to be that
those connections weren't necessarily
encrypted now they must they are but if
it goes to a smaller ISP it might not be
basically you have no guarantee that any
part of that transit is going to be
connected or secure email is the worst
except it's the only truly federated
communication medium we have it's the
only communication platform that isn't
run by a single company anyway that's
the world we live in this is how you
should think of an email you're sending
a postcard everybody can read it phones
as we saw earlier they just leave data
like crazy in most cases only state
actors can get at that data in many
cases they need a warrant to do so but
as we've seen sources have been busted
by this stuff so I don't know I I kind
of think that VoIP is probably a better
call but you know Skype is not it's
going to respond to a subpoena Google is
going to respond to a subpoena but you
can do calls through signal and whatsapp
okay that's communications we're now
going to talk about storing data as I
mentioned earlier the key question you
have to ask is how many copies and where
are they if you don't know that you
cannot securely store data you also have
to think about erasing files
just putting something in the trash does
not remove it from the disk you can
still recover it with forensic programs
emptying the trash does not remove it
from the disk either it just removes it
from the filesystem so if that there is
a secure erase feature built into Mac OS
now I think there's also one in Windows
not there's various utilities if you
really want to be sure destroy the media
just smash your USB Drive with a hammer
then you have to think about the
physical security you know if you have
an unencrypted memory card in your desk
drawer that's probably terrible also
physical asset access right
if you leave your computer to go to
lunch and the screensaver has a five
minute delay or you don't have the
screen lock then someone can sit down
and install spyware while you're not
sitting there you see if I have to ask
questions like how easy is it for
someone to just walk into your office
and sit down at your computer all it all
it takes is guts right it's there's
probably nobody is gonna stall them oh
you know I'm Jonathan's friend he asked
me to get a file on that would probably
work all right finally I'm going to talk
about anonymous sources and uh Numidian
privacy are really different things
the problem is link ability so we all
have many online identities the the
issue with anonymous communication is
not so much that the person has no
identity of course they have the
identity they have the identity of the
person you know they're using a
particular account to talk to you the
problem is making sure that you can't
link that identity to say their email
identity or their legal name so the way
to think about an autumn ility anonymity
anonymous anonymity is the link ability
of different accounts
this is how you should think about any
communication you send where the
metadata is available right so it's not
a postcard so now you can't get inside
but everybody can see the address and by
the way everything that you send through
the US Postal Service
they scan and save the address that's
how they trace mail bottoms and so forth
so the post office knows who is sending
what this is the difficulty with sending
up anonymous communication this is a
lovely Commerce comment by Barton
Gellman who was involved in a lot of the
Snowden recording and you know it's a
lot of work to set up an autonomous
communication and you don't want to make
sources feel like they're doing
something wrong that's the challenge
here in order to not have metadata
linking you to the to the source every
contact with the source has to be
through an anonymous channel and that's
very hard to do that means every contact
including the first contact and so you
can have this conversation but it's a
tough sell this is one of the major
challenges in journalism security is
what I call the first contact problem
the only answer that I've ever found the
first contact problem is you just need
to use secure communication channel from
the start so this is easier than it used
to be you can now say oh I use what set
whatsapp for everything the only problem
with whatsapp is that I think it exposes
your phone number so then they could
just text you but hopefully they just
use whatsapp for everything and so
there's the conversation you're gonna
have yeah let me add you on whatsapp I
use whatsapp for all main communications
with everybody just to protect
everybody's security just standard the
one of the problems you have here is
that you don't know which sources are
going to become sensitive sources later
so the only thing you can do is treat
every source as a sensitive source
anonymous browsing when you go to a
website they know your IP address the
Tor obfuscates the ID to be addressed by
routing it through a bunch of computers
in it they use three computers so that
the computer in the middle or at the
beginning doesn't know who you're
talking to and the computer at the end
who doesn't doesn't know who's talking
to them you're all you've all heard of
tour I think tour does work but it's
important understand what it works for
in particular it doesn't encrypt traffic
so if you're not using HTTPS then
someone here in the network can see what
you're doing and if you log in using
your gmail account over tor then this
computer knows it's you right so the
only thing tor does is hide your IP
address it's a useful part of the
solution but it's not the whole solution
the easiest way to use tor is to just
install the tor browser you can also
make other network connections like I've
done scrapes through tor by setting it
up as a proxy on your local machine
there is now we were all sort of waiting
for this to happen there is now one case
where the subject of a story figured out
that a story was being written about
them one documented case figured out
that a story was being written about
them by looking at who the IP of the
people connecting to their web site so
here it is I'm sure it's happened in
other cases but we know that's because
it came up in a basically unrelated
court case New York Times had been on
the Company's website twelve times that
day or that week I just have a feeling
it might be this reporters limping
around trying to build a story so this
is the problem that tor will solve for
you
the last thing I want to talk about
today is securely handling leaks this is
in the recipes category again so we're
gonna assume we're in a context where
you're trying to prevent your adversary
from knowing who gave you the data
obviously you can't use a corporate
network if they do it from a personal
laptop there's going to be various types
of identification you know they do it
from their email account obviously it's
from their email account the best thing
you can do is still meet them and hand
something over beyond that the safest
thing you can do is the model that was
pioneered by WikiLeaks gotta give credit
where credit is due which is just don't
know the identity of the source now
there is a trust problem here obviously
that only works if you have an
independent way to verify the material
right if it's material that you know is
true because of who is giving it to you
then you then you have to know the
identity of the person giving it to you
but if you don't have to know that's the
safest thing you can do because then
there's no way you can be forced to
reveal their identity secure drop is now
the gold standard for electronic
submission to journalists who's heard of
secure drop yeah okay less than all of
you secure drop is a piece of open
source software that runs a secure
Dropbox so physical Dropbox is a place
where you leave material and someone
comes and picks up later and you don't
ever have to meet and know each other's
identity so what this is is a it's a
place where you can upload files and it
forces you to connect through tor so
there's no way to connect to it that
reveals your IP address and then it also
basically you can log in and create a
mailbox so you can you can leave
messages for each other so it's not just
someone throws you something and you
don't know who it is you can talk to
them by leaving a message in when they
log in later they can get the message
and maybe you can set up
you know some other method of
communication through this most major
news organizations now have a secure
drop server this is become the gold
standard in electronically leaking
material now you still have to use some
thought because if you're the only
person on your network using tor they
can figure out it was you in fact this
is a real case I don't think I have this
slide but it was a case where a student
didn't want to go to an exam so they
sent a bomb threat through tor and shut
down the campus but campus IT was able
to figure out that there was only one
person using tor connecting to the Tor
network at the time that the bomb threat
was sent so tor is not right tor does
one thing right it hides your IP address
that's it it doesn't it doesn't even
hide the fact that you're using tor so
you have to be careful still but secure
drop is about as good as technical
solutions are going to get yeah because
there was only one person using tor so
they were figured out which computer he
was using he's using a computer on
campus and then they knew who was logged
into that computer yeah yeah
so the MAC address is that is visible in
the network traffic from your computer
to their access point right so when it
starts getting routed through firewalls
and gateways there's no more MAC address
because then you see the MAC address of
the machine that is forwarding it right
so the MAC address is the physical
Ethernet address that your
let's say for your router right he's
exposed generally the MAC address will
not survive transit through the internet
okay but you can pick it up on Wi-Fi so
that's just getting the documents and
then once you have the documents you
have to worry about whether the
documents will reveal the source here's
one of the famous journalism security
fails phones and many cameras record the
GPS coordinates in metadata on the image
if you publish that image you know where
the phone is so here's an interview with
John McAfee who was more or less on the
run in Belize and then they knew exactly
where he was one of the famous
journalism security failures this was
Vice by the way so don't do that
these there are programs that will strip
metadata from images and other documents
so here's a file metadata from a Word
document
PDFs have this basically don't ever
republish material that was linked to
you and don't ever republish the
original files there's tools that were
remove metadata but one of the safest
things you can do is load it up and take
a screenshot on your own computer then
it will have the metadata of your
computer and not the source and the
reason I say use a screenshot because
it's truly what you see is what you get
when you take a screenshot you're you
are literally not getting anything other
than what's on the screen and that's the
idea you want to completely strip all of
the information we learned unfortunately
in the last few years everybody learned
about microdots
so this is the sad tale of reality
winner who was working for the
government and forwarded to the
intercept a classified document showing
that Russians had gotten into either the
election websites not the actual vote
counting machines but the websites of 18
states intercept did a story about that
the intercept published the document so
here's a little more about that yeah so
here's here's what it is so basically
all printers publish these little dots
you can barely see them here these
little yellow dots these are called
micro dots
yeah here we go
and they were in the document that they
published and what it gives you is it
gives a printer serial number and a date
so if you zoom in here and you invert it
they become blue you have to increase
the contrast a little bit and you end up
with this and then the e FF website has
a little dot decoder I mean it's an open
standard right you don't have to go to
the e FF do it
look at this there's all sorts of
information about which printers do it
or not yep
so basically assume everyone I can't
find it right now but anyway there's
little utilities that will take that
micro dot image and give you the date
and time and that we believe this played
a key role in the identification of
reality winner because all print prints
are logged basically every access to
classified material is logged so I mean
the fails just stack up here right so a
source I'd previously emailed the
intercept right so problem number one
there was metadata already showing a
contact between them and we know a lot
about how reality was found because we
can read the court documents that the
FBI filed right so they knew there was a
previous contact they accessed it from
work and printed it right so printer
micro Docs and then they sent it to the
intercept the intercept then
verify the document send it to someone
else in the intelligence services to say
is this real they sent them the original
document all right that's how the FBI
first got a copy of it right and then
they published the original document
which had the printer microdots and also
creases because reality winner had had
folded it up to take it out so a major
failure on the sources part a major
failure on the journalists part and
reality winner is now in jail okay so I
think that's all we're gonna have time
for today
we will end on that cautionary tale
Expertise:

A class on the basics of digital security, and methods to deal with specific journalistic situations — anonymous sources, handling leaks, border crossings

Contributor: Jonathan Stray