Sunday 14 September 2014

Google's Safe Browsing service is killing your privacy

  Google Safe Browsing is a service through which Google provides lists of URLs (addresses) of websites that contain malware or phishing content.

  These lists of suspicious sites are continuously updated using Google's web crawlers, programs that scour the web to index sites for Google's search engine.

  Lists from the Google Safe Browsing service are used by browsers such as Google Chrome, Mozilla Firefox and apple Safari for checking web-pages users are trying to access against potential threats.

  The service issues alerts when they are about to open websites or content Google has classified as malicious. The warnings are display as 'visual messages' along with specific details relating to the malicious content concerned.

  The service is also designed to block the downloading of files infected with malware and, once a user's computer has been infected, it can issue instructions on how to detect and remove the malware.

  Members of the public can also access the lists of unsafe sites via a public API for the service. [An API, or application program interface, is a set of instructions that specifies the functions or routines required to accomplish a specific task, such as reading a particular list of websites.]

  In addition, Google uses its Safe Browsing service to send internet service providers e-mail alerts regarding threats hosted on their networks.

  More than one billion internet users are currently using the Safe Browsing service, either directly or indirectly and, according to Google, it is issuing three million warnings a week.

  The service is acknowledged as being highly efficient in protecting users from malware and phishing attacks.

Privacy concerns with Google's Safe Browsing service

  When using Google's public API (the Safe Browsing Lookup API) to check out a suspicious webpage, members of the public who are concerned about their privacy need to be cautious. The URLs (addresses) to be look up are not hashed (encrypted) so the Google server knows which URLs have been looked up using this API. This makes tracking your online activities ultra-easy.

  The Firefox and Safari browsers however use a second version of the API, Safe Browsing API v2, to exchange data with the server. This uses hashed URLs so the Google server never knows the actual URLs queried by the user.

 However the Safe Browsing API also stores a cookie on the user's computer which the NSA (US National Security Agency) uses to identify individual computers. This is a mandatory requirement that many users feel is acceptable as it helps them feel safe.

  In addition, Google stores another cookie on the user's computer that can be used to identify the IP addresses the user visits, ie can be used to track him or her.

  Google's excuse is that the tracking cookie logs this data in order to prevent DDoS (distributed denial-of-service) attacks. That may be so.

  The API in the user's browser (eg, Chrome) will 'phone home' every few hours to check for updates to its list of malicious sites. At the same time it sends a payload that includes the machine's ID and the user's ID.

Should you turn off Google's Safe Browsing service?

  Even if you trust Google not to use your information without your permission or for some nefarious purpose, there is a potential risk that it can be picked up by a malicious third party when it is being echoed back across the internet to Google from your browser.

  The only way to prevent this is to disable the Safe Browsing feature in your browser, which is on by default.

 This is a real bummer as you would be turning off a great service.

 But that's what you've got to do if you don't want to be tracked.

  When making up your mind about whether to turn off Safe Browsing or not, you should bear in mind that even if the information that is tracked is not hacked, it is available to be accessed under a court warrant or at the request of the US NSA.

 The good news from Google is that Google only retains the data for two weeks and then deletes it.

 Not so, say some researchers, who believe that after two weeks the data is anonymized, ie names and other identifying features are removed, and stored in aggregate form.

 If this is true then having just the user's IP address, the cookie and timestamp would be enough information to decloak someone for something they may have done years before.



1 comment: