Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Long before amp, Google began prefixing search result urls with "google.tld?url=" and adding Google parameters as suffixes such as "sa=", "ved=", etc.

Unless I am mistaken this parasitic cruft only serves Google, not end users.

Below is quick and dirty program to filter out the above. Replace .com with .cctld as needed.

Requirements: cc, lex

Usage:

   curl -o 1.htm https://www.google.com/search?q=xyz
   yyg < 1.htm > 2.htm
   your-ad-supported-web-browser 2.htm
To compile this I use something like

   flex -Crfa -8 -i g.l;
   cc -Wall -pipe lex.yy.c -static -o yyg;
Save text below as file g.l Then compile as above.

   %%
   [^\12\40-\176]
   \/url[?]q= 
   "http://www.google.com/gwt\/x?hl=en&amp;u=" 
   "&amp;"[^\"]* 
   %%
   main(){yylex();}
   yywrap(){}
As for amp, I read that it needs to use iframes (and Javascript). Yikes. We can easily write a program to strip out iframe targets as well as links to Javascript.

amphtml does look great in a text-only browser that does not load iframes automatically.



It's really annoying trying to copy and paste URLs from Google results. It also seems largely unnecessary, can't they detect clicks using javascript? I have noticed they have started doing this with links sent through Google Hangouts messages as well. I do remember a time when they weren't doing this and it was very refreshing because everyone else was.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: