The Robservatory

Robservations on everything…

 

Remove tracking data from copied URLs

A while back, my friend James and I were discussing the amount of tracking cruft in many URLs. In my case, I subscribe to a ton of email newsletters, and I noticed that those URLs are just laden with tracking information—and most go through a URL processor, so you don't really see those tracking details until you've clicked the link, at which point it's too late to avoid any tracking.

I wanted a way to clean up these URLs such that the least-possible tracking information was sent to a server—and in particular, to prevent any browser cookie creation. In addition, if I want to share a link with friends, I don't want to send them a crufty tracker-laden link—I wanted a nice clean shareable URL.

Note: I wrote all of this before I knew about Jeff Johnson's Link Unshortener, which does all of this (and more) in a "real" app. If you'd like the easy solution, Jeff's app is the way to go. Mine is definitely a do-it-yourself concoction that's not for the faint of heart.

tl;dr version: Install this macro group (v8.1) in Keyboard Maestro to remove tracking details from copied URLs in a set of defined apps.

Latest Update: Dec 5 2021 I made a fundamental change in when copied links are opened in the browser—they now only open if you hold the Option key down while copying the link. I also made some changes that should make future updates easier for those who customize this macro. Please see the release notes section for more details about these changes, as well as notes on prior updates.

See the update notes section for details on how to update the macro if you've already installed a prior version.

As an example, here's a URL from a recent Monoprice email newsletter. (Note: I changed a few of the characters so these URLs will not work; they're for demonstration purposes only.) Any tracking data is completely disguised in the source URL; all you see is a jumble of letters and numbers at the end of the URL:

http://enews.emails.monoprice.com/q/LLMxJBmP1xo0XEpHXEE5PmLbqHbNxpVsuheZcOJcm9iZ0Bnc3ltZnN3ZWIuY29tw4gZyWuu1KtfpSL58gax3zGue2fhqQ

When loaded in the browser, all of that resolves to a completely different URL:

https://www.monoprice.com/product?p_id=11297&trk_msg=8OLJJ72Q87O4PCIC9HC71NL7VS&trk_contact=T3KPNYKTNJTD6NU53HNJCDD7T4&trk_sid=FGHVB9JP6L3MVLERLES7DLEFEG&trk_link=9HBDB7KBCAEKT949P8P8JQG5J4&cl=res&utm_source=email&utm_medium=email&utm_term=View+product+recommended+for+you&utm_campaign=210902_thursday

Fairly obviously, everything after p_id=11297 is simply tracking information, so the actual URL is just:

https://www.monoprice.com/product?p_id=11297

Ideally, that's the URL I'd like to load and/or share with friends, not the tracking-laden version. One weekend, I decided to do something about it, and sat down with Keyboard Maestro to work on a solution, thinking it wouldn't be overly complicated...

Months later, I think it's finally reached a point where it's shareable, and the 24 macros in my URL Decrufter group show that it was, indeed, overly complicated. But it's done now, and working quite well.1To extract destination URLs from disguised source URLs, that URL has to go to the server, which means tracking information is sent. However, my macro uses curl to do that, so it's not happening within a browser where cookies could then be set.

Here's how it looks in action…

The features and limitations of my macro are as follows…

  • Obviously, you'll have to have Keyboard Maestro to use the macro.
  • The macro only works on copied links—clicked links aren't modified. Why not? Because capturing mouse clicks and analyzing the clicked item would require "real" programming, instead of a relatively straightforward macro that acts on the clipboard contents. I actually prefer it this way, because I can use a simple click if I don't mind delivering the tracking information, or a right-click and Copy Link when I don't want to share.
  • The macro only works in apps that I define. If the app isn't part of the defined set, then no processing occurs. This is easily changed by modifying the listed apps in the URL Decrufter macro group.
  • The macro only works on a defined list of hosts—it doesn't attempt to capture and interpret every link I copy. This was a design decision, as I didn't want the overhead of processing every copied link, and I have a limited number of known domains that send me tracking-laden URLs.
  • It can take anywhere from a few tenths of a second to a couple of seconds to process a copied URL. Nearly all of this time is during the curl step, which translates the original disguised destination URL into the actual final URL, which is then stripped of its tracking details.

First things first, here's the macro group (v8.1). It's saved in a disabled state, so it won't be active on load. I suggest you start by opening the _The Decrufter macro, and looking at the comment at the top of the macro. It explains basic usage, how to add domains, and how to (generally speaking) create custom filters for domains that don't match the most common form (where all tracking info follows a "?" at the end of the real URL).

Because of the complexity of the macro, I'm not going to go through it here, as I usually do with simpler macros. But here's the basic process flow:

  1. When you copy something to the clipboard in a monitored app, the macro activates.
  2. If the item on the clipboard is an image, non-URL text, or a URL that's already been filtered, the macro quits.
  3. The URL on the clipboard is compared to a list of hosts (the part between www and com). If there's a match, the macro will run. Otherwise, it quits.
  4. A curl command is used to resolve the destination URL from the copied URL, except for a subset of hosts that don't require curl.
  5. The destination URL is run through a filter (the generic one, or one specific to that domain) to decruft it.
  6. The cleaned URL is pasted back to the clipboard, and opened in the browser in the background (if you hold the Option key down while choosing Copy Link).
  7. Both the crufty and cleaned URLs are added to a history file, variables are erased, and the macro quits.

In developing this macro, I learned a lot about Keyboard Maestro's powers (always more than I think they are) and regular expressions (so much power and so much complexity). It's probably of interest to very few people, but now that it's working, I really like the functionality and take it for granted that I can right-click and copy a link to have it open a clean version in my browser.

Note: This macro will run fine as is, but if you want to customize it, you should be comfortable working with Keyboard Maestro's macros. (If you're going to write your own URL filters for the macro, you'll need a good understanding of regular expressions, too.) If you have questions, or want some hosts added to the macro, feel free to ask me here and I'll see what I can do.

 

How to update the macro

Note: Please make sure you read the help included in the main macro (_The Decrufter). It explains how to change which apps activate The Decrufter, and other aspects of working with the macro.

If you haven't customized the macro at all, the easiest way to update is to just download and install the new one, test it by deactivating the old one and activating the new one, then delete the old one once you're confident the new one works.

If you have customized the macro, then it's going to take a bit more work. I'll try to note in the release notes section which file(s) changed in each update. If you see a note that indicates which file(s) have changed, you can import the full macro, then just compare the modified file with your version to port over any changes. But because this is a hobby and not my job, I'm pretty bad about keeping track of individual file changes within the macro, so these notes typically won't be there (there aren't any as of yet!).

One-time process as of Dec 5 2021 update:

I made a basic change in the a01, a02, and a03 macros: They should now be much easier to customize, after you do some one-time work with this update.

a01 and a02 macros: There's now a separate text box for the hosts you add. If you've already customized this list, compare the list in the Dec 5 updated macro with your list, and move your custom entries to the new step (and enable that step). Going forward, you'll just have to copy that step to the newly-updated macro in a02 and a03, and your custom host lists will be ready to go.

a03 macro: Similar to a01 and a02, there's a new box where you can add hosts that require custom URL processing. This isn't a text box, though, it's a case statement. You'll need to add each host as a new entry to the case statement, as seen in the first case statement in this macro (for the "factory provided" exception processing). Keep in mind that you'll also need to create a "dn-" macro for each such host, containing the custom regex you're using to extract the URL.

For the Dec 5 update, you'll want to remove any such rules you added from the first case statement, and put them into the second case statement. Then, for future updates, just copy your modified case statement into the newly-updated macro.

For any other modifications you make outside of the a01, a02, and a03 macros, the best thing to do is note those customizations in a file. Then you can compare your modified versions with the updated macro to see if I cahnged anything else. (This should be a very rare occurrence, at least in terms of user customization, as most of the rest of the macro is just processing steps.)

I understand this is a bit of a pain, but I'm not aware of any better way to update a macro that's been customized.

 

Release Notes

Jan 9 2022: Release notes can now be viewed in the macro in 8.1, so I won't be updating them here any longer.

Dec 5 2021: This update features the biggest fundamental change in the macro's behavior since the first version: Copied links are no longer opened unless the Option key is held down when you select Copy Link in the contextual menu.

I made this change because of a situation that I didn't think would happen much, but actually happens a lot in my use of the macro and Keyboard Maestro. One of Keyboard Maestro's actions lets you insert text by pasting, and I use this feature in many of my macros—it's much quicker than the alternative command, Insert Text by Typing.

But when used, it does so by adding the text to be pasted to the clipboard, pasting it, then deleting the entry from the clipboard. And the decrufting macro is triggered by a clipboard change, so here's what was happening:

  1. I decruft and open a URL from my email. The clipboard now contains the clean URL as the most recent item.
  2. I execute a macro that uses Insert Text by Pasting. The clipboard first gains text, which isn't usually a URL, so the decrufter ignores it.
  3. When the macro deletes the clipboard entry, however, the clipboard contents have changed, which triggers the decrufter. And what's at the top of the clipboard? A cleaned URL from when it first ran.

The macro is now intelligent about this, and knows it's a cleaned URL, so it doesn't reprocess it. But it does open all URLs that are in its list of handled domains by default, so it opens a browser window to the cleaned URL. This could be very jarring, especially when I would use an Insert Text by Pasting macro while on our customer support ticket web site, for instance. I'd be working on a reply to a customer, use a macro to insert some common text, and wind up looking at some other web page. Yikes!

As a result, I modified the s04 - Copy and open final URL macro so that it will only open links when the Option key is held down. That macro now contains a comment (colored yellow for easy visibility) that explains three different ways to handle copied URLs, and the (simple) changes needed to implement each method. Links can always be opened (old behavior, with the above issue), opened only when a modifier key is down (the current behavior), or never opened, just copied to the clipboard.

But don't copy just that macro—I also made small changes in a number of other files, and most importantly, basic changes in the a01, a02, and a03 macros: After some one-time work with this update, these changes will make it much easier to keep customized versions of this macro current when future updates are released. See the update notes section for more details on these changes.


Nov 29 2021: There was a bit of a logic bomb in the handling of already-decrufted URLs (or completely clean URLs) that has been resolved. The macro now behaves properly if you copy a clean URL for a domain that would normally be decrufted, or if you copy a crufty URL that has a cleaned version available in your history.


Nov 17 2021: I completely rewrote the history routines—now both the original crufty URL and clean URL are saved. If you copy either the crufty URL or clean URL again, the macro won't re-run curl, it will just look up the clean version and open that. I also added code for flyingmag.com, which now uses an HTML redirect (ugh!), requiring another step to clean.


Nov 1 2021: Added t.co links, as seen on twitter.com (and probably other Twitter clients) to the decrufter.


Oct 27 2021: Initial release.

13 Comments

Add a Comment
  1. Anyway to avoid popup for youtube which has a ? mark as part of the basic URL. Not sure what a URL with a referral looks like. Or maybe it's there and I'm missing it

    1. I do have one for (1), but it's basically too simple to post, so here's how to make it. It's a single if-then macro:

      If condition: The Decrufter is enabled
      Then condition: Disable macro The Decrufter
      Else condition: Enable macro The Decrufter

      Basically toggles it from whatever state it's in to the other state. (Mine also displays a notification so I know which state it's presently in.)

      For (2), do you mean the list of URLs it's decrufted? By default, it only keeps the last 50, and you can see tham at any time by opening KM's prefs, selecting the Variables tab, and browsing the d_p_trimmedURLs variable. (Better: Copy-paste it to a larger text editor window.)

      Someday, I may do more with the history data, but not sure when.

      -rob.

Leave a Reply

Your email address will not be published. Required fields are marked *

The Robservatory © 2022 • Privacy Policy Built from the Frontier theme