I know this sounds like the most boring blog title ever, but I’m sure many analysts have had this problem before and wanted to find the answer to this annoying question. So here goes…
Don’t you hate it when you have to analyse a list of URLs but the data you have received has dynamic URLs for the same page which you just want to treat as one page. Simple, just used the Excel remove duplicates function right? Wrong. The strings are unique so Excel won’t trim them down for you.
Lets say you want to remove all of the characters after the ‘?’ there is actually a way you can do this in Excel which makes use of the ‘LEFT’ function. Below is the code that you copy into the cell next to the cell you want to shorten the string. What this function does is search through the string for the ‘?’ character and the deletes everything past this point.
=LEFT(A1,FIND(“?”,A1,1)-1)
Now you have a beautiful list of URL’s ,many of which will now be the same. You can now use the trusty remove duplicates function and you now have your shortened list of URLs that you need to analyse.