Skip to content

Osint Me Tricky Thursday #8 – URL manipulation

  • by

This week’s focus will be on reviving a somewhat forgotten and neglected section of the blog – the Osint Me Tricky Thursday.

And without further ado, I want to get right into it, sharing some tips and tricks on how to use URL manipulation for OSINT.

1. Understanding the basics of URLs

Even if sometimes we are unable to recall the exact meaning of the acronym URL (Uniform Resource Locator), we all know what it is and what it does – it’s a human-readable link in your browser that allows you to access online resources residing on a specific IP address.

A URL can be general in nature, pointing to a landing page of a website (for example osintme.com), or to a more defined object or resource using the path to a file, such as a PDF form or a text file on that website, for instance:

https://www.osintme.com/wp-content/uploads/2021/05/CySA-002-Notes.pdf

A thing to remember – many domains have hidden URLs that are never usually accessed or searched for by regular users. Some of those URLs are part of the deep web, meaning that they are not indexed by search engines – but sometimes they can be accessible by discovering and visiting the exact URL.

2. Subdomain enumeration

Every valid Internet domain name is comprised of the following components:

  1. Top level domain – whatever follows after the last dot in the URL string. Common top level domains examples are: .com, .org, .gov, .net, .uk, .ie…
  2. Second level domain – whatever is before the top level domain. So, the second level domain of this blog is osintme and the top level domain is .com.
  3. Subdomain – whatever is positioned before the second level domain. It can be anything really, for example: aws.amazon.com – the aws part is the subdomain here.

Subdomain enumeration is used to identify and expose subdomains that are infrequently used or that are not meant to be accessed by regular users. This can be done manually by simply adding a common word before the second level domain:

  • blog.example.com
  • news.example.com
  • mail.example.com
  • store.example.com

However, manual domain enumeration is long-term ineffective: it can be laborious and time consuming, or won’t work at all if subdomains have uncommon names.

You can automate subdomain enumeration using a tool like Sublist3r…

https://github.com/aboul3la/Sublist3r

… or The Harvester:

https://github.com/laramies/theHarvester

Alternatively, you can use web tools such as Spyse’s Subdomain Finder:

https://spyse.com/tools/subdomain-finder

3. Connecting directly through an IP address

The URL bar is not limited to working only with human-readable input like domain names.

DNS – Domain Name System – entries appear in a human friendly name (www.google.com) – but this is not the only way to connect to a website. Every resolvable domain will have what’s known as an A record (Address Record), which points to an IP address.

I can illustrate this with an example of a Hack The Box machine Delivery that was accessible under this URL (it won’t resolve now, unless a HTB users activates it – they only stay online for 24h):

http://helpdesk.delivery.htb

Part of the task of compromising this machine involved connecting to it through its IP address, which can be further enhanced by adding a port number:

http://10.129.229.49:22

http://10.129.229.49:80

Sometimes you can glean extra information from connecting to an online resource using various ports, of course if they are open.

NOTE: This will not always work – it depends on details of a security configuration. Also, services like Cloudflare or hosting providers are likely to block this method of connection.

4. See anything with a number? Enumerate!

This technique can be used wherever you find a URL that implies the existence of sequentially ordered resources. For instance, take LinkedIn groups:

https://www.linkedin.com/groups/113/

At the end of that URL there is a number that you can manipulate by switching digits and searching through the listed resources by incrementally increasing or decreasing the numeral value.

This can work particularly well for photo galleries, file directories, usernames and so on.

5. Increase image resolution

Sometimes you might come across a link containing an image file in a lower resolution. Chances are that there is a higher resolution image out there too, but you just don’t know the exact URL for it.

In some cases the higher resolution image can be viewed by manipulating the size from the URL. Take a look at the image below:

https://ucarecdn.com//985d4f2c-973a-4ae6-a2b1-f992683da70b/-/resize/200x/

Now – try switching this part -/200x/ – to /2200x/ by adding 2 in the front…

The effectiveness of this method will depend on each website and each URL. Different services will have different parameteres, located in different parts of the URL, like for example this image of pizza:

https://cdn.shopify.com/s/files/1/1405/0664/products/4791207-9790062099-Pizza1_250x250_crop_center@2x.progressive.jpg?v=1469649640

If you noticed the 250×250 value (pixels in this case), try changing it to something else.

For instance:

https://cdn.shopify.com/s/files/1/1405/0664/products/4791207-9790062099-Pizza1_1250x1250_crop_center@2x.progressive.jpg?v=1469649640

Other services might allow you to change the size parameters by substituting the word “small” with “large” in the URL, and so on.

6. Add something at the end of a URL

Many websites contain files that are not being indexed by search engines – for example, robots.txt.

As per this Google explainer, a robots.txt file is used to manage crawler traffic to a website, and usually to keep a file off Google, depending on the file type.

This file will not reveal any sensitive data, but it might point a user to resources currently in development – or to whatever the website’s owner does not want the wider public to see.

You can try this method by visiting random websites and adding this value – /robots.txt – (as well as trying some other options) to the URL.

Example:

https://www.rte.ie/robots.txt

7. Unshorten a URL

Link shortening services are legitimately used to condense very long and messy URLs into short and sweet, legible and human friendly links. Sadly, these resources are also often used by scammers and cyber criminals as a method of obscuring a URL that might otherwise appear suspicious to any prospective victim.

Luckily, there are several resources and tricks available to unshorten those links.

If it’s a Bitly shortened URL (you can tell by the Bitly name in the shortened URL), then you can unshorten it by simply adding a + sign at the end of it:

https://bit.ly/3F3vlKO

https://bitly.com/3F3vlKO+

This works for some other URL shortening services. For others, you can use one of the following resources that help disentangle shortened links:

8. Web parameter tampering [not strictly OSINT!!!]

The last part of these URL related tips and tricks is very much a grey area, on the border between OSINT, pentesting and exploiting vulnerabilities. The legitimate use case for these methods is pentesting of web applications.

OWASP classes these actions as web parameter tampering and identifies a number of attacks that can be perpetrated from a URL level against poorly written and not adequately secured web applications.

As per the OWASP page:

An attacker can tamper with URL parameters directly. For example, consider a web application that permits a user to select their profile from a combo box and debit the account:

http://www.attackbank.com/default.asp?profile=741&debit=1000

In this case, an attacker could tamper with the URL, using other values for profile and debit:

http://www.attackbank.com/default.asp?profile=852&debit=2000

Other parameters can be changed including attribute parameters. In the following example, it’s possible to tamper with the status variable and delete a page from the server:

http://www.attackbank.com/savepage.asp?nr=147&status=read

Modifying the status variable to delete the page:

http://www.attackbank.com/savepage.asp?nr=147&status=del

Leave a Reply

Your email address will not be published. Required fields are marked *