Today we take a step back towards the most basic OSINT resource – the Google search engine.
Google products are often labelled as privacy unfriendly, equipped with built-in overt and covert features that concentrate on tracking users’ online activities and their physical movements.
That said, the Google search engine remains the best and the most effective out there, which makes it impossible for any OSINT practitioner to disregard.
Anybody can google – but the results will vary drastically.
With speed, accuracy and efficiency in mind, the objective is to refine, narrow down, isolate and prioritize your search results by using a correct combination of sources (websites) to query.
This is the true power of Google custom search engines (CSEs).
So let’s take a look at how they work and how to build them.
To create a custom search you will need a Google account.
Next, simply follow the Programmable Search Engine link and log in to your account.
Select New search engine and pick an appropriate name.
As seen above, you can add websites that will be searched for against your query and also filter by language.
You can decide here whether you want to search whole websites (for example, the whole of Reddit), or just the selected parts (say like various Reddit threads), or maybe specific subdomains that belong to the main site, which you might wish to omit in your search.
If you are unsure about how domain addressing works, check out this post that contains an explanation on web domain addressing structure.
After you have created your custom search engine, you can modify it and change the parameters using the “Edit search engine” tab.
You can also choose to embed a CSE you created on a website (or link it up via traditional URL pasting / shortening methods outside of this panel).
One helpful option is “Refinements” – available under the “Search features” section, after you’ve chosen to edit the search engine.
This will allow you to limit search result to a specific website per refinement, but you can display multiple results segregated by tabs, each with their own results:
Refining search results by file formats and file extensions will allow you to build effective custom search engines for PDF documents, Excel spreadsheets, video files and whatever else you want to focus on.
By applying the method described above, you can filter your search by limiting results to files with a specific file extension.
This will require specifying the “Optional word(s)” value in a way that Google understands as filtering by file extension, for instance:
ext:pdf
ext:jpeg
ext:ppt
The main advantages of building custom search engines with Google are accuracy of sources and results limitation.
The trade off is that your results will be limited to 10 pages, with each page displaying only 10 results – so you get a maximum of 100 hits per query.
This means you really have to define your queries well and avoid broad searches – for which you can always use the general Google search engine.
So that’s it in a nutshell; there are some more granular options within the CSE interface that you can explore and tweak to make the results display better or be more relevant to your OSINT angle.
Or, if you are feeling lazy, you can use some of my own custom search engines that I share below…
Social media sites
Forums & chats
Reddit – global search of the entire Reddit platform.
Telegram – channels and content on Telegram. I recommend using the desktop app to navigate results.
Bitcoin Forums – 10+ various digital currency discussion forums (including some sub-reddits).
People search
People Search Websites – a bunch of websites that gather personal data, search by name & surname.
Dating websites – most people don’t use real details on dating sites, so search for pseudonyms.
Corporate & business
Companies & organisations – corporate and business related info; focused on the English language.
Phone numbers
Truecaller – search for phone numbers & names on Truecaller (restricted, not great for EU-based users).
Files & content
Github – searches for code and open source software on Github (also works on usernames!)
Documents – searches for document files online, filters by extension type.
Photos & images – as above, but for graphical files in varieties of file formats.
Slideshare – looks for slide decks and presentations on Slideshare.
Google Drive – searches through publicly available content on people’s Google Drives.
Most wanted & sanctioned lists
FBI most wanted – fugitives, criminals, terrorists, etc. wanted by the FBI.
Interpol most wanted – as above, but compiled by Interpol.
Europol most wanted – an EU focused list for people on the run.
OFAC sanctioned – persons and organisations under US sanctions – from terror groups to rogue states.
Feel free to use my searches and tell me what works / doesn’t work!
Let me know if you would like to see any other custom search engines – reach out on Twitter
I tried the Google Drive CSE, it worked as you said