There is a tremendous amount of information available online that most people never see.

I’m not talking about the dark web or the deep web—there’s a lot of useful stuff from the surface web can still be accessed via search. You’ve just got to search better.

I’ve spent a fair bit of time over the past few years searching the Internet for specific and unique information on startups and high-growth businesses. Over time, businesses of any size that continue to exist start to leave a trail of digital detritus that can inform anything from journalism to competitive intelligence work.

Natural activities performed by a company as part of day-to-day operations can create far more detailed information than what is published on its website. Directors and employees will give talks that get filmed and hosted and their slides will be uploaded to event websites or LinkedIn’s SlideShare. Companies will win government tenders and details will become available due to disclosure requirements. Freedom of Information requests will be answered and responses will become available via company websites or sites like WhatDoTheyKnow.com.

Homepage of WhatDoTheyKnow

Homepage of WhatDoTheyKnow

Two things that have helped me out most when searching for business information are knowing some key operators for advanced searches and being able to search for pages across time. Let me lay these out in a little more detail.

Searching deeper – Google search operators

The single best way to search better is to know some Google search operators. These are commands that can be used to search for specific types and formats of information or to search in more specific places. Here are my top four and how I use them.

1. site:

This operator restricts the search results to a specific domain, for example site:ft.com would just return results from www.ft.com. It is easily my favourite operator because it can save you so much time and turn up high-quality results.

Say that you wanted to know what response a particular company was having to the coronavirus crisis, you could search site:companydomain.com coronavirus COVID-19 in Google and just see relevant references from the company’s website. This saves you from having to search through the company’s news, blog and update pages. In some instances, it can even return information that is not accessible via the site’s normal navigation such as pages to which there are no internal links and uploaded documents like pdfs.

The Google operator site: in action

The Google operator site: in action

2. “keyword”

Putting search terms or keywords inside quotes, “like this”, forces Google to try find an exact match of that word or phrase. This is very useful when searching for business names that are also common words like “Apple Inc”. Or for use in a context where searching without the quotes will return generic results. “Apple Inc” tree planting is a better search than Apple inc tree planting if you want to learn about which trees are being planted on Apple’s new campus.

3. -keyword

Adding a minus symbol (-) to the start of a keyword forces Google to return results that do not have that word or phrase in them. This can be particularly powerful when combined with other operators. Say you wanted to read about Apple but not information that the company itself has produced, you could search for apple inc -site:apple.com. This search returns about Apple the corporation but not from its own website.

I like to use this operator to prune results as I go along which can be a great to build specific searches. For example, if you are searching for a new tech company that you’re not sure what it does and that has the fictional name Tractor, you might get lots of results related to agriculture and machinery. After an initial unsuccessful search for tractor tech company, you may change the search to tractor technology company -agriculture -machinery -industrial to filter out the most erroneous results.

4. filetype:

This operator makes Google searches return particular files type depending on what is specified. It requires that you know some file type extensions such as pdf, jpg, docx and xls. But mainly pdf. This is because pretty much all secrets worth knowing get saved in pdfs and uploaded to the Internet.

So imagine that you’d heard about the fictional acquisition of Tractor Ltd by Apple and you’re wondering if there are any more details available other that what is in the press release. You could run a search for “Tractor Ltd” AND “Apple” filetype:pdf. Google would seek to return any pdf documents that have the exact phases in the quotes and sometimes these can be pretty juicy. You may find company accounts, the accounts of businesses invested in Tractor Ltd or updates issued to shareholders by Tractor Ltd just to name a few examples I have seen.

These are only four Google operators out of more than 40 but these do most of the heavy lifting for me. Using several operators together can drastically improve the results by focusing the search area and increasing the specificity of what is searched for. Google operators can be so powerful that hackers can use them to identify vulnerabilities in web applications—this practice is known as Google hacking or Google dorking.

Searching for things that don’t exist – Google caches and the Wayback Machine

I’ve heard people say that once something is on the Internet it’s on there forever; however, when it comes to using the Internet for researching businesses, this doesn’t always seem to be the case. Businesses fold and websites are erased. News articles are deleted, or sites are mothballed. The information available on the web is part of a living, writhing technomass that is always changing and will be different based on when and where you access it.

If you’re looking for something that seems to have vanished, there are a few options. Sometimes you may find a page that is being indexed by Google but doesn’t seem to be loading or no longer displays the information suggested by the summary. In the first instance, check to see whether Google has cached the page. When Google returns a list of results, there is a small down arrow at the end of the header—click this to see whether a cached version of the page is available.

Homepage of the Internet Archive’s Wayback Machine

Homepage of the Internet Archive’s Wayback Machine

Your other option comes courtesy of the Internet Archive which is a US-based not-for-profit that has archived 424bn webpages over the last 20 years. Using the Wayback Machine—the Internet Archive’s tool for searching its web archive—you can check to see whether the webpage you are searching for has been saved at some point in the past. This is a brilliant resource and can be useful for understanding how a business has changed its strategy over time; just look at how the content of its website has changed. If you make regular use of the Wayback Machine, consider donating to the Internet Archive.

Searching better

Using Google operators well and knowing that it is possible to search across time have helped improved the quality of the information I find on the web. Given how the web changes, these strategies may make less sense in the future. But at the moment these work well for me—and I hope they work for you.