Search engines like Google and Bing cover the part of the Internet that’s referred to as the “Surface Web” or “Visible Web”—and that is estimated to only be about 4% of what is available on the Internet. The rest is what’s referred to as the “Deep Web” or “Dark Web.”
What is the remaining 96% and why do you care?
If you exclude TOR (a popular closed network for those seeking anonymity), paid cloud-based databases like LexisNexis, web-based mail servers, illegal content, and criminal networks, the “deep web” consists of:
- Pages the require a login/password like Facebook or local courts, which are still searchable and not necessarily private.
- Websites that ask not to be indexed like company “extranets” or websites designed for a limited audience.
- Old, static webpages or webpages not hyperlinked to other pages anymore/
- Small or obscure websites of limited size or viewership.
These types of “Deep Web” websites are easy to find and can be enormously fruitful in your research. Here are my top five tips, culled from a career as a private investigator. You’ll never look at Google the same way again:
1. Use Google to find the databases that have the information you want, rather than for your searches.
If you aren’t already harnessing the explosion of public records available through web access, you should. In many counties in the U.S., you can access the actual court filings on the web either for free or a nominal fee.
blackbookonline.info is a favorite site to quickly find court websites across the country.
Alternatively, try searching by the name of the court rather than by the person or company you are doing research on. For example: “san francisco superior court” or “maricopa county superior court.” The same is true for all county-level public records like real estate ownership, planning department violations, recorded documents, etc.
2. Use social media databases for purposes beyond expanding your network.
Search engines miss most of the information you can access publicly through social media. Using the search function within websites like Facebook, LinkedIn, Tumblr, etc. can be fruitful for tasks like finding hard-to-find witnesses or getting better contact information.
Try searching within social media with the information you do have. Doing these searches does not violate the social media’s Usage Agreement and the results of the search will be limited by what the person has allowed to be viewed in their privacy settings. For example, try searching for people on Facebook, LinkedIn, Twitter, and other social media by email address or phone number.
Another great use of social media is to find a certain type of person: former employees, former colleagues, or additional plaintiffs who will be a good witness in your case.
You’d be surprised to hear that most search engines allow broad, natural language searches. In Facebook, try using search terms that describe the people you are
looking for, like:
“People who used to work at Acme Metals” or
“People who work at Acme and live in Boise, Idaho”
In Tumblr, Instagram and Twitter, you can search by hashtags (#). For example, use hashtags that the people you seek would use:
3. Use the web to get better contact information.
Need to contact a witness to verify information on a website or find someone who was quoted in an old news article? Don’t stop at Google! There are other great resources available to you:
Every website has to register a contact person on its behalf with ICANN, the non-profit charged with managing domain names. It’s true you can use a third-party service to register privately but most don’t. That contact information is held in a database called a “WhoIs” directory. You can search the WhoIs directory on sites like whois.icann.org or betterwhois.com.
Need to find someone in another country? Do you really want to call an international operator? (Do they still exist?) Most countries have some form of a phone directory. infobel.com is a great compilation of almost every country’s white-pages directory. Infobel will take you directly to the relevant website to do the search there in its natural form.
4. Use advanced Google to access old or outdated webpages.
Google has advanced search terminology that you can use to access pages that might not come up in your initial search. Here are just a couple examples:
If you want to find sales materials by a certain company, tell Google you want to search (a) just the company’s website and (b) just certain types of files like PowerPoint presentations or PDF files. It would look like this:
site:sony.com and filetype:ppt | filetype:pptx |filetype:pdf
If you want to search for a person’s resume that is no longer linked to the homepage, your search would look like this:
site:murphybrown.com resume | CV
And for websites that are completely gone, you can’t miss the amazing Internet Archive that has been indexing websites since 1996. Go to the Wayback Machine and search by website address to see previous versions of it: https://archive.org
5. Don’t forget to search video, books, and audio!
Outside of Google, there are easy ways to access information that live in books, documentary films, and other resources. For example: amazon.com lets you run searches “Inside the Book”—this is a fantastic resource to find out information about past events, find witnesses to a scandal, and even find experts.
docuseek.com and shoppbs.org are both great resources for finding documentaries and audio shows on your topic. Don’t miss the previous work of documentarians.
These are just a few tips on ways you can harness the “Deep Web” through web research. As a successful litigator, you probably have favorites of your own. Drop a comment here on ones you recommend.