Open-source intelligence (OSINT) is an investigative process to find, collect, and use publicly available information. OSINT is now used for a wide variety of purposes, including cybersecurity (by both attackers and defenders), civil and criminal cases, corporate investigations, employee background checks, due diligence for transactions, marketing studies, competitive intelligence, and more.
OSINT and Cyber Investigations
OSINT can find specific information about targets such as an individual, company, event, or location. It is also a powerful tool in comprehensive cyber investigations. An effective cyber investigation should be planned in advance, including a plan for the objective(s), scope, and resources to be devoted. The starting point is understanding what you know and what you want to learn. Checklists, flowcharts, and templates are often helpful. Most investigations should be iterative so that adjustments can be made as they progress. Detailed records should be maintained so that steps can be traced and repeated. Data that is collected should be appropriately preserved. Depending on the purpose(s) of an investigation, the qualifications of the person(s) conducting it is an important threshold consideration. A qualified professional may be required.
If the results of an investigation may be used in court or an administrative proceeding, authentication and admissibility are critical. Detailed documentation is important, and an expert’s opinion and testimony may be required. Reliability of source(s) and information is important for all kinds of investigations. It’s essential for investigations for legal purposes.
Security of the investigator is critical in cyber investigations, particularly when unknown websites or services are visited or the Dark Web is included. (The Dark Web is a part of the World Wide Web that cannot be indexed by search engines and is not accessible with standard browsers; it is designed for privacy and anonymity.) The investigator should use a dedicated computer that is securely configured or a virtual machine (VM) (an isolated partition that behaves like a separate computer).
Another consideration is whether the investigation will be “stealth” or if it does not matter that the subject or others may learn about it. OSINT analysts use the term “sock puppet” to describe the persona that is created for stealth investigations. If the investigation is to be confidential, the investigator(s) can use tools such as VMs, anonymizing virtual private networks (VPNs), or a secure, anonymous browser.
Legal compliance is also important in investigations. Federal and state laws and regulations may apply. There also may be applicable contracts and terms of service. For example, unauthorized access to an information system or website or a violation of a website’s or system’s terms of service can be a criminal violation of the federal Computer Fraud and Abuse Act, 18 U.S.C. §1030, and parallel state laws. There is also a risk of civil liability for violation of a law, breach of contract, or invasion of privacy.
For attorneys and those working for them, compliance with ethics rules is also important. Investigations may involve potential misrepresentation, failure to disclose, and improper contact with represented parties, courts, or jurors. Attorneys should consult the ethics rules and opinions in the relevant jurisdiction(s).
The U.S. Office of the Director of National Intelligence has defined OSINT as “intelligence produced from publicly available information that is collected, exploited, and disseminated in a timely manner to an appropriate audience for the purpose of addressing a specific intelligence requirement” (U.S. Off. of the Dir. of Nat’l Intel., U.S. National Intelligence: An Overview 2011 (2011)). The process in the United States goes back to at least World War II, when it was used to monitor sources such as newspapers, magazines, and public radio broadcasts. It has grown exponentially in the Internet age.
According to Secjuice, a security nonprofit,
OSINT includes all publicly accessible sources of information. This information can be found either online or offline:
1. The Internet, which includes the following and more: forums, blogs, social networking sites, video-sharing sites like YouTube.com, wikis, Whois records of registered domain names, metadata and digital files, dark web resources, geolocation data, IP addresses, people search engines, and anything that can be found online.
2. Traditional mass media (e.g., television, radio, newspapers, books, magazines).
3. Specialized journals, academic publications, dissertations, conference proceedings, company profiles, annual reports, company news, employee profiles, and résumés.
4. Photos and videos including metadata.
5. Geospatial information (e.g., maps and commercial imagery products).
Nihad Hassan, An Introduction to Open Source Intelligence (OSINT) Gathering, Secjuice (Aug. 12, 2018).
Additional information sources include government databases (federal, state, and local) and proprietary databases of all kinds, such as people finders. Some definitions of OSINT include only sources that are free, but there are many rich sources that require a fee per use or a subscription.
Traditional search engines, such as Google and Bing, can be used for basic OSINT. The more that a user knows about them and the use of advanced search techniques with them, the more complete the search results can be. They are likely to become more powerful as artificial intelligence is built into them. However, it has been estimated that more than 99 percent of Internet data is not available through traditional search engines. It’s important to consider the use of alternative tools.
For a number of years, the CLE provider Internet for Lawyers has offered information on Internet investigations and research for attorneys (and others), including the book Cybersleuth’s Guide to the Internet: Conducting Effective Free Investigative & Legal Research on the Web and webinars based on it. The book explains:
Much of the information that was once only available to professional researchers from expensive, fee-based sources is now available for free on the Internet if you know how to find it. There’s more to conducting a comprehensive search for information on the Internet than just relying on the results returned by search engines, though.
Carole Levitt & Mark E. Rosch, Cybersleuth’s Guide to the Internet: Conducting Effective Free Investigative & Legal Research on the Web (14th ed., rev. 2019).
Many of the techniques and resources discussed in this book are included in OSINT. The book is out of print but is available in some law libraries and state bar book programs. Webinars and CLE programs based on its contents are still offered.
There are also alternative search tools that can be used for OSINT. In their article “15 Top Open-Source Intelligence Tools,” CSO Online (Aug. 15, 2023), authors Josh Fruhlinger, Ax Sharma, and John Breeden II describe 15 commonly used OSINT tools. Examples include:
- Maltego shows relationships among people, companies, domains, and publicly accessible information on the Internet and plots it all out in easy-to-read charts and graphs.
- Recon-ng automates time-consuming investigation activities, such as cutting and pasting, and automates much of the most popular kinds of data harvesting (written in Python).
- theHarvester captures public information that exists outside of an organization’s network; it can find incidental information on internal networks but mostly draws from outward-facing content. It would be effective as a reconnaissance step prior to penetration testing or similar exercises. It uses common search engines such as Bing and Google, as well as lesser-known ones such as dogpile, DNSdumpster, and the Exalead metadata engine. In general, theHarvester gathers emails, names, subdomains, IPs, and URLs.
- Shodan is a dedicated search engine used to provide access to databases and to find information about network devices and Internet of things (IoT) devices, such as cameras, which often are not searchable.
- Metagoofil extracts metadata from publicly available documents, including .pdf, .docx, .pptx, .xlsx, and many others.
- searchcode is a highly specialized search engine that looks for useful intelligence inside source code.
- SpiderFoot is a reconnaissance tool that automatically queries more than 100 public data sources (OSINT) to gather intelligence on IP addresses, domain names, email addresses, names, and more. It collects data to build an understanding of all the selected target entities and how they relate to each other, and it also includes data visualization.
- Babel X is a multilingual search tool for the public Internet, including blogs, social media, message boards, and news sites.
While the list was prepared for cybersecurity investigations, the tools in it are regularly used for broader OSINT purposes. Some of these tools are free, some offer free and paid versions, and some offer only paid versions, which can be very expensive.
An OSINT analysis starts with a subject, such as an individual, company, event, or location, and uses manual search, automated tools, or both to find additional information, sometimes very comprehensive information. For example, starting with an individual’s name, OSINT can find a person’s residence (often with value, mortgages, and street-view, interior, and satellite images), age, education, occupation (including employer, position, location, email, phone number, and time of employment), phone number, email address, social media accounts, detailed information about relatives, membership in organizations, hobbies, travel, and more. It’s always interesting to try an OSINT search of your own name.
Consider the example of an OSINT investigation by a private individual in the aftermath of the U.S. Capitol riot on January 6, 2021. A hacker downloaded data from the Parler social media app and used it to plot the locations of Parler users inside the U.S. Capitol during the attack. It showed a bird’s-eye view of its users swarming the Capitol grounds (Dell Cameron & Dhruv Mehrotra, Parler Users Breached Deep Inside U.S. Capitol Building, GPS Data Shows, Gizmodo (Jan. 12, 2021)). In another example, “Faces of the Riot used open source software to detect, extract, and deduplicate every face from the 827 videos taken from the insurrection on January 6” (Andy Greenberg, This Site Published Every Face from Parler’s Capitol Riot Videos, Wired (Jan. 20, 2021)).
In yet another example, a study by the New York Times reported on the use of cell phone location technology to aggregate “more than 50 billion location pings from the phones of more than 12 million Americans as they moved through several major cities . . .” (Stuart A. Thompson & Charlie Warzel, Twelve Million Phones, One Dataset, Zero Privacy, N.Y. Times (Dec. 19, 2019)). This data is for sale by location aggregators for marketing and other purposes.
OSINT is a powerful tool for attorneys and law firms of all sizes in almost all practice areas, whether it’s used by attorneys, paralegals, staff, or a retained research professional. There are many potential uses, such as information on parties, opposing counsel, witnesses, judges, and jurors in litigation; due diligence for transactions; law firm marketing (both for targets and the law firm’s image); pre-employment screening; and more.
OSINT Information Resources
There is a wealth of available information resources on OSINT. This section discusses some leading ones.
Michael Bazzell’s OSINT Techniques: Resources for Uncovering Online Information (10th ed. 2023) is a comprehensive reference on OSINT that has been regularly updated. It provides detailed information on tools, techniques, and sources of information. It covers the following information sources: search engines, social networks, online communities, email addresses, usernames, people search engines, telephone numbers, online maps and aerial photos, documents, images, videos, broadcast streams, domain names, IP addresses, government records, virtual currencies, application programming interfaces (APIs), and advanced Linux tools. The author’s background is in law enforcement and training law enforcement personnel. He now focuses on speaking, writing, and teaching on OSINT and the privacy concerns that go with it. He maintains a website, IntelTehniques, that has a wealth of information.
The OSINT Framework is an online, interactive catalog of free OSINT tools and resources. It has a list of 32 information elements that are the starting point for a search (e.g., username, email address, domain name, IP address, image/video/document, and social networks). Clicking on People Search Engines shows General People Search and Registries. Selecting General People Search displays more than 40 search tools.
The OSINT Curious Project is an organization that provides great resources for OSINT for a broad range of uses. It includes a blog, tips, videos, podcasts, a YouTube channel, and a good 25-minute video introduction to OSINT (created in January 2020). In February 2023, the Project announced that it had closed, but it will continue to make its existing content available.
The SANS Institute, a highly regarded cybersecurity training and certification group, includes OSINT in its offerings and conducts an annual OSINT Summit. It maintains the SANS Cyber Security Blog and a page of “‘Must Have’ Free Open-Source Intelligence (OSINT) Resources.” SANS also offers two week-long courses: “SANS SEC497: Practical Open-Source Intelligence (OSINT)” and “SANS SEC587: Advanced Open-Source Intelligence (OSINT) Gathering and Analysis.” SANS focuses on intelligence for cybersecurity, but its resources can also be helpful in other areas.
The DEFCON underground hacking conference has presented OSINT talks for several years. Videos of some of them are available free online and can be found by an Internet search for “DEFCON & OSINT.” The author’s introduction to OSINT was through DEFCON.
Trace Labs has published a specialized OSINT VM (virtual machine) that brings together what its members have found to be the most effective OSINT tools and customized scripts for crowdsourced searches for missing persons.
Columbia Journalism Review has published Michael Edison Hayden’s “A Guide to Open Source Intelligence” (June 7, 2019), which provides a good overview for reporters as well as others.
There is a wealth of information to help you get started and advance with OSINT, much of it free. Depending on your learning style, the videos may be a good start.
OSINT is a powerful investigative process to find, collect, and use a wealth of publicly available information. It is a valuable tool for attorneys in most, if not all, fields of practice. Attorneys should understand OSINT, the kinds of information it can discover, where to find accurate information on its use, and where to get professional resources. It is important to use OSINT appropriately, in compliance with applicable legal and ethical requirements.