If you ask any researcher which online outlets they use to find relevant journal articles, there’s a good chance that Google Scholar will be at the top of their list. The 2018 “How Readers Discover Content in Scholarly Publications“ report found that researchers rated academic search engines as “the most important discovery resource when searching for journal articles,” and Google Scholar is among the most widely used free academic search engines available. A 2015 survey on 101 Innovations in Scholarly Communication also found that 92% of academics surveyed used Google Scholar.
With so many researchers using Google Scholar, it’s a search engine that all journal publishers should prioritize. Google Scholar stands apart as one of the most accessible and sophisticated academic search engines available. Inclusion in Google Scholar can help expand the accessibility, reach, and, consequently, the impacts of the articles you publish.
Despite the seemingly magical ability of Google to answer any search query with endless results, it’s important for publishers to know that the search engine can only index content its crawlers are able to find (more on crawlers below!). Google Scholar also has specific inclusion criteria. If you want all of your journal articles to be added to Google Scholar, you must take steps to ensure that they can be found by the search engine and that Google Scholar recognizes your journal website as a legitimate source.
In this blog post, we overview how Google Scholar works, the benefits of Google Scholar indexing, and what you need to know to have your journal articles added to Google Scholar. Let’s get started!
Since you’re reading this blog post, you likely know about Google Scholar as an academic search tool. But you may not be entirely sure of how Google Scholar processes content or how it compares to Google’s general search engine. Before we get into the specific benefits of Google Scholar and its inclusion requirements, let’s first take a look at what Google Scholar is exactly and how it works.
Like Google, Google Scholar is a crawler-based search engine. Crawler-based search engines are able to index machine-readable metadata or full-text files automatically using “web crawlers,” also known as “spiders” or “bots,” which are automated internet programs that systematically “crawl” websites to identify and ingest new content.
Google Scholar has access to all of the crawlable scholarly content published on the web, with the ability to index entire publisher and journal websites as well as the ability to use the citations in the articles it has indexed to find other related content. Google Scholar includes content across academic disciplines, from all countries, and in all languages. Recent research, including Michael Gusenbauer’s article “Google Scholar to overshadow them all? Comparing the sizes of 12 academic search engines and bibliographic databases,” has found that Google Scholar is the world’s largest academic search engine, containing over 380 million records.
A common misconception about Google Scholar is that it indexes all of the content it has access to regardless of the content type or quality. This is not the case. Rather, as explained in “Academic Search Engine Optimization (ASEO): Optimizing Scholarly Literature for Google Scholar & Co.,” Google Scholar is an “invitation based search engine.” This means that “only articles from trusted sources and articles that are ‘invited’ (cited) by articles already indexed are included in the database.” On its website Google Scholar states, “we work with publishers of scholarly information to index peer-reviewed papers, theses, preprints, abstracts, and technical reports from all disciplines of research and make them searchable on Google and Google Scholar.”
In order for your journals to be considered for inclusion in Google Scholar, the content on your website must first meet two basic criteria:
- Consist primarily of journal articles (e.g. original research articles, technical reports)
- Make freely available either the full-text or the complete author-written abstract for all articles (without requiring human or search engine robot readers to log into your site, install specific software, accept any disclaimers etc.)
From there your journal website and articles will have to meet certain technical specifications, which we outline below. Before we get into that, let’s first take a look at some of the specific benefits Google Scholar offers journals and how to tell if your articles are being included in the search engine.
We’ve talked about the broad research benefits of Google Scholar, but you may be wondering — what are the specific benefits of Google Scholar indexing for the journals I publish? Google Scholar indexing can greatly expand the reach of your journal articles and improve the chances of your articles being read, shared, and cited online. A primary benefit of Google Scholar is that, unlike other databases, its search functionality focuses on individual articles, not entire journals. So having your articles indexed in Google Scholar can help more scholars discover the journals you publish when those articles show up in keyword and key phrase searches.
Getting your journal articles indexed in Google Scholar will:
- Increase the reach of your individual journal articles because more scholars will be likely to find them
- Give scholars an easy way to gauge how relevant your articles are to their research based on the article title and search snippet you provide
- Help resurface old articles from the journals you publish — Google Scholar takes citations into account and shows more frequently cited works earlier in search results
For open access journals the importance of Google Scholar indexing is even greater. If you want your content to be accessible, making it freely available isn’t enough — you have to be sure that anyone can find your journal articles on the web and that they aren’t only available to scholars with access to subscription-based academic abstracting and indexing databases or prior knowledge of your journals (i.e. scholar knows to search for your specific journal website). Google Scholar makes it possible for anyone to freely search for and find relevant scholarly content on the web from anywhere in the world.
As noted, Google Scholar doesn’t just index all of the content it can access on the web. Rather, it seeks to index content from what it deems to be “trusted” publication websites. If other articles from trusted websites have cited a journal article Google Scholar will know to index it, but any content that is not published on a “trusted” website and that has not been cited by an article already included in Google Scholar will not be indexed right away.
In order for Google Scholar to deem a journal website trustworthy, it must follow all of Google Scholar’s technical guidelines. Journal publishers should also contact Google Scholar to request inclusion in the index. If you’re not sure whether your journals are being indexed by Google Scholar, you can quickly check by searching your journal website domain (e.g. www.examplejournal.com) in scholar.google.com.
If you find that one or more of the journals you publish are not yet being indexed by Google Scholar you’ll need to take some steps to get them added to the search engine.
Google Scholar has thorough Inclusion Guidelines for Webmasters that detail how to get your articles added to the index.
Some steps you may need to take include:
- Checking your HTML or PDF file formats to make sure the text is searchable
- Configuring your website to export bibliographic data in HTML meta tags
- Publishing all articles on separate webpages (i.e. each article should have its own URL)
- Making sure that your journal websites are available to both users and crawlers at all times
- Making sure you have a browse interface that can be crawled by Google’s robots
- Placing each article and each abstract in a separate HTML or PDF file (Google Scholar will not index multiple articles in the same PDF)
Google Scholar’s indexing guidelines can get pretty technical. If your journal or journals are currently hosted on a standalone website that you had custom-built or that you’re hosting via an outside provider like WordPress, you’ll need to either work with available internal IT resources to make any necessary updates or hire a web developer.
If you don’t want to deal with the technical aspects of getting your journal articles indexed in Google Scholar, you may want to consider moving your journal to a website hosted on a journal publishing platform that can take care of Google Scholar indexing for you. For example, Scholastica is already recognized as a trusted site by Google Scholar so all journals that publish via Scholastica journal websites are automatically indexed with no extra work on the part of the editors. Some journal databases, such as JSTOR or Project Muse, are also indexed by Google Scholar. So if you publish via a Google Scholar indexed aggregator or database, or if you regularly upload articles to one, you may also be able to have articles added to Google Scholar through it. You’ll want to check with any journal hosting platform or aggregator to make sure that they support indexing in Google Scholar.
However you decide to go about getting your journal articles indexed by Google Scholar, now’s the time to start! Google Scholar indexing is sure to expand the accessibility and reach of the articles you publish.
This post was originally published on February 4, 2016 and updated on August 20, 2019.