VOSviewer for Environmental Science: A Comprehensive Guide to Bibliometric Analysis and Research Trends

Aria West Nov 28, 2025 141

This article provides a complete guide for researchers and professionals on using VOSviewer for bibliometric analysis in environmental science.

VOSviewer for Environmental Science: A Comprehensive Guide to Bibliometric Analysis and Research Trends

Abstract

This article provides a complete guide for researchers and professionals on using VOSviewer for bibliometric analysis in environmental science. It covers foundational principles, from defining bibliometric networks to exploring environmental research landscapes. The guide details practical methodologies for constructing and visualizing networks based on citation, co-authorship, and keyword co-occurrence, using real-world examples from fields like resilient cities and microplastic pollution. It also addresses common troubleshooting and optimization techniques for data handling from sources like Scopus and Web of Science, and validates findings through comparative analysis with other tools. Ultimately, this resource empowers scientists to leverage VOSviewer for uncovering research trends, collaboration patterns, and emerging topics in environmental studies, enhancing the quality and impact of their literature reviews and research planning.

Understanding VOSviewer and Its Role in Environmental Science Landscapes

What is VOSviewer? Defining Bibliometric Network Visualization

VOSviewer is a specialized software tool for constructing, visualizing, and exploring bibliometric networks. Developed by Nees Jan van Eck and Ludo Waltman at the Centre for Science and Technology Studies (CWTS) of Leiden University, its name stands for "Visualization of Similarities" [1] [2]. It is designed to create maps based on network data where the distance between items reflects their relatedness, providing a powerful means to analyze the structure of scientific literature [3].

While its core function is analyzing bibliometric data from sources like Web of Science, Scopus, and PubMed, VOSviewer is also a capable tool for text analysis, enabling the creation of co-occurrence maps from any body of text, such as academic abstracts or interview transcripts [1] [2].

The Role of VOSviewer in Environmental Science Research

In environmental science, VOSviewer helps researchers map the complex landscape of scientific knowledge. For instance, a bibliometric analysis of microplastic (MP) pollution research can use VOSviewer to visualize international collaborations and identify core research themes, such as distribution, toxic effects, and analytical methods [4]. The software can process thousands of publications to reveal trends and patterns that might otherwise remain hidden in large datasets.

Table 1: Quantitative Analysis of Microplastics Research via VOSviewer

Aspect of Analysis Key Finding Data Source & Scope
Publication Volume Explosive growth from 2004; 3,548 publications in 2022 alone (30.12% of total analyzed). Web of Science (2004-2023); 11,777 English literature pieces [4].
Global Research Participation 147 countries participated, with China, the United States, the UK, Australia, and Canada being the most prolific. Web of Science [4].
Research Hotspots (Clusters) 1. Distribution and sources of MPs2. Exposure and toxic effects3. Research methods for MPs4. Adsorption of MPs with other pollutants. Keyword co-occurrence analysis performed with VOSviewer [4].

Experimental Protocols for Bibliometric Analysis

This section provides detailed methodologies for conducting a bibliometric analysis using VOSviewer, framed within the context of environmental science research.

Protocol 1: Creating a Co-authorship Network

A co-authorship network map reveals collaborative relationships between researchers, institutions, or countries [3] [1].

Workflow Overview

A 1. Data Export B 2. Data Import A->B C 3. Map Type Selection B->C D 4. Threshold Setting C->D E 5. Network Construction D->E F 6. Visualization E->F

Step-by-Step Instructions

  • Data Export: Collect a comprehensive dataset of scientific publications. This is typically done by performing a topic search (e.g., "microplastics AND toxicity") on a bibliographic database like Web of Science (WoS) or Scopus. Export the full record and cited references in a compatible format (e.g., plain text from WoS or RIS) [2] [4].
  • Data Import: Launch VOSviewer and select "Create" → "Create a map based on bibliographic data" → "Read data from bibliographic database files." Load the exported file(s) [1].
  • Map Type Selection: In the wizard, choose "Co-authorship" and then select the unit of analysis, such as "Countries" [1].
  • Threshold Setting: Set a minimum number of documents per country to be included in the map. This helps focus on the most significant actors. For example, setting a threshold of "5" will exclude countries with fewer than 5 publications in your dataset [1].
  • Network Construction: The software will display a table of countries meeting the threshold. You can choose to display only the largest connected component or all items. Click "Finish" to build the network [1].
  • Visualization and Interpretation: The map will be generated. Larger nodes indicate a higher number of publications from that country. Thicker lines between nodes represent stronger co-authorship links. The colors represent different clusters of densely collaborating countries [5] [4].
Protocol 2: Conducting a Term Co-occurrence Analysis

This analysis extracts key terms from publication titles and abstracts to identify predominant research topics and conceptual themes within a field [3] [2].

Workflow Overview

A 1. Load Data B 2. Select Text Fields A->B C 3. Choose Counting Method B->C D 4. Set Frequency Threshold C->D E 5. Select Relevant Terms D->E F 6. Generate Term Map E->F

Step-by-Step Instructions

  • Load Data: Start the creation wizard and load your bibliographic data file (e.g., from WoS or Scopus) [1].
  • Select Text Fields: Choose to extract terms from "Title and Abstract fields" for a comprehensive analysis [1].
  • Choose Counting Method: Select "Full counting" for a standard approach where each occurrence of a term is counted [1].
  • Set Frequency Threshold: Define the minimum number of times a term must appear to be included. A higher threshold (e.g., 10) will include only the most prominent terms [1].
  • Select Relevant Terms: VOSviewer will suggest the most "relevant" terms based on an algorithm that prioritizes terms associated with specific document groups. You can adjust the number of terms to select (e.g., 60% of the most relevant) [1].
  • Generate Term Map: The software creates the map where closely related terms (those often appearing together) are positioned near each other, forming thematic clusters [3] [2].

The Scientist's Toolkit: Research Reagent Solutions

Table 2: Essential Tools for a VOSviewer-Based Bibliometric Study

Item / Software Function in the Workflow
Web of Science / Scopus Primary data sources. These subscription-based databases are the gold standard for exporting high-quality bibliographic records for analysis [3] [4].
VOSviewer Software The core analysis and visualization engine. It is a free, Java-based application that constructs and visualizes the bibliometric networks [2].
Microsoft Excel Data cleaning and manipulation tool. Used for initial data organization and, if necessary, for reformatting non-bibliometric data (e.g., interview transcripts) into a structure compatible with VOSviewer's text analysis function [2].
Lexos A web-based text cleaning and tokenization tool. Useful for preparing large bodies of unstructured text (e.g., social media transcripts) by dividing them into standardized chunks ("tokens") for analysis in VOSviewer [2].

Within the domain of bibliometric analysis, understanding the intellectual structure and collaborative dynamics of a scientific field is paramount. VOSviewer, a specialized software tool, enables this exploration through the construction and visualization of several core types of bibliometric networks [6] [7]. These networks—citation, bibliographic coupling, co-authorship, and co-occurrence—provide unique lenses to investigate relationships between scholarly publications, authors, journals, and keywords [8]. In environmental science research, where the field is inherently interdisciplinary and rapidly evolving, these analyses are invaluable for mapping research trends, identifying key players, uncovering emerging topics, and tracing the flow of ideas [9] [10]. This document provides detailed application notes and experimental protocols for employing these core network types within the VOSviewer environment, with a specific focus on applications in environmental science.

VOSviewer supports the creation of multiple network types based on data from major bibliographic databases like Web of Science, Scopus, Dimensions, and Lens [8]. Each network type defines a unique relationship between the units of analysis (e.g., publications, authors, journals, or terms).

Table 1: Core Bibliometric Network Types in VOSviewer

Network Type Relationship Defined Units of Analysis Primary Use Case in Environmental Science
Citation A publication cites another publication. Publications, Journals Tracing the influence and foundational literature of a field (e.g., seminal papers on climate change) [7].
Bibliographic Coupling Two publications reference a common third publication. Publications, Journals Mapping current research fronts and identifying groups of papers working on similar contemporary issues [7].
Co-authorship Two researchers, institutions, or countries co-author a publication. Researchers, Institutions, Countries Revealing collaborative partnerships and social networks within and across environmental research communities [10] [7].
Co-occurrence Two terms (e.g., keywords) appear together in the same publication. Keywords, Terms from Titles/Abstracts Identifying research hotspots, conceptual themes, and the intellectual structure of a field (e.g., linking "restorative environment" and "mental health") [10] [7].

G cluster_citation Citation Network cluster_biblio Bibliographic Coupling cluster_coauthor Co-authorship Network cluster_cooccur Co-occurrence Network P1 Publication A (2010) P2 Publication B (2015) P1->P2 cites P3 Publication C (2023) P4 Publication D (2023) P3->P4 Coupling Strength P5 Publication E (2018) P3->P5 cites P4->P5 cites A1 Researcher 1 A2 Researcher 2 A1->A2 co-author A3 Researcher 3 A1->A3 co-author K1 Term X K2 Term Y K1->K2 co-occurs K3 Term Z K2->K3 co-occurs

Diagram 1: Logical relationships defining the four core network types.

The Scientist's Toolkit: Essential Research Reagents

Conducting a robust bibliometric analysis requires a set of "research reagents"—specialized tools and data sources. The table below details the essential components for a VOSviewer-based study.

Table 2: Key Research Reagents for VOSviewer Bibliometric Analysis

Tool / Resource Type Primary Function Relevance to Environmental Science
VOSviewer Software Analysis & Visualization Tool Constructs and visualizes bibliometric networks; performs layout, clustering, and mapping [6] [7]. Core platform for mapping the intellectual structure of environmental research domains [9] [10].
Bibliometrix (R-package) Complementary Analysis Tool A comprehensive R-tool for science mapping analysis; allows greater customization and analysis from multiple data sources [7]. Useful for performing complementary statistical analyses and processing data from diverse databases concurrently [9].
Web of Science / Scopus Bibliographic Database Primary sources for exporting high-quality metadata of scientific publications [10] [7]. Provides comprehensive coverage of high-impact environmental science journals for data extraction [9] [10].
Thesaurus File Data Cleaning Tool A text file used to merge synonymous terms (e.g., "climate change" and "global warming") [7]. Crucial for normalizing diverse environmental terminology to ensure accurate keyword co-occurrence maps [9].

Experimental Protocols for Network Construction

Protocol 1: Creating a Keyword Co-occurrence Network

Application: To identify conceptual themes, research hotspots, and the intellectual structure in environmental science (e.g., "socio-environmental disclosure" or "restorative environments") [9] [10].

  • Data Retrieval: Export a comprehensive set of publication records from your chosen database (e.g., Web of Science or Scopus) using a well-defined search query. Save the records in the appropriate format (e.g., plain text for WoS, CSV for Scopus).
  • Data Import in VOSviewer: Launch VOSviewer. Click "Create" → "Create a map based on bibliographic data" → "Read data from bibliographic database files". Select your exported file[s] [8].
  • Analysis Type Selection: Choose "Co-occurrence" and then "All keywords" as the unit of analysis. This will map the conceptual space based on terms that appear together.
  • Threshold Setting: Apply a minimum number of occurrences for a keyword to be included. This filter helps focus the map on the most significant terms, reducing clutter. A typical starting threshold is 5-10 occurrences [10].
  • Map Creation and Interpretation: VOSviewer will generate a network map where:
    • Nodes represent keywords. The size typically corresponds to the frequency of occurrence.
    • Links represent co-occurrence. Thicker links indicate stronger association.
    • Colors indicate clusters of closely related terms, revealing distinct research themes [9] [10]. For example, in socio-environmental disclosure, clusters might represent "sustainability assurance" and "organizational legitimacy" [9].

G Start Start: Define Research Scope Step1 1. Data Retrieval (Export from WoS/Scopus) Start->Step1 Step2 2. Data Import (Load into VOSviewer) Step1->Step2 Step3 3. Analysis Selection (Choose 'Co-occurrence') Step2->Step3 Step4 4. Threshold Setting (Min. keyword occurrences) Step3->Step4 Step5 5. Map Interpretation (Analyze clusters & links) Step4->Step5 End End: Thematic Analysis Step5->End

Diagram 2: Workflow for creating a keyword co-occurrence network.

Protocol 2: Creating a Co-authorship Network

Application: To reveal collaboration patterns among researchers, institutions, or countries within environmental science, such as international partnerships in climate change research [10].

  • Data Retrieval: Same as Protocol 1.
  • Data Import in VOSviewer: Same as Protocol 1.
  • Analysis Type Selection: Choose "Co-authorship" and select the desired unit of analysis: "Authors", "Organizations", or "Countries".
  • Threshold Setting: Apply a minimum number of documents or citations for an entity to be included. For a country-level analysis, a minimum of 5 documents is often effective [11].
  • Map Creation and Interpretation: The resulting map visualizes the collaborative landscape:
    • Nodes represent authors, institutions, or countries. Size may indicate document count.
    • Links indicate a co-authorship relationship. Link strength reflects the intensity of collaboration.
    • Colors often denote clusters of tightly-knit collaborative groups [11] [10]. For instance, a study on sustainable economic growth might show distinct clusters led by China, India, and Italy [11].

Application: Citation analysis traces historical influence and foundational knowledge (what has been influential). Bibliographic coupling maps current research fronts and groups of papers actively working on similar problems (what is happening now) [7].

  • Data Retrieval and Import: Same as previous protocols.
  • Analysis Type Selection:
    • For Citation Networks: Choose "Citation" → "Cited references" to map the landscape of influential prior works, or "Cited sources" to analyze journal influence.
    • For Bibliographic Coupling: Choose "Bibliographic coupling" → "Documents" or "Sources" to group currently published papers that share references.
  • Threshold Setting: Set a minimum number of citations for citation networks, or a minimum number of cited references for bibliographic coupling.
  • Map Creation and Interpretation:
    • In a citation network, large nodes represent highly cited, foundational publications. The map shows the flow of knowledge from these seminal works [7].
    • In a bibliographic coupling network, clusters of documents represent current research fronts or thematic specialties. This is highly useful for identifying emerging sub-fields in fast-moving areas like green innovation or CSR [9].

Visualization and Advanced Features

VOSviewer provides multiple visualization modes, each suited for a different type of analysis [8] [7]:

  • Network Visualization: Shows items as labeled circles with lines representing links. Colors indicate clusters. This is the primary view for exploring the structure and relationships within the network.
  • Overlay Visualization: Similar to the network view, but the color of a node represents a specific variable, such as the average publication year. This is ideal for tracking thematic evolution over time, for instance, showing how research on "CSR" has shifted towards "digital economy" and "blue economy" [11] [5].
  • Density Visualization: Provides a quick overview of the main areas in a network. Regions with many closely spaced items appear in warmer colors (e.g., yellow), helping to instantly identify the field's core and periphery [5].

VOSviewer also incorporates natural language processing techniques to extract relevant terms from titles and abstracts for building co-occurrence networks and allows for the use of a thesaurus file to consolidate synonymous terms, which is critical for clean and accurate maps [8] [7].

The field of environmental science is experiencing a significant transformation, driven by the increasing urgency of global sustainability challenges and the data-driven capabilities of bibliometric analysis. This growth is particularly visible in research addressing the United Nations' Sustainable Development Goals (SDGs), where the volume of literature has expanded dramatically. Bibliometrics provides a powerful, quantitative framework for mapping the evolution of scientific knowledge, identifying emerging trends, and understanding the collaborative networks that underpin environmental research. Originating from information and library science, bibliometric methods have evolved, combining quantitative statistics with network analysis and data visualization techniques to create insightful maps of scientific literature [12].

The application of these methods within environmental science allows researchers and policymakers to digest large, complex corpora of scientific publications. For instance, a comprehensive bibliometric analysis of Sustainable Inclusive Economic Growth (SIEG) within the framework of SDG 8 recorded a substantial increase in research output, with a notable surge after 2019 as global efforts toward the UN 2030 Agenda intensified [11]. This analysis, conducted using specialized software, helped identify the most productive countries, influential authors, and leading journals, demonstrating the practical value of bibliometrics in tracking a rapidly evolving research domain [11]. This protocol details the methods for applying bibliometric analysis, specifically using the VOSviewer software, to track and visualize this explosive growth in environmental science.

Materials

Research Reagent Solutions

Table 1: Essential Research Reagents for Bibliometric Analysis

Item Name Function/Brief Explanation
Bibliographic Database A structured source of publication data (e.g., Scopus, Dimensions). Provides the raw metadata (authors, titles, citations, etc.) for analysis.
VOSviewer Software A specialized tool for constructing and visualizing bibliometric networks. It performs the core analysis based on co-authorship, citation, co-citation, or keyword co-occurrence [12].
Data Export File (.CSV) A formatted file containing the exported records from the bibliographic database, structured for compatibility with VOSviewer [13].
Analysis Threshold A minimum frequency value (e.g., 5, 10, 20) applied to select the most relevant authors, keywords, or other units for analysis, ensuring a manageable and meaningful network [12].
Color Schemes (e.g., Viridis) Perceptually uniform color palettes used in VOSviewer for overlay and density visualizations. They improve interpretability over older schemes like the rainbow colormap [5].

Methods

Data Collection and Preparation Protocol

This section outlines the step-by-step procedure for gathering and preparing publication data for a bibliometric analysis of environmental science topics, such as trends related to SDG 8.

  • Define Research Scope: Clearly delineate the boundaries of the analysis. For example, a study might focus on "Sustainable Inclusive Economic Growth (SIEG) within the SDG 8 framework from 2015 to 2025" [11].
  • Select a Bibliographic Database: Access a comprehensive database such as Scopus or Dimensions. These platforms are preferred due to their extensive coverage of peer-reviewed literature.
  • Develop a Search String: Construct a structured search query using relevant keywords, Boolean operators, and filters. For instance: ( "sustainable inclusive economic growth" OR "SIEG" OR "SDG 8" ) AND PUBLICATION YEAR > 2014.
  • Apply Inclusion/Exclusion Criteria: Systematically screen results using defined criteria. The PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) approach is often recommended for this process to ensure transparency [11].
  • Export Data for Bibliometric Mapping: From the database, select the export option formatted for bibliometric tools. In Dimensions, this is the specific "Export for Bibliometric Mapping" option, which generates a .CSV file compatible with VOSviewer [13]. Note the export limits based on your subscription level (e.g., 2,500 for the Free version) [13].

VOSviewer Analysis and Visualization Protocol

This protocol details the process of creating and interpreting bibliometric maps using VOSviewer software, from data import to figure generation.

  • Launch and Import Data:

    • Download and install VOSviewer from the official website (http://www.vosviewer.com/).
    • Open VOSviewer and select "Create".
    • Choose the data source; for data exported from Dimensions, select "Data from Dimensions" and open your previously saved .CSV file [13].
  • Select Analysis Type and Knowledge Unit:

    • Choose the type of map you wish to create based on your research question. The main options include [12]:
      • Collaboration analysis: Based on co-authorship relations between countries, institutions, or authors.
      • Topics analysis: Based on the co-occurrence of keywords or terms in titles and abstracts.
      • Citation-based analysis: Based on bibliographic coupling or co-citations.
    • Select the specific knowledge unit (e.g., "Authors", "Organizations", "All Keywords") for the analysis.
  • Set the Analysis Threshold:

    • To focus on the most significant elements, set a minimum frequency threshold [12]. For example, to analyze the most recurring keywords, you might set a threshold of 5, meaning only keywords that appear at least 5 times in the dataset are included.
    • The threshold is project-dependent; a lower threshold yields a larger, more complex network, while a higher threshold creates a smaller, more core-focused network [12].
  • Generate and Refine the Map:

    • After setting parameters, VOSviewer will calculate and display the network map.
    • Use the visualization modes to interpret the results:
      • Network Visualization: Shows items as nodes and relationships as links. The size of a node often represents its importance (e.g., publication count), and the color indicates its cluster group [5].
      • Overlay Visualization: Colors nodes based on a metric like average publication year, helping to identify temporal trends and emerging topics [5].
      • Density Visualization: Provides a quick overview of the main areas in a map, with colors indicating the density of items in a region [5].
  • Customize Visuals and Export:

    • Improve clarity by adjusting colors, fonts, and layout. VOSviewer version 1.6.7 and newer uses perceptually uniform color schemes like viridis as the default, which are less misleading than the older rainbow scheme [5].
    • For cluster-based network maps, the software uses a modified tab20 color scheme to clearly distinguish between different research fronts [5].
    • Export the final visualization as an image file for your report or publication.

Workflow Diagram

The following diagram illustrates the logical workflow for a complete bibliometric analysis, from initial data collection to final interpretation.

Start Define Research Scope A Search Bibliographic Database (e.g., Scopus) Start->A B Export Data for Bibliometric Mapping A->B C Import .CSV File into VOSviewer B->C D Select Analysis Type (e.g., Co-occurrence) C->D E Set Frequency Threshold D->E F Generate and Customize Bibliometric Map E->F G Interpret Results and Identify Trends F->G End Report and Disseminate Findings G->End

Results and Discussion

Expected Outcomes and Data Interpretation

Upon successful execution of the protocols, researchers can expect to generate several key quantitative and visual outputs that characterize the research landscape.

Table 2: Expected Core outputs from a Bibliometric Analysis

Output Metric Description Example from SIEG/SDG 8 Analysis [11]
Productive Countries Countries with the highest volume of publications on the topic. China, India, and Italy emerged as the most productive.
Leading Journals Journals publishing the most research in the domain. Sustainability (Switzerland), published by MDPI, was the leading journal.
Influential Authors Most cited researchers, indicating high-impact work. Bekun FV, Onifade ST (Turkey), and Zhang X (China) were highly cited.
Collaboration Clusters Groups of frequently collaborating countries/institutions. Six co-authorship clusters were identified, with India leading one cluster (63 publications).
Dominant & Emerging Themes Key topics and their evolution over time, identified via keyword analysis. A shift from "financial inclusion & CSR" to "digital economy, blue economy, employment & entrepreneurship" was observed.

Visualization of a Hypothetical Keyword Co-occurrence Network

The following diagram represents a hypothetical output of a keyword co-occurrence analysis generated in VOSviewer, simulating the structure and clusters one might find in a growing field like environmental science bibliometrics.

cluster_0 Cluster 1: Core Concepts cluster_1 Cluster 2: Social Equity cluster_2 Cluster 3: Environmental Focus SD Sustainable Development SDG Sustainable Development Goals SD->SDG EG Economic Growth SD->EG Inc Inclusive Growth SD->Inc CC Climate Change SD->CC BE Blue Economy EG->BE DE Digital Economy EG->DE Emp Employment Inc->Emp Pov Poverty Reduction Inc->Pov

The analysis of a field like environmental science bibliometrics itself reveals dynamic patterns. The observed substantial increase in research output post-2019 in SIEG/SDG 8 research is a microcosm of the broader "explosive growth" in environmental science [11]. This surge is likely linked to intensified global efforts toward the 2030 Agenda for Sustainable Development.

Bibliometric mapping allows researchers to move beyond simple publication counts to understand the intellectual structure of a field. The identification of distinct thematic clusters (e.g., core concepts, social equity, environmental focus) and their interconnections, as visualized in the network diagram, helps pinpoint specialized research fronts and potential interdisciplinary bridges. Furthermore, the thematic evolution from traditional topics like financial inclusion toward cutting-edge areas like the digital and blue economy provides critical insights for scientists and funders seeking to anticipate future research directions [11].

The choice of visualization tools and settings directly impacts the clarity and accuracy of these insights. The adoption of perceptually uniform color schemes like viridis in VOSviewer, for example, prevents the misinterpretation of data that can occur with the previously used rainbow color scheme, ensuring that trends related to the average publication year or citation impact are correctly perceived [5].

Within the field of environmental science research, the ability to systematically map the intellectual landscape is crucial for identifying emerging trends, collaborative networks, and foundational knowledge. Bibliometric analysis has emerged as a powerful methodology for this purpose, originating from information and library science and combining quantitative methods with network analysis and data visualization [12]. The VOSviewer software, a tool whose name is short for "Visualization of Similarities," is specifically designed for creating, visualizing, and exploring bibliometric maps based on network data [14] [12]. Developed by van Eck and Waltman from Leiden University, it has become an indispensable instrument in the scientist's toolkit, enabling researchers to transform complex publication data into interpretable visual networks that reveal the underlying structure of scientific fields [12]. This application note details the protocols for employing VOSviewer to map research trends, identify key authors, and unveil thematic clusters, with specific examples framed within environmental science research.

Key Applications and Experimental Protocols

VOSviewer supports several core types of analysis, each designed to answer different research questions. The general workflow begins with data extraction from bibliographic databases like Scopus or Web of Science, followed by data preprocessing to ensure compatibility, and culminates in network construction and visualization within VOSviewer [11] [15]. The table below summarizes the key applications, their objectives, and the required data.

Table 1: Core Bibliometric Analysis Applications in VOSviewer

Application Primary Research Objective Type of Data Required Visualization Output
Co-authorship Analysis Map collaboration patterns between individuals, institutions, or countries. Author names, institutional affiliations, countries. Network where nodes represent authors/ institutions/countries; links represent joint publications.
Co-occurrence Analysis Identify thematic clusters and conceptual structure of a field. Keywords (author-generated or database-tagged) or terms from titles and abstracts. Network where nodes represent keywords/terms; links represent frequency of co-occurrence.
Citation-Based Analysis Identify influential works, authors, and journals, and map intellectual foundations. Document citation data, author citation data, or journal citation data. Network where nodes represent publications/authors/journals; links represent citation relationships.
Bibliographic Coupling Map relationships between documents that cite the same references, revealing topical similarity. Reference lists of citing documents. Network where nodes are documents; link strength depends on number of shared references.

Objective: To identify and visualize the main thematic areas and their evolution within a research domain, such as sustainable inclusive economic growth or machine learning in environmental chemical research [11] [15].

Methodology:

  • Data Collection and Preparation:

    • Execute a structured search query in a bibliographic database (e.g., Scopus, Web of Science) relevant to the research field. For example, a search might focus on "Sustainable Inclusive Economic Growth" and "SDG 8" [11].
    • Define explicit inclusion and exclusion criteria and apply a workflow like the PRISMA approach to refine the dataset.
    • Export the complete bibliographic records of the final article set, including titles, abstracts, keywords, and references, in a format compatible with VOSviewer (e.g., RIS, CSV).
  • Network Creation and Threshold Setting:

    • Launch VOSviewer and select "Create" followed by "Create a map based on bibliographic data."
    • Choose the data source (e.g., the exported file) and select "Co-occurrence" as the analysis type. For the unit, select "All keywords" or "Author Keywords." A threshold is applied to select the minimum frequency of occurrence for a keyword to be included in the network [12]. This threshold (e.g., 5, 10, 15) is set to extract a core, meaningful knowledge network; a lower threshold results in a larger, more complex network [12].
    • VOSviewer will then display the number of keywords meeting the threshold. The user can adjust the threshold to achieve a network of manageable size and relevance.
  • Visualization and Interpretation:

    • VOSviewer generates a network map where nodes represent keywords and their size reflects the frequency of occurrence. The links between nodes represent the strength of co-occurrence [11].
    • Clusters: Items (keywords) are colored based on the cluster to which they belong, identified by a smart local moving algorithm [14]. Each cluster represents a distinct thematic area. For instance, an analysis of machine learning in environmental chemical research identified clusters centered on "ML model development," "water quality prediction," and "PFAS" [15].
    • Overlay Visualization: To map trends over time, use the overlay visualization. The color of a node can indicate the average publication year of the articles in which the keyword appears, revealing the evolution of topics from earlier (cooler colors like blue in the viridis scheme) to more recent (warmer colors like yellow) [5].

The following diagram illustrates the logical workflow for this co-occurrence analysis:

G label Protocol 1: Co-occurrence Analysis Workflow start Define Research Scope & Query step1 Data Collection from Scopus/WoS start->step1 step2 Data Preprocessing & Export step1->step2 step3 VOSviewer: Set Keyword Threshold step2->step3 step4 Generate & Analyze Co-occurrence Map step3->step4 step5 Identify Thematic Clusters step4->step5 step6 Interpret Trends via Overlay Visualization step5->step6

Protocol 2: Identifying Key Authors and Collaboration Networks

Objective: To identify the most influential researchers and patterns of scientific collaboration within a specific field using co-authorship and citation analysis.

Methodology:

  • Data Collection: Follow a similar data collection and preparation procedure as in Protocol 1, ensuring that author names and affiliations are included in the exported data.

  • Network Creation:

    • In VOSviewer, select "Co-authorship" as the analysis type. The unit can be "Authors," "Organizations," or "Countries."
    • Set a minimum threshold for the number of documents or citations for an author/organization/country to be included. This ensures the analysis focuses on the most active contributors. The software can also be set to ignore publications with extreme numbers of authors ("hyperauthorship") to prevent them from distorting the map [5].
  • Visualization and Interpretation:

    • The resulting network map will display nodes (authors, institutions, or countries) connected by links indicating co-authorship. Larger nodes represent more prolific entities [11].
    • Influence Metrics: The total strength of an author's citation links or their number of publications can identify key contributors. For example, in a study on SDG 8, researchers like Bekun FV and Onifade ST were identified as highly cited authors [11].
    • Collaboration Clusters: Different colored clusters reveal distinct collaborative networks. For instance, a bibliometric analysis of SDG 8 identified six country-level collaboration clusters, with India being a leading collaborator [11].

Table 2: Quantitative Data from a Bibliometric Analysis of SDG 8 Research (2015-2025)

Metric Findings Implications
Most Productive Countries China, India, Italy Indicates geographic centers of research productivity and potential policy focus.
Leading Journal Sustainability (Switzerland) Identifies the primary academic outlet for this research field.
Highly Cited Researchers Bekun FV, Onifade ST (Turkey); Zhang X (China) Highlights influential authors and thought leaders.
Number of Country Clusters 6 distinct collaboration clusters Reveals the structure of international research collaboration.
Thematic Evolution Shift from financial inclusion/CSR (2014-2023) to digital economy, blue economy (2024-2025) Tracks the progression of research foci over time, showing emerging trends.

The Scientist's Toolkit: Research Reagent Solutions

In the context of bibliometric analysis with VOSviewer, "research reagents" refer to the essential software, data, and methodological components required to conduct a successful analysis.

Table 3: Essential Research Reagents for VOSviewer Analysis

Reagent / Tool Function / Description Application Note
Bibliographic Database Source of raw publication data (e.g., Scopus, Web of Science). Provides structured data including citations, abstracts, and author affiliations for export and analysis.
VOSviewer Software Core tool for constructing, visualizing, and exploring bibliometric maps. Used for all network types (co-authorship, co-occurrence, citation). The latest versions use perceptually uniform color schemes like viridis by default [5].
Data Preprocessing Scripts Code (e.g., in Python, R) for cleaning and standardizing raw data. Ensures data consistency (e.g., standardizing author name variants) before import, improving map accuracy.
VOSviewer Color Schemes Define the palette for overlay and density visualizations. The viridis scheme is perceptually uniform, aiding accurate interpretation. Coolwarm is a diverging scheme useful for highlighting above/below average values [5].
Clustering Algorithm The smart local moving algorithm used to identify thematic clusters. Automatically groups related items (e.g., authors, keywords) within the network, defining the map's structure [14].

Workflow Diagram for Comprehensive Analysis

A full bibliometric study often integrates multiple analysis types. The following diagram outlines the comprehensive end-to-end workflow, from data acquisition to final interpretation, connecting the various protocols and reagents.

G cluster_1 Phase 1: Data Preparation cluster_2 Phase 2: Network Analysis & Visualization cluster_3 Phase 3: Synthesis & Reporting label Comprehensive VOSviewer Bibliometric Workflow A Define Research Question B Search Bibliographic Database (Scopus/WoS) A->B C Apply PRISMA & Export Data B->C D Co-authorship Analysis (Key Authors & Collaboration) C->D E Co-occurrence Analysis (Thematic Clusters & Trends) C->E F Citation Analysis (Influential Works) C->F G Synthesize Findings from All Analyses D->G E->G F->G H Report Insights & Future Directions G->H

This application note provides a comprehensive technical overview of VOSviewer, a specialized software tool for constructing and visualizing bibliometric networks. Aimed at researchers, scientists, and drug development professionals, we document detailed protocols for leveraging VOSviewer's workflow within environmental science research contexts. The guidance covers data acquisition from major bibliographic databases, network construction techniques, visualization customization, and interpretation of analytical outputs. Emphasis is placed on practical methodology with structured data presentation and visual workflow documentation to facilitate immediate implementation by users conducting bibliometric analysis on scientific landscapes, particularly in tracking research trends on environmental degradation.

VOSviewer is a specialized software tool designed for constructing and visualizing bibliometric networks. These networks can represent various scholarly entities including journals, researchers, or individual publications, and are constructed based on citation analysis, bibliographic coupling, co-citation, or co-authorship relations. The software also provides text mining functionality to construct and visualize co-occurrence networks of important terms extracted from scientific literature [6]. Developed by the Centre for Science and Technology Studies (CWTS) in Leiden, VOSviewer has become an essential tool for researchers conducting systematic reviews and science mapping studies.

For environmental science researchers, VOSviewer offers powerful capabilities to map the rapidly evolving landscape of sustainability research. The tool enables the identification of key trends, influential authors, and emerging topics in fields such as environmental degradation, where publication growth exceeds 80% annually [16]. This application note provides a structured framework for utilizing VOSviewer within research contexts, with specific examples drawn from environmental bibliometrics.

VOSviewer Software Specifications and Data Compatibility

Current Software Version Information

VOSviewer undergoes regular updates to enhance functionality and data compatibility. The following table summarizes recent version improvements relevant to research applications:

Table 1: VOSviewer Version History and Feature Development

Version Release Date Key Features and Improvements
1.6.20 October 31, 2023 Improved map creation from API data; enhanced support for Scopus' new export format [6]
1.6.19 January 23, 2023 Improved OpenAlex data support; fixed Web of Science data processing issues [6]
1.6.18 January 24, 2022 Added OpenAlex data source; Europe PMC full-text search; Semantic Scholar organization maps; Crossref ROR ID querying [6]
1.6.17 July 22, 2021 Online map sharing via VOSviewer Online; Lens export support; JSON file format [6]

Data Source Compatibility

VOSviewer supports multiple data sources, which is crucial for comprehensive bibliometric analysis. The software can process data from:

  • Bibliographic Databases: Direct export files from Scopus, Web of Science, Dimensions, and Lens [6]
  • Open Data Sources: OpenAlex, Europe PMC, Semantic Scholar, and Crossref [6]
  • API Integration: Direct querying of Microsoft Academic (discontinued but historically significant), Crossref, and other sources [6]

The flexibility in data sourcing enables researchers to construct analyses from multiple literature corpora, enhancing the robustness of bibliometric findings.

Experimental Protocol: Bibliometric Analysis of Environmental Degradation Research

Data Collection and Preparation Methodology

This protocol outlines the methodology for analyzing research determinants of carbon emissions and environmental degradation, adapting the approach used in a recent bibliometric study [16].

Step 1: Database Selection and Search Strategy

  • Select the Scopus database core collection via Elsevier's platform
  • Define search keywords: "determinants OR factor", "carbon emission OR CO2", "environmental degradation"
  • Set timeframe: June 1993 to May 2024 (or adjust according to research needs)
  • Apply language filter: English only (covers 98.16% of relevant literature)
  • Document type: Restrict to research articles only

Step 2: Data Export

  • Execute search and refine results
  • Export all relevant records in compatible format (e.g., RIS, CSV)
  • Save data file for VOSviewer import

Step 3: Data Import into VOSviewer

  • Launch VOSviewer and select "Create" → "Create a map based on bibliographic data"
  • Choose "Read data from bibliographic database files"
  • Import the saved data file
  • Select appropriate parsing options for the database format

Network Construction Parameters

Step 4: Network Type Selection

  • Based on research questions, select from:
    • Co-authorship: Authors, organizations, countries
    • Citation: Journals, documents, authors
    • Bibliographic coupling: Documents, sources, authors
    • Co-citation: References, sources, authors
    • Term co-occurrence: All keywords, author keywords, titles/abstracts

Step 5: Counting Method and Thresholds

  • Select counting method (full counting or fractional counting)
  • Apply minimum threshold for item selection
  • For environmental degradation studies, a sample of 1365 documents was typical [16]

Step 6: Network Visualization and Refinement

  • Allow VOSviewer to calculate the network
  • Apply visualization settings (layout, clustering, labels)
  • Refine map for clarity and interpretation

Visualization Workflows and Diagrammatic Representations

Core VOSviewer Analysis Workflow

The following flowchart represents the primary workflow for conducting bibliometric analysis in VOSviewer:

VOSviewerWorkflow Start Define Research Objectives DBSelect Select Bibliographic Database Start->DBSelect Search Execute Search Strategy DBSelect->Search Export Export Records Search->Export VOSImport Import Data into VOSviewer Export->VOSImport NetworkType Select Network Type VOSImport->NetworkType Threshold Set Analysis Thresholds NetworkType->Threshold Visualize Generate & Refine Visualization Threshold->Visualize Interpret Interpret Results Visualize->Interpret ExportViz Export Visualization Interpret->ExportViz

Bibliometric Analysis Workflow in VOSviewer

Network Visualization Types and Applications

Table 2: VOSviewer Network Types and Research Applications in Environmental Science

Network Type Analysis Level Research Application Environmental Science Example
Co-authorship Authors, Organizations, Countries Identify collaboration patterns Mapping international collaborations in climate change research [16]
Citation Documents, Sources, Authors Assess influence and impact Identifying seminal papers on environmental Kuznets curve [16]
Bibliographic coupling Documents, Sources, Authors Group conceptually similar items Clustering research on renewable energy and economic growth [16]
Co-citation References, Sources, Authors Map intellectual structure Analyzing theoretical foundations of carbon emission studies [16]
Term co-occurrence Keywords, Terms from text Identify conceptual themes Tracking evolution of research themes in environmental degradation [16]

Color Scheme Selection for Scientific Visualization

VOSviewer offers multiple color schemes optimized for different analytical purposes. The software transitioned from the rainbow color scheme to more perceptually uniform alternatives in version 1.6.7 [5]. The following diagram illustrates the color scheme selection process:

Color Scheme Selection Guide for VOSviewer Visualizations

Research Reagent Solutions: Essential Materials for VOSviewer Analysis

Table 3: Essential Research Materials for Bibliometric Analysis with VOSviewer

Research Reagent Function in Analysis Implementation Example
Scopus Database Primary bibliographic data source for comprehensive coverage Exporting 1365 documents on environmental degradation determinants [16]
VOSviewer Software Network construction and visualization tool Creating co-occurrence maps of keywords in environmental research [6] [16]
OpenAlex Data Open bibliographic data alternative Creating maps based on open data sources following Microsoft Academic discontinuation [6]
VOSviewer Online Web-based visualization sharing platform Embedding interactive bibliometric maps in online research platforms [6]
CitNetExplorer Complementary citation network analysis tool Analyzing citation networks of publications on carbon emissions [6]
Thematic Analysis Framework Qualitative interpretation of visual patterns Identifying key research themes like economic growth and renewable energy in environmental degradation [16]

Application Case Study: Environmental Degradation Research Mapping

Protocol Implementation and Results

Implementing the protocol in section 3 for environmental degradation research yielded significant bibliometric insights [16]:

Key Findings from Environmental Degradation Analysis:

  • Annual growth rate of publications exceeded 80%, indicating rapidly increasing research interest
  • Primary research themes included economic growth, renewable energy, and Environmental Kuznets Curve
  • Geographical distribution showed China, Pakistan, and Turkey as leading in research output
  • Methodological application of VOSviewer successfully identified key factors driving carbon emissions research

Visualization Parameters:

  • Network type: Term co-occurrence from titles and abstracts
  • Minimum term occurrence: 5 times
  • Counting method: Full counting
  • Visualization: Overlay visualization showing thematic evolution

Advanced Technical Considerations

Color Accessibility in Scientific Visualizations: VOSviewer's transition from rainbow color schemes to perceptually uniform alternatives like viridis addresses several limitations [5]:

  • Perceptual ordering: Viridis provides intuitive progression from low to high values
  • Detail preservation: Avoids obscuring small data variations in certain color ranges
  • Accessibility: Improved interpretability for users with color vision deficiencies
  • Print compatibility: Maintains distinction when printed in grayscale

Data Source Migration: With the discontinuation of Microsoft Academic, VOSviewer has implemented support for alternative sources including OpenAlex, Europe PMC, and Semantic Scholar [6]. Researchers should verify current data source compatibility when designing bibliometric studies.

VOSviewer provides a robust methodological framework for conducting bibliometric analysis in environmental science research. The structured workflow encompassing data collection, network construction, visualization, and interpretation enables researchers to systematically map scientific landscapes and identify emerging trends. The application of these protocols to environmental degradation research demonstrates the practical utility of VOSviewer in tracking evolution of research themes, collaboration patterns, and knowledge structures. As bibliometric methodology continues to evolve, VOSviewer's ongoing development ensures compatibility with emerging data sources and visualization best practices, maintaining its position as an essential tool for research assessment and science mapping.

A Step-by-Step Workflow for Environmental Bibliometric Analysis

Bibliometric analysis has become an indispensable methodology in environmental science research, enabling researchers to map knowledge domains, identify emerging trends, and visualize collaborative networks. Within this context, VOSviewer has emerged as a powerful tool for constructing and visualizing bibliometric networks, supporting multiple analysis types including co-authorship, co-occurrence, citation, and bibliographic coupling. However, the quality and compatibility of input data fundamentally determine the success of these analyses. This protocol provides comprehensive guidance for exporting bibliographic data from three major sources—Scopus, Web of Science, and OpenAlex—with specific consideration for environmental science applications and VOSviewer compatibility.

The data acquisition process presents significant challenges, including platform-specific export limitations, format inconsistencies, and metadata completeness variations. Environmental science researchers particularly benefit from comprehensive data acquisition strategies due to the interdisciplinary nature of their field, which spans ecological systems, environmental chemistry, climate science, and sustainability studies. Proper data export and harmonization ensure that subsequent VOSviewer visualizations accurately represent the complex intellectual structure of environmental science research domains.

Table 1: Comparison of Export Capabilities from Major Bibliometric Data Sources

Platform Export Formats Record Limits VOSviewer Compatibility Key Metadata Fields
Scopus CSV, RIS 2,000 per export [17] High (CSV recommended for most analysis types) [17] Citation information, affiliations, abstracts, keywords, references [17]
Web of Science Plain Text (.txt), RIS 500-1,000 per export [17] High (Plain text recommended for most analysis types) [17] Full record, cited references, author, title, source, abstract [17]
OpenAlex CSV, RIS, Text (.txt) 100,000 per export [18] Moderate (may require format adjustment) Flattened work data, reference managers compatibility [18]

Table 2: Analysis Type Compatibility with VOSviewer by Export Format

Analysis Type Scopus CSV WoS Plain Text OpenAlex CSV RIS Formats
Co-authorship Supported [17] Supported [17] Limited Limited [17]
Co-occurrence Supported [17] Supported [17] Limited Supported [17]
Citation Supported [17] Supported [17] Limited Limited
Co-citation Supported [17] Supported [17] Limited Limited
Bibliographic Coupling Supported [17] Supported [17] Limited Limited

Experimental Protocols

Scopus Data Export Protocol

Scopus provides extensive coverage of environmental science literature, particularly strong in pollution research, ecological engineering, and environmental chemistry. The following protocol optimizes data extraction for VOSviewer analysis:

  • Search Strategy Formulation:

    • Access Scopus through institutional subscription
    • Develop comprehensive search queries using relevant environmental science terminology
    • Utilize field codes (TITLE-ABS-KEY, AFFILCOUNTRY) to refine results
    • Apply date ranges appropriate to research objectives
  • Export Configuration:

    • Select relevant records from search results (maximum 2,000 per export) [17]
    • Click "Export" button and select "CSV" format [17]
    • Configure export settings to include:
      • Citation information
      • Author details and affiliations
      • Abstracts and keywords
      • References (critical for citation-based analyses) [17]
    • Initiate export and download completed file
  • VOSviewer Preprocessing:

    • Open CSV in spreadsheet software to verify data integrity
    • Check for consistent field separation
    • Ensure author names and affiliations follow consistent formatting
    • Validate reference list completeness for citation analysis

ScopusExport Start Start Scopus Export Search Formulate Search Strategy Start->Search Refine Apply Environmental Science Filters Search->Refine Select Select Records (≤2000) Refine->Select Config Configure CSV Export Select->Config Include Include: Citations, Affiliations, Abstracts, Keywords, References Config->Include Download Download CSV File Include->Download Preprocess Preprocess for VOSviewer Download->Preprocess Finish VOSviewer Ready Data Preprocess->Finish

Scopus Data Export Workflow: Sequential steps for exporting bibliographic data from Scopus and preparing it for VOSviewer analysis.

Web of Science Data Export Protocol

Web of Science offers robust coverage of environmental science journals, particularly in fundamental ecology, environmental biology, and conservation science. This protocol maximizes data utility while working within platform constraints:

  • Search Optimization:

    • Access Web of Science through institutional portal
    • Construct targeted search queries using Web of Science indexing terminology
    • Utilize "Advanced Search" for complex Boolean operations
    • Filter results by Web of Science categories relevant to environmental sciences
  • Export Execution:

    • Select records for export (limit: 500 for full analysis) [17]
    • Click "Export" and select "Plain Text" format [17]
    • Configure "Record Content" to include:
      • "Full Record" for complete metadata
      • "Cited References" for citation analysis [17]
    • Set "Records from 1 to [maximum allowed]" based on export limits
  • Data Validation:

    • Verify text file structure and encoding
    • Confirm presence of both bibliographic data and reference lists
    • Check for consistent field tagging throughout records
    • Ensure author addresses are included for institutional analysis

WoSExport Start Start WoS Export Access Access Institutional Portal Start->Access BuildQuery Build Advanced Search Query Access->BuildQuery Filter Apply Environmental Science Filters BuildQuery->Filter SelectRec Select Records (≤500) Filter->SelectRec ChooseFormat Select Plain Text Format SelectRec->ChooseFormat ConfigContent Configure: Full Record + Cited References ChooseFormat->ConfigContent Export Execute Export ConfigContent->Export Validate Validate File Structure Export->Validate Complete Analysis-Ready Data Validate->Complete

WoS Data Export Workflow: Step-by-step process for exporting data from Web of Science in VOSviewer-compatible format.

OpenAlex Data Export Protocol

OpenAlex provides an open alternative to traditional bibliometric databases with significantly higher export limits, particularly valuable for comprehensive environmental science mapping studies:

  • Search Implementation:

    • Access OpenAlex website publicly without subscription requirements
    • Construct search queries using relevant environmental science terminology
    • Utilize available filters to refine results by publication year, document type, or subject area
  • Export Configuration:

    • Identify Works result set for export
    • Click export button above results [18]
    • Select appropriate file format:
      • CSV for VOSviewer analysis (with Excel compatibility checked if needed) [18]
      • RIS for reference manager integration [18]
      • Text for Web of Science format compatibility [18]
    • For CSV exports targeting Excel users: check "Shorten column values for Excel compatibility" to prevent truncation errors [18]
  • Large-Scale Export Management:

    • Initiate export preparation (may require several minutes for large sets) [18]
    • Wait for system-generated download link
    • Download completed file (maximum 100,000 works) [18]
    • Verify file completeness and structure

OpenAlexExport Start Start OpenAlex Export Search Perform Environmental Science Search Start->Search Identify Identify Works Result Set Search->Identify FormatSelect Select Export Format: CSV, RIS, or Text Identify->FormatSelect ExcelCompat For Excel: Check Compatibility (Truncates Long Values) FormatSelect->ExcelCompat Prepare System Prepares File (Minutes for Large Sets) ExcelCompat->Prepare Wait Wait for Download Link Prepare->Wait Download Download File (≤100,000 works) Wait->Download Verify Verify Data Integrity Download->Verify Finish Export Complete Verify->Finish

OpenAlex Data Export Workflow: Procedure for exporting large datasets from OpenAlex while maintaining data quality and compatibility.

Data Harmonization Protocol Using BibexPy

Bibliometric analyses in environmental science often require integrating data from multiple sources to overcome individual database limitations. BibexPy provides a Python-based solution for harmonizing datasets from Scopus and Web of Science, addressing common challenges in bibliometric data integration:

  • Environment Setup:

    • Install BibexPy package via pip
    • Verify Python dependencies (pandas, numpy, requests)
    • Secure API keys for Unpaywall and Semantic Scholar if metadata enhancement is required
  • Data Integration Process:

    • Load exported Scopus CSV and Web of Science plain text files
    • Execute BibexPy merging function to combine datasets
    • Perform DOI-based deduplication to remove duplicate records [19]
    • Implement missing metadata enhancement through integrated APIs [19]
  • VOSviewer Preparation:

    • Export harmonized dataset in analysis-ready formats
    • Validate file structure compatibility with VOSviewer and Biblioshiny [19]
    • Execute quality checks for data completeness and consistency

DataHarmonization Start Start Data Harmonization Setup Setup BibexPy Environment Start->Setup Load Load Scopus CSV & WoS Text Files Setup->Load Merge Merge Datasets Load->Merge Dedup DOI-Based Deduplication Merge->Dedup Enhance Enhance Metadata via Unpaywall/Semantic Scholar APIs Dedup->Enhance Export Export Analysis-Ready Format Enhance->Export Validate Validate VOSviewer Compatibility Export->Validate Complete Harmonized Dataset Ready Validate->Complete

Data Harmonization Workflow: Integration and enhancement process for multi-source bibliometric data using BibexPy.

The Scientist's Toolkit: Research Reagent Solutions

Table 3: Essential Tools for Bibliometric Data Acquisition and Processing

Tool/Platform Function Environmental Science Application
Scopus CSV Export Exports comprehensive bibliographic data with references [17] Enables co-citation analysis of environmental policy research and collaborative networks in climate science
Web of Science Plain Text Export Provides full records with cited references in compatible format [17] Supports historical analysis of ecological research trends and emerging topics in conservation biology
OpenAlex CSV Export Offers open bibliometric data with high export limits [18] Facilitates large-scale mapping of sustainability science and renewable energy research
BibexPy Harmonizes datasets from multiple sources, performs deduplication, enriches metadata [19] Enables comprehensive analysis of interdisciplinary environmental research spanning multiple domains
Reference Managers (Zotero/Endnote) Imports and manages RIS format exports [18] Organizes literature for systematic reviews in environmental health and risk assessment
Unpaywall API Provides open access status and metadata enhancement [19] Identifies publicly accessible environmental science literature for comprehensive analysis
Semantic Scholar API Enhances metadata with citation context and research entities [19] Enriches data for machine learning applications in environmental informatics

Effective data acquisition from Scopus, Web of Science, and OpenAlex forms the critical foundation for robust bibliometric analysis in environmental science research using VOSviewer. Each platform offers distinct advantages and limitations in terms of export capabilities, record limits, and metadata completeness. Scopus provides the most versatile CSV exports for multiple analysis types but with intermediate record limits. Web of Science offers robust plain text exports compatible with VOSviewer but with more restrictive record constraints. OpenAlex presents an attractive open alternative with significantly higher export limits, though with potential format adjustment requirements for optimal VOSviewer compatibility.

For comprehensive environmental science mapping studies, researchers should consider a strategic approach combining data from multiple sources where possible, utilizing harmonization tools like BibexPy to address challenges of duplicate records, missing metadata, and inconsistent formats [19]. This integrated approach ensures that subsequent VOSviewer visualizations and analyses accurately capture the complex, interdisciplinary nature of environmental science research domains, from climate change studies to ecological conservation and environmental technology development.

VOSviewer is a powerful software tool for constructing and visualizing bibliometric networks, enabling researchers to reveal and explore the underlying structure of complex scientific fields such as environmental science. These networks can represent various relationships, including citations, co-authorships, bibliographic couplings, and co-occurrences of key terms extracted from scientific literature. For environmental scientists and drug development professionals, this capability provides a robust methodological framework for mapping the intellectual landscape of critical research areas—from climate change mitigation technologies to the environmental impact of pharmaceuticals. By transforming large, unstructured bibliographic datasets into interpretable visual maps, VOSviewer facilitates the identification of emerging trends, influential authors and institutions, and collaborative patterns within and across research domains [20] [21].

The construction of a scientifically valid and insightful network requires careful planning and execution across several stages: data collection and preparation, network parameter configuration, visualization, and interpretation. This guide provides a detailed, step-by-step protocol for building your first bibliometric network within the context of environmental science research. The methodology outlined here emphasizes practical application, ensuring that researchers can systematically create maps that accurately represent the structural and dynamic aspects of their field of investigation, thereby supporting both literature review processes and strategic research planning [1].

The Scientist's Toolkit: Essential Research Reagents and Materials

Successful network construction in VOSviewer requires specific "research reagents" – primarily datasets and software components. The table below details the essential materials and their functions in the network construction process.

Table 1: Essential Research Reagents for VOSviewer Network Construction

Item Name Type/Format Primary Function in Network Construction
Bibliographic Database File .txt, .ris (from Web of Science, Scopus, PubMed) Serves as the raw data input for constructing co-authorship, citation, or co-occurrence networks based on established scholarly records [1].
Plain Text Corpus .txt (structured with paragraphs as units) Acts as the input for constructing term co-occurrence networks through text mining, where each paragraph is treated as a context window [1].
VOSviewer Desktop Application Executable software (Windows, macOS, Linux) The primary environment for data import, network parameter configuration, layout calculation, and visualization of the constructed network [21].
Thesaurus File .txt (simple two-column format) Used to merge synonymous terms or author name variants (e.g., "WHO" and "World Health Organization") to ensure conceptual consistency in the network [1].
VOSviewer Online Web-based platform Enables the sharing and embedding of interactive network visualizations in web pages, facilitating collaboration and dissemination of findings [20].
Pre-existing Network File .gml, .pajek Allows for the import and visualization of networks constructed in other software tools (e.g., Gephi), leveraging VOSviewer's visualization capabilities [1].

Experimental Protocols for Network Construction

Protocol 1: Constructing a Co-authorship Network from Bibliographic Data

This protocol is designed to map collaborative relationships between researchers, institutions, or countries within a specific environmental science subfield.

Step-by-Step Methodology:

  • Data Acquisition and Preparation: Export a bibliographic dataset from a database like Scopus or Web of Science using a targeted query (e.g., "pharmaceutical pollution" AND "aquatic ecosystems"). Ensure the export format is compatible with VOSviewer (e.g., RIS or plain text) [1].
  • Initiate Map Creation: In VOSviewer, select Create > Create based on bibliographic data > Read data from reference manager files. Click Next and browse to your downloaded file [1].
  • Select Network Type: Choose the type of analysis. For co-authorship, select Co-authorship. Then, choose the unit of analysis—Authors, Organizations, or Countries—depending on your research question. Click Next [1].
  • Apply Counting Method: Select the Full counting method. This method attributes one full count to each author for a publication, which is generally recommended for a balanced analysis [1].
  • Set Thresholds: Define the minimum number of documents an author/organization/country must have to be included. This filters out less active entities. For an initial map, a threshold of 5 documents is a reasonable starting point [1].
  • Select Items for Analysis: The software will present a list of entities meeting the threshold. You can manually deselect items, but typically, you would proceed by selecting all and clicking Next [1].
  • Finalize Network: When prompted to restrict the analysis to the largest connected component, select No to visualize all network components, including smaller, potentially isolated collaborative groups. Click Finish to generate the network [1].

Protocol 2: Building a Term Co-occurrence Network via Text Mining

This protocol generates a conceptual map of a research field by analyzing the co-occurrence of keywords or terms within a corpus of text, such as article titles and abstracts.

Step-by-Step Methodology:

  • Data Preparation: Compile a plain text file (e.g., environmental_abstracts.txt) where each paragraph represents a distinct textual unit, typically the abstract of a single research article [1].
  • Launch Text Mining Function: In VOSviewer, select Create > Create based on text data > Read data from text corpus files [1].
  • Configure Data Source: Choose between loading a VOSviewer format file (your prepared text corpus) or directly extracting text from a PubMed/MEDLINE formatted file, which can automatically filter out copyright statements and other noise [1].
  • Define Fields and Counting: If using a bibliographic format, select the fields to mine (e.g., Title and Abstract fields). Choose the Full counting method and load a thesaurus file if you have prepared one. Click Next [1].
  • Set Frequency Threshold: Define the minimum number of times a term must appear in the corpus to be included. A threshold of 10 occurrences helps filter out rare terms while retaining significant concepts. Click Next [1].
  • Select Relevant Terms: VOSviewer will propose the most "relevant" terms based on an algorithm that identifies terms associated with specific contexts rather than all contexts equally. Adjust the number of terms to select (e.g., the top 500). Click Next and then Finish to construct the term co-occurrence network [1].

Data Presentation and Quantitative Configuration

The quantitative parameters set during network construction profoundly impact the resulting map's scope and interpretability. The following table summarizes key thresholds and their typical values for environmental science applications.

Table 2: Key Quantitative Parameters for Network Construction in Environmental Science

Parameter Protocol 1: Co-authorship Protocol 2: Term Co-occurrence Impact on Final Network
Minimum Document Count 5 N/A Determines the minimum productivity for an author/organization/country to be included, controlling node count [1].
Minimum Term Frequency N/A 10 Filters out infrequent terms, focusing the map on central, recurring concepts in the literature [1].
Minimum Link Strength 1 (or context-dependent) 1 (or context-dependent) Removes weak connections, simplifying the network to reveal the strongest collaborative or conceptual ties [22].
Number of Relevant Terms N/A 500 Limits the network to the most discriminative and meaningful terms, preventing visual clutter [1].

Beyond construction parameters, VOSviewer allows for extensive customization of the network visualization through its configuration object, which can be stored in a JSON file. This is critical for tailoring the map for publication or presentation, such as using color schemes optimized for accessibility or highlighting specific environmental science clusters [22] [5].

Table 3: Key Configuration Parameters in the VOSviewer JSON File for Visualization Control

Config Parameter (within parameters object) Data Type Description and Common Values Visualization Impact
scale Float Zoom level of the visualization (≥1). A higher value zooms in. Controls the overall visible area of the network [22].
item_size Integer Reference for node size (1=first option, 2=second, etc.). Influences the relative prominence of nodes [22].
item_color Integer Determines what property defines node color (1=cluster, 2=score, etc.). Critical for highlighting clusters or temporal trends (overlays) [22].
score_colors String Color scheme for score-based overlays (e.g., 'Viridis', 'Plasma', 'Coolwarm'). 'Viridis' is the new perceptually uniform default, better than the old rainbow scheme [5].
min_score / max_score Float Defines the range for the score color legend. Focuses the color contrast on a specific data range, enhancing interpretability [22].
attraction / repulsion Integer Force-directed layout parameters controlling node spacing. Adjusts the layout density and clarity; higher repulsion spreads nodes apart [22].
resolution Float Controls the cluster detection granularity (default=1.0). A higher value typically results in a larger number of smaller, more specific clusters [22].

Workflow and Logical Relationship Visualization

The entire process of building a network map in VOSviewer, from data collection to interpretation, follows a structured workflow. The diagram below illustrates the key decision points and procedural steps, highlighting the parallel paths for constructing co-authorship versus term co-occurrence networks.

vosviewer_workflow VOSviewer Network Construction Workflow cluster_Biblio Bibliometric Network Path cluster_Text Text Mining Path Start Define Research Question (e.g., Trends in Green Pharma) DataCollection Data Collection (Web of Science, Scopus, PubMed) Start->DataCollection DataTypeDecision What is the primary data type? DataCollection->DataTypeDecision BiblioData Bibliographic Records (.ris, .txt export) DataTypeDecision->BiblioData Structured Data TextData Text Corpus (Abstracts, full text) DataTypeDecision->TextData Unstructured Text SubgraphBiblio Bibliometric Network Path BiblioData->SubgraphBiblio SubgraphText Text Mining Path TextData->SubgraphText BiblioChoice Select Network Type P1CoAuth Protocol 1: Co-authorship BiblioChoice->P1CoAuth Collaboration P1Citation Citation Analysis BiblioChoice->P1Citation Impact/Relatedness NetworkConfig Configure Parameters (Thresholds, Counting) P1CoAuth->NetworkConfig P1Citation->NetworkConfig P2TermMap Protocol 2: Term Co-occurrence P2TermMap->NetworkConfig MapGen Network Calculation & Map Generation NetworkConfig->MapGen Visualization Visualization & Interpretation MapGen->Visualization Insight Generate Insights (Trends, Clusters, Gaps) Visualization->Insight

The network visualization in VOSviewer is governed by a structured data model, particularly when using the JSON file format. This model defines how items (nodes), links (edges), and their visual properties are stored and interrelated. The following diagram depicts the core structure of this data model, which is essential for advanced users who wish to customize or programmatically generate network files.

vosviewer_data_model VOSviewer JSON Data Model Structure JSONFile VOSviewer JSON File + network : Object + config : Object + info : Object NetworkObject Network Object + items : Array + links : Array + clusters : Array JSONFile->NetworkObject ConfigObject Config Object + parameters : Object + color_schemes : Object + terminology : Object + ... other objects ... JSONFile->ConfigObject ItemObject Item Object + id : String/Integer + label : String + description : String + x : Float + y : Float + cluster : Integer + weights : Object + scores : Object NetworkObject->ItemObject LinkObject Link Object + source_id : String/Integer + target_id : String/Integer + strength : Float NetworkObject->LinkObject ClusterObject Cluster Object + cluster : Integer + label : String NetworkObject->ClusterObject Parameters Parameters Object + attraction : Integer + repulsion : Integer + scale : Float + item_color : Integer + score_colors : String + ... other parameters ... ConfigObject->Parameters ColorSchemes Color_Schemes Object + cluster_colors : Array + score_colors : Array ConfigObject->ColorSchemes

Mastering the construction of bibliometric networks with VOSviewer provides environmental science researchers with a powerful analytical framework for navigating the expansive and complex body of scientific literature. By adhering to the detailed protocols for data preparation, network construction, and visualization configuration outlined in this guide, researchers can systematically generate evidence-based maps that reveal the structural dynamics of their field. The transition from raw data to an insightful visual map demystifies the research landscape, enabling the identification of knowledge gaps, emerging frontiers, and collaborative opportunities. As with any methodological tool, proficiency comes with practice. Researchers are encouraged to apply these protocols to their own domains, using the structured workflows and configuration options to build, refine, and interpret maps that advance their specific research objectives and contribute to the broader progress of environmental science.

VOSviewer is a powerful software tool for constructing and visualizing bibliometric networks, enabling researchers to discern complex patterns within large sets of academic literature. Within environmental science research, these visualizations help identify trending topics, collaborative networks, and emerging research frontiers. The software primarily generates three types of visualizations: network, overlay, and density views, each providing distinct analytical perspectives on bibliometric data. These maps can be constructed based on citation networks, bibliographic coupling, co-citation relationships, or co-authorship patterns, with additional text mining functionality to visualize co-occurrence networks of important terms extracted from scientific literature [21].

Comparative Analysis of Visualization Techniques

Table 1: Comparative Characteristics of VOSviewer Visualization Techniques

Feature Network View Overlay View Density View
Primary Function Displays relationships and connections between items Superimposes temporal or thematic information on network structure Highlights area concentration and impact of research topics
Visual Elements Nodes (circles) and links (lines) Colored nodes over standard network layout Color gradients from blue (low density) to red (high density)
Color Significance Typically indicates cluster affiliation Indicates specific properties (e.g., publication year, citation impact) Indicates concentration and importance of topics
Interpretation Focus Cluster identification, relationship strength Temporal evolution, performance metrics Research dominance, emerging hotspots
Environmental Science Application Research theme identification, collaboration patterns Tracking topic evolution, impact assessment Identifying established vs. emerging research areas

Table 2: Quantitative Parameters for VOSviewer Visualization Optimization

Parameter Network View Overlay View Density View
Node Size Scaling Proportional to citation count or publication volume Proportional to specific metric (e.g., recent citations) Fixed size with density-based coloring
Cluster Resolution 0.60-1.00 (moderate to high for clear separation) 0.40-0.80 (lower to maintain base structure) Not applicable
Minimum Cluster Size 5-15 items for meaningful grouping 5-15 items aligned with base network Not applicable
Attraction Parameter 1.5-3.0 for optimal layout 1.5-3.0 (matches base network) Not applicable
Repulsion Parameter 0.0-1.0 for cluster separation 0.0-1.0 (matches base network) Not applicable
Label Size Proportional to item importance or fixed for readability Proportional to overlay metric Minimal or no labels
Color Saturation High for cluster distinction Gradient based on time or performance metric Continuous scale from blue to red

Experimental Protocols for Visualization Generation

Protocol for Network Visualization Construction

Purpose: To create a network visualization mapping research themes in environmental science.

Materials and Reagents:

  • VOSviewer software (latest version) [21]
  • Bibliographic dataset (Scopus, Web of Science, or PubMed exports)
  • Computer system with minimum 4GB RAM and 500MB storage

Methodology:

  • Data Collection: Export bibliographic data from chosen database using environmental science keywords (e.g., "climate change mitigation," "biodiversity conservation," "renewable energy").
  • Data Preparation: Save data in RIS, EndNote, or CSV format ensuring inclusion of title, abstract, author, citation, and keyword fields.
  • Software Setup: Launch VOSviewer and select "Create" → "Create a map based on bibliographic data" → "Read data from bibliographic database files."
  • Mapping Type Selection: Choose "Co-occurrence" → "All keywords" with minimum occurrence threshold of 5-10 depending on dataset size.
  • Network Parameters: Set normalization method to "Association strength," clustering resolution to 0.80, and layout optimization to default VOSviewer settings.
  • Visualization Refinement: Adjust node size proportional to occurrence frequency, apply automatic labeling with maximum 50% of items labeled.
  • Interpretation: Identify clusters as research themes, analyze link strength between nodes as conceptual relationships, and note isolated nodes as niche research areas.

Protocol for Overlay Visualization Construction

Purpose: To superimpose temporal evolution on bibliometric networks.

Materials and Reagents:

  • Existing network visualization from Protocol 3.1
  • Complete bibliographic dataset with publication years
  • VOSviewer software with overlay functionality [21]

Methodology:

  • Base Network Preparation: Generate network visualization following Protocol 3.1 steps 1-6.
  • Time Data Import: Ensure publication year field is properly included in source data.
  • Overlay Activation: Select "Overlay" → "Citation" or "Publication Year" from visualization menu.
  • Color Scaling: Adjust color gradient to represent temporal progression (earlier = blue, later = yellow) or citation impact (low = blue, high = yellow).
  • Time Slicing: Utilize sliding time window to observe network evolution across specific periods (e.g., 5-year intervals).
  • Interpretation: Track color patterns to identify emerging topics (yellow clusters), declining research areas (blue clusters), and knowledge diffusion pathways.

Protocol for Density Visualization Construction

Purpose: To identify research hotspots and concentration areas in environmental science.

Materials and Reagents:

  • Bibliographic dataset as used in Protocol 3.1
  • VOSviewer software with density view functionality [21]

Methodology:

  • Base Visualization: Generate co-occurrence network following Protocol 3.1 steps 1-5.
  • View Switching: Select "Density" from visualization options instead of "Network" or "Overlay."
  • Color Calibration: Observe density gradients with blue indicating peripheral research areas and yellow/red indicating research hotspots.
  • Smoothing Adjustment: Adjust smoothing parameter to 10-15 for optimal balance between detail and clarity.
  • Focus Identification: Click on high-density (red/yellow) areas to identify specific terms constituting research hotspots.
  • Interpretation: Correlate density patterns with citation metrics to distinguish between established core research and emerging frontier areas.

Visualization Workflows and Signaling Pathways

vosviewer_workflow start Bibliographic Data Collection data_prep Data Preparation & Cleaning start->data_prep import Import Data into VOSviewer data_prep->import param_select Select Mapping Parameters import->param_select network_gen Generate Network Visualization param_select->network_gen overlay_gen Apply Overlay Visualization network_gen->overlay_gen Temporal Analysis density_gen Apply Density Visualization network_gen->density_gen Impact Analysis interpretation Interpret Results & Draw Conclusions overlay_gen->interpretation density_gen->interpretation

VOSviewer Visualization Selection Workflow

network_interpretation node_cluster Node/Cluster Analysis node_size Node Size: Citation Impact node_cluster->node_size node_color Node Color: Cluster Affiliation node_cluster->node_color cluster_density Cluster Density: Research Concentration node_cluster->cluster_density link_analysis Link/Connection Analysis link_strength Link Strength: Conceptual Relationship link_analysis->link_strength link_direction Link Direction: Knowledge Flow link_analysis->link_direction structural_analysis Structural Pattern Analysis centrality Centrality Measures: Research Importance structural_analysis->centrality isolation Isolation Patterns: Niche Research structural_analysis->isolation

Network Visualization Interpretation Framework

overlay_density_flow base_network Base Network Visualization decision_point Select Analytical Focus base_network->decision_point temporal_focus Temporal Evolution Analysis decision_point->temporal_focus impact_focus Research Impact Analysis decision_point->impact_focus overlay_application Apply Overlay Visualization with Time/Citation Metric temporal_focus->overlay_application density_application Apply Density Visualization with Color Intensity Scale impact_focus->density_application temporal_interpret Identify Emerging Topics (Yellow/Red Areas) overlay_application->temporal_interpret density_interpret Identify Research Hotspots (Red Areas) density_application->density_interpret

Overlay and Density Visualization Decision Flow

Research Reagent Solutions for Bibliometric Analysis

Table 3: Essential Research Reagents for VOSviewer Bibliometric Analysis

Reagent/Material Function Specifications Application Context
VOSviewer Software Primary tool for constructing and visualizing bibliometric networks Latest version; compatible with Windows, Mac, Linux; requires Java Runtime Environment All visualization types: network, overlay, and density views [21]
Bibliographic Databases Source of raw data for analysis Scopus, Web of Science, or PubMed; RIS/CSV export format; inclusive of abstracts and citations Data extraction for co-occurrence, citation, and co-authorship analysis
Normalization Algorithms Standardize network measurements for fair comparison Association strength, fractionalization, or clustering-based approaches Network visualization to account for varying citation practices across fields
Clustering Techniques Group related items into thematic clusters Resolution parameter 0.60-1.00; minimum cluster size 5-15 items Network view to identify research themes in environmental science
Color Gradients Visual representation of temporal or impact metrics Time-based: blue (older) to yellow (recent); Impact-based: blue (low) to red (high) Overlay visualization to show evolution of research topics
Density Smoothing Enhance visual interpretation of concentration areas Kernel-based smoothing with adjustable bandwidth (10-20) Density view to identify research hotspots without clutter
Layout Algorithms Optimize spatial arrangement of network elements VOS scaling, modularity-based, or force-directed approaches All visualization types to minimize overlapping and improve readability

The concept of urban resilience has evolved significantly from its ecological origins in the 1970s to become a critical framework for addressing modern urban challenges including climate change, natural disasters, and public health emergencies [23] [24]. This application note provides a structured protocol for conducting bibliometric analysis of resilient cities research using VOSviewer software, enabling researchers to quantitatively map the intellectual landscape and identify emerging trends in this rapidly growing field. The methodology outlined here facilitates objective assessment of research publications, authors, institutions, and conceptual themes within resilient cities literature, supporting evidence-based decision-making in urban policy and planning [23] [25].

Experimental Protocols and Analytical Framework

Data Acquisition and Preprocessing Protocol

Database Selection and Search Strategy

  • Primary Database: Web of Science (WoS) Core Collection, recognized as the most authoritative data source for studying scientific publications globally [24]
  • Search Query: TS = ("resilient cit" OR "resilient communit") to capture both singular and plural forms [23] [25]
  • Time Span: 1995 to present (adjustable based on research objectives)
  • Document Types: Restrict to articles, conference proceedings, and reviews; exclude books, indices, and other non-primary sources [23]
  • Export Format: Plain text format with full record and cited references

Data Screening and Cleaning

  • Remove duplicate records using VOSviewer's deduplication function
  • Verify consistency of author names and affiliations (e.g., "Univ" versus "University" abbreviations)
  • Standardize terminology variations (e.g., "resilient cities" vs. "urban resilience") through manual review

Bibliometric Analysis and Visualization Protocol

Network Construction Parameters in VOSviewer

  • Analysis Type: Select based on research questions (co-authorship, citation, co-citation, or co-occurrence analysis)
  • Counting Method: Set to full counting for equal weight to all publications
  • Thresholds: Apply minimum publication or citation thresholds to focus on significant elements
  • Clustering Resolution: Default parameter of 1.0 typically provides optimal balance between specificity and generality

Visualization Optimization

  • Layout Algorithm: Use VOSviewer's weighted and automated labeling for initial mapping
  • Manual Adjustment: Refine node positioning to minimize overlapping and improve readability
  • Color Schema: Assign distinct colors to major clusters for immediate visual differentiation
  • Export Settings: Save high-resolution images (minimum 300 DPI) for publication quality

Data Presentation and Quantitative Analysis

Temporal Evolution of Resilient Cities Research

Table 1: Publication Trends in Resilient Cities/Communities Research (1995-2022)

Time Period Stage Classification Annual Publications Range Cumulative Publications Key Influencing Factors
1995-2004 No Attention Period 0-2 publications annually <10 total Limited conceptual development; minimal practical application
2005-2014 Starting Period Steady growth with minor fluctuations ~100 total UNISDR "Making Cities More Resilient" campaign (2010)
2015-2022 Rapid Growth Period 213 publications by 2021 1148 total Rockefeller Foundation 100 Resilient Cities (2013); climate change urgency; COVID-19 pandemic

Analysis of publication trends reveals three distinct developmental phases in resilient cities research [23]. The field remained largely dormant until 2004, experienced gradual growth following international resilience campaigns, and entered a period of rapid expansion after 2014, largely driven by major global initiatives and escalating climate concerns [23] [24]. The acceleration phase corresponds with the launch of the Rockefeller Foundation's "100 Resilient Cities" program in 2013, which significantly stimulated research interest and output [24].

Table 2: Key Research Sources and Contributors in Resilient Cities Research

Category Top Elements Quantitative Metrics Significance/Focus
Core Journals Sustainability 73.2% of publications as articles [23] Primary outlet for resilient cities research
International Journal of Disaster Risk Reduction High citation frequency [23] Focus on practical disaster mitigation strategies
Leading Authors Serre Most productive author [23] Expertise in infrastructure resilience and risk management
Shaw High publication output [23] Contributions to community and social resilience
Institutional Leaders Colorado State University Leading research institution [23] Interdisciplinary resilience research center
Delft University of Technology Prominent European institution [23] Water management and climate adaptation expertise
Texas A&M University Major contributor [23] Community and regional resilience focus
Geographical Distribution United States Leading country [23] Extensive research funding and institutional support
Global North Countries Majority of publications [23] [26] Disproportionate research output compared to implementation needs

Journal analysis indicates that resilient cities research is published across interdisciplinary platforms, with Sustainability and International Journal of Disaster Risk Reduction serving as primary venues [23]. The United States maintains dominance in research output, though the Global South represents critical areas for implementation and case studies [26].

Visualization of Analytical Workflows

Bibliometric Analysis Procedure

G Resilient Cities Bibliometric Analysis Workflow DataAcquisition Data Acquisition DataPreprocessing Data Preprocessing DataAcquisition->DataPreprocessing WoS Web of Science Database DataAcquisition->WoS SearchQuery Search Query: TS=('resilient cit*' OR 'resilient communit*') DataAcquisition->SearchQuery Export Export Plain Text Format DataAcquisition->Export NetworkConstruction Network Construction DataPreprocessing->NetworkConstruction Deduplication Remove Duplicate Records DataPreprocessing->Deduplication Standardization Standardize Terminology & Affiliations DataPreprocessing->Standardization Filtering Apply Publication/ Citation Thresholds DataPreprocessing->Filtering Visualization Visualization & Interpretation NetworkConstruction->Visualization AnalysisType Select Analysis Type: Co-authorship, Citation, Co-citation, Co-occurrence NetworkConstruction->AnalysisType CountingMethod Set Full Counting Method NetworkConstruction->CountingMethod Clustering Apply Clustering Algorithm NetworkConstruction->Clustering Application Knowledge Application Visualization->Application Layout Optimize Network Layout Visualization->Layout ThematicMapping Create Thematic Maps Visualization->ThematicMapping TrendAnalysis Identify Research Trends & Gaps Visualization->TrendAnalysis ResearchPlanning Inform Future Research Planning Application->ResearchPlanning PolicyDevelopment Support Evidence-Based Policy Development Application->PolicyDevelopment Collaboration Identify Potential Collaborators Application->Collaboration

Knowledge Network Mapping Framework

G VOSviewer Knowledge Network Mapping Framework VOSviewer VOSviewer Analysis CoAuthorship Co-authorship Networks VOSviewer->CoAuthorship CitationNetworks Citation Networks VOSviewer->CitationNetworks CoCitation Co-citation Networks VOSviewer->CoCitation TermCooccurrence Term Co-occurrence Networks VOSviewer->TermCooccurrence AuthorCollaboration Author Collaboration Patterns CoAuthorship->AuthorCollaboration InstitutionalLinks Institutional Partnerships CoAuthorship->InstitutionalLinks InternationalCoop International Cooperation CoAuthorship->InternationalCoop InfluentialPapers Influential Publications CitationNetworks->InfluentialPapers KnowledgeFlow Knowledge Diffusion Paths CitationNetworks->KnowledgeFlow IntellectualBase Intellectual Base CitationNetworks->IntellectualBase ResearchFronts Emerging Research Fronts CoCitation->ResearchFronts ThematicClusters Thematic Clusters CoCitation->ThematicClusters DisciplineStructure Disciplinary Structure CoCitation->DisciplineStructure ConceptualStructure Conceptual Structure TermCooccurrence->ConceptualStructure ResearchHotspots Research Hotspots & Trends TermCooccurrence->ResearchHotspots TopicEvolution Topic Evolution Over Time TermCooccurrence->TopicEvolution

The Scientist's Toolkit: Research Reagent Solutions

Table 3: Essential Research Tools and Data Sources for Resilient Cities Bibliometric Analysis

Tool/Resource Type/Function Specific Application in Resilient Cities Research Access Method
VOSviewer Software Bibliometric visualization tool Constructing and visualizing networks of journals, researchers, publications based on citation and co-authorship relations [6] Free download from VOSviewer.com
Web of Science Core Collection Scientific literature database Primary data source for comprehensive publication records on resilient cities; enables precise bibliometric queries [23] [24] Institutional subscription required
CiteSpace Complementary visualization software Analyzing citation networks and identifying emerging trends through burst detection [24] Free Java application
100 Resilient Cities Strategies Policy document collection Content analysis of urban resilience plans; evaluating equity and justice considerations in planning [26] Publicly available from Rockefeller Foundation
City Resilience Program (CRP) Data Geospatial urban data Spatial analysis of resilience challenges; mapping population exposure to hazards [27] World Bank/GFDRR partnership
ArcGIS StoryMaps Geospatial storytelling platform Creating interactive visualizations of urban resilience indicators and spatial relationships [27] Esri platform subscription

Application Notes and Interpretation Guidelines

Bibliometric analysis of resilient cities research reveals several dominant thematic clusters that have emerged over the past decade [23] [24]. The conceptual foundation has evolved through three distinct phases: engineering resilience (focusing on return to stable state), ecological resilience (emphasizing adaptation and multiple stable states), and evolutionary resilience (highlighting transformation and adaptive capacity) [24]. Current research hotspots identified through keyword analysis include:

  • Climate Adaptation: Dominant theme focusing on urban responses to climate change impacts and extreme weather events [28] [24]
  • Social Equity and Justice: Emerging emphasis on equitable distribution of resilience benefits across diverse communities [26]
  • Critical Infrastructure: Research on transportation, energy, and communication systems resilience [29]
  • Community Engagement: Participatory approaches to resilience building and capacity development [24]
  • Assessment Frameworks: Development of quantitative metrics and evaluation systems for urban resilience [23] [28]

Analytical Limitations and Methodological Considerations

Data Comprehensiveness

  • WoS coverage, while authoritative, may exclude relevant publications from regional databases or non-English sources [23]
  • Keyword-based search strategies may miss conceptually related works using alternative terminology
  • Time-lag in database indexing can affect recency of analysis

Interpretation Challenges

  • Co-authorship patterns may reflect social networks rather than intellectual connections
  • Citation counts favor older publications, potentially undervaluing recent significant contributions
  • Geographical and institutional biases in publication practices can skew network structures

This protocol provides a comprehensive framework for conducting bibliometric analysis of resilient cities research using VOSviewer. The systematic approach to data collection, processing, and visualization enables researchers to identify knowledge gaps, track conceptual evolution, and map collaborative networks within this interdisciplinary field. Application of these methods supports strategic research planning, evidence-based policy development, and identification of potential collaborators across institutions and geographical regions [23] [24]. As urban resilience challenges continue to evolve in complexity and scale, bibliometric analysis offers a valuable tool for navigating the expanding knowledge domain and directing future research toward areas of greatest need and potential impact.

Microplastics (MPs), plastic particles less than 5 mm in size, have emerged as a global environmental contaminant of significant concern, threatening aquatic and terrestrial ecosystems worldwide [4]. The rapid expansion of microplastic research over the past decade has created an extensive body of literature that requires systematic analysis to identify research trends, knowledge gaps, and emerging frontiers. This case study employs VOSviewer software for bibliometric analysis to map the intellectual landscape and research hotspots in microplastic pollution studies. By quantitatively analyzing publication patterns, collaborations, and keyword relationships, we provide researchers with a comprehensive overview of the field's structure and evolution, supporting strategic research planning and resource allocation in environmental science.

Bibliometric Methodology and Workflow

Data Retrieval Protocol

Bibliometric analysis requires systematic data collection to ensure comprehensive coverage of the research domain. The following protocol outlines the standardized approach for data retrieval:

  • Data Source: Web of Science Core Collection (WoSCC) database
  • Search Field: Topic (TS) encompassing title, abstract, and keywords
  • Time Span: 2013-2022 for temporal trend analysis [30]
  • Search Formula: TS=(microplastic OR microplastics OR "plastic debris" OR micro-plastic OR nanoplastics) AND TS=(marine OR sea OR ocean OR beach OR bay OR gulf OR estuary OR coastline OR shoreline) AND TS=(contamination OR pollution OR contaminate OR pollute OR stain OR filth OR contaminant OR foul) AND TS=(removal OR removal OR removed OR remove OR exenterate OR dispose OR expulsion OR erasing OR eliminate OR degradation OR degrade OR decomposition OR decompose OR degeneration OR hydrolysis OR degradable OR dissipation OR harness OR governance OR treatment OR control OR management OR government OR govern OR administration OR regulation) [30]
  • Refinement Filters: Language restricted to English; document types limited to "article" and "review"
  • Quality Control: Manual review of research directions and topics to remove irrelevant content

Data Analysis Framework

The analytical workflow employs specialized bibliometric software to process and visualize literature data:

  • Software Tools: VOSviewer and CiteSpace for visualization and temporal analysis [30]
  • Node Filtering: Minimum publication thresholds applied to authors (≥5 papers), organizations (≥5 papers), and countries (≥5 papers) to focus on significant contributors [30]
  • Analytical Techniques:
    • Co-authorship networks for collaboration mapping
    • Keyword co-occurrence analysis for research hotspot identification
    • Citation analysis for influential works and knowledge foundations
    • Cluster analysis for thematic grouping of research topics
  • Visualization Parameters: Network, overlay, and density visualizations with temporal coloring to show field evolution

G Data Retrieval\nWoS Core Collection Data Retrieval WoS Core Collection Data Filtering\nEnglish Articles/Reviews Data Filtering English Articles/Reviews Data Retrieval\nWoS Core Collection->Data Filtering\nEnglish Articles/Reviews Bibliometric\nAnalysis Bibliometric Analysis Data Filtering\nEnglish Articles/Reviews->Bibliometric\nAnalysis Network Construction\nCo-authorship, Co-citation Network Construction Co-authorship, Co-citation Bibliometric\nAnalysis->Network Construction\nCo-authorship, Co-citation Visualization\nVOSviewer/CiteSpace Visualization VOSviewer/CiteSpace Network Construction\nCo-authorship, Co-citation->Visualization\nVOSviewer/CiteSpace Research Hotspots\nIdentification Research Hotspots Identification Visualization\nVOSviewer/CiteSpace->Research Hotspots\nIdentification Trend Analysis\nTemporal Evolution Trend Analysis Temporal Evolution Visualization\nVOSviewer/CiteSpace->Trend Analysis\nTemporal Evolution

Figure 1: Bibliometric Analysis Workflow for Microplastic Research

Publication Growth and Geographic Distribution

Microplastic research has experienced exponential growth over the past decade, reflecting increasing global concern about plastic pollution:

Table 1: Annual Publication Trends in Microplastic Research (2013-2022)

Year Cumulative Publications Annual Publications Key Driving Events
2013 Baseline Minimal Growing scientific interest
2014 - Rapid increase -
2015 - Continued growth UN Sustainable Development Summit
2019 - >1,000 annual publications Increased global awareness
2022 11,777 total [4] 3,548 [4] 30.12% of total analyzed publications

Geographic analysis reveals concentrated research efforts in specific countries, with 147 countries having participated in microplastic pollution research [4]. The most productive countries include:

  • China: Most active country with significant research output [30]
  • United States: Major contributor with extensive research networks [30] [4]
  • India: Key driver in microplastic research expansion [30]
  • Australia: Important research hub with strong international collaborations [30] [4]
  • United Kingdom, Canada: Significant contributors with robust research capacity [4]

These countries not only demonstrate high research productivity but also maintain extensive international collaboration networks, facilitating global knowledge exchange on microplastic pollution.

Research Hotspots and Conceptual Structure

Keyword co-occurrence analysis in VOSviewer reveals four primary research clusters in microplastic studies:

Table 2: Major Research Clusters in Microplastic Pollution Studies

Research Cluster Key Focus Areas Specific Topics
Distribution & Sources [4] Spatial patterns, input pathways Marine vs. freshwater distribution, land-based sources, atmospheric transport, wastewater treatment plant effluent
Toxicological Effects [4] Organism impacts, ecosystem risks Ingestion by marine organisms, biological toxicity, food web transfer, physiological and behavioral effects
Analytical Methods [4] Detection, quantification Sampling techniques, identification methods, standardization needs, size fractionation
Adsorption & Interactions [4] Pollutant interactions, vector effects Adsorption with other pollutants, chemical additive leaching, persistent organic pollutants, metals

The conceptual structure of microplastic research has evolved significantly over time, transitioning from initial focus on traceability and hazard analysis to broader examination of economic activities and synthetic fibers as major contributors to microplastic pollution [30]. Current research frontiers include microplastics in wastewater treatment plant effluent, human consumption impacts, synthetic textiles, and polymer degradation processes [30].

Experimental Protocols for Microplastic Research

Field Sampling Methodology

Comprehensive microplastic monitoring requires standardized field sampling protocols to ensure data comparability across studies:

Surface Water Sampling (Grab Sample Method)
  • Sample Volume: Collect 1L grab samples from top 25cm of surface water [31]
  • Spatial Interval: Sample every 80.5 river kilometers for systematic coverage [31]
  • Container Protocol: Use pre-cleaned glass containers with aluminum foil covers to prevent airborne contamination [31]
  • Quality Control: Limit exposure time during collection; wear 100% cotton clothing to reduce fiber contamination [31]
Net Sampling Protocol
  • Equipment: Plankton nets with 100μm mesh size [31]
  • Deployment: Surface trawling at consistent depth and duration
  • Processing: Concentrate samples immediately after collection and transfer to clean containers
  • Preservation: Refrigerate samples until laboratory analysis to prevent biological degradation

Laboratory Analysis Workflow

Microplastic identification and characterization requires meticulous laboratory procedures to ensure accurate results:

G Sample Filtration\n(0.45μm cellulose nitrate) Sample Filtration (0.45μm cellulose nitrate) Visual Identification\n(Stereomicroscope) Visual Identification (Stereomicroscope) Sample Filtration\n(0.45μm cellulose nitrate)->Visual Identification\n(Stereomicroscope) Hot Needle Test\n(Polymer Verification) Hot Needle Test (Polymer Verification) Visual Identification\n(Stereomicroscope)->Hot Needle Test\n(Polymer Verification) Micro-Raman Spectroscopy\n(Polymer Identification) Micro-Raman Spectroscopy (Polymer Identification) Hot Needle Test\n(Polymer Verification)->Micro-Raman Spectroscopy\n(Polymer Identification) Morphological Classification Morphological Classification Micro-Raman Spectroscopy\n(Polymer Identification)->Morphological Classification Size Measurement\n(Microscope with graticule) Size Measurement (Microscope with graticule) Morphological Classification->Size Measurement\n(Microscope with graticule) Data Quantification Data Quantification Size Measurement\n(Microscope with graticule)->Data Quantification

Figure 2: Microplastic Laboratory Analysis Workflow

Sample Processing and Identification
  • Filtration: Filter water samples through 0.45μm cellulose nitrate membranes [31]
  • Visual Sorting: Examine filters under stereomicroscope; identify potential microplastics based on visual criteria (color, texture, shape) [31]
  • Polymer Verification: Apply hot needle test to suspected particles - melting behavior confirms synthetic polymer origin [31]
  • Polymer Identification: Select representative subset (approximately 16.7%) for micro-Raman spectroscopy to determine specific polymer composition [31]
  • Morphological Classification: Categorize particles as fragments, fibers, films, or beads with size classification (100-333μm, 334-1000μm, >1000μm) [31]

Quality Assurance and Control Measures

Contamination control is paramount in microplastic research due to ubiquitous presence of synthetic particles in laboratory environments:

  • Equipment Cleaning: Thoroughly clean all glassware with Decon-90, rinse with tap water followed by filtered water (0.45μm) [31]
  • Processing Environment: Perform filtrations inside clean air cabinet; cover samples during processing [31]
  • Airborne Contamination Control: Wear white lab coats manufactured from 100% cotton during all procedures [31]
  • Blank Samples: Process method blanks alongside environmental samples to quantify and correct for background contamination
  • Cross-Validation: Implement multiple identification methods (visual, physical, spectroscopic) to minimize false positives/negatives

The Researcher's Toolkit: Essential Reagents and Materials

Table 3: Essential Research Reagents and Materials for Microplastic Studies

Item Function/Application Technical Specifications
Cellulose Nitrate Filters Sample filtration 0.45μm pore size [31]
Plankton Nets Field sampling 100μm mesh size [31]
Decon-90 Equipment cleaning Removes organic contaminants [31]
Glass Containers Sample collection/storage Pre-cleaned, foil-covered [31]
Micro-Raman Spectroscopy Polymer identification Molecular characterization [31]
Stereomicroscope Visual identification 10-40x magnification range [31]
Cotton Laboratory Apparel Contamination control 100% cotton material [31]

Research Applications and Implications

Bibliometric analysis of microplastic research provides valuable insights for scientific advancement and policy development:

Research Direction and Prioritization

The hotspot analysis reveals critical areas requiring further investigation, including:

  • Standardized Methodologies: Development of uniform sampling, extraction, and identification protocols across environmental matrices [4]
  • Freshwater Systems: Addressing the knowledge gap in riverine microplastic pollution compared to marine environments [31]
  • Toxicological Relevance: Bridging the concentration gap between laboratory studies (typically high doses) and environmental exposure levels (generally lower) [31]
  • Source Apportionment: Enhanced understanding of primary versus secondary microplastic sources and their relative contributions

Policy and Mitigation Implications

The research trends identified through bibliometric analysis support evidence-based decision making:

  • Pollution Prevention: Targeting major sources such as wastewater treatment plants, synthetic textiles, and improper plastic waste disposal
  • Regulatory Frameworks: Informing development of microplastic discharge standards and monitoring requirements
  • International Cooperation: Facilitating global collaboration on microplastic research and mitigation strategies, particularly given the transboundary nature of plastic pollution

This case study demonstrates the powerful application of VOSviewer bibliometric analysis for mapping research hotspots in microplastic pollution science. The exponential growth in publications, shifting research frontiers, and emerging geographic collaborations highlighted through this analysis provide a strategic roadmap for researchers, funding agencies, and policy makers. The standardized protocols and methodological frameworks presented offer practical guidance for conducting environmentally relevant microplastic research. As the field continues to evolve, ongoing bibliometric monitoring will be essential for identifying new research directions, facilitating international collaboration, and effectively addressing the global challenge of microplastic pollution through evidence-based scientific approaches.

Co-occurrence networks are powerful text mining tools that visually map relationships between entities within large collections of documents. In scientific literature analysis, these networks reveal knowledge structures and research trends by representing co-occurring keywords, authors, or concepts as interconnected nodes within a graphic visualization [32]. Within environmental science research, where interdisciplinary work is prevalent, this method helps researchers identify emerging topics, collaborative networks, and thematic clusters across vast bibliographic datasets.

The fundamental principle involves identifying paired entities that appear together within specified text units (e.g., article keywords, titles, or abstracts). Each entity becomes a network node, and each co-occurrence forms a connection link [33]. The number of co-occurrences determines the connection strength, visually representing the cumulative knowledge of a scientific domain [33]. When integrated with specialized software like VOSviewer, this approach enables environmental scientists to extract meaningful patterns from complex literature corpora, thereby informing research directions and gap analyses.

Theoretical Foundation and Key Concepts

Definition and Construction Principles

Co-occurrence networks belong to the broader category of semantic networks that graphically visualize potential relationships between entities represented within written material [32]. The network construction process follows three systematic steps: (1) identifying relevant keywords or terms from the text corpus, (2) calculating frequencies of individual terms and their paired co-occurrences, and (3) analyzing the resulting networks to identify central terms and thematic clusters [32]. The construction criteria can be adjusted for specificity—for instance, requiring co-occurrence within the same sentence rather than merely within the same document for more precise relationship mapping.

In environmental science contexts, the network structure reveals specialized subdomains through distinct clustering patterns. For example, terms like "carbon sequestration," "soil organic matter," and "biochar" might form one cluster, while "microplastic pollution," "marine ecosystems," and "trophic transfer" form another. The interconnection density between these clusters indicates their conceptual relatedness within the research landscape, providing insights into knowledge integration across environmental subdisciplines.

Network Analysis Metrics and Interpretation

Beyond visual inspection, co-occurrence networks employ quantitative metrics to extract meaningful insights. The table below summarizes key metrics used in network analysis:

Table 1: Key Metrics for Co-occurrence Network Analysis

Metric Calculation Interpretation in Research Context
Node Degree Number of connections to other nodes Indicates popularity or centrality of a concept within the research field
Betweenness Centrality Number of shortest paths passing through a node Identifies bridging concepts that connect different research themes
Link Weight Frequency of co-occurrence between two terms Reflects strength of association between concepts
Modularity Ability of network to decompose into modules Measures delineation of distinct research themes or subfields
Strength Sum of weights of all links connected to a node Combines node connectivity and association strength

These metrics enable environmental scientists to move beyond simple term frequency counts to understand the structural role of specific concepts within the research landscape. For instance, a term with high betweenness centrality might represent an integrative concept connecting traditionally separate subfields like "environmental justice" bridging toxicology and policy studies [33].

Software and Tools for Co-occurrence Network Analysis

VOSviewer for Bibliometric Analysis

VOSviewer (Visualization Of Similarities viewer) is a specialized software tool developed by researchers at Leiden University's CWTS for constructing and visualizing bibliometric networks [21]. Its design specifically addresses the needs of scientific mapping, offering functionality for creating co-authorship networks, term co-occurrence maps, and citation-based networks from major bibliographic databases [21]. For environmental scientists, VOSviewer provides accessible entry into complex network analysis without requiring advanced programming skills.

The software accepts multiple data formats, including direct imports from Web of Science, Scopus, and PubMed/Medline, streamlining the initial data processing stages [1]. Additionally, VOSviewer offers text mining functionality that can construct co-occurrence networks from important terms extracted from scientific literature bodies, making it particularly valuable for emerging research domains where keyword standardization may still be evolving [21].

Complementary Analysis Tools

While VOSviewer offers user-friendly network visualization, other software tools provide complementary functionalities. CiteSpace enables temporal analysis of research trends, particularly valuable for tracking the evolution of environmental science concepts [34]. Network Workbench provides additional statistical analysis capabilities for larger datasets [33]. The integration of these tools creates a comprehensive analytical suite for sophisticated bibliometric research.

Table 2: Software Tools for Co-occurrence Network Analysis

Software Primary Function Environmental Science Application
VOSviewer Network visualization and clustering Identifying research themes and collaborative patterns
CiteSpace Temporal trend analysis and burst detection Tracking emerging concepts in environmental policy
Network Workbench Large-scale network statistics Analyzing cross-disciplinary connections
Gephi Advanced network manipulation and visualization Creating publication-ready network maps

Application Notes: Environmental Science Case Study

Research Context and Objectives

Environmental science research faces particular challenges in synthesizing knowledge across multiple disciplines and addressing rapidly evolving issues like climate change impacts and ecosystem management. Co-occurrence network analysis addresses these challenges by mapping the conceptual structure of environmental research domains. For this case study, we simulate a bibliometric analysis investigating interconnections between "nanomaterial environmental safety" and "regulatory science" research—a domain with significant policy implications where VOSviewer has demonstrated utility in previous studies [33].

The analysis aims to: (1) identify major research themes within nanomaterial environmental risk literature, (2) detect emerging topics and temporal trends, and (3) visualize collaborative networks among researchers and institutions. This approach facilitates evidence-based literature reviews by providing data-driven insights before undertaking more rigorous systematic reviews [33].

Data Collection and Preparation Protocol

Data Source Selection and Search Strategy

  • Access the Web of Science Core Collection database via institutional subscription
  • Develop comprehensive search query combining terminology from both nanomaterial and environmental safety domains:
    • TS=("nano*" AND "risk assessment") OR ("nano*" AND "environment* safety") OR ("nano*" AND "ecotoxic*")
    • Apply publication date filters depending on research objectives (e.g., 2000-2025 for comprehensive mapping)
  • Refine results by document type (articles, reviews) excluding meeting abstracts, editorials, and letters
  • Export full record and cited references in plain text format suitable for VOSviewer import

Data Cleaning and Standardization

  • Import raw data into VOSviewer using the built-in Web of Science parser
  • Apply thesaurus file to consolidate variant terms (e.g., "NP" and "nanoparticle," "EHS" and "environmental health safety")
  • Implement minimum publication threshold of 5 documents per author to focus on active contributors
  • Resolve institutional name variations (e.g., "Univ" versus "University") through manual inspection

Experimental Protocol: Creating Co-occurrence Networks with VOSviewer

The following diagram illustrates the complete workflow for creating co-occurrence networks from scientific literature:

G data_collection Data Collection from Bibliographic Databases data_cleaning Data Cleaning and Standardization data_collection->data_cleaning import Import into VOSviewer data_cleaning->import network_type Select Network Type import->network_type counting_method Choose Counting Method network_type->counting_method threshold Apply Frequency Thresholds counting_method->threshold visualization Network Visualization threshold->visualization interpretation Analysis and Interpretation visualization->interpretation

Step-by-Step Implementation

Step 1: Data Import and Network Type Selection

  • Launch VOSviewer and select "Create" to begin new analysis
  • Choose "Create a map based on bibliographic data" option
  • Select appropriate import format matching exported data (Web of Science, Scopus, or PubMed)
  • Navigate to exported data file and confirm loading
  • Select "Co-occurrence" as analysis type and "All keywords" as unit of analysis [1]

Step 2: Counting Method and Normalization

  • Choose between binary counting (recording presence/absence per document) or full counting (considering frequency within documents)
  • For environmental science topics with established terminology, full counting typically provides more nuanced results
  • Select association strength as normalization method for balanced cluster detection
  • Implement minimum keyword occurrence threshold (typically 5-10) to focus analysis on meaningful concepts [1]

Step 3: Network Visualization and Customization

  • Allow VOSviewer to automatically generate initial network layout
  • Apply clustering resolution appropriate to research questions (higher values yield more clusters)
  • Customize visualization using:
    • Network view: Adjust node size proportional to occurrence frequency
    • Label size: Scale according to term importance
    • Cluster colors: Assign distinct colors to thematic groups
    • Density view: Highlight conceptual density regions
    • Overlay visualization: Map temporal trends using color gradients [21]

Step 4: Analysis and Interpretation

  • Identify major research themes by examining cluster composition
  • Detect central concepts through node size and betweenness centrality
  • Analyze cluster interconnections to understand knowledge integration
  • Export visualization for publication and data tables for further statistical analysis

Research Reagent Solutions

Table 3: Essential Research Reagents for Co-occurrence Network Analysis

Tool/Resource Function in Analysis Application Example
Web of Science Core Collection Primary bibliographic data source Comprehensive coverage of environmental science literature
VOSviewer Software Network construction and visualization Creating co-occurrence maps of sustainability research
Thesaurus File Standardization of variant terms Merging "climate change" and "global warming" references
CiteSpace Temporal pattern analysis Detecting emerging concepts in renewable energy research
Microsoft Excel Data cleaning and preprocessing Managing author institutional affiliations

Advanced Analytical Techniques

Temporal Trend Analysis

Beyond static network visualization, VOSviewer enables examination of research evolution through overlay visualizations that map scientific activity across time periods [34]. For environmental scientists, this functionality helps track conceptual shifts in response to policy developments or technological breakthroughs. The software color-codes nodes based on publication year, creating a chronological landscape of research focus.

Implementation involves:

  • Selecting "Overlay Visualization" option in VOSviewer display settings
  • Choosing "Publication Year" as the weighting attribute
  • Interpreting color gradients from cool (older) to warm (recent) tones
  • Identifying newly emerging concepts through color clustering
  • Detecting declining research areas through temporal patterns

Cluster Validation and Interpretation

Network clusters require rigorous interpretation to ensure they represent meaningful research themes rather than algorithmic artifacts. The following validation protocol enhances analytical robustness:

Quantitative Validation

  • Calculate silhouette scores for cluster quality assessment (values >0.5 indicate reasonable quality)
  • Perform bootstrap resampling to test cluster stability
  • Compare multiple resolution parameters to identify optimal clustering

Qualitative Validation

  • Conduct expert review of cluster composition with domain specialists
  • Compare cluster structure with existing literature reviews
  • Validate cluster labels through examination of high-weight terms

Technical Considerations and Methodological Caveats

Methodological Limitations and Solutions

While powerful, co-occurrence network analysis presents several methodological challenges that environmental scientists must address:

Terminology Issues

  • Synonymy: Different terms representing identical concepts (e.g., "global warming" and "climate change")
  • Polysemy: Identical terms with different meanings across subdisciplines (e.g., "resolution" in remote sensing vs. policy studies)
  • Solution: Develop comprehensive thesaurus files specific to environmental science domains and validate through expert consultation

Database Biases

  • Coverage gaps: Incomplete representation of non-English literature or interdisciplinary work
  • Indexing inconsistencies: Variable keyword assignment practices across journals and databases
  • Solution: Implement multi-database searches (Web of Science, Scopus) and manual validation of key domains

Recent critical reviews caution against treating co-occurrence networks as direct representations of ecological relationships without proper validation [35]. Environmental scientists should supplement network findings with traditional review methods and expert consultation to avoid misinterpretation of algorithmic patterns.

Optimization for Environmental Science Applications

Environmental science research presents unique challenges including terminological diversity, cross-disciplinary integration, and rapidly evolving concepts. The following optimization strategies address these challenges:

Domain-Specific Adaptations

  • Develop environmental science-specific stopword lists to retain technically meaningful terms
  • Create specialized thesauri for subdomains (e.g., conservation biology, environmental engineering)
  • Implement multi-level analysis examining both broad themes and specialized subdomains

Validation Frameworks

  • Compare network results with existing systematic reviews in well-established domains
  • Conduct expert surveys to assess face validity of cluster interpretations
  • Employ multiple software tools to assess methodological consistency

Co-occurrence network analysis represents a powerful methodology for mapping the complex, interdisciplinary landscape of environmental science research. When implemented through VOSviewer with appropriate methodological rigor, it enables researchers to identify emerging topics, track conceptual evolution, and visualize knowledge structures across vast literature corpora. The protocols outlined provide a comprehensive framework for environmental scientists to harness these techniques, from data collection through advanced temporal analysis.

As environmental challenges grow increasingly complex, such data-driven literature analysis methods become essential tools for research planning, gap identification, and interdisciplinary collaboration facilitation. By integrating these network approaches with domain expertise, environmental scientists can more effectively navigate the expanding research landscape and identify productive pathways for addressing pressing ecological issues.

Overcoming Common Challenges and Enhancing Analysis Quality

In bibliometric analysis, particularly within environmental science research, the integrity of findings is fundamentally dependent on the quality of the underlying keyword data. Synonyms and keyword variations introduce significant noise, potentially skewing network maps, misrepresenting conceptual relationships, and compromising the validity of conclusions derived from tools like VOSviewer [36]. Environmental science is especially prone to this issue, encompassing a lexicon that includes "ecosystem services," "ecological goods," "ecological products," and "environmental degradation," often used interchangeably across different studies and schools of thought [36] [16]. This application note establishes a standardized protocol for the cleaning and harmonization of keyword data, ensuring that subsequent bibliometric visualizations and analyses accurately reflect the true intellectual structure of the research landscape.

Background and Quantitative Context

The challenge of keyword variability is not merely anecdotal but is quantifiable in the literature. The following table summarizes key aspects of keyword dynamics identified in bibliometric studies:

Table 1: Quantitative Insights into Keyword Dynamics in Scientific Literature

Aspect Quantitative Finding Source Context
High-Frequency Environmental Keywords "ecosystem services", "valuation", "biodiversity", "management", "conservation" are high-frequency, high-centrality terms [36]. Analysis of international research on ecological product value.
Primary Research Drivers Economic growth, renewable energy, and the Environmental Kuznets Curve are dominant themes [16]. Bibliometric analysis of environmental degradation research (1,365 papers).
Keyword Repetitiveness as a Specialization Metric Proposed Sj index measures journal specialization as the average frequency of a keyword's appearance in a journal [37]. Study of keyword occurrences across 88,583 articles in 50 journals.

These findings underscore that effective data cleaning must go beyond simple merging of obvious duplicates. It requires an understanding of the thematic context—recognizing that "carbon emission" and "CO2" are functionally identical in many environmental studies [16]—and an awareness of the level of conceptual granularity, where broad terms like "management" coexist with specific ones like "contingent valuation" [36].

Experimental Protocol: Keyword Cleaning and Harmonization

This protocol provides a step-by-step methodology for preprocessing a raw keyword dataset exported from databases like Web of Science or Scopus before import into VOSviewer.

Materials and Reagents

Table 2: Essential Research Reagent Solutions for Bibliometric Data Cleaning

Item Name Function/Description
Raw Bibliometric Data The initial dataset, typically in .txt or .csv format, containing author keywords, KeyWords Plus, titles, and abstracts.
Data Preprocessing Software A tool for bulk text manipulation (e.g., Python with Pandas, R, OpenRefine, or even advanced Excel functions).
Taxonomy/Thesaurus File A custom-built list defining groups of synonymous terms and their preferred standardized label (e.g., "CO2" -> "carbon emissions").
VOSviewer Software The bibliometric analysis and visualization tool for which the data is being prepared [36] [16].

Procedure

Step 1: Data Acquisition and Initial Preprocessing

  • Export Data: Collect and export the required bibliometric data (e.g., "Full Record and Cited References" from Web of Science).
  • Keyword Field Extraction: Isolate the keyword fields (author keywords and KeyWords Plus) into a single list for processing.
  • Case Normalization: Convert all keywords to lowercase to eliminate case-sensitive duplicates (e.g., "Climate Change" and "climate change").
  • Punctuation and Whitespace Removal: Strip all punctuation (e.g., hyphens, commas, periods) and standardize whitespace.

Step 2: Building a Custom Environmental Science Harmonization Taxonomy

  • Identify Variations: From the initial keyword list, manually identify clusters of synonyms and spelling variations. Common examples in environmental science include:
    • "ecological product", "ecosystem good", "environmental service"
    • "CO2", "carbon dioxide", "carbon emission"
    • "EKC", "Environmental Kuznets Curve"
    • "GHG", "greenhouse gas" [36] [16]
  • Define Preferred Label: For each cluster, select a single, academically prevalent term as the preferred label.
  • Document Mapping: Create a two-column taxonomy file where the first column lists all observed variations and the second column lists the corresponding preferred label.

Step 3: Automated Term Harmonization

  • Script Execution: Using your data preprocessing software, run a script that iterates through every keyword in the dataset.
  • Pattern Matching: For each keyword, the script checks against the first column of the taxonomy file.
  • Replacement: If a match is found, the keyword is replaced with the standardized preferred label from the second column.

Step 4: Validation and Iteration

  • Generate Frequency Report: Produce a post-cleaning frequency report of all keywords.
  • Spot-Check: Manually review the report to ensure harmonization was applied correctly and no new anomalies were introduced.
  • Iterate: Refine the taxonomy file and repeat Steps 3-4 as necessary. This is an iterative process.

The following workflow diagram illustrates the logical sequence and decision points of this protocol:

Start Start: Acquire Raw Keyword Data Preprocess Initial Preprocessing: Case Normalization, Punctuation Removal Start->Preprocess Analyze Analyze Initial List for Synonym Clusters Preprocess->Analyze BuildTaxo Build Harmonization Taxonomy File Analyze->BuildTaxo Automated Run Automated Term Harmonization Script BuildTaxo->Automated Validate Validate Output & Generate Frequency Report Automated->Validate Decision Results Acceptable? Validate->Decision Finish Finish: Export Cleaned Data for VOSviewer Decision->Finish Yes Iterate Refine Taxonomy and Iterate Decision->Iterate No Iterate->Automated

Application in Environmental Science Research

Applying this protocol within environmental science requires domain-specific knowledge. The table below provides illustrative examples of synonym groups pertinent to this field, derived from bibliometric research.

Table 3: Exemplary Keyword Harmonization for Environmental Science Bibliometrics

Standardized Preferred Label Common Synonyms and Variations to Map
Ecosystem Services Ecological services, environmental services, ecosystem goods, ecological products [36].
Carbon Emissions CO2, carbon dioxide, CO2 emissions, carbon emission [16].
Environmental Kuznets Curve EKC, Kuznets curve [16].
Renewable Energy Green energy, alternative energy, sustainable energy [16].
Valuation Economic valuation, contingent valuation, ecosystem service valuation [36].
Biodiversity Biological diversity, species richness [36].

Failure to implement this cleaning process can lead to misleading visualizations in VOSviewer. For instance, "CO2" and "carbon emissions" would appear as distinct, unconnected nodes in a co-occurrence network, artificially fragmenting the research domain and obscuring the true centrality of this topic [16]. Harmonizing these terms ensures that the resulting map accurately conveys the collective scholarly focus on this critical aspect of environmental degradation.

Rigorous data cleaning is the indispensable foundation of any reliable bibliometric analysis. The systematic protocol outlined here for handling synonyms and keyword variations empowers researchers to transform noisy, inconsistent raw data into a structured and valid dataset. By adopting these best practices, environmental scientists can leverage VOSviewer to generate more accurate, interpretable, and trustworthy maps of their research landscape, thereby providing a solid evidence base for scientific insight and policy decision-making.

In the field of environmental science research, bibliometric analysis using VOSviewer has become an indispensable methodology for mapping the intellectual landscape, identifying emerging trends, and understanding collaborative networks. The software enables researchers to construct and visualize various bibliometric networks, including co-authorship, co-citation, and keyword co-occurrence networks [20]. However, the creation of meaningful and interpretable maps requires careful consideration of threshold settings, which directly determine the balance between analytical detail and visual clarity. Proper threshold selection ensures that visualizations highlight the most significant patterns without becoming cluttered with irrelevant information, making it a critical skill for researchers, scientists, and drug development professionals working with complex environmental datasets.

This application note provides a comprehensive framework for selecting appropriate thresholds in VOSviewer, with specific considerations for environmental science research. We present detailed protocols, quantitative guidelines, and visualization strategies to help researchers optimize their bibliometric maps for maximum analytical value and communicative power.

Understanding VOSviewer Threshold Parameters

Core Threshold Concepts

Thresholds in VOSviewer function as filtration mechanisms that determine which elements (items, links, clusters) appear in the final visualization. These parameters are essential for managing visual complexity while maintaining analytical integrity. The software employs several types of thresholds that operate on different attributes of the bibliometric data, each serving a distinct purpose in the map refinement process.

The item weight threshold controls which items (authors, keywords, journals, countries) appear in the visualization based on quantitative metrics such as publication count, citation count, or total link strength [38]. Items falling below this threshold are excluded from the map, allowing researchers to focus on the most significant elements. The link strength threshold determines which connections between items are displayed, filtering out weaker associations that might contribute to visual noise without adding substantive analytical value [22]. Additionally, VOSviewer incorporates cluster resolution parameters that influence how items are grouped together, with higher resolution values typically resulting in more numerous, finer-grained clusters [22].

Quantitative Threshold Guidelines for Environmental Science

Based on analysis of published bibliometric studies in environmental science and related fields, the following table provides recommended threshold ranges for different types of analyses. These values serve as starting points that should be refined based on specific research questions and dataset characteristics.

Table 1: Recommended Threshold Ranges for Environmental Science Bibliometrics

Analysis Type Dataset Size Minimum Item Weight Minimum Link Strength Resolution Parameter
Co-authorship 500-2,000 items 2-5 documents 1-2 co-authored papers 1.0-1.5
Keyword Co-occurrence 3,000-10,000 items 5-10 occurrences 3-5 co-occurrences 1.2-1.8
Citation 1,000-5,000 items 5-15 citations 2-4 citation links 0.8-1.2
Country Collaboration 50-100 countries 1-2 collaborative papers 1-2 collaborative links 1.0-1.5

Experimental Protocols for Threshold Optimization

Protocol 1: Systematic Threshold Calibration for Keyword Co-occurrence Analysis

Purpose: To establish an optimized threshold setting for keyword co-occurrence analysis in environmental science research, balancing comprehensive coverage with visual interpretability.

Materials and Reagents:

  • VOSviewer Software: Version 1.6.10 or newer, installed from the official VOSviewer website [39].
  • Bibliographic Data: Extracted from Web of Science Core Collection using field-specific search queries.
  • Data Format: Plain text files containing publication records exported from WoS.
  • Computing Equipment: Standard computer with minimum 8GB RAM and 500MB storage.

Method Details:

  • Data Extraction and Preparation

    • Design a comprehensive search query for your environmental science topic (e.g., "microplastic pollution" OR "environmental degradation" OR "climate change adaptation").
    • Execute the search in Web of Science Core Collection, limiting results to relevant document types (articles, reviews) and publication years (typically 5-10 year range) [39].
    • Export the full record and cited references in plain text format.
    • Note: For environmental science topics, include multiple relevant databases (Scopus, PubMed) if available to ensure comprehensive coverage [39].
  • Initial Threshold Setting

    • Import data into VOSviewer and select "Create a map based on text data."
    • Choose "Title and Abstract fields" as the data source.
    • Set the initial minimum number of occurrences of a term to 10, which typically filters out overly specific or rare terminology while retaining conceptually significant terms [38].
    • Select the binary counting method to avoid over-emphasis on terms that appear multiple times in the same document.
  • Iterative Refinement

    • Generate the initial map and identify the number of items meeting the threshold.
    • If the map contains >200 items, increase the minimum occurrence threshold by increments of 5 until the item count falls between 100-200 for optimal readability.
    • If the map contains <50 items, decrease the threshold by increments of 2 until a sufficiently detailed network emerges.
    • Apply additional filtering based on relevance score, retaining terms that score above 0.50 in the VOSviewer relevance metric.
  • Validation and Documentation

    • Compare the resulting map with domain knowledge to ensure key concepts in environmental science are appropriately represented.
    • Document the final threshold values, number of items included, and the rationale for threshold selection.
    • Export the network data for further analysis or reporting.

Protocol 2: Co-authorship Network Threshold Optimization

Purpose: To identify optimal thresholds for mapping collaboration networks in environmental science research communities.

Materials and Reagents:

  • VOSviewer Software: As in Protocol 1.
  • Citation Data: From WoS or Scopus, focusing on author and institutional information.
  • Data Cleaning Tools: Text preprocessing software for author name disambiguation.

Method Details:

  • Data Collection and Cleaning

    • Retrieve bibliographic data from WoS using an environmental science-focused search query.
    • Export full records including author names, affiliations, and citation information.
    • Preprocess author names to address homonyms and variant spellings, which is particularly important for cross-institutional environmental science collaborations.
  • Threshold Configuration

    • In VOSviewer, select "Create a map based on bibliographic data" and choose "Co-authorship" analysis type.
    • For author co-authorship analysis, set the minimum number of documents per author to 2 and the minimum number of citations to 5 as starting parameters.
    • For organization-level analysis, set the minimum document count to 3 to focus on established institutional collaborations.
    • For country-level analysis, set the minimum document count to 1, as international collaborations in environmental science may be less frequent but highly significant.
  • Network Refinement

    • Generate the initial co-authorship network.
    • Adjust the minimum number of documents per author upward if the network contains >150 authors, or downward if <50 authors are included.
    • Set the minimum link strength to 1 to display all collaborative relationships, or increase to 2 to focus on stronger collaborations.
    • Apply the fractionalization weighting option to account for multiple authors per publication.
  • Interpretation and Analysis

    • Identify key researchers and institutions based on node size and total link strength.
    • Analyze cluster formation to reveal research communities within environmental science.
    • Compare collaboration patterns across different subdomains (e.g., environmental chemistry, conservation biology, climate science).

The following diagram illustrates the threshold optimization workflow for bibliometric analysis in VOSviewer:

Start Start: Data Import Threshold Set Initial Thresholds Start->Threshold Generate Generate Initial Map Threshold->Generate Assess Assess Item Count Generate->Assess Decision Item Count >200? Assess->Decision AdjustUp Increase Threshold Decision->AdjustUp Yes AdjustDown Decrease Threshold Decision->AdjustDown No Final Final Visualization Decision->Final Optimal (50-200 items) AdjustUp->Generate AdjustDown->Generate

Workflow for Threshold Optimization in VOSviewer

Successful bibliometric analysis in environmental science requires both specialized software and methodological knowledge. The following table details essential components of the bibliometric analysis toolkit, with particular emphasis on their application to environmental research questions.

Table 2: Research Reagent Solutions for VOSviewer Bibliometric Analysis

Tool/Resource Function Application in Environmental Science
VOSviewer Software Network visualization and analysis Creates interpretable maps of research trends, collaborations, and conceptual structure in environmental science [20].
Web of Science Core Collection Primary data source Provides comprehensive bibliographic data with consistent indexing, essential for tracking environmental research outputs [39].
CiteSpace Complementary analysis tool Identifies emerging trends and burst concepts in environmental science literature when used with VOSviewer [39].
Custom Thesauri Term normalization Standardizes variant environmental terminology (e.g., "global warming" vs. "climate change") for more accurate mapping.
JSON Configuration Files VOSviewer parameter storage Saves and shares optimal threshold settings for specific environmental science domains [22].

Advanced Threshold Strategies for Specific Environmental Science Applications

Temporal Analysis and Overlay Visualizations

Threshold Strategy for Temporal Analysis:

  • Set a lower minimum occurrence threshold (5-8) to capture emerging concepts that may not yet have high frequency.
  • Use the citation overlay feature with a minimum of 10 citations to identify influential works across different time periods.
  • Adjust the color scale endpoints manually to align with significant policy or scientific events in environmental science (e.g., Paris Agreement, IPCC reports).

The relationship between threshold parameters and visualization characteristics follows predictable patterns that can guide decision-making:

HighThreshold High Threshold Settings LowDetail Lower Map Detail HighThreshold->LowDetail ClearVis Enhanced Clarity HighThreshold->ClearVis FocusAnalysis Focused Analysis HighThreshold->FocusAnalysis LowThreshold Low Threshold Settings HighDetail Higher Map Detail LowThreshold->HighDetail ReducedClarity Reduced Clarity LowThreshold->ReducedClarity Exploratory Exploratory Analysis LowThreshold->Exploratory

Threshold Impact on Map Characteristics

Handling Multidisciplinary Environmental Topics

Environmental science increasingly intersects with other disciplines, creating challenges for threshold selection due to terminological diversity. For multidisciplinary topics like "One Health" or "Planetary Boundaries," consider these specialized approaches:

  • Staged Thresholding:

    • First pass: Apply standard thresholds (minimum 10 occurrences) to identify core concepts.
    • Second pass: Reduce thresholds (minimum 5 occurrences) specifically for bridge terms that connect disciplinary clusters.
    • Third pass: Manually add critical interdisciplinary terms that fall below automatic thresholds.
  • Cluster-Based Threshold Adjustment:

    • Generate an initial map with moderate thresholds.
    • Identify distinct disciplinary clusters (e.g., ecological, sociological, technological).
    • Adjust thresholds separately for each cluster to ensure adequate representation of key concepts within each discipline.
  • Cross-Database Validation:

    • Compare results from WoS, Scopus, and PubMed to identify consistently significant terms across databases.
    • Use consistent threshold values across databases to enable valid comparison.

Threshold selection in VOSviewer represents both a technical and conceptual challenge that directly influences the analytical value of bibliometric visualizations in environmental science research. The protocols and guidelines presented in this application note provide a systematic approach to balancing map detail and clarity, enabling researchers to generate visualizations that are both comprehensive and interpretable. As environmental science continues to evolve as an interdisciplinary field, appropriate threshold management becomes increasingly important for identifying emerging research trends, collaboration patterns, and knowledge structures. By applying these evidence-based threshold strategies, researchers can enhance the rigor and communicative power of their bibliometric analyses, ultimately supporting more informed decisions in research planning and policy development.

Bibliometric analysis has become an indispensable methodology in environmental science research, enabling the systematic mapping of knowledge domains and emerging trends. VOSviewer has emerged as a dominant tool in this landscape, distinguished by its powerful network visualization capabilities and user-friendly interface for creating "distance-based maps" where the proximity between items accurately reflects their similarity [40]. Despite its widespread adoption across various environmental research domains, including environmental security management and ecological risk assessment, researchers must acknowledge and develop strategies to address two significant analytical limitations: the lack of stemming functionality in term analysis and the inability to perform native temporal analysis [40]. These constraints present particular challenges in environmental science, where terminology evolves rapidly and understanding temporal trends is crucial for tracking emerging pollutants, policy impacts, and technological developments.

Key Limitations in Environmental Science Research

The Stemming Challenge in Terminology Analysis

Stemming refers to the text processing technique that reduces words to their root form, allowing related terms to be analyzed as a single conceptual unit. VOSviewer's inability to perform stemming presents significant challenges in environmental science bibliometrics, where terminology frequently appears in variant forms.

Impact on Environmental Science Research:

  • Inconsistent clustering of related environmental concepts (e.g., "ecosystem service," "ecosystem services")
  • Fragmented analysis of pollution terminology (e.g., "microplastic," "microplastics")
  • Inflated keyword counts and distorted co-occurrence patterns
  • Reduced accuracy in identifying true conceptual relationships

Table 1: Common Environmental Science Terminology Affected by Lack of Stemming

Root Concept Variant Forms Research Impact
Ecosystem Service service, services Fragmented analysis of key sustainability concepts
Climate Change changing, changed climate Incomplete assessment of research themes
Risk Assessment assessing, assessed risk Disconnected risk management literature
Environmental Security security, securities Compromised mapping of safety research domains
Pollution pollute, polluted, polluting Underestimation of pollution research volume

Temporal Analysis Limitations

VOSviewer lacks native capabilities for analyzing how research domains evolve, a critical limitation for environmental science where understanding temporal patterns is essential for tracking emerging contaminants, policy impacts, and technological adoption. While the software provides overlay visualizations, it cannot perform sophisticated time-series analysis or detect emerging trends algorithmically [40].

Consequences for Environmental Research:

  • Inability to track the evolution of environmental concepts like "resilient cities" through distinct development periods [41]
  • Limited capacity to identify emerging contaminants or treatment technologies
  • Difficulty establishing causal relationships between environmental policies and research focus
  • Compromised prediction of future research directions in rapidly evolving fields

Experimental Protocols for Overcoming Limitations

Protocol 1: Preprocessing Text Data for Stemming

Objective: Implement a standardized text preprocessing workflow to compensate for VOSviewer's lack of stemming functionality.

Materials and Software:

  • Python 3.7+ with Natural Language Toolkit (NLTK) and Pandas libraries
  • Raw bibliographic data from Web of Science, Scopus, or Dimensions
  • VOSviewer software (version 1.6.19 or newer)

Methodology:

  • Data Extraction: Export bibliographic records (titles, abstracts, keywords) from your chosen database in CSV or RIS format
  • Term Extraction: Isolate key textual elements (author keywords, KeyWords Plus, title terms) into a separate spreadsheet
  • Stemming Implementation: Apply the Porter stemming algorithm to all terms using Python's NLTK library
  • Manual Harmonization: Review and standardize environmental science-specific terminology not properly handled by algorithmic stemming
  • Data Integration: Replace original terms with standardized forms in the bibliographic data file
  • VOSviewer Analysis: Import the preprocessed data into VOSviewer for co-occurrence analysis

G Start Export Raw Bibliographic Data Step1 Extract Key Text Fields (Titles, Abstracts, Keywords) Start->Step1 Step2 Apply Algorithmic Stemming (Porter Stemmer) Step1->Step2 Step3 Manual Terminology Harmonization (Environmental Science Specific) Step2->Step3 Step4 Integrate Standardized Terms Back into Dataset Step3->Step4 Step5 Import Preprocessed Data into VOSviewer Step4->Step5

Protocol 2: Complementary Temporal Analysis Framework

Objective: Establish a reproducible methodology for integrating temporal dimension into VOSviewer analyses using complementary bibliometric tools.

Materials and Software:

  • VOSviewer (version 1.6.19 or newer)
  • CiteSpace (version 6.2.R4 or newer) or Bibliometrix (version 4.1.0 or newer)
  • Web of Science or Scopus database access
  • Microsoft Excel or similar spreadsheet software

Methodology:

  • Data Collection: Conduct comprehensive literature search using environmental science-specific search strings with defined time ranges
  • Split-Time Analysis: Divide dataset into meaningful time periods based on environmental policy milestones or technological developments
  • Comparative VOSviewer Mapping: Generate separate network maps for each time period using identical parameters
  • CiteSpace Integration: Perform burst detection and time-zone analysis in CiteSpace to identify emerging trends and pivotal publications
  • Synthesis: Correlate findings from both tools to construct comprehensive timeline of conceptual evolution

Table 2: Temporal Analysis Workflow for Environmental Science Research

Research Phase VOSviewer Application Complementary Tool Outcome Metrics
Data Preparation Keyword co-occurrence per time period Bibliometrix for conceptual structure map Thematic evolution patterns
Trend Identification Overlay visualization with publication year CiteSpace for burst detection Citation burst strength & duration
Network Evolution Citation network analysis SciMAT for thematic evolution Thematic stability, emergence, disappearance
Research Front Analysis Bibliographic coupling CiteSpace for time-zone visualization Research front progression

G Start Define Environmental Research Question Step1 Collect Bibliographic Data with Time-Structured Search Start->Step1 Step2 Split Data into Meaningful Time Periods Step1->Step2 Step3 VOSviewer: Generate Comparative Network Maps per Period Step2->Step3 Step4 CiteSpace: Perform Burst Detection and Time-Zone Analysis Step3->Step4 Step5 Integrate Findings for Temporal Evolution Model Step4->Step5

Case Study: Environmental Security Management Research

A 2023 study on international environmental security management exemplifies the effective integration of VOSviewer with complementary tools to overcome its inherent limitations [41]. The research analyzed 7,596 articles from Web of Science spanning 1997-2021, forming six main clustering labels from 28,144 authors.

Methodology Implementation

Stemming Compensation Approach: Researchers implemented manual terminology harmonization for environmental security concepts, grouping variants like "environmental safety" and "environmental security" through pre-processing before VOSviewer analysis. This enabled more accurate identification of research hotspots spanning personal health, society, agriculture, ecological environment, energy, and sustainable development.

Temporal Analysis Integration: While VOSviewer mapped the current research landscape and collaboration networks, CiteSpace provided critical temporal analysis through:

  • Time-zone visualization of keyword evolution
  • Burst detection identifying emerging concepts
  • Analysis of shifting international collaborations over time

The hybrid methodology revealed that the United States maintains a dominant position in this research field, with China showing increasing collaboration with the United States, Britain, Australia, and India [41].

Research Reagent Solutions

Table 3: Essential Analytical Tools for Comprehensive Bibliometric Analysis

Tool/Software Primary Function Application in Environmental Science Access
VOSviewer Network visualization and clustering Mapping research domains and collaboration networks Free download
CiteSpace Temporal and burst analysis Identifying emerging trends and research fronts Free download
Bibliometrix Comprehensive bibliometric analysis Performance analysis and science mapping R package
Python NLTK Text preprocessing and stemming Terminology standardization pre-VOSviewer Open source
Google Sheets Data preprocessing and harmonization Manual terminology cleaning Web-based

Integrated Workflow for Environmental Science Bibliometrics

Based on experimental protocols and case studies, we propose a comprehensive workflow that compensates for VOSviewer's limitations while leveraging its strengths in visualization.

G Step1 1. Data Collection & Preprocessing Text standardization and stemming Step2 2. Preliminary VOSviewer Analysis Co-occurrence and collaboration networks Step1->Step2 Step3 3. Complementary Temporal Analysis Using CiteSpace or Bibliometrix Step2->Step3 Step4 4. Data Integration & Interpretation Correlating network and temporal findings Step3->Step4 Step5 5. Visualization & Communication Leveraging VOSviewer's strength in presentation Step4->Step5

Implementation Guidelines

For Terminology Management:

  • Develop environmental science-specific dictionaries for manual term harmonization
  • Combine algorithmic stemming with domain expertise for optimal results
  • Document all terminology decisions to ensure reproducibility

For Temporal Analysis:

  • Align time periods with significant environmental policy milestones (e.g., Paris Agreement, SDG implementation)
  • Use split-time analysis to map conceptual evolution before and after key events
  • Correlate research trends with real-world environmental developments

VOSviewer remains an invaluable tool for environmental science bibliometrics, particularly for its sophisticated visualization capabilities and user-friendly interface. However, researchers must acknowledge its limitations in stemming and temporal analysis, particularly in a field characterized by evolving terminology and pressing needs to understand temporal trends. The experimental protocols and integrated workflow presented here provide a robust methodology for compensating these limitations while leveraging VOSviewer's strengths. Through strategic complementarity with other bibliometric tools and careful attention to text preprocessing, environmental scientists can overcome these constraints to produce more accurate, comprehensive, and temporally-sensitive analyses of their research domains. This approach enables more effective mapping of the complex, evolving landscape of environmental science research, from ecosystem service-based risk assessment to emerging contaminants and sustainability transitions.

Optimizing Map Layouts and Clustering for Effective Presentation

Within the framework of a broader thesis on the application of VOSviewer for bibliometric analysis in environmental science research, the ability to optimize map layouts and clustering is paramount. Effective visual presentation transforms complex network data into interpretable knowledge landscapes, enabling researchers to identify key themes, track emerging trends, and communicate findings clearly to diverse audiences, including scientists and drug development professionals. VOSviewer is specifically designed to construct and visualize various bibliometric networks based on scientific literature, facilitating the creation of co-occurrence, co-authorship, and citation networks [20]. The software's functionality allows for the visual exploration of patterns within textual and bibliographical data, making it an indispensable tool for mapping scientific fields [20]. This document provides detailed application notes and protocols for maximizing the effectiveness of VOSviewer's layout and clustering algorithms, with examples contextualized for environmental science.

Network Typology and Data Acquisition in VOSviewer

Before optimizing a map, one must understand the type of network being analyzed. VOSviewer can construct several network types, each offering a different perspective on the scholarly landscape. The choice of network dictates the relationships that will be visualized and clustered.

Table: Common Bibliometric Network Types in VOSviewer

Network Type Defining Relationship Research Insight Provided Example from Environmental Science
Co-occurrence Terms (e.g., keywords) appearing together in the same publication [42]. The conceptual structure and main topics of a field. Mapping the co-occurrence of terms like "microplastics," "bioaccumulation," and "ecotoxicology."
Co-authorship Researchers or institutions collaborating on publications [42] [20]. Collaborative networks and key partners. Visualizing international collaboration on "carbon sequestration" research.
Citation Documents or journals citing one another [42]. The flow of knowledge and influence between entities. Analyzing which foundational papers on "green synthesis" are most cited in recent drug development.
Bibliographic Coupling Documents that share common references [42]. Current research fronts working on similar problems. Identifying groups of recent studies on "pharmaceutical pollution" that build on the same knowledge base.
Co-citation Documents or journals being cited together by other documents [42]. The intellectual foundations and seminal works of a field. Revealing the core set of historical studies that underpin modern "environmental impact assessment."
Experimental Protocol: Building a Co-occurrence Network from Scratch

This protocol details the creation of a term co-occurrence network, one of the most common approaches for mapping a research field's conceptual structure.

1. Data Source Identification and Export:

  • Primary Source: Utilize the Web of Science (WOS) Core Collection, a comprehensive database for bibliometric analysis [34].
  • Search Strategy: Develop a targeted Boolean search query. For an environmental science context, this might be: TS=("green synthesis" AND "drug development" AND "environmental impact").
  • Data Export: After executing the search, select all relevant publications. Export the full record and cited references from WOS in a plain text format.

2. Data Import and Network Construction in VOSviewer:

  • Open VOSviewer and select Create.
  • Choose Create a map based on bibliographic data and then Read data from bibliographic database files.
  • Under Type of analysis, select Co-occurrence and then All keywords (or Author keywords for a more focused map).
  • Set the minimum number of occurrences of a keyword.
  • VOSviewer will then present a list of terms meeting the threshold. This is a critical step for managing map complexity. Select all terms for a broad overview or a subset for a more specific analysis.

3. Map Initialization:

  • Click Finish to allow VOSviewer to calculate the network and generate an initial, unoptimized map.

Optimization of Map Layouts

The default map layout often requires refinement to improve its interpretability. VOSviewer uses a visualization-based similarity technique to layout items [20].

Layout Algorithms and Manual Adjustment
  • VOS Layout Technique: This is the core algorithm that positions items in the map. The goal is to minimize the weighted sum of the squared Euclidean distances between all pairs of items, where the weights are the similarity strengths [20]. Items with higher similarity are pulled closer together.
  • Manual Refinement: After the VOS algorithm generates a layout, you can manually adjust the position of individual nodes (items) to reduce clutter, resolve overlapping labels, or emphasize specific areas of the network. This is done by clicking and dragging nodes to new positions.
Protocol for Optimizing Visual Clarity and Interpretation

1. Cluster-Colored Layout:

  • Action: Use the Items -> Clusters -> Use colors for clusters option. This assigns a distinct, high-contrast color to each cluster, making it immediately visible which items belong together.
  • Rationale: This is the most effective way to visually separate the thematic groups identified by the clustering algorithm.

2. Label and Size Adjustment:

  • Action: In the Label menu, adjust the Size and Scale parameters. Increase the minimum label size to ensure readability of key items. Use the Max. number of labels setting to show labels only for the most important items, preventing a "hairball" appearance.
  • Rationale: Managing label density is crucial for a clean presentation. The size of the circle (node) can be scaled by a metric like citation count or occurrence frequency, providing an immediate visual cue to an item's importance.

3. Density Visualization:

  • Action: Switch to the Density view via the main toolbar. Adjust the Kernel width and Edge width in the Density menu to control the smoothness and prominence of the color layers.
  • Rationale: The density view provides a quick overview of the research landscape's core areas. Regions with many items appear in warmer colors (e.g., red, yellow), while sparse areas are cooler (e.g., blue, green). This is particularly useful for identifying the most densely researched topics at a glance [20].

The following diagram illustrates the logical workflow for optimizing a VOSviewer map, from data loading to final visualization.

G Start Load Network Data Algo Apply VOS Layout Algorithm Start->Algo ClusterColor Apply Cluster-Colored Layout Algo->ClusterColor AdjustLabels Adjust Label Size and Density ClusterColor->AdjustLabels ManualTweak Manual Node Positioning AdjustLabels->ManualTweak DensityView Generate Density Visualization ManualTweak->DensityView FinalMap Final Optimized Map DensityView->FinalMap

Advanced Clustering for Thematic Analysis

Clustering is the process of partitioning a network into distinct groups, or clusters, of closely related items. In VOSviewer, this is automated but can be guided by the user.

Cluster Resolution and Parameter Control
  • Resolution Parameter: VOSviewer allows you to control the level of detail in the clustering. A higher resolution parameter will typically result in a larger number of smaller, more specific clusters. A lower resolution will yield fewer, broader clusters.
  • Methodology: The clustering in VOSviewer is based on the same similarity matrix used for the layout. The software uses a smart local moving algorithm to efficiently find clusters that maximize a quality function.
Protocol for Overlay Visualization and Trend Analysis

Overlay visualization is a powerful feature for displaying the temporal evolution of research topics [20].

1. Data Preparation:

  • Ensure your bibliographic data includes publication year information.

2. Score Calculation:

  • In the Overlay menu, select the Score tab. Choose the Publication years attribute.
  • VOSviewer will calculate an average score for each item (e.g., the average publication year for the documents in which a keyword appears).

3. Visualization:

  • The map will be colored based on this average score. Typically, a color gradient is used, where one end (e.g., blue) represents older topics and the other end (e.g., yellow/red) represents more recent or emerging topics [20].
  • Interpretation: This allows for immediate identification of historical foundations (cooler colors) and research fronts (warmer colors) within the same map. For example, in a map of sustainable agriculture, "organic farming" might appear in blue, while "precision agriculture using AI" would appear in yellow.

Table: Clustering Metrics and Their Impact on Map Interpretation

Clustering Metric Function Impact on Thematic Analysis
Modularity Measures the strength of division of a network into modules (clusters). A high value indicates that the network has a strong community structure, validating the distinctness of the identified clusters.
Silhouette Score Evaluates how similar an object is to its own cluster compared to other clusters. A score close to 1 indicates items are well-matched to their own cluster and poorly-matched to neighboring clusters, confirming cluster cohesion and separation.
Number of Clusters The total count of thematic groups identified by the algorithm. A higher number requires more detailed interpretation but can reveal niche sub-fields. A lower number provides a high-level overview.
Average Publication Year (Overlay) Colors items based on the temporal dimension of their activity [20]. Directly reveals the evolution of the field, distinguishing between established core topics and emerging, trending research areas.

The Scientist's Toolkit: Research Reagent Solutions

This section details the essential digital "reagents" and tools required for conducting a VOSviewer-based bibliometric analysis.

Table: Essential Tools for VOSviewer Bibliometric Analysis

Tool / Resource Function Role in the Analytical Workflow
VOSviewer Software The primary application for constructing, visualizing, and exploring bibliometric networks [20]. The core engine for data processing, layout calculation, clustering, and visualization.
Web of Science (WOS) Core Collection A leading scholarly citation database used as the data source for robust bibliometric studies [34]. Provides the raw, structured bibliographic data (titles, authors, abstracts, citations, etc.) required for analysis.
CiteSpace Software Another scientometric software tool often used in conjunction with VOSviewer for complementary analyses like burst detection [34]. Used for analyzing emerging trends and pivotal points in the literature, enriching the interpretation of VOSviewer maps.
Microsoft Excel A ubiquitous spreadsheet application. Used for preliminary data cleaning, managing exported records, and creating basic charts (e.g., annual publication trends) [34].
VOSviewer Online The web-based version of VOSviewer [42] [20]. Allows for easy sharing and interactive exploration of created maps with collaborators or a wider audience, enhancing research dissemination.

The relationship between these tools in a standard bibliometric workflow is outlined below.

G WOS Web of Science Excel Excel WOS->Excel Export Data VOSviewer VOSviewer Excel->VOSviewer Import Data CiteSpace CiteSpace VOSviewer->CiteSpace Complementary Analysis Online VOSviewer Online VOSviewer->Online Publish & Share

VOSviewer is a freely available, Java-based software package specifically designed for constructing and visualizing bibliometric networks [43]. These networks can include relationships between a variety of scientific entities, such as journals, researchers, individual publications, and keywords, based on citation, co-citation, co-authorship, or co-occurrence data [44]. For researchers, scientists, and professionals in environmental science and drug development, bibliometric analysis has transitioned from a niche specialty to a fundamental preliminary research tool. It enables the mapping of vast scientific literatures to identify emerging trends, core publications, and collaborative networks. The software's user-friendly interface and its ability to create insightful, easy-to-interpret visualizations have contributed to its exponential growth in adoption, making it a critical component of the modern scientist's toolkit for literature discovery and research evaluation [43] [44].

Core Bibliometric Analysis Types and Applications

VOSviewer supports several types of analysis, each serving a distinct purpose in exploring the scholarly landscape. Understanding these types allows researchers to select the appropriate methodology for their specific research question.

Table 1: Core Bibliometric Analysis Types in VOSviewer

Analysis Type Description Primary Application in Environmental Science
Term Co-occurrence [44] Identifies key thematic areas and their interlinkages by analyzing how frequently terms (e.g., keywords) appear together in publications. Mapping the conceptual structure of a field like "microplastic pollution" to identify connected sub-themes such as "toxicity," "wastewater treatment," and "marine ecosystems."
Co-citation Analysis [44] Identifies influential journals, references, and authors. A co-citation link exists between two items that are both cited by the same document. Finding the foundational papers and key theorists in "environmental impact assessment" by seeing which references are consistently cited together.
Bibliographic Coupling [44] Identifies countries, institutions, or publications working on similar topics. A link exists between two items that both cite the same document. Revealing which European and Chinese research institutes are leading parallel research streams in "solar cell technology" development.

The co-occurrence analysis is particularly valuable as a preliminary tool. It generates a network where the node size is proportional to the frequency of a term's co-occurrence, and the link thickness indicates the strength of the connection between two terms. Frequently co-occurring terms form distinct clusters, with each cluster typically representing a specific thematic research area [44]. This allows researchers to quickly grasp the intellectual structure of a field. For example, an exploratory study used VOSviewer to analyze keywords from the most cited papers on "smart cities" and "sustainable cities," revealing that they occupy largely distinct citation spaces, thus challenging the assumption that "smart cities" are inherently "sustainable cities" [43].

Protocol: Conducting an Exploratory Bibliometric Study

This protocol outlines the steps for using VOSviewer to perform an exploratory analysis of a research domain, using the example of investigating the intersection between remote sensing and the Sustainable Development Goals (SDGs) [43].

Research Reagent Solutions

Table 2: Essential Materials and Software Tools

Item Name Function/Description Source/Availability
Bibliographic Database Source of publication and citation data. Provides metadata (titles, authors, abstracts, keywords, references) for analysis. Scopus, Web of Science
VOSviewer Software Java-based application for constructing, visualizing, and exploring bibliometric maps. Freely available for download from https://www.vosviewer.com/
Thesaurus File A plain text file used to merge different variants of the same term (e.g., "color" and "colour," "SDG" and "Sustainable Development Goal"). Created manually by the researcher based on knowledge of the field.

Step-by-Step Methodology

  • Define Research Scope and Data Retrieval:

    • Formulate an exploratory hypothesis. Example: "The field of remote sensing is not centrally focused on the UN Sustainable Development Goals (SDGs)."
    • Select a data source (e.g., Scopus) and define search parameters. For a manageable, focused analysis, you may limit the data to a specific, high-impact journal (e.g., Remote Sensing) from its inception to the present [43].
    • Execute the search and download the complete records of the resulting publications. The required format is typically a RIS or CSV file containing the metadata.
  • Data Preparation and Thesaurus Creation:

    • Import the downloaded data file into VOSviewer.
    • Create a thesaurus file to ensure terminological consistency. This is critical for accurate term co-occurrence analysis. For the SDG example, the thesaurus would map terms like "SDG 6" and "Sustainable Development Goal 6" to a single unified term.
  • Perform Term Co-occurrence Analysis:

    • In VOSviewer, select "Create" > "Create a map based on bibliographic data" > "Read data from reference manager files."
    • Upload your data file. For the analysis type, choose "Co-occurrence" and then "All keywords."
    • Set a minimum number of occurrences for a term (e.g., 5 or 10) to filter out insignificant terms and focus on the main themes.
    • VOSviewer will then calculate the network and present a list of terms meeting the threshold. Select all and proceed to generate the visualization.
  • Interpretation and Visualization:

    • The resulting map will display keywords as nodes, with clusters of closely related terms color-coded.
    • Analyze the map to see if keywords related to your topic of interest (e.g., "SDG," "sustainability") appear, and note their position, cluster affiliation, and link strength to other terms. Their presence in a small, peripheral cluster would support the hypothesis that the field is not highly focused on this topic [43].
    • Use VOSviewer's zoom and scroll functions to interact with the visualization and identify specific relationships.

vosviewer_workflow start Define Research Scope data Retrieve Data from Scopus/WoS start->data prep Prepare Thesaurus File data->prep import Import Data into VOSviewer prep->import analyze Run Co-occurrence Analysis import->analyze vis Generate & Explore Network Visualization analyze->vis interp Interpret Thematic Clusters & Links vis->interp end Shape Further Research interp->end

VOSviewer Analysis Workflow

Advanced Application: Testing a Research Hypothesis

VOSviewer can be used not just for exploration but also as a preliminary tool to test a simple hypothesis using publication data as a surrogate for real-world phenomena [43]. The workflow below outlines this advanced application.

hypothesis_testing hypo Formulate Hypothesis (e.g., Smart City ≠ Sustainable City) search Search for Relevant Publications hypo->search kw Extract & Analyze Keyword Co-occurrences search->kw map Examine Network Map for Cluster Overlap kw->map result Distinct Clusters Support Hypothesis map->result next Proceed with Deeper Case Study Analysis result->next

Hypothesis Testing with VOSviewer

Protocol:

  • Formulate the Hypothesis: Define a testable statement. The published example asked whether "smart cities" are inherently "sustainable cities" [43].
  • Data Collection: Search a bibliographic database (e.g., Web of Science) for the most cited papers relevant to both concepts (e.g., 1000 most cited papers with "smart" and "city" and "sustainable" and "city" from 2015-2020).
  • Create the Visualization: In VOSviewer, perform a keyword co-occurrence analysis on the combined dataset.
  • Analyze the Results: The resulting knowledge map acts as a surrogate for the actual intellectual alignment between the two concepts. In the published case, "smart cities" and "sustainable cities" appeared in largely separate citation spaces, with only a small, distinct cluster for "smart sustainable cities." This visual separation provided preliminary support for the hypothesis that the two concepts are not synonymous in research practice, thus validating the need for a more detailed follow-up study [43].

VOSviewer serves as a powerful and accessible tool for researchers seeking to navigate and understand complex scientific landscapes. Its value lies in its ability to transform large volumes of bibliographic data into intuitive visual maps, facilitating exploratory research and preliminary hypothesis testing. For environmental scientists and drug development professionals, mastering VOSviewer enables a data-driven approach to literature review, gap analysis, and research planning. By following the detailed protocols and utilizing the structured workflows outlined in this article, researchers can systematically integrate bibliometric analysis into their research process, thereby shaping more informed and impactful scientific inquiries.

Ensuring Robust Results and Comparing Bibliometric Tools

Validating Bibliometric Findings with Domain Knowledge

Bibliometric analysis, facilitated by software like VOSviewer, provides powerful visualization of research landscapes through networks of citations, co-authorships, and term co-occurrences [6]. However, these computational outputs require rigorous validation through domain knowledge to ensure their scientific accuracy and practical relevance. Without proper validation, bibliometric findings risk representing statistical artifacts rather than genuine intellectual patterns. This protocol establishes comprehensive methodologies for validating VOSviewer-generated bibliometric maps within environmental science research, ensuring findings withstand scholarly scrutiny and provide meaningful insights for researchers, scientists, and drug development professionals.

The integration of quantitative bibliometric data with qualitative domain expertise creates a robust framework for interpreting complex research landscapes. This validation process is particularly crucial in environmental science, where research trends directly inform policy decisions and resource allocation. As demonstrated in a recent bibliometric analysis of environmental degradation research, which examined 1,365 publications, validation against domain knowledge helps ascertain whether frequently occurring terms like "economic growth" and "renewable energy" genuinely represent dominant research fronts rather than semantic artifacts [16].

Validation Framework and Principles

Core Validation Concepts

Table: Key Validation Concepts in Bibliometric Analysis

Concept Definition Validation Approach
Semantic Validation Ensuring cluster labels accurately represent underlying concepts Expert evaluation of term consistency and contextual relevance
Structural Validation Verifying network relationships reflect genuine intellectual connections Comparison with established citation classics and review articles
Temporal Validation Confirming observed trends align with historical developments Longitudinal analysis against known scientific milestones
Methodological Validation Assessing appropriateness of visualization parameters Sensitivity analysis of clustering resolution and correlation thresholds
Common Pitfalls in Bibliometric Interpretation

Bibliometric visualizations can mislead through several mechanisms. The rainbow color scheme previously used in VOSviewer, while visually appealing, could implicitly suggest non-existent categorical boundaries or obscure details in certain data ranges due to perceptual non-uniformity [5]. Version 1.6.7 replaced this with perceptually uniform color schemes like viridis, but interpretation challenges remain. Cluster boundaries may suggest discrete research areas when reality involves continuous intellectual gradients. Co-occurrence networks might reflect terminology preferences rather than conceptual relationships. Citation patterns can be influenced by disciplinary conventions rather than intellectual influence.

Data Presentation and Quantitative Validation Metrics

Validation Metrics for Bibliometric Clusters

Table: Quantitative Metrics for Cluster Validation

Metric Calculation Method Validation Threshold Interpretation Guide
Cluster Silhouette Score Average distance between items in same cluster vs. other clusters >0.5 indicates strong clustering Scores <0.25 suggest weak or artificial groupings
Term Consistency Ratio Ratio of domain-relevant to generic terms within cluster >60% for validated clusters High generic term ratio indicates potential false cluster
Temporal Coherence Index Standard deviation of publication years within cluster <3 years for emerging topics High deviation may indicate thematically disparate items
Citation Density Balance Ratio of internal to external citations >1.0 for well-defined clusters Low ratios suggest fragmented intellectual foundations
Environmental Science Application: Carbon Emissions Research

In the recent analysis of 1,365 environmental degradation publications, validation confirmed that "economic growth" represented a genuine research focus rather than a semantic artifact, evidenced by its central positioning across multiple visualization techniques and consistent co-occurrence with established theoretical frameworks like the Environmental Kuznets Curve [16]. The study demonstrated an annual publication growth rate exceeding 80%, with validated research fronts including renewable energy, urbanization drivers, and technological solutions – all confirming known domain priorities through quantitative measures.

Experimental Protocols for Validation

Protocol 1: Expert Validation of Cluster Labels

Purpose: To verify that VOSviewer-generated cluster labels accurately represent the intellectual content of publications within each cluster.

Materials: VOSviewer cluster visualization output, list of publications per cluster, domain expert panel (3-5 experts), standardized evaluation forms.

Methodology:

  • Export cluster composition from VOSviewer including publication titles, abstracts, and author-supplied keywords
  • Prepare expert evaluation forms with cluster labels and representative publications
  • Conduct blinded evaluation where experts assess label appropriateness without knowing VOSviewer-assigned labels
  • Calculate inter-rater reliability using Cohen's Kappa coefficient
  • Modify cluster labels based on expert consensus where discrepancy exceeds 30%

Validation Criteria: Cluster labels require revision when (1) more than 30% of publications are misclassified by expert judgment, (2) alternative labels are proposed by multiple experts independently, or (3) temporal inconsistencies exist where historical and contemporary works are improperly grouped.

Protocol 2: Temporal Trend Validation

Purpose: To ensure observed bibliometric trends align with documented historical developments in the research domain.

Materials: VOSviewer overlay visualization, historical timeline of domain milestones, reference literature reviews.

Methodology:

  • Generate VOSviewer overlay visualization using publication year as the field color parameter
  • Identify apparent research trends through color gradient patterns
  • Compare with established historical milestones from literature reviews
  • Conduct citation analysis of seminal publications to verify influence patterns
  • Perform discontinuity analysis to identify potential artificial trends

Validation Criteria: Temporal trends are validated when (1) color gradients correlate with known scientific breakthroughs (>0.7 correlation), (2) seminal publications appear at expected timeline positions, and (3) trend discontinuities align with major policy changes or funding initiatives.

Protocol 3: Cross-Database Validation

Purpose: To verify that bibliometric patterns are consistent across different literature databases, reducing database-specific biases.

Materials: Parallel datasets from Scopus, Web of Science, and Dimensions; VOSviewer software; correlation analysis tools.

Methodology:

  • Conduct identical bibliometric searches across multiple databases
  • Apply consistent VOSviewer parameters to each dataset
  • Compare cluster structures using similarity indices
  • Analyze term co-occurrence patterns for consistency
  • Validate key author and journal influence across databases

Validation Criteria: Findings are considered database-independent when (1) cluster similarity indices exceed 0.75, (2) core author networks show >60% overlap, and (3) key term relationships remain stable across data sources.

Visualization of Validation Workflows

Start VOSviewer Analysis Complete StructuralCheck Structural Validation Cluster Coherence Analysis Start->StructuralCheck SemanticCheck Semantic Validation Expert Term Assessment Start->SemanticCheck TemporalCheck Temporal Validation Trend Historical Alignment Start->TemporalCheck CrossDBCheck Cross-Database Validation Pattern Consistency Start->CrossDBCheck Integration Integrate Validation Results StructuralCheck->Integration SemanticCheck->Integration TemporalCheck->Integration CrossDBCheck->Integration Quantitative Quantitative Metrics Assessment Integration->Quantitative Final Validated Bibliometric Findings Quantitative->Final

Bibliometric Validation Workflow

Cluster VOSviewer Cluster Output Internal Internal Validation Metrics Calculation Cluster->Internal Silhouette Silhouette Score > 0.5? Internal->Silhouette Temporal Temporal Coherence < 3 yrs? Silhouette->Temporal Yes Revision Cluster Requires Revision Silhouette->Revision No External External Validation Domain Expert Review Temporal->External Yes Temporal->Revision No Relevance Term Relevance > 60%? External->Relevance Consensus Expert Consensus > 70%? Relevance->Consensus Yes Relevance->Revision No Validated Cluster Validated Consensus->Validated Yes Consensus->Revision No

Cluster Validation Decision Protocol

The Scientist's Toolkit: Research Reagent Solutions

Table: Essential Validation Tools for Bibliometric Research

Tool/Resource Function Application in Validation
VOSviewer Software Constructing and visualizing bibliometric networks [6] Primary tool for generating co-occurrence, citation, and co-authorship maps requiring validation
Perceptually Uniform Color Schemes Viridis, magma, and other accessible palettes [5] Ensure visualizations are interpretable by all users, including those with color vision deficiencies
Domain Expert Panels Qualitative assessment of cluster integrity and label accuracy Provide ground truth evaluation of bibliometric patterns against domain knowledge
Multiple Literature Databases Scopus, Web of Science, Dimensions, OpenAlex [6] Cross-database verification to eliminate platform-specific biases in literature coverage
Historical Milestone Databases Curated timelines of scientific breakthroughs Temporal validation of observed research trends and emergence patterns
Color Vision Deficiency Simulators Tools like Coblis, Adobe Color [45] Testing visualization accessibility for various color blindness types (deuteranopia, protanopia, tritanopia)
Quantitative Metric Suites Silhouette scores, consistency ratios, coherence indices Numerical validation of cluster quality and pattern robustness

Application in Environmental Science Research

The application of these validation protocols is particularly crucial in environmental science, where research trends directly influence policy and funding decisions. The bibliometric analysis of environmental degradation research exemplifies proper validation, where apparent trends like the dominance of economic growth studies were confirmed through multiple validation protocols [16]. The researchers cross-referenced VOSviewer outputs with domain knowledge, confirming that China, Pakistan, and Turkey genuinely led research output rather than this pattern resulting from database biases.

Environmental science bibliometrics must also validate emerging trends against known scientific developments. For instance, the rapid growth of renewable energy literature should align with technological advancements and policy initiatives. The validation protocols ensure that such patterns reflect genuine intellectual shifts rather than terminological changes or database artifacts. This is especially important when identifying research gaps, such as the under-exploration of AI and metaverse applications in environmental degradation research identified in the recent study [16].

Validating bibliometric findings with domain knowledge transforms VOSviewer visualizations from suggestive patterns into reliable research tools. Through structured protocols encompassing semantic, structural, temporal, and methodological validation, researchers can confidently interpret bibliometric maps and derive meaningful insights. The integration of quantitative metrics with qualitative expert assessment creates a robust framework for validation, particularly crucial in applied fields like environmental science where research trends inform real-world decisions. As bibliometric methodology evolves, continuing to strengthen these validation practices ensures the growing influence of bibliometrics remains grounded in scholarly rigor.

Bibliometric analysis has become an indispensable methodology for mapping the intellectual structure and evolutionary trends within scientific domains, particularly in environmentally-focused research where understanding global collaboration and emerging topics is crucial [46]. The evolution of bibliometric analysis from its origins in library science to its current role as a cornerstone of research evaluation reflects the broader transformation of academic research in the digital age [46]. This expansion has been facilitated by specialized software tools that enable sophisticated data visualization and analysis, with VOSviewer, Bibliometrix, and CiteSpace emerging as three prominent solutions. Each tool offers unique capabilities for performance analysis and science mapping—the two key dimensions of bibliometric analysis that focus respectively on productivity/impact and the conceptual, intellectual, and social structure of research domains [46].

Within environmental science research, where interdisciplinary collaboration and rapidly evolving research fronts are commonplace, selecting the appropriate bibliometric tool is critical for generating meaningful insights. This application note provides a systematic comparison of these three software tools, evaluating their respective strengths and weaknesses specifically for applications in environmental research contexts such as climate change adaptation [47], environmental degradation [16], ecological impacts of energy systems [48], and ESG performance [49]. By presenting structured comparisons, experimental protocols, and practical workflows, this analysis equips researchers with the knowledge to select and implement the most appropriate tool for their specific research objectives.

Tool Origins and Technical Specifications

VOSviewer (Visualization of Similarities viewer) was developed by Nees Jan van Eck and Ludo Waltman at Leiden University's Centre for Science and Technology Studies [14]. Its technical foundations are detailed in numerous publications, with the first technical paper appearing in 2007 [14]. The software implements the VOS (Visualization of Similarities) mapping technique and a smart local moving algorithm for large-scale modularity-based community detection [14]. VOSviewer is designed specifically for constructing, visualizing, and exploring bibliometric maps based on network data from bibliographic databases.

CiteSpace, developed by Chaomei Chen, is a Java-based application specializing in visual exploratory analysis of emerging trends and transient patterns in scientific literature [46]. It employs algorithms for detecting burst terms and betweenness centrality to identify pivotal points in research networks. CiteSpace is particularly noted for its temporal analysis capabilities, enabling researchers to track the evolution of research fields over discrete time periods.

Bibliometrix is an open-source R package complemented by a web-based interface called Biblioshiny. It offers a comprehensive suite of tools for quantitative research in bibliometrics and scientometrics [46]. Unlike the other tools, Bibliometrix leverages the statistical capabilities of the R environment, allowing for advanced statistical analysis and customization through programming.

Table 1: Fundamental Characteristics of Bibliometric Software Tools

Characteristic VOSviewer Bibliometrix CiteSpace
Primary Developer Van Eck & Waltman (Leiden University) Massimo Aria & Corrado Cuccurullo Chaomei Chen
Initial Release 2007-2009 [14] 2017 2004
Programming Language Java R (Biblioshiny web interface) Java
Software Type Standalone desktop application R package with web interface Standalone desktop application
License Model Freeware Open-source (R package) Freeware for non-commercial use
Data Integration Supports multiple data formats Extensive R integration Java-based framework
System Requirements Java Runtime Environment R environment Java Runtime Environment

Analytical Capabilities and Visualization Strengths

Each tool offers distinct analytical capabilities that determine its suitability for specific research questions in environmental science. VOSviewer excels in creating clear, intuitive visualizations of co-occurrence networks, with special optimization for keyword co-occurrence analysis and citation networks [14] [50]. Its visualization approach emphasizes the clarity of network maps through the VOS clustering technique, which is particularly valuable for identifying major research themes in environmental domains such as "economic growth, renewable energy, and the Environmental Kuznets Curve" [16]. The software's accessibility and responsive interface make it suitable for researchers without programming expertise [16].

Bibliometrix provides the most comprehensive statistical analysis capabilities among the three tools, leveraging the full power of the R environment [46]. It supports the entire bibliometric analysis workflow from data import to visualization, with particular strength in performance analysis including author productivity, source impact, and country-level contributions. This makes it valuable for environmental studies requiring detailed statistical assessment of research output, such as analyzing global contributions to climate change adaptation research [47].

CiteSpace specializes in temporal analysis of research frontiers and emerging trends. Its unique strength lies in detecting burst terms and visualizing the evolution of research fields through time-sliced networks [46]. This capability is particularly useful for tracking the development of fast-evolving environmental topics like "wind-PHS coupling and life-cycle assessment" in energy storage research [48]. CiteSpace also provides advanced metrics like betweenness centrality for identifying pivotal points in research networks.

Table 2: Comparative Analysis of Core Functionalities

Functionality VOSviewer Bibliometrix CiteSpace
Network Types Co-authorship, co-citation, co-occurrence, bibliographic coupling [14] Comprehensive including coupling, co-citation, collaboration networks Co-citation, collaboration, co-occurrence, thematic evolution
Visualization Features Cluster-based maps, density views, overlay visualizations [14] Various plot types including thematic maps, factorial analysis Time-sliced networks, burst detection, betweenness centrality
Data Compatibility Scopus, WoS, PubMed, RIS Scopus, WoS, Dimensions, PubMed WoS, Scopus, Dimensions, PubMed
Learning Curve Gentle Moderate to steep (depending on R proficiency) Steep
Customization Options Moderate through GUI High through R programming Moderate through GUI
Collaboration Analysis Author, organization, country Comprehensive collaboration networks Author, institution, country
Thematic Evolution Limited Thematic evolution, factorial analysis Specialized timeline and timezone views

Experimental Protocols for Environmental Science Research

Data Collection and Preprocessing Workflow

A standardized data collection protocol is essential for rigorous bibliometric analysis across all three tools. For environmental research applications, the following protocol ensures comprehensive data retrieval:

  • Database Selection: Identify primary data sources—typically Scopus and Web of Science (WoS)—based on coverage of environmental literature [16] [47]. Scopus often provides broader coverage of environmental journals, while WoS offers more selective indexing.

  • Search Query Development: Formulate targeted search strings using Boolean operators. For example:

    • Environmental degradation study: "determinants or factor", "carbon emission or CO2" and "environmental degradation" [16]
    • Climate adaptation study: "Climate Change" AND "Institutions" OR "Agriculture" [47]
  • Time Frame Specification: Define appropriate temporal boundaries based on research objectives. Many environmental studies cover decades to capture evolution of the field [16], while others may focus on specific periods of high activity (e.g., 2014-2024 for climate adaptation research [47]).

  • Data Export: Export full bibliographic records in the appropriate format for each tool:

    • VOSviewer: Plain text files from WoS or CSV from Scopus [14]
    • Bibliometrix: Compatible with multiple formats including BibTeX and CSV
    • CiteSpace: WOS plain text format preferred
  • Data Cleaning: Implement standardization procedures for author names, affiliations, and keywords using the preprocessing capabilities of each tool or external scripting.

G start Define Research Scope db Select Databases (Scopus/WoS) start->db search Develop Search Query db->search export Export Records search->export clean Clean Data export->clean import Import to Tool clean->import analyze Perform Analysis import->analyze visualize Visualize Results analyze->visualize

Figure 1: Bibliometric Data Collection and Analysis Workflow

Tool-Specific Implementation Protocols

VOSviewer Implementation for Environmental Research

For analyzing environmental research networks in VOSviewer:

  • Data Import: Use "Create" function to build maps from bibliographic data, selecting the appropriate map type (co-authorship, co-occurrence, citation, or bibliographic coupling) [14].

  • Co-occurrence Analysis: For identifying research themes in environmental science:

    • Select "co-occurrence" analysis type with "author keywords" as the unit
    • Choose binary counting for keyword analysis to avoid bias from repetitive phrases
    • Set minimum number of occurrences threshold (typically 5-10 depending on dataset size) [16]
  • Network Visualization and Interpretation:

    • Apply clustering to identify distinct research themes
    • Use overlay visualization to track temporal trends (e.g., rising interest in "renewable energy" and "Environmental Kuznets Curve") [16]
    • Adjust layout parameters for optimal clarity of network structure
  • Citation Analysis: Employ citation-based networks to identify foundational papers and emerging highly-cited works in environmental research.

Bibliometrix Implementation for Comprehensive Performance Analysis

For environmental research evaluation using Bibliometrix:

  • Data Loading: Use the Biblioshiny web interface or R commands to import and convert bibliographic data.

  • Performance Analysis:

    • Conduct source analysis to identify core environmental journals
    • Perform author analysis to identify leading contributors and collaboration patterns
    • Execute country analysis to map global research contributions in environmental science [47]
  • Science Mapping:

    • Implement conceptual structure analysis through multiple correspondence analysis
    • Create thematic maps to identify motor, basic, emerging, and declining themes
    • Conduct collaboration network analysis to visualize international partnerships
  • Statistical Reporting: Generate comprehensive summary statistics describing the literature dataset.

CiteSpace Implementation for Temporal Pattern Detection

For analyzing evolution of environmental research fronts using CiteSpace:

  • Project Setup: Configure time slicing parameters to divide the dataset into sequential periods (typically 1-3 year slices).

  • Burst Detection: Apply Kleinberg's algorithm to identify suddenly popular topics (e.g., emerging environmental concepts like "ESG performance" or "pumped hydro storage") [49] [48].

  • Betweenness Centrality Calculation: Identify pivotal papers that connect different research clusters in environmental science.

  • Timeline and Timezone Visualization: Generate temporal views showing the emergence, evolution, and decline of research themes.

Application in Environmental Science Research

Case Study: Environmental Degradation Research

A recent bibliometric analysis of environmental degradation research exemplifies VOSviewer's application, analyzing 1365 papers to identify key trends and patterns [16]. The analysis revealed:

  • Major Research Themes: Economic growth, renewable energy, and Environmental Kuznets Curve as dominant themes
  • Geographical Focus: China, Pakistan, and Turkey as leading contributors to the literature
  • Methodological Approach: Co-occurrence analysis of keywords identified interconnected research clusters
  • Visualization Output: Network maps illustrated relationships between determinants of environmental degradation

This study demonstrated VOSviewer's strength in creating intuitive visualizations that "provide a strategic roadmap for future research" in environmental science [16].

Case Study: Climate Change Adaptation Governance

Research on institutional dynamics in climate change adaptation employed bibliometric analysis to identify research patterns and geographical distributions [47]. The study revealed:

  • Geographical Concentration: Research concentration in Western countries and parts of Africa, with significant gaps in South Asia
  • Collaboration Patterns: Limited cross-regional collaboration despite global nature of climate challenges
  • Methodological Approach: Combined bibliometric analysis with systematic review following PRISMA guidelines

G cluster_0 Tool Selection Guide cluster_1 Environmental Applications start Environmental Research Question vos VOSviewer: Clear network visualization & clustering start->vos biblio Bibliometrix: Comprehensive statistical analysis start->biblio cite CiteSpace: Temporal patterns & emerging trends start->cite app1 Theme identification in environmental degradation [16] vos->app1 app2 Collaboration analysis in climate adaptation [47] biblio->app2 app3 Trend analysis in energy storage research [48] cite->app3

Figure 2: Tool Selection Guide for Environmental Research Questions

Comparative Performance in Environmental Research Domains

Table 3: Tool Performance in Specific Environmental Research Applications

Research Application VOSviewer Bibliometrix CiteSpace
Climate Change Adaptation Effective for mapping research themes Superior for analyzing geographical contributions and collaboration Strong for tracking evolution of adaptation strategies
Environmental Degradation Excellent for determinant identification [16] Comprehensive for statistical trends Effective for detecting emerging determinants
Energy Storage Research Good for technology relationship mapping Strong for publication output analysis Superior for tracking technology evolution [48]
ESG Performance Studies Effective for conceptual structure Comprehensive for interdisciplinary analysis Strong for identifying emerging ESG topics [49]
Pollution and Carbon Emissions Optimal for co-occurrence network visualization Excellent for temporal production analysis Effective for identifying research fronts

Research Reagent Solutions: Essential Materials for Bibliometric Analysis

Table 4: Essential Research Reagents for Bibliometric Analysis in Environmental Science

Research Reagent Function Example Sources/Tools
Bibliographic Databases Source data for analysis Scopus, Web of Science, Dimensions
Data Extraction Tools Export and format bibliographic data Scopus Export, WOS Export Utilities
Reference Managers Organize and preprocess references Zotero, Mendeley, EndNote
Statistical Software Complementary statistical analysis R, Python, SPSS
Text Mining Tools Enhance keyword processing Natural Language Processing libraries
Visualization Platforms Supplementary visualization Gephi, Tableau, Microsoft Power BI

Integrated Workflow for Comprehensive Environmental Bibliometrics

For researchers requiring comprehensive analysis, an integrated workflow leveraging multiple tools provides the most robust approach:

  • Data Collection and Preparation: Use Bibliometrix for initial data importing and cleaning due to its flexible data handling capabilities.

  • Performance Analysis: Employ Bibliometrix for comprehensive productivity and impact assessment of countries, institutions, authors, and journals.

  • Science Mapping: Utilize VOSviewer for clear, interpretable network visualizations of research themes and intellectual structure.

  • Temporal Analysis: Implement CiteSpace for detecting emerging trends and visualizing the evolution of research fronts.

  • Results Integration: Synthesize findings from all tools to develop a complete picture of the research landscape.

This integrated approach compensates for the limitations of individual tools while leveraging their respective strengths, ultimately producing more rigorous and insightful bibliometric assessment of environmental research domains.

VOSviewer, Bibliometrix, and CiteSpace each offer unique value propositions for bibliometric analysis in environmental science research. VOSviewer excels in creating accessible, interpretable network visualizations with particular strength in co-occurrence analysis. Bibliometrix provides the most comprehensive statistical toolkit with seamless integration into the R ecosystem. CiteSpace offers unparalleled capabilities for temporal analysis and detection of emerging research fronts.

Tool selection should be guided by research objectives: VOSviewer for intuitive visualization and clustering analysis, Bibliometrix for comprehensive performance assessment and statistical analysis, and CiteSpace for investigating temporal patterns and emerging trends. For complex environmental research questions, an integrated approach leveraging the complementary strengths of all three tools often yields the most robust and actionable insights for researchers, policymakers, and environmental professionals seeking to understand the evolving landscape of sustainability science.

In the field of environmental science research, bibliometric analysis has become an indispensable technique for mapping the intellectual structure and evolution of scholarly fields. The reliability and comprehensiveness of such analyses are fundamentally dependent on the quality and scope of the underlying bibliographic data. Proprietary databases like Scopus and emerging open sources like OpenAlex each present unique advantages and limitations in coverage, particularly across different geographic and disciplinary domains [51]. Cross-verification using multiple data sources mitigates the inherent biases of any single database, ensuring a more robust and reproducible analysis. This protocol provides a detailed framework for integrating and validating data from Scopus, Dimensions, and OpenAlex specifically for bibliometric studies in environmental science using VOSviewer software.

The Scientist's Toolkit: Research Reagent Solutions

Table 1: Essential Materials and Software for Bibliometric Cross-Verification

Item Name Function/Application Specification Notes
Scopus API Retrieving structured bibliographic data from Elsevier's Scopus database. Requires institutional subscription; offers comprehensive coverage of peer-reviewed literature [52].
OpenAlex API Accessing open scholarly metadata; a continuation of Microsoft Academic Graph [51]. Freely available; permits reproducible bibliometrics without licensing barriers [52] [51].
VOSviewer Creating and visualizing bibliometric maps based on network data. Specialized functions for collaboration, topic, and citation analysis [12].
Bibliographic Data Parser Cleaning and standardizing records from different sources (e.g., authors, institutions). Custom scripts in Python, R, or SQL are often necessary for data harmonization [52].
Reference Matching Script Identifying and linking duplicate publications across databases. Crucial for creating a unified, non-redundant dataset for analysis.

A critical first step in cross-verification is understanding the core characteristics and comparative performance of the available databases. The following table summarizes key metrics based on recent large-scale studies.

Table 2: Data Source Characteristics for Bibliometric Analysis in Environmental Science

Characteristic Scopus OpenAlex Dimensions
Provider & Business Model Elsevier (Proprietary) OurResearch (Open Access) Digital Science (Proprietary)
Coverage (General) Extensive, selective Very extensive, inclusive Extensive
Internal Reference Coverage High Comparable to Scopus and WoS on shared publications [51] Information Missing
Metadata Completeness High Mixed (e.g., more ORCIDs, fewer abstracts) [51] Information Missing
Primary Application Disciplined, traditional bibliometrics Reproducible, large-scale bibliometrics [51] Information Missing
Key Strength Well-curated metadata Permissive licensing for open research Information Missing
Key Limitation Licensing cost and restrictions Rapidly evolving and changing data [51] Information Missing

Recent analyses indicate that when restricted to a cleaned dataset of recent publications shared across Scopus, Web of Science, and OpenAlex, OpenAlex demonstrates average source reference numbers and internal coverage rates that are comparable to both Web of Science and Scopus [51]. This makes it a viable and powerful open-source alternative for many bibliometric applications. However, its metadata coverage is mixed, capturing more ORCID identifiers but fewer abstracts than its proprietary counterparts [51].

Experimental Protocols

Protocol 1: Data Retrieval and Harmonization

This protocol outlines the steps for gathering data from multiple APIs and standardizing it for analysis.

I. Research Question Formulation Define clear research questions to guide search strategy, data collection, and analysis. For example: "What is the thematic evolution of climate change adaptation research from 2015 to 2025?"

II. Search Strategy Development

  • Develop a comprehensive search string using keywords, Boolean operators, and field codes (e.g., TITLE-ABS-KEY).
  • Test and refine the search string for recall and precision.
  • Apply the identical search string across all data sources (Scopus, OpenAlex, Dimensions) to ensure comparability.

III. Data Retrieval via APIs

  • Use the respective official APIs to execute the search and retrieve data.
  • For Scopus and OpenAlex, follow structured code procedures as demonstrated in the literature [52].
  • Key data fields to retrieve for each publication include: title, authors, affiliations, year, source, abstract, keywords, citation count, and references.

IV. Data Cleaning and Harmonization

  • Parse and clean author names and affiliations to address variations and inconsistencies.
  • Standardize document types (e.g., map "article" to "journal article").
  • Extract and standardize keywords, creating a unified list of terms.

V. Data Integration and De-duplication

  • Merge records from different sources using unique identifiers like DOI.
  • Identify and remove duplicate publications based on DOI, title, and author matching.
  • Resolve conflicts between metadata from different sources (e.g., by prioritizing the record from the most authoritative source for that field).

VI. Final Dataset Creation

  • Export the unified, cleaned dataset in a format compatible with VOSviewer (e.g., CSV or RIS).

workflow RQ Define Research Question Search Develop Search String RQ->Search Retrieval Retrieve Data via APIs Search->Retrieval Cleaning Clean & Harmonize Metadata Retrieval->Cleaning Integration Integrate & De-duplicate Cleaning->Integration Export Export to VOSviewer Integration->Export

Data Retrieval and Harmonization Workflow

Protocol 2: Cross-Verification and Validation

This protocol describes methods to validate the coverage and consistency of the retrieved dataset.

I. Coverage Analysis

  • Calculate the total number of unique publications retrieved from each source and their overlaps.
  • Generate a Venn diagram to visualize the overlap and unique contributions of each database.

II. Benchmarking with a Gold Standard

  • Compile a known set of key publications and authors in the target research field from expert-nominated sources or seminal review articles.
  • Check the presence of these benchmark items in the merged dataset to calculate a recall rate for each source and the combined set.

III. Consistency Checks

  • Reference coverage analysis: For a sample of publications, compare the number of references provided by each data source [51].
  • Metadata consistency: For a random sample of records common to all sources, compare key metadata fields like author count, affiliation string, and citation counts to identify discrepancies.

Protocol 3: VOSviewer Analysis with Cross-Verified Data

This protocol details the creation of bibliometric maps using the cross-verified dataset, incorporating best practices for threshold selection.

I. Data Preparation and Import

  • Load the final, integrated dataset into VOSviewer.
  • Select the type of analysis (e.g., co-authorship, co-occurrence, citation).

II. Threshold Setting

  • Bibliometric analysis uses thresholds to select the minimum frequency of knowledge units, which helps extract a core, interpretable knowledge network [12].
  • Set thresholds for authors, keywords, or other units. There is no universal standard; the threshold is problem-dependent. Common thresholds are 3, 5, 10, 15, 20, or 30 [12].
  • Guideline: A lower threshold creates a larger, more complex network, while a higher threshold yields a more focused, core network. Start with a moderate threshold (e.g., 10) and adjust based on the network's interpretability.

III. Map Creation and Interpretation

  • Create the network map. VOSviewer will visually cluster items frequently occurring together.
  • Interpret the resulting maps by analyzing clusters, the position of items, and their links. Compare the network structure derived from the cross-verified data with what might have been obtained from a single source.

analysis Input Cross-Verified Dataset VOS VOSviewer Analysis Input->VOS Threshold Apply Frequency Threshold VOS->Threshold Network Generate Network Map Threshold->Network Interpret Interpret Clusters & Links Network->Interpret

VOSviewer Analysis with Thresholding

The integration and cross-verification of data from Scopus, Dimensions, and OpenAlex establishes a rigorous foundation for bibliometric analysis in environmental science. This multi-source approach leverages the respective strengths of each database—be it the curated metadata of Scopus or the open and inclusive nature of OpenAlex—while mitigating their individual biases. By adhering to the detailed protocols for data retrieval, harmonization, and validation outlined above, researchers can utilize VOSviewer to generate more accurate, reliable, and comprehensive maps of scientific knowledge, thereby enhancing the integrity and reproducibility of their research outcomes.

In the rapidly evolving field of environmental science, where research directly informs critical policy and conservation decisions, accurately assessing scholarly impact has never been more important. Traditional citation counts, while valuable for measuring academic influence, provide an incomplete picture of a study's true reach and significance. This limitation is particularly pronounced in environmental science, where research often influences policy, public awareness, and industrial practices beyond academic circles. The integration of bibliometric analysis using visualization tools like VOSviewer represents a paradigm shift in research assessment, enabling scholars to identify emerging trends, map intellectual networks, and contextualize citation metrics within broader scientific landscapes. This methodological approach allows researchers to transition from simply counting citations to understanding the complex relationships between ideas, authors, and institutions that drive scientific progress in environmental domains.

The evolution of bibliometric methodology from basic citation counting to sophisticated network analysis reflects a growing recognition that scientific impact is multidimensional. Modern bibliometric tools like VOSviewer, developed by researchers at Leiden University's Centre for Science and Technology Studies [5], enable both performance analysis and science mapping, providing insights into the structural and dynamic aspects of scientific research. For environmental scientists, this means being able to track the development of key concepts such as "blue economy," "environmental degradation," or "sustainable financial inclusion" across time and geographic boundaries, identifying not just what is being cited, but how ideas cluster and evolve in response to global environmental challenges.

The Evolution of Research Impact Assessment

From Simple Metrics to Complex Networks

The assessment of research impact has undergone significant transformation since the early days of simple citation counting. Bibliometrics, a term first introduced by Otlet in the 1930s and popularized by Pritchard in 1969 [53], has evolved from basic publication counts to sophisticated analyses of scientific networks and knowledge structures. This evolution mirrors the increasing complexity of environmental research itself, which requires interdisciplinary approaches to address multifaceted challenges like climate change, biodiversity loss, and sustainable development.

Traditional citation analysis, while useful for measuring academic influence, suffers from several limitations in environmental science contexts. It often favors established topics over emerging ones, overlooks non-academic impact, and fails to capture the relational aspects of knowledge production. The integration of tools like VOSviewer has addressed these gaps by enabling:

  • Co-authorship analysis: Mapping collaboration networks between researchers, institutions, and countries
  • Co-citation analysis: Identifying intellectual foundations and thematic relationships
  • Keyword co-occurrence: Revealing conceptual structure and emerging topics
  • Bibliographic coupling: Connecting documents that reference similar prior work

These advanced techniques allow environmental scientists to visualize the intricate knowledge networks that underlie scientific progress, moving beyond simple quantitative measures to qualitative understanding of how research ideas connect and evolve.

Complementary Impact Metrics

While traditional citations remain important, several complementary metrics have emerged to provide a more nuanced understanding of research impact:

Table: Emerging Research Impact Metrics in Environmental Science

Metric Category Specific Examples Application in Environmental Science
Citation Enhancements Field-Weighted Citation Impact, Citation Percentiles Contextualizes citation performance within specific subfields like climate science or conservation biology
Alternative Metrics Altmetric Attention Score, Social Media Mentions Captures policy uptake, public engagement, and media coverage of environmental research findings
Network Metrics Betweenness Centrality, Modularity Class Identifies bridging studies that connect different research communities or thematic clusters
Temporal Metrics Burst Detection, Half-Life Pinpoints rapidly emerging topics and sustainability of research influence over time

The Altmetric Attention Score (AAS) has particular relevance for environmental science, where research often informs public policy and conservation practice. This metric quantifies attention across news outlets, social media, policy documents, and other non-academic sources, capturing impact that traditional citations might miss [54]. For example, a study on plastic pollution might receive modest citation counts but generate significant policy discussions and public awareness, reflected in its AAS.

VOSviewer in Bibliometric Analysis

Software Capabilities and Functionality

VOSviewer (Visualization of Similarities viewer) is a specialized software tool for constructing and visualizing bibliometric networks that has become increasingly prominent in environmental science research. Developed by van Eck and Waltman at Leiden University [12] [5], this Java-based application provides user-friendly functionality for analyzing bibliometric data through multiple visualization techniques:

  • Network visualization: Displays items as nodes and relationships as links, with node size indicating importance and colors representing clusters
  • Overlay visualization: Similar to network visualization but uses color to represent a specific variable such as publication year or citation impact
  • Density visualization: Shows the density of items at any point in the map, with colors indicating the concentration of research activity

The software supports several types of bibliometric analysis particularly relevant to environmental research, including co-authorship (between researchers, organizations, countries), co-citation (of references, authors, journals), and co-occurrence (of keywords, terms) [12]. Recent versions have introduced improved color schemes such as "viridis" and "tab20" to replace the problematic rainbow palette, enhancing perceptual uniformity and accessibility for color-blind users [5].

Integration with Bibliometric Workflows

VOSviewer operates within a broader ecosystem of bibliometric tools and typically follows data extraction from major databases like Scopus and Web of Science. The software imports data in RIS or CSV formats and can process thousands of records simultaneously, making it suitable for comprehensive analyses of environmental research domains. Its compatibility with other tools like CiteSpace [54] and Bibliometrix [55] allows researchers to combine multiple analytical approaches, validating findings through methodological triangulation.

For environmental scientists, VOSviewer's ability to handle large datasets is particularly valuable given the interdisciplinary and collaborative nature of the field. Studies analyzing trends in sustainable development [11], environmental degradation [16], or climate change research typically involve thousands of publications across multiple subdisciplines, requiring robust software capable of mapping complex knowledge structures without sacrificing analytical nuance.

Experimental Protocols for Bibliometric Analysis

Data Collection and Preprocessing Protocol

Table: Data Collection Protocol for Environmental Science Bibliometrics

Step Procedure Tools Quality Control
Database Selection Select Scopus, Web of Science, or both based on coverage needs Scopus, WoS Compare initial results to assess database-specific biases
Search Strategy Develop comprehensive search strings using Boolean operators Database interfaces Validate search sensitivity and specificity with test sets
Time Frame Define appropriate temporal range (e.g., 2000-2025) - Justify time period based on research questions
Export Parameters Export full record and cited references RIS, Plain text Verify complete metadata extraction
Data Cleaning Remove duplicates, standardize terms, complete metadata OpenRefine, Excel, R Implement systematic deduplication protocol

Protocol Details:

  • Database Selection: Choose between Scopus and Web of Science based on disciplinary coverage. Scopus generally provides broader coverage of environmental journals, while WoS offers more consistent citation data. For comprehensive analyses, use both databases and merge results after deduplication [56] [55].

  • Search Strategy Development: Formulate structured search queries using title-abstract-keyword fields. For example, in sustainable energy research: TITLE-ABS-KEY ("renewable energy" AND "policy" AND "developing countries"). Test search sensitivity by verifying known key papers are included [16].

  • Time Frame Determination: Select appropriate time frames based on research questions. For emerging trends, recent 5-10 year periods may suffice; for evolutionary analysis, longer timeframes (20+ years) are necessary [11].

  • Data Export: Export complete bibliographic records including citations, references, abstracts, and keywords. For VOSviewer analysis, the "full record and cited references" export option is recommended [54].

  • Data Cleaning: Implement systematic cleaning procedures:

    • Remove duplicate records using DOI and title matching
    • Standardize author names and affiliations (e.g., "Univ." versus "University")
    • Harmonize keyword variants (e.g., "USA" and "United States")
    • Complete missing metadata through manual checking

VOSviewer Analysis Protocol

Network Construction Steps:

  • Data Import: Load the processed data into VOSviewer using the "Create" function based on bibliographic database files.

  • Analysis Type Selection: Choose appropriate analysis type:

    • Co-authorship: For collaboration patterns among researchers, institutions, or countries
    • Co-occurrence: For conceptual structure analysis using author keywords or terms from titles/abstracts
    • Citation: For document, author, or journal impact networks
    • Bibliographic coupling: For relatedness between documents based on shared references
  • Threshold Setting: Apply frequency thresholds to focus on most relevant items. For environmental science reviews, typical initial thresholds might be:

    • Author co-citation: Minimum 20 citations
    • Keyword co-occurrence: Minimum 5 occurrences
    • Journal analysis: Minimum 10 documents [12]
  • Mapping Parameters:

    • Normalization: Association strength for co-occurrence, fractionalization for co-authorship
    • Clustering: Default VOS clustering algorithm
    • Layout: Attraction = 2, Repulsion = 0
  • Visualization Refinement:

    • Adjust label size and scaling for readability
    • Apply appropriate color schemes (viridis for overlay, tab20 for clusters)
    • Manual repositioning of overlapping labels

G Start Define Research Objectives DBSelect Select Databases (Scopus/WoS) Start->DBSelect Search Develop Search Strategy DBSelect->Search Export Export Records Search->Export Clean Data Cleaning Export->Clean Import Import to VOSviewer Clean->Import Analysis Select Analysis Type Import->Analysis Threshold Set Thresholds Analysis->Threshold Map Generate Map Threshold->Map Interpret Interpret Results Map->Interpret Report Report Findings Interpret->Report

Bibliometric Analysis Workflow

Interpretation and Validation Protocol

  • Cluster Analysis: Identify major thematic clusters based on network modularity. Label clusters by examining central terms and highly cited documents within each group.

  • Temporal Analysis: Use overlay visualization to track concept evolution. Color code by average publication year to identify emerging (recent) versus established (older) topics.

  • Network Metrics Interpretation:

    • Betweenness centrality: Identifies bridge concepts connecting different research areas
    • Citation density: Measures cluster cohesion and maturity
    • Average publication year: Indicates temporal development of concepts
  • Validation Procedures:

    • Compare results with different threshold settings to assess robustness
    • Conduct sensitivity analysis by excluding specific journals or time periods
    • Triangulate findings with complementary tools (CiteSpace, Bibliometrix)
    • Verify conceptual clusters with domain expert feedback

Applications in Environmental Science Research

Bibliometric analysis using VOSviewer has revealed significant insights into the evolution of sustainable development research. A comprehensive analysis of Sustainable Inclusive Economic Growth (SIEG) within the SDG 8 framework documented a substantial increase in research output post-2015, with a notable surge after 2019 as global efforts toward the UN 2030 Agenda intensified [11]. The analysis identified China, India, and Italy as the most productive countries, while "Sustainability (Switzerland)" ranked as the leading journal in this domain.

Thematic evolution analysis revealed a distinct shift from earlier focus areas like financial inclusion and corporate social responsibility (2014-2023) toward emerging topics like digital economy, blue economy, employment, and entrepreneurship (2024-2025) [11]. This temporal mapping provides valuable intelligence for researchers and policymakers seeking to align investigations with evolving priorities in sustainability science.

Mapping Environmental Degradation Research

In environmental degradation research, VOSviewer analysis of 1,365 papers revealed an astonishing annual publication growth rate exceeding 80%, reflecting intensified global focus on sustainability challenges [16]. The analysis identified economic growth as the most frequently studied factor connected to environmental degradation, with particular emphasis on themes like renewable energy and the Environmental Kuznets Curve.

Network visualization demonstrated how energy consumption, globalization, and urbanization drive carbon emissions research, with China, Pakistan, and Turkey leading research output. The bibliometric approach helped identify emerging research hotspots, including the role of advanced technologies like artificial intelligence and the Metaverse, as well as behavioral and psychological factors influencing environmental degradation [16].

Analyzing Sustainable Financial Inclusion

A bibliometric analysis of sustainable financial inclusion research revealed eight distinct thematic clusters, including digital finance, ESG integration, green finance, and financial literacy, demonstrating the multidimensional nature of this evolving field [56]. The analysis documented rapid growth since 2017, led by China, India, and the United States, while also revealing geographic imbalances and underrepresentation of Sub-Saharan Africa and Central Asia regions.

The VOSviewer mapping identified major barriers to sustainable financial inclusion, including financial illiteracy and uncoordinated regulations among institutions, providing actionable intelligence for policymakers seeking to align inclusive finance with Sustainable Development Goals [56].

The Scientist's Toolkit: Research Reagent Solutions

Table: Essential Bibliometric Tools for Environmental Research Analysis

Tool Name Primary Function Application in Environmental Science Access
VOSviewer Network visualization and clustering Mapping thematic evolution in environmental research domains Free download
Bibliometrix (R-tool) Comprehensive bibliometric analysis Performance analysis of countries, institutions, authors R package
CiteSpace Burst detection and temporal analysis Identifying emerging trends and paradigm shifts Free download
SCImago Graphica Geographic mapping of research output Visualizing regional contributions to environmental research Free download
Google Scholar Broad literature search Complementary coverage beyond subscription databases Web access

Implementation Notes:

The effective application of these tools requires thoughtful integration into the research workflow. VOSviewer excels at visualization and cluster identification, while Bibliometrix provides more robust performance analysis capabilities. For environmental scientists studying rapidly evolving fields like climate change adaptation or plastic pollution research, CiteSpace's burst detection functionality can identify suddenly popular topics that might represent research fronts.

Many research groups employ a sequential approach where Bibliometrix is used for initial data screening and performance analysis, followed by VOSviewer for network construction and visualization, with CiteSpace adding specialized temporal analysis for trend identification. This tool combination provides methodological triangulation, strengthening the validity of bibliometric findings.

Visualization and Interpretation

Network Visualization Principles

Effective interpretation of VOSviewer maps requires understanding key visualization principles:

  • Node size: Typically represents importance metrics like publication count, citation frequency, or occurrence count
  • Node color: Indicates cluster membership (network view) or temporal development (overlay view)
  • Node distance: Reflects relatedness between items, with closer nodes being more strongly associated
  • Label size: Usually corresponds to node importance metrics
  • Cluster formation: Groups of closely connected nodes representing thematic areas

The software offers multiple visualization modes suited to different analytical questions in environmental research. Density visualization helps identify knowledge concentrations, overlay visualization reveals temporal trends, and network visualization displays relational structures between research constituents.

G Map VOSviewer Network Map NodeSize Node Size: Impact (e.g., citation count) Map->NodeSize NodeColor Node Color: Cluster or Timeline Map->NodeColor NodeDistance Node Distance: Relatedness (closer = stronger link) Map->NodeDistance Clusters Identify Thematic Clusters NodeSize->Clusters NodeColor->Clusters NodeDistance->Clusters Bridges Find Bridge Concepts (high betweenness centrality) Clusters->Bridges Trends Analyze Temporal Patterns (overlay visualization) Bridges->Trends Gaps Identify Research Gaps (sparse network areas) Trends->Gaps

Network Map Interpretation Guide

Advanced Interpretation Techniques

Beyond basic network interpretation, several advanced techniques enhance the value of bibliometric analysis for environmental research:

  • Overlay Visualization for Trend Analysis: Using the color gradient from blue (older) to yellow (newer) in the viridis color scheme to identify emerging topics at the research frontier [5].

  • Burst Detection: Identifying concepts with sudden increases in frequency that may indicate emerging research fronts or responding to environmental crises.

  • Geographical Mapping: Integrating bibliometric findings with geographic visualization to reveal regional specialization and international collaboration patterns in environmental research.

  • Multilevel Analysis: Conducting simultaneous analysis at multiple levels (authors, institutions, countries) to understand scale-dependent patterns in knowledge production.

For example, in analyzing climate change adaptation research, overlay visualization might reveal shifting emphasis from general vulnerability assessment to specific resilience strategies, while geographical mapping could identify leading regions and potential collaboration opportunities for knowledge transfer.

The Role of Bibliometrics in Shaping Research Agendas and Funding Priorities

Bibliometric analysis has evolved into a crucial methodology for quantitatively assessing scholarly literature, enabling the systematic mapping of research landscapes across scientific domains. This approach utilizes quantitative data analysis and network visualization to identify emerging trends, intellectual structures, and collaborative patterns within scientific literature [57]. In environmental science, where research questions are complex and funding resources are competitive, bibliometrics provides evidence-based insights that help shape research agendas and strategically allocate funding resources. The integration of specialized software tools like VOSviewer has significantly enhanced our capacity to process and visualize large bibliometric datasets, revealing patterns that might otherwise remain obscured in conventional literature reviews [58]. This application note examines the methodological protocols and practical applications of bibliometric analysis in guiding research priorities within environmental science, with specific focus on VOSviewer implementation.

Application Notes: Bibliometrics in Environmental Research

Bibliometric analyses have revealed several pivotal research trends and shifts in environmental science. Studies utilizing VOSviewer have demonstrated a substantial rise in research focusing on ecological product valuation and ecosystem services, particularly following international policy frameworks such as the United Nations Sustainable Development Goals [36]. The analysis of publication trends has enabled researchers to track the evolution from initial conceptual exploration to global cooperation and policy application phases in ecosystem service-based ecological risk assessment (ESRA) [58].

Research in environmental degradation has shown an accelerating publication growth rate exceeding 80% annually, with particular emphasis on themes like economic growth, renewable energy, and the Environmental Kuznets Curve [16]. The analysis of 1,365 research papers in this domain revealed that economic growth remains the most extensively studied factor, with China, Pakistan, and Turkey emerging as leading contributors to the research output [16].

In the microplastics research domain, bibliometrics has uncovered an explosive growth in publications, particularly from 2014 to 2023, with research expanding from marine environments to terrestrial and atmospheric systems [4]. This analysis has helped identify four major research clusters: distribution and sources, toxic effects, analytical methods, and interactions with other pollutants [4].

Table 1: Key Research Trends in Environmental Science Identified Through Bibliometric Analysis

Research Domain Primary Trends Identified Temporal Pattern Leading Contributing Countries
Ecological Product Value Ecosystem services valuation, Policy frameworks Two-phase growth: starting/exploring (1993-2010) and rapid development (2011-2023) China, United States, European nations [36]
Environmental Degradation Economic growth, Renewable energy, Environmental Kuznets Curve Annual growth >80%, particularly accelerated since 2015 China, Pakistan, Turkey [16]
Microplastic Pollution Distribution pathways, Toxicological effects, Analytical methods Explosive growth since 2014, 3,548 publications in 2022 alone China, USA, UK, Australia, Canada [4]
Ecosystem Service Risk Assessment Landscape ERA, Aquatic ecosystems, Ecosystem health Four-stage evolution: initial development (1994-2005) to global cooperation Not specified [58]
Influence on Research Agendas

Bibliometric analysis directly shapes research agendas by identifying knowledge gaps and emerging frontiers in scientific literature. The visual mapping of keyword co-occurrence and evolution over time allows researchers to detect shifting priorities and underexplored areas requiring investigation [36]. For instance, in ecological product value research, bibliometric analysis has highlighted the need for more comprehensive value development, improved value realization pathways, and refined accounting methodologies [36].

The analysis of collaboration networks has revealed substantial international cooperation patterns, with countries like China, the United States, and the United Kingdom forming central hubs in microplastics research networks [4]. These insights help funding agencies promote strategic international partnerships and allocate resources to regions where research capacity building is most needed.

Citation analysis has further enabled the identification of seminal works and conceptual foundations within environmental research domains. For example, in process safety and environmental protection, bibliometric mapping revealed influential publications and research trends that have shaped the field's development over three decades [57]. This helps new researchers quickly grasp the intellectual structure of the field and identify foundational knowledge.

Impact on Funding Priorities

Funding agencies increasingly utilize bibliometric analysis to inform strategic prioritization and resource allocation. The quantitative assessment of publication outputs, citation impacts, and collaboration networks provides objective criteria for evaluating research productivity and impact [57]. Bibliometric indicators have become valuable tools for assessing the return on investment in research funding and identifying promising areas for future investment.

The analysis of research fronts and emerging topics allows funding agencies to support cutting-edge investigations in areas such as the role of advanced technologies like artificial intelligence and the Metaverse in environmental science, as well as behavioral and psychological factors influencing environmental degradation [16]. These analyses help anticipate future research directions rather than merely responding to past trends.

Bibliometric mapping has also supported the identification of interdisciplinary opportunities where environmental science converges with other domains. This enables funding agencies to promote cross-disciplinary initiatives that address complex environmental challenges through integrated approaches [58].

Experimental Protocols

Data Collection and Preprocessing Protocol

Objective: To systematically collect and preprocess bibliographic data from authoritative databases for analysis in VOSviewer.

Materials and Reagents:

  • Computer with internet access
  • Subscription access to Web of Science Core Collection and/or Scopus
  • Data storage system (local or cloud-based)
  • VOSviewer software (latest version)

Procedure:

  • Database Selection: Access Web of Science Core Collection via institutional subscription. Alternative: Scopus database for complementary coverage.
  • Search Query Formulation:
    • Define research topic using key terminology
    • Utilize Boolean operators (AND, OR, NOT) for comprehensive coverage
    • Example: TS=("ecological product valuation" OR "ecosystem services") AND DT=("ARTICLE" OR "REVIEW") AND LA=("ENGLISH") [36]
  • Time Span Specification: Set appropriate timeframe based on research objectives (e.g., 1990-2020 for historical trends [57])
  • Document Type Filtering: Select primarily "Article" and "Review" as these represent core research contributions [58]
  • Data Export: Download full record and cited references in plain text format
  • Data Cleaning:
    • Remove duplicate records
    • Standardize author names and affiliations
    • Verify consistency of citation information
  • Data Integration: Merge datasets from different databases, resolving reference format differences

Troubleshooting Tips:

  • If result set is too large, refine search terms using more specific terminology
  • If result set is too small, broaden search terms and reduce restrictions
  • Verify database coverage periods to ensure comprehensive temporal coverage
VOSviewer Analysis Protocol

Objective: To analyze and visualize bibliometric networks using VOSviewer software.

Materials and Reagents:

  • Computer meeting VOSviewer system requirements
  • Preprocessed bibliographic data file
  • VOSviewer software (version 1.6.7 or later recommended for updated color schemes [5])

Procedure:

  • Software Setup:
    • Download and install VOSviewer from official source
    • Launch application and select "Create" to begin new analysis
  • Data Import:
    • Select appropriate data source type (e.g., Web of Science, Scopus, PubMed)
    • Import preprocessed data file
  • Analysis Type Selection:
    • Choose from co-authorship, co-occurrence, citation, or bibliographic coupling analyses
    • For research trend identification: select "keyword co-occurrence" [36]
  • Threshold Setting:
    • Define minimum number of occurrences for inclusion (typical range: 5-15) [57]
    • Adjust threshold based on dataset size and research objectives
  • Mapping Parameters:
    • Select normalization method (association strength recommended)
    • Choose clustering resolution (default typically effective)
    • Set visualization method (network, overlay, or density)
  • Visualization Refinement:
    • Apply appropriate color schemes (viridis recommended over rainbow for perceptual uniformity [5])
    • Adjust node size and label positioning for clarity
    • Utilize zoom and rotation functions to explore map details
  • Interpretation Aid:
    • Identify major clusters by color coding
    • Analyze node size as indicator of frequency or importance
    • Examine link thickness as indicator of relationship strength

Troubleshooting Tips:

  • If map is too dense, increase threshold values
  • If clusters are unclear, try different normalization methods
  • For large datasets, use the "ignore items from large publications" option to prevent hyperauthorship from dominating the map [5]
Trend Analysis and Forecasting Protocol

Objective: To identify evolutionary trends and forecast future research directions.

Materials and Reagents:

  • VOSviewer with overlay visualization capability
  • Bibliographic dataset with temporal information
  • Additional software for complementary analysis (optional: CiteSpace, SciMAT)

Procedure:

  • Temporal Visualization:
    • Select "overlay visualization" in VOSviewer
    • Choose time-based coloring (e.g., average publication year)
    • Identify temporal patterns in research focus [5]
  • Burst Detection:
    • Analyze citation bursts or keyword frequency surges
    • Identify rapidly emerging topics and declining interests
  • Thematic Evolution Analysis:
    • Divide timeframe into sequential periods (typically 3-5 year intervals)
    • Compare keyword clusters across periods
    • Track conceptual shifts and emerging specialties [58]
  • Research Frontier Identification:
    • Combine recent frequency with high growth rates
    • Identify weakly connected concepts suggesting innovative combinations
  • Validation:
    • Cross-reference findings with expert opinion
    • Compare with policy developments and societal trends
    • Assess consistency across different analytical approaches

Troubleshooting Tips:

  • Ensure adequate data points per time period for statistical reliability
  • Use multiple complementary indicators for trend validation
  • Consider external factors (policy changes, technological breakthroughs) that might influence trends

Visualization Schemes

Bibliometric Analysis Workflow

G cluster_DataCollection Data Collection Phase cluster_VOSviewerAnalysis VOSviewer Analysis Phase Start Define Research Objectives DataCollection Data Collection & Preprocessing Start->DataCollection VOSviewerAnalysis VOSviewer Analysis DataCollection->VOSviewerAnalysis Import Data Import Visualization Visualization & Interpretation VOSviewerAnalysis->Visualization Applications Research & Funding Applications Visualization->Applications DBSelect Database Selection QueryForm Query Formulation DBSelect->QueryForm Filtering Data Filtering QueryForm->Filtering Export Data Export Filtering->Export Cleaning Data Cleaning Export->Cleaning Cleaning->Import AnalysisType Analysis Type Selection Import->AnalysisType Mapping Mapping Parameters AnalysisType->Mapping Refinement Visual Refinement Mapping->Refinement Refinement->Visualization

Research Agenda Development Process

G BibliometricInput Bibliometric Analysis TrendIdentification Trend Identification BibliometricInput->TrendIdentification GapAnalysis Gap Analysis BibliometricInput->GapAnalysis CollaborationMapping Collaboration Mapping BibliometricInput->CollaborationMapping ImpactAssessment Impact Assessment BibliometricInput->ImpactAssessment ResearchThemes Emerging Research Themes TrendIdentification->ResearchThemes PriorityAreas Research Priority Areas GapAnalysis->PriorityAreas StrategicPartnerships Strategic Partnerships CollaborationMapping->StrategicPartnerships FundingAllocation Funding Allocation ImpactAssessment->FundingAllocation ResearchThemes->PriorityAreas StrategicPartnerships->FundingAllocation

The Scientist's Toolkit

Table 2: Essential Research Reagents and Tools for Bibliometric Analysis

Tool/Resource Function Application in Environmental Science
VOSviewer Software Network visualization and analysis Creating co-authorship, co-occurrence, and citation networks; identifying research trends [57]
Web of Science Core Collection Primary bibliographic database Comprehensive coverage of high-impact environmental science literature [36]
Scopus Database Complementary bibliographic database Expanded journal coverage for more comprehensive analysis [16]
CiteSpace Software Complementary analysis tool Temporal pattern analysis and burst detection [58]
SciMAT Software Science mapping analysis Thematic evolution analysis and strategic diagram creation [58]
Boolean Search Operators Query formulation Precise literature retrieval using logical combinations [36]
Viridis Color Scheme Visualization enhancement Perceptually uniform coloring for trend visualization [5]

Bibliometric analysis, particularly when implemented through specialized tools like VOSviewer, provides robust methodological frameworks for shaping research agendas and funding priorities in environmental science. The protocols outlined in this document offer standardized approaches for data collection, analysis, and interpretation that enable evidence-based decision-making in research planning and resource allocation. As environmental challenges continue to evolve in complexity, bibliometric methods will play an increasingly vital role in identifying emerging research fronts, fostering strategic collaborations, and ensuring that funding priorities align with the most pressing scientific and societal needs. The integration of these quantitative approaches with domain expertise represents a powerful paradigm for advancing environmental science in the coming decades.

Conclusion

VOSviewer emerges as an indispensable tool for navigating the complex and rapidly evolving landscape of environmental science research. By mastering its capabilities for foundational exploration, methodological application, troubleshooting, and validation, researchers can systematically decode research trends, identify influential works and collaborations, and pinpoint emerging frontiers—from resilient cities to microplastic pollution. This synthesis of visual bibliometrics and domain expertise not only enriches literature reviews but also actively shapes future research directions and resource allocation. As environmental challenges grow in complexity, the ability to conduct robust, data-driven analyses of scientific literature will be crucial for accelerating discovery and innovation. Future advancements in VOSviewer and integrated bibliometric methods promise even greater precision in tracking the development and impact of environmental research, ultimately contributing to more informed and effective scientific progress.

References